1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142
|
package XML::RSS::Headline::Fark;
use strict;
use warnings;
use base qw(XML::RSS::Headline);
use URI::Escape qw(uri_unescape);
=head1 NAME
XML::RSS::Headline::Fark - XML::RSS::Headline Example Subclass
=head1 VERSION
2.2
=cut
our $VERSION = 2.2;
=head1 SYNOPSIS
Strip out the extra Fark redirect URL and strip out the various [blahblah]
blocks in the headline
use XML::RSS::Feed;
use XML::RSS::Headline::Fark;
use LWP::Simple qw(get);
my $feed = XML::RSS::Feed->new(
name => "fark",
url => "http://www.pluck.com/rss/fark.rss",
hlobj => "XML::RSS::Headline::Fark",
);
while (1) {
$feed->parse(get($feed->url));
print $_->headline . "\n" for $feed->late_breaking_news;
sleep($feed->delay);
}
Here is the before output in #news on irc.perl.org
<rssbot> - [Sad] Elizabeth Edwards diagnosed with breast cancer
<rssbot> http://go.fark.com/cgi/fark/go.pl?IDLink=1200026&location=http://www.msnbc.msn.com/id/6408022
and here is the updated output
<rssbot> - Elizabeth Edwards diagnosed with breast cancer
<rssbot> http://www.msnbc.msn.com/id/6408022
=head1 MUTAITED METHOD
=over 4
=item B<< $headline->item( $item ) >>
Init the object for a parsed RSS item returned by L<XML::RSS>.
=back
=cut
sub item {
my ( $self, $item ) = @_;
$self->SUPER::item($item); # set url and description
my $headline = $self->headline;
$headline =~ s/\[.+?\]\s+//;
$self->headline($headline);
my $url = $self->url;
my $stripit = qr/
http\:\/\/
go\.fark\.com\/
cgi\/fark\/go\.pl\?
IDLink\=\d+\&
location\=
/x;
$url =~ s/$stripit//;
$self->url( uri_unescape($url) );
}
=head1 AUTHOR
Jeff Bisbee, C<< <jbisbee at cpan.org> >>
=head1 BUGS
Please report any bugs or feature requests to
C<bug-xml-rss-feed at rt.cpan.org>, or through the web interface at
L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=XML-RSS-Feed>.
I will be notified, and then you'll automatically be notified of progress on
your bug as I make changes.
=head1 SUPPORT
You can find documentation for this module with the perldoc command.
perldoc XML::RSS::Headline::Fark
You can also look for information at:
=over 4
=item * AnnoCPAN: Annotated CPAN documentation
L<http://annocpan.org/dist/XML-RSS-Feed>
=item * CPAN Ratings
L<http://cpanratings.perl.org/d/XML-RSS-Feed>
=item * RT: CPAN's request tracker
L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-RSS-Feed>
=item * Search CPAN
L<http://search.cpan.org/dist/XML-RSS-Feed>
=back
=head1 ACKNOWLEDGEMENTS
Special thanks to Rocco Caputo, Martijn van Beers, Sean Burke, Prakash Kailasa
and Randal Schwartz for their help, guidance, patience, and bug reports. Guys
thanks for actually taking time to use the code and give good, honest feedback.
=head1 COPYRIGHT & LICENSE
Copyright 2006 Jeff Bisbee, all rights reserved.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
=head1 SEE ALSO
L<XML::RSS::Feed>, L<XML::RSS::Headline>, L<XML::RSS::Headline::PerlJobs>, L<XML::RSS::Headline::UsePerlJournals>, L<POE::Component::RSSAggregator>
=cut
1;
|