http://qs321.pair.com?node_id=1059348

perl.j has asked for the wisdom of the Perl Monks concerning the following question:

Hey Everyone!

I'm trying to hack together a little tool to help me parse RSS feeds. Basically, the code I currently have takes a keyword(s) and prints the articles that have that word in the title. Here is the code:

use 5.14.2; use strict; use warnings; use XML::RSSLite; use LWP::Simple; my @keywords = qw(approach); my $URL = 'http://www.theguardian.com/theguardian/mainsection/rss'; my $content = get($URL); my %result; parseRSS(\%result, \$content); my $re = join "|", @keywords; $re = qr/\b(?:$re)\b/i; foreach my $item (@{ $result{items} }) { my $title = $item->{title}; $title =~ s{\s+}{ }; $title =~ s{^\s+}{ }; $title =~ s{\s+$}{ }; if ($title =~ /$re/) { print "$title\n\t$item->{link}\n\n"; } }

This gives me the desired effect with one url, but I need to parse ~20 of these and print the articles from all of them.

I attempted to do this by turning $URL into an array (by making it @URL), and changed that variable throughout the code, but that just gave me several errors.

So, my question is, how can I parse multiple RSS feeds in one script and have all of the output formatted the same way into the same file?

--perl.j

Replies are listed 'Best First'.
Re: Parsing multiple RSS files
by jethro (Monsignor) on Oct 23, 2013 at 21:25 UTC

    The get() function of LWP::Simple does not know how to work with arrays, you need to give it URLs one by one. This is done with a loop. That would look like this:

    foreach my $URL (@URL) { ... }
Re: Parsing multiple RSS files
by atcroft (Abbot) on Oct 23, 2013 at 21:19 UTC

    Would this (untested!) not work?

    use 5.14.2; use strict; use warnings; use XML::RSSLite; use LWP::Simple; my @keywords = qw(approach); my @URLlist = ( 'http://www.theguardian.com/theguardian/mainsection/rss', 'http://www.theguardian.com/theguardian/mainsection/rss1', ); foreach my $URL ( @URLlist ) { my $content = get($URL); my %result; parseRSS(\%result, \$content); my $re = join "|", @keywords; $re = qr/\b(?:$re)\b/i; foreach my $item (@{ $result{items} }) { my $title = $item->{title}; $title =~ s{\s+}{ }; $title =~ s{^\s+}{ }; $title =~ s{\s+$}{ }; if ($title =~ /$re/) { print "$title\n\t$item->{link}\n\n"; } } }