http://qs321.pair.com?node_id=538734
Category: CGI Programming
Author/Contact Info Suresh S http://www.javajuggler.blogspot.com
Description: BBc New RSS Reader
###############################################
#    Author : Suresh S
#    
# Version:0.0.1a
#    
#    PexlReader
################################################

#---------------------------------------------------------------------
+------------------
#  This Program Traces BBC NEWS and
#  and Retrieves RSS feed 
#   
#  @ Return title,link,pubDate,description
#
#---------------------------------------------------------------------
+------------------
#!/usr/bin/perl -w

use strict;
use warnings;
use LWP::UserAgent;
use XML::Parser;

my $urlh=    LWP::UserAgent->new;
my $urld =    $urlh->get('http://newsrss.bbc.co.uk/rss/sportonline_wor
+ld_edition/front_page/rss.xml');
 if ($urld->is_success) {
    my $ps = new XML::Parser(Handlers => {Start => \&handle_start,
                                     End   => \&handle_end, Char => \&
+handle_char });
    $ps->parse($urld->content);
 }
 else {
     die $urld->status_line;
 }
 sub handle_start {
          my ($p, $elt, %atts) = @_;
          my $msg;
          if ($elt eq "title") {
    #          $msg = $atts{$_};
          }
    #        print $p;
    #        print $msg;
    #        print $elt;
     }

     sub handle_end {
         my ($p,$elt)=@_;
         if ($elt eq "title") {

         }
    #     print $p;
    #     print $elt;
     }

sub handle_char{
    my ($p,$con)=@_;
        print $con unless  $con !~ /\W/;
}
Replies are listed 'Best First'.
Re: BBC News Reader
by davorg (Chancellor) on Mar 23, 2006 at 13:55 UTC

    It looks like your start and end handlers don't do anything. Maybe you should remove them.

    Also, for parsing RSS, you might like to look at XML::RSS.

    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

      I also have to fetch BBC RSS at work. Just to note that you can't automatically throw their RSS at an XML parser. For some reason (laziness?) they don't always encode their content correctly. So you can often get titles with plain ampersands in (etc).

      Easy to regex out first but. . . .

        If something claims to be XML but isn't for some reason, then you should pass those problems back to the people creating the XML.

        I'm pretty sure that the BBC would be very interested in hearing of any problems with their RSS feeds.

        --
        <http://dave.org.uk>

        "The first rule of Perl club is you do not talk about Perl club."
        -- Chip Salzenberg