Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

bobf asked me to expand on my scary comment in my original reply. Here goes.

#!/usr/bin/perl -w

My first beef, 'use warnings;' instead of -w unless you have a good reason to believe your code will run on pre 5.6

use strict; use XML::Parser; use LWP::Simple; my $feed;

a lone declaration, what will $feed be for I wonder?

open( FH, ">feed.xml") or die "Error: $!\n";

Whoops, using a global filehandle, and two arg open is not a good habit unless you have reason to believe your code will be run on earlier perls.

$feed = get("http://rss.news.yahoo.com/rss/science");

Aha, so $feed will be used to store the retreived XML. Cool.

print FH $feed;

Waitaminute, now we print the contets of $feed to the FH. WTF!?

my $parser = new XML::Parser ( Handlers => { Start => \&hdl_start, End => \&hdl_end, Char => \&hdl_char, } );

ok this is all pretty normal, but indirect notation is a bad habit to get into in perl, better to use XML::Parser->new()

$parser->parsefile("feed.xml");

Ok, so now we read the data back from the file we just wrote. Of course we have serious issues here, as we havent closed the FH that we wrote to this file with earlier, meaning that the buffer probably isn't flushed and we aren't reading the full contents. Were the filehandle a lexical and properly scoped the filehandle would have automatically closed and flushed as it dropped out of scope. As it is the filehandle wont be closed and flushed until the program does so explicitly or the script terminates.

sub hdl_start { my ($p, $ele, %attribs) = @_; $attribs{'string'} = ''; $feed = \%attribs;

and yet another waitaminute. Why is $feed being used here? whats going on?!

} sub hdl_end { + my ($p, $ele) = @_; display_feed($feed) if $ele eq 'title'; display_feed($feed) if $ele eq 'link';

And now we use the $feed variable as well, but its not clear what it holds yet, presumably a hashref.

} sub hdl_char { my ($p, $str) = @_; no strict 'refs'; $feed->{'string'} .= $str;

$feed appears to be a hashref now. Earlier it was a string. Sigh. and the "no strict 'refs'" is really bad here. Apparently it doesnt do anything, but it might. In fact if hdl_char is called before hdl_start then $feed will equal the original XML content and thus we could be treating the xml response as a var name. No matter how you look at this its bad. Either we have a misleading pragma in the sub that doesnt do anything confusing things, or we have some really nasty stuff happening under the surface.

} sub display_feed { my $attribs = shift;

According to the code that calls this sub, $attribs should probably hold a hashref

$attribs =~ s/\n//g;

Woah! Now its a string. Presumably something like "HASH(0xdeadbeef)"

print "$attribs->{'string'}\n\n";

Id assume that this is just an error outright. Preumably strict will choke on this line. I assume DK never actually got to this point in the call graph

}

---
demerphq

    First they ignore you, then they laugh at you, then they fight you, then you win.
    -- Gandhi

    Flux8



In reply to Re: Writing a simple RSS feed 'grabber' with XML::Parser. (detailed review) by demerphq
in thread Writing a simple RSS feed 'grabber' with XML::Parser. by DigitalKitty

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2024-04-25 22:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found