Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: Being a heretic and going against the party line.

by Chmrr (Vicar)
on Oct 03, 2002 at 13:23 UTC ( #202504=note: print w/replies, xml ) Need Help??

in reply to Being a heretic and going against the party line.

I can certainly see your point, and personally I would never downvote anyone for giving an example which worked and which fit the requirements.

Personally, I find that using the HTML::* modules makes the code cleaner, more compact, and easier to understand what's doing on. For example, here's how I'd write the bit of code in question, using HTML::TreeBuilder (my personal favorite for doing such manipulations):

#!/usr/bin/perl use warnings; use strict; use LWP::Simple; use HTML::TreeBuilder; my $html = get("") or die "Getting html: $!" +; my $tree = HTML::TreeBuilder->new_from_content($html) or die "Building html tree: $!"; $tree = $tree->look_down("_tag"=>"a", "href"=>"") ->look_up("_tag","tr"); $tree->objectify_text(); print join ' ', map {$_->attr("text")} $tree->look_down("_tag","~text" +);

Whenever I see a huge regex, I must admit that my eyes generally glaze over slightly. Even though ones such as the one you gave are not actually all that complex, they tend to look rather intimidating.

A personal anecdote on the use of HTML::* modules for parsing; a while back, I wrote a program which, given an ISBN number, would look up basic information off to Amazon, such as title, author, possibly series, and so on. This was nearly a year ago. Just last week, someone asked me if I still had the code around. I dug around and ran it -- and, lo and behold, it spat back information. Despite that Amazon had rearranged the webpage significantly over that time, the extractor still worked.

I shan't just tell you to drink the kool-aid, but in general most of my solutions, if I have the choice at all, will use HTML::* modules. Why? I've placed my trust in them many a time, and they have yet to let me down. I will suggest that others will do the same, but if they choose not to -- well, that is their choice, and they may well be right or wrong down the road. It's their kool-aid. ;>

perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://202504]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2021-04-10 15:18 GMT
Find Nodes?
    Voting Booth?

    No recent polls found