in reply to Using HTTP::LinkExtor to get URL and description info
You have to know your tools. HTML::LinkExtor was designed to only extract the links, not the text in between (whatever you call it, cdata or whatever).
Demo
And now for the nudge, HTML::TokeParser tutorialuse strict; use Data::Dumper; use HTML::LinkExtor; my $base = 'http://perlmonks.org/'; my $stringy = q{ <tr><td><a HREF="/index.pl?node_id=188511">How does this code work (w +arnings.pm)?</a></td> <td>by <a HREF="/index.pl?node_id=80322">John +M. Dlugosz</a></td></tr> <tr><td><a HREF="/index.pl?node_id=188509">Tk and X events</a></td> < +td>by <a HREF="/index.pl?node_id=961">Anonymous Monk</a></td></tr> <tr><td><a HREF="/index.pl?node_id=188507">warnings::warnif etc. wise + usage?</a></td> <td>by <a HREF="/index.pl?node_id=80322">John M. Dl +ugosz</a></td></tr> <tr><td><a HREF="/index.pl?node_id=188505">52-bit numbers as floating + point</a></td> <td>by <a HREF="/index.pl?node_id=80322">John M. Dlu +gosz</a></td></tr> }; my $p = new HTML::LinkExtor(undef, $base); $p->parse($stringy); print Dumper $p->links; $p = new HTML::LinkExtor( sub { print Dumper($_) for @_; } , $base); $p->parse($stringy);
update: suprise, suprise, I've solved this one before (crazyinsomniac) Re: Getting the Linking Text from a page
______crazyinsomniac_____________________________ Of all the things I've lost, I miss my mind the most. perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;" |
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: Using HTTP::LinkExtor to get URL and description info
by Popcorn Dave (Abbot) on Aug 08, 2002 at 05:58 UTC | |
by bjr (Novice) on Aug 08, 2002 at 17:45 UTC |
In Section
Seekers of Perl Wisdom