Re: Using HTTP::LinkExtor to get URL and description info

in reply to Using HTTP::LinkExtor to get URL and description info

You have to know your tools. HTML::LinkExtor was designed to only extract the links, not the text in between (whatever you call it, cdata or whatever).

Demo

use strict;
use Data::Dumper;
use HTML::LinkExtor;

my $base = 'http://perlmonks.org/';
my $stringy = q{
 <tr><td><a HREF="/index.pl?node_id=188511">How does this code work (w
+arnings.pm)?</a></td> <td>by  <a HREF="/index.pl?node_id=80322">John 
+M. Dlugosz</a></td></tr>
 <tr><td><a HREF="/index.pl?node_id=188509">Tk and X events</a></td> <
+td>by  <a HREF="/index.pl?node_id=961">Anonymous Monk</a></td></tr>
 <tr><td><a HREF="/index.pl?node_id=188507">warnings::warnif etc. wise
+ usage?</a></td> <td>by  <a HREF="/index.pl?node_id=80322">John M. Dl
+ugosz</a></td></tr>
 <tr><td><a HREF="/index.pl?node_id=188505">52-bit numbers as floating
+ point</a></td> <td>by  <a HREF="/index.pl?node_id=80322">John M. Dlu
+gosz</a></td></tr>
};


my $p = new HTML::LinkExtor(undef, $base);

$p->parse($stringy);

print Dumper $p->links;

$p = new HTML::LinkExtor( sub { print Dumper($_) for @_; } , $base);

$p->parse($stringy);
[download]

And now for the nudge, HTML::TokeParser tutorial

update: suprise, suprise, I've solved this one before (crazyinsomniac) Re: Getting the Linking Text from a page

______crazyinsomniac_____________________________
Of all the things I've lost, I miss my mind the most.
perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

In Section Seekers of Perl Wisdom