Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^5: Pulling a Page with LWP::UserAgent and fixing URLs?

by teabag (Pilgrim)
on Nov 10, 2004 at 14:30 UTC ( #406679=note: print w/replies, xml ) Need Help??


in reply to Re^4: Pulling a Page with LWP::UserAgent and fixing URLs?
in thread Pulling a Page with LWP::UserAgent and fixing URLs?

I mean fixing the url to convert relative to absolute pathnames. Just include tokeparser as suggested and you're there.

something like this:
(btw. not my code from http://perl.com):

#!/usr/bin/perl use strict; use warnings; use LWP; use URI; my $browser = LWP::UserAgent->new; my $url = 'http://www.cpan.org/RECENT.html'; my $response = $browser->get($url); die "Can't get $url -- ", $response->status_line unless $response->is_success; my $html = $response->content; while( $html =~ m/<A HREF=\"(.*?)\"/g ) { print URI->new_abs( $1, $response->base ) ,"\ +n"; }

teabag

-- Siggy Played Guitar
Sure there's more than one way, but one just needs one anyway - Teabag

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://406679]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2020-11-30 05:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?