Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^4: Pulling a Page with LWP::UserAgent and fixing URLs?

by MrForsythExeter (Novice)
on Nov 09, 2004 at 15:08 UTC ( [id://406354]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Pulling a Page with LWP::UserAgent and fixing URLs?
in thread Pulling a Page with LWP::UserAgent and fixing URLs?

URI::URL is only used for old stuff.. backward compatibility and all that, Looks like URI is the one, however using this are you saying i should parse out all the URL's and then use this to fix them and put them in.. or could i do a regexp with /e on the end and do it all in one line?
  • Comment on Re^4: Pulling a Page with LWP::UserAgent and fixing URLs?

Replies are listed 'Best First'.
Re^5: Pulling a Page with LWP::UserAgent and fixing URLs?
by teabag (Pilgrim) on Nov 10, 2004 at 14:30 UTC
    I mean fixing the url to convert relative to absolute pathnames. Just include tokeparser as suggested and you're there.

    something like this:
    (btw. not my code from http://perl.com):

    #!/usr/bin/perl use strict; use warnings; use LWP; use URI; my $browser = LWP::UserAgent->new; my $url = 'http://www.cpan.org/RECENT.html'; my $response = $browser->get($url); die "Can't get $url -- ", $response->status_line unless $response->is_success; my $html = $response->content; while( $html =~ m/<A HREF=\"(.*?)\"/g ) { print URI->new_abs( $1, $response->base ) ,"\ +n"; }

    teabag

    -- Siggy Played Guitar
    Sure there's more than one way, but one just needs one anyway - Teabag

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://406354]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-04-19 13:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found