Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^3: Pulling a Page with LWP::UserAgent and fixing URLs?

by teabag (Pilgrim)
on Nov 09, 2004 at 12:11 UTC ( #406318=note: print w/replies, xml ) Need Help??


in reply to Re^2: Pulling a Page with LWP::UserAgent and fixing URLs?
in thread Pulling a Page with LWP::UserAgent and fixing URLs?

ok then

use URI::URL;

Teabag

-- Siggy Played Guitar
Sure there's more than one way, but one just needs one anyway - Teabag
  • Comment on Re^3: Pulling a Page with LWP::UserAgent and fixing URLs?

Replies are listed 'Best First'.
Re^4: Pulling a Page with LWP::UserAgent and fixing URLs?
by MrForsythExeter (Novice) on Nov 09, 2004 at 15:08 UTC
    URI::URL is only used for old stuff.. backward compatibility and all that, Looks like URI is the one, however using this are you saying i should parse out all the URL's and then use this to fix them and put them in.. or could i do a regexp with /e on the end and do it all in one line?
      I mean fixing the url to convert relative to absolute pathnames. Just include tokeparser as suggested and you're there.

      something like this:
      (btw. not my code from http://perl.com):

      #!/usr/bin/perl use strict; use warnings; use LWP; use URI; my $browser = LWP::UserAgent->new; my $url = 'http://www.cpan.org/RECENT.html'; my $response = $browser->get($url); die "Can't get $url -- ", $response->status_line unless $response->is_success; my $html = $response->content; while( $html =~ m/<A HREF=\"(.*?)\"/g ) { print URI->new_abs( $1, $response->base ) ,"\ +n"; }

      teabag

      -- Siggy Played Guitar
      Sure there's more than one way, but one just needs one anyway - Teabag

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://406318]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2020-11-24 10:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?