Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

(jeffa) Re: Lack of LWP::SIMPLE information

by jeffa (Bishop)
on Jul 12, 2003 at 14:22 UTC ( [id://273626]=note: print w/replies, xml ) Need Help??


in reply to Lack of LWP::SIMPLE information

There are (IIRC) two ways to redirect ... one is to use a Redirection Header and the other is to use a <meta> tag:
<meta http-equiv="Refresh" content="15; URL=redirect.html"/>
I set up two (three including the file redirected to) test files:
  1. redirect.html:
    <html> <meta http-equiv="Refresh" content="1; URL=redirected.html"/> <head> <title>redirect</title> </head> <body> you won't see me </body> </html>
  2. cgi-bin/redirect.cgi:
    #!/usr/bin/perl -Tw use strict; use CGI qw(redirect); print redirect('http://localhost/redirected.html');
And then i fetched both pages with LWP::Simple. Here are the results:
[jeffa]$ perl -MLWP::Simple -le'getprint "http://localhost/cgi-bin/red +irect.cgi"' <html> <head> <title>redirected!</title> </head> <body> you've been redirected! </body> </html> [jeffa]$ perl -MLWP::Simple -le'getprint "http://localhost/redirect.ht +ml"' <html> <meta http-equiv="Refresh" content="1; URL=redirected.html"/> <head> <title>redirect</title> </head> <body> you won't see me </body> </html>
So ... in semi-conclusion, looks like LWP::Simple will transparantly grab the redirected page IF the redirection was implemented with a redirection header, not a meta tag. Hope this helps.

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)

Replies are listed 'Best First'.
Re: (jeffa) Re: Lack of LWP::SIMPLE information
by Dog and Pony (Priest) on Jul 17, 2003 at 21:19 UTC
    LWP::UserAgent (which LWP::Simple uses) follows redirects by default for GET and HEAD requests - this is a configurable behaviour, see requests_redirectable and redirect_ok in the documentation.

    It also stands to reason that only header redirects (ie Status: 302 Moved) is followed. Otherwise the module would need to parse the HTML looking for meta tags. Using the proper modules, or maybe just a simple regexp, this is not too hard to implement yourself, but it doesn't belong in the core module IMO. :)

    Another way that some (thankfully few nowadays) do redirects is by scripting, such as javascript. This is a tougher nut to crack if needed.


    You have moved into a dark place.
    It is pitch black. You are likely to be eaten by a grue.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://273626]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (7)
As of 2024-04-19 10:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found