http://qs321.pair.com?node_id=447836


in reply to WWW::Mechanize follow meta refreshes

I've used a regex as my refresh template is fixed and very, very simple. However, if yours isn't/aren't then you should replace the regex with a call to something like HTML::TokeParser.

This is actually built into WWW::Mechanize (well, LWP...) for you, so you can do something like:-

if ($mech->response and my $refresh = $mech->response->header('Refresh +')) { my($delay, $uri) = split /;url=/i, $refresh; $uri ||= $mech->uri; # No URL; reload current URL. sleep $delay; $mech->get($uri); }

$delay should probably be validated to protect against malformed META refresh tags, and there's a whole other headache about potential loops if you hack WWW::Mechanize to follow refreshes automatically.

    --k.


Replies are listed 'Best First'.
Re^2: WWW::Mechanize follow meta refreshes
by simon.proctor (Vicar) on Apr 15, 2005 at 09:00 UTC
    The snippet I provided is from my test suite. I'll be first to admit that it isn't great as I've only just started hacking away with Mechanize (and wondered why I didn't start sooner ;P).

    Anyway, from a testing perspective is it not better to follow the expected url and not the url in the template? Its only a minor point but are you not then reporting on a mistaken redirect but continuing as normal otherwise? I feel this is better but would welcome your comments.

    I do like the delay bit but, for my testing purposes, I would also pass that into the function. Something like:
    meta_refresh($mech, '/index.cgi?rm=home', 5);
    Or whatever :). I would also then, personally, have a default delay (of some time determined by the particular project) and simply validate the delay as being correct (for the same reasons as with the URL).

    Its funny, I only wrote this function because IIS, at the time, couldn't handle HTTP redirects and would crash (no really). Its *fixed now* but I don't have the time to rework my app again :).