http://qs321.pair.com?node_id=863731

jfrm has asked for the wisdom of the Perl Monks concerning the following question:

Shavenheads, I run a script regularly that crawls my various websites checking for bad links. Last night my internet connection was down and rather than just producing an error and carrying on to the next part of the script which does a whole of other useful things too, the whole script crashed. viz:

Error GETing http://www.website.com/images/billiards/brass-cross-rest- +head-PER-033.jpg: Can't connect to www.website.com:80 (Bad hostname ' +www.website.com') at nightlychecks.pl line 2346 at C:/Perl/lib/WWW/Mechanize.pm line 2705 WWW::Mechanize::_die('Error ', 'GET', 'ing ', 'URI::http=SCALAR(0x +93515dc)', ': ', 'Can\'t connect to www.website.com:80 (Bad hostname +\'www...') called at C:/Perl/lib/WWW/Mechanize.pm line 2692 WWW::Mechanize::die('WWW::Mechanize=HASH(0x934eda4)', 'Error ', 'G +ET', 'ing ', 'URI::http=SCALAR(0x93515dc)', ': ', 'Can\'t connect to +www.website.com:80 (Bad hostname \'www...') called at C:/Perl/lib/WWW +/Mechanize.pm line 2340 WWW::Mechanize::_update_page('WWW::Mechanize=HASH(0x934eda4)', 'HT +TP::Request=HASH(0x9643efc)', 'HTTP::Response=HASH(0x93531bc)') calle +d at C:/Perl/lib/WWW/Mechanize.pm line 2206 WWW::Mechanize::request('WWW::Mechanize=HASH(0x934eda4)', 'HTTP::R +equest=HASH(0x9643efc)') called at C:/Perl/lib/LWP/UserAgent.pm line +389 LWP::UserAgent::get('WWW::Mechanize=HASH(0x934eda4)', 'http://www. +website.com/images/billiards/brass-cross-rest...') called at C:/Perl/ +lib/WWW/Mechanize.pm line 407 WWW::Mechanize::get('WWW::Mechanize=HASH(0x934eda4)', 'http://www. +website.com/images/billiards/brass-cross-rest...') called at nightlyc +hecks.pl line 2346 main::webpage_assign_check() called at nightlychecks.pl line 560 main::run_report('HASH(0x6fb523c)') called at nightlychecks.pl lin +e 289 main::nightly_checks() called at nightlychecks.pl line 86

How can I make www::mechanize produce an error gracefully rather than just pegging out horrifically? The www::mechanize documentation doesn't enlighten me...

Replies are listed 'Best First'.
Re: www::mechanize - how to prevent it dying?
by Corion (Patriarch) on Oct 06, 2010 at 08:03 UTC

    See the autocheck parameter, which turns all HTTP errors into fatal errors. When you turn autocheck off, you will need to do all error checking yourself.

    my $mech = WWW::Mechanize->new( autocheck => 0 ); my $res = $mech->get('does.notexist.example'); $res->is_success or print "Uhoh\n";

    Alternatively, use eval to trap fatal errors when making the original connection.

    my $mech = WWW::Mechanize->new(); my $connected = eval { $mech->get('does.notexist.example'); 1 }; if (! $connected) { print "Uhoh\n"; };

    Also see Try::Tiny to reduce the exception-dance to one less step (I haven't used it myself):

    use Try::Tiny; my $mech = WWW::Mechanize->new(); try { $mech->get('does.notexist.example'); } catch { print "Uhoh: $_\n"; };

      This was a great answer, thank you. I have switched off autocheck. But then I realised that I was already doing a check after each $mech->get() by testing $mech->status. Is testing $mech->status adequate or does checking $res->is_success do something different or better?

Re: www::mechanize - how to prevent it dying?
by pemungkah (Priest) on Oct 06, 2010 at 23:38 UTC
    The other option is to subclass Mech, which magically turns autocheck off too; the assumption being that if you're subclassing, you know what you're doing and Mech shouldn't second-guess you on proper error handling.

    I think it shouldn't second-guess you at all, but that's my personal opinion.

Re: www::mechanize - how to prevent it dying?
by murugu (Curate) on Oct 06, 2010 at 12:32 UTC
    Hi jfrm,

    Please use eval. Enclose your fetching code with eval and check the errors using $@.

    Regards,
    Murugesan Kandasamy
    use perl for(;;);