Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

LWP::UserAgent Client-Warning 500 against HTTP standards?

by Discipulus (Canon)
on Sep 30, 2022 at 07:35 UTC ( [id://11147169]=perlmeditation: print w/replies, xml ) Need Help??

Hello community,

being our halls so quite in these days I'm lazily inviting you to meditate about LWP::UserAgent behaviour returning 500 when LWP can't connect to some URL or when other failures in protocol handlers occur.

Is this breaking HTTP specification? If ever glanced current rfc or not you should know that all 5** status code are server side.

The LWP doumentation is very clear on this:

> There will still be a response object returned when LWP can't connect to the server specified in the URL or when other failures in protocol handlers occur. These internal responses use the standard HTTP status codes, so the responses can't be differentiated by testing the response status code alone. Error responses that LWP generates internally will have the "Client-Warning" header set to the value "Internal response". If you need to differentiate these internal responses from responses that a remote server actually generates, you need to test this header value.

Infact..

use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent->new(); for my $url ( qw( https://perlmonks.org https://perlmonks.roma.it) ){ print "\nGET $url\n"; my $res = $ua->get( $url ); # ..yes you can $res->status_line to have both combined print "code :\t", $res->code, "\n"; print "message :\t", $res->message, "\n"; print "Client-Warning header:\t", $res->header( "Client-Warning" ) +, "\n"; } __END__ GET https://perlmonks.org code : 200 message : OK Client-Warning header: GET https://perlmonks.roma.it code : 500 message : Can't connect to perlmonks.roma.it:443 Client-Warning header: Internal response

The message returned is already very clear Can't connect.. is oblviously client side: so why the choose of an error of the 5** class?

In the chat LanX suggested 418 I'm a teapot and is fun and new to me, but not usable: teapots are reserved to IANA :)

In the 4** class are defined status codes 401-418 plus 421 422 426 so there is room to have something like: 419 - Can't connect

See also other status numbers used to craft a HTTP::Response

So (and I dont want to blame LWP authors) why they choosed to return 500 setting an header internally to disambiguate it?

What other frameworks do? Quickly trying Mojo::UserAgent I see it uses it's own Mojo::Message::Response and does not return any status code for unexisting urls:

use strict; use warnings; use Mojo::UserAgent; my $ua = Mojo::UserAgent->new; for my $url ( qw( https://perlmonks.org https://perlmonks.roma.it) ){ print "\nGET $url\n"; my $res = $ua->get( $url )->result; print "code :\t", $res->code, "\n"; print "message :\t", $res->message, "\n"; #print "Client-Warning header:\t", $res->header( "Client-Warning" +), "\n"; } __END__ GET https://perlmonks.org code : 200 message : OK GET https://perlmonks.roma.it Can't connect: Host unknown. at testLWP500.pl line 10.

..and this error is defined in Mojo::IOLoop::Client it seems to me a better design, but... wait this is a die behaviour! if you switch URLs in the above code you never reach the second GET.

By other hand curl tell us it is unable to resolve the URL:

curl -I https://perlmonks.roma.it curl: (6) Could not resolve host: perlmonks.roma.it

..and it is right.

What do you think about? What other frameworks I missed do?

Is 200 if you post 203 but no 204 will be accepted! :)

L*

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Replies are listed 'Best First'.
Re: LWP::UserAgent Client-Warning 500 against HTTP standards?
by choroba (Cardinal) on Sep 30, 2022 at 08:01 UTC
    When we briefly discussed the possible alternatives, I asked "Would you like it to directly die instead?" I wasn't joking. You see the Mojo client does it, and in fact, curl does it too: if you don't display $? automatically in the prompt, check it after the call (curl is helpful enough to show us the exit status it's going to exit with, it's 6). Wrapping all calls to unstable interfaces into a try is a good habit.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: LWP::UserAgent Client-Warning 500 against HTTP standards?
by hippo (Bishop) on Sep 30, 2022 at 08:48 UTC
    The message returned is already very clear Can't connect.. is oblviously client side: so why the choose of an error of the 5** class?

    The message is being returned to you by the client in the absence of any status code from the server because it cannot connect. This doesn't mean that it's a problem with the client - the server could be down or unroutable or the DNS could be screwy or whatever. The 4xx set is for operations where the client is at fault: malformed requests, unauthorized or forbidden access, unsatisfiable negotiation requests and so on.

    The HTTP spec defines response codes which the server should provide. An absence of a response at all from the server rather renders a response code irrelevant. I'd be happy for the code to be undefined in the client in such a scenario but 500 seems a reasonable compromise in the absence of anything better. Consider 500 to mean "the request did not complete due to reasons beyond our knowledge".


    🦛

      The 4xx set is for operations where the client is at fault: malformed requests, unauthorized or forbidden access, unsatisfiable negotiation requests and so on.

      Playing devil's advocate for a 4xx response: When a client requesting https://valid.host/invalid/file.dne gets a 404 from the server, because the file doesn't exist (so the client made a "mistake" by requesting a resource that doesn't exist); this is true even when the true error is that index.html linked to invalid/file.dne and the only "mistake" the client made was trying to download a resource linked from index.html . Likewise, requesting https://invalid.host/index.html from a server that doesn't exist is similarly the client's fault for asking for a server that doesn't exist... but in the absence of a server to tell it that the client made a mistake, it wouldn't be unreasonable for the client to admit "User, I made a mistake in trying to access invalid.host , even though you're the one who asked me to access it". I would even argue for the literal 400 BAD REQUEST response, because it "indicates that the server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing)." -- where the client error was "asking for a server that doesn't exist (or isn't currently online)"

      But similarly, the choice of a 500 error also makes sense: normally, 500 errors are a server's response to an error that it doesn't have better answer for (like "the underlying CGI doesn't know how to generate proper headers" or "doesn't have the proper linux permissions to execute the script"); so when the server is so dissociated that the client cannot see it, I could see the client arbitrarily deciding "this must be a 500, because the server is so confused or messed up that it refuses to respond at all".

      But honestly, I think it would be better for LWP::UserAgent to die instead, for reasons given elsewhere in this discussion. That really seems like the best and most natural choice to me.

Re: LWP::UserAgent Client-Warning 500 against HTTP standards? -- LWP::UserAgent issue 508
by Discipulus (Canon) on Oct 14, 2022 at 07:57 UTC
    just to stay tuned.. since I was pointed to the open issue 508 by Anonymous Monk I have thrown there my idea and it was well received.

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      Mechanism already exists
      sub not500 { my($response, $ua, $handler) = @_; ... } $ua->add_handler("response_done", \&not500 );
Re: LWP::UserAgent Client-Warning 500 against HTTP standards? (Predates current standards)
by Anonymous Monk on Sep 30, 2022 at 09:46 UTC
    WWW::Mechanize uses autocheck ;) its a nice default with a browser like that

    https://github.com/libwww-perl/libwww-perl/issues/258 Pseudo-500 Errors Make Debugging Harder Than It Should Be #258

    the rfc does not specify how libraries/apis ought to be implemented. Issue 258 argues the code should be 4xx but those are all server responses too

    20+ years ago i learned LWP and about internal error on the first day

    500 is the most famous error

      autocheck being on by default was a heinous and breaking change to WWW::Mechanize. It is the equivalent of a browser crashing every time there is a response that doesn’t match ->is_success.

        Seatbelts save lives

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11147169]
Approved by choroba
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2024-04-23 11:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found