Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^4: WWW::Mechnize redirect handling

by nikster (Novice)
on Nov 22, 2019 at 22:33 UTC ( [id://11109086]=note: print w/replies, xml ) Need Help??


in reply to Re^3: WWW::Mechnize redirect handling
in thread [Solved] WWW::Mechnize redirect handling

Yes, of course, but I've just changed:

$m->max_redirect(2);

to:

$m->max_redirect(0);

On a closer look, the result is not entirely the same but also not much better (redirect loop detected):

Cache-Control: no-cache, no-store, max-age=0, must-revalidate Date: Fri, 22 Nov 2019 22:25:09 GMT Pragma: no-cache Via: url Server: servername Vary: Accept-Encoding,Origin Content-Encoding: gzip Content-Language: en Content-Length: 6336 Content-Type: text/html;charset=UTF-8 Expires: 0 Client-Date: Fri, 22 Nov 2019 22:29:13 GMT Client-Peer: xxx.xxx.xxx.xxx:443 Client-Response-Num: 1 Client-SSL-Cert-Issuer: /certinfo Client-SSL-Cert-Subject: /certinfo Client-SSL-Cipher: ECDHE-RSA-AES256-GCM-SHA384 Client-SSL-Socket-Class: IO::Socket::SSL Client-SSL-Warning: Peer certificate not verified Client-Warning: Redirect loop detected (max_redirect = 0) Strict-Transport-Security: max-age=15768000 ; includeSubDomains Strict-Transport-Security: max-age=15768000 X-Content-Type-Options: nosniff X-Frame-Options: DENY X-XSS-Protection: 1; mode=block

Strange, isn't it?

Replies are listed 'Best First'.
Re^5: WWW::Mechnize redirect handling
by bliako (Monsignor) on Nov 23, 2019 at 08:39 UTC

    Change that user-agent string to something valid instead of myagent.

    When I hit facebook with that script I get a Location header item as expected.

    Regarding autocheck=>1, from WWW::Mechanize manpage:

    Checks each request made to see if it was successful. This saves you t +he trouble of manually checking yourself. Any errors found are errors +, not warnings.

    Setting max_redirect=0 is good for making sure things work and gives you all control. But there is an easier way to do it:

    $m->max_redirect(3); # whatever redirects you may thing you will get o +r more my $content = $m->post($uri); my $ri=0; foreach my $aredirect ($content->redirects()){ $ri++; print "REDIRECT $ri ******\n".($aredirect->as_string())."\nEND + ****\n\n"; }

    That is, you loop through the headers of each of the redirects encountered to get what you need and at the same time you are at your final URL to hit login.

    bw, bliako

      I've inserted your code into my script.

      But the response is exactly the same as the one I've posted before.

      If I try against facebook, like you did, I get all the redirect headers.

      Must be something about the site itself, unfortunately there is no one to ask.

      Thank's very much, though!

        For the sake of closing this thread, I have solved this.

        In the end it was not overly complex but there was just more to it than initially known.

        tl;dr: the trick was not to follow the redirect after receiving the ticket (this invalidated the ticket already).

        Thanks for your help bliako.

        #!/usr/bin/env perl ###Modules use WWW::Mechanize; use HTTP::Cookies; use HTTP::CookieJar::LWP (); use IO::Socket::SSL qw(); use Data::Dumper; use JSON; ###Variables & Declarations my $creds = "$ENV{'HOME'}/.credentials"; my $uri ="https://xxx.employer.xxx/app/login?service=https://xxx.emplo +yer.xxx/app/service"; my $cookie_jar = HTTP::Cookies->new(); my ($username,$password) = get_credentials($creds); my $fields = { username => $username, password => $password, }; my $m = WWW::Mechanize->new( cookie_jar => $cookie_jar, autocheck => 0 +, ssl_opts => { SSL_verify_mode => IO::Socket::SSL::SSL_VERIFY_NONE, +verify_hostname => 0 }, env_proxy => 1, keep_alive => 1, timeout => 1 +20, agent => 'Windows IE 6' ); $m->max_redirect(0); ############## Log in and get Ticket ################## my $content = $m->post($uri); $m->submit_form( form_number => 1, fields => $fields, button => 'submit' ); my $location = $m->response()->header('Location'); my $ticket_id = (split /ticket=/, $location)[1]; ############## /Log in and get Ticket ################## ############## Create Session and get authorization id ############### +### $m->add_header('Content-Type' => 'text/plain'); $m->add_header('Accept' => ['text/plain', 'application/json']); #$m->delete_header('Referer'); my $session_url = "https://xxx.employer.xxx:<port>/session"; my $contentp = $m->post($session_url, 'Content' => "$ticket_id"); my $resp = $contentp->decoded_content()."\n\n"; my $decoded_json = decode_json( $resp ); my $id = $decoded_json->{id}; ############## /create session and get authorization id############### +### ############## do authorized stuff ######################### $m->add_header('Authorization' => $id); $m->post("whatever"); from here on I'm doing stuff inside the session...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11109086]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-19 22:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found