Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Using LWP to automate a login

by gw1500se (Beadle)
on Jul 29, 2007 at 18:43 UTC ( [id://629420]=perlquestion: print w/replies, xml ) Need Help??

gw1500se has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to extract some information from a page that requires authentication (I have to reverse engineer this thing). However, the problem is with the way the authentication is done. I'm trying to use LWP::Simple (which I am beginning to suspect I cannot do for this) so I first do a "GET" on the login page. Looking at the javascript used for submitting the authentication, it appears a data string is included in a hidden field which is used by a javascript to hash the authentication information. My first problem is to extract that data string. Here is the HTML segment that I believe is relevent.
<!--@UNIQUE:bodystart@--> <form name="postform" method="get" action="/post_login.cgi"> <input type="hidden" name="data" /> </form> <!--@ENDUNIQUE@-->
I don't see the data string in there but the javascript uses it to apparently get the hash string (looks like some server side stuff is going on as well). Although there is an action with this form I don't believe anything gets sent to the server as only 1 page is displayed (the login page) and then submited via a button.
<form name="myform" action="/dummy" onsubmit="sendLogin(); return fals +e;"> . . . <input class="button_submit_padleft" type="button" name="Login" value= +"Log In" onclick="sendLogin();" />
Here is the 'onClick' javascript that submits the login from the form.
function sendLogin() { // If the 'data' variable is not defined then there was probab +ly some // problem with loading the page. The best guess is that the u +ser's network // connection has gone down. Inform the user and try to reload + the page. if (typeof(data) == "undefined") { alert ("The network connection seems to be down. Press + 'Ok' to try again."); location.reload(true); return; } var a = new Array; // Compute the login hash. var shex = byteArrayToHexString(convertFromBase64(data),0,4); var goodp = document.myform.Password.value.substr(0,16); document.myform.Password.value = ""; // Make sure p +assword never gets sent as clear text for (var i = goodp.length; i < 16; i++) { goodp = goodp.concat(String.fromCharCode(1)); } var str = shex + goodp; // Pad the string to 64 bytes. for (var i = str.length; i < 63; i++) { str = str.concat(String.fromCharCode(1)); } str = str.concat((document.myform.username.value == 'user') ? +'U' : String.fromCharCode(1)); var hash = hex_md5(str); var saltHash = shex.concat(hash); a = convertHexString(saltHash, 20, 20); // Send the new configuration to the server sendDataToServer ("post_login.cgi?data=" + convertToBase64(a), +loginReturnValue) }
What I don't know how to do, at this point anyway, is to get that data string so I can build the correct hash for authentidation when I issue a "POST". I may well be missing something in the HTML code but I'm not sure where to go from here. Can someone help me with this LWP code? Thanks.

Replies are listed 'Best First'.
Re: Using LWP to automate a login
by Corion (Patriarch) on Jul 29, 2007 at 18:55 UTC

    You haven't shown any (Perl) code so it's hard to help you with that. Let me recommend WWW::Mechanize, which emulates a browser far more faithfully than LWP::Simple (and LWP::UserAgent) does. What WWW::Mechanize doesn't do is interpret JavaScript, so you will still have to do that yourself.

    The easy way to approach JavaScript is to use the Mozilla Live HTTP Headers extension for FireFox, trace what gets sent, and then replicate that with WWW::Mechanize, sending the GET or POST requests as needed. Especially in the JS snippet you posted it seems that they use some base64 conversion and some other crypting to make the data transfer "more secure", so your Perl code will have to make some effort to replicate that.

Re: Using LWP to automate a login
by zer (Deacon) on Jul 29, 2007 at 18:56 UTC
    this appears to be an ajax script

    A proxy can help you listen to what is being sent back and forth. The java probably does something with the return value that you may not have included in the code such as an xml file. Again a proxy can help you get more information especially if there is a filter feature.

    sendDataToServer ("post_login.cgi?data=" + convertToBase64(a), loginReturnValue) alert (loginReturnValue); alert (convertToBase64(a)); }
Re: Using LWP to automate a login
by Cody Pendant (Prior) on Jul 29, 2007 at 20:38 UTC
    What you're missing is the bit of JavaScript which puts a value into that "data" field. In the HTML, it's empty, so if the login script expects a value there's got to be a script which sets document.postform.data.value = 'something'. You'll need to find that and manually add it somehow.

    WWW::Mechanize will help you do that via the update_html method.



    Nobody says perl looks like line-noise any more
    kids today don't know what line-noise IS ...
      Thanks for the replies. You did move me forward a little. There is a javascript that does what you suggest. Apparently when it is downloaded, it contains nothing more then data="some hash string". It is not a funciton so it must be executed when the page loads and thus the value of 'data' is established. So the question becomes, how do I "GET" that value?

      I don't seem to see it in the data returned by "GET". All that shows up is the javascript tag. Does "GET" not include the source for downloaded javascripts (as opposed to imbedded scripts which do show up)?

      As an aside, the reason I didn't include any perl code is because there isn't any yet beyond a "GET" and a "PRINT" of the page. I won't write any real code until I understand what I need to do.

      Thanks again.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://629420]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-03-28 16:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found