Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: WWW::Mechanize and fooling server for javascript

by marto (Cardinal)
on Jul 31, 2007 at 12:32 UTC ( [id://629814]=note: print w/replies, xml ) Need Help??


in reply to Re^2: WWW::Mechanize and fooling server for javascript
in thread WWW::Mechanize and fooling server for javascript

gw1500se,

Some routers use JavaScript to dynamically create an html page using document.write. As you have already been told LWP and WWW::Mechanize do not support JavaScript, see how to reboot adsl modem with perl? for a similar problem, and list of WWW::Mechanize variants that do support JavaScript.

I don't know what you mean by 'fooling Server' but depending on the JavaScript in question it is often possible to write some Perl that provides the same function as the JavaScript does. If all you are looking for is the IP address assigned by your ISP, perhaps you could simply use WWW::Mechanize to get http://www.whatismyip.com (or similar service), and parse the response.

Martin

Replies are listed 'Best First'.
Re^4: WWW::Mechanize and fooling server for javascript
by gw1500se (Beadle) on Jul 31, 2007 at 18:23 UTC
    What I meant was that I initially thought there was some kind of redirect that prevented the full page from loading if javascript was not enabled. Thus the server would serve a different page unless javascript was enabled. I was looking for a way to fool that mechanism into thinking javascript was enabled.

    I have since been convinced this is not possible so my only alternative is to be able to parse the javascript for the assignment I am looking for (data='some hash string'). I am finding that the challenge is to find something that will let me access the javascript source. It seems that if the javascript is a link rather then embedded, LWP at least, will not "GET" it. I am hoping Mech will when I try it without the format option.

      It's quite easy to see what requests your browser makes, for example with the Live HTTP Headers Extension for FireFox. All you have to do then is to faithfully replicate the requests made by the browser with WWW::Mechanize. In one instance, I used HTTP::Request::FromTemplate to recreate HTTP requests from templates I created from sniffer logs. Other network analysis tools, like WireShark or Sniffer::HTTP could also be useful in determining the difference between what your browser sends and what your script sends.

        Thanks for all the replies and I guess the reason I am not making progress is because I am doing a poor job of explaining the problem. My apologies. I know what is happening so let me try to clarify.

        When I access the login page it includes a link to a javascript that simply does a "data=<some hash string>". The submit javascript uses that hash to encrypt the login password. Doing all that with perl is probably the easy part. The hard part is extracting that from the javascript source so I can get the hash string. With both Mech and LWP it seems the embedded javascript source is there but the linked is not.

        After more digging I THINK I know what has to be done so at this point I am asking for some reassurance. If I understand Mech, the first get retreives everything except links are not downloaded. Instead a list of links is built. Do I need to make another call to get the links I need? If so (this is more of a browser question I think) does the server know the subsiquent requests are from the same "session"? I use that word for lack of any better way to make the point as there real is no session context here. What concerns me is that if the javascript source does not come with the initial page, it will issue a different hash string that, when used, will not result in the correct encryption. Does a browser request links individually like Mech seems to? Perhaps the string is just a time based thing.

        Thanks again.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://629814]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-25 14:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found