Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

WWW::Mechanize and fooling server for javascript

by gw1500se (Beadle)
on Jul 30, 2007 at 23:41 UTC ( [id://629699]=perlquestion: print w/replies, xml ) Need Help??

gw1500se has asked for the wisdom of the Perl Monks concerning the following question:

This is a followup question to node ID 629420.

I have switched my method to use 'Mech' but now have a more basic problem. Since Mech does not support javascript I do not get the whole page or the form for submitting the login. I don't need javascript but I do need the source for the javascript (one of them contains a unique hash for encrypting the password). Maybe I am using the wring thing but when I use $mech->content(format=>'text'); I get an abbretiate form with the error that javascript is not enabled.

How do I fool this thing into thinking I have javascript enabled? This was not a problem when i was using LWP and 'GET'.
my $mech=WWW::Mechanize->new(); $mech->get($URL1); print $mech->content(format=>'text');
Thanks.

Replies are listed 'Best First'.
Re: WWW::Mechanize and fooling server for javascript
by cLive ;-) (Prior) on Jul 31, 2007 at 00:04 UTC

    Without seeing the HTML, I'll just throw out a best guess.

    My guess is that the page before uses javascript to set a form variable, so that when the form is submitted, if the dynamically generated form element is missing, it throws the Javascript error. That (or something like it) is the most likely problem.

    I suggest you install live HTTP headers plugin and watch what gets sent to the server when you manually go through the system in Firefox.

      The "Javascript is not enabled" message could be generated by the server attempting browser-detection, so you may need to fake the useragent.

      just another cpan module author
Re: WWW::Mechanize and fooling server for javascript
by Cody Pendant (Prior) on Jul 31, 2007 at 01:21 UTC
    Since Mech does not support javascript I do not get the whole page or the form for submitting the login.
    That doesn't make any sense at all. No, Mech doesn't support JavaScript but neither does LWP. Unsurprisingly since they're really the same thing. But of course Mech gets the full page, it couldn't possibly function otherwise. print $mech->content(), without the format, will show you the source.

    If you've got the source of the page, you can use it to find the Javascript. If you can find the JavaScript you can read it and find how it sets the data field.

    And if you can bear to tell us what site you're trying to log into, this will get a lot easier because we'll be able to look at it as well.



    Nobody says perl looks like line-noise any more
    kids today don't know what line-noise IS ...
      I thought I replied to this last night but I don't see it. My apologies if I am missing something and this turns out to be a double post.

      Thanks for the reply. I thought 'content' returned an HTML object if I didn't use the format. I'll give that a try.

      I agree that it seems impossible to not get the whole page but since this is my first time with Mech I don't really know all it does yet.

      The bottom line of this exercise is to get ALL the javascript source including that which comes as a link rather then embedded. That was the problem I was having with LWP, the 'GET' only gave me the source for embedded javascript.

      For a better description of what I am doing, please see the explanation in Using LWP to automate a login. I am trying to extract the assigned IP address from the ISP for a DSL line. To do that I need to log on to a D-Link EBR-2310 which will serve a status page containing that information. The trick is to authenticate to the router.
        gw1500se,

        Some routers use JavaScript to dynamically create an html page using document.write. As you have already been told LWP and WWW::Mechanize do not support JavaScript, see how to reboot adsl modem with perl? for a similar problem, and list of WWW::Mechanize variants that do support JavaScript.

        I don't know what you mean by 'fooling Server' but depending on the JavaScript in question it is often possible to write some Perl that provides the same function as the JavaScript does. If all you are looking for is the IP address assigned by your ISP, perhaps you could simply use WWW::Mechanize to get http://www.whatismyip.com (or similar service), and parse the response.

        Martin

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://629699]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-24 00:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found