Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: Mechanize, Forms, Links, problem from Javascript?

by Cody Pendant (Prior)
on Jun 20, 2008 at 03:27 UTC ( #693047=note: print w/replies, xml ) Need Help??

in reply to Mechanize, Forms, Links, problem from Javascript?

In a word, yes, it's a javascript issue.

This stuff here: onChange="enableIt(this,document.sipp.proyektemp); gatherInfoProp(document.sipp.fiscal, this, 'thn=','&kdprop='); getName(this,document.sipp.nmpropinsi); is JS which runs three different functions when something is selected in those menus.

So, figure out what's actually happening, is the usual advice.

My preferred way to do this is to use the Live HTTP Headers Add-on for Firefox. No matter what the Javascript does, sooner or later, the browser retrieves a page, from a URL, via HTTP, and once you can figure out what that URL is, you'll be able to write your script.

Nobody says perl looks like line-noise any more
kids today don't know what line-noise IS ...

Replies are listed 'Best First'.
Re^2: Mechanize, Forms, Links, problem from Javascript?
by goodepic (Initiate) on Jun 20, 2008 at 21:21 UTC

    Thanks guys. HTTP::Recorder looks very cool and useful, but it doesn't deal with Javascript either. It just pumps out the code that I'd already tried.

    So I tried Live HTTP with firefox. Oh man. I'm not super well versed in this stuff. I've just attached my Live HTTP output below. The only thing I've been able to think to try is to get the ../sipp2005/form_A3.php page "manually" by adding the the ses_id line to the full form_A3 URL after a ?, both with the +'s and with them replace by %20, since they appear as spaces in the page itself. That doesn't work even in my browser, already logged in to the site. Any help is GREATLY appreciated...

    There's stuff before this, but none of it, like the login, is javascript dependent, so I can get there fine. I can get to the site just by putting those URLs in after logging (through Mechanize). The thn and kdprop values come from two drop down select controls that use javascript so I can't use them properly but just getting the URL works fine. There's HTTP content after the top one below, but I've deleted that. The second Post is what I need and can't get to work...

      Well, the output there says that there was a POST request to with the content
      ses_id=sid&fiscal=2008&thnang=&propinsi=01&proyektemp=0905497004-11003 +7040+-Ir.+Bambang+Erianto%2CMM++++++++&nmpinpro=-Ir.+Bambang+Erianto% +2CMM++++++++&nippin=110037040&nmproyek=PUSAT-SEKRETARIAT-SNVT+PENANGA +NAN+MENDESAK+DAN+TANGGAP+DARURAT&proyek=0905497004&nmpropinsi=DKI+Jak +arta

      Which might be the same, assuming the server accepts GET requests as well as POST requests, as this URL: +g=&propinsi=01&proyektemp=0905497004-110037040+-Ir.+Bambang+Erianto%2 +CMM++++++++&nmpinpro=-Ir.+Bambang+Erianto%2CMM++++++++&nippin=1100370 +40&nmproyek=PUSAT-SEKRETARIAT-SNVT+PENANGANAN+MENDESAK+DAN+TANGGAP+DA +RURAT&proyek=0905497004&nmpropinsi=DKI+Jakarta

      Now, I've gone to that URL and ... I can't read Bahasa Indonesia, so I don't know if that's the information you want or an error message.

      Nobody says perl looks like line-noise any more
      kids today don't know what line-noise IS ...

        Just for anyone googling a similar problem, I got this working. In the end it was just a simple post from mechanize. With the same $br = WWW::Mechanize->new() object, I went to the login page logged in, followed the link to the trouble page, manually went to the URL you're directed to through the first two option drop downs, then to get the behavior of the combo selecting from the 3rd option drop down and then clicking on one of the data table links, I did this:

        $br->post('', ['ses_id' => 'sid', 'fiscal' => "$year", 'thnang' => '', 'propinsi' => "$prop", 'proyektemp' => satker->{'proyektemp'}, 'nmpinpro' => $satker->{'nminpro'}, 'nipppin' => $satker->{'nippin'}, + 'nmproyek' => $satker->{'nmproyek'}, 'proyek' => $satker->{'proyek'}, + 'nmpropinsi' => $state_names{$prop}]); + my $resp = $br->content();

        All the values in the POST come from the names and values in the 3rd drop down option select input. Easy to figure out how to parse those to get the values in the HTTP above. I got those like this:

        $form = $br->form_name('sipp'); my $input = $form->find_input('proyektemp'); my @pt_values = $input->possible_values; my @pt_names = $input->value_names;

        Read on cpan HTML::Form for more on that...

      um, http::recorder is easier than live/headers. you use a browser to do what you want (js or no), and http::recorder records the conversation, which you duplicate using mechanize. Otherwise there really is no way without learning http/cgi....
        I worked with HTTP::Recorder and went through the browser. The code it gave me back was exactly what I'd written myself in mechanize, and didn't work. Can I read up on HTTP/CGI and figure out what's going on from live/headers? I guess that's my only option now...
        From the docs:
        WWW::Mechanize can't play back Javascript actions, and HTTP::Recorder doesn't record them.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://693047]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2021-02-28 13:50 GMT
Find Nodes?
    Voting Booth?

    No recent polls found