http://qs321.pair.com?node_id=547270

mpettis has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I could not find relevant help on google or your search archives, so I am posting here.

I am scraping a screen that I want to use to submit a form based upon a chosen radio button, which will return a different page. A data::dumper of the form is at the end of the post. I do the following AFTER my $mech object (I'm just including a form dump) returns the data following the '-------...' line (with a status of 200):

(Another update - looks like my original 'code' snipped was stripped out by the moderators - probably because my 2nd code segment didn't have a closing tag))

$mech->submit_form( form_name => 'lmSelect', fields => {selService => 'Load Manager', selEnv => 'Prod', lmRB => + $_}, button => 'viewSLA' );
Note also that for lmRB => $_, $_ is one of the values in the 'menu' array in this dump at the end. I've also tried using a value from the 'value_names' array with the same result.

(Update: here is a link to the full dump):
------------------------------------------
http://www.graciegoose.com/code/sla-form-dump.txt

(Update: here is a link to the page i am parsing. It will not work to actually submit anything here, since the website is internal, but it has the original content.):
------------------------------------------
http://www.graciegoose.com/code/sla-page-content.htm

Replies are listed 'Best First'.
Re: WWW::Mechanize, radio buttons, and clicking a named button
by mikeock (Hermit) on May 03, 2006 at 21:39 UTC
    You need to properly format your question with tags for a quicker repsonse
Re: WWW::Mechanize, radio buttons, and clicking a named button
by Kanji (Parson) on May 04, 2006 at 00:08 UTC
    I don't understand the errror, as I am naming a button that is on the page, and can be seen in the dump.

    The only buttons I see in that dump are selSubmit and addService, with no mention of viewSLA anywhere, as a button or otherwise.

    If you're positive you're dumping the correct form, then the viewSLA button may be dynamically generated with JavaScript (which WWW::Mechanize doesn't handle) or not parse correctly because of invalid markup (WWW::Mechanize isn't as forgiving as some GUI browsers).

        --k.


Re: WWW::Mechanize, radio buttons, and clicking a named button
by mpettis (Beadle) on May 04, 2006 at 02:23 UTC
    Hi All,

    Sorry for the problem... my data::dumper was so large that the end was truncated. I am going to see if i can find some publicly-accessible webpage to put the dumper datastructure on and link to it instead of inline it.
      You definitely need to learn about CODE and READMORE tags.

      But someone will sort that out for you.

      Are you sure, in the meantime that you're reading the HTML correctly?

      Radio buttons form a mutually-exclusive set of options, of course. So they all have the same name, but different values. For example:

      <input type="radio" name="one_of_two_options" value="option_a"> <input type="radio" name="one_of_two_options" value="option_b">
      So are you sure you're addressing the button correctly? I've never done radio buttons in WWW::Mech, but you obviously can't address them by name, as they all have the same name.



      ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
      =~y~b-v~a-z~s; print

      What you've already posted is quite lengthy and rather discouraging to wade through if someone wants to help.

      Perhaps you could come up with a minimalist test case instead and post that?

      It'll make things much easier for people here to see what the problem may be, while the process of you striping out as many text fields, radio groups, etc. as you can while still getting the error may reveal the problem to you.

          --k.


Re: WWW::Mechanize, radio buttons, and clicking a named button
by Cody Pendant (Prior) on May 04, 2006 at 23:21 UTC
    I was sure I'd posted this yesterday, but I must have been distracted between preview and post.

    This works for me. $mech->set_fields('one_of_two_options'=>'option_b');

    Where "one_of_two_options" is the NAME of the set of radio buttons and "option_b" is the VALUE of the particular button.

    That is, despite radio buttons being buttons in one sense of the word, they're not buttons as far as Mech is concerned, they're fields.



    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print
      Thanks! I'll give it a try.
Re: WWW::Mechanize, radio buttons, and clicking a named button
by mpettis (Beadle) on May 04, 2006 at 14:20 UTC
    Hi All,

    I have updated the original post to cut out the inlined dumper structure and replaced it with a link to the structure on a website. I had originally encapsulated it in 'code' tags, but since it was too long, the closing tag was cut off, leaving the ugly thing you saw in my original post. And, I wasn't aware of the 'readmore' tags, but i'll figure out how to use those now, though i doubt they would have helped for this example, as the whole thing still would have been truncated and dropped off the closing tags.

    Please, if this is a bit easier to read now, consider helping me. I will continue to look for an example that would be smaller, but I am at the mercy of what I can trim down from the page I am trying to scrape.

    tia,
    matt

      While <readmore> tags are helpful, I think you would have seen more response if you had provided the minimalist test case as that makes it much easier to separate the wheat from the chaff.

      It also isn't as difficult as you seem to think: make a copy of the problem page (or code) and start hacking away bits of the copy.

      For the HTML you posted, that would have resulted in something like...

      <form name="lmSelect" method="post"> <input type=button value="Delete Load Manager" name=delService title="Delete a Load Manager from the SLA."> </form>

      It really limits where the problem might be when you only have 3 lines of HTML returning the same error as the 371-line page you pointed people to. :-)

      Anyhoo, your problem:-

      WWW::Mechanize (well, HTML::Form actually) only supports "submit" and "image" type buttons for clicking, and not the "button" type buttons that are used on your form.

      The reason for this is "button" type buttons are meant for use with JavaScipt or some other client-side scripting, which WWW::Mechanize doesn't support.

      Fortunately, there's an easy workaround, which you can find details and examples of by Super Searching for "update_html".

          --k.