Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

help clicking radio buttons using WWW::Mechanize::Chrome

by Special_K (Monk)
on Jan 10, 2021 at 23:30 UTC ( [id://11126728]=perlquestion: print w/replies, xml ) Need Help??

Special_K has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to write a perl script that uses WWW::Mechanize::Chrome to determine whether or not a particular product is in stock on a product page (https://www.tenthousand.cc/products/interval-short) by clicking the appropriate radio buttons (color, size, inseam, liner), and then reading the value of the button (either "add to bag" or "sold out"). The page does not load properly without JS, otherwise I would just use WWW::Mechanize.

Here is my script so far:


#!/usr/bin/perl -w use strict; use Log::Log4perl qw(:easy); use WWW::Mechanize::Chrome; my $url = "https://www.tenthousand.cc/products/interval-short"; Log::Log4perl->easy_init($ERROR); my $mech = WWW::Mechanize::Chrome->new(headless => 1, launch_exe => 'C +:\Program Files (x86)\Google\Chrome\Application\chrome.exe'); my $post_response; $post_response = $mech->get($url); $post_response->is_success || die $post_response->status_line; # step 1 - select color $mech->click({id => 'ProductSelect-option-color-solar-8568844557'}); # step 2 - select size $mech->click({id => 'ProductSelect-option-size-medium-8568844557'}); # step 3 - select inseam $mech->click({id => 'ProductSelect-option-inseam-9-inch-8568844557'}); # step 4 - select liner/no liner $mech->click({id => 'ProductSelect-option-liner-no-8568844557'});

If I comment out the first click(), I get no output, suggesting that all buttons were recognized and successfully clicked. If I include the first click() however, I get the following error:


> ./www_mechanize_chrome_testcase.pl $VAR1 = [ { 'method' => 'Network.requestWillBeSent', 'params' => { 'loaderId' => 'EF16DAA4B81ECCB0A2A3FB76F0B0F +3B6', 'requestId' => 'EF16DAA4B81ECCB0A2A3FB76F0B0 +F3B6', 'timestamp' => '1492087.342068', 'wallTime' => '1610320199.63641', 'type' => 'Document', 'redirectResponse' => { 'connectionReused' = +> bless( do{\(my $o = 1)}, 'JSON::PP::Boolean' ), 'protocol' => 'h2', 'fromServiceWorker' +=> bless( do{\(my $o = 0)}, 'JSON::PP::Boolean' ), 'url' => 'https://pi +xel.tapad.com/idsync/ex/push/check?partner_id=2884&partner_url=https% +3A%2F%2Ftr.snapchat.com%2Fcm%2Fp%3Frand%3D1610201931907%26pnid%3D140% +26pcid%3D%24%7BTA_DEVICE_ID%7D', 'connectionId' => 80 +1, 'mimeType' => '', 'remoteIPAddress' => + '107.178.246.49', 'headers' => { 'loca +tion' => 'https://tr.snapchat.com/cm/p?rand=1610201931907&pnid=140&pc +id=f5cb7c21-5398-11eb-9bb4-ba2c2ef2346d', 'stri +ct-transport-security' => 'max-age=31536000', 'date +' => 'Sun, 10 Jan 2021 23:09:58 GMT', 'alt- +svc' => 'clear', 'via' + => '1.1 google', 'p3p' + => 'policyref="http://tapad-taptags.s3.amazonaws.com/policy/p3p.xml" +, CP="NOI DSP COR ADM PSAo PSDo OURo SAMo UNRo OTRo BUS COM NAV DEM S +TA PRE"', 'serv +er' => 'Jetty(9.4.28.v20200408)', 'set- +cookie' => 'TapAd_TS=1610320198882;Expires=Thu, 11 Mar 2021 23:09:58 +GMT;Path=/;Domain=.tapad.com;Secure;SameSite=None TapAd_DID=f5cb7c21-5398-11eb-9bb4-ba2c2ef2346d;Expires=Thu, 11 Mar 202 +1 23:09:58 GMT;Path=/;Domain=.tapad.com;Secure;SameSite=None TapAd_3WAY_SYNCS=;Expires=Thu, 11 Mar 2021 23:09:58 GMT;Path=/;Domain= +.tapad.com;Secure;SameSite=None', 'cont +ent-length' => '0' }, 'encodedDataLength' +=> 376, 'securityState' => ' +secure', 'responseTime' => '1 +610320199635.73', 'fromPrefetchCache' +=> $VAR1->[0]{'params'}{'redirectResponse'}{'fromServiceWorker'}, 'fromDiskCache' => $ +VAR1->[0]{'params'}{'redirectResponse'}{'fromServiceWorker'}, 'remotePort' => 443, 'requestHeaders' => +{ + ':scheme' => 'https', + 'referer' => 'https://tr.snapchat.com/cm/i?pid=a8067835-6f53-4ec6-b +1d7-509ec7db92f4', + 'upgrade-insecure-requests' => '1', + 'sec-fetch-mode' => 'navigate', + 'cookie' => 'TapAd_TS=1610320198882; TapAd_DID=f5cb7c21-5398-11eb-9 +bb4-ba2c2ef2346d', + 'accept-language' => 'en-US', + ':authority' => 'pixel.tapad.com', + ':path' => '/idsync/ex/push/check?partner_id=2884&partner_url=https +%3A%2F%2Ftr.snapchat.com%2Fcm%2Fp%3Frand%3D1610201931907%26pnid%3D140 +%26pcid%3D%24%7BTA_DEVICE_ID%7D', + ':method' => 'GET', + 'sec-fetch-site' => 'cross-site', + 'accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9, +image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchang +e;v=b3;q=0.9', + 'sec-fetch-dest' => 'iframe', + 'user-agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWeb +Kit/537.36 (KHTML, like Gecko) HeadlessChrome/87.0.4280.88 Safari/537 +.36', + 'accept-encoding' => 'gzip, deflate, br' +}, 'status' => 302, 'statusText' => '', 'securityDetails' => + { + 'subjectName' => '*.tapad.com', + 'cipher' => 'AES_128_GCM', + 'validTo' => 1636156800, + 'protocol' => 'TLS 1.3', + 'keyExchangeGroup' => 'X25519', + 'signedCertificateTimestampList' => [], + 'keyExchange' => '', + 'certificateId' => 0, + 'certificateTransparencyCompliance' => 'unknown', + 'sanList' => [ + '*.tapad.com', + 'tapad.com' + ], + 'issuer' => 'DigiCert SHA2 Secure Server CA', + 'validFrom' => 1601856000 + }, 'timing' => { 'sslEn +d' => -1, 'worke +rReady' => -1, 'proxy +End' => '0.359', 'conne +ctStart' => -1, 'sendS +tart' => '0.416', 'sslSt +art' => -1, 'worke +rRespondWithSettled' => -1, 'sendE +nd' => '1.285', 'reque +stTime' => '1492087.291689', 'proxy +Start' => '0.097', 'conne +ctEnd' => -1, 'worke +rStart' => -1, 'pushS +tart' => 0, 'dnsSt +art' => -1, 'pushE +nd' => 0, 'recei +veHeadersEnd' => '49.727', 'worke +rFetchStart' => -1, 'dnsEn +d' => -1 } }, 'initiator' => { 'type' => 'other' }, 'request' => { 'initialPriority' => 'VeryHig +h', 'headers' => { 'User-Agent' = +> 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTM +L, like Gecko) HeadlessChrome/87.0.4280.88 Safari/537.36', 'Referer' => ' +https://tr.snapchat.com/cm/i?pid=a8067835-6f53-4ec6-b1d7-509ec7db92f4 +', 'Upgrade-Insec +ure-Requests' => '1' }, 'referrerPolicy' => 'no-refer +rer-when-downgrade', 'url' => 'https://tr.snapchat +.com/cm/p?rand=1610201931907&pnid=140&pcid=f5cb7c21-5398-11eb-9bb4-ba +2c2ef2346d', 'mixedContentType' => 'none', 'method' => 'GET' }, 'hasUserGesture' => $VAR1->[0]{'params'}{'re +directResponse'}{'fromServiceWorker'}, 'documentURL' => 'https://tr.snapchat.com/cm +/p?rand=1610201931907&pnid=140&pcid=f5cb7c21-5398-11eb-9bb4-ba2c2ef23 +46d', 'frameId' => 'A04583AF0A872985D9548891D196F5 +E0' } }, { 'params' => { 'headers' => { 'cookie' => 'sc_at=v2|H4sIAAA +AAAAAAE3GwRGAMAgEwIqYuQsnot0oCVWkeL/uazFWqLwM2YdptSwfvQbMriAdZ2wKN4Pw +AV65f8UHQZX7ikAAAAA=', 'accept-language' => 'en-US', 'sec-fetch-mode' => 'navigate +', 'upgrade-insecure-requests' = +> '1', ':scheme' => 'https', 'referer' => 'https://tr.snap +chat.com/cm/i?pid=a8067835-6f53-4ec6-b1d7-509ec7db92f4', 'user-agent' => 'Mozilla/5.0 +(Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) +HeadlessChrome/87.0.4280.88 Safari/537.36', 'accept-encoding' => 'gzip, d +eflate, br', 'accept' => 'text/html,applic +ation/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apn +g,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'sec-fetch-dest' => 'iframe', ':authority' => 'tr.snapchat. +com', ':path' => '/cm/p?rand=161020 +1931907&pnid=140&pcid=f5cb7c21-5398-11eb-9bb4-ba2c2ef2346d', ':method' => 'GET', 'sec-fetch-site' => 'cross-si +te' }, 'requestId' => 'EF16DAA4B81ECCB0A2A3FB76F0B0 +F3B6', 'frame' => {}, 'associatedCookies' => [ { 'cookie' => { 'se +cure' => $VAR1->[0]{'params'}{'redirectResponse'}{'connectionReused'} +, 'pa +th' => '/', 'ht +tpOnly' => $VAR1->[0]{'params'}{'redirectResponse'}{'fromServiceWorke +r'}, 'si +ze' => 104, 'na +me' => 'sc_at', 'se +ssion' => $VAR1->[0]{'params'}{'redirectResponse'}{'fromServiceWorker +'}, 'sa +meSite' => 'None', 'ex +pires' => '1644016199.48171', 'va +lue' => 'v2|H4sIAAAAAAAAAE3GwRGAMAgEwIqYuQsnot0oCVWkeL/uazFWqLwM2Ydpt +SwfvQbMriAdZ2wKN4PwAV65f8UHQZX7ikAAAAA=', 'do +main' => '.snapchat.com', 'pr +iority' => 'Medium' }, 'blockedReasons' +=> [] } ] }, 'method' => 'Network.requestWillBeSentExtraInfo' }, { 'params' => { 'frame' => {}, 'blockedCookies' => [], 'headers' => { 'cf-request-id' => '07902953f +a0000d28e22857000000001', 'x-shopify-generated-cart-tok +en' => '6b9d64259b09cde535dac48f68879f6b', 'set-cookie' => 'secure_custo +mer_sig=; path=/; expires=Mon, 10 Jan 2022 23:09:58 GMT; secure; Http +Only cart_currency=USD; path=/; expires=Sun, 24 Jan 2021 23:09:58 GMT; secu +re; SameSite=None cart_sig=32b470570732770b7820df9d419bfbd2; path=/; expires=Sun, 24 Jan + 2021 23:09:58 GMT; secure; HttpOnly; SameSite=None', 'cf-cache-status' => 'DYNAMIC +', 'x-liquid-rendered-at' => '20 +21-01-10T23:09:58.861847156Z', 'cache-control' => 'no-cache, + no-store', 'x-shopid' => '2517377', 'x-xss-protection' => '1; mod +e=block; report=/xss-report?source%5Baction%5D=app_liquid&source%5Bap +p%5D=Shopify&source%5Bcontroller%5D=storefront_section%2Fapp_proxy&so +urce%5Bsection%5D=storefront&source%5Buuid%5D=5a6f66a5-6c93-4bdb-9484 +-208b4ef7ecf4', 'server' => 'cloudflare', 'content-security-policy-repo +rt-only' => '; report-uri /csp-report?source%5Baction%5D=app_liquid&s +ource%5Bapp%5D=Shopify&source%5Bcontroller%5D=storefront_section%2Fap +p_proxy&source%5Bsection%5D=storefront&source%5Buuid%5D=5a6f66a5-6c93 +-4bdb-9484-208b4ef7ecf4', 'x-sorting-hat-shopid' => '25 +17377', 'x-dc' => 'gcp-us-central1,gc +p-us-central1,gcp-us-east1,gcp-us-east1', 'report-to' => '{"group":"net +work-errors","max_age":2592000,"endpoints":[{"url":"https://monorail- +edge.shopifycloud.com/v1/reports/nel/20190325/shopify"}]} {"group":"network-errors","max_age":2592000,"endpoints":[{"url":"https +://monorail-edge.shopifycloud.com/v1/reports/nel/20190325/shopify"}]} +', 'x-download-options' => 'noop +en', 'content-type' => 'applicatio +n/json; charset=utf-8', 'cf-ray' => '60fa11999cb9d28e +-DFW', 'content-language' => 'en', 'expect-ct' => 'max-age=60480 +0, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expec +t-ct"', 'x-request-id' => '5a6f66a5-6 +c93-4bdb-9484-208b4ef7ecf4', 'content-encoding' => 'br', 'x-permitted-cross-domain-pol +icies' => 'none', 'x-sorting-hat-podid' => '86' +, 'x-content-type-options' => ' +nosniff', 'vary' => 'Accept-Encoding Accept', 'content-security-policy' => +'; report-uri /csp-report?source%5Baction%5D=app_liquid&source%5Bapp% +5D=Shopify&source%5Bcontroller%5D=storefront_section%2Fapp_proxy&sour +ce%5Bsection%5D=storefront&source%5Buuid%5D=5a6f66a5-6c93-4bdb-9484-2 +08b4ef7ecf4', 'nel' => '{"report_to":"netwo +rk-errors","max_age":2592000,"success_fraction":0.0001} {"report_to":"network-errors","max_age":2592000,"success_fraction":0.0 +001}', 'x-shardid' => '86', 'x-shopify-stage' => 'product +ion', 'strict-transport-security' = +> 'max-age=7889238', 'date' => 'Sun, 10 Jan 2021 2 +3:09:58 GMT' }, 'requestId' => '21392.213' }, 'method' => 'Network.responseReceivedExtraInfo' }, { 'params' => { 'frame' => {}, 'headersText' => 'HTTP/1.1 200 OK Access-Control-Allow-Credentials: true Access-Control-Allow-Headers: Access-Control-Allow-Methods: GET Access-Control-Allow-Origin: * Access-Control-Max-Age: 86400 Allow: OPTIONS, GET Cache-Control: max-age=7200 Content-Type: application/json; charset=utf-8 Date: Sun, 10 Jan 2021 23:09:58 GMT Server: nginx Vary: Cookie Content-Length: 46 Connection: keep-alive ', 'blockedCookies' => [], 'headers' => { 'Access-Control-Allow-Origin' + => '*', 'Content-Type' => 'applicatio +n/json; charset=utf-8', 'Access-Control-Max-Age' => ' +86400', 'Vary' => 'Cookie', 'Date' => 'Sun, 10 Jan 2021 2 +3:09:58 GMT', 'Cache-Control' => 'max-age=7 +200', 'Connection' => 'keep-alive', 'Server' => 'nginx', 'Allow' => 'OPTIONS, GET', 'Content-Length' => '46', 'Access-Control-Allow-Credent +ials' => 'true', 'Access-Control-Allow-Methods +' => 'GET', 'Access-Control-Allow-Headers +' => '' }, 'requestId' => '21392.241' }, 'method' => 'Network.responseReceivedExtraInfo' }, { 'method' => 'DOM.attributeModified', 'params' => { 'frame' => {}, 'value' => 'product-single__add-to-cart', 'nodeId' => 2990, 'name' => 'class' } }, { 'method' => 'DOM.childNodeCountUpdated', 'params' => { 'nodeId' => 2999, 'childNodeCount' => 1, 'frame' => {} } }, { 'method' => 'DOM.attributeModified', 'params' => { 'frame' => {}, 'value' => 'false', 'name' => 'data-available', 'nodeId' => 2325 } }, { 'params' => { 'previousNodeId' => 2307, 'parentNodeId' => 2275, 'node' => { 'attributes' => [], 'childNodeCount' => 0, 'backendNodeId' => 855, 'nodeType' => 1, 'localName' => 'div', 'nodeId' => 3011, 'nodeValue' => '', 'nodeName' => 'DIV' }, 'frame' => {} }, 'method' => 'DOM.childNodeInserted' }, { 'method' => 'DOM.childNodeRemoved', 'params' => { 'nodeId' => 3011, 'parentNodeId' => 2275, 'frame' => {} } }, { 'params' => { 'url' => 'https://www.tenthousand.cc/product +s/interval-short-no-liner?variant=32441933168727', 'frameId' => 'F7C8298C7F980A1FF5BAA43B19990B +0D' }, 'method' => 'Page.navigatedWithinDocument' } ]; Chrome behaviour problem: Didn't see a 'Network.responseReceived' even +t for frameId A04583AF0A872985D9548891D196F5E0, requestId EF16DAA4B81 +ECCB0A2A3FB76F0B0F3B6, cannot synthesize response. I saw Network.requ +estWillBeSent,Network.requestWillBeSentExtraInfo at /usr/local/share/ +perl5/site_perl/5.30/WWW/Mechanize/Chrome.pm line 2596.

How do I go about debugging this? I tried the following as a start:


my $response = $mech->click({id => 'ProductSelect-option-color-solar-8 +568844557'}); if ($response->is_success) { printf("click succeeded\n"); }

But that returns:


> ./www_mechanize_chrome_testcase.pl Can't call method "is_success" on unblessed reference at ./www_mechani +ze_chrome_testcase.pl line 23.

Based on the documentation I thought click() returned an HTTP::Response object (the URL formatting doesn't seem to allow me to link directly to the click bookmark):

WWW::Mechanize::Chrome

Replies are listed 'Best First'.
Re: help clicking radio buttons using WWW::Mechanize::Chrome
by Corion (Patriarch) on Jan 11, 2021 at 07:13 UTC

    The error message is:

    Chrome behaviour problem: Didn't see a 'Network.responseReceived' event for frameId A04583AF0A872985D9548891D196F5E0, requestId EF16DAA +4B81ECCB0A2A3FB76F0B0F3B6, cannot synthesize response. I saw Network. +requestWillBeSent,Network.requestWillBeSentExtraInfo

    WWW::Mechanize::Chrome decided that the browser wants to send an HTTP request, but then did not find the expected responseReceived event.

    This seems to be a bug/unexpected behaviour by Chrome, as the log shows:

    'params' => { 'url' => 'https://www.tenthousand.cc/product +s/interval-short-no-liner?variant=32441933168727', 'frameId' => 'F7C8298C7F980A1FF5BAA43B19990B +0D' }, 'method' => 'Page.navigatedWithinDocument' }

    But this event is for a different frame. I can't do it at this time, but as you posted code, maybe I can reproduce the problem using your code and suggest a fix.

Re: help clicking radio buttons using WWW::Mechanize::Chrome
by Corion (Patriarch) on Jan 11, 2021 at 16:59 UTC

    I can't reproduce the problem. I'm using WWW::Mechanize::Chrome v0.65, Chromium v87, and the following program:

    #!/usr/bin/perl -w use strict; use 5.012; use Log::Log4perl qw(:easy); use WWW::Mechanize::Chrome; my $url = "https://www.tenthousand.cc/products/interval-short"; Log::Log4perl->easy_init($ERROR); my $mech = WWW::Mechanize::Chrome->new( #headless => 0, # headless => 1, # launch_exe => 'C:\Program Files (x86)\Google\Chrome\Application\ +chrome.exe' ); my $post_response; $post_response = $mech->get($url); $post_response->is_success || die $post_response->status_line; $mech->sleep(5); # step 1 - select color say "Color"; $mech->click({id => 'ProductSelect-option-color-solar-8568844557'}); # step 2 - select size say "Size"; $mech->click({id => 'ProductSelect-option-size-medium-8568844557'}); # step 3 - select inseam say "Inseam"; $mech->click({id => 'ProductSelect-option-inseam-9-inch-8568844557'}); # step 4 - select liner/no liner say "Liner"; $mech->click({id => 'ProductSelect-option-liner-no-8568844557'}); say "Done"; $mech->sleep(5);

    This program runs through to Done, waits 5 seconds and then finished. It seems to me all elements on the website get selected.

    Maybe you are using an older version of WWW::Mechanize::Chrome?

      After further testing I verified the following:

      1. I am using WWW-Mechanize-Chrome-0.65
      2. I am using Chrome 87.0.4280.141 (Official Build) (64-bit)
      3. Adding the $mech->sleep(5); after the $post_response->is_success || die $post_response->status_line; as you have done above fixed my problem. Is this sleep() call always necessary after a get() due to the following (taken from the documentation):

      Note that the returned HTTP::Response object gets the response body filled in lazily, so you might have to wait a moment to get the response body from the result. This is a premature optimization and later releases of WWW::Mechanize::Chrome are planned to fetch the response body immediately when accessing the response body.

      Finally, I noticed that if I change this line:

      $mech->click({id => 'ProductSelect-option-color-solar-8568844557'});

      to the following:

      my $response = $mech->click({id => 'ProductSelect-option-color-solar-8 +568844557'}); $mech->sleep(5); print Dumper($response); if ($response->is_success) { printf("click succeeded\n"); }

      the output is:

      > ./www_mechanize_chrome_testcase.pl $VAR1 = []; Can't call method "is_success" on unblessed reference at ./www_mechani +ze_chrome_testcase.pl line 27.

      Is that expected? If $mech->click() returns an HTTP::Response object, shouldn't I be able to call the is_success method?

        Ouch - yes, that is an inconsistency/bug in the behaviour of ->click(). When the ->click() method results in external HTTP requests, then the returned value is an HTTP::Response object. When the result is just some internal Javascript code, the result is an arrayref of the triggered events. This is wrong and I'll make it so that ->click always returns an HTTP::Response object, as documented.

Re: help clicking radio buttons using WWW::Mechanize::Chrome
by shmem (Chancellor) on Jan 10, 2021 at 23:46 UTC
    How do I go about debugging this?
    my $response = $mech->click({id => 'ProductSelect-option-color-solar-8 +568844557'});

    What is in $response? I'd dump that and go on from there.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

      It's apparently empty. If I modify my original code to include the Dumper call as follows:

      my $response = $mech->click({id => 'ProductSelect-option-color-solar-8 +568844557'}); print Dumper($response);

      The output is:

      > ./www_mechanize_chrome_testcase.pl $VAR1 = []; Can't call method "is_success" on unblessed reference at ./www_mechani +ze_chrome_testcase.pl line 26.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11126728]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (1)
As of 2024-04-25 03:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found