http://qs321.pair.com?node_id=11117047

vskatusa has asked for the wisdom of the Perl Monks concerning the following question:

Here is my code
use LWP::UserAgent; use feature 'say'; my $todaysQuoteStr = ' data-reactid="49"><span class="Trsdu(0.3s) Fw(b +) Fz(36px) Mb(-4px) D(ib)" data-reactid="50">'; $todaysQuoteStr =~ quotemeta($todaysQuoteStr); ## I believe this escap +es all characters that need to be escaped my $url = 'https://finance.yahoo.com/quote/AAPL?p=AAPL&.tsrc=fin-srch' +; my $ua = LWP::UserAgent->new(); my $req = HTTP::Request->new(GET => $url); $ua->agent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537. +36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'); my $response = $ua->request($req); #die $response->code if ! $response->is_success; $string = $response->decoded_content; say "\t\t\t=>searching for todays quote pattern [$todaysQuoteStr]"; + if ($string =~ m/$todaysQuoteStr/) { my $quoteStr = substr $string, $position, 20; my @quotes = split(/<\/span>/,$quoteStr); $price = $quotes[0]; say "\t\t\t[price = $price]"; } else { say "\t\t\t pattern not found"; }
What is interesting is that when I view the source of html in the browser and search for the pattern I do get a match but programmatically it is not working. Obviously I am doing something wrong...but unable to troubleshoot. Any help would be much appreciated.

Replies are listed 'Best First'.
Re: LWP::UserAgent & match
by haukex (Archbishop) on May 21, 2020 at 16:56 UTC
Re:LWP::UserAgent & match
by marto (Cardinal) on May 21, 2020 at 17:23 UTC

    Enjoy making life difficult for yourself huh? We've been here before.

    #!/usr/bin/perl use strict; use warnings; use Mojo::UserAgent; use feature 'say'; my $url = 'https://finance.yahoo.com/quote/AAPL?p=AAPL&.tsrc=fin-srch' +; my $ua = Mojo::UserAgent->new; say $ua->get( $url )->res->dom->at('span.[class^=Trsdu]')->attr->{'dat +a-reactid'};

    Output:

    31

    Down since last month I see. The lesson is to learn from previous working solutions you've been given.

Re: LWP::UserAgent & match
by choroba (Cardinal) on May 21, 2020 at 15:48 UTC
    LWP::UserAgent doesn't run JavaScript in the page as the browser does. Either switch to WWW::Mechanize::Chrome or try to find out what the JavaScript code does behind the scenes and emit the same requests using the original module.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: LWP::UserAgent & match
by hippo (Bishop) on May 21, 2020 at 16:00 UTC
    $todaysQuoteStr =~ quotemeta($todaysQuoteStr); ## I believe this escapes all characters that need to be escaped

    Not quite, no. The right hand side uses quotemeta to do that but then you don't assign the result to anything. This will show the difference that an assignment makes:

    #!/usr/bin/env perl use strict; use warnings; my $todaysQuoteStr = ' data-reactid="49"><span class="Trsdu(0.3s) Fw(b +) Fz(36px) Mb(-4px) D(ib)" data-reactid="50">'; $todaysQuoteStr =~ quotemeta($todaysQuoteStr); print "Bad: $todaysQuoteStr\n"; $todaysQuoteStr = quotemeta($todaysQuoteStr); print "Good: $todaysQuoteStr\n";
      Thanks hippo. I did make the change but no luck in match. BTW, the $string does contain the matcheded string $todaysQuoteStr. My revised code
      use strict; use warnings; use LWP::UserAgent; use feature 'say'; my ($position,$price); my $todaysQuoteStr = 'data-reactid="49"><span class="Trsdu(0.3s) Fw(b) + Fz(36px) Mb(-4px) D(ib)" data-reactid="50">'; $todaysQuoteStr = quotemeta($todaysQuoteStr); my $url = 'https://finance.yahoo.com/quote/AAPL?p=AAPL&.tsrc=fin-srch' +; my $ua = LWP::UserAgent->new(); my $req = HTTP::Request->new(GET => $url); $ua->agent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537. +36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'); my $response = $ua->request($req); my $string = $response->decoded_content; #say "string = $string"; say "\t\t\t=>searching for todays quote pattern [$todaysQuoteStr]"; + $position = 0; if ($string =~ m/$todaysQuoteStr/) { my $quoteStr = substr $string, $position, 20; say "quoteStr = $quoteStr"; my @quotes = split(/<\/span>/,$quoteStr); $price = $quotes[0]; say "\t\t\t[price = $price]"; } else { say "\t\t\t pattern not found"; }
        worked - code below
        use strict; use warnings; use LWP::UserAgent; use feature 'say'; my ($price); my $todaysQuoteStr = 'data-reactid="49"><span class="Trsdu(0.3s) Fw(b) + Fz(36px) Mb(-4px) D(ib)" data-reactid="50">'; $todaysQuoteStr = quotemeta($todaysQuoteStr); my $url = 'https://finance.yahoo.com/quote/AAPL?p=AAPL&.tsrc=fin-srch' +; my $ua = LWP::UserAgent->new(); my $req = HTTP::Request->new(GET => $url); $ua->agent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537. +36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'); my $response = $ua->request($req); my $string = $response->decoded_content; #say "string = $string"; say "\t\t\t=>searching for todays quote pattern [$todaysQuoteStr]"; + my ($counter,$position) = (0,''); while ($string =~ m/$todaysQuoteStr/g) { $counter++; $position = pos($string); if ($counter == 1) {last;} # } if (!(defined($position))) { $position = ''; } if ($position eq '') { say "\t\t\t\t<<pattern not found!>> "; } else { my $quoteStr = substr $string, $position, 20; say "quoteStr = $quoteStr"; my $endStr = '</span>'; $endStr = quotemeta($endStr); my @quotes = split(/$endStr/,$quoteStr); $price = $quotes[0]; say "\t\t\t[price = $price]"; }