Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

searching with mojo dom

by adambot (Acolyte)
on Mar 11, 2018 at 04:04 UTC ( [id://1210642]=perlquestion: print w/replies, xml ) Need Help??

adambot has asked for the wisdom of the Perl Monks concerning the following question:

I have a selector that has a tenancy to change slightly ex: #APP > div > div.hero-PwoFv7gG > table > tbody > tr:nth-child(2) > td > span > span

the div.hero-something is what changes.

The code i have is: my $data = $dom->at('#APP > div > div.hero-PwoFv7gG > table > tbody > tr:nth-child(2) > td > span > span')->text;

however this keeps breaking when the hero-something changes. Would someone be able to point me in the right direction as to how i could make my code more robust? I've tried wildcards in the at but that doesn't work...

Replies are listed 'Best First'.
Re: searching with mojo dom
by marto (Cardinal) on Mar 11, 2018 at 08:23 UTC

    If the hero div is always the same number can't you just use the nth-child selector, in the same way you do for the table row? Failing that a short example of the HTML would help provide a solution.

      Here is my final code that works. If anyone has a better way to do it, please share.

      my $dom = Mojo::DOM->new($source); my $sidecarsource = $dom->find('div.today_nowcard-sidecar')->first(qr/ +Right Now/)->content; #print Data::Dumper::Dumper($sidecarsource); my $sidecardom = Mojo::DOM->new($sidecarsource); my $wind = $sidecardom->at('table > tbody > tr:nth-child(1) > td > spa +n')->text;

        You might find this cleaner:

        #!/usr/bin/perl use strict; use warnings; use feature 'say'; use Mojo::UserAgent; my $url = 'https://weather.com/weather/today/l/20001:4:US'; my $selector = 'div.today_nowcard-sidecar.component.panel table tr td +span'; my $ua = Mojo::UserAgent->new; say $ua->get( $url )->res->dom->at( $selector )->all_text;

        However, you may want to check if their recommended API, if it offers what you want it'll be faster to access and your code more resilient to changes.

      the HTML i'm working with can be found here: https://weather.com/weather/today/l/20001:4:US

      the full/unedited selector is:

      #APP > div > div.today.section-local-suite.page > div.section-page-nam +e > div.hero.hero-background.layout-centered > div.hero-flex.styles-d +2KKDEYo__heroFlex__3UOm0 > div.region.region-hero-left > div > sectio +n > div.today_nowcard-sidecar.component.panel > table > tbody > tr:nt +h-child(1) > td > span
      (the wind speed text)
        I'm thinking if i could figure out how to do a div find that would be easiest -- i have my @headers = $dom->find('div ~ today_nowcard-sidecar')->map('text')->each; but it doesn't seem to find anything...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1210642]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2024-04-26 07:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found