Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^4: running an example script with WWW::Mechanize* module

by marto (Cardinal)
on Apr 30, 2020 at 10:31 UTC ( #11116268=note: print w/replies, xml ) Need Help??


in reply to Re^3: running an example script with WWW::Mechanize* module
in thread running an example script with WWW::Mechanize* module

"Let me tell you why I would like to shift away from IMDB. It's that when I save the dom file to disc, it's 2.2 megs, which is nowhere near what I can lay eyes on and understand. Machines get it with the help of javascript, but I am only intermediate at best in my understanding of any of the matters I am writing about now."

One of the nice things about Mojo::DOM is the support for CSS Selectors (see the Mojo docs section Learning Web Technologies). You don;t have to figure these out for yourself, you can use your browsers 'developer tools' GUI to click on things and copy their CSS selector/path. Searching for a tutorial for whatever browser you use should produce many videos/tutorials demoing this sort of thing. The selectors aren't always optimal, just looking at the HTML source can often point to much shorter selectors in many cases. Mojo::UserAgent makes it fairly simple to send data to web interfaces, and the return object contains the resulting DOM (->res->dom above) which you can then use to display/capture whatever data you like. Give it a shot and let me know if you have any problems.

Replies are listed 'Best First'.
Re^5: running an example script with WWW::Mechanize* module
by Aldebaran (Deacon) on Apr 30, 2020 at 19:43 UTC
    One of the nice things about Mojo::DOM...

    I hadn't been looking there but found at the bottom a simple way to get the DOM into lexical perl that guys like me can understand. I don't get any buttons pushed here, but I'm so pleased with this script that I'm gonna post it. It represents my best achievement yet in getting the DOM information in a format I can read and not blowing me out on STDOUT using Data::Dump.

    $ ./3.mojo_fermi.pl >3.txt Wide character in print at /usr/local/share/perl/5.26.1/Log/Log4perl/A +ppender/File.pm line 313. Wide character in print at /usr/local/share/perl/5.26.1/Log/Log4perl/A +ppender/Screen.pm line 41. $ cat 3.mojo_fermi.pl #!/usr/bin/perl use strict; use warnings; use Mojo::URL; use Mojo::Util qw(dumper); use Mojo::UserAgent; use Data::Dump; use Log::Log4perl; use 5.016; use Mojo::DOM; my $log_conf3 = "/home/hogan/Documents/hogan/logs/conf_files/3.conf"; my $log_conf4 = "/home/hogan/Documents/hogan/logs/conf_files/4.conf"; #Log::Log4perl::init($log_conf3); #debug Log::Log4perl::init($log_conf4); #info my $logger = Log::Log4perl->get_logger(); $logger->info("$0"); my $site = 'https://www.fourmilab.ch/cgi-bin/Yoursky?z=1&lat=45.5183&ns=North&lon +=122.676&ew=West'; # pretend to be a browser my $uaname = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like G +ecko) Chrome/40.0.2214.93 Safari/537.36'; my $ua = Mojo::UserAgent->new; $ua->max_redirects(5)->connect_timeout(20)->request_timeout(20); $ua->transactor->name($uaname); # find search results my $dom = $ua->get($site)->res->dom; # dd $dom; #overwhelms STDOUT say "==========="; my @nodes = @$dom; # c-style for is good for array output with index for ( my $i = 0 ; $i < @nodes ; $i++ ) { $logger->info("i is $i =============="); $logger->info("$nodes[$i]"); } sleep 2; #good hygiene __END__ $

    I would excerpt my beautiful, straight, demarcated logs, but they're covered in symbols that won't render well here.

    Give it a shot and let me know if you have any problems.

    Thx, marto, I'll keep after it....

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11116268]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (7)
As of 2020-11-24 12:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?