Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

"If the site doesn't like what your script is doing when you're signed in, they can let you know. I really would like to work up this example, but with WMC instead. My partner and I will watch Netflix, HBO, Amazon, and we're always trying to match the actors up to where we've seen them last, so I will get on my android and make the actual keystrokes on other occasions."

Have you looked at the many utilities on cpan for scraping data from IMDB? While packages exist here is a short proof of concept for accessing data, just a short (sub optimal) example to get you started:

#!/usr/bin/perl use strict; use warnings; use feature 'say'; use Mojo::URL; use Mojo::Util qw(trim); use Mojo::UserAgent; my $imdburl = 'http://www.imdb.com/search/title?title=Caddyshack'; # pretend to be a browser my $uaname = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 ( +KHTML, like Gecko) Chrome/40.0.2214.93 Safari/537.36'; my $ua = Mojo::UserAgent->new; $ua->max_redirects(5)->connect_timeout(20)->request_timeout(20); $ua->transactor->name( $uaname ); # find search results my $dom = $ua->get( $imdburl )->res->dom; # assume first match my $filmurl = $dom->find('a[href^=/title]')->first->attr('href'); # extract film id my $filmid = Mojo::URL->new( $filmurl )->path->parts->[-1]; # get details of film $dom = $ua->get( "https://www.imdb.com/title/$filmid/" )->res->dom; # print film details say trim( $dom->at('div.title_wrapper > h1')->text ) . ' (' . trim( $d +om->at('#titleYear > a')->text ) .')'; # print actor/character names foreach my $cast ( $dom->find('table.cast_list > tr:not(:first-child)' +)->each ){ say trim ($cast->at('td:nth-of-type(2) > a')->text ) . ' as ' . trim + ( $cast->at('td.character')->all_text ); }

Output:

Caddyshack (1980) Chevy Chase as Ty Webb Rodney Dangerfield as Al Czervik Ted Knight as Judge Elihu Smails Michael O'Keefe as Danny Noonan Bill Murray as Carl Spackler Sarah Holcomb as Maggie O'Hooligan Scott Colomby as Tony D'Annunzio Cindy Morgan as Lacey Underall Dan Resin as Dr. Beeper Henry Wilcoxon as The Bishop Elaine Aiken as Mrs. Noonan Albert Salmi as Mr. Noonan Ann Ryerson as Grace Brian Doyle-Murray as Lou Loomis Hamilton Mitchell as Motormouth

See also Mojo::UserAgent, Mojo::DOM, Mojo::URL, ojo (should you want one liners). You could adapt the above to print all matches and prompt for which one you want, rather than assume the first one (since remakes, sequels/prequels etc..), allow you to select the actor and return the details of all the other films/shows they have been in.

Update: If you would prefer some sort of web interface to the results wrap the above around Mojolicious::Lite


In reply to Re^2: running an example script with WWW::Mechanize* module by marto
in thread running an example script with WWW::Mechanize* module by Aldebaran

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-04-19 10:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found