Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

help with writing a web crawler with LWP

by butcher (Initiate)
on Sep 08, 2018 at 17:57 UTC ( [id://1221952]=perlquestion: print w/replies, xml ) Need Help??

butcher has asked for the wisdom of the Perl Monks concerning the following question:

My dynamic language experience is solely PHP. I want to learn Perl now to broaden my knoweldge and just because I like programming. :)

well - i think it is useful to do some real live things with Perl. So i thought that i can dive into some tasks where perl is told to be very helpful and powerful in:

need some help with writing a web crawler with LWP found a great tutorial with fairly nice and helpful explanations: but unfortunatly there is no help - with some hints how to step further.

see the video here: https://www.youtube.com/watch?v=2-kU-mKrYjM

Dr. Rob Edwards from San Diego State University shows how to use Perl and LWP::Simple to write a simple web crawler. Perl part 6: Writing a web crawler with LWP unfortunatly we cannot follow the tutorial - since the code is not visible. can you help out here a bit. that would be a great pleasure.. and help in learing

  • Comment on help with writing a web crawler with LWP

Replies are listed 'Best First'.
Re: help with writing a web crawler with LWP
by marto (Cardinal) on Sep 08, 2018 at 20:17 UTC

    Welcome. This question is great in so much as the modern alternatives are much nicer/easier to work with, so I'm really glad you you asked/posted. Firstly, many universities have webpages associated with their courses, so there's a good chance the actual example code shown here (I gave up watching a little of the way in, honestly no disrespect to anyone intended) may be available online in a sane, downloadable format.

    When I talk about nicer/easier ways to work I really mean tools that make the task at hand easier to maintain, more fun to work with and as far as I'm concerned easier to learn/teach. I'd suggest you look at Mojo::DOM which is simply fantastic for working with web based data, along with Mojo::UserAgent. The two can be combined easily as seen in some examples below. Mojolicious has fantastic documentation and examples to get you started with modern web development in perl. Here are some examples/answers to posts here I've implemented in using the above tools. Some are sub optimal and I may get round to updating them one day.

    Modern perl has some fantastic alternatives to older tools/methods which are really worth exploring. Please let us know if you have any follow on questions.

Re: help with writing a web crawler with LWP
by atcroft (Abbot) on Sep 08, 2018 at 18:34 UTC

    Welcome to perl. Hope you enjoy your time with this language. Having experience with PHP, you will likely see a lot that looks familiar.

    As to your question regarding a web crawler, may I suggest taking a look at Web Client Programming with Perl (available through the O'Reilly Open Books Project) and the LWP cookbook. It may seem a bit dated (1st. edition March 1997), but may provide a good background, and includes an example (in Chapter 6) of a recursive client. (You may also want to look at Merlyn's columns as well.)

    Good luck with the project. Hope that helps.

Re: help with writing a web crawler with LWP
by alexander_lunev (Pilgrim) on Sep 08, 2018 at 18:18 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1221952]
Approved by atcroft
Front-paged by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (None)
    As of 2024-04-25 00:47 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found