Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Weather goest thou, spider?

by Molt (Chaplain)
on Aug 29, 2002 at 10:13 UTC ( [id://193721]=note: print w/replies, xml ) Need Help??


in reply to Weather goest thou, spider?

If you want a nice and complete discussion about writing spiders and parsing HTML you may want to look at the new O'Reilly tome Perl and LWP. This includes many examples of mining information from websites, ranging from using a few regexps to pull out the information, to rebuilding the HTML in tree from and throwing it out again, or spidering entire sites in the correct manner.

I've recently had to write a spider for work and whilst I'd got it working and doing what we needed this book pointed out a few things I'd over-looked thus allowing me to tighten things and cut down the chances of things falling to pieces. Well recommended.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://193721]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (4)
As of 2024-03-29 10:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found