Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

a crawler

by hnd (Scribe)
on Jun 22, 2009 at 20:13 UTC ( [id://773752]=perlquestion: print w/replies, xml ) Need Help??

hnd has asked for the wisdom of the Perl Monks concerning the following question:

hey every one and a deep bow to the monks........ i'am thinking of a project (which is to be submitted in my college) on a web crawler........... i made one using the LWP::RobotUA module but couldnt get much (not much.....nothing) so please help me ps:- no need of the code just tell me something that could spark the fire

Replies are listed 'Best First'.
Re: a crawler
by CountZero (Bishop) on Jun 22, 2009 at 20:36 UTC
    Webcrawlers are thirteen in a dozen. A few modules and a few lines of code glueing it together and off it goes.

    Perhaps you should think of writing a very specific webcrawler, which does something which hasn't been done yet, which goes out and looks for something that an ordinary webcrawler does not do.

    A wild idea: a webcrawler which starts at the top of the website and visits all on-site links and builds a nice site-map with this info. Perhaps add thumbnails of each page and make it a graphical sitemap, with all pictures active links to the actual pages.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      I can't remember specific but I think we have those already CountZero :)
        Could be. Do you have any links to them?

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://773752]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-25 20:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found