Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^3: What is the fastest way to download a bunch of web pages?

by inman (Curate)
on Mar 03, 2005 at 17:11 UTC ( [id://436302]=note: print w/replies, xml ) Need Help??


in reply to Re^2: What is the fastest way to download a bunch of web pages?
in thread What is the fastest way to download a bunch of web pages?

he had no restriction was due to personal laziness rather than an optimised answer. BrowserUK's solution is more engineered since it allocates a thread pool (with a variable number of threads) and therefore manages the total amount of traffic being generated at any one time.

Let's say for example that you were trying to download 100 pages from the same website. My solution would batter the machine at the other and effectively be a denial of service attack. The thread pool managed approach allows you to tune your network use.

There's more than one way to do it (and the other guy did it better!)

  • Comment on Re^3: What is the fastest way to download a bunch of web pages?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://436302]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2024-04-25 17:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found