Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Downloading URL's in Parallel with Perl

by eduardo (Curate)
on Sep 11, 2001 at 17:01 UTC ( [id://111739]=note: print w/replies, xml ) Need Help??


in reply to Downloading URL's in Parallel with Perl

I just wanted to make a comment real quick. Make sure that you are not falling to fallatious logic in believing that somehow calling forth the gods of "explicit parallelism" you will be guaranteed to incurr a speedup. Remember, in situations like this, where you are *pulling* on the dataflow, and more importantly, your data set rests on a node to which you do not have a guaranteed transfer rate, it is possible that parallelizing your GET's will not increase your actual throughput or minimize your wall-clock time for the entire transaction.

Remember your Von Neumann bottleneck, it is doubtful that what is slowing down your task is the overhead of processing the data, it is much more likely that the bottleneck exists in the actual data pipe (in other words, not processing BUT bandwidth!) And attempting to stuff 10k/sec of data down a 1k/sec pipe won't make the pipe bigger... it may actually end up slowing down your overall wall-clock time due to TCP collisions and other assorted baddies. I'm a big fan of parallelization throughout... just make sure it makes *sense* in your particular configuration.

  • Comment on Re: Downloading URL's in Parallel with Perl

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://111739]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-04-16 22:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found