Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: how to add threads in the script

by jettero (Monsignor)
on Jul 24, 2009 at 15:55 UTC ( [id://783028]=note: print w/replies, xml ) Need Help??


in reply to Threads problem with Tk

I recommend against threads in every situation where they seem wanted. Check out POE and the Tk mainloop in particular (POE Tk Cookbook).

Also, check out WWW::Mechanize. It may simplify a lot of what you're doing there.

-Paul

Replies are listed 'Best First'.
Re^2: how to add threads in the script
by go0913 (Novice) on Jul 24, 2009 at 16:18 UTC
    Thank you Paul But I think changing is not a good thing for learning, I just want to finish my initial idea even by a not smart way. Thanks any way. POE and WWW::Mechanize are nice modules, and I will play with them after I master threads and LWP.

      Even if you master threads, they'll still be awful. Friends don't let friends use threads, they suggest forks.

      -Paul

      Perl-LWP is way cool!

      I've built maybe a dozen gizmos that cruise around various websites, some with SSL secure logins, etc. Some sites have bogus search engines and I just have to suck their database dry by navigating through their webpages, generating 10's of thousands of requests via their bogus search engine.

      I would caution you about overwhelming the other guy. It is possible for you to be "blacklisted" if you generate too much traffic too fast on a site.

      The thread model is the very most complex thing that you can do, but it is the highest performance. The fork() model is slower but not by much. Instead of a multi-thread process, that whacks the doo-doo out of a site, as fast as you can, maybe run 10 processes at a more "leisurely" pace that beats on 10 sites at once.

      Update:Got some down votes on this post...I'll try a clarification...the main idea is not use the most complex multi-processing model when something easier will do and second depending upon what kind of site you are accessing, how many times per second you do that makes a difference. Some my LWP programs access some sites that are "small" in terms of bandwidth and processing power but have significant size DB's. I can't crash Google.com with a single machine, but on some "small" sites, too fast a flood of requests can be "disruptive" to say the least. I'm just saying to be a "good citizen" out there on the web. Don't unleash a maximally performant web query engine on a website that can't take it. I think this is just common courtesy and is something to be considered.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://783028]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-04-25 07:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found