Thank you Paul
But I think changing is not a good thing for learning, I just want to finish my initial idea even by a not smart way. Thanks any way. POE and WWW::Mechanize are nice modules, and I will play with them after I master threads and LWP.
| [reply] |
| [reply] |
Perl-LWP is way cool! I've built maybe a dozen gizmos that cruise around various websites, some with SSL secure logins, etc. Some sites have bogus search engines and I just have to suck their database dry by navigating through their webpages, generating 10's of thousands of requests via their bogus search engine. I would caution you about overwhelming the other guy. It is possible for you to be "blacklisted" if you generate too much traffic too fast on a site. The thread model is the very most complex thing that you can do, but it is the highest performance. The fork() model is slower but not by much. Instead of a multi-thread process, that whacks the doo-doo out of a site, as fast as you can, maybe run 10 processes at a more "leisurely" pace that beats on 10 sites at once. Update:Got some down votes on this post...I'll try a clarification...the main idea is not use the most complex multi-processing model when something easier will do and second depending upon what kind of site you are accessing, how many times per second you do that makes a difference. Some my LWP programs access some sites that are "small" in terms of bandwidth and processing power but have significant size DB's. I can't crash Google.com with a single machine, but on some "small" sites, too fast a flood of requests can be "disruptive" to say the least. I'm just saying to be a "good citizen" out there on the web. Don't unleash a maximally performant web query engine on a website that can't take it. I think this is just common courtesy and is something to be considered.
| [reply] |