http://qs321.pair.com?node_id=187153


in reply to Fetching data from a corporate websites using LWP

Like many are saying, I definitely believe it depends on the site you are retrieving the data from. If they don't have advertisements on their site, you hitting it every few hours with LWP is less stressful on their server then somebody hitting it more frequently with interactive browser. Even better, you could set up your script to only work at night, when few people would be there.

However, many sites have policies on this, which can often be found at the bottom of their site. For example, WhoWhere.com states the following in their terms of service:

(You agree not to) Sell, distribute, or make any commercial use of data obtained from any Lycos database or make any other use of data from any Lycos database in a manner which could be expected to offend the person for whom the data is relevant

-and-

Use automated means, including spiders, robots, crawlers, or the like to download data from any Lycos Network database.

Also, the terms of service for people.yahoo.com states:

You agree not to reproduce, duplicate, copy, sell, resell or exploit for any commercial purposes, any portion of the Service, use of the Service, or access to the Service.

The above statements make it sound like retrieving any data from either of those sites for any commercial purpose may be breaking their terms of service. So, I'd just make sure you read the terms of service and such for the site you're looking into. You may want to email them, and explicitly ask their permission -- they may let you do it, particularly if you tell them it'd only be once an hour throughout the night.

Good luck!
-Eric

--
Lucy: "What happens if you practice the piano for 20 years and then end up not being rich and famous?"
Schroeder: "The joy is in the playing."