I've got a project that would benefit from spidering and scraping a web site. Unfortunately, the web site I want to spider and scrape has very explicit TOS and robots.txt: the info I want is off limits. I want to cancel this project because of this, but management is insistent. Is it ethical to spider / scrape a site that says to stay away? Are there any possible legal ramifications? Should I tell my boss where to put his data in this job market?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: The Ethics of Webbots
by tilly (Archbishop) on Sep 17, 2004 at 17:37 UTC | |
You're right that it is unethical to do this project. Whether it is worth your job is your choice. Two factors to consider though. The first is that my experience is that slimeballs are usually not just slimeballs in one way - how they want you to treat others is how they'll also treat you when push comes to shove. (A tip when you eat out with people. Watch how they treat the waiter/waitress. That tends to be very revealing about what those people are really like...) The second is that your practical ability to take a stand on principle strongly depends on your personal circumstances. If nobody depends on you and you have strong skills, then you can do it pretty safely. The current job market (see http://jobs.perl.org/) is reasonable. However if your background is weaker or if you have a family depending on you, then it becomes much harder to walk away from a currently paying job. My suggestion, without knowing your exact circumstances, is that if you cannot afford to take an immediate stand on principle, when an employer does something that you don't want to stand for, quietly start shopping your resume around. In fact even if you think that you can afford an immediate stand on principle, you may feel more comfortable making sure that you have a good fallback before burning any bridges. | [reply] |
Re: The Ethics of Webbots
by dragonchild (Archbishop) on Sep 17, 2004 at 18:47 UTC | |
The idea is to have your ass completely covered. That's what the paper trail is for. You can term it as "I just want to know exactly what you want and be able to refer to it without bothering you." Signatures are best, but email is often good enough. I cannot emphasize this part strong enough - Make hardcopies of all email and photocopies of all signed documents, then store them offsite. These are your only protection when the company gets sued and the $h!t starts to roll downhill. If you're absolutely paranoid, you might even want to talk to a lawyer, just to make sure the jurisdiction you're in doesn't have some crazy laws that would skew the issue. ------
Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose I shouldn't have to say this, but any code, unless otherwise stated, is untested | [reply] |
by Your Mother (Archbishop) on Sep 18, 2004 at 03:36 UTC | |
I'll add, from a personal work experience that ended up in a multi-million dollar lawsuit, that dragonchild's advice above is excellent and should be heeded. | [reply] |
by Rhys (Pilgrim) on Sep 18, 2004 at 16:39 UTC | |
Get it in writing. Get it in writing. Get it all in writing. --J | [reply] |
by radiantmatrix (Parson) on Sep 22, 2004 at 16:40 UTC | |
A few additional thoughts, though: IANA Attorney, so please check with one if you're interested in what your precise rights under the law may be.
-- $me = rand($hacker{perl}); All code, unless otherwise noted, is untested "All it will give you though, are headaches after headaches as it misinterprets your instructions in the most innovative yet useless ways." - Maypole and I - Tales from the Frontier of a Relationship (by Corion) | [reply] |
by EvdB (Deacon) on Sep 21, 2004 at 16:46 UTC | |
The paper trail may get you out of trouble inside your company, but would be an admission that you knew you were wrong if the other company got hold of it. Tread with care - stay legal. --tidiness is the memory loss of environmental mnemonics | [reply] [d/l] |
Re: The Ethics of Webbots
by gjb (Vicar) on Sep 17, 2004 at 17:34 UTC | |
Yes, IMO this is unethical. You can run into trouble. Once I inadvertently violated Google's TOS and the ip-address was blacklisted. Unfortunately this happened to be the ip-address of the proxy that served the whole company I worked for. It took a letter with apologies to Google to get this corrected. Just my 2 cents, -gjb- | [reply] |
Re: The Ethics of Webbots
by Albannach (Monsignor) on Sep 17, 2004 at 17:53 UTC | |
I'd have to ask your boss why not just contact the owners of this site and look into an agreement with them? Maybe this has not occured to him at all. On the other hand maybe he already has done this but his offer was rejected, or they wanted more money than he wanted to spend. Either way, proceeding to scrape the site after failed negotiations would just give the other side more evidence against your side in court. You might want to point out the potential for legal costs to your boss. -- | [reply] |
Re: The Ethics of Webbots
by jbware (Chaplain) on Sep 17, 2004 at 17:57 UTC | |
-jbWare | [reply] |
Re: The Ethics of Webbots
by petdance (Parson) on Sep 21, 2004 at 03:11 UTC | |
What if your girlfriend said "Hey, let's go shoplifting" (assuming you're not Perry Farrell). You think "Do I want to maintain a relationship with this person who wants me to engage in illegal, unethical activities?" That's the situation you're in with your boss. Make no mistake, this issue is between you and your boss, not you and "management." Your boss should be protecting you from you having to do anything unethical or illegal. If he/she is not, then you have a shitty boss, and it may well be time to move on anyway. My bottom line is: no, you should do anything in the name of the company that violates your own code of ethics. Let 'em fire you for insubordination. They won't be eager to fire you, in general, nor will they relish the thought of you going to the unemployment office explaining why you were canned. Does your company have a code of ethics? A higher-up corporate parent you can talk to? My company has a code of ethics I have to sign yearly. I was in a situation like this before, at a different company, now out of business, when I was asked to write a program to create false sales reports to give to a supplier. I refused to do it, and was fully prepared to get canned. There were no repercussions, perhaps because my boss did the project instead of me.
xoxo, | [reply] |
by prodevel (Scribe) on Sep 21, 2004 at 06:43 UTC | |
Excellent advice by all - many thanks as I've been asked this same stuff before and have been able to be in the position to give a very flat, 'No.' | [reply] |
Re: The Ethics of Webbots
by Anonymous Monk on Sep 17, 2004 at 21:22 UTC | |
However, if I were to try and somehow make money with this data, I would consider it unethical (and IANAL, I'd probably be legally liable). | [reply] |
Re: The Ethics of Webbots
by inman (Curate) on Sep 22, 2004 at 08:24 UTC | |
If you are scraping the website of a competitor, someone who derives an income stream from the information or someone who has already paid someone else for the content on their website then you probably won't get very far. It is also the case that if you are re-selling the information that you get or not attributing the source of the information then you would be in trouble (breech of copyright, passing off etc.). If, on the other hand, you are crawling a website that is otherwise in the public domain (e.g. government websites) then it may be worth getting in touch with the website owners and talking to them about it. Content owners are trying harder to provide machine to machine services such as web services, RSS feeds etc. A small licence fee later and you could end up with a web service feed rather than a web site scrape. As a general note from a technical point of view - be nice when you are scraping/indexing. In terms of your relationship with your boss. You need to document your concerns and the approach that you are taking. Remember that if you can demonstrate that they knew what was happening then they get it in the neck and not you. | [reply] |
Re: The Ethics of Webbots
by toma (Vicar) on Sep 20, 2004 at 05:02 UTC | |
If you have to do a lot of clicking, you can save your fingers from injury by using a footswitch instead of a mouse button. Take your shoe off, and click with your big toe. This motion should be the same as one of the motions involved in walking, and should not wear out the meat-ware. One type of footswitch is sometimes called a treadle.
It should work perfectly the first time! - toma
| [reply] |
Re: The Ethics of Webbots
by poqui (Deacon) on Sep 21, 2004 at 16:55 UTC | |
| [reply] |