Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re: Re: Re: Web Robot

by schumi (Hermit)
on Jul 17, 2003 at 08:53 UTC ( #275152=note: print w/replies, xml ) Need Help??

in reply to Re: Re: Web Robot
in thread Web Robot

Most major search engines don't stick to those rules, true. That means that if you use a robots.txt on your site, be aware of that.

But I think, just because some big companies/search engines don't stick to the rules doesn't mean that you should do the same. I always go by the maxime, don't do unto someone else, what you wouldn't want done to you/your site.

Just my 2 Rappen (Swiss equivalent to cents).


There are nights when the wolves are silent and only the moon howls. - George Carlin

Replies are listed 'Best First'.
Re: Re: Re: Re: Web Robot
by Anonymous Monk on Jul 17, 2003 at 09:10 UTC

    On the topic of robots.txt, why would someone even use this? If you don't want a page accessed, limit access to it. Depending on all computers to play nice isn't a very smart move, they have many hidden motives :)

      Quite true, although most major search engines do actually heed the robots-file, if it is setup properly.

      I think the easiest way to restrict access to a directory is setting up a proper .htaccess-file. You could even restrict access by IP-addresses...

      On the other hand, using a robots-file (in addition to the above, note!) decreases the amount of 404s in your error-logs... ;-)


      There are nights when the wolves are silent and only the moon howls. - George Carlin

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://275152]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2022-01-21 08:19 GMT
Find Nodes?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:

    Results (57 votes). Check out past polls.