Quoting from Google's FAQ:
How do I request that Google not return cached
material from my site?
Google stores many web pages in its cache to retrieve
for users as a back-up in case the page's server temporarily fails.
Users can access the cached version by choosing the "Cached" link on
the search results page. If you do not want your content to be accessible
through Google's cache, you can use the NOARCHIVE meta-tag. Place this
in the <HEAD> section of your documents:
<META NAME="ROBOTS" CONTENT="NOARCHIVE">
This tag will tell robots not to archive the page.
Google will continue to index and follow links from the page, but will
not present cached material to users. If you want to allow other robots
to archive your content, but prevent Google's robots from caching,
you can use the following tag:
<META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">
Note that the change will occur the next time Google
crawls the page containing the NOARCHIVE tag (typically at least once
per month). If you want the change to take effect sooner than this,
the site owner must contact us and request immediate removal of archived
content. Also, the NOARCHIVE directive only controls whether the cached page is shown. To control whether the page is indexed, use the NOINDEX
tag; to control whether links are followed, use the NOFOLLOW tag. See
the Robots
Exclusion page for more information.
So vroom would simply need to modify the header of the PM generated pages to include such tags, but this means that any PM page would not be cached.
He could also include such a link that would produce pages w/o a CB nodelet, then fix a robots.txt file to block the main pages, but not the non-CB pages. But this is getting into the bit-tricky area...
Dr. Michael K. Neylon - mneylon-pm@masemware.com
||
"You've left the lens cap of your mind on again, Pinky" - The Brain
| [reply] [d/l] [select] |
I think most legal bots, such as Googlebot and others, respect the robot rules convention.
You can find more information on it here
For more information on how to remove the stuff already indexed by Google, try here
at the risk of sounding pedantic though, caching and indexing by crawlers are not exactly the same thing. To stop someone from caching pages, this would be a good site to start with....
HTH
| [reply] |
| [reply] |