Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Is it possible to stop caching?

by Fingo (Monk)
on Apr 07, 2001 at 04:08 UTC ( #70656=monkdiscuss: print w/replies, xml ) Need Help??

Now as many of you know google (and some other search engines) cache pages. This is all good and useful, except for one thing: the chatterbox. Try searching for an exact phrase that appeared in the chatterbox, and you get a whole collection of nodes that were presentat the time of the message, which google politley cached. Is there a way to staop the cacing of part of a site?

Replies are listed 'Best First'.
Re: Is it possible to stop caching?
by Masem (Monsignor) on Apr 07, 2001 at 04:23 UTC
    Quoting from Google's FAQ:
    How do I request that Google not return cached material from my site?
    Google stores many web pages in its cache to retrieve for users as a back-up in case the page's server temporarily fails. Users can access the cached version by choosing the "Cached" link on the search results page. If you do not want your content to be accessible through Google's cache, you can use the NOARCHIVE meta-tag. Place this in the <HEAD> section of your documents:
    <META NAME="ROBOTS" CONTENT="NOARCHIVE">
    This tag will tell robots not to archive the page. Google will continue to index and follow links from the page, but will not present cached material to users. If you want to allow other robots to archive your content, but prevent Google's robots from caching, you can use the following tag:
    <META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">
    Note that the change will occur the next time Google crawls the page containing the NOARCHIVE tag (typically at least once per month). If you want the change to take effect sooner than this, the site owner must contact us and request immediate removal of archived content. Also, the NOARCHIVE directive only controls whether the cached page is shown. To control whether the page is indexed, use the NOINDEX tag; to control whether links are followed, use the NOFOLLOW tag. See the Robots Exclusion page for more information.

    So vroom would simply need to modify the header of the PM generated pages to include such tags, but this means that any PM page would not be cached.

    He could also include such a link that would produce pages w/o a CB nodelet, then fix a robots.txt file to block the main pages, but not the non-CB pages. But this is getting into the bit-tricky area...


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
Re: Is it possible to stop caching?
by tinman (Curate) on Apr 07, 2001 at 04:23 UTC

    I think most legal bots, such as Googlebot and others, respect the robot rules convention. You can find more information on it here

    For more information on how to remove the stuff already indexed by Google, try here

    at the risk of sounding pedantic though, caching and indexing by crawlers are not exactly the same thing. To stop someone from caching pages, this would be a good site to start with....
    HTH

Re: Is it possible to stop caching?
by chipmunk (Parson) on Apr 09, 2001 at 20:57 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: monkdiscuss [id://70656]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2020-10-28 03:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (259 votes). Check out past polls.

    Notices?