Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

CB history and Google's cache

by EdwardG (Vicar)
on Sep 14, 2004 at 13:01 UTC ( [id://390822]=monkdiscuss: print w/replies, xml ) Need Help??

Any monk with a strong view on the persistence of conversations in the CB should be aware that whatever is said might end up cached by Google, and thereby not only recorded for far longer than 1 hour (as per CBHistory) but also showing up in Google search results.

By way of example - I was talking the other day about a BNF for ANSI SQL, one thing led to another, and tye coined the name sqltidy1. Fast forward a few days, I'm still thinking about the topic, on a whim decide to search for "sqltidy", just in case, and lo - there is my conversation, frozen in time as part of Google's cache, courtesy of thepen.

Not a big deal this time, but it made me glad I hadn't said anything I might regret (like three stooges impersonations, nyuk nyuk nyuk2).

____

1presumably a variant of /(?:perl|html)tidy/

2oops

 

Replies are listed 'Best First'.
Re: CB history and Google's cache (fixed)
by demerphq (Chancellor) on Sep 14, 2004 at 15:21 UTC

    For the record when this came up yesterday in the CB I /msg'ed blakem the owner of thepen. It seems somehow (not sure how exactly, but whatever) that the nodelets were turned on on thepen without blakem's knowledge.

    Anyway, I asked him to turn it off, and he said that he has. How long it'll take before it actually affects google results is another question. Also, i dont know if this will totally resolve the problem afaik googlebots are allowed on the front page and as they log in as AM they get the CB nodelet automatically, so anything there will be indexed. I think however this is a lot less drammatic than what would have been available through thepens mirror.


    ---
    demerphq

      First they ignore you, then they laugh at you, then they fight you, then you win.
      -- Gandhi

      Flux8


      Remember that Google will soon re-index the frontpage and forget the moment of CB conversation it had previously conserved (in favour of a new one…)

      Unless prohibited by robot rules, the GoogleBot will also crawl into the site up to a certain depth, despite the parametrized GETs.

      Makeshifts last the longest.

      PM's /robots.txt says:

      # sorry, but misbehaved robots have ruined it for all of you.
      User-agent: *
      Disallow: /
      so google shouldn't be indexing PM directly at all. But I believe that google still does index PM to some extent, perhaps when prompted to by people giving specific URLs to less-obvious google tools (which could be construed as not being "robot" traffic) -- though I'm just guessing wildly here.

      Corion is working on building perlmonks.org/robot/... and then we'll tell robots (via robots.txt) to only scan that part of the site and we'll also detect common robots (via user agent string) and redirect them there as well.

      - tye        

Re: CB history and Google's cache
by eric256 (Parson) on Sep 14, 2004 at 13:11 UTC

    You wouldn't assume anthing uttered in the presence of another human to be secret. The more of those humans you put together, the less seceret it becomes. I've never understood why people think that the CB should be any different. I always consider that what I say can be misrepresented, recorded without my knowledge, and used later agiants my will, and even modified or twisted and used later. You just have to hope in life that people will be smart enough to recognize that certain comments have been taken out of context and don't have the meaning they appear to. If you are afraid that your comments will be misunderstood or taken out of context, then I wonder if you fear everytime you open your mouth in public? By "you" and didn't mean you specificaly EdwardG, I just meant those people who fear long CB history. I understand that people arn't always smart enough to distinguish when comments have been taken out of context, I've heard the stories about comments from the CB being used as evidence in a trial, none of that changes the fact that it IS a public medium and should be treated as such when you use it. Wishing it wasn't, doesn't make it so. I hope this doesn't upset anybody, it represents only my views and thoughts on the matter.


    ___________
    Eric Hodges
      Howdy!

      The problem is a social one.

      Consider going to a party with lots of conversation. Surreptitously record the conversations for future reference. Make use of those recordings. Watch your back.

      That is the essence of the issue.

      ChatterBox carries an expectation that the conversation is ephemeral. The FullPage Chat only carries about the last eight to ten minutes, or the last fifteen messages (or so). Anything older is lost, probably by design. Other chat clients and history tools work to try to preserve a larger window, but that simply facilitates putting things into context when the conversation is roaring along.

      There is no mechanism available to prevent archiving CB traffic for longer periods, save social pressure. I suppose something could be done, with much work, to try to discern monitor-bots from simple CB clients, but I'm not hinting at advocating going there.

      Google is not systematically archiving the traffic; it happens to catch snippets as it spiders about (so far as I know).

      It's not a matter of secrecy; it's a matter of social responsibility.

      yours,
      Michael

        My point was that even though people shouldn't record conversations at parties (and maybe therefore the CB), it doesn't mean you should act as if everything you do and say is lost forever 10 minutes later. We may want it to be ephemeral but that doesn't mean you should count on it. The same goes for both cases, it doesn't mean the person doing the recording is right, legal, moral. I just mean that when you open your mouth, or use your fingers you are should take responsibility for what you say, and hope that others will notice out of context recordings (or chat logs) when they see them.


        ___________
        Eric Hodges
Re: CB history and Google's cache
by been42 (Curate) on Sep 14, 2004 at 13:54 UTC
    I could be wrong, and often am, but I think when people get upset in discussions of CB history persistence they're really upset about intentional persistence.

    There's a huge difference between (a) you and tye having a conversation in a restaurant where someone overhears you, and (b) you and tye having a conversation where another monk is following you around with a tape recorder.

    We can't help it if Google keeps pieces of conversations, and it's getting tougher all the time as the search engine wars heat up. I get hundreds of hits a day from search engine bots at my little site that nobody knows or cares about. Now, because of stupid little conversations between me and my brother, I get hits at my site from searches for "randy newman" and "abercrombie t shirt". So your warning is really good for someone who's likely to say things in the CB that he'll regret later (like me). But you, EdwardG, have an important life lesson to learn here: never, ever be ashamed of a good Three Stooges impersonation. They, like Irish accents, are too often imitated and too seldom perfected.

Re: CB history and Google's cache
by Your Mother (Archbishop) on Sep 14, 2004 at 21:57 UTC

    I put this in the head of lots of my html <meta name="robots" content="nocache,noarchive" /> for per page Google directions. In my experience, it's reliable for them. You can add a "nofollow" too to keep something out of the index entirely.

    The web is not a party conversation. It's a bad analogy. Everything online is defacto recorded. Sometimes for a very short term, ie: printed to a temporary page and then the tape is erased, but it's still recorded. So the analogy becomes everyone *is* taping your conversations at a party, you just have some vague social agreement to not save the tapes.

    I appreciate the fact that the CB isn't stored (and any attempts to keep it out of others' caches) but eric256 is correct--saying something in the presence of others, for example the 31 monks on the site right now, and having an expectation of privacy isn't sensible.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: monkdiscuss [id://390822]
Approved by claree0
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-23 21:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found