http://qs321.pair.com?node_id=168984


in reply to XML for databases?!?! Is it just me or is the rest of the world nutz?

2-second moral of the story: archive, archive, archive with natural language decipherability if you think it's better to err on the side of preservability. And anything else you can think of that ends in "ility".

PRIORITY ONE MESSAGE FROM STARFLEET COMMAND Attn: James. T. Kirk, Cpt. USS Enterprise Greetings, James. Once again we are shocked at your continued violations of time protocols and your frequent flaunting of edict 45899.68.45 regarding deliberate forking of the time/space continuum despite substantial evidence that these apparent paradoxes do indeed iron themselves out in the different facets of the multiverse. Until we know more about these potential effects, we severely condemn this latest breach of protocol. We do understand that once you found the miraculously preserved digital archives of Dr. Sean Shrum (circa 21st century Earth United States of America) there was some problem deciphering their contents, despite the best efforts of S.O. Spock and your shipboard computer. Normally, had their contents been deliberately encrypted we understand you could have used the latest quantum computing techniques to crack the code. But, since Dr. Shrum chose to use a data format that had been lost in the winds of time, decipherment proved impossible -- especially since we had no idea what sort of information was stored in his archives. Only his reputation based on the surviving seventeen cults spawned from various interpretations of his purported discoveries survived well enough to even enable us to recall his name. Usually in these cases, especially beginning in the 21st century, data formats were specified in "Unicode" plain text markup formats, one of the prevalent extant examples being so-called "XML". We can reliably infer from the works of his contemporaries who chose to save their compendia in XML that he was the exception in this regard. Nevertheless, the apparent loss of his archives serve as no excuse to travel back in time, seduce the man's wife, and pummel the data format out of him under duress. Our top researchers are still classifying the various continued bifurcations in the multiverse -- we can only assume that our colleagues in the other realities, where they still survive, are doing likewise. In the meantime, despite your methodology, we find his varied works to be extremely interesting -- algorithmically useful at best, anthropologically fascinating at worst. Please port his works to your nearest Perl 5678.6.8 Planetary cache. Regards, Fleet Commander Larry Wall XXIX STARFLEET COMMAND OUT

Replies are listed 'Best First'.
Re: Re: XML for databases?!?! Is it just me or is the rest of the world nutz?
by dsheroh (Monsignor) on May 24, 2002 at 14:10 UTC
    Normally, had their contents been deliberately encrypted we understand you could have used the latest quantum computing techniques to crack the code. But, since Dr. Shrum chose to use a data format that had been lost in the winds of time, decipherment proved impossible

    Hmm... Let's see...

    # cd /var/lib/postgres/data/pg_xlog # strings 0000000000000011 | more Serial 026324 Serial 5699K11353 cn 29L02 Roof fans Mfg Greenheck Roof fan serial OOC23697 ...
    Looks like postgres (and, in my experience, most other databases) stores its data internally as plain text unless deliberately encrypted or compressed. Yeah, I'll give you that strings isn't likely to tell you the structure of the data, but getting at the content without going through the database engine is trivial.
      Yes, you squarely nailed the weak spot in my parable. I winced when I wrote that bit, but then shrugged in favor of artistic license. The idea that a lost data format could somehow be more incomprehensible than deliberately encrypted data is of course ludicrous. Matt
Re: XML for databases?!?! Is it just me or is the rest of the world nutz?
by Abigail-II (Bishop) on May 28, 2002 at 19:26 UTC
    Yeah, a nice story. I don't see the point though. XML isn't a magic bullet that somehow makes your data preserve into eternity. If you have data in which the content is mostly what there is, you might as well store it in a flat file without all the XML verbosity around it *. But if you require the relations, then your data is gone when the description of what the elements of your DTD mean is gone.

    And that's the same problem as losing the data format.

    * Why on earth does XML has to be so verbose? It's just LISP, it only needs a lot more characters.

    Abigail