Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Version Control in Database Applications

by samtregar (Abbot)
on Jun 10, 2002 at 16:50 UTC ( [id://173218]=note: print w/replies, xml ) Need Help??


in reply to Version Control in Database Applications

Thanks guys! Your suggestions have been very helpful, particularly the ones from hossman and tomhukins.

I think I have an idea for an implementation that would be both easy on queries and on updates. Basically, instead of using an INT column for my ids I'd use a VARCHAR. Then the current version of every object is just the id number, say "1024". When I create a new version I append the version number creating a new id - "1024,1" for version 1 for example. This makes creating a new version as simple as cloning records in all affected tables under the new id. SELECTs and JOINs stay simple since selecting on the id as-is always gets you the current version.

So, how does that sound? Am I missing something that will cause me premature hair-loss later?

-sam

  • Comment on Re: Version Control in Database Applications

Replies are listed 'Best First'.
Re: Re: Version Control in Database Applications
by cowens (Beadle) on Jun 10, 2002 at 20:01 UTC
    I think you may find that a composite key on two INTEGERs is faster than a single VARCHAR key (and smaller as well). Besides the composite key allows easier gathering of history for a single item. Compare:
    SELECT * FROM tablename WHERE key = 12345 ORDER BY version;
    and
    SELECT * FROM tablename WHERE key LIKE '12345,%' ORDER BY key;
    The first most likely will get to use an index on key (especially if you build the index with key first and version second) and the second is most likely a table scan. Here are the singleton selects for the current version
    SELECT * FROM tablename o WHERE key = ? AND version = ( SELECT max(version) FROM tablename i WHERE i.key = o.key )
    vs.
    SELECT * FROM tablename WHERE key = ( SELECT max(key) FROM tablename WHERE key LIKE '$key,%' )
      Re: Speed. My guess is that given an indexed fixed-width CHAR field I should be able to get acceptable speed on lookups. All I need to do is decide on the maximum total number of objects and the maximum number of versions and I should be able to pick a field width. I'll be using MySQL and my memory is that MySQL indexes CHARs just as well as it does INTs.

      Re: Querying. I think you misunderstood (or I did). In my planned implementation the current version would always exist without an appeneded ",version". Thus, selecting the current version is just:

      SELECT * FROM foo WHERE id = "1024"

      No sub-select required! And selecting version N is simply:

      SELECT * FROM foo WHERE id = "1024,N"

      Now, I may still need an extra column to store the current version number but as long as it isn't needed on every select and every join I'll be happy.

      I think you're correct about collecting aggregate information across versions being hard. But I don't anticipate needing to do that very often, if at all.

      -sam

        But how do you make an old version current? Change it's primary key? Nasty. And if you don't do that, you'll have to know which version is live, and that means your queries won't be so simple becuase they'll need to look up which version is live first and won't be able to assume it's a simple key with no appended number. Sounds like a tangle to me. I would go with the two-part key.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://173218]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-24 20:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found