http://qs321.pair.com?node_id=895602

pileofrogs has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Monks!

I'm a sysadmin who runs a few redundant web servers. I've created a lot of tools and systems to help me manage my beasties, and I'm trying to create an overall manager of them.

The key is, I need it to be super-uper-duper reliable, so I need to keep track of what I'm doing, what I've done and what I plan to do. I need this information to be written atomically and I want to be able to look back at previous state. The idea being, the machine could crash at any time in any sequence and I want to be able to piece things togeather. That will actually be easier than it sounds, but right now I'm planning this atomic history config thingy...

Here's an example: Say a server is in ACTIVE mode and I want to transition to RESERVE mode. I'd want to write down somewhere something like:

STATE=ACTIVE CHANGING_TO=RESERVE CHANGE_STEP=TELL_OTHER_SERVER

...as you can see, if the server crashed in the middle of this, I'd need this file to be accurate enough to piece things togeather and finish the job.

Okay, here's the actual question: Has this or a large piece of this already been done? I don't want to re-invent the wheel.

I was thinking I'd write a JSON/YAML blob to a file as the "current" state and keep a log of all previous states as a file containing a whole bunch of these JSON blobs. But if someone's already written something that does what I want, I'd rather go with that.

I was also thinking this sounds a lot like a transactional database with transaction history BUT I really don't want to do that because this needs to be able to work even if major components, like a database server, are down.

Okay, hopefully that makes sense. I'm looking for something to atomically store arbitrary data and also have a history of all previous revisions (hey, maybe a versioning system should come into this...). I just want to avoid re-inventing the wheel.

Thanks!
Pileofrogs

Replies are listed 'Best First'.
Re: Atomic Config Updater with History?
by BrowserUk (Patriarch) on Mar 25, 2011 at 22:01 UTC

    Atomic IO is as rare as rocking horse do-do.

    If you write to a local text file, your blob may (will frequently) cross a disk block boundary. Therefore, the start of the blob may get written to disk as a part of one 4k disk block; and the end of it as part of another. If the machine goes down between the two writes, non-atomic.

    Compound that with the fact that all modern OSs use transparent file caching. Even once you've "written to disc", you've often only written to the cache. and if something crashes, what you think you've already written can get lost.

    And unless your file-system allows you to make your log file contiguous, it is quite possible that due to write reordering, that the second block in the first scenario might get written before the first. And if the interrupt occurs at the inappropriate time, you have the end of a blob but not it's start.

    If you are prepared to bypass Perl & your CRT lib, then your OS might provide write-thru file handling APIs. If you use these, synchronous IO, and write 4K blocks every time, you can achieve something approaching atomic. But still, disk heads do occasionally crash mid-block.

    If your file is on a remote system, the transmission protocol (TCP/IP or whatever) is free and will frequently aggregate and/or break up writes in order to form transmission packets optimised for the comms fabric. And that can happen multiple times if the transmissions cross fabric boundaries (eg. cat5 to fiber and back; or 54Mb/s to 1Gb/s; etc) in the course of it's journey.

    The point is, that if you really need total reliability under any (well most at least) circumstances, then you need to stop thinking "atomic" and start thinking Two Phase Commit.

    Personally, I think a transactional DB is your best bet. INSERT the message saying what you are about to do; do it; then commit.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      All of what BrowserUk says is true, although most journaled filesystems will limit your liability when using regular files. Couple this with (at least on Linux) sync-ed writes (which you *don't* want to do a lot of, as they are dreadfully slow), and you might get by. A transactional DB is better, but you do have to remember that the two-phase commit is designed to ensure that multiple operations on the DB itself are either all done, or not done (ie rolled back). When part of what you are trying to 'commit' has nothing to do with the database (ie, transition a server to a new state), then you are still not atomic. In BrowserUk's example, if you
      1. INSERT the command msg
      2. perform the command
      3. commit the INSERT
      but the system crashes before step 3 completes, the DB will rollback the INSERT, but the DB has no knowledge of the command you performed. You would have to take the additional step of looking at the DB's transaction log (which many DBs allow you to record in a readable format). Upon crash recovery, if you see a 'command rollback', you would want to check the state of the execution of that command, and try to 'roll that back' too...

      fnord

        Sync'd writes (and write-thru) can be slow due to the absence of caching, but so are most journalled file-systems.

        When part of what you are trying to 'commit' has nothing to do with the database (ie, transition a server to a new state), then you are still not atomic. In BrowserUk's example,

        Agreed. That example only works if repeating the performed, but unlogged command over is effectively a noop.

        Mind you, breaking processing up into steps such that any given step can be repeated 2 or more times without affecting the overall result is something of a black art in itself. The basic steps are: a) don't discard source data for a given step, until the output data for that step has been successfully processed by the next step. b) discard any source data for this step that is 'incomplete'. Sentinel values are useful for this c) Once the input data for this step--ie. the output of the previous step--has been successfully processed, delete the associated input data to the previous step. Of course, in critical systems, 'delete' is probably spelt 'move to archive'.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      Yes! Thank you! That's exactly the kind of thing I needed to know. ++