Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Stand-Alone CGI Frame Chat

by Zaxo (Archbishop)
on Jan 23, 2002 at 13:17 UTC ( [id://140856]=note: print w/replies, xml ) Need Help??


in reply to Stand-Alone CGI Frame Chat

You should be locking your database files, and holding the lock for as short a time as possible.

After Compline,
Zaxo

Replies are listed 'Best First'.
(crazyinsomniac) Re^2: Stand-Alone CGI Frame Chat
by crazyinsomniac (Prior) on Jan 24, 2002 at 01:50 UTC
    I'm sure you meant to say "you should be locking a sentinel file".

    I draw from the DB_File pod (as I don't recall where *else* I've seen info on this before, or if it applies to AnyDBM_File, which I think it does), and I quote verbatim (as proported by pod2html):

    Locking: The Trouble with fd

    Until version 1.72 of this module, the recommended technique for locking DB_File databases was to flock the filehandle returned from the ``fd'' function. Unfortunately this technique has been shown to be fundamentally flawed (Kudos to David Harris for tracking this down). Use it at your own peril!

    The locking technique went like this.

        $db = tie(%db, 'DB_File', '/tmp/foo.db', O_CREAT|O_RDWR, 0644)
            || die "dbcreat /tmp/foo.db $!";
        $fd = $db->fd;
        open(DB_FH, "+<&=$fd") || die "dup $!";
        flock (DB_FH, LOCK_EX) || die "flock: $!";
        ...
        $db{"Tom"} = "Jerry" ;
        ...
        flock(DB_FH, LOCK_UN);
        undef $db;
        untie %db;
        close(DB_FH);

    In simple terms, this is what happens:

    1. Use ``tie'' to open the database.

    2. Lock the database with fd & flock.

    3. Read & Write to the database.

    4. Unlock and close the database.

    Here is the crux of the problem. A side-effect of opening the DB_File database in step 2 is that an initial block from the database will get read from disk and cached in memory.

    To see why this is a problem, consider what can happen when two processes, say ``A'' and ``B'', both want to update the same DB_File database using the locking steps outlined above. Assume process ``A'' has already opened the database and has a write lock, but it hasn't actually updated the database yet (it has finished step 2, but not started step 3 yet). Now process ``B'' tries to open the same database - step 1 will succeed, but it will block on step 2 until process ``A'' releases the lock. The important thing to notice here is that at this point in time both processes will have cached identical initial blocks from the database.

    Now process ``A'' updates the database and happens to change some of the data held in the initial buffer. Process ``A'' terminates, flushing all cached data to disk and releasing the database lock. At this point the database on disk will correctly reflect the changes made by process ``A''.

    With the lock released, process ``B'' can now continue. It also updates the database and unfortunately it too modifies the data that was in its initial buffer. Once that data gets flushed to disk it will overwrite some/all of the changes process ``A'' made to the database.

    The result of this scenario is at best a database that doesn't contain what you expect. At worst the database will corrupt.

    The above won't happen every time competing process update the same DB_File database, but it does illustrate why the technique should not be used.

    Safe ways to lock a database

    Starting with version 2.x, Berkeley DB has internal support for locking. The companion module to this one, BerkeleyDB, provides an interface to this locking functionality. If you are serious about locking Berkeley DB databases, I strongly recommend using BerkeleyDB.

    If using BerkeleyDB isn't an option, there are a number of modules available on CPAN that can be used to implement locking. Each one implements locking differently and has different goals in mind. It is therefore worth knowing the difference, so that you can pick the right one for your application. Here are the three locking wrappers:

    Tie::DB_Lock
    A DB_File wrapper which creates copies of the database file for read access, so that you have a kind of a multiversioning concurrent read system. However, updates are still serial. Use for databases where reads may be lengthy and consistency problems may occur.

    Tie::DB_LockFile
    A DB_File wrapper that has the ability to lock and unlock the database while it is being used. Avoids the tie-before-flock problem by simply re-tie-ing the database when you get or drop a lock. Because of the flexibility in dropping and re-acquiring the lock in the middle of a session, this can be massaged into a system that will work with long updates and/or reads if the application follows the hints in the POD documentation.

    DB_File::Lock
    An extremely lightweight DB_File wrapper that simply flocks a lockfile before tie-ing the database and drops the lock after the untie. Allows one to use the same lockfile for multiple databases to avoid deadlock problems, if desired. Use for databases where updates are reads are quick and simple flock locking semantics are enough.

     
    ______crazyinsomniac_____________________________
    Of all the things I've lost, I miss my mind the most.
    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://140856]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2024-04-16 07:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found