Re: Locking database

Other people have explained locking, but no one has answered your direct question. So I'll take a shot.

No, that's not a particularly good way to lock a database, because it has a race condition. Suppose you have two processes vying to open the file. The first one spends its timeslice executing the loop in lines 1 and 2, then is suspended in favor of the second process. The second process starts, hits line 3 and opens the database. It takes a while to open the file, so it uses its timeslice up and control returns to the first process.

The first process does not check for the existence of the semaphore file, and proceeds to open the database itself. Uh oh.

The documentation for DB_File has a nice extended example of locking one of these databases. It's much to long to post here, but it's a good example. (The solution is an atomic locking operation. It gets the name because it opens and locks the file in a single, uninterruptable step.)

Comment on Re: Locking database

Replies are listed 'Best First'.

Do not flock dbm filehandles!
by tilly (Archbishop) on Feb 18, 2001 at 08:56 UTC

DB_File

There are two basic problems. The first is that upon opening the database, the first page is read into memory. This happens before any possibility of flocking, and if that page wound up being modified by another process before you get your lock, you can get database corruption.

The second problem is that with more recent versions of Berkeley DB the database may close and reopen the database for internal reasons. (IIRC sendmail will cause it to do this.) When you do that then you lose the flock and there is no way for you to know that this happened.

The two basic solutions are to either synchronize all of your locks through an external means (eg by flocking a semaphore file) or to use the newer BerkeleyDB module which gives you access to Berkeley DB's internal locking functions. Those not only avoid the above problems, they also allow for fine-grained locks to reduce contention betweeen programs.

For the record the fault for the bad advice rests squarely upon the folks at Sleepycat who were involved in the 1.x series of Berkeley DB. In that series they recommended grabbing the file descriptor that Berkeley DB was using and flocking that. Well it has been many years since they realized the various reasons why it was a bad idea to have people rely on such aspects of their internal behaviour, but the old bad advice just keeps on floating around...

UPDATE
I misremembered the advice being repeated in the Cookbook. It is not. However it is repeated in perlfunc's documentation for flock and hence in the Camel.

UPDATE 2
If anyone is confused by the explanation, search for "flock" in the current documentation for DB_File and read a fuller explanation.

[reply]


Welcome to the Monastery
	PerlMonks