Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Order of flock and open

by svsingh (Priest)
on Apr 30, 2003 at 16:08 UTC ( [id://254354]=perlquestion: print w/replies, xml ) Need Help??

svsingh has asked for the wisdom of the Perl Monks concerning the following question:

I'm updating a program I wrote that reads from one file and writes to another, and then finally renames the read file with the write file. Given the risks in doing something like this, I started searching for ways to lock files and found the flock command.

Every example I've seen for flock locks the file after opening it. While I understand you need to open the file to get the filehandle, this seems risky to me. What if another request sneaks in between the open and flock commands? Ditto for unlocking and closing the handle.

Am I just being overly paranoid? Are there safeguards built into Perl to prevent something from happening to the file between the open and flock commands?

Thank you.

Replies are listed 'Best First'.
Re: Order of flock and open
by Thelonius (Priest) on Apr 30, 2003 at 16:28 UTC
    What I have done in similar cases is lock a different file. E.g. before opening "ImportantData.txt", my programs first open and lock "ImportantData.lock" and hold it until they are really done with the file "ImportantData.txt".

    Updated:Checking back, I see a lot of people tell you "no problem", "don't worry", etc. Wrong! You are right to be concerned. They do not understand the issues.

    Example: have two processes which are each going to add 1 to the value in the file.

    Process 1 opens file IN, "<in.txt" Process 1 locks file IN Process 1 opens file OUT, ">out.txt"; Process 1 locks file OUT Process 2 opens file IN, "<in.txt"; Process 2 tries to lock IN and waits (The vulnerable window is greatly reduced if you use LOCK_NB, but no +t eliminated. But then you have to close and reopen IN in a busy loop +). Process 1 reads the value "100" from IN Process 1 writes the value "101" to OUT In any order (or atomically): Process 1 closes IN (thereby unlocking it) Process 1 closes OUT (thereby unlocking it) Process 1 rename "out.txt" to "in.txt" Process 2 acquires lock on file IN Process 2 opens file OUT, ">out.txt"; Process 2 locks file OUT Process 2 reads the value "100" from IN Process 2 writes the value "101" to OUT In any order (or atomically): Process 2 closes IN (thereby unlocking it) Process 2 closes OUT (thereby unlocking it) Process 2 rename "out.txt" to "in.txt"
    Result: file "out.txt" has the value "101" instead of the correct "102"

    If you use use the default blocking lock, you have a huge vulnerability. You not only have to worry about another process opening the file between your open and lock, you have to worry about any process opening it from the time you lock it until the rename is completed!

      You are absolutely right -- you must watch out in this case. Most of the other replies don't bring up the fact that you must lock the entire process (read/write/rename). If you do this with a separate file, it is rather straightforard.

      UPDATE: merlyn is right that the "code" below is fishy and should not be used. Take it as an example of what can go wrong if you try to get too tricky with locking. There are almost always conditions you will overlook (as I have.)

      That said, I believe you can do this by locking both files, but you must pay attention to the order you perform the operations. You should be able to do the following in order...

      open and lock the input file open lock the output file read the input file write the output file rename the input file rename the output file close both files (which unlocks them) delete old renamed input file

      I'm not sure this works on all operating systems, but should work for Unix/Linux. You must use a non-blocking lock for the input file and you must close/reopen it in a loop if it is already locked. This is because someone can completely replace the input file before you get the lock for it and then your open file descriptor is pointing to the old stale (possibly deleted) input file. You should probably also use unique names (i.e. that don't collide with other processes) for the output file and the old input file.

      This all seems rather complicated, which is why locking a separate file is much simpler. But, YMMV...

      bluto

      This technique of locking a separate file before opening the real data file is called locking with semaphore files. Thelonius++

      Arjen

Re: Order of flock and open
by converter (Priest) on Apr 30, 2003 at 16:46 UTC

    Since flock() can fail after a successful open(), you should make sure that your open() isn't destructive. In other words, if you're opening a file for writing, open it in such a way that if the file exists and contains data it won't be truncated.

    Check out the links to Dominus' File Locking Tricks and Traps slides in the Question about Flock and die thread for more details.

Re: Order of flock and open
by ferrency (Deacon) on Apr 30, 2003 at 16:50 UTC
    Other posters are mostly correct: Most of the time, you don't need to worry about the fact that there's a delay between the open() and flock() calls. But there are some cases you should be aware of which require more careful consideration.

    One issue is if you're opening a file for something other than reading. If you open a file for append, you should seek() to the end of the file after you acquire the lock, to make sure no one else appended to the file after you opened it.

    open my $FH, ">>foo" or die "can't open"; flock $FH, LOCK_EX or die "no lock"; seek ($FH, 0, 2);
    If you're opening a file for destructive write, you have a bigger problem. Opening destroys the file, but you can't lock the file until you open it. One solution to this is to lock a different file, as someone else described. Another solution is to open the file twice: open it once for reading, lock it, and then open it for writing once you know you have the lock. As long as you keep the first filehandle open, you'll keep the lock.

    open my $LOCK, "<foo" or die "can't open for locking"; flock $LOCK, LOCK_EX or die "Can't lock"; open my $FH, ">foo" or die "can't open for writing"; print $FH "I own this file\n"; close $FH; close $LOCK;
    Finally, is the situation you describe, where you want to lock a file, rebuild it in a temporary file, and then copy the tempfile over the real file. The issue here is, with most filesystems, when you use rename() or `mv` to replace the real file with the newly built tempfile, you lose your lock on the file. If this is okay (you're done with the file when you copy it over) then it's probably not a problem. But if you wanted to perform any other operations on the file while it's still locked, you probably need to use the "lock a different file" technique.

    Alan

      If you're opening a file for destructive write, you have a bigger problem. Opening destroys the file, but you can't lock the file until you open it. One solution to this is to lock a different file, as someone else described. Another solution is to open the file twice: open it once for reading, lock it, and then open it for writing once you know you have the lock. As long as you keep the first filehandle open, you'll keep the lock.
      You keep the lock on the now dead file, but you have a new file that exists but might not yet be fully written. And someone can come along and flock that! Now there are two flocks active. Repeat ad-nauseum, and you get an infinite number of flocks.

      So, no, that's not the way to do it. Others have posted the proper way. Just wanted to point out that yours is flawed, so erase that part of your brain please. {grin}

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

        My experiences are probably platform-dependant on FreeBSD. However, those experiences offer empirical evidence that on my preferred platform, opening the file for write a second time opens the same file, not a new file, and subsequent lock attempts on the write filehandle fail.

        Example:

        % perl -MFcntl -e 'open my $f, "<foofile"; flock $f, LOCK_EX; open my +$g, ">foofile"; flock $g, LOCK_EX|LOCK_NB or die "No lock\n";' + No lock %
        In summary: please don't follow my original advice. Merlyn is probably right in the general case, and will probably even find a good explanation as to why I've fooled myself into believing what I do. I'll add a caveat to the particular portion of my brain which holds this information, but I'm unlikely to erase it completely :)

        Alan

Re: Order of flock and open
by hardburn (Abbot) on Apr 30, 2003 at 16:12 UTC

    flock requires a filehandle in its argument list. How do you get a filehandle without opening it first?

    I don't think it's as dangerous as it seems. Be default, flock blocks execution until the lock is actually obtained, so even though the file is open, you can't do anything potentially dangerous until you already have the lock. Someone more familer with low-level systems stuff would have to confirm this, though.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

      This highlights a difference of perception that is important: many people read flock() as "protect me from all those other bad programs out there," when it's actually intended as more of a "protect any of the other programs from the ruckus I'm about to cause."

      --
      [ e d @ h a l l e y . c c ]

Re: Order of flock and open
by perlplexer (Hermit) on Apr 30, 2003 at 16:34 UTC
    What if another request sneaks in between the open and flock commands?
    As long as all applications that manipulate data in this file use Fcntl locking API -- flock(), fcntl(), you'll be OK. Just make sure you're checking return values of all system calls.

    Ditto for unlocking and closing the handle
    If you don't need the file handle after you unlock it, you can simply close() it. Perl flushes the buffers for you before unlocking the file in either case. But you can always do it yourself if you so desire.

    --perlplexer
Re: Order of flock and open
by thor (Priest) on Apr 30, 2003 at 18:00 UTC
    While all of the other posters are correct, there is one thing that I'd like to mention. Locks are advisory in much the same way that stoplights are advisory. Nothing keeps someone from ignoring a stoplight and plowing in to an intersection. Similarly, if you have a process that pays no heed to locks, you'll be in for a nasty surprise if you were relying on the locks to protect your data.

    thor

      Some OS's (such as Windows) have file locking at the OS level. Those are not advisory, but are enforced for all handles, period.

      For example, under Win32 API the LockFile function will prevent any other process from accessing the region of the file, period. The other process doesn't have to call LockFile and stop itself; just using a handle open to that file that would read/write the restricted region will become an error.

      Note that on Linux does offer kernel enforced, mandatory locking. See /usr/src/linux/Documentation/mandatory.txt for details.

      Makeshifts last the longest.

        I'd like to ask: are the two manditory file locking methods mentioned above called by the perl flock call? If so, that'd be good for portable code. If not, you've got a big cludge in which you have to examine the operating system to determine what to use.

        thor

Re: Order of flock and open
by theguvnor (Chaplain) on Apr 30, 2003 at 23:41 UTC

    I'm surprised no one has pointed you to sysopen(), rather than the simpler open() function.

    From the perlopentut tutorial:

    To get an exclusive lock, typically used for writing, you have to be careful. We sysopen the file so it can be locked before it gets emptied. You can get a nonblocking version using LOCK_EX | LOCK_NB.
    use 5.004; # make sure you have at least this level perl use Fcntl qw(:DEFAULT :flock); sysopen(FH, "filename", O_WRONLY | O_CREAT) or die "can't open filename: $!"; flock(FH, LOCK_EX) or die "can't lock filename: $!"; truncate(FH, 0) or die "can't truncate filename: $!"; # now write to FH

    [Jon]

      Personally, I like LockFile::Simple.

      Works great for me...
      use LockFile::Simple qw(lock trylock unlock); $lockmgr = LockFile::Simple->make(-format => '%f.lck', -max => 20, -delay => 1, -stale => 1); # first program to get the lock the file gets the chance to run it + if($lockmgr->trylock("$eventsLogFile")){ my @bytes= split(//, $serial_data); $lockmgr->unlock("$eventsLogFile"); }
Re: Order of flock and open
by atnonis (Monk) on Apr 30, 2003 at 16:39 UTC
    you can also try something:
    use Fcntl; open FILE, $file, O_EXLOCK or die "Cannot open or lock $file. $!\n";
    to lock the file exactly when you open it

    Antonis!
      I'm wondering on what OS you're using that. It fails for me on both Linux and FreeBSD.

      Arjen

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://254354]
Approved by broquaint
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (1)
As of 2024-04-25 03:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found