Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

OT (Perhaps) File Locking

by Anonymous Monk
on Jun 03, 2002 at 16:31 UTC ( [id://171276]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

If I have multiple forked Perl processes on a UNIX box that need to append the same file, would it be prudent to lock the file first or is that really only an issue if the multiple processes are re-creating the file?? I know this is kinda-sorta not that perl related but any pointers you could provide as to where I could RTFM would be super.

Replies are listed 'Best First'.
Re: OT (Perhaps) File Locking
by Zaxo (Archbishop) on Jun 03, 2002 at 16:41 UTC

    By all means lock them. I'd recommend perldoc -f sysopen and Fcntl as starting points. Your system's man(2) docs will help clarify the details from Fcntl.

    After Compline,
    Zaxo

Re: OT (Perhaps) File Locking
by Moonie (Friar) on Jun 03, 2002 at 17:13 UTC
Re: OT (Perhaps) File Locking
by Fletch (Bishop) on Jun 03, 2002 at 17:08 UTC

    See also perldoc -q 'lock a file'.

    And in general, if you have more than one writer you want to lock.

Re: OT (Perhaps) File Locking
by derby (Abbot) on Jun 03, 2002 at 17:15 UTC
    AM,

    Well it kinda depends. A "write" is atomic in Unix so if you're appending a chunk of data with only one underlying write (as in a singular print, printf, or write) you'll be okay (that is, that chunk of data will be appended as one piece of data to the file).

    There are several reasons you may need locking:

    • If you're spreading your output across several writes and you want those several to stay "clumped" in the file
    • There's going to be some type of "read" dependency where you're reading and writing at the same time and you want to block out writers while reading.

    Other than that, append to your hearts content without worrying about "losing" something (you won't).

    -derby

    update: Just one caveat, if the file you're appending to is on an nfs partition, you do want locking for appending. (Common sensical enough, the standard kernel can only control its own writes, not those of other processors).

      A write may be atomic, but prints and printfs go via the stdio layer (or in newer perls, via the perlio layer), which will perform buffering and does writes as much as possible in fixed sized blocks (typically 1, 2 or 8k). Which could mean a single is actually performed as two writes. Or two prints will be done as one write.

      If you want to be safe, you always do: lock, seek, print, unlock.

      As for pointers, consult Stevens' "Advanced programming in the UNIX environment", IMO, a more useful book for a Perl programmer than all "Perl book"s combined. Afterall, where C is portable assembler, Perl is portable UNIX.

      Abigail

        If you haven't "disabled" buffering [via HANDLE->autoflush(1), for example], then your advice to lock won't really help much either (except that if you use Perl's own flock, it will flush the buffer before unlocking -- on modern versions of Perl).

        derby was correct: If you are writing relatively small amounts of data in a single operation (such as via a single print or printf, for example) and have the file opened for append access under Unix, then locking is not needed but disabling of buffering is needed. If you are using multiple operations or writing large amounts of data, then you need locking but you also need to disable buffering.

        Disabling buffering is usually done via autoflush (or $| and 1-argument select) but you can get the same effect by using Perl's flock and requiring a modern version of Perl (5.004 or later).

        Update: Yes, I assumed Perl, but I didn't assume flock, especially since you didn't even mention it (there are other, often better, ways to lock files in Perl than flock). The reason I said "Perl's flock" was to prevent people from getting the impression that the underlying flock() is what does the buffer flushing.

        And if I'm going to write a program that relies on a Perl-specific aspect of flock, then I tend to document this as I would not be surprised to see a Perl routine converted to C for any number of reasons. Since this feature is documented as not always having been true even in Perl, a require 5.004; makes a great place to put such documentation. But my point was more that I'd rather not rely on this feature and instead put the autoflush in place so that a change in locking method doesn't break things.

        And, yes, when answering questions, I do consider it important to note things that won't work on older versions of Perl. After all, everything covered here works even in Perl4 except for flock flushing buffers. And this lack would not be easy to notice if you ended up running this code on an old version of Perl, making it more important to point out.

                - tye (but my friends call me "Tye")
        And to mention a trap - the unlock in there only works as such if you are not depending on data read from the file though. If you do, then you shouldn't explicity unlock the file - rather, process as quick as possible, close it and forget what you read. I don't know where it was, but merlyn wrote a note or two to that effect here once upon a time.

        Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://171276]
Approved by VSarkiss
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2024-04-25 16:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found