Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Best practices for modifying a file in place: q's about opening files, file locking, and using the rename function

by grep (Monsignor)
on Nov 03, 2006 at 02:25 UTC ( [id://581993]=note: print w/replies, xml ) Need Help??


in reply to Best practices for modifying a file in place: q's about opening files, file locking, and using the rename function

Q1: In "The truly paranoid programmer would lock the file", which file are the authors referring to?
The $old file. This is data that would get clobbered (assuming you are not using the same name for the temp $new file). BTW I would name the files $orig and $tmp - that seems to make more sense.

Q2: Regarding the reason for being "truly paranoid" -- is this because we don't want another running instance of this script to be writing to $new while we are,
Nope, You're stuck on the $new temp file when the $old original file is what you should be concerned about. You should be using File::Temp to get a uniquely named $new temp file.

I'm not sure I completely understand the hazards of "clobbering." So...
It's when this happens:

UserA UserB Orig File Open $orig Original Content | Reads $orig Opens $orig | | Modify $orig in Memory Reads $orig | | Write $orig to FS Modify $orig in Memory UserA Content | Write $orig to FS UserB Content
There are 2 problems - UserA's changes only last a split second but generally the more important problem is UserB never saw changes UserA made.

Q3: Is the problem the fact that $new might exist already because another instance of this script running at the same time had created $new a split-second ago in connection with its own update of $old, and that our process will destroy the contents of that $new due to the way ">" works,
Nope (at least if you use File::Temp). You only have to be concerned about the file has the unchanging name. That is when 'clobbering' occurs.

Q4: In a multi-user environment, does a careful programmer need to use "sysopen/flock LOCK_EX/truncate" every time a script needs to write a file? And now a final wrinkle on the addition of a file lock for $new in the recipe.
Depends.

  • If it's really important then yes, you should.
  • If it's not critical and not changed very often, locking is not that critical.
  • If you are reasonably sure that only one instance of one program will be updating the file. The locking is generally not required.

The flip side is - If your data is important, changed by more than one source, and changed often - Then you should generally use a full database that supports locking. This is why file locking is not a huge problem.

Q5: Wouldn't we would want to keep $new open (and hence the LOCK_EX in place) until after the "rename( $new, $old )"?
You're still stuck on $new but, I'll rework your question towards what I think you want to ask. 'When should I be releasing a lock'

The best strategy IMO is to create a '.lock' file and flock that. Like this:

  • Once your program decides to modify the file 'foo.txt'. Check for a flocked 'foo.lock' file. If you're clean then create a 'foo.lock' and lock it.
  • read 'foo.txt'
  • modify
  • write it to a unique temp file via File::Temp
  • rename temp file to 'foo.txt'
  • delete 'foo.lock'
This prevents corruption from clobbering and from your program dieing in mid write.


grep
One dead unjugged rabbit fish later
  • Comment on Re: Best practices for modifying a file in place: q's about opening files, file locking, and using the rename function
  • Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://581993]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (7)
As of 2024-04-18 02:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found