Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Greetings, monks!

I would like to modify a file in place using the method recommended as the "best" method in the Perl Cookbook, 2d ed., recipe 7.15, Modifying a File in Place with a Temporary File. I don't understand something the authors say; it seems critical to understand it, though.

(The file to be modified in my case is an important flat-file database (one record per line); web users can use a CGI script to either add data to the file or to edit their own records; I'm concerned about possible file corruption when two or more users are submitting new or revised data at about the same instant. I know I could use a real database but I really want to figure out file locking using Perl. Seems like this issue must come up all the time in a multiuser environment, whether web or internal network.)

The code provided in the recipe is:

open( OLD, "<", $old ) or die "can't open $old: $!"; open( NEW, ">", $new ) or die "can't open $new: $!"; while (<OLD>) { # change $_, then ... print NEW $_ or die "can't write $new: $!"; } close( OLD ) or die "can't close $old: $!"; close( NEW ) or die "can't close $new: $!"; rename( $old, "$old.orig" ) or die "can't rename $old to $old.orig: + $!"; rename( $new, $old ) or die "can't rename $new to $old: $!";

Some discussion follows, then the authors say:

Note that rename won't work across filesystems, so you should create your temporary file in the same directory as the file being modified.

The truly paranoid programmer would lock the file during the update. The tricky part is that you have to open the file for writing without destroying its contents before you can get a lock to modify it. Recipe 7.18 shows how to do this.

(Emphasis supplied by me.)

Q1: In "The truly paranoid programmer would lock the file", which file are the authors referring to?

Q2: Regarding the reason for being "truly paranoid" -- is this because we don't want another running instance of this script to be writing to $new while we are, so we ought to revise this script (and hence both instances) to get a LOCK_EX before writing to $new?

To get the desired file lock, the authors caution that the "tricky part" is to first open the file for writing without clobbering its contents. I have read elsewhere in the book that "open (OUT, ">", $out)" would "clobber" any existing file named $out before a script would have a chance to get a lock on the file, and I've read (p. 421 of Programming Perl, 3d ed.) that the best method for writing to a file is to use sysopen, which does not clobber any file that exists, as in:

use Fcntl qw( :flock :DEFAULT ); sysopen( OUT, $out, O_WRONLY|O_CREAT ) or die "can't sysopen $out: $!" +; flock( OUT, LOCK_EX ) or die "can't flock $out: $!"; truncate( OUT, 0) or die "can't truncate $out: $! +"; # now write to file... close( OUT ) or die "can't close $out: $!";

Q3: I'm not sure I completely understand the hazards of "clobbering." Is the problem the fact that $new might exist already because another instance of this script running at the same time had created $new a split-second ago in connection with its own update of $old, and that our process will destroy the contents of that $new due to the way ">" works, thereby causing the other instance (e.g., another web user submitting data via the same page's form) to produce mangled or empty data when that instance renames $new to $old? Yikes, there goes the database.

Q4: In a multi-user environment, does a careful programmer need to use "sysopen/flock LOCK_EX/truncate" every time a script needs to write a file? If a plain open ">" technique is used there would seem to be a potential clobbering problem.

Q5: A final wrinkle on the addition of a file lock for $new in the recipe: wouldn't we would want to keep $new open (and hence the LOCK_EX in place) until after the "rename( $new, $old )"? Would that work, though? I'm concerned that the rename function implicitly closes the file being renamed and breaks the lock on it before doing something as drastic as renaming it.


In reply to Best practices for modifying a file in place: q's about opening files, file locking, and using the rename function by davebaker

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-20 04:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found