Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Mysterious Disapperance of file contents

by Gorby (Monk)
on Aug 29, 2003 at 01:43 UTC ( [id://287564]=perlquestion: print w/replies, xml ) Need Help??

Gorby has asked for the wisdom of the Perl Monks concerning the following question:

Dear Wise Monks, One of my programs has a simple counter that counts the number of times the program has run. Below is the relevent portion that handles this task. My problem is that sometimes the counter mysteriously loses count and starts from the beginning again, as if the contents were erased somehow. This happens during times when traffic is high and people are accessing the script at the same time on the internet. Can you shed light on why the file contents "disappear"? Thanks in advance.
$completeadd = "mycounter"; $semaphore_file = "mycounterlock"; open(SEM, ">$semaphore_file") || die "Cannot create semaphore $semapho +re_file: $!"; flock(SEM, LOCK_EX) || die "Lock failed: $!"; open(MFILE, ">>$completeadd") || die "file open1 failed: $!\n"; close(MFILE); open(MFILE, "$completeadd") || die "file open2 failed: $!\n"; @filedata1=<MFILE>; chomp @filedata1; close(MFILE); $hitcount=$filedata1[0]; if ($hitcount) { $hitcount=$hitcount + 1; } else { $hitcount = 1; } $filedata1[0]=$hitcount; writedata(@filedata1); release_lock();

Replies are listed 'Best First'.
Re: Mysterious Disapperance of file contents
by converter (Priest) on Aug 29, 2003 at 06:29 UTC

    There's a fine "hit counter" example in the File Locking section at the bottom of perlopentut:

    use Fcntl qw(:DEFAULT :flock); sysopen(FH, "numfile", O_RDWR | O_CREAT) or die "can't open numfile: $!"; # autoflush FH $ofh = select(FH); $| = 1; select ($ofh); flock(FH, LOCK_EX) or die "can't write-lock numfile: $!"; $num = <FH> || 0; seek(FH, 0, 0) or die "can't rewind numfile : $!"; print FH $num+1, "\n" or die "can't write numfile: $!"; truncate(FH, tell(FH)) or die "can't truncate numfile: $!"; close(FH) or die "can't close numfile: $!";

    converter

      This is similar to the discussion of locking when writing in Camle 3, where it states that the simpler open, flock sequence used for reading isn't adequate. BUT I believe OP's use of a separate semaphore file got around that issue, as he never does anything with the counter file until the flock on the semaphore file returns.

      --Bob Niederman, http://bob-n.com

      All code given here is UNTESTED unless otherwise stated.

Re: Mysterious Disapperance of file contents
by sgifford (Prior) on Aug 29, 2003 at 02:45 UTC

    What you've posted so far looks fine. What do writedate and release_lock look like?

    One good way to do this is to create the new count in a different file, then rename it in to the actual name. You probably have a moment when the file is empty, and if the program crashes just then it will stay empty. By using rename instead, the file is never empty; it is either the old value, or the new value.

    Another way is to open the file read-write, read the value, then rewind and write the new value. As long as the number is reasonably short (less than 1000 digits or so) and buffered writes are turned off (or you use syswrite), this should be atomic as well.

    Why do you do these two lines?

    open(MFILE, ">>$completeadd") || die "file open1 failed: $!\n"; close(MFILE);
      In which case, there's no sense reinventing wheels.

      --
      I'm not belgian but I play one on TV.

      The effect of these two lines will be so that when you run the script for the first time, the open(filehandle, filename) || die doesn't die because the file does not exist.

      Otherwise, the script would die continuously until you manually created the file. In theory, the ">>" (append) should prevent data from being overwritten in the file if it does exist and you crash with an open filehandle - although that's perhaps not the safest way of doing things.

      Try removing these lines, and putting a &CreateFile() unless -e $filename; earlier on in the code instead.

Re: Mysterious Disapperance of file contents
by bbfu (Curate) on Aug 29, 2003 at 04:58 UTC

    Like sgifford, I believe the problem lies in writedata or release_lock (or in some other code you've not shown us).

    I also believe your code should be cleaned up a bit, such as:

    #!/usr/bin/perl use warnings; use strict; use Fcntl qw':flock :seek'; our $COUNT_FILE = 'mycounter'; my $cfh; -e $COUNT_FILE ? open $cfh, "+< $COUNT_FILE" : open $cfh, "+> $COUNT_FILE" # not needed if file always exists or die "Can't open $COUNT_FILE: $!\n"; flock($cfh, LOCK_EX) or die "Can't lock: $!\n"; chomp(my $count = <$cfh> || ''); seek($cfh, 0, SEEK_SET); print $cfh ++$count, "\n"; print "Run #$count\n"; # truncate not needed as $count is always increasing close($cfh); # automagically releases the lock

    bbfu
    Black flowers blossom
    Fearless on my breath

Re: Mysterious Disapperance of file contents
by dws (Chancellor) on Aug 29, 2003 at 02:51 UTC
    My problem is that sometimes the counter mysteriously loses count and starts from the beginning again, as if the contents were erased somehow.

    Consider the following sequence of events:

    open(SEM, ">$semaphore_file") -- succeeds flock(SEM, LOCK_EX) -- succeeds open(MFILE, ">>$completeadd") -- fails die(...)
    During the source of the die(), files are closed, and locks are released. Because you opened the semaphore file for writing, but haven't yet written to it, it's just been erased.

    This may or may not be what's happening, but it's one possible explanation.

    Are you seeing any errors in your error logs?

    Update: Ignore everything above "Are you seeing any errors in your error logs?".

      Because you opened the semaphore file for writing, but haven't yet written to it, it's just been erased.

      Except that he's never (in the shown code, anyway) writing to the semaphore file. He's using a separate, presumably always-empty, file as the semaphore and is always opening the counter file in read or append... except possibly in writedata.

      bbfu
      Black flowers blossom
      Fearless on my breath

Re: Mysterious Disapperance of file contents
by aquarium (Curate) on Aug 29, 2003 at 02:27 UTC
    you're not supposed to open, and then lock your lock/semaphore file, as in that gap of time, another copy of your program can open it also. any decent perl book (e.g. camel book) tells you how to get around it. Alternatively, read the docs and faqs on open and flock.
      you're not supposed to open, and then lock your lock/semaphore file...

      As someone unfamiliar with flock semantics, could you explain how your comments sits with the example given in the docs for flock:

      use Fcntl ':flock'; # import LOCK_* constants sub lock { flock(MBOX,LOCK_EX); # and, in case someone appended # while we were waiting... seek(MBOX, 0, 2); } sub unlock { flock(MBOX,LOCK_UN); } open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}") or die "Can't open mailbox: $!"; lock(); print MBOX $msg,"\n\n"; unlock();

      and with flock taking a filehandle? Doesn't that mean I have to open it before I lock it?

      I'm blissfully unaware or "how you're supposed to do it", using different mechanisms for this purpose, but I find the documentation decidedly unclear. Care to explain it?


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

        Keep in mind you can't use LOCK_UN safely because of buffering. Use close() for that.

        your example does not use an external lock file...i'll just leave it at that.
        ...one other thing i noticed with the original poster's code is that it's closing the counter file and then opening it again, which is just plain asking for trouble, as it all should be done under one lock. anyway, i got the camel book out:
        use Fcntl qw(:DEFAULT :flock); $lockfile = "counter.lck"; sysopen(COUNTERLOCK, $lockfile, O_RDONLY | O_CREAT) or die "can't open + $lockfile: $!"; flock(COUNTERLOCK, O_EXCL) or die "can't lock $lockfile: $!";
        ....here open your counter file in read/write mode using same locking semantics as per the lockfile...then seek to 0, read the integer, add 1, seek to 0, write new value, close counter file, close lockfile, output new value to screen. don't unlock the counterfile/lockfile, just close them. this flushes the buffers and releases locks and closes in a fairly atomic fashion....that's another bug in the original code. Just remember that all this locking is only advisory, and programs not written to respect the locks will clobber it. Also, don't try to lock files over any network file systems, such as NFS,SMBFS.

      That was my first thought, (hence my award-winning post above), but after reading the doc on flock, it appears that purpose of the flock isn't to stop the second program from opening the file - instead it stops the second program from obtaining the lock - the second program's flock(HANDLE, LOCK_EX) blocks until the first program issues flock(HANDLE, LOCK_UN).

      The doc on flock shows a sequence much like that used by the OP, but without use of a spearate seamphore file.

      Testing things very similar to the OP's code seems to work as expected. I don't see the issue.

      --Bob Niederman, http://bob-n.com

      All code given here is UNTESTED unless otherwise stated.

      That's just wrong. flock and fcntl both take a filehandle as their first argument. How are you supposed to provide a filehandle without opening a file first?
Re: Mysterious Disapperance of file contents
by Abigail-II (Bishop) on Aug 29, 2003 at 08:34 UTC
    open(MFILE, ">>$completeadd") || die "file open1 failed: $!\n"; close(MFILE);

    What is the point of this?

    Can you shed light on why the file contents "disappear"?

    While the code you display isn't the way I would do it, I don't think there's anything wrong in the code you show. Of course, you don't show us writedata or release_lock - it could be that the problem lies there.

    Abigail

Re: Mysterious Disapperance of file contents
by Ajnabi (Initiate) on Aug 29, 2003 at 11:14 UTC
    I had the same problem. I suspected some sort of racing condition. It can also happen with Server or Web-Server crash.

    I am taking a similar approach, but with two files in place of one. If main counter file is blank, use values from counter.bak file.


    Lock Counter_File
    Read Count
    On Fail
    Read Counter_File.bak
    Increment
    Write Data to Counter_File
    Copy Counter_File to .bak file.
    Release Lock
      Gorby is also chomping an array instead of just using a scalar all the way through. There's 3 opens and closes for the counterfile: first open in append, second open in RW, and third open in write. this spells disaster when not locking the lock/semaphore file properly using sysopen and flock. Therefore, when the server is busy, and the semaphore lock fails, the counter file is clobberred by another instance of the prog. It could have been written with just a single lock on the counter file itself: but locked properly (sysopen/flock) and openned and closed once, not 3 times.
        Therefore, when the server is busy, and the semaphore lock fails, the counter file is clobberred by another instance of the prog.

        No. The code says:

        flock(SEM, LOCK_EX) || die "Lock failed: $!"

        If a lock fails, the program exits, so it can't clobber.

        Abigail

Re: Mysterious Disapperance of file contents
by dga (Hermit) on Aug 29, 2003 at 19:33 UTC

    My money is on writedata(). I bet there is a case where the file is zeroed out, then prior to the writing of the new value, it fails leaving an empty file. Then the main code which is written to handle a missing file silently, resets the counter to 1 and goes on with its life. Perhaps an error should be logged if the file is zero length. ie reading the first line returns undef which, for example which means your process either just created an empty file or opened an empty file. Either way your counter data has disappeared.

    I do wonder why the main file isn't opened RW then locked LOCK_EX then read, increment value, truncate(), and write, and close (unlocking the file). This method only uses one file, one lock, and only opens the file once. The locking will ensure that all the processes play well together.

    Also if you have access to a database which has sequences which don't have to relate to a specific table, one could be set up to do the counting. The interface is heavier but automicity is guaranteed and your counter won't mysteriously reset to zero. This probably only makes sense if you are using the database for other things already on the site.

    In PostgreSQL the SQL would go like SELECT nextval('my_sequence'). One would then use DBI presumably and fetch the value which would always be the next one in the series. This works for multiple processes and is quite scalable.

Re: Mysterious Disapperance of file contents
by bobn (Chaplain) on Aug 29, 2003 at 02:11 UTC

    Oops, nevermind.

    I beleive this is a race condition:

    open(SEM, ">$semaphore_file") || die "Cannot create semaphore $semapho +re_file: $!"; flock(SEM, LOCK_EX) || die "Lock failed: $!";
    I believe it's possible for 2 instances to execute the first line above at the same time, before either gets to the second.

    --Bob Niederman, http://bob-n.com

    All code given here is UNTESTED unless otherwise stated.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://287564]
Approved by Paladin
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (2)
As of 2024-03-19 07:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found