Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Most efficient File Modified check (w/o warnings)

by Cefu (Beadle)
on Nov 14, 2007 at 15:33 UTC ( [id://650771]=perlquestion: print w/replies, xml ) Need Help??

Cefu has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I'm looking for the most efficient way to repeatedly check a file for a new last-modified date.

I currently have a script which continuously monitors a fast-growing log file for new lines and checks them using some parameters loaded from a config file. As such it uses (stat(FILE))[7] on the log file to notice new lines and (stat(FILE))[1) to check for a new inode number when the log gets rotated.

I plan to 'daemonize' this when I get it working and I want to support updating the config file on the fly. To do this I note the modified date in $lastModConfig when I open it then check as follows:

if( (stat($configFileName))[9] != $lastModConfig ){
...re-process config...
}

However, this can produce a warning about using an undefined value with a not equal comparison if the config file is ever missing (renamed etc.) Of course I could fix this by checking for definedness first with:

if( defined (stat($configFileName))[9] && (stat($configFileName))[9] != $lastModConfig ){
...re-process config...
}

but then I'd be calling stat twice. Since this check happens every couple of seconds (remember the log file grows fast) I don't want to eat up any cycles I don't have to.

Is there a better (less CPU time) way to pre-check the file for existance than calling stat($configFileName) or is the an altogether better way to check for modifications to the config file?

Perhaps I should just accept an extra second or two of CPU time every hour to do the second check. Or am I being entirely too neurotic about the warnings? They are just warnings after all and the script functions as intended.

I could, of course, just throw in a
no warnings 'uninitialized';
or even let the warnings drop into the void since STDERR will go to null when this script is a daemon.

I'd like to know what course those with more experience would take.

Thanks, Cefu

Replies are listed 'Best First'.
Re: Most efficient File Modified check (w/o warnings)
by Fletch (Bishop) on Nov 14, 2007 at 15:49 UTC
Re: Most efficient File Modified check (w/o warnings)
by jettero (Monsignor) on Nov 14, 2007 at 15:39 UTC
    You could simply store the mtime... my $mtime = (stat $file)[9]; and then use that twice...

    -Paul

Re: Most efficient File Modified check (w/o warnings)
by tirwhan (Abbot) on Nov 14, 2007 at 16:40 UTC

    If you're on Linux with a kernel >=2.6.13 you could use Linux::Inotify2 to ... err... notify you of the various changes in the file that you'd like to monitor. That's likely to be a lot more efficient than waking up and doing stat calls.


    All dogma is stupid.

      Ahh... Linux. If only.

      For reasons I can't get in to I'm actually banging out this code on Windows right now while waiting for access to the UNIX box it will eventually run on. That is why I haven't bothered with the bits I'll need to deamonize it yet.

      There isn't, by any chance, something like Inotify2 for UNIX is there?

        There isn't, by any chance, something like Inotify2 for UNIX is there?

        Depends on your definition of UNIX :-)

        For the BSDs there's IO::Kqueue and Solaris has the FEM API, though no Perl wrappers for that is available on CPAN AFAICT, you might be able to do something with dtrace there.If you need something that works on UNIX/POSIX in general then your best bet is to go with stat, the other mechanisms are bound to be specific to the UNIX variant. Actually, I can even dimly remember reading something about Vista including a file event notification mechanism, but I haven't had to code for that OS in years so my interest in it is peripheral at best.


        All dogma is stupid.
Re: Most efficient File Modified check (w/o warnings)
by Cefu (Beadle) on Nov 14, 2007 at 16:13 UTC

    jettero - Of course! I consider all sorts of options except for the obvious. If hold on to the value then I don't have to call stat() again.

    Fletch - Thanks. I didn't know about the _ filehandle. It will probably be very efficient to check stat(_) since it is already present and being populated behind the secenes anyway.

    I'll go benchmark these two approaches and see if there is any extra overhead to accessing _ with stat() versus keeping and populating my own scalar.

    EDIT: Benchmarking shows that there is very little difference between the two methods. The only significant finding was that any method using one call (storing the timestamp to a scalar, stat()ing the auto-populated _ filehandle or even my original warning-prone check) is roughly twice as fast as stat()ing twice to check for existance then for a new timestamp.

    With enough effort I was able to measure that storing to a scalar is about 1.5% faster than calling stat() on _. The warning-prone check is less than half a percent faster than that when the config file always exists and is about 5% slower when the config file is missing (even with STDERR closed so warnings don't go anywhere).

    Bottom line: For my purposes the simplest/most direct method of storing the timestamp to a scalar then checking for definedness then difference is the most efficient method that shouldn't throw warnings. Thanks again.

Re: Most efficient File Modified check (w/o warnings)
by chrism01 (Friar) on Nov 14, 2007 at 23:01 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://650771]
Approved by jettero
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-04-25 18:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found