Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

file state mannagment

by newPerlr (Novice)
on Feb 05, 2009 at 03:50 UTC ( [id://741484]=perlquestion: print w/replies, xml ) Need Help??

newPerlr has asked for the wisdom of the Perl Monks concerning the following question:

i have some multiple log files that are written every five min, i want to read and sort these files data, after sorting the files i want to archive the files, meaning i delete the source log files, The problem I'm having is that i don't want to read the log files that is being written at the time i start reading the files so that i don't delete any uncompleted log files. i only want to read the files which the logs are written and finished. can someone please please help me on this?? thank you in advance, ------------------------------------------------------------------------------- i have no controls over the logging programme and is a unix system

Replies are listed 'Best First'.
Re: file state mannagment
by cdarke (Prior) on Feb 05, 2009 at 08:47 UTC
    Not easy, really if you want to do this the two processes (yours and the logging program) should be co-operating.

    You will need to understand how the logging program uses the output file, particularly if it is opened and closed, or kept open at all times (as asked above). You should be able to get that information from your supplier, or by running either strace, truss, or tusc, depending on your version of UNIX. These programs are very similar and trace the kernel calls made by a program. Try and determine the file descriptor being used (returned integer from open), then you might be able to determine from /proc/process-id/fd when the file gets closed. Unfortunaltely the format of /proc varies between versions of UNIX.

    If your logging process does not close the file then you are stuck - you have no idea if a single write is the only one in a sequence, and there may be other messages in buffers.
Re: file state mannagment
by GrandFather (Saint) on Feb 05, 2009 at 04:21 UTC

    Are the log files updated precisely every 5 minutes according to a clock that you have access to (system time for example) or is the update period stochastic?

    If the update time is predictable you can arrange your work to fit in between updates based on the same clock that the update process is using.

    If the update process is stochastic, but there is a usable minimum time between updates, you may be able to look at the last modified time of the log files and base your work on that.


    Perl's payment curve coincides with its learning curve.
      its stochastic, but i have access to system time,
Re: file state mannagment
by ikegami (Patriarch) on Feb 05, 2009 at 04:07 UTC
    What OS (Win or unix)? Do you know if the application doing the logging uses flock? Do you have any control over the code of application doing the logging?
      i have no control over the logger and it a unix system

        One more question. Does the application keep the log file open, or does it close and reopen it. This can be answered using strace or your system's equivalent.

        If it does close and reopen them, you can rename the file. Once the application creates a new log file, you can do whatever you want to the renamed one.

Re: file state mannagment
by Eyck (Priest) on Feb 05, 2009 at 13:17 UTC
    You do it like this:
    foreach $log (@logfiles) { rename($log,$log.".tmp"); ProcessLogfile($log.".tmp"); } foreach $log (@logfiles) { del($log.".tmp"); };
    this way, everything that is logging, will keep on logging into it's open filedescriptor (ie - the new data will go into .tmp file).

    The problem is - how long are your external logging processing keeping their logfile descriptors open, the key part being - are they keeping their descriptors opened indefinitely? Then you're out of luck and can't really do what you're trying to achieve.

    If their logging method is

    open(LOG,">>logfile"); print LOG "memo"; close LOG;
    then the method described above would work OK.
Re: file state mannagment
by gone2015 (Deacon) on Feb 05, 2009 at 18:26 UTC

    Perhaps you could collect the names and last modification times of the available log files. You could safely process all but the latest ? Or are there multiple log files being written to at the same time ?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://741484]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-03-28 22:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found