Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: Parsing logs and bookmarking last line parsed

by murugu (Curate)
on Aug 19, 2010 at 04:12 UTC ( #855953=note: print w/replies, xml ) Need Help??

in reply to Parsing logs and bookmarking last line parsed


One way of doing it is by writing the last time stamp in the logfile in to a text file during end of the script execution. Next time you run the script, use time stamp inside the text file to skip through the unwanted entries inside the for loop.

There must be better way than what i suggested, hang on in this site you will get better solutions from best monks here.

Murugesan Kandasamy
use perl for(;;);

  • Comment on Re: Parsing logs and bookmarking last line parsed

Replies are listed 'Best First'.
Re^2: Parsing logs and bookmarking last line parsed
by JaeDre619 (Acolyte) on Aug 19, 2010 at 04:51 UTC
    That's kind of what I was thinking too. How would I skip the lines already read and read the new lines? Is there good perl module for this?
      I missed this comment before posting my reply below. It would help if you could explain what you intend to use this "last processing occurred at XYZ date/time" for?

      I'm actually not sure that you need this concept at all. If you just need the last data for each backup set, then I would process the input file, replacing old info with new as it becomes available. Then the output becomes "hey here is the most recent stuff I have". All of this processing will be so fast that there is no need to keep track of what you did before, just do it all again to keep things simple. I mean there are 86,400 seconds in a day and running a program once per day that takes one second is nothing in the scheme of things!

      The problem I came into was that the data for each backup set doesn't appear to be "symmetric". In other words, sometimes some parm values are "missing". This can cause some previous value to continue to be "carried forward" when that is not the right thing to do.

      Rather than getting into some "spec war", I post a simple minded use of my previously posted code to report "last values" of each set and then you can tell me: "Hey this would have been right if it had of done X". Below I didn't use $date, don't know why you need $date.

      In doing this short thing, I noticed that $param could have a leading space, so I changed a regex.

      use strict; use Data::Dumper; my %backups; while (<DATA>) { next if (/^\s*$/); #skip blank lines chomp; my ($date, $backupset , $parm , $value) = parseline($_); if ($value) { $backups{$backupset}{$parm} = $value; } } print Dumper \%backups; sub parseline { my $line = shift; my ($date, $rest) = $line =~ m/(^.*\d{4}):(.*)/; my ($backupset, $msg) = split(/backup:INFO:/, $rest); $backupset =~ s/:\s*$//; #trim some unwanted thing like ':' is ok $backupset =~ s/^\s*backup\.//; #more than one step is just fine! my ($parm, $value) = $msg =~ m/\s*(.*)=\s*(.*)\s*/; $parm ||= $msg; #if match doesn't happen these will be undef $value ||=""; #this trick makes sure that they are defined. return ($date, $backupset, $parm, $value); } =print #some reformatting to try to stop line wrap.... $VAR1 = { 'set1_lvm' => { 'backup-size' => '187.24 GB', 'backup-set' => 'backup.set1_lvm', 'backup-time' => '01:59:04', 'backup-date-epoch' => '1281942003', 'backup-status' => 'Backup succeeded', 'last-backup' => '/home/backups/backup.set1_lvm/20100815000006', 'backup-type' => 'regular', 'backup-date' => '20100816000003' }, 'set2_lvm_lvm' => { 'backup-size' => '424.53 GB', 'backup-time' => '04:33:12', 'backup-status' => 'Backup succeeded', 'last-backup' => '/home/backups/backup.set2_lvm_lvm/20100814200003' }, 'set2_lvm' => { 'backup-directory' => '/home/backups/backup.set2_lvm/20100815200003' +, 'backup-set' => 'backup.set2_lvm', 'backup-date-epoch' => '1281927603', 'backup-type' => 'regular', 'backup-date' => '20100815200003' } }; =cut __DATA__ Sun Aug 15 20:00:03 2010: backup.set2_lvm:backup:INFO: START OF BACKUP Sun Aug 15 20:00:04 2010: backup.set2_lvm:backup:INFO: backup-set=back +up.set2_lvm Sun Aug 15 20:00:04 2010: backup.set2_lvm:backup:INFO: backup-date=201 +00815200003 ..... use __DATA__ segment from my previous post

      If you're ok with the idea of recording to a file the last time stamp that was used, it should be pretty simple to record which line you were last on too. You'll just need to modify your while loop a bit by adding a variable to keep track of the line numbers.

      For a simple illustration, let's say that you read in from your new assistant file the last time stamp and the last line number read. Let's say that the last line read was stored in the variable $last_line_read. The code below illustrates the modification that you would need to do.

      my $line_count = 0; while (my $line = <>){ $line_count++; next if ($line_count <= $last_line_read); # the rest of you code from the while remains the same }

      I'm not saying that this is the "best" way to do it, but it should work.

        Hi, thanks for your input on this. I was sort of playing around with this. I figured out how to grab the last count. I found using $. to store this number and I figured out how to dump this to a file.

        My question now is how can I use this as a place holder to start the regex the next time I run the script? I don't know if my logic works currently as I am grabbing data based on a given date, so I'm not sure if a count of record would help? or do you have another idea for a good key? thanks.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://855953]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2021-10-16 17:47 GMT
Find Nodes?
    Voting Booth?
    My first memorable Perl project was:

    Results (69 votes). Check out past polls.