Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: Parsing logs and bookmarking last line parsed

by JaeDre619 (Acolyte)
on Aug 19, 2010 at 04:51 UTC ( [id://855957]=note: print w/replies, xml ) Need Help??


in reply to Re: Parsing logs and bookmarking last line parsed
in thread Parsing logs and bookmarking last line parsed

That's kind of what I was thinking too. How would I skip the lines already read and read the new lines? Is there good perl module for this?

Replies are listed 'Best First'.
Re^3: Parsing logs and bookmarking last line parsed
by Marshall (Canon) on Aug 19, 2010 at 20:03 UTC
    I missed this comment before posting my reply below. It would help if you could explain what you intend to use this "last processing occurred at XYZ date/time" for?

    I'm actually not sure that you need this concept at all. If you just need the last data for each backup set, then I would process the input file, replacing old info with new as it becomes available. Then the output becomes "hey here is the most recent stuff I have". All of this processing will be so fast that there is no need to keep track of what you did before, just do it all again to keep things simple. I mean there are 86,400 seconds in a day and running a program once per day that takes one second is nothing in the scheme of things!

    The problem I came into was that the data for each backup set doesn't appear to be "symmetric". In other words, sometimes some parm values are "missing". This can cause some previous value to continue to be "carried forward" when that is not the right thing to do.

    Rather than getting into some "spec war", I post a simple minded use of my previously posted code to report "last values" of each set and then you can tell me: "Hey this would have been right if it had of done X". Below I didn't use $date, don't know why you need $date.

    In doing this short thing, I noticed that $param could have a leading space, so I changed a regex.

    use strict; use Data::Dumper; my %backups; while (<DATA>) { next if (/^\s*$/); #skip blank lines chomp; my ($date, $backupset , $parm , $value) = parseline($_); if ($value) { $backups{$backupset}{$parm} = $value; } } print Dumper \%backups; sub parseline { my $line = shift; my ($date, $rest) = $line =~ m/(^.*\d{4}):(.*)/; my ($backupset, $msg) = split(/backup:INFO:/, $rest); $backupset =~ s/:\s*$//; #trim some unwanted thing like ':' is ok $backupset =~ s/^\s*backup\.//; #more than one step is just fine! my ($parm, $value) = $msg =~ m/\s*(.*)=\s*(.*)\s*/; $parm ||= $msg; #if match doesn't happen these will be undef $value ||=""; #this trick makes sure that they are defined. return ($date, $backupset, $parm, $value); } =print #some reformatting to try to stop line wrap.... $VAR1 = { 'set1_lvm' => { 'backup-size' => '187.24 GB', 'backup-set' => 'backup.set1_lvm', 'backup-time' => '01:59:04', 'backup-date-epoch' => '1281942003', 'backup-status' => 'Backup succeeded', 'last-backup' => '/home/backups/backup.set1_lvm/20100815000006', 'backup-type' => 'regular', 'backup-date' => '20100816000003' }, 'set2_lvm_lvm' => { 'backup-size' => '424.53 GB', 'backup-time' => '04:33:12', 'backup-status' => 'Backup succeeded', 'last-backup' => '/home/backups/backup.set2_lvm_lvm/20100814200003' }, 'set2_lvm' => { 'backup-directory' => '/home/backups/backup.set2_lvm/20100815200003' +, 'backup-set' => 'backup.set2_lvm', 'backup-date-epoch' => '1281927603', 'backup-type' => 'regular', 'backup-date' => '20100815200003' } }; =cut __DATA__ Sun Aug 15 20:00:03 2010: backup.set2_lvm:backup:INFO: START OF BACKUP Sun Aug 15 20:00:04 2010: backup.set2_lvm:backup:INFO: backup-set=back +up.set2_lvm Sun Aug 15 20:00:04 2010: backup.set2_lvm:backup:INFO: backup-date=201 +00815200003 ..... use __DATA__ segment from my previous post
Re^3: Parsing logs and bookmarking last line parsed
by dasgar (Priest) on Aug 19, 2010 at 19:08 UTC

    If you're ok with the idea of recording to a file the last time stamp that was used, it should be pretty simple to record which line you were last on too. You'll just need to modify your while loop a bit by adding a variable to keep track of the line numbers.

    For a simple illustration, let's say that you read in from your new assistant file the last time stamp and the last line number read. Let's say that the last line read was stored in the variable $last_line_read. The code below illustrates the modification that you would need to do.

    my $line_count = 0; while (my $line = <>){ $line_count++; next if ($line_count <= $last_line_read); # the rest of you code from the while remains the same }

    I'm not saying that this is the "best" way to do it, but it should work.

      Hi, thanks for your input on this. I was sort of playing around with this. I figured out how to grab the last count. I found using $. to store this number and I figured out how to dump this to a file.

      My question now is how can I use this as a place holder to start the regex the next time I run the script? I don't know if my logic works currently as I am grabbing data based on a given date, so I'm not sure if a count of record would help? or do you have another idea for a good key? thanks.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://855957]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2024-03-28 23:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found