shoness has asked for the wisdom of the Perl Monks concerning the following question:
Monks,
I've run some jobs that print out statistics about memory usage during their life. I use "grep" to collect relevant data into a single file. I parse that file with Perl to pull out the data that I want and create a comma-separated-values file that I can operate on with a spreadsheet tool.
Each line of the output file contains one of the data elements I want to collect. There is lots of other data that I don't want. The input data looks like this:
... Pass #123 ... ... Elapsed Time : 1753.2 sec CPU Time : 753.2 sec ... Virtual memory size : 4472.6 MB Resident set size : 4362 MB ... Major page faults : 7153 ... Pass #124 ... ...
From this data, I expect to create a line like this:
... 123, 1753.2, 753.2, 4472.6, 4362, 7153 ...
As you can see, it's sortof taking the original source data and turning it sideways, stripping off the descriptive and unwanted text. The position in the line tells me that.
My working solution is below. I just think that since "tmtowtdi", that "tMBABwtdi. I'd love to see your thoughts on what surely must be a very common task.
Thanks!
use strict; use warnings; sub slurp { local $/ = undef; local *file; open file, $_[0] or die "Can't open $_[0]: $!"; my $slurp = <file>; close file or die "Can't close $_[0]: $!"; $slurp; } my $indata = slurp('noa.txt'); print "pass, wall time (sec), CPU time (sec), VM (MB), ResMem (MB), Pa +ge Faults\n"; while ($indata =~ m/^Pass\s\#(\d+).*? ^Elapsed\ Time\s+:\s+([\d\.]+).*? ^CPU\ Time\s+:\s+([\d\.]+).*? ^Virtual\ memory\ size\s+:\s+([\d\.]+).*? ^Resident\ set\ size\s+:\s+([\d\.]+).*? ^Major\ page\ faults\s+:\s+([\d\.]+) /msgcx) { print "$1, $2, $3, $4, $5, $6\n"; }
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Multiline RegExp. A Better Way?
by NetWallah (Canon) on Feb 16, 2012 at 04:54 UTC | |
Re: Multiline RegExp. A Better Way?
by kennethk (Abbot) on Feb 16, 2012 at 02:46 UTC | |
Re: Multiline RegExp. A Better Way?
by kcott (Archbishop) on Feb 16, 2012 at 03:09 UTC | |
Re: Multiline RegExp. A Better Way?
by sundialsvc4 (Abbot) on Feb 16, 2012 at 14:04 UTC |