Re: Extract Block Of Text From Log

in reply to Extract Block Of Text From Log

Well, i'd use a classic state machine for this problem, then we can work line-by-line, without having to load the whole file into memory.

#!/usr/bin/env perl
use strict;
use warnings;
use diagnostics;

my @blocklines;
my $inblock = 0;

open(my $IFH, '<', 'blockextract.txt') or die($!);
while((my $line = <$IFH>)) {
    chomp $line;

    if($line =~ /parameters\ after\ change/) {
        # Start of a block we want to read
        $inblock = 1;
        next;
    }

    # Skip handling line unless we are in a block
    next unless($inblock);

    if($line =~ /NRG\ location/) {
        # Block ends here
        # Do whatever you want to do to the block lines
        # stored in @blocklines.
        # I'm just dumping them to STDOUT
        print "*** START ***\n";
        print join("\n", @blocklines), "\n";
        print "*** END ***\n\n";

        # Clean up block
        @blocklines = ();
        $inblock = 0;
        next;
    }

    # just some line within the interesting block
    # remember it for later in @blocklines
    push @blocklines, $line;
    next;
}

close $IFH;

exit(0);
[download]

That way, we can even modify the program very slightly to make it work via pipes, working live on a stream of data generated by some other program. We just remove the open and close calls and change the while loop a bit:

while((my $line = <>)) {
[download]

Then we can use the program on an arbitrary stream of this kind of data, and it extracts each block as soon as it is pushed into the programs STDIN:

cat blockextract.txt | perl blockextract.pl
[download]

And the only thing the state machine has to hold in memory is the block it is currently working on and a single state variable...

perl -e 'use MIME::Base64; print decode_base64("4pmsIE5ldmVyIGdvbm5hIGdpdmUgeW91IHVwCiAgTmV2ZXIgZ29ubmEgbGV0IHlvdSBkb3duLi4uIOKZqwo=");'

In Section Seekers of Perl Wisdom