addn'l help with parsing here doc

smackdab has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I had some great help the other day from jonadab and others and would like a little more on adapting this code...

It works great for parsing a file, but sometimes I have already read the file into a $buffer and need to do the same processing.

If there is a way to easily convert the neat "local $/" file-handle trick, I don't see it. Do I need to somehow keep track of the character count in the $buffer?

The code parses k=v and heredoc config files

 while (<CONFIG>) {
    if (/^\s*#/) {  # ignore comment line
    } elsif (/^\s*$/) { # ignore blank line
    } elsif (/(\w+)\s*=\s*[<]{2}(\w+)/) { # heredoc
      (my $name, local $/) = ($1, "\n$2"); # ++ysth
      $config{$name} = <CONFIG>;
      chomp $config{$name}; # as etcshadow points out.
    } elsif (/(\w+)\s*=\s*(.*?)\s*$/) { # regular pair
      $config{$1}=$2;
    } else {
       warn "Ptooey:  Could not parse config line: $_\n";
    }
 }
[download]

Comment on addn'l help with parsing here doc Download Code

Replies are listed 'Best First'.
Re: addn'l help with parsing here doc by graff (Chancellor) on Oct 17, 2003 at 03:29 UTC
Given that some entire config file has been loaded into $buffer, it would seem easiest to split that into lines, and then behave pretty much the same way as reading from a file, except that "local $/" is of no use in this case. The following is untested: `@lines = split /\n/, $buffer; my ( $name, $end ) = ( '', '' ); for ( @lines ) { next if (/^\s#/ or /^\s$/ ); # skip comments, blank lines if ( $name ) { if ( /^$end$/ ) { chomp $config{$name}; #remove trailing "\n" $name = ''; } else { $config{$name} .= "$_\n"; } } elsif ( /(\w+)\s=\s(.?)\s$/ ) { # regular pair $config{$1} = $2; } elsif ( /(\w+)\s=\s<<(\w+)/ ) { # heredoc ( $name, $end ) = ( $1, $2 ); } else { warn "Ptooey: Could not parse config line: $_\n"; } }` [download] It's a little grotty, in the sense that you have to put the "within a HERE doc" behavior first in the "if...elsif..." series, because who knows whether/when the contents of a here-doc might trigger a false-alarm match on one of the other conditions. Also, as written above, there's nothing to warn about a here-doc that is not terminated at the last line in $buffer -- but that should be easy to figure out.	[reply] [d/l]
Re: addn'l help with parsing here doc by etcshadow (Priest) on Oct 17, 2003 at 03:46 UTC
You can outright replace any logic that goes like: `while(<FILE>) { ...` [download] with this: `for (split "(?<=\Q$/\E)", $contents_of_FILE) { ...` [download] and it does exactly the same thing. Which just means: split the content of the file on the zero-width positive lookbehind assertion of $/ (the input record separator). This is subtley different from just splitting on $/ (or, more safely, splitting on \Q$/\E), in that splitting on $/, itself, removes $/ from the output of the split... whereas splitting on the zero-width positive lookbehind assertion leaves it in. (This is because what is being splitted on is the empty string following each occurence of $/, so that is the thing that gets removed). Anyway, this may not be the best way to deal with the situation of your problem, but it is the most general solution for dropping in a replacement of a <FILE> loop with some sort of loop over the contents of FILE. ------------ :Wq Not an editor command: Wq	[reply] [d/l] [select]


Perl-Sensitive Sunglasses
	PerlMonks