Here's a corrected version that can parse the long sample on your scratchpad (which I also copy below so that it wouldn't disappear unexpextedly). The original script (that in the parent thread) couldn't parse the longer sample because of features it had that I couldn't have guessed from the small samples on the node, and by the time I wrote that you didn't give us the long sample. There are three differences: firstly, this script expects that the data starts with an opening parenthesis, secondly, it accepts a lone colon instead of a colon with a keyword after it, thirdly, it accepts double-quoted strings.
perl -we 'use Data::Dumper; $s = \%p; @s = (); while (<>) { our $f++ o
+r $_ = ": " . $_; while (/\G\s*(?:([-\w.]+|"[^"]*")|:([-\w.]*)\s*\(|(
+\)))/gc) { if (defined($1)) { defined($$s{""}) and die "parse error:
+two"; $$s{"@"} = $1; } elsif (defined($2)) { push @s, $s; $s = $$s{$2
+} = {}; } elsif (defined($3)) { @s or die "parse error: close"; $s =
+pop @s; } } /(\S.*)/g and die "parse error: junk: $1"; } $! and die "
+read error"; $s == \%p or die "parse error: open"; print Dumper(\%p);
+'
A historical note. I did the correction because someone has asked on an irc channel how to parse a file of this exact format.
Here's the long sample
Update:
a version of the above converted to a real script (not a one-liner using global variables) is here. This one also removes double-quotes from double-quoted strings and accepts multi-line strings. The file format has backslash-escaped double quotes in double-quoted strings it seems, and possibly other things this can't parse.
use warnings; use strict;
use Data::Dumper;
sub parse {
my($f) = @_;
my($s, %p, @s, $b);
$s = \%p;
while (<$f>) {
$b++ or $_ = ": " . $_;
while (/\G\s*(?:([-\w.]+)|"([^"]*)"|("[^"]*$)|:([-\w.]
+*)\s*\(|(\)))/gc) {
if (defined($1) || defined($2)) {
defined($$s{""}) and die "parse error:
+ two";
$$s{"@"} = defined($1) ? $1 : $2;
} elsif (defined($3)) {
$_ = $+ . <$f>;
} elsif (defined($4)) {
push @s, $s;
$s = $$s{$+} = {};
} elsif (defined($5)) {
@s or die "parse error: close";
$s = pop @s;
}
}
/(\S.*)/g and die "parse error: junk: $1";
}
$! and die "read error";
$s == \%p or die "parse error: open";
\%p;
}
my $p = parse(*ARGV);
print Dumper($p);
__END__
Update:
defined($$s{""}) and die "parse error: two"; shoud be changed to defined($$s{"@"}) and die "parse error: two"; in both scripts I belive.
|