I think the following code is self explaining
use strict;
use warnings;
use Data::Dump qw/pp dd/;
my @parse;
my @path;
my $last_level=0;
$path[$last_level] = \@parse;
while (my $line = <DATA> ) {
# pp
my($white,$key,undef,$content) =
$line =~ /^
(\s*) # indent
(.*?) # key
(
\s*->\s* # ignore arrow
(.*)
)? # optional group
$/x;
my $level = length($white) / 2;
# pp [$white, $key,$level,$last_level];
die "indent-level $level too big (last level was $last_level)!"
if $level > $last_level+1;
my @children;
push
@{$path[$level]},
# {
# $key => {
# children => \@children,
# # level => $level,
# # content => $content,
# }
# };
{
$key => \@children # terse output
};
$path[$level+1] = \@children;
$last_level = $level;
}
warn "Output: " , pp \@parse;
__DATA__
interface XYZ
given param1 -> child of "interface XYZ"
given param2 -> child of "interface XYZ"
given param2.1 -> child of "given param2"
given param2.2 -> child of "given param2"
given param2.2.1 -> child of "given param2.2"
given param3 -> child of "interface XYZ"
I extended your input to cover the case of a bigger indent gap.
Output: [
{
"interface XYZ" => [
{ "given param1" => [] },
{
"given param2" => [
{ "given param2.1" => [] },
{ "given param2.2" => [{ "given param2.2.1" => [] }] },
],
},
{ "given param3" => [] },
],
},
] at d:/tmp/parse_indent.pl line 55, <DATA> line 7.
you can uncomment various code sections to play with debug output and different data-structure patterns YMMV.
update
Some people prefer to avoid empty "children" arrays for leafs-nodes.
In this case avoid an empty default array and let $path[$level] point to an upper container where you check for existence of an entry for children.
extending the code should be straight forward.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.