note
golux
Hi dpelican,
<p>I think this will do what you need. It uses the <i>recursive subpatterns</i> described in [http://perldoc.perl.org/perlre.html|the perlre documentation]. By the time I finished getting my example working I saw that [roboticus] had already mentioned them. (I had never used the recursive regex method, so it was a good learning experience for me).
</p><p><b>Edit</b>: Fixed some comments (specifically capture group numbering), and captured a little bit more.
</p><p><b>Edit 2</b>: Added output.
</p><p><b>Edit 3</b>: Allow keyword '<c>report</c>' (somehow missed it the first time).
<c>
#!/usr/bin/perl
#
# References:
# http://perldoc.perl.org/perlre.html (See section on 'PARNO')
##
use strict;
use warnings;
use feature qw( say );
use Method::Signatures;
##################
## Main Program ##
##################
my $str = 'private function convert_wa_date_strings(iv_beg string, iv_end string, iv_read_date date, iv_step char(6)) returns (date, date, char(1))';
recursive_function_parsing_regex($str);
#################
## Subroutines ##
#################
func recursive_function_parsing_regex($str) {
my $re = qr{
( # Paren group 1 -- full function
(?:
(private|public) # Paren group 2 -- optional 'private' or 'public'
\s+)?
(function) # Paren group 3 -- required 'function' keyword
\s* # Optional space after 'function'
(\w+) # Paren group 4 -- function name
( # Paren group 5 -- args in parens
\(
( # Paren group 6 -- contents of parens
(?:
(?> [^()]+ ) # Non-parens without backtracking
|
(?5) # Recurse to start of paren group 5
)*
)
\)
)
(?: # Optional return value
\s+
returns\s*
( # Paren group 7 -- return args in parens
\(
( # Paren group 8 -- return args
(?:
(?> [^()]+ ) # Non-parens without backtracking
|
(?7) # Recurse to start of paren group 7
)*
)
\)
)
)?
)
}x;
if ($str !~ /$re/) {
say "No match for '$str'";
return;
}
my ($full, $pp, $func, $name, $par, $args, $ret, $rargs) = ($1, $2 || "", $3, $4, $5, $6, $7 || "", $8 || "");
say "Match!";
say " \$full => '$full'"; # Full expression
say " \$pp => '$pp'"; # Optional 'private' or 'public' keyword
say " \$func => '$func'"; # 'function' keyword
say " \$name => '$name'"; # Function name
say " \$par => '$par'"; # Func args (in parens)
say " \$args => '$args'"; # Func args (no parens)
say " \$ret => '$ret'"; # Optional return args (in parens)
say " \$rargs => '$rargs'"; # Optional return args (no parens)
}
</c>
</p><p>Result:
<c>
Match!
$full => 'private function convert_wa_date_strings(iv_beg string, iv_end string, iv_read_date date, iv_step char(6)) returns (date, date, char(1))'
$pp => 'private'
$func => 'function'
$name => 'convert_wa_date_strings'
$par => '(iv_beg string, iv_end string, iv_read_date date, iv_step char(6))'
$args => 'iv_beg string, iv_end string, iv_read_date date, iv_step char(6)'
$ret => '(date, date, char(1))'
$rargs => 'date, date, char(1)'
</c>
<div class="pmsig"><div class="pmsig-941867">
<center><font size="-1">
<font color="#ff0000">say </font>
<font color="#ffbf3f">substr</font><font color="#c8871a">+</font><font color="#9f4f06">lc </font><font color="#711f79">crypt</font><font color="black">(<font color="#a9df2d">qw </font><font color="#4df9ff">$i3 </font><font color="#4db2ff">SI$</font>),4,5</font></font></center>
</div></div>
1218162
1218162