Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Count capturing parentheses in a compiled regexp

by BooK (Curate)
on May 02, 2004 at 10:28 UTC ( #349794=snippet: print w/replies, xml ) Need Help??
Description:

This subroutines count the number of capturing parentheses in a compiled regular expression. Since the regexp is compiled, we know it's correct, and thus we only have to count the opening parentheses.

Update: added /s in the while condition.

Update: This other discussion led to a much better solution, so I commented out the original code.

#sub captures {
#    local $_ = shift;
#    croak "$_ is not a compiled regexp" unless ref eq 'Regexp';
#    my $n = 0;
#    while( /\G(?=.)/gcs ) {
#      /\G[^\\(]*/gc;     # ignore uninteresting stuff
#      /\G(?:\\.)*/gc;    # ignore backslashed stuff
#      /\G\(\?/gc;        # ignore special regexps
#      /\G\(/gc && $n++;  # a capturing (, count it!
#    }
#    $n;
#}

sub captures { ( @_ = '' =~ /(@{[shift]})??/ ) - 1; }
Replies are listed 'Best First'.
Re: Count capturing parentheses in a compiled regexp
by hv (Prior) on May 02, 2004 at 12:12 UTC

    Nice snippet, but a couple of problems: the outer lookahead needs to be //s, else eg:

    qr{( x )}x;
    will fail.

    Also, this will find parens in embedded code and comments and treat as captures. If that doesn't seem worth worrying about it'd be enough to add a caveat I guess, else I think you can mimic perl's simplistic parsing reasonable easily for the code (just count to the balancing close-brace). Comments may actually be the trickiest, since you'll need to know when //x is in force:

    qr{ (?x: # (comment) ) (?-x: # (capture) ) }

    Oops, another one: parens in [ ... ] should be ignored too; I'm not sure how easy those would be to parse, since not every ] closes the selection.

    Hugo

      Thanks a lot for finding these shortcomings in my code. :-) I'll submit updated versions as I correct them.

Re: Count capturing parentheses in a compiled regexp
by japhy (Canon) on May 02, 2004 at 16:03 UTC
    Once I get Regexp::Parser working (it's the update to YAPE::Regex), you'll be able to do this via:
    use Regexp::Parser; # pushes itself to @Regexp::ISA print qr/.../->nparens;
    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
Re: Count capturing parentheses in a compiled regexp
by japhy (Canon) on Jun 27, 2004 at 03:35 UTC
    In the spirit of "match the regex to get the number of parens in it", here's another way:
    sub nparens { "" =~ /|$_[0]/ and $#+ }
    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: snippet [id://349794]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2023-03-30 05:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Which type of climate do you prefer to live in?






    Results (73 votes). Check out past polls.

    Notices?