comment on

With respect to (?xism-xism:...), beware of the docs for this (my emphasis):

One or more embedded pattern-match modifiers, to be turned on (or turned off, if preceded by "-") for the remainder of the pattern or the remainder of the enclosing pattern group (if any).

If I understand that correctly, the "only look at openings" approach will fail on something like:

  /(?x:((?-x:)) # (comment)
  )/
[download]

  /((?-x:)) # (comment)
  /x
[download]

The simplistic (?{...}) parsing I referred to is in toke.c:scan_const(); look for the test

            else if (s[2] == '{' /* This should match regcomp.c */
                     || ((s[2] == 'p' || s[2] == '?') && s[3] == '{'))
[download]

which simply counts unescaped braces until the opening one is closed - something like:

  our $re_true = qr{(?=)}x;
  our $re_false = qr{(?!)}x;
  our $count;

  /
    # (?{ ... }) or (??{ ... }) or (legacy) (?p{ ... })
    \G \( \? (?: \? \?? | p ) (?= \{ )
    (?{ local $count = 0; })
    (?:
       \{ (?{ local $count = $count + 1 })
    |
       \} (?{ local $count = $count - 1 })
    |
       \\ .
    |
       .
    )+?
    (??{ $count == 0 ? $re_true : $re_false })
  /xgc;
[download]

would be fitting, though I suspect there must be a simpler way.

(consider how lucky I am that the regular expression engine is not reentrant...)

Now now, no need for that sort of language.

Hugo

In reply to Re: Trying to count the captures in a compiled regular expression by hv
in thread Trying to count the captures in a compiled regular expression by BooK

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Don't ask to ask, just ask
	PerlMonks