With respect to (?xism-xism:...), beware of the docs for this (my emphasis):
One or more embedded pattern-match modifiers, to be turned on (or turned off, if preceded by "-") for the remainder of the pattern or the remainder of the enclosing pattern group (if any).
If I understand that correctly, the "only look at openings" approach will fail on something like:
/(?x:((?-x:)) # (comment)
)/
or
/((?-x:)) # (comment)
/x
The simplistic (?{...}) parsing I referred to is in toke.c:scan_const(); look for the test
else if (s[2] == '{' /* This should match regcomp.c */
|| ((s[2] == 'p' || s[2] == '?') && s[3] == '{'))
which simply counts unescaped braces until the opening one is closed - something like:
our $re_true = qr{(?=)}x;
our $re_false = qr{(?!)}x;
our $count;
/
# (?{ ... }) or (??{ ... }) or (legacy) (?p{ ... })
\G \( \? (?: \? \?? | p ) (?= \{ )
(?{ local $count = 0; })
(?:
\{ (?{ local $count = $count + 1 })
|
\} (?{ local $count = $count - 1 })
|
\\ .
|
.
)+?
(??{ $count == 0 ? $re_true : $re_false })
/xgc;
would be fitting, though I suspect there must be a simpler way.
(consider how lucky I am that the regular expression engine is not reentrant...)
Now now, no need for that sort of language.
Hugo
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.