( prefix (meat)* suffix )*
is is possible for the prefix and suffix to be simultaneously (and repeatedly) empty? If so, then anything that matches /(meat){n}/ will be able to match at least 2^n different ways.
I'm sure that at the least a subset of exponential regexps should be discernible. I imagine the first stage of such an approach would be to iteratively strip out anything that can be zero-width, in which case it would be rather harder to detect that eg: /( (foo)* (?=b) (bar)? )* baz )/x
is linear. Ask me again when Perl6 has a full grammar implementation - it might be an interesting project to write such a detector targetting a pluggable regexp grammar.
Note that one generic approach to fixing such regexps is to use the 'cut' operator (?> ...), but as far as I am aware that operator is available only in perl itself and the ever-faithful PCRE. For example: /( (?> [^']* ) ('[^']*')? )* /x
avoids the exponential behaviour by disallowing backtracking into the [^']*.
Hugo |