Re^2: Regex fun

in reply to Re: Regex fun
in thread Regex fun

I think it's important to note that \1 is not a variable (which is why you can't use it outside of a regex);

But you can, sometimes, use it in the replacement part.

think it's important to note that \1 is not a variable (which is why you can't use it outside of a regex); the variable that contains the contents of the first capture group is $1, but that's empty until the capture has completed.

But in /([0-9]+){$1}/, the first capture is completed before the quantifier. So, that's not the reason.

For example, /\+32767.{32767}/ is rejected at compile time

Yes, but that's considered a bug. It's a restriction that should have been removed after the regexp engine was no longer recursive.

“Why, then,” you ask, “is something like /(.)\1/, which suffers from the same compilation problem, OK?”

That's not the same problem. {...} is one of the mini-languages inside regular expressions. Compare it with [...]. [\1] doesn't refer back to something else either.

But one can defer a subpattern. The syntax is (??{ }). This is what the OP wants, and this is what the OP ought to use.

Comment on Re^2: Regex fun Select or Download Code

Replies are listed 'Best First'.
Re^3: Regex fun by JadeNB (Chaplain) on Dec 15, 2009 at 20:22 UTC
But you can, sometimes, use it in the replacement part. Sure, but you're not supposed to: Warning on \1 Instead of $1. But in `/([0-9]+){$1}/`, the first capture is completed before the quantifier. So, that's not the reason. Sorry, I don't understand—not the reason for what? It's a restriction that should have been removed after the regexp engine was no longer recursive. Sorry, I don't understand this, either. Do you mean ‘re-entrant’? (UPDATE: Nope, just my internals-ignorance revealed. Thanks, ikegami!)	[reply] [d/l]
Re^4: Regex fun by ikegami (Patriarch) on Dec 15, 2009 at 20:30 UTC
Regarding the last point, the engine was re-engineered for 5.10. It used to use the C stack, so limits were imposed to prevent stack overflows. Now, the stack it uses is on the heap. The implementation moved away from a recursive model as part of the change.	[reply]
Re^4: Regex fun by JavaFan (Canon) on Dec 15, 2009 at 22:13 UTC
Sorry, I don't understand—not the reason for what? Quoting myself where I am quoting you: the variable that contains the contents of the first capture group is $1, but that's empty until the capture has completed. You're claiming $1 is "empty" until the the capture has completed. I'm pointing that the in the case of the OP, said first capture has completed. Do you mean ‘re-entrant’? No, I don't. The current regexp-engine isn't re-entrant.	[reply]
Re^5: Regex fun by JadeNB (Chaplain) on Dec 15, 2009 at 22:41 UTC
You're claiming $1 is "empty" until the the capture has completed. I'm pointing that the in the case of the OP, said first capture has completed. I guess that the quotes around ‘empty’ are to point out that, besides the unusual choice of word (in place of ‘undefined’), it's not true—sorry, I'll correct that. I agree that Hena's second solution doesn't suffer from the problem that I mentioned; but the post particularly asks for a single-regex solution, and I was just mentioning why the obvious substitute, `/\+([0-9]+)[$bases]{$1}/`, for the non-working regex `/\+([0-9]+)[$bases]{\1}/`, doesn't work. (Nobody suggested it anyway, so I guess it was pretty unclear what I was talking about.) No, I don't. The current regexp-engine isn't re-entrant. Yes, which is why I thought that the final word in “the regexp engine was no longer recursive” might be ‘re-entrant’. :-) (I don't know enough history to know whether it ever was re-entrant, so, for all I knew, the grammar was correct.) I was particularly confused because Perl 5.10 newly allows for recursive regexes, which I confused with the regex engine itself being recursive; but ikegami clarified.	[reply] [d/l] [select]
Re^6: Regex fun by JavaFan (Canon) on Dec 16, 2009 at 09:00 UTC

In Section Seekers of Perl Wisdom