Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Broken compatibility for recursive regex in Perl 5.8.4

by gmax (Abbot)
on May 24, 2004 at 23:10 UTC ( #356076=perlmeditation: print w/replies, xml ) Need Help??

I've just got a warning from CPAN testers about one of my modules failing to compile under Perl 5.8.4.

The code, in its simplest form, is almost the same that you could get from perlre:

#!/usr/bin/perl -w use strict; our $re = qr{ \( (?: (?> [^()]+ ) | (??{ $re }) )* \) }x;

This code used to work fine from Perl 5.6.1 to 5.8.3

Under the latest version (5.8.4) it breaks as follows:

Variable "$re" is not imported at (re_eval 1) line 2.
Global symbol "$re" requires explicit package name at (re_eval 1) line 2.
Compilation failed in regexp at line 12.

If I want to compile the same code, now, I need to either enclose the offending regex inside a "no strict 'vars'" block or call the variable with a package name:

#!/usr/bin/perl -w use strict; our $re = qr{ \( (?: (?> [^()]+ ) | (??{ $main::re }) )* \) }x;

With this correction the code compiles.

Proof of concept:

$_ = 'aa bb (cc (dd) ee) ff '; # wanting ^----------^ print "$1\n" while m/($re)/g ; __END__ __OUTPUT__ (cc (dd) ee)

I wonder, though, if the "correct" behavior was the previous one or the current one.

I mean, I can't use this code with a variable declared as "my $re = qr/ ....", because by the time it gets embedded inside the regex, its value is still undefined. The example in perlre is actually using a global variable, which won't work under "strict" rules.

Any thoughts?

update (1)
The previous behavior was "correct" in the technical sense. The compiler wouldn't complain and it would produce the expected results.

I remember trying the my $re; $re = qr//; arrangement in an earlier version of Perl, and discarding it because it wouldn't work. I tried to replicate the failing behavior but I can't isolate a simple case, even though my module (Chess::PGN::Parse) will work as expected with a complete package name, but it will break under this syntax.

My decision of using "our" came after seeing that TheDamian's Regexp::Common was implemented without "use strict" at all. I should have looked at it again after Abigail took over the module.

update (2)
Actually, it will also work with our $re; $re = qr/ ... $re.../; (Thanks dragonchild).

update (3)
I submitted a patched module to CPAN, and now (26-May-2004) the testers reports look much better!

 _  _ _  _  
(_|| | |(_|><

Replies are listed 'Best First'.
Re: Broken compatibility for recursive regex in Perl 5.8.4
by dave_the_m (Monsignor) on May 25, 2004 at 00:30 UTC
    This is because until 5.8.4, the use strict; wasn't being propagated into the code within the regex; now it is, so it complains that the var hasn't been declared (because the declaration doesn't come into effect until the statement following the our.

    The error message is a bit misleading due the way Perl code within regexes is currently handled - something I'm hoping to fix for 5.10.0.


      Will the 5.8.4 scope changes fix this bug: Spurious re 'eval'; warning ?
      $ perl -Mre=eval -e '/ (??{ "(?{1})" }) $_ /x' Eval-group not allowed at runtime, use re 'eval' in regex m/(?{1})/ at + -e line 1.

        No, for that you'll need to wait for the overhaul of the hints system referred to here. I believe Mark-Jason Dominus is currently working on this.


Re: Broken compatibility for recursive regex in Perl 5.8.4
by dragonchild (Archbishop) on May 25, 2004 at 01:36 UTC
    What about:
    our $re; $re = qr/ ... /;

    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

Re: Broken compatibility for recursive regex in Perl 5.8.4
by diotalevi (Canon) on May 24, 2004 at 23:33 UTC
    I can't see how the previous behaviour was correct. When I've written this sort of thing the obvious and normal thing (to me of course) was to declare the lexical on the preceding line and then allow the regexp to capture it in its closure. Similarly, my $foo; $foo = sub { ... $foo->( ... ) ... }. If you're going to have it operate on something that is operating by lexical rules, then expect to have to treat it by lexical rules.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://356076]
Approved by Zaxo
Front-paged by hsmyers
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2022-10-07 04:56 GMT
Find Nodes?
    Voting Booth?
    My preferred way to holiday/vacation is:

    Results (29 votes). Check out past polls.