Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Strange Regex Behavior

by Manchego (Acolyte)
on Dec 06, 2011 at 05:23 UTC ( [id://941958]=perlquestion: print w/replies, xml ) Need Help??

Manchego has asked for the wisdom of the Perl Monks concerning the following question:

OK, just noticed this odd behavior. Can someone shed some light on this?
$b = 'test 100'; %hash = ( a => ($b =~ /(\d+)/ ? $1 : 0), b => ($b =~ /\d/ ? 1 : 0), ); print "$hash{a}\n";
You'd think you'd get 100, instead you get undef.

Replies are listed 'Best First'.
Re: Strange Regex Behavior
by BrowserUk (Patriarch) on Dec 06, 2011 at 05:34 UTC

    Intriguing. If you quote the $1, you get the expected output. Quite what that tells us I'm not sure.

    Update: this also works:

    a => ($b =~ /(\d+)/ ? 0+$1: 0),

    As does this:

    a => ($b =~ /(\d+)/ ? do{ print $1; $1 }: 0),

    But not this:

    a => ($b =~ /(\d+)/ ? do{ $1 }: 0),

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      Interesting. Similar to your +0, this seems to work:

              a => ($b =~ /(\d+)/ ? "$1" : 0),

      Could this be a Perl bug?

        It certainly looks that way to me.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

Re: Strange Regex Behavior
by JavaFan (Canon) on Dec 06, 2011 at 10:36 UTC
    It's an optimization that shines its ugly head through. When you use $1 (or any of its friends), no copy is actually made - there's a pointer that still needs to be followed (a pointer into the original string). However, since you are doing another match before querying $hash{a}, the structure in $1 has been overwritten. You have to treat $1 as it were a reference (or maybe alias would be a better term).

    The "$1" forces a copy. Or you could swap the assignment:

    hash = ( b => ($b =~ /\d/ ? 1 : 0), a => ($b =~ /(\d+)/ ? $1 : 0), );
    Or not use $1:
    hash = ( a => ($b =~ /(\d+)/)[0] || 0, b => ($b =~ /\d/ ? 1 : 0) );
    It's related to the problem when passing $1 to a subroutine:
    $ perl -wE 'sub f {say @_} 3 =~ /(\d)/; f $1' 3 $ perl -wE 'sub f {"1" =~ "1"; say @_} 3 =~ /(\d)/; f $1' Use of uninitialized value $_[0] in say at -e line 1. $ perl -wE 'sub f {"1" =~ "1"; say @_} 3 =~ /(\d)/; f "$1"' 3 $
    $1 really is valid only till the next successful match.
Re: Strange Regex Behavior
by quester (Vicar) on Dec 06, 2011 at 06:34 UTC

    It might be a little more obvious like this:

    $b = 'test 100'; %hash = ( a => ($b =~ /(\d+)/ ? $1 : 0), b => ($b =~ /(\w+)/ ? 1 : 0), ); print "$hash{a}\n";

    which prints

    test

    So, it did the second pattern match first, and interpreted $1 to be the result of that pattern match. It's the same sort of ambiguity that is found in, say, ($i++)+$i.

      I saw the same outputs in my perl 5.12.3. It seems perl confusing for $1 because this also prints "100" without warnings;

      %hash =( a => ($b =~ /(\d+)/ ? $1 : 0), b => "test b", ); print "$_=#$hash{$_}#\n" for keys %hash;

      And named capture seems to work fine.

      %hash = ( a => (($b =~ /(?<tag>\d+)/) ? $+{tag} : 0), b => (($b =~ /(?<tag>test)/) ? $+{tag} : 0), ); print "$_=#$hash{$_}#\n" for keys %hash;

      But I have no idea for why named capture doesn't confuse...

      So, it did the second pattern match first, and interpreted $1 to be the result of that pattern match

      Then why does quoting $1 'fix' it?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        AFAIK, the order in which these subexpressions are evaluated is not defined. Adding one more operation (more or less any operation, "" or - or sqrt all work) does change the order of evaluation. But in the absence of some rule requiring the second and third operands of ?: to be evaluated after the first one rather than before, that's merely a detail of the implementation. I don't see any rule about it offhand in "conditional operator" in perlop.
Re: Strange Regex Behavior
by vinian (Beadle) on Dec 06, 2011 at 10:01 UTC

    It seems match the last 'key => value' first, i try the below code serval times, but got the same output.

    use strict; use warnings; use Data::Dumper; my $b = 'test 200'; my %hash = ( c => ( $b =~ /([a-z]+)/ ? $& : 0 ), b => ( $b =~ /(\d+)/ ? $` : 0 ), a => ( $b =~ /\s+/ ? $' : 0 ), ); print Dumper(\%hash);

    Output:

    $VAR1 = { 'c' => ' ', 'a' => '200', 'b' => 'test' };

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://941958]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-25 20:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found