Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re-use of a global match

by gube (Parson)
on Aug 01, 2007 at 08:49 UTC ( [id://629993]=perlquestion: print w/replies, xml ) Need Help??

gube has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

See the below code. I declared one variable and assigned the name. For the same Code first time it's working and second time it's not working. What would be the reason ?

#!/usr/local/bin/perl use strict; use warnings; my $regexp = 'Perl Monks'; if($regexp =~ m/^Perl Monks/gi) { print "\nFound.."; } else { print "\nNot Found.."; } if($regexp =~ m/^Perl Monks/gi) { print "\nFound.1."; } else { print "\nNot Found.1."; }

Update : Title change as per ww suggesion

Replies are listed 'Best First'.
Re: Re-use of a global match
by shmem (Chancellor) on Aug 01, 2007 at 09:16 UTC
    That's because of the /g switch in the first match.From perlop:
    The "/g" modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.

    In scalar context, each execution of "m//g" finds the next match, returning true if it matches, and false if there is no further match. The position after the last match can be read or set using the pos() function; see "pos" in perlfunc. A failed match normally resets the search position to the beginning of the string, but you can avoid that by adding the "/c" modifier (e.g. "m//gc"). Modifying the target string also resets the search position.

    Aha. In scalar context (boolean is scalar, and you are using the match in an if-expression) the /g matching returns the next match. So the match-operator keeps its state with regard to that match, and repeating the same match returns again the next match - and there is no more match. Failing the match the match-opreator gets reset. Consider:
    use strict; use warnings; my $regexp = 'Perl Monks'; if($regexp =~ m/^Perl Monks/gi) { print "\nFound.."; } else { print "\nNot Found.."; } if($regexp =~ m/^Perl Monks/gi) { print "\nFound.1."; } else { print "\nNot Found.1."; } if($regexp =~ m/^Perl Monks/gi) { print "\nFound.2."; } else { print "\nNot Found.2."; } print "\n"; __END__ Found.. Not Found.1. Found.2.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

      Thanks for the above replies..

      Hi shmem

      I added modifier /c also in the match. Please check the below code. But, still the pos() for regexp is not reset. Could you please help me how to reset the position for regexp variable using modifier
      #!/usr/local/bin/perl use strict; use warnings; my $regexp = 'Perl Monks'; print "\nPos...", pos($regexp); if($regexp =~ m/^Perl Monks/gci) { print "\nFound.."; } else { print "\nNot Found.."; } print "\nPos..1.", pos($regexp); if($regexp =~ m/^Perl Monks/gi) { print "\nFound.1."; } else { print "\nNot Found.1."; }
        I added modifier /c also in the match. Please check the below code. But, still the pos() for regexp is not reset.

        which is the expected behaviour, since (perlop again):

        Options are:
        c  Do not reset search position on a failed match when /g is in effect.
        g  Match globally, i.e., find all occurrences.

        The pos() is retained in all cases until the next match is attempted, so you have to remove the /g modifier on the second match, as casiano correctly noted, which will reset pos() just before the regep engine tries to match the second time.

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Re-use of a global match
by akho (Hermit) on Aug 01, 2007 at 08:59 UTC
    Read up on the /g modifier. The second match only starts after the place the first one matched.
Re: Re-use of a global match
by casiano (Pilgrim) on Aug 01, 2007 at 09:06 UTC
    The reason is the "g" (global) option. The "g" option means that the next search will start where the last search succeeded. There is a counter associated with the string being searched that can be accessed using the "pos" function. The search starts from "pos($regexp)". See a modified version of your code:

    #!/usr/local/bin/perl use strict; use warnings; my $regexp = 'Perl Monks'; if($regexp =~ m/^Perl Monks/gi) { print "\nFound.."; } else { print "\nNot Found.."; } print "\npos = ".pos($regexp)."\n"; if($regexp =~ m/^Perl Monks/gi) { print "\nFound.1."; } else { print "\nNot Found.1."; }

    when you run the code you obtain:

    $ perl /tmp/prueba.pl Found.. pos = 10
    Since "pos" is now 10 the regexp will not match from that position.
    Try to eliminate the "g" option the second time and the regexp will succeed.
Re: Re-use of a global match
by moritz (Cardinal) on Aug 01, 2007 at 09:01 UTC
    You are using the g modifier.

    That means the first time you are looking for the first match, second time for the second match - but there is no second match (and never will, because your regexp is anchored).

    See perlre and perlop for details.

Re: Re-use of a global match
by ww (Archbishop) on Aug 01, 2007 at 10:28 UTC

    Do gube's original question and the fine answers above not point also to some optomization (a single instance in the symbol table of the two identical if...} clauses? Note that adding the third time thru, but using a variant if... (no 'g') produces a match there but then repeating the variant if...} finds no match in 4.

    #!/usr/local/bin/perl use strict; use warnings; my $regexp = 'Perl Monks'; my $regexp2 = 'Perl Monks is a wonderful place to visit'; # even if No +de Reaper is sometimes a bit loud if($regexp =~ m/^Perl Monks/gi) { print "\nFound.0."; } else { print "\nNot Found.0."; } if($regexp =~ m/^Perl Monks/gi) { print "\nFound.1."; } else { print "\nNot Found.1."; } if($regexp =~ m/^Perl Monks/i) { # No 'g' print "\nFound.2."; } else { print "\nNot Found.2."; } if($regexp2 =~ m/^Perl Monks/gi) { # 'g' is back, but different $var +being tested print "\nFound.3."; } else { print "\nNot Found.3."; } if($regexp2 =~ m/^Perl Monks/gi) { # second identical use of $regex2 +and match print "\nFound.4."; } else { print "\nNot Found.4."; }
    produces:
    perl gube_g.pl Found.0. Not Found.1. Found.2. Found.3. Not Found.4.
      Do gube's original question and the fine answers above not point also to some optomization (a single instance in the symbol table of the two identical if...} clauses?

      No*. It all has to do with match positions not being reset until a match failed, if /g is in effect.

      Without /g, those internal state variables are reset before each match; with /g, they are reset after a match failed.

      *) which doesn't mean there isn't such optimization, merely that the behaviour gives no evidence about that.

      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Re-use of a global match
by balaji_red83 (Acolyte) on Aug 01, 2007 at 13:25 UTC
    It is better not to use /g modifier while performing match operation.
      That statement needs to be qualified. It is far from always true. The /g modifier is part of Perl for a reason and there are many things one can do with it that are much more difficult without it.
      $foo = (join ', ', 'abacadaeafagahaiaj' =~ m/a/g ); print $foo . "\n";
        /g can be useful. But if you just want to know if something matches (which for me is most of the time), you don't need it.

        The cases where m///g is useful are where:

        • you need to save the position of the last match, or
        • you're counting or saving the matches for later use (assigning the result of m/// to a variable)

        In practice, I find myself using /g only in substitutions.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://629993]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (8)
As of 2024-04-24 17:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found