Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

While behavior

by SavannahLion (Pilgrim)
on Feb 15, 2014 at 03:34 UTC ( [id://1075017] : perlquestion . print w/replies, xml ) Need Help??

SavannahLion has asked for the wisdom of the Perl Monks concerning the following question:

I was knocking my head against a bug in my code when I came across Perlish behavior I don't really understand. I reduced the code to the bare minimum to best show the behavior.

#!perl use strict; use warnings; my $data = 'some random data to test this'; while (my ($line) = $data =~ m/^(.*)$/gm) { print "<0>" . $line ."\n"; }

The above code will output indefinitely:

<0>some . . <0>some <0>some

Now, if we modify the code just a little bit, we get the following code and behavior, which is what I wanted:

#!perl use strict; use warnings; my $data = 'some random data to test this'; while ($data =~ m/^(.*)$/gm) { my $line = $1; print "<1>" . $line ."\n"; }

The above will output what I expect:

<1>some <1>random <1>data <1>to <1>test <1>this

To get it out of the way, someone might say why don't I do the below instead? The behavior is the same as above. To avoid dirty details, yes, that's true. I just want to preserve $_ deeper within the while{} block

while ($data =~ m/^(.*)$/gm) { print "<2>" . $1."\n"; }

Can someone explain why the first code sample doesn't work as I would expect? The only thing I can think of is that the RegEx pointer isn't being preserved on each iteration, but I don't understand why....

Replies are listed 'Best First'.
Re: While behavior
by LanX (Saint) on Feb 15, 2014 at 03:41 UTC
    The first while calls the match in list context, you get all matches but grab only the first line.

    Since the string is exhausted the loop starts from the beginning and the match is never empty.

    while normally imposes scalar context but you broke the logic.


    just try grabbing more vars to understand what is happening

    DB<118> while (my ($line1,$line2) = $data =~ m/^(.*)$/gm) { print "$ +line1 $line2\n"; last if $x++>5 } some random some random some random some random some random some random some random

    Cheers Rolf

    ( addicted to the Perl Programming Language)

      Actually that does help. It also puts things like while(<FH>) in a better context.

Re: While behavior
by Athanasius (Archbishop) on Feb 15, 2014 at 04:10 UTC
    I just want to preserve $_ deeper within the while{} block

    The code shown has no effect on $_. Did you mean $1? If so, I can’t see any problem with using $1 at the top of the loop, then redefining it lower down as needed. However, if you do want to dispense with $1, you could use a named capture:

    while ($data =~ /^(?<line>.*)$/gm) { print "<3>" . $+{line} . "\n"; }

    See Capture groups.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Oh sorry, you're absolutely right. That was a bad edit. I'd blame my right hand fingers being bound up, but it's just sloppy proof reading. :\
      Yes, I really meant $1. The code I culled out reuses the string captured in $1 because I'm having problems crafting a single line Regex that parses the string correctly (A subject for a different posting). I was trying to fix that bug when I came across this bug.
      Now that I understand why I have this bug, I can focus on that bug. :)

Re: While behavior
by tobyink (Canon) on Feb 15, 2014 at 07:56 UTC

    In the first example, you are calling m//g in list context. In the second example, you are calling it in scalar context. The different behaviours of m//g in list and scalar context are documented in perlop.

    Relevant quote:

    "The /g modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.

    "In scalar context, each execution of m//g finds the next match, returning true if it matches, and false if there is no further match."

    You're relying on the scalar context behaviour, but calling it in list context.

    use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name