http://qs321.pair.com?node_id=96369

iakobski has asked for the wisdom of the Perl Monks concerning the following question:

Now I usually only use $_ in very small blocks of code, such as in a map or grep block. If the section of code large or is likely to grow then it is much clearer to assign to a named variable and this helps to document the code.

Today I was extending a bit of code that uses $_ absolutely everywhere and there was a point where the behaviour was a bit difficult to understand. I have always thought that you could assume that the magic of $_ made it work so that you could use it as if it were lexically scoped. So when you run this:

use strict; for (1..3){ print "BEFORE: " . $_; foo(); print " AFTER: " . $_ . "\n"; } sub foo{ for(<DATA>){ last; } } __DATA__ a b c
the fact that there is a call to foo() should make no difference to the value printed. And as expected, it doesn't:
BEFORE: 1 AFTER: 1 BEFORE: 2 AFTER: 2 BEFORE: 3 AFTER: 3 ========== [C:\users\jake\code\komodo\test2.pl] run finished. ======== +==

So what's the problem? Well if you change foo() to read:

sub foo{ while(<DATA>){ last; } }
ie use while instead of for, then the output is:
BEFORE: 1 AFTER: a BEFORE: 2 AFTER: b BEFORE: 3 AFTER: c ========== [C:\users\jake\code\komodo\test2.pl] run finished. ======== +==
Now I found this quite surprising. So my questions are:

-- iakobski

Replies are listed 'Best First'.
Re: When is $_ local and when is it not?
by Abigail (Deacon) on Jul 13, 2001 at 17:17 UTC
    The answers are:
    • Yes
    • Yes
    • Just because.... ;-)
    In a few cases, $_ is implicitely localized: in the block/expression of map and grep and in the block of a foreach that doesn't have its own iterator mentioned. (If the iterator is mentioned, with no my preceeding it, the iterator is localized too). This is all documented where foreach, map and grep are documented: perlsyn and perlfunc.

    There is a difference between foreach and while because they are two totally different things. foreach always assigns to a variable when looping over a list, while while normally doesn't. It's just that while (<>) is an exception and only when there's a single diamond operator there's an implicite assignment to $_.

    -- Abigail

      Thanks for the explanation. I guess a more thorough reading of the docs should have sorted this out for me.

      However, do you think this is the way it should work? After all you don't expect your variables to suddenly change as you go through a function. I know $_ is a global, but I imagine a lot of people write code like the stuff I was maintaining where there is a call to a sub in the middle of using $_. And then maybe that calls a sub and that calls another one, etc. Then someone comes along one day and puts while(<INPUT>) in one of those subs, all careful with use strict and warnings, but suddenly some bit of code miles away just breaks!

      Anyway, I know I'm getting a bit hot under the collar over this, perhaps because I spent about an hour not knowing why my change had broken the whole program. But shouldn't there be some kind of health warning at least over the use of while(<>) since it must be one of the first idioms beginners learn.

      I know I'm going to write

      local $_; while(<>){ }
      from now on!

      update No I'm not I'm going to follow jeffa's example! Doh! I might go through and put local in the old "code"/plate of spaghetti that I'm supposed to be maintaining.

      -- iakobski

        ... or better yet, just don't use $_ in such situations. You hit the nail on the head when you mentioned someone coming along and breaking it.
        my $line; while ($line = <>) { # stuff }
        i gonna wash that ambiguity outta my hair . . .

        Jeff

        R-R-R--R-R-R--R-R-R--R-R-R--R-R-R--
        L-L--L-L--L-L--L-L--L-L--L-L--L-L--
        
        Yes, I think it should work that way. Then you can write code like:
        sub next_foo { while (<>) { return $. if /^FOO:/ } } while (next_foo) { do_something # with $_ }
        Some people won't like this style. That's fine, you don't have to. Other will like it, and Perl let's you.

        -- Abigail

      Good explanation, Abigail. I'm just curious, is there a good reason why while (<>) doesn't implicitely localize $_?

      -- Hofmator

        One possible reason for why while(<>) does not implicitly localize $_ as part of its magic is that sometimes you want to access the last value of $_ outside the loop. For example:
        while (<>) { chomp; last if /\S/; } print "You said: $_\n";
        If $_ were localized, you would have to copy the value to another variable:
        my $in; while (local $_ = <>) { chomp; $in = $_, last if /\S/; } print "You said: $in\n";
        Although it may just be as Abigail said, that foreach always assigns to a variable as part of the loop, whereas while can have anything in the conditional and (<>) is just a special case.
Re: When is $_ local and when is it not?
by converter (Priest) on Jul 13, 2001 at 17:14 UTC

    The perlsyn manpage documents localization:

    In the Foreach loops section:

    The foreach loop iterates over a normal list value and sets the variable VAR to be each element of the list in turn. If the variable is preceded with the keyword my, then it is lexically scoped, and is therefore visible only within the loop. Otherwise, the variable is implicitly local to the loop and regains its former value upon exiting the loop.

    The Compound statements section of the perlsyn manpage for Perl 5.6.1 (the PM perlsyn doesn't include this) mentions:

    Unlike a "foreach" statement, a "while" statement never implicitly localises any variables.

    Hope this helps.

Detecting global $_ usage
by bikeNomad (Priest) on Jul 13, 2001 at 19:00 UTC
    There's a charming example in the Perl Cookbook that ties global $_ so you get diagnostics when $_ is used globally (modified to be able to carp instead of croak if you want):
    # croak on global underscore usage: # no Underscore; # carp on global underscore usage: # no Underscore 'carp'; package Underscore; use Carp (); my $complain = \&Carp::croak; sub TIESCALAR { bless \(my $dummy) => shift } sub FETCH { $complain->("read access to \$_ forbidden") } sub STORE { $complain->("write access to \$_ forbidden") } sub unimport { tie($_, __PACKAGE__); $complain = \&Carp::carp if $_[1] eq 'carp'; } sub import { untie $_ } tie($_, __PACKAGE__) unless tied $_; 1;
    And you save it as Underscore.pm and use it by just adding a no Underscore; or no Underscore 'croak'; . Or you can do it at the command line, of course, using perl -M-Underscore myprog.pl or perl -M-Underscore=carp myprog.pl
Re: When is $_ local and when is it not?
by HyperZonk (Friar) on Jul 13, 2001 at 17:21 UTC
    I'll admit that this is pretty much just a guess. I think it may have something to do with the fact that the for construction essentially builds a list from <DATA>, which it then iterates over, while while pumps the data items one at a time into $_. I could very well be very wrong, but I think this may be a place to start looking for an answer.
    update: In looking at converter's reply in the thread, I see that the correct answer is given there. I wonder if that has something to do with the fact that the for construction builds a hidden list? Perl_guts wizards can fill us in, perhaps.
Re: When is $_ local and when is it not?
by dragonchild (Archbishop) on Jul 13, 2001 at 18:36 UTC
    I know that I've started to do something like:

    sub foo { local $_; my ($var1, $var2) = @_; ... }

    in all my personal "production-level" code. (I'm an avid user of $_.) While this can result in a slight performance hit, I change my personal code around enough that have that template just makes me feel safer about using $_.

    Just a suggestion. :)

Re: When is $_ local and when is it not?
by John M. Dlugosz (Monsignor) on Jul 13, 2001 at 19:05 UTC