Well, this sums up what I believe the beginner ought to know about scoping in Perl.

I do hope I have it right, and if I haven't, don't hesitate to tell me! Stylistic suggestions are also welcome.

Scoping

One thing you need to know to master Perl is how to deal with the scoping mechanisms it provides you with. You want globals? You got 'em! You want to avoid "collisions" (two variables with the same name clobbering each other)? You got it, and there's more than one way to manage the trick. But Perl's scoping rules aren't always so well understood, and it's not just the difference between my and local that trips people up.

I've learned a lot from Coping with Scoping and sections in various Perl books ( e.g.Effective Perl Programming ). So credit has to go to those authors (Dominus for the first, and Joseph N. Hall and merlyn for the second.

Namespaces

A basic idea, although one you need not master to write many scripts, is the notion of a namespace. Global variables (variables not declared with my live in a package. A package provides a namespace, which I'm going to explain by reference to the metaphor of a family name. In English speaking countries, "Robert" is a reasonably common name, so you (assuming you live in one) probably know more than one "Robert." Usually, for us humans, the current conversational context is enough to determine for our audience which Robert we're talking about (my chums down at the pool hall know Robert the darts genius, but at work, "Robert" is the CEO of our failing dot-com). Of course these people have family names too (yes, those can be shared by different people too -- but you can't expect this metaphor to be perfect =), and if we wanted to be fully explicit we'd add that to allow our audience to determine which Robert we are talking about. $Smith::Robert is a creature distinct from $Jones::Robert. When you have two different variables with the same (as it were) 'first name', you can explicitly declare which one you want to refer to by using the full name of the variable. Alternately, you can say "in this bit of code, I want to talk about the Smith family" by using the package Smith keyword.

Implicitly, there's a package main; at the top of your scripts; that is, unless you explicitly declare a different package, all the variables you declare (keeping the caveat about my in mind) will be in main. Variables that live in a package are reasonably called "package globals", because they are accessible by default to every operator and subroutine that lives in the same package (and, if you're explicit about their names, outside the package, too).

Using packages makes accessing Perl variables sort of like travelling in different circles. For example, at work, it's understood that "Robert" is "Robert Szywiecki", the boss. At the pool hall, it's understood that "Robert" is "Robert Yamauchi", the darts expert. Here's a little code to illustrate the use of packages:

#!/usr/bin/perl -w

package Szywiecki;

$Robert = "the boss";

sub terminate {
        my $name = shift;
        print "$Robert has canned $name's sorry butt\n";
}

terminate("arturo"); # prints "the boss has canned arturo's sorry butt
+"
[download]

The variable $Robert's full name, as it were, is $Szywiecki::Robert (note how the $ moves out to the front of the package name, indicating that this is the scalar Robert that lives in package Szywiecki). To code and, most importantly, subroutines in the Szywiecki package, an unqualified $Robert refers to $Szywiecki::Robert -- unless $Robert has been 'masked' by my or local (more on that later).

Now, if you use strict (and you should, you should, you should), you'll need to declare those global variables before you can use them, UNLESS you want to fully qualify them. That is,

#!/usr/bin/perl -w

use strict;

$Robert = "the boss";
print "\$Robert = $Robert\n";
[download]

will produce an error, whereas if we fully qualified the name (remember that implicit package main in there), there's no problem:

#!/usr/bin/perl -w

use strict;

$main::Robert = "the boss";
print "\$main::Robert = $main::Robert\n";
[download]

One way -- the preferred way -- to satisfy strict 'vars' (the part of strict that enforces variable declaration) is to use the our ($foo, $bar) operator (in perl 5.6.0 and above) or use vars qw($foo $bar) (previous versions, but still works in 5.6) to declare package globals. Notice that with use vars, you are expected to give an array of variable names, not the variables themselves (as with our. Both mechanisms allow you to use globals while still maintaining one of the chief benefits of strict 'vars': you are protected from accidently generating a new variable via a typo. strict 'vars' demands that your variables be explicitly declared (as in "here's a list of my package globals"). Both of these mechanisms allow you to do this with package globals.

A neat thing about packages (and potentially a bad thing, depending on how big a fan you are of "privacy") is that package globals aren't just global to that package, but they can be accessed from anywhere in your code</code>, as long as the names are fully qualified. You can talk about Robert the darts expert at work, if you say "Robert Yamauchi":

#!/usr/bin/perl -w package Szyewicki; $Robert = "the boss"; package PoolHall; $Robert = "the darts expert"; package Sywiecki; # back to work! print "Here at work, 'Robert' is $Robert, but over at the pool hall, ' +Robert' is $PoolHall::Robert\n";
[download]

See? Understanding packages isn't really all that hard. Generally, a package is like a family of variables (and subroutines! the full name of that terminate in the example above is &Szywiecki::terminate -- similar remarks apply to hashes and arrays, of course).
my

Variables declared with my are not globals, although they can act sort of like them. A main use of my is to operate on a variable that's only of use within a loop or subroutine, but that's by no means where it ends. Here are some basic points about my

A my variable has a block of code as its scope (i.e. the places in which it is accessible).
A block is often declared with braces {}, but as far as Perl is concerned, a file is a block.
A variable declared with my does not belong to any package, it belongs only to its block
Although you can name blocks (e.g. BEGIN, with which you may already be familiar), you can't fully qualify the name of the block to get to the my variable in code that doesn't occur in that block.
File-level my variables are those which are declared in a file outside of any block within that file.
You can't access a file-level my variable from outside of the file in which it is declared.

As long as you're writing one-file scripts (e.g. ones that don't import modules), some of these points don't matter a great deal. But if you're heavily into "privacy" and "encapsulation", and if you write modules and OO modules you will be, you'll need to understand all of the above.
Here's some commented code to explain some of these points:
#!/usr/bin/perl -w use strict; #remember we're in package main use vars qw($foo); $foo = "Yo!"; # sets $main::foo print "\$foo: $foo\n"; # prints "Yo!" my $foo = "Hey!"; # this is a file-level my variable! print "\$foo: $foo\n"; # prints "Hey!" -- new declaration 'masks' the +old one { # start a block my $foo = "Yacht-Z"; print "\$foo: $foo\n"; # prints "Yacht-Z" -- we have a new $foo in scope. print "\$main::foo: $main::foo\n"; # we can still 'see' $main::foo subroutine(); } # end that block print "\$foo: $foo\n"; # there it is, our file-level $foo is visible a +gain! print "\$main::foo: $main::foo\n"; # whew! $main::foo is still there! sub subroutine { print "\$foo: $foo\n"; # prints "Hey!" # why? Because the variable declared in the naked block # is no longer in scope -- we have a new set of braces. # but the file-level variable is still in scope, and # still 'masks' the declaration of $main::foo } package Bar; print "\$foo: $foo\n"; # prints "Hey!" -- the my variable's still in s +cope # if we hadn't made that declaration above, this would be an error: th +e # interpreter would tell us that Bar::foo has not been defined.
[download]

As the bottom bit in the above example shows, because they don't live in any package, my variables can be visible even though a new package has been declared because the block is the file (at least for these purposes)
Now the example above used a 'naked' block -- there's no control structure (e.g. if or while) involved. But of course that makes no difference to the scoping.
File-level my variables ARE accessible from within blocks defined within that file (as the example above shows) this is one way in which they're sort of like globals. If, however, subroutine had been defined in a different file, we would have a run-time error. Once you know how my works, you can see, just by looking at the syntax of the file, where a my variable is going to be accessible. This is one reason the scoping it provides is called "lexical scoping."
One of my favorite uses of my is when you're iterating over an array. You can use foreach (@array) { #stuff}, and operate on the 'default' variable $_ which is an alias for the current element of the array on each step through the loop. But that means that if you modify $_ in your loop, you're modifying the elements of the array (and the use of $_ which is a very special kind of global variable, makes it even more complex than that, but that's a subject for part II). So what if you don't want to do that? The answer is simple: use foreach my $element (@array) {# } ... $element will be set to the value of the current element of the array on each step through the array, and modifying it won't change your array.

local

Now we arrive at local, which is only sort of like my, but due to its name, its function is sometimes confused with that of my. Here's the skinny : local $foo saves away the current value of the (package) global $foo, and determines that in the current block and any code called by the current block, $foo refers to a different variable. Since local only works on globals, you can't use it on a my variable (try it, I can wait). What this means is that if you localize a variable within a block and call a subroutine from that block, that subroutine will see the value of the localized variable. This is a major difference between my and local. Compare the above example to this one:
#!/usr/bin/perl -w use strict; use vars qw ($foo); # or "our $foo" if you're using 5.6 $foo = "global value"; print "\$foo: $foo\n"; # prints "global value" sub mysub { my $foo = "my value"; showfoo(); # prints "global value" } sub localsub { local $foo = "local value"; showfoo(); # prints "local value } sub showfoo { print "\$foo: $foo\n"; }
[download]

Well, I hope that's enough to get you started!

In reply to (proposed)Scoping tutorial by arturo

Title:
Use: <p> text here (a paragraph) </p>
and: <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

Are you posting in the right place? Check out Where do I post X? to know for sure.

Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>

Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).

Want more info? How to link or How to display code and escape characters are good places to start.

comment on

Scoping

Namespaces

`my`

`local`


P is for Practical
	PerlMonks