http://qs321.pair.com?node_id=768991

GaijinPunch has asked for the wisdom of the Perl Monks concerning the following question:

Monks,
Pretending for a moment globals are all fine and dandy, I have an issue. I have some shared routines that a couple of scripts use. They use global variables (one to a WWW::Mechanize object and another which is just holds the location of a directory to dump output in). These will never change throughout the course of the script, so I simply wrote the routines to use whatever variable had been defined when the routine is called.

Contents of my_subs.pl
sub process() { my $url = shift; $mech->get( $url ); my $html = $mech->content(); # Do all kinds of stuff w/ $html, including get() other pages, an +d login() to forms; $html = $mech->content(); return $html }
contents of myscript.pl
require( 'my_subs.pl' ); my $mech = WWW::Mechanize->new(); my $datadir = "/tmp/logs"; foreach ( @url ) { my $returned_html = &process( $_ ); # Bob Loblaw }
When process() is defined in an external file, it thinks $mech and $datadir are undefined. So, it doesn't work. I can't pinpoint exactly when it broke as it runs in cron, and I hadn't checked it in a while (a user pointed it out). It looks like just days ago though, and the last time I updated the packages on my system was a few weeks ago.

If process() is defined within myscript.pl and not an external file, life is good, and it works.

I know I can pass those things (and the few others the routine needs) as parameters, but my initial "parameterized" version of the routine did not work. It's not exactly the most kosher routine in the world. It scrapes a page that doesn't like to be scraped, and on enough failures, the page will do all kinds of nasty things like require visual human confirmation (boo). It's not the easiest to debug, either, as it doesn't tell you what's wrong. So, I only test for 15-20 minute intervals and stop. As such, I'd like to keep things as they are. As before, defining the rather long routine in the script file itself works, so I'm okay for now, but I'd rather have it in one centralized (external) place where it can be shared, and subsequent changes will only happen once.

So, my question now is, what am I missing about require() (and even the package approach) that I didn't know before? I thought declaring something with "my" outside of any scope made it global to everything...including external files.

Replies are listed 'Best First'.
Re: require, globals, and some various mayhem
by perrin (Chancellor) on Jun 06, 2009 at 05:21 UTC

    What you're missing is that those aren't globals. Variables declared with my() are lexical, not global, and the reason it works when the subs are defined within the same script is that the lexical variables and the sub are in the same scope.

    What you should do is pass the variables to the sub. However, your globals approach will work, provided you declare the variables with our() instead of my().

      Cheers, I will try that. Any idea why it worked properly before? Maybe I hallucinated the whole thing.
        If you had them in separate files before, it couldn't have worked. Maybe you weren't running the code you thought you were.
        I suspect for this to work well, you are going to need to "use" instead of "require". I might be wrong, but I'm sure approach below will work. Here is some boiler-plate for you. Make a file called my_subs.pm, and stick modified version of this in there. The "use my_subs" will cause this .pm code to run before your program and the globals will exist.
        #file my_subs.pm use strict; use warnings; package my_subs; use vars qw(@ISA @EXPORT @EXPORT_OK %EXPORT_TAGS $VERSION); use Exporter; our $VERSION=1.0; our @ISA = qw(Exporter); our @EXPORT = qw(GLOBAL1, GLOBAL2, XYZZY); our @EXPORT_OK = qw(); our $GLOBAL1 = 23; our $GLOBAL2; sub XYZZY{} 1; # important!!! every .pm file must return "true", 1; is # easiest way to do that! ###### in main program ##### use my_subs; # do something with XYZZY(a,b); # my $a=$GLOBAL1 +23;