http://qs321.pair.com?node_id=768991

GaijinPunch has asked for the wisdom of the Perl Monks concerning the following question:

Monks,
Pretending for a moment globals are all fine and dandy, I have an issue. I have some shared routines that a couple of scripts use. They use global variables (one to a WWW::Mechanize object and another which is just holds the location of a directory to dump output in). These will never change throughout the course of the script, so I simply wrote the routines to use whatever variable had been defined when the routine is called.

Contents of my_subs.pl
sub process() { my $url = shift; $mech->get( $url ); my $html = $mech->content(); # Do all kinds of stuff w/ $html, including get() other pages, an +d login() to forms; $html = $mech->content(); return $html }
contents of myscript.pl
require( 'my_subs.pl' ); my $mech = WWW::Mechanize->new(); my $datadir = "/tmp/logs"; foreach ( @url ) { my $returned_html = &process( $_ ); # Bob Loblaw }
When process() is defined in an external file, it thinks $mech and $datadir are undefined. So, it doesn't work. I can't pinpoint exactly when it broke as it runs in cron, and I hadn't checked it in a while (a user pointed it out). It looks like just days ago though, and the last time I updated the packages on my system was a few weeks ago.

If process() is defined within myscript.pl and not an external file, life is good, and it works.

I know I can pass those things (and the few others the routine needs) as parameters, but my initial "parameterized" version of the routine did not work. It's not exactly the most kosher routine in the world. It scrapes a page that doesn't like to be scraped, and on enough failures, the page will do all kinds of nasty things like require visual human confirmation (boo). It's not the easiest to debug, either, as it doesn't tell you what's wrong. So, I only test for 15-20 minute intervals and stop. As such, I'd like to keep things as they are. As before, defining the rather long routine in the script file itself works, so I'm okay for now, but I'd rather have it in one centralized (external) place where it can be shared, and subsequent changes will only happen once.

So, my question now is, what am I missing about require() (and even the package approach) that I didn't know before? I thought declaring something with "my" outside of any scope made it global to everything...including external files.