http://qs321.pair.com?node_id=671761

First of all, don't be fooled by the length of this post, it doesn't reflect its importance of substance.

I'm working on an application which I decide to include required CPAN modules in its distribution. The modules will be pre-installed as private modules of the application under its own 'lib' directory. Let's put aside for a while the debate around including the prerequisite modules directly versus letting the implementor installs the modules from CPAN. I believe everyone has their own pros and cons toward either or both ways. I have my own reasons to support both myself.

My goal is to make sure that the eventually installed application really loads the CPAN modules under its 'lib' directory, not the ones (if any) in the standard Perl module directories as set in @INC. The easiest way may be just by using use lib '/path/to/private/libdir'; to unshift the private libdir to @INC. But it could miss one module or two because perl will skip the private libdir upon failure to find a requested module there (should I forget to include it), and otherwise successfully loads it from a standard directory (if any), so the overall application runs OK. While this is harmless in most cases, it's simply false positive to my goal.

Without thinking other possibilities too much, I decide to take a route which is hopefully not the most stupid one. My idea is restricting @INC to contain only two or three entries: the first is the application 'lib' directory, the rest is (are) where Perl core modules installed.

For testing purpose I write a dummy script (inc.pl) contains a bunch of use statements for both core and CPAN modules required by the application. Some of the core modules are strict, warnings, Carp, and Data::Dumper. Some of the CPAN modules are DBI, SQL::Abstract, HTML::Template, CGI::Application, CGI::Simple, and CGI::FormBuilder. The first run of the script results nothing to the output which means everything is fine as my Perl installation is sane and all required CPAN modules are installed in the standard directories as found in @INC.

I then start bashing @INC by setting it to an empty array before any use statement.

BEGIN { @INC = () }

Unsuprisingly the script dies complaining that it Can't locate strict.pm in @INC (@INC contains:) at ./inc.pl line 6.. I then add the private libdir to the array but I don't need to rerun the script because I know it would result the same with the difference on printed @INC content.

BEGIN { @INC = qw( /path/to/apps/lib ); }
Speaking of @INC, the original @INC on my system is:
$ perl -le 'print for @INC' /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl .

I always wonder why Linux distros (mine is currently Debian 4.0) throw so many stuffs at @INC. Manual installation of Perl (default configuration) is simpler, for example:

$ /opt/bin/perl -E 'say for @INC' /opt/lib/perl5/5.10.0/i686-linux /opt/lib/perl5/5.10.0 /opt/lib/perl5/site_perl/5.10.0/i686-linux /opt/lib/perl5/site_perl/5.10.0 .

Anyway, I need to determine which path holds the Perl core modules. Nothing is probably easier than taking a core module and inspecting its entry in %INC,

$ perl -Mstrict -le 'print $INC{q(strict.pm)}' /usr/share/perl/5.8/strict.pm

Bingo! It's /usr/share/perl/5.8. Tempted to use another way, or maybe just to convince myself, I fire up another test with File::Find and checking two modules at once.

$ perl -MFile::Find -le ' find { follow => 1, wanted => sub{ print $File::Find::name if $File::Find::name =~ /strict\.pm$/ || $File::Find::name =~ /Data\/Dumper\.pm$/; }}, @INC; ' /usr/lib/perl/5.8/Data/Dumper.pm /usr/share/perl/5.8/strict.pm

Bummer! Two paths? (NOTE: I use option follow as it turns out that /usr/lib/perl/5.8 and /usr/share/perl/5.8 are symlinks to /usr/lib/perl/5.8.8 and /usr/share/perl/5.8.8, respectively.)

$ ls -ld /usr/lib/perl/5.8 /usr/share/perl/5.8 lrwxrwxrwx 1 root root 5 2008-02-05 13:17 /usr/lib/perl/5.8 -> 5.8.8 lrwxrwxrwx 1 root root 5 2008-02-05 13:17 /usr/share/perl/5.8 -> 5.8.8

Now I have to apologize to all of you as it becomes this lengthy, much longer than I originally thought it would be. I mean, uncovenience by my simple discovery and the fact that I found two paths, I insist to backup myself with supporting documentation. Among the standard POD documentations and some related modules, I think I find the canonical answer in ExtUtils::MakeMaker and Module::Build modules.

The former mentions INSTALL_BASE with three installation layouts: core, site, and vendor. Two of the installation directories of the core layout that attract me are INSTALLPRIVLIB and INSTALLARCHLIB. OTOH, the latter mentions installdirs with equally three installation layouts: perl, site, and vendor. As before, installprivlib and installarchlib of perl layout get my attention. Both modules refers to Config.pm, so I open its manual page (man Config). Upon reading about installprivlib and privlibexp in turn</c>, as well as installarchlib and archlibexp in turn, I quit the pager and peek at the module source code with perldoc -m Config. There I see privlibexp => '/usr/share/perl/5.8' and archlibexp => '/usr/lib/perl/5.8' entries of %Config hash. OK, it's time to go back to the script. I promise.

Armed with two standard directories, I change the BEGIN block to,

BEGIN { @INC = qw( /path/to/apps/lib /usr/share/perl/5.8 /usr/lib/perl/5.8 ); }

Running the script again, I get the expected result,

$ ./inc.pl Can't locate DBI.pm in @INC (@INC contains: /path/to/apps/lib /usr/sha +re/perl/5.8 /usr/lib/perl/5.8) at ./inc.pl line 21. BEGIN failed--compilation aborted at ./inc.pl line 21.

The module DBI is the first entry in the list of use statements for the non-core modules. It means that the core modules load OK, and it's time for the boring task of installing a number of CPAN modules, one by one, following the list. I know I can automate the perl Makefile-make-make test-make install sequence, or using CPAN locally. But, I enjoy the hardway. Seeing the friendly "Can't locate SomeModule.pm" message is quite fun, if you know what I mean (hints: checkpoint). But maybe it's just me. Maybe I'm simply sick. Maybe. Anyway, here is the complete test script, in case you wonder.

$ cat inc.pl #!/usr/bin/perl BEGIN { @INC = qw( /path/to/apps/lib /usr/share/perl/5.8 /usr/lib/perl/5.8 ); } # Core modules, along with in which Perl version they're first include +d use strict; # 5 use warnings; # 5.006 use Carp; # 5 use File::Spec; # 5.00405 use CGI::Carp; # 5.004 use Exporter; # 5 use Data::Dumper; # 5.005 # CPAN modules, list everything you need, as complete as possible use DBI; use DBD::mysql; use SQL::Abstract; use CGI::Application; use CGI::Application::Plugin::Forward; use CGI::Application::Plugin::DBH; use CGI::Simple; use CGI::FormBuilder; use HTML::Template; use URI; use Data::Pageset;

Comments and suggestions are very much welcome. Thank you verymuch.

Update: added notion about use lib, per sundialsvc4.


Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Replies are listed 'Best First'.
Re: Restricting @INC for specific application need
by Tanktalus (Canon) on Mar 04, 2008 at 16:43 UTC

    I'm still trying to figure this out. It seems it runs contrary to some of the culture of perl. For about the same reason why there are no "private" methods (well, unless you do something really silly), or any way to prevent someone from subclassing your package, the ability to go and install new modules is generally a good thing. In fact, in my setup script, I only expand CPAN modules that haven't been superseded by whatever is on the system already - allowing for a newer module is probably a good thing.

    That said, you may be more interested in the only module. It may get you what you want without all the hackery of @INC. (Well, it will do its own hackery, rather than relying on your own hackery :-))

      Well, this has nothing to do with subclassing. The application, specially this particular distribution, is intented to use only required CPAN modules disributed with it.The only module is close to what I need, but then I have to track the version of all modules to make sure the expected targets are the ones get loaded.

      Out of all, it may really sound silly, but I really hope I only do this once.


      Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: Restricting @INC for specific application need
by sundialsvc4 (Abbot) on Mar 07, 2008 at 21:59 UTC

    I take it that you have not yet stumbled-upon use lib ...?

    Anyway, when you do, this is probably going to be what you want. I would have one directory for application-specific modules, another for CPAN-overrides, and in the main-programs of your application simply list them both in a use lib declaration.

    Perl will search the libraries declared here, in order, before it walks down the @INC list... (I've got several deployed apps where I'm having to override CPAN-modules that are seriously-broken, and the owners of those modules continue to “upgrade” them without fixing the bugs. I will not name them here.)

      I didn't use use lib because it only prepends dir parameters to @INC, while my need is to override @INC completely by setting it manually. But yes, I should have mentioned this in the first place.

      Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: Restricting @INC for specific application need
by mr_mischief (Monsignor) on Mar 06, 2008 at 05:02 UTC
    You don't really ever know where the "standard" Perl module directory is, do you? I can override any paths using configure. I can have multiple directories included for past point releases. I can have vendor lib directories (which are used by many Linux distributions (because the distribution is packaged to be reliable across many installations for many users, and many distributions use all sorts of Perl in their distributions-specific tools (like urpmi on Mandriva))).

    Given those caveats, if you can get your code to be more reliable rather than less reliable by doctoring your lib paths and you don't intend to mess with the defaults for other programs, then I think every option is open to you.

    I'll give you a bit of free advice about common wisdom that not enough people seem to grasp. There are conventions, best practices, and recommendations enough that work well enough for most people in most situations. Usually, it's a good idea to follow the collective wisdom of a well-experienced, thoughtful group. When all the cards are down, so to speak, your chips depend on having the best hand at the table and not a pretty good one on average. If you're finding that you have an odd requirement that falls outside of the common wisdom, then perhaps you're right to have an odd solution. You'd just better be right about it, or someone's going to call you a fool when your system falls down and you have to explain why you stepped outside the norm. If you can make your system ten times as reliable by breaking convention, then do it and document the decision process. If you're making it 100% harder for another Perl programmer to take over the project later for a 5% increase in reliability under someone who knows the whole custom installation and configuration process, you'll be canned and for good reason.

      Long before this distro showers, I always knew that the "standard" Perl module directory started with /usr/lib/perl5. Whenever I installed Perl myself, I never felt the need to reconfigure @INC or other stuff, except once or twice for debugging and thread options. Why? Because I always intented to replace Perl (including non-core modeuls, if any) came with the distro. Before I leave this old history, just in case it rang a bell, I just want to emphasize that "I always wonder why Linux distros...." is not a real question. I can understand the reason behind distros decision, I just don't like the result.
      You don't really ever know where the "standard" Perl module directory is, do you?
      If you really asked me refering to the Perl installation shipped with a random distro, my answer was: I would never be sure until I did some test or lookup.

      So, put the long story short, eventually, only this application in this particular distribution that sets @INC in such way. I believe this @INC hackery (borrowed from Tanktalus) won't affect other systems I don't need or don't have any interest to control.

      I know it's against common practices, it's against my standard practices as well. I think the hard part with my OP is that it tries to force a frame, unusual frame of thinking. Now I can see it fails :-) That's why I said earlier that "Let's put aside for a while the debate around....". I kinda expected typical reactions I got so far, which I'm grateful for because it lets me know my sanity level (or, is it my insanity level?)

      If you can make your system ten times as reliable by breaking convention, then do it and document the decision process
      I will, and thanks for reminding me. My OP is absolutely part of the docs :) The lengty was mostly to accommodate my intention to share what I went through during the process of the decision making. The part I apologized because I realized it might end up useless to anybody else.

      Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

        It is kind of a pain to install a second perl alongside the distro's perl installation to keep from breaking things, but I've found it worthwhile for the polish and reliability I get with the management tools for my distro.

        I don't think your original post fails at all. I'm a big believer that common wisdom should be used in common situations, but that sometimes it falls down. You just need to be careful about when and how you break the rules and make sure there's actual benefit from it. I can't speak to your case very directly, because I'm not as familiar with your situation as you are. From what you've said, though, you may have one of those situations in which it's worthwhile to write an exception in your local copy of the rules. Just be sure you write it legibly, and, well, cover your behind.

Re: Restricting @INC for specific application need
by ig (Vicar) on Mar 27, 2009 at 06:28 UTC

    If you want to be sure certain modules are loaded from your application library, you can set @INC to include only the application library (or libraries as the case may be) rather than prepending the application library. In this way, if the module isn't in the application library it isn't loaded. For example:

    #!/usr/bin/perl # use strict; use warnings; BEGIN { local @INC = ( "/opt/app/lib" ); eval "use Application::Module"; }

    If the modules in your application library use modules from the standard library and assume a default @INC, then you can't change @INC when you use them but you can check %INC after they are loaded to ensure they are loaded form the correct location.

    For example, if you wanted to load a patched version of File::Find from your application directory and fail if it was accidentally omitted from there you could do something like the following:

    #!/usr/bin/perl # use strict; use warnings; use constant APPLIB => "/opt/app/lib"; use lib APPLIB; use File::Find; substr($INC{'File/Find.pm'},0,length(APPLIB)) eq APPLIB or die "File::Find incorrectly loaded from " . $INC{'File/Find.pm' +};