Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Including files

by Juerd (Abbot)
on Sep 24, 2004 at 09:28 UTC ( [id://393426]=perltutorial: print w/replies, xml ) Need Help??

No include

Some simple languages, like PHP, offer an include() to include a file literally. Perl does not. Then how can you still load code from other files?

There are many ways to do this in Perl, but there is no easy way to get close to what most people expect of an include(). People who want to include files usually come from simpler languages, or have no programming experience at all. The hardest thing they will have to learn is that you should not want to include a file.

That may sound harsh, but it's a Perl reality. In Perl, we don't include or link libraries and code reuse isn't usually done by copying and pasting. Instead, Perl programmers use modules. The keywords use and require are meant for use with modules. Only in the simplest of situations you can get away with abusing require as if it were some kind of include(), but it's easy to get bitten by how it works: it only loads any given file once.

Creating a module

What is a module, anyway? perlmod defines it as: just a set of related functions in a library file, i.e., a Perl package with the same name as the file.

Creating a module is EASY, but it is a little more work than just creating a file with code in it. First, you need to think of a module name. This is the basis for both the filename and the package name, and these have to be synchronised for some features to work. For project specific modules, I always create a new top-level namespace with the name of the project. For example, NameOfTheProject/SomeModule.pm is a good filename (in real life, use something a little more meaningful). The corresponding package is NameOfTheProject::SomeModule. Use CamelCaps like this, because everyone else does too.

The file must be in a directory that is listed in @INC. To find out what your @INC is, run perl -V. The current working directory (listed as its symbolic name . (a single dot)) should be listed. To begin with, putting the module in the script's directory is a good idea. It is the easiest way to keep things organized. If you want to put the module somewhere else, you will have to update @INC, so that perl knows where to find it. An easy way to do get code like this in the main script:

use lib 'path/to/the/modules'; # The example module could be path/to/the/modules/NameOfTheProject/Som +eModule.pm

What goes in the module itself doesn't require much explanation. I'll just give a complete example of a simple NameOfTheProject::SomeModule:

use strict; package NameOfTheProject::SomeModule; sub some_function { # put sane code here return 123; } 1;
Just list all the subs as you usually would, put a package statement at the top and a true value at the bottom (1 will suffice). Obviously, use strict is a good idea. You should never code a module that does not have this. Beware: A use strict; statement in the main script has no effect on the module.

A module runs only once. That means that for code to be reusable, it must be in a sub. In fact, it's considered very bad style to do anything (except declaring and initializing some variables) in the module's main code. Just don't do that.

You can now load the module and use its sub, simply by doing:

use NameOfTheProject::SomeModule; print NameOfTheProject::SomeModule::some_function(); # prints 123

You can have the module export its sub(s) to the caller's namespace (that is: the package the use statement is in. By default, this is main). For this, put after the package statement:

use base 'Exporter'; our @EXPORT_OK = ('some_function');
Then, when useing the module, you can request that some_function be imported to your namespace with:
use NameOfTheProject::SomeModule ('some_function'); print some_function(); # prints 123
To have it exported automatically, use @EXPORT instead of @EXPORT_OK. This will eventually bite you if the function names are generic. For example, many people get bitten by LWP::Simple's get function. It is not unlikely that you already have one.

There are more ways to export functions. I of course prefer to use my own module Exporter::Tidy. Not only because it's smaller and often faster, but mostly because it lets the user of a module define a prefix to avoid clashes. Read its documentation for instructions.

For the export/import mechanism, it is very important that the filename, the package name and the name used with use are equal. This is case sensitive. (Ever wondered why under Windows, use Strict; doesn't enable strict, but also doesn't emit any warning or error message? It has everything to do with the case insensitive filesystem that Windows uses.)

Stubbornly still wanting to include

Sometimes, a module just isn't logical. For example, when you want to use an external configuration file. (Many beginners and people who post code online put configuration in the script itself for ease of use, but this makes upgrading the script harder.) There are many configuration file reader modules you can use, but why use one of those if you can just use bare Perl?

This is where do comes in. What do does is very close to what an include would do, but with a very annoying exception: the new file gets its own lexical scope. In other words: a variable declared with my is not accessible externally. This follows all logical rules attached to lexical variables, but can be very annoying. Fortunately, this does not have to be a problem. do returns whatever the included script returned, and if you make that script just the contents of a hash, here's my favourite way to offer configurability:

# This is config.pl mirror => 'http://www.nl.example.com/mirror', path => '/var/www/example', skip_files => [ 'png', 'gif', 'jpg' ],
(The last , is optional, but it's included to make adding a line easier.)
# This is script.pl use strict; my %config = do 'config.pl'; chdir $config{path}; ...
Error checking is left as an excercise.

Because we used only the return value of the script, and never even touched a variable in config.pl, the inaccessibility of lexical variables is no longer a problem. Besides that, the code looks very clean and we have a very powerful config file format that automatically supports comments and all kinds of useful functions. How about interval => 24 * 60 * 60, for self-documentation? :)

Still not good enough

do updates %INC, which you may or may not want. To avoid this, use eval read_file instead. To find out if you want this, read perlvar.

There is a way to get an include() the way other languages have it. This is a very ugly hack that uses an internal exception made for Perl's debugger, and is possibly not future proof. As said before, you should not want to include a file. Still, because it is possible, I feel I have to tell you how. Just don't actually use it.

If you read the documentation for eval (which you of course should (don't use an operator without having read its documentation first)), you see that if it is called from within the DB package, it is executed in the caller's scope. This means that lexical values are made visible and the file behaves as a code block.

Here is an example to get an include() function that actually works the way most people expect:

use strict; package Acme::Include; use base 'Exporter'; use File::Slurp (); our @EXPORT = 'include'; { package DB; # The sub's name is fully qualified to avoid getting a B::Include sub Acme::Include::include ($) { my ($filename) = @_; my $code = qq[#line 1 "$filename"\n] . File::Slurp::read_file($filename); eval $code; } } 1;
Documentation for the #line directive is in perlsyn.

To test this new module, save it as Acme/Include.pm and create:

# This is foo.pl use strict; use Acme::Include; my $lexical = 'set in foo.pl'; include 'bar.pl'; print $lexical; # Should print: set in bar.pl
and:
# This is bar.pl use strict; $lexical = 'set in bar.pl'; # There is no "my" here, because that would create a *new* lexical # variable, hiding the existing one.
and then run perl foo.pl.

This example Acme::Include does not have any error checking. In practice, you will want to check $@ somewhere (but you also want to retain the value returned by the included file, and context to propagate properly. Good luck!).

Learning more

I wrote this tutorial to have an answer ready for the nth time someone in EFnet's #perlhelp asks why require works only once, or asks how to really include a file. Explaining the same thing over and over gets annoying over time. This is not a good guide to writing modules. For that, read chapter 10 of Beginning Perl and perlmod and perlnewmod. Of course, good code always comes with good documentation; so learn POD in 5 minutes.

One last thing

If you name your module Test, don't be surprised if it doesn't work. The current working directory comes last in @INC, so the Test module that is in the Perl distribution is probably loaded instead. This bites me at least once per year, this time while writing this tutorial :).

Replies are listed 'Best First'.
Re: Including files
by ambrus (Abbot) on Sep 24, 2004 at 11:44 UTC

    Couldn't one write ACME::Include based on source filters instead of the DB way you gave above?

    Source filters are a bad idea in general, but note that this one does not attempt to parse perl code, so it can not be fooled by wierd-looking perl code like some source filters can be. I still don't say that soing such things would be a good idea, but it might be cleaner that the DB hack.

    Here's my example, which just a quick draft, does not handle line numbers etc.

    Filter/Include.pm is

    package Filter::Include; use warnings; use strict; use Filter::Util::Call; sub import { @_==2 or die qq{usage: use Filter::Include "filename"}; open my $f, "<", $_[1] or die qq{cannot open include file "$_[ +1]": $!}; read $f, my $d, -s $f or die qq{cannot read file "$_[1]": $!}; filter_add(sub { $_ = $d; filter_del(); 1; }); } 1; __END__

    first is:

    #!perl -w use warnings; use strict; my(@a, @b); @a = ( 1, 2, do{ use Filter::Include "./second"; }, 7, 8); print "a(@a) b(@b)\n"; __END__

    and second which has to be in the current dir when first is run:

    3, 4); @b = (5, 6,

    Then, perl first prints a(1 2 3 4) b(5 6 7 8)

      IMO, the DB hack is a cleaner solution, because it is faster, uses a documented feature and has clean syntax for using it. To achieve such clean syntax with a source filter, you need to write a complex regex.

      Besides that, source filters don't work everywhere (like in eval) and I really think the included file should by itself be syntactically correct. Including code is bad, but including partitial expressions is, IMHO, even worse.

      On the other hand, the source filter really literally includes, while the DB hack only loads and runs during runtime.

      Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

        IMO, the DB hack is a cleaner solution, because it is faster, uses a documented feature ...

        I don't think any of them would be much faster than the other. The source filter is documented too.

        and has clean syntax for using it. To achieve such clean syntax with a source filter, you need to write a complex regex.

        It's not that simple. The difference is that the code I gave above does the include in compile time, and its syntax is use Filter::Include "file" (the do is not needed when including complete statements). The DB way includes the code at run-time, that's why it's possible to use a simple subroutine include "file" is possible. While it is indeed not possible to make my solution work with such a simple syntax, without actually interpreting the code (incidentally that's what the actual Filter::Include cpan module does); if you wanted to modify the DB solution so that it includes the code in compile time (which can be a difference in semantics, depending on what the include file contains), you'll have to use a use or BEGIN syntax too, or try to parse the code.

        Besides that, source filters don't work everywhere (like in eval)...
        That's true. More generally, source filters can be used only at compile-time, not runtime. Also, source filters can not be used from command line (-e) it seems.
        and I really think the included file should by itself be syntactically correct. Including code is bad, but including partitial expressions is, IMHO, even worse.

        True. I just wanted to show that this is really including the file.

        Finally let me note that some include facility is already built in perl: the -P switch. If the file third contains

        #!perl -w use warnings; use strict; my(@a, @b); @a = ( 1, 2, #include "./second" 7, 8); print "a(@a) b(@b)\n"; __END__
        and you run it with perl -P third, you get the same results. Of course, perlrun warns you that there are lots of problems with the -P switch.
Re: Including files
by Aristotle (Chancellor) on Sep 26, 2004 at 16:16 UTC

    You probably want to fix this:

    my $code = join( "\n", qq[#line 1 "$filename"], File::Slurp::read_file($filename) );

    You provide list context to read_file, only to then glue all the lines back together anyway. Except that you didn't chomp them, so joining with \n doubles all EOLs which throws off your line numbers. You want a simple concatenation instead.

    my $code = qq[#line 1 "$filename"\n] . File::Slurp::read_file($filenam +e);

    Very nice work on the node.

    Makeshifts last the longest.

      You provide list context to read_file, only to then glue all the lines back together anyway.

      Oops. That join is a left over bit of an earlier, more complex, attempt. Because Perl doesn't care about double newlines, and I didn't test with multi-line strings, and haven't even looked at line numbers, I never noticed that anything was wrong.

      You are of course right that simple concatenation is better here. I'll update the node right away.

      Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Re: Including files
by jidanni (Initiate) on May 28, 2017 at 23:57 UTC
    Nowadays,
    $ cat config.pl # This is config.pl mirror => 'http://www.nl.example.com/mirror', path => '/var/www/example', skip_files => [ 'png', 'gif', 'jpg' ], $ cat script.pl # This is script.pl use strict; my %config = do 'config.pl'; chdir $config{path}; $ perl script.pl $ perl -w script.pl Odd number of elements in hash assignment at script.pl line 3. Use of uninitialized value in list assignment at script.pl line 3. Use of uninitialized value $config{"path"} in chdir at script.pl line +4. $

      I would avoid configuration files containing executable code. There are many better alternatives: Re: Accessing variables in an external hash without eval

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      Nowadays,

      FWIW, the code still works fine for me under 5.24.1 on Linux. However, I can suggest you try replacing 'config.pl' by './config.pl', as discussed here. Also, it's a good idea to add the error checking to do as shown in its documentation.

Re: Including files
by chacham (Prior) on Dec 07, 2011 at 16:54 UTC
    > use lib 'path/to/the/modules';

    and for same directory files (as i just figured out), no path is needed, just the name. For example, including my local db module:"

    use lib 'db';

      If the module you're including is actually in the current directory, no use lib is required at all, since the current working directory, also known as ".", is already in @INC by default.

        Let me get this straight:
        • ~/abc/moo.pl
        • ~/abc/cow.pm
        • ~/zyx/cow.pm
        • current directory is ~/zyx

        moo.pl has:

        • use cow
        • use lib 'cow'

        When i execute moo.pl, . is set to ~/zyx, so:

        • use cow means ~/zyx/cow.pm
        • use lib 'cow' means ~/abc/cow.pm

        Did i get it right that, basically, there is a difference between the "current" directory and the directory of the pl file. Or, put another way, if i keep all my project files in the same directory, like we do during development, and the directory may change, and the directory i execute it from may change, "use lib" will work "as expected".

A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perltutorial [id://393426]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-03-28 11:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found