http://qs321.pair.com?node_id=291927

mp has asked for the wisdom of the Perl Monks concerning the following question:

I'm starting to recompartmentalize code that is in 250+ library files (*.pm) into module distributions. The main motivation for doing this is to be able to integrate in unit tests so that modules can be tested standalone before installation.

(Is this a bad approach? Would it be better to just keep using the source module files (*.pm) directly without modulizing them, and just come up with some way to handle unit tests, rather than use the h2xs-generated framework?)

These module distributions use the same structure as CPAN modules (start with h2xs), but for the most part, they won't become CPAN modules because they are not general purpose. Previously, the mod_perl-driven website that used these modules used "use lib ..." to access the source for the libraries directly so there was no module build process, but in converting them to module distributions, they will each require building with:

perl Makefile.PL make make test make install

The modules will need to be installed on multiple servers. Their source code will undergo frequent modifications, so the modules will frequently need to be upgraded (re-installed) on multiple servers. The module sources will be kept in CVS.

With these not being CPAN modules, what is a good way to determine which version of each module is currently installed on a system? Also, what is a good way to handle the build process so that only those modules that have been modified are rebuilt? I can do a CVS update of the module sources on each system, but I need some way to determine which modules need to be re-built and re-installed. I have thought of making a global Makefile (or perl script that does basically the same thing a Makefile would do) to handle this, but am not sure if that's the best approach.

For CPAN modules, I can get pretty much what I need with Bundle files and the "autobundle" command in CPAN.pm, but I'm just not sure what is the best (and that requires the least amount of additional work) way to achieve similar capablity for non-CPAN modules. Is there some (relatively easy) way I can leverage off existing CPAN.pm functionality to treat these non-CPAN modules as though they were on CPAN?

Replies are listed 'Best First'.
Re: non-CPAN module distributions
by perrin (Chancellor) on Sep 16, 2003 at 19:23 UTC
    Is this a bad approach? Would it be better to just keep using the source module files (*.pm) directly without modulizing them, and just come up with some way to handle unit tests, rather than use the h2xs-generated framework?

    Yes. You have nothing to gain by making separate installations unless you intend to use these modules in separate places. I don't really understand the problem with the unit tests. You can make as many tests as you want and they can test anything you want. I have a whole bunch of different test files in my current project, all testing different modules, and they all just sit in my t/ directory.

    what is a good way to determine which version of each module is currently installed on a system?

    Add a $VERSION variable to each module.

    what is a good way to handle the build process so that only those modules that have been modified are rebuilt? I can do a CVS update of the module sources on each system, but I need some way to determine which modules need to be re-built and re-installed.

    This is an example of why you shouldn't split these up. Your global Makefile idea would probably work though.

      You have nothing to gain by making separate installations unless you intend to use these modules in separate places.

      Hmm. I sometimes develop in this way. You do get some useful things from this model:

      • You can easily separate the test scripts that apply to this particular module from those that apply to another. This allows you to easily run only the relevant tests when testing a module. This can save time and make TDD considerably more pleasant.
      • People can work on modules independently from each other.
      • It can give you a nice clear separation of unit tests (distributed with module directory) and acceptance tests (distributed with application as a whole).
      • Allows you to easily release module updates rather than application updates as appropriate.

      There are certainly other ways of getting these benefits, but the CPAN-ish structure gets you a lot without any effort.

        I think you can do all of these things just by making separate test scripts for each module and only running the one that pertains to the module you're working on. That's what I'm doing now.
      Perrin, thank you for your input.

      The advantages I see of moving groups of module files (*.pm) into module distributions are that tests would be bundled with the code they are testing, support libraries needed just during testing would be bundled in as well, and in many cases, only one or two module distributions would be required for standalone development and unit testing. It seems to me that it would be easier to have someone work on and test just one module distribution without having to understand the entire design and interactions between it and other modules.

      It may well be that these advantages are outweighed by the disadvantages -- that is part of the assessement I'm currently trying to make before going very far down this path. Some disadvantages of this bundling into module distributions are that later code restructuring will become more difficult, and that additional work is needed at the very least, to setup the framework for doing module installations.

        The advantages I see of moving groups of module files (*.pm) into module distributions are that tests would be bundled with the code they are testing

        If you have a single big bundle, the tests are still bundled with the code they are testing.

        support libraries needed just during testing would be bundled in as well

        I don't see why these couldn't be part of your big bundle as well.

        in many cases, only one or two module distributions would be required for standalone development and unit testing.

        Does it hurt anything to have the other modules there if they aren't being used?

        It seems to me that it would be easier to have someone work on and test just one module distribution without having to understand the entire design and interactions between it and other modules.

        Why is this difficult to do when they're all together? I have lots of modules as part of one project with no separate installs. Each one has a separate test script. The tested modules don't interact with each other at all during tests unless it's really necessary. While working on one module, I just run that module's test. When I'm ready for integration testing, I run all the tests. If you split up your modules, integration testing will be a lot more hassle.

        You can split them up if you want to, but I just don't think you'll get much value from it, unless you plan to release individual pieces to CPAN.

Re: non-CPAN module distributions
by adrianh (Chancellor) on Sep 16, 2003 at 22:02 UTC
    Is this a bad approach? Would it be better to just keep using the source module files (*.pm) directly without modulizing them, and just come up with some way to handle unit tests, rather than use the h2xs-generated framework?

    I don't think it's a bad approach - but there are certainly other ways of going about it. For example:

    • Group your unit tests (by name, by directory, etc.) and use tools like Test::Verbose or a hand rolled test runner to run just the tests you want. For example if you prefix all test scripts with the module name you can use Test::Verbose to do things like:
      % tv t/ModuleName*.t
    • Use something like <plug bias="author">Test::Class</plug> to combine the unit tests for a module into a single test script.
    Previously, the mod_perl-driven website that used these modules used "use lib ..." to access the source for the libraries directly so there was no module build process, but in converting them to module distributions, they will each require building with ...

    You can get ExtUtils::MakeMaker to do most of the work for you.

    With these not being CPAN modules, what is a good way to determine which version of each module is currently installed on a system?

    That's what $VERSION is for :-)

    Also, what is a good way to handle the build process so that only those modules that have been modified are rebuilt? I can do a CVS update of the module sources on each system, but I need some way to determine which modules need to be re-built and re-installed. I have thought of making a global Makefile (or perl script that does basically the same thing a Makefile would do) to handle this, but am not sure if that's the best approach.

    When I go this route this is the structure I use:

    AppBundle/ Makefile.PL t/ acceptance_test1.t acceptance_test2.t ... Module1/ Makefile.PL t/ lib/ ... Module2/ Makefile.PL t/ lib/ ...

    Where the root Makefile.PL is like this one, and the individual module Makefile.PL files are as you would normally expect.

    This allows me to:

    • To checkout individual modules, run the module perl Makefile.PL and be able to develop a module independently of the application as a whole.
    • Checkout the entire AppBundle, run the root Makefile.PL and build, test and install the application as a whole.

    In fact I have a hook script that, on commit, does a make test on the application as a whole and mails me if it goes wrong. I'm a Subversion user, but I'm sure it will carry across to CVS.

    Hope this makes sense :-)

      Thanks for posting the link to Subversion. I'd never heard of it, but it's now installed and I am a convert!


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

        Subversion rocks :-)

        Actually, it just occurred that subversion would offer another solution to the separation of tests issue. You could create a separate directory in a subversion repository that had a per-module Makefile.PL and external definition links to the appropriate module and test code from the main directory.

        You could then checkout a module that could be worked on stand alone, and have all changes automatically checked into the main tree.

        No idea if/how you would do this in CVS.

Re: non-CPAN module distributions
by jmanning2k (Pilgrim) on Sep 16, 2003 at 20:18 UTC
    For the basic modularization stuff, I'm working on something similar.

    Unfortunately ExtUtils::MakeMaker only really works when you have everything under a single module directory. I ended up making a top level prefix that matched my work name, and putting everything under that. The downside being that I had to change all the package lines, all the use lines, and everywhere that referenced that code. A few regex one-liners did the trick, but it was still a pain.

    This solved many of the problems you listed. The source to my module is in CVS, and I can just do a cvs update. When you want to install, perl Makefile.PL LIB=/install/path works great, and only updates changed files.

    You can also make a distribution tarball (make dist) to install on multiple machines, or just do cvs update and build on each machine - It only installs newer files.

    That said, I'll be watching this thread for any alternate solutions. There is a MAKEFILE_PL option in MakeMaker for which the docs read "put other directories with Makefile.PL files in here". I'm thinking that might be used for a suite of modules with a master Makefile.PL. I'm not sure though.

    Hope this helps.
    ~Jon

    A few links that might help:
    http://www.cpan.org/misc/cpan-faq.html#How_make_bundle http://search.cpan.org/author/MSCHWERN/ExtUtils-MakeMaker-6.17/lib/ExtUtils/MakeMaker/Tutorial.pod

      You don't need to put them all in one directory, just make sure you don't use the ridiculous structure that h2xs creates. Schwern has talked about modifying h2xs at some point so that it will no longer do this. With a sane directory structure, you can check things out directly from CVS and use them right away without running make (assuming they are not XS modules), even if they are multiple directories deep. The structure shown in that tutorial you linked to is correct, and you'll note that it is different from what h2xs generates.
        Really? I've been having trouble with that. Any help you can provide would be appreciated.

        What do I set NAME to? If I set NAME to 'MyCorp', then it only installs modules in lib/MyCorp/, and totally ignores lib/OtherUtils/.

        I have a structure like:

        t/tests.t lib/MyCorp/Mod1.pm lib/MyCorp/Submodules/Mod2.pm lib/OtherUtil/Mod3.pm Makefile.PL bin/helperscript.pl
        The problem is, Mod1.pm and Mod2.pm get installed OK, but Mod3 is ignored. NAME in Makefile.PL is set to 'MyCorp'.

        Do I need multiple makefiles for this? aka:

        Makefile.PL (master) MyCorp/Makefile.PL MyCorp/lib/MyCorp/Mod1.pm OtherUtil/Makefile.PL OtherUtil/lib/OtherUtil/Mod3.pm

        The only way it works for me now, is if I put everything under MyCorp/.

        Thanks, ~Jon

Re: non-CPAN module distributions
by leriksen (Curate) on Sep 17, 2003 at 00:35 UTC
    I think there are several advantages to breaking up the lotsa-modules structure you have
    • you get to think about exactly how all the parts relate to each other
    • whilst they may not be candidates for CPAN, you can create localised 'general' modulues - I find I end up with three groups -
      1. core
      2. tools/utils
      3. specials
    • there are opportunities to refactor/simplify
    • your release process can be better defined - this release is based on fixes to library A and extensions to library B, instead of we changed a whole bunch o stuff
    • because the libraries are separated, one would hope that over time opportunities to use some of them for other projects would emerge - if they are all in one, that step may not be made
    • it shows a level of maturity in your's and the codes development

    As for the versioning I use the $VERSION string too, but sometime I even create a VERSION.pm file for a library, whose only job is to hold the master version identity of the library. I sometime link this to the CVS version support via
    $VERSION = sprintf("%d.%2d", q$Revision$ =~ /(\d+)\.(\d+)/);
    or I link to a tag holding the master version identity in CVS via
    $VERSION = sprintf("%d.%2d", q$Name$ =~ /(\d+)_(\d+)/);
    Be aware that $Name requires you checkout with -r <tagname>, and that CVS doesn't allow you to put '.'s in tagnames, that's why I use the '_' character.
Re: non-CPAN module distributions - summary
by mp (Deacon) on Sep 18, 2003 at 16:17 UTC
    Thank you all for the very helpful replies/comments on this. I have learned a LOT from them. To summarize, there seem to be 4 basic approaches that are on a scale that ranges from on the one hand running straight from the source files and on the other hand having everything packaged up in CPAN-like modules distributions. There are additional points in between these 4 on the scale.
    • Run everything from library module sources and not bundle anything into a module (unless you have to for XS support or something similar). This is described by perrin here, here, here, and here. This is basically what I do currently except many of the modules are lacking unit tests and sufficient pod documentation on their interface. ExtUtils::MakeMaker would not be needed dat all.
    • Make a single module distribution and stick everything in ./lib. It would support either having or not having a build process (Makefile.PL), with tradeoffs, depending on whether the application pointed to ./lib directly or not. This is actually very similar to the previous bullet item, but does let you "borrow" some ExtUtils::MakeMaker functionality. If a library needed to be broken up into module distributions for CPAN or some other reason, I think this would be a good first step.
    • Make a single module distribution but with submodules (for lack of a better name) that sit in subdirectories and each have their own Makefile.PL and their own unit tests. The top level module sits above them with its own Makefile.PL and integration tests. How to set this up is described by adrianh here. This is a useful technique that I did not know existed. I think this would be a good second step in moving towards individual module distributions, and it seems to pretty much make the the global makefile for you.
    • Bundle everything up into various CPAN-like modules. Unless a particular module distribution really needed to be separate from the rest (to release to CPAN or for some other reason), it would probably not make much sense to go to this step.