Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Per-distro versioning and dependency specification

by creamygoodness (Curate)
on Jun 06, 2012 at 02:30 UTC ( [id://974617]=perlmeditation: print w/replies, xml ) Need Help??

Abstract

Version numbers and dependency specifications are best expressed at the level of the distro (i.e. the atomic unit of installation) rather than at the level of the module.

CPAN authors can emulate distro-level versioning by unifying version numbers across all subcomponents and by treating the removal of a module from a distro as a backwards compatibility break.

Discussion

Per-module dependency specification

A number of distributions on CPAN provide multiple modules with disparate $VERSION numbers. Take URI 1.60:

    URI:             1.60
    URI::Escape:     3.31
    URI::Heuristic:  4.20
    URI::Split:
    URI::URL:        5.04
    ...

The CPAN toolchain supports expressing dependencies at the level of individual modules...

    requires:
        URI::Escape: 3.31

... though it is common to use only the primary module in the distro as a proxy for all the others:

    requires:
        URI: 1.60   # may be a proxy for URI::Escape 3.31

Impracticality of consistent per-module dependency specification

In theory, it is safest for downstream users to express all module dependencies individually, because individual dependency specs will continue to resolve properly if a module moves to a different distribution -- for example, if URI::Escape were to be broken out of URI.

However, there are numerous distributions out there which are structured as large collections of small modules, and for which it would be tedious and error-prone to specify each constituent dependency seperately. The existence of such distributions renders strict per-module dependency specification impractical.

Removing a module from a distro breaks backwards compatibility

The only safe assumption for an author to make is that some fraction of the user base has listed the primary module of a distro as a dependency when they really want a different module within the distro. Users in this category are going to get burnt when their desired module gets removed. Therefore, removing a module from a distro must be classified as a compatibility break.

Hidden major version breaks

Say that URI::Escape makes a major version increment, but URI does not:

URI         1.60 ... 1.61
URI::Escape 3.31 ... 4.00

The distribution's version number only moves from 1.60 to 1.61 because is tied to the URI module -- URI::Escape is only along for the ride.

Unfortunately, this minor version increment fails to communicate to downstream users that a potentially backwards-incompatible major version break has occurred.

The alternative: Per-distro dependency specification

Another approach is to assign a single $VERSION variable to the primary module in the distro, leaving all other modules without versions. This yields improvements in conceptual clarity.

Authors can still do evil things in the context of per-distro packaging, such as breaking back compat within a subcomponent without incrementing the distro's major version number. However, it is more obvious that you're doing something wrong because the distro's version number unambiguously represents everything together.

Removal of a subcomponent is likewise an unambiguous compatibility break -- that is, unless the subcompoment gets broken out into a new distro which gets added to the original distro's dependency chain as a mandatory prerequisite.

The downside is that we must assume that some fraction of the user base will attempt to specify per-module dependencies and will not get their desired behavior.

Recommendations Suggestions1

Fortunately, it is possible to emulate distro-level versioning while still supporting downstream users who make per-module dependency specifications:

  • Assign a $VERSION number in every provided package. (Admittedly, this is inconvenient and violates DRY; some projects use scripts to help ease the maintenance burden.)
  • Ensure that all $VERSION variables within a distro contain exactly the same value. (Some packaging tools provide validation routines.)
  • Treat the removal of a module from a distro as a backwards compatibility break.

References

This post is a spinoff from an off-topic thread on the cpan-testers-discuss list. The recommendations about $VERSION variables have been repeated many times elsewhere.

1 Update: "Suggestions" just sounds a little more humble the morning after. :)

Replies are listed 'Best First'.
Re: Per-distro versioning and dependency specification
by tobyink (Canon) on Jun 06, 2012 at 09:04 UTC

    I always keep the version numbers of modules in my distributions synchronised. While using different version numbers in different modules can be handled sanely, it seems to me there is at least potential for confusion. On the other hand, synchronisation doesn't appear to have many drawbacks. If I bump the version number of one module, then it doesn't seem harmful to bump the version numbers of the rest, even if they haven't seen any other changes.

    With regard to changing major version numbers when breaking backwards compatibility, personally I don't think that goes far enough. If I write:

    use Foo::Bar 2.00;

    ... but Foo::Bar 3.00 has a completely changed API, then my code may break in unexpected and confusing ways.

    When JSON.pm changed its API between versions 1 and 2, this caused a lot of breakage. The only way to avoid that is to continue to support your old API alongside the new one. In practice this usually means giving your new API a different module name (and perhaps rewriting the old module as a wrapper for the new one).

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

      It's tough to handle backwards compatibility breaks sanely in a dynamic environment. I agree that if you want to avoid disruption entirely in Perl 5, package renaming is the only option, and have in fact done something like what you recommend in the past.

      Things would be easier if the language offered namespace aliasing a la Python and the community had a tradition of Java-like package namespaces.

      use org::apache::spamassassin3::Mail::SpamAssassin as SpamAssassin; my $spamtest = SpamAssassin->new();

      Using the aliased module, you can almost get there in Perl 5, though aliased only aliases one package at a time rather than the whole hierarchy.

      # Works: use aliased 'org::apache::spamassassin3::Mail::SpamAssassin'; my $spamtest = SpamAssassin->new(); # Doesn't work: use aliased org::apache::spamassassin3 => 'SA'; my $spamtest = SA::Mail::SpamAssassin->new();

      Can you imagine the abuse that would rain down on someone who uploaded a tarball like org::apache::spamassassin3-3.31.tar.gz to CPAN, though?

Re: Per-distro versioning and dependency specification (DRY)
by tye (Sage) on Jun 06, 2012 at 17:31 UTC

    Why is everybody copying version numbers all over the place in their distributions?

    So, in the theoretical case of me writing a module that includes multiple packages where there isn't one "main" package that always gets used and so is the only place that needs to support "require at least version V", here is how I would implement that (because it means that there is exactly one place where I track the current version number):

    Makefile.PL: ... VERSION_FROM => 'lib/My/Widget/Version.pm', ... lib/My/Widget/Version.pm: package My::Widget::Version; require Exporter; our @EXPORT_OK = '$VERSION'; *import = \&Exporter::import; our $VERSION = 1.011_021; lib/My/Widget/Flanged.pm: package My::Widget::Flanged; use My::Widget::Version '$VERSION'; ...

    Nothing at all complicated about that. And if you don't want to pull in Exporter, then you have to have two simple lines in each "versioned" package instead of just the one.

    Are there any ways in which that trivial solution is not better than any module that does work to copy-and-paste a version string into multiple files or that just checks that you did the copy-and-paste correctly?

    - tye        

      I don't believe that the imported version numbers will be recognized by PAUSE. See the PAUSE documentation, and also this note in Perl::Critic::Policy::Modules::RequireVersionVar:

      =head1 TO DO Add check that C<$VERSION> is independently evaluatable. In particular, prohibit this: our $VERSION = $Other::Module::VERSION; This doesn't work because PAUSE and other tools literally copy your version declaration out of your module and evaluates it in isolation, at which point there's nothing in C<Other::Module>, and so the C<$VERSION> is undefined.

      This may cause problems for downstream users who specify per-module dependencies. It's up to you whether you want to support them; some people feel quite strongly about it, such as "Anonymous Monk" above.

        My first reaction was that I didn't think that I cared. PAUSE will get the correct version number for the distribution, which is what matters. 'require' will get the correct version number for what it does. I didn't think of any way that I would care what PAUSE thinks the version number is of a particular package.

        Then I realized that the problem is probably that trying to set a dependency on " My::Widget::Flanged => 1.021_031" would be what could fail.

        *sigh* Stupid tools are stupid. If there are indeed "install" tools that are stupid enough to fail for that case, then somebody should consider making them (or PAUSE or the latest "meta.today's-fad-data-encapsulation-format writer") smart enough to use the dist version for the module version when the module version appears to be beyond their ken. Of course, there surely are install tools that are that stupid. There's got to be a dozen install tools by now and most of the ones I've tried I quickly decided to never use again because of how stupid they were.

        Not that I'm actually convinced that I care. One could still successfully declare a dependence on " My::Widget::Version => 1.021_031". Also, having such a distribution as I described probably means that My::Widget would be an even better place to put the version number and depending on " My::Widget => 1.021_041" is completely reasonable.

        What I don't find reasonable is copying and pasting version numbers nor then building annoying tools to assist in that work and all just to allow people to use stupid tools stupidly.

        Yeah, I'd rather document " My::Widget::Version => 1.021_041" being required than resort to copying and pasting version numbers.

        Though, I find it quite a stretch to imagine myself writing a module distribution where any of this comes up. Given the stupidity of the tools that lead to this dilemma, I might even document that if you want to use My::Widget::Flanged and require at least version V of it, then you have to code that like:

        use My::Widget::Version 1.021_031; use My::Widget::Flanged;

        - tye        

Re: Per-distro versioning and dependency specification
by BrowserUk (Patriarch) on Jun 06, 2012 at 06:46 UTC

    Congratulations on the stupidest least-well justified, most ill-thought-through recommendation I've seen in a very long time.

    By reductio ad absurdum: Ubunto (substitute your favorite here) is a "distribution". It contains many "packages" -- over 1200 device drivers before you even start on the other components. If it incremented a single version 'number' every time any of those hundreds of subcomponents changed, it would seriously need to consider using 64-bit integers for each of the major(midor)minor components of the version 'number', lest it ran out of space.

    Too big for this advise? How about GCC. If it incremented its version number every time one of it hundreds (thousands?) of subcomponents changed, its version numbers would read like a telephone numbers.

    Still too big a project? At what point does this suggestion cease to be viable?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      I would draw the line around any group of files which can only be installed together. That was what the phrase "atomic unit of installation" was intended to convey.

      Ubuntu and GCC don't really fit into that context, as they are aggregations of smaller packages. Most CPAN tarballs would qualify, though, since perl Makefile.PL; make install is generally all or nothing.

Re: Per-distro versioning and dependency specification
by Anonymous Monk on Jun 06, 2012 at 03:50 UTC

    Yeah, I disagree with pretty much everything you wrote

    use/require work with modules, modules are the basis of reusable code in perl

    Forcing the same $VERSION number in every provided package doesn't violate DRY but it also doesn't make sense as a standard or recommended practice

    If you know, through consulting various Changes files, and through testing, that your code will only work if you  require CGI 2001; then go ahead and add use CGI 2001; otherwise don't specify a version number

    You can use the output of Devel::Modlist and note in the documentation, that your module (or app, whatever) is known to work with this combination of modules/versions

    Or you can turn this list into a list of author/distribution-name.tar.gz and create a Task which only installs this combination of dependencies.

    Also, you don't say whom this recommendation is intended for :) but clearly anyone writing a lot of code (GAAS with his LWP.../URI... or say authors of Moose ) have already made up their mind and adopted a practice

      The recommendations are for upstream authors.

      FWIW, Moose's versioning is consistent with what the original post advocates (the version numbers for all modules in Moose-2.0602.tar.gz are synchronized at 2.0602). Downstream users are well-served regardless of whether their dependency specifications reference Moose itself or any number of its subcomponents.

Re: Per-distro versioning and dependency specification
by Anonymous Monk on Jun 06, 2012 at 07:29 UTC
    there are numerous distributions (…) for which it would be tedious and error-prone to specify each constituent dependency seperately. The existence of such distributions renders strict per-module dependency specification impractical.

    Perl::PrereqScanner makes short work of that.

    The only safe assumption for an author to make is that some fraction of the user base has listed the primary module of a distro as a dependency when they really want a different module within the distro.
    This looks like some sort of circular argument to me: "I advocate per-distro versioning, and some people already do per-distro versioning, so per-distro versioning should be done generally."

    In the majority opinion, when someone does per-distro versioning, it constitutes an accidental bug and should be fixed to become per-module.

    leaving all other modules without versions
    This only works if the proposal is universally accepted and implemented. (This is already implied from the later section "Recommendations", I rather want to say this explicitly, too.)

      There is no need for this proposal to be "universally accepted and implemented." :) I merely hope that individual authors will consider it.

      As for exploding dependency lists using Perl::PrereqScanner, I'm agnostic. It might be good defensive programming for downstream users, but I think any upstream author who justifies recomposing a distribution on the basis of Perl::PrereqScanner's availability is irresponsible.

      Ultimately, a version number has to stand for some bounded collection of code. My argument is that it assigning disparate version numbers to subdivisions within a collection of code which can only be installed as a single atomic unit is unhelpful, and thus that version numbers should be unified within such installation units.

      For the sake of argument, why not assign version numbers to individual subroutines? Subroutines can be moved to different distros, too. Is Perl::PrereqScanner paranoid enough? :) Of course the answer is that the Perl 5 interpreter and use/require deal in modules with version numbers -- but there is tension between that and how modules typcially get packaged and installed as indivisible collections.

      Now I might argue that it would be better logical behavior if the interpreter understood the concept of distros and dealt with version numbers at that level -- and that angle interests me as a student of language design and informs my opinions about best practice. However, in this Meditation I have attempted to limit the concrete suggestions to those which can implemented without either

      • a time machine, or
      • the unanimous consent of all CPAN authors.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://974617]
Approved by davido
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-03-28 08:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found