http://qs321.pair.com?node_id=49061


in reply to Re (tilly) 1: Supersplit
in thread Supersplit

If you are a module, there is no need to do initializations in a BEGIN block.

Maybe you haven't found any but I certainly have. They aren't easy to run into but I still find it a valuable habit. Leaving a potentially large window for a race condition to fit into isn't my idea of a good programming technique. I suspect that future features of Perl will make this even more important to avoid.

Along those lines, I'd use base qw(Exporter) myself, though I look forward to just being able to dispatch &import and avoid the sledgehammer of inheritance. (:

I consider qr// to be new enough of a feature that if I were to use it in a module I would make it optional:

BEGIN { my $sub; if( $] < 5.005 ) { $sub= 'return $_[0]'; } else { $sub= 'return qr/$_[0]/'; } eval "sub _qr { $sub }; 1" or die $@; } # ... _split($text, map {_qr($_)} @_);
(untested, though).

        - tye (but my friends call me "Tye")

Replies are listed 'Best First'.
Re (tilly) 3: Supersplit
by tilly (Archbishop) on Dec 31, 2000 at 10:22 UTC
    I disagree quite strongly on the BEGIN issue.

    When someone loads your module via use or require, there is no gap between the finish of parsing your module and the execution of code in your module. Therefore there is no possibility of a race if you don't play games with BEGIN in your module, and stupid games are not played while checking who is loading you. I assume, of course, that you are nt using an unstable experimental feature. (ie Threads. And if Perl 6 gets into races with initialization code while loading modules, then that is a showstopper in my books!)

    If I am wrong then please show me how exactly this race can happen in Perl. If it is something which I think could possibly happen to me, then I will start blocking it. But not unless. In general I won't put energy into defensively coding against things that I don't think I will get wrong. Conversely if it is something that I can conceivably get wrong by accident, I will become a paranoid nut. :-)

    Now I will give you a very good reason not to move your initialization into BEGIN. If you don't move your manipulation of @ISA into BEGIN, then on 5.005 (and if they fix the stupid $Carp::CarpLevel games in Exporter on 5.6 as well) if you mess up a use statement then you will by default get it correctly reported from the correct package in your module rather than from the module which uses you. If you move the manipulation of @ISA into BEGIN or use base to achieve the same effect you will mess that up. (Note that the fixed Carp that will appear in 5.8 does not have this issue.) Therefore by not playing games with what parts of your initializations occur before your module is done compiling, you will get error reporting which is more likely to be informative.

    So the BEGIN time initialization not only doesn't buy me anything that I care about, it loses me something that I consider very important!

    The qr// point is a matter of taste and environment. No, it is not supported in older Perl's. If that is an issue for you, then it is easy enough to drop it and just split on /$re/.

      I hate having to say, "Oops".

      Everything that I said about BEGIN is true. But my promise that Perl 5.8 will have a version of Carp which fixes the begin-time Exporter issue is incorrect. Here are some details on the situation.

      Suppose that you have 3 packages, A, B, and C. Suppose that B and C both inherit from A but C does not inherit from B. Today when C makes a call into B and then B makes a call into A which generates an error, Carp will not label the error as coming from A, B, or C, but instead will report it from whoever called C. In 5.8 the error will in this case be reported from C because Carp has not been told that B trusts C.

      This is good because the fact that B inherits from A is an internal implementation detail which C should not have to know about. What is not good at the moment is that the trust relationship is synonymous with "inherits". But the code for Carp has been written in a way where this can be fixed by just replacing the trusts_directly() function in Carp::Heavy with something smarter. This is all intended.

      What I had not thought through, though, is the situation where C calls a method in B, which actually is inherited from A. Now when Carp looks at the information, what it sees from caller is that C called a function in A. But C trusts A, so that call is trusted. In fact if the call generates an error, then C should be fingered because B does not trust C, and where the error is reported from should not depend on that implementation detail. But the critical piece of information required is not reported by caller, on a method call we don't know what call was intended, we only know what function was called. If we had a way to get that information, then set $called to the appropriate thing in the short_error_loc() function in Carp::Heavy, and the underlying problem would go away in Perl 5.8. But unless someone feels motivated to champion this on p5p, that won't happen and in this situation the error will continue to be incorrectly reported.

      This last situation is exactly what we get with Exporter and playing games with @ISA during compile time. If you do not play the games, then while your code is being compiled you do not inherit from Exporter during your use invocations, so you can be correctly fingered as causing errors. If you do set @ISA bright and early, your use of other packages will go to Exporter's import() function, and then when Carp tries to figure out the error it will decide not to finger you because what it sees is you calling a function in a package you inherit from.

      An incidental note. Capturing information about method calls in caller would allow that information to be properly propagated in Perl's SUPER mechanism. This is a necessary step towards having Perl's SUPER mechanism play nicely with multiple inheritance. Of course making that change in Perl 5 raises backwards compatibility issues, so likely a new keyword would need to be created. That would require even more championining. Since I am personally of the opinion that multiple inheritance leads to overly fragile and complex design issues, I won't be seen caring much either way, but some others might...

        First, my comment about using qr// was specific about when writing modules. If you are writing modules then you should consider other peoples' environments. So I think the minor effort to support what are still quite common versions of Perl to be good manners.

        After all of the above, I am even more convinced that:

        • Inheritance should not be used for Exporter.
        • use base and initializing static variables in BEGIN blocks is a good idea.
        • Having Carp.pm pay attention to inheritance is a bad idea.

        The documentation for Carp.pm is quite simple, saying that the error will be reported as coming from the caller (not from the caller of the caller of the caller etc. for as long as it takes to get out of your inheritance heirachy). I think this is a much better idea.

        My first run-in with Carp.pm skipping more than one level of the call stack kinda made sense since it allowed your module to have a helper routine that didn't check all of its arguments closely and could pass on bad arguments to an underlying routine and in that specific case, it makes sense to have the error report two levels up. Well, sorta. At least it doesn't help much to have an error reported in your helper routine.

        I say "sorta" because I'd better be pretty careful in my underlying routine to word my error message clearly. It doesn't do much good to have code fred("bar") report "_fetchField() called with undefined field name". So I think that in most cases, even the best case I that can come up with, skipping more than one level of call stack doesn't really give a better message than just simply always skipping exactly one level would do.

        I think it makes much more sense to let those specific error messages (that are very carefully worded and carefully chosen such that they really were caused more than one level up and will be understood when reported more than one level up) should call carp() in such a way that carp() knows how far up the stack to climb.

        Now skipping the whole inheritance heirarchy is just madness. So everyone who uses HTML::ParseTree (or whatever some module is that expects the user to use inheritance in order to make use of the module) should be getting their programming errors popped up to the next higher package instance?

        I can see wanting carp() to skip a couple levels of inheritance heirarchy in very rare cases where I have several packages that are developed together, probably in a single source file, and some specific arguments are not checked until we get a few levels deep. Then only errors about those specific arguments should be allowed to bubble up the appropriate level.

        But just because I inherit from a class certainly doesn't mean that I should avoid checking my arguments for validity before passing them along to that class. But this seems to be what the designers of Carp assume, no? If I inherit from someone else's package, I really do want to be told when I use their package wrong. I will write my code such that errors passed in by users of my code will be detected by my code, not blindly passed on to the other module.

        So the fact that Exporter has a "bug" in that it doesn't take into account the bad design flaw of Carp and this bug can be worked around by making the inheritance tree inconsistant over time, isn't really a good argument for encouraging such inconsistancy.

        I think you may be too caught up in the whole Carp mess. (:

        And all of this just enforces my commitment that lots of things go on between declaring @ISA and run-time initializing @ISA and it doesn't make sense to leave your inheritance in an inconsistant state during all of this processing.

        Sorry, I don't have time right now to dig up the easy way to break things when you don't use BEGIN or use base. But several other people got convinced enough that patches were issued such that BEGIN blocks were actually documented as the proper way to initialize your package variables like $VERSION and @ISA.

                - tye (but my friends call me "Tye")
        In caller you have file and line number. Go ahead and parse the call yourself if you need more info.