in reply to (tye)Re: Supersplit in thread Supersplit
I disagree quite strongly on the BEGIN issue.
When someone loads your module via use or require, there
is no gap between the finish of parsing your module and
the execution of code in your module. Therefore there is
no possibility of a race if you don't play games with
BEGIN in your module, and stupid games are not played
while checking who is loading you. I assume, of course,
that you are nt using an unstable experimental feature.
(ie Threads. And if Perl 6 gets into races with
initialization code while loading modules, then that is a
showstopper in my books!)
If I am wrong then please show me how exactly this race can
happen in Perl. If it is something which I think could
possibly happen to me, then I will start blocking it. But
not unless. In general I won't put energy into defensively
coding against things that I don't think I will get wrong.
Conversely if it is something that I can conceivably get
wrong by accident, I will become a paranoid nut. :-)
Now I will give you a very good reason not to move
your initialization into BEGIN. If you don't move your
manipulation of @ISA into BEGIN, then on 5.005 (and if they
fix the stupid $Carp::CarpLevel games in Exporter on 5.6
as well) if you mess up a use statement then you will by
default get it correctly reported from the correct package
in your module rather than from the module which uses
you. If you move the manipulation of @ISA into BEGIN or
use base to achieve the same effect you will mess that
up. (Note that the fixed Carp that will appear in 5.8
does not have this issue.) Therefore by not playing games
with what parts of your initializations occur before your
module is done compiling, you will get error reporting which
is more likely to be informative.
So the BEGIN time initialization not only doesn't buy me
anything that I care about, it loses me something that I
consider very important!
The qr// point is a matter of taste and environment. No,
it is not supported in older Perl's. If that is an issue
for you, then it is easy enough to drop it and just split
on /$re/.
Re (tilly) 4 (gotcha): Supersplit
by tilly (Archbishop) on Dec 31, 2000 at 13:33 UTC
|
I hate having to say, "Oops".
Everything that I said about BEGIN is true. But my promise
that Perl 5.8 will have a version of Carp which fixes the
begin-time Exporter issue is incorrect. Here are some
details on the situation.
Suppose that you have 3 packages,
A, B, and C. Suppose that B and C both inherit from A but
C does not inherit from B. Today when C makes a call into
B and then B makes a call into A which generates an error,
Carp will not label the error as coming from A, B, or C,
but instead will report it from whoever called C. In 5.8
the error will in this case be reported from C because
Carp has not been told that B trusts C.
This is good because the fact that B inherits from A is an
internal implementation detail which C should not have to
know about. What is not good at the moment is that the
trust relationship is synonymous with "inherits". But the
code for Carp has been written in a way where this can be
fixed by just replacing the trusts_directly()
function in Carp::Heavy with something smarter. This is all
intended.
What I had not thought through, though, is the situation
where C calls a method in B, which actually is inherited
from A. Now when Carp looks at the information, what it
sees from caller is that C called a function in A. But C
trusts A, so that call is trusted. In fact if the call
generates an error, then C should be fingered because B
does not trust C, and where the error is reported from
should not depend on that implementation detail. But the
critical piece of information required is not reported by
caller, on a method call we don't know what call was
intended, we only know what function was called. If we
had a way to get that information, then set $called to the
appropriate thing in the short_error_loc() function
in Carp::Heavy, and the underlying problem would go away in
Perl 5.8. But unless someone feels motivated to champion
this on p5p, that won't happen and in this situation the
error will continue to be incorrectly reported.
This last situation is exactly what we get with Exporter
and playing games with @ISA during compile time. If you do
not play the games, then while your code is being compiled
you do not inherit from Exporter during your use
invocations, so you can be correctly fingered as causing
errors. If you do set @ISA bright and early, your use
of other packages will go to Exporter's import()
function, and then when Carp tries to figure out the
error it will decide not to finger you because what it sees
is you calling a function in a package you inherit from.
An incidental note. Capturing information about method
calls in caller would allow that information to be
properly propagated in Perl's SUPER mechanism. This is a
necessary step towards having Perl's SUPER mechanism play
nicely with multiple inheritance. Of course making that
change in Perl 5 raises backwards compatibility issues, so
likely a new keyword would need to be created. That would
require even more championining. Since I am personally of
the opinion that multiple inheritance leads to overly
fragile and complex design issues, I won't be seen caring
much either way, but some others might... | [reply] |
|
First, my comment about using qr// was specific about when writing modules. If you are writing modules then you should consider other peoples' environments. So I think the minor effort to support what are still quite common versions of Perl to be good manners.
After all of the above, I am even more convinced that:
- Inheritance should not be used for Exporter.
- use base and initializing static variables in BEGIN blocks is a good idea.
- Having Carp.pm pay attention to inheritance is a bad idea.
The documentation for Carp.pm is quite simple, saying that the error will be reported as coming from the caller (not from the caller of the caller of the caller etc. for as long as it takes to get out of your inheritance heirachy). I think this is a much better idea.
My first run-in with Carp.pm skipping more than one level of the call stack kinda made sense since it allowed your module to have a helper routine that didn't check all of its arguments closely and could pass on bad arguments to an underlying routine and in that specific case, it makes sense to have the error report two levels up. Well, sorta. At least it doesn't help much to have an error reported in your helper routine.
I say "sorta" because I'd better be pretty careful in my underlying routine to word my error message clearly. It doesn't do much good to have code fred("bar") report "_fetchField() called with undefined field name". So I think that in most cases, even the best case I that can come up with, skipping more than one level of call stack doesn't really give a better message than just simply always skipping exactly one level would do.
I think it makes much more sense to let those specific error messages (that are very carefully worded and carefully chosen such that they really were caused more than one level up and will be understood when reported more than one level up) should call carp() in such a way that carp() knows how far up the stack to climb.
Now skipping the whole inheritance heirarchy is just madness. So everyone who uses HTML::ParseTree (or whatever some module is that expects the user to use inheritance in order to make use of the module) should be getting their programming errors popped up to the next higher package instance?
I can see wanting carp() to skip a couple levels of inheritance heirarchy in very rare cases where I have several packages that are developed together, probably in a single source file, and some specific arguments are not checked until we get a few levels deep. Then only errors about those specific arguments should be allowed to bubble up the appropriate level.
But just because I inherit from a class certainly doesn't mean that I should avoid checking my arguments for validity before passing them along to that class. But this seems to be what the designers of Carp assume, no? If I inherit from someone else's package, I really do want to be told when I use their package wrong. I will write my code such that errors passed in by users of my code will be detected by my code, not blindly passed on to the other module.
So the fact that Exporter has a "bug" in that it doesn't take into account the bad design flaw of Carp and this bug can be worked around by making the inheritance tree inconsistant over time, isn't really a good argument for encouraging such inconsistancy.
I think you may be too caught up in the whole Carp mess. (:
And all of this just enforces my commitment that lots of things go on between declaring @ISA and run-time initializing @ISA and it doesn't make sense to leave your inheritance in an inconsistant state during all of this processing.
Sorry, I don't have time right now to dig up the easy way to break things when you don't use BEGIN or use base. But several other people got convinced enough that patches were issued such that BEGIN blocks were actually documented as the proper way to initialize your package variables like $VERSION and @ISA.
-
tye
(but my friends call me "Tye")
| [reply] [d/l] [select] |
|
I think we are agreed on the qr// idea. I just wanted to
mention that it was possible, not advocate its use.
So now on to the more substantial BEGIN issue.
Having read and thought about what you wrote, the following
points summarize my current position fairly well:
- I form my own opinions. Telling me what other people
have been convinced of is a waste of time. Give me
examples and reasons please.
- Complaints about poorly written error messages don't
go very far with me.
- Our differing views about debuggers explain to some
extent our disagreements on error messages.
- There are valid reasons for having a standard way to
have error messages reported from elsewhere. There needs
to be some flexibility in what that elsewhere is.
- It is a bad idea to make wrappers know a lot about
what they are wrapping or vice versa.
- It is a bad idea to make the programmer have to
be explicit about where exactly the error should be
reported from.
- Inheritance is the wrong handle to use for deciding
where errors get reported. But it is at least a relative
of the right concepts.
- I still think that having user code play cute games
with the internal definitions of what happens during
run-time and compilation is a Bad Idea.
More on those points.
- I form my own opinions. Telling me what other people
have been convinced of is a waste of time. Give me
examples and reasons please.
This is how I have always been. The fact that other
people think something doesn't tell me why I should
think that. Perhaps when I think about it I will come
to a different conclusion. More importantly, knowing
why I believe what I do is more important than having
some rote rules to learn from.
For instance Perl 5.6 introduces the idea of 'our'. A
lot of people - including many who I know are better at
both Perl and programming in general than I am, think
that it is a great idea. However that didn't stop me
from thinking about it and deciding to respectfully
disagree.
Now perhaps it is a great idea. If so then by forming
my own opinion and putting it forth I get to find out
something about how I don't understand programming in a
useful way. The opportunity for that is more important
to me than having a stupid opinion from time to time.
And not doing that leads to Cargo Cult
thinking.
So rather than tell me that some example out there
convinced some people to have the official documentation
give me the examples. Besides which I note that the
current version of Perl (ie what I see in 5.7 snapshots)
does not recommend using BEGIN blocks. Or suggest
base. So not everyone can have been convinced.
- Complaints about poorly written error messages
don't go very far with me.
This should be a, "Well, duh" item. In Perl there are
many different options for reporting an error. You can
die. You can warn. From Carp you can choose to
carp, croak, cluck, and confess. The only one
of these that I have not used in production code is
cluck. Each one makes sense in different places, and
what message you should write depends strongly on which
one you are using.
Complaining that people don't know how to write useful
error messages sounds to me like a user education issue.
If you don't know how to write an informative message
then it doesn't matter what function you use, your errors
will be uninformative at best and seriously misleading
at worst. So learn to write better messages.
For the record from most to least frequently I use
confess, croak, die, warn, carp, and cluck. When I
am writing stuff that others will use I move croak and
carp up in that list.
- Our differing views about debuggers explain to some
extent our disagreements on error messages.
I was trying to figure out why we disagree. And this is
the best explanation I came up with.
As we discussed in Are debuggers good?, I prefer to write good
error messages and then depend on them. You prefer to
use the debugger. This difference in philosophy affects
our view of error messages. If the error message gives
a good place to start in the debugger, it is likely
useful for you. I want it to give me me much more than
that. I want it to finger a piece of code and give me
information on why that code in particular is likely to
be the cause of the problem.
As a result I demand far more out of my error messages
than you do. So I put (and expect others to put) far
more energy into what is reported where and when. And
into deciding what is reported how. Which leads to an
interesting cycle. Because I put that energy in, I
constantly find that errors are pinpointing my exact
problem in a useful way, and I do not have the desire
to use the debugger. The same happens in reverse for
you.
Or at least that is the best theory I can come up with
at the moment.
- There are valid reasons for having a standard way
to have error messages reported from elsewhere. There
needs to be some flexibility in what that elsewhere
is.
There is a world of difference between where an error
is found and what the most useful place to report it
from is. What both croak and carp are for is having
a library report an error from the context of a user.
If the library does this then it is up to the library
to provide enough context about the error to make the
report useful. (The lazy way to provide context is,
of course, to just confess to your sins.)
The reason for this is that whoever is maintaining the
user-code should not need to know anything about the
internals of the library to figure out the code. And
a real-world useful library is unlikely to just belong
in one package. For instance look at LWP. So
if you encounter a problem (eg you cannot resolve the
site you are trying to connect to) that has nothing to
do with your library, deciding where to report it by
reporting the error as soon as you get out of your
current package is likely to be useless.
- It is a bad idea to make wrappers know a lot about
what they are wrapping or vice versa.
Very often people want to wrap something else. For
instance XS wrappers to allow people in Perl to use
C libraries, the DBI wrapper to give a wide variety
of DBD drivers a common Perl interface, so on and so
forth. This is both a common thing to do and often
a good design idea. This should be encouraged, not
discouraged.
When you insist that the wrapper should
know every possible thing that can go wrong inside
the thing wrapped and handle all of them, you have
just made wrapping a much more complex thing to do,
and are discouraging wrappers. If I wrap a socket
connection with an interface that makes ftp requests
transparent, and then someone uses my code, there is
no reason on Earth that I should have to do my own
tests that the connection can be made or won't break.
Not only is there no good way for me to do that, but
I am reusing code that (hopefully) already does
that for me - why should I reinvent the wheel?
Conversely the thing wrapped shouldn't need to care
that it is wrapped. After all you have an interface
into it, not vica versa. If it is written well it
should not be making assumptions about how it is
being used. If it is making such assumptions
then it is preventing itself from being easily reused.
- It is a bad idea to make the programmer have to
be explicit about where exactly the error should be
reported from.
Well you know my opinion about the stupid misuses of
$Carp::CarpLevel in the past. First of all when
proper coding depends on programmers always getting
things right, that is guaranteed to go wrong often.
For instance the most common security whole in every
year of the last ten has remained the basic buffer
overflow. It isn't hard. Don't put more into a data
structure than it has room for. But people simply
cannot get it right. (But the ever growing popularity
of languages like Perl that do dynamic allocation and
so prevent buffer overflows may someday change this.)
Well the one thing I don't want to see messed up on a
regular basis is error reporting. OK, you have to ask
for errors to be reported. But if you require people
to also have to synchronize information correctly, they
will get that wrong. Guaranteed. They will do things
like hard-code the level. (Which Carp and perldb.pl
both do with $Carp::CarpLevel.) Or they will write
their own nifty functions to figure out where they think
the error should come from. (warnings.pm does this.)
And they will get it wrong. Do you want proof? Just
look at how regularly the Perl core has messed up on
this given the opportunity! And the Perl core is
supposedly filled with competent people!
In particular woe betide the poor joe who wants to wrap
a library that for no particular reason chooses to be
special and do it differently!
Software design involves a constant balancing act
between what complexity needs to be exposed in your
interface, and what interface aspects are going to be
the cause of ongoing problems. In this case my strong
opinion is that asking for libraries to know enough
about their environment and how they are used that they
can correctly decide where it is reported from is just
begging for a nightmare.
- Inheritance is the wrong handle to use for deciding
where errors get reported. But it is at least a relative
of the right concepts.
I am not a mind reader. I am not even positive who
wrote the original Carp. However I had to think
through a lot of the decisions made, and while I have
to say that the original was a hack, it was a hack
that caught a lot of important ideas.
The use of inheritance in Carp captures the following
insight. It should not matter whether a method was
inherited or overridden. That is an implementation
detail. The rewrite takes that basic insight and
makes its application more consistent. But it is still
retained. If it matters to you whether a given method
was over-ridden or inherited, then perhaps inheritance
is not what you are looking for.
As you rightly point out, this does not generally hold.
Here are several classes of issues:
- The module (like HTML::ParseTree) expects
you to inherit and subclass from it.
- You have a has-a relationship rather than is-a.
(This is very common in wrappers, and we both agree
that using has-a is usually cleaner than relying on
is-a.)
- The library (which is complex enough to be spread
across several packages) is not object-oriented.
I claim it as a bug in Carp that it does not have a
more flexible declaration mechanism for handling these
cases. I don't think that coming up with one is
by any means impossible. However when dealing with
reasonably mature packages (eg stuff off of CPAN) it
will often turn out that if package A inherits from
package B then they shouldn't really be reporting
errors in each other.
- I still think that having user code play cute games
with the internal definitions of what happens during
run-time and compilation is generally unwarranted, and
for this case in particular is a bad idea.
This is where we started. I am personally a fan of
trying to KISS. Have what people are likely to do
pretty much by default work well. If you can, make
it work right to have people consistently do the simple
thing. If you can't then make it a big deal.
Now I have no idea that if code that doesn't play cute
BEGIN games relies on code that does, you can get all
sorts of fun games. But my answer to that is to not
play cute games in the first place, not to play more.
And in this case in particular you will guaranteed get
wrong behaviour from both old and current versions of
Perl (as well as versions likely to be released in the
near future) by playing games with BEGIN. So Don't Do
That. And if you do do that and get hurt, Don't
Complain.
OK, so this is far, far more than I intended to write, but
it hopefully is interesting. I don't think that the
concept of Carp is broken. I am emphatically unconvinced
that BEGIN games are justified. And I have just said far
more on that than I should. :-)
| [reply] |
|
|
|
In caller you have file and line number. Go ahead and parse the call yourself if you need more info.
| [reply] |
|
|