http://qs321.pair.com?node_id=740402

targetsmart has asked for the wisdom of the Perl Monks concerning the following question:

Why most of the CPAN/core modules lacks comments(definitely it has man pages)?. Any particular reason for it.
The reason for asking this question is, as and when I use a subroutine from an external module(can be core module or downloaded from CPAN), I would go and see how the subroutine was implemented in the code by the author, to get to know the good/efficient practices, but many a times because of lack of comments I have gone to other modes like using a debugger, copying only that subroutine and putting some debug messages to it, etc. It would be excellent if it comes with sufficient comments(from a learner point of view).
I believe that I have asked nothing wrong here.

Vivek
-- In accordance with the prarabdha of each, the One whose function it is to ordain makes each to act. What will not happen will never happen, whatever effort one may put forth. And what will happen will not fail to happen, however much one may seek to prevent it. This is certain. The part of wisdom therefore is to stay quiet.

Replies are listed 'Best First'.
Re: Why no comments?
by GrandFather (Saint) on Jan 31, 2009 at 11:02 UTC

    Writing good comments is at least as hard as writing good code. Bad comments can be worse than bad code. Comments can give the reader the impression that they understand the code without the need to actually read and understand the code. If the comments are good, that's fine. If the comments are bad that can lead to a great deal of wasted time and frustration.

    Don't get me wrong. Comments are an important part of writing good code. But it is better to write good code that speaks for itself than to write a translation of the code into some semblance of English (or whatever language is appropriate).

    Good comments convey the intent of code and may explain how a tricky algorithm works, but doesn't give a blow by blow description of the code.

    Modules provided by CPAN are written by people of vastly varying ability and can demonstrate code on many different levels, but few of those modules are written for their educative value. "Dumbing down" a module so that it is accessible to learners doesn't help CPAN and doesn't actually help learners. For one thing, there are much better ways to learn Perl basics (PerlMonks for a start), and for another: learners don't stay learners for ever, so what level should the comments and code be aimed at?

    Comments in code for CPAN should be designed to ease the job of maintenance programmers. Generally that means few comments and much thought given to appropriate coding, identifiers, documentation and test suites.

    That is not to say either that there isn't a lot to be learned from reading CPAN code. There is a great deal that can be learned from examining CPAN code. But you shouldn't expect it to be entry level stuff.


    Perl's payment curve coincides with its learning curve.

      If the comments are bad that can lead to a great deal of wasted time and frustration.
      This is a very common attitude, and I suppose it must reflect someone's experience, but it doesn't reflect mine. My experience is that even when the comment is in error it gives you a hint about the history of the code, the state of mind of the author, and so on.

      It's a basic skill of reading, if you ask me: you can't take anything at face value, but it doesn't mean it's worthless, either.

        you can't take anything at face value

        That statement is not true about code. If you compiler isn't buggy, the code does exactly what it says it does. And it's the whole reason why code is more important than comments.

        During bug fixing some time ago I came across a comment that was plain wrong - it described the effect of the code as being the opposite of the actual effect. After spending considerable time "fixing" the code I discovered that it had in fact been correct and that the bug I was looking for was in a completely unrelated piece of code.

        In that particular case there was no need for the comment. One reason the code took considerable time to "fix" was that all the identifiers were "wrong". Without the comment the code was correct, consistent and clear. However, the comment was plausible and in the context of the symptoms of the bug the code could well have been incorrect.


        Perl's payment curve coincides with its learning curve.
Re: Why no comments?
by shmem (Chancellor) on Jan 31, 2009 at 11:29 UTC

    Comments in code are for maintainers. They point out parts which need fixing, lay out assertions on which the current piece of code relies, explain unusual constructs and point out side-effects upon other parts far away. Sometimes they refer to a bug ticket. A maintainer has the skills to understand the code without third party help, and comments which explain the code further than that are annoying and obscuring the fact to her.

    All other comments on the code live in the documentation, and in the skill of teachers. See also the thread An Introduction to Literate Programming with perlWEB which also discusses the use of comments.

Re: Why no comments?
by roboticus (Chancellor) on Jan 31, 2009 at 14:40 UTC
    targetsmart:

    Your code should be written to be as clear as possible, as it's supposed to adequately explain *what* it does. Normally, the types of comments I use are:

    1. Section breaks to show when I'm "shifting gears" in the code. (Such as "Initialization section", "Process results", "Print reports". These aren't really necessary, but I find them a handy way to segue between sections.)
    2. References to algorithms I intended to implement, if they're unusual. These will normally be in the comment header for the function.
    3. Specific and implied business requirements. This is the most important type of comment in my view. Business logic and programmer logic are quite different at times, so code that looks obviously wrong can be correct and vice versa. If the line of business specifies that you define "Gross Annual Sales" as the number of items you sold last Tuesday, then that's what you have to deliver. (No matter how Marketingesque the logic is.) Implicit requirements are also important--these are the ones where you have to work around flaws in other subsystems or business processes.
    4. Assumptions I'm making for input. (Gee, I assume that all our products are *heavier* than 0kg. I hope we don't start selling Helium balloons or this routine will break!)

    Of course, commenting styles are varied, so this list is strictly my opinion...

    ...roboticus
Re: Why no comments?
by sundialsvc4 (Abbot) on Jan 31, 2009 at 16:56 UTC

    (Shrug...)   “It could be better. It could always be better.”

    I suggest that you might contact the owner of the module in question (privately...) and offer some clarified text and where you'd like to place it. Maybe you could become one of the credited maintainers of the module.

    Maybe your entreaties would be rebuffed, but maybe (likely...) they would be warmly appreciated. After all, as we all know, it's tough to look at your own work from somebody else's eyes; to know what will and will not be clear, or useful, to someone else.

    Tread delicately. There are human egos nearby. Don't name names here; start with private contact.

Re: Why no comments?
by ELISHEVA (Prior) on Feb 01, 2009 at 08:57 UTC
    One reason these commenting discussions get so difficult is that good commenting is really a product of programming experience, and sometimes history. Depending on experience, one person's commenting on goals of the code is another person's commenting on mechanics.

    Consider a block of code consisting of a while loop with internal variables named $high, $low, $mid. I think most experienced programmers skimming the code would assume that the loop is some kind of binary search algorithm. They might appreciate (at most) a very short comment saying #binary search right above the start of the while loop, but anything more than that would be annoying. On the other hand, if there was something weird about the use of a binary search in that context or it really wasn't a binary search, then any of the following comments might be appreciated:

    • #looks like binary search - but isn't - read carefully
    • #binary search - optimized using fiddle foo
    • #yes - there is a standard routine for this, but it is in a CPAN module and at the time of writing (2009-xx-xx) the client has insisted for reasons X,Y,Z that there be no external dependencies

    But each of those comments requires a great deal of context and experience even to construct. A programmer who doesn't know what a binary search algorithm looks like is hardly going to know that he or she needs to comment their code when it looks like one but isn't.

    On the other hand, I'm sure some of us can remember when we were first learning the ideas behind a binary search. Grasping each step of the algorithm was painful and we very much appreciated our professor or text book explaining each step of the algorithm and why it was done that way.

    As a further thought, what does and doesn't need to be commented changes over time even for the software industry as a whole. Before the discussion of patterns became widespread, it would have been impossible to comment code using a visitor pattern with a one liner #process foo using visitor pattern.

    Best, beth

Re: Why no comments?
by toolic (Bishop) on Jan 31, 2009 at 17:09 UTC
    In addition to the answers already provided, here are two more reasons: money and time.

    How much did you pay for the modules you are using, or will use? My guess is "zero". It is my understanding that most (if not all) of CPAN modules are created by volunteers. Perhaps many authors (such as myself) spent nights and weekends away from work writing tests and POD -- without monetary compensation. There is no army of well-paid Micro$soft hackers churning out CPAN modules on the company's dime. Maybe you get what you pay for :)

    Do you know of any other free software repository in which most of its library components have ample/accurate/coherent comments? Please provide some links.

    most of the CPAN/core modules lacks comments
    That is an amazing claim, considering there are more than 10,000 modules on CPAN.

      If anyone were to submit a patch for my free code consisting solely of comments, I would at least consider applying it - after checking that the comments were correct and meet my arbitrary definition of useful.

      Similarly, I accept doco patches. Comments are really just a type of doco anyway - they're documentation for the maintainer.

Re: Why no comments?
by tilly (Archbishop) on Feb 01, 2009 at 00:50 UTC
    See Re (tilly) 2 (disagree): Another commenting question, for an ancient conversation where I explained my (non)commenting style, gave concrete reasons for it, and explained how that style leads to me writing better code.

    I still stand by what I said then.

    If you wish more enlightenment, I highly recommend picking up a copy of Code Complete and reading it cover to cover. Yes, it is long. Yes, it is detailed. However every last bit of it is highly worthwhile to read, think about, and try to understand.

Re: Why no comments?
by zerohero (Monk) on Jan 31, 2009 at 23:47 UTC

    Internal comments are for maintainers, not for people using the code. Since good/efficient practices is a huge issue, and subject to many larger factors, it makes no sense to embed this kind of general comments in code.

Re: Why no comments?
by gone2015 (Deacon) on Feb 01, 2009 at 02:00 UTC

    It is indeed a sad state of affairs when programmers fail to properly comment as they write code.

    Comments should be regarded as an integral part of programming, at least as important as the data and the code. Comments should cover the purpose and meaning of both data and code -- information that doesn't come from reading the code, which can tell you what it is doing, but not why and to what end.

    The key to designing good data structures and good code is a clear understanding of their purpose and the objectives. It is good discipline to include comments which describe that much. It is excellent discipline to write those comments, at least in draft form, before defining the data or writing the code -- and then refine data, code and comments together. If it's hard to express the purpose of the data or code, what hope is there of being able to implement it effectively ?

    Each subroutine should be commented so that you know what the arguments and return values are and mean, without reading all the code, and looking at everything it calls. You should not have to read all the code to reverse engineer an understanding of what data structures mean and are used for.

    Each data structure, each module and each object should contain comments to describe their meaning and purpose. Comments should also cover any key dependencies, assumptions, limitations, caveats, key algorithms, obscure or key background information, etc.

    Good commenting can:

    • make you describe what is intended. This helps clarify the problem before coding. It also reminds you, three months from now, what you thought you were doing.

      As you review the code and its associated commentary, it can become obvious that there's a mismatch between what was intended and what's been written. During debugging, the commentary can remind you what was intended... so you don't just assume that what it does is right !

    • make you to consider edge conditions and check that cases that need to be covered are covered. You should do that anyway, of course, but if you write it down it's more concrete.

    • require you to express and examine assumptions.

    • justify and document the choice of algorithm or method.

    • leave notes on things which were not obvious when you worked out what needed to be done or how to do it -- so you don't need to rediscover this in six months time.

    • and so on.

    The writing of good comments is not a distraction or an overhead. It should reflect what the programmer must think about in any case -- the discipline of writing it down concentrates the mind, and contributes to the quality of the code. Later on, in maintenance and upgrade, good comments speed up the process of understanding.

    I am filled with wonder and admiration for people who can churn out reams of self-documenting code, without having to work out and write down what it is they are doing and why ! Mind you, I prefer to admire them from a (very) great distance.

      I've downvoted this post. Not because it flies in the face of everything that I've come to believe over a long (and I believe deeply thoughtful and curious) career--though it does.

      But because, as with so many of these theses, it attempts to argue and convince, not on the basis of strong pro-logic, nor strong counter-argument to the opposing views. But rather on the basis of: commenting is so obviously good, that not commenting can *only* be the product of laziness and indifference; which is absolutely not the case for many of us that prefer minimal commenting.

      I started my professional coding in assembler, and commenting every line. And every routine started with big, gaudy block comments detailing date, time, author, revisions and reasons, inputs, outputs, side effects, purpose, methods and references et al. Through Bliss, Fortran and C and several others, the level of commenting slowly reduced. Not because I was too lazy to comment, but because as I went back to maintain code--my own and others--I found that the comments were inaccurate, unhelpful or misleading. And sometimes all three.

      Comments made by the original authors--including myself--no longer made sense in the light of

      1. The new purpose to which the routine was being put.
      2. My new level of understanding of the purpose of the routine in the calling code, or my better understanding of the problem the code is to address in the light of the bugs that it accumulated in use.
      3. The fading of the memory (or no knowledge at all of it), of the mindset of the author when the code was written.

      In a nutshell: things that seemed important to the author when writing the particular line, block or subroutine, are no longer significant once the intensity of thought about those lines is no longer fresh in your mind and you are instead caught up in the malaise of the bigger picture. Comments that made sense in the light of a library routine's place in one project, no longer have any significance when that library is re-used in another.

      You make some claims for the advantages of "good commenting". So, I raise you a challenge. Show us an example! And let us pick over its bones. It could be very enlightening.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        You seem to have taken particular exception because you think I made no real case, but argued:

        ... on the basis of: commenting is so obviously good, that not commenting can *only* be the product of laziness and indifference; which is absolutely not the case for many of us that prefer minimal commenting.

        Well... what I thought I'd done was to describe in broad terms what I consider to be the attributes of good commenting, and then to enumerate what I think are some of the benefits. I fail to see how that is starting from a position of "commenting is so obviously good", far less jump from there to an accusation of "laziness and indifference". I feel you've misrepresented what I said... but, hey ho, it's an imperfect world.

        I agree that there is such a thing as excessive and pointless commenting -- which is made-work both for the original author and for later modifiers. In my assembler days I would despair of programmers who thought that every instruction required a comment saying what it was.

        I'm not sure whether you are arguing for no comments at all, or whether the question is the degree and type of comments ? If the second, then I'd be interested to hear what you think represents "good" commentary.

        However, you asked for an example. OK. I posted this some time ago. I haven't constructed it as a text book example of good commenting -- in any case, like most forms of writing the question is: "who is the audience ?". Alternatively, I can offer ensure.pm. I look forward to seeing them torn to shreds :-) [Looking back at the posting, there is quite a bit of background -- if you just want to look at the module, see here.]


        I will also give an example of where, IMO, there is a shortage of comments. This is from numeric.c in the Perl 5.10.0 source.

      Good commenting can:

      What about s/Good commenting can:/Good coding can:/ ? And that's the main point.

      • make you describe what is intended.

        That can, and should be done with code proper.

      • make you to consider edge conditions and check that cases that need to be covered are covered. You should do that anyway, of course, but if you write it down it's more concrete.

        What is better than code which considers edge conditions, naming the conditions by proper set branches and variables? The requirements are (or should be) in the specs. Code is no place for that.

      • require you to express and examine assumptions

        that is what we do all time along coding perl. Anything which can't be done that way needs to be cleared beforehand and laid down in the specs.

      • justify and document the choice of algorithm or method.

        The choice of algorithm is done per knowledge. Now, you wouldn't document your knowledge in any (every?) piece of code you write, would you? Algorithms can be named, if they have a name. If they work out to the specs, that's fine. If they don't, they are changed, and describing the "why and whence and how" sayings doesn't add anything to the comprehensiveness of the code. The code should be written to be comprehensive in the first place.

      • leave notes on things which were not obvious when you worked out what needed to be done or how to do it

        This is the only reasonable item in the list - quoting Kernighan: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

        Teaching perl, the first code samples of your pupils are like that:
        open my $h, '>', $file # try to open a file or die "Can't open '$file' for writing: $!\n"; # if that fails, te +rminate with a message while(<$fh>) { # reading a line via <$fh> assigns per default +to $_ s/foo/bar/; # s/// operates per default on $_ }
        That's okay, since they want to remember what they have learned. Maintaining code and refactoring is a totally different issue. If you do that, you have to toss around code, and having comments piggyback is - piggyback, and a burden. If every EXPR is written properly; if every STATEMENT is clearly written; if every BLOCK defining a SCOPE containing STATEMENTS which are comprehensible on their own, or as part of the enclosing PACKAGE, or APPLICATION to which they apply, everything is fine.

      I choose to code in perl (also ;-) because that way I have a wealth of expressions at hand which makes - or could make, if I work up to my level of experience - the code self-explanatory. And that's the goal. Anything below that can only be badly mended with comments. They are workarounds. Your program is a technical paper, which should be readable on its own by technicians of the same craft, but not necessarily for pupils (unless you are a teacher) or laymen. Those technicians which can't need teaching (by themselves, or training by somebody else), and that fact cannot be mended with comments either.

      I'm surprised to see this node has a negative reputation -- I don't disagree strongly enough with it to downvote it, but I think it sets the 'you must comment!' bar a little high.

      I agree that code/comment mis-matches is a problem -- in that case, I'd rather have no comments than bad comments.

      Writing down comments on edge cases doesn't make the code more concrete -- comments don't change the code.

      And I disagree that .. the writing of good comments is not a distraction or an overhead; if the comments get written during development, that's fine. Having to go back and add the comments afterwards is definitely overhead. Sometimes there just isn't time to spend a leisurely week adding comments and generally tightening code up.

      Pull out some code from CPAN (say) and critique it for us .. we'll be glad to give you some feedback. :)

      Alex / talexb / Toronto

      "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

        Writing down comments on edge cases doesn't make the code more concrete -- comments don't change the code.

        Ah, what I was trying to say was that the writing of the comment helps make the thinking more concrete: often I can think I have something clear in my mind, but when I have to write it down, I find that my thinking was sloppier than I thought... if you see what I mean.

        if the comments get written during development, that's fine.

        Absolutely. This is what I was trying to say when talking about the comments being an integral part of the programming process. You have to think a chunk of code (or data) is for before you write it... writing some of that down in the form of comment helps the thought process now, as well as having benefits later.

        The writing of comments also has its place when reviewing the code. First, again because the act of writing concentrates the mind. Second, if the code and the comments don't tally, something needs to be thought about !

        I must say I'm taken aback by the response... I get the feeling that there are a number of people who've been traumatised by being forced to eat their greens. Suggesting that greens should form part of a balanced diet seems to produce a gagging response :-)

      I do believe that light commenting is good.

      In your bullets, you make the following points:

      1. helps clarify the problem before coding

        Yes, but sometimes these notes should be kept separate as design documentation. Some of this can also be done by writing tests.

        Here you are saying they have served their purpose already; you need another purpose to shove them into the code.

      2. reminds you ... what you thought you were doing.

        This is true, but it is could also true of the code itself. This can also be done with tests.

      3. [make] obvious that there's a mismatch between what was intended and what's been written

        No, this is like having two compasses that don't agree. Well, if you're the author, it can work until tickling your memory doesn't help.

        Expressing your intentions for some code can be more precisely done by tests.

      4. make you to consider edge conditions and check [them]

        Better done by testing.

      5. require you to express and examine assumptions.

        Better done by testing.

      6. justify and document the choice of algorithm or method.

        Agreed.

      7. leave notes on things which were not obvious when you worked out what needed to be done or how to do it

        What is being done and how it is being done is captured in the code; the mindset of a maintainer with the artifact of the code before him may not mirror that of the writer who sat down with no code in front of him.

      8. and so on.

        May your comments not be so vague, :-) and testing will probably help with whatever this is :-).

      Be well,
      rir