Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Preferred method of documentation?

by Anonymous Monk
on Oct 05, 2004 at 21:01 UTC ( [id://396760]=perlmeditation: print w/replies, xml ) Need Help??

One of the requirements for a project I'm working on is "good documentation". The project is a module about 1200 lines long. It seems to me that the most useful documentation for a maintenance programmer consists of good comments placed throughout the code. I'm sure there are other opinions, and I'd like yours.

What do YOU do for documentation?

janitored by ybiC: Retitle from "Documentation" because onewordnodetitles hinder site search

Replies are listed 'Best First'.
Re: Preferred method of documentation?
by dragonchild (Archbishop) on Oct 05, 2004 at 21:55 UTC
    Documentation comes in two flavors - developer docs and user docs. For library-type code, like CGI or DBI, they may be more similar than, say, for Template or Mason.

    User docs are the easy one, conceptually. You need to do something similar to Microsoft's helpfiles. Whatever you may feel about their products or practices, they are really good in their docs, for the average user.

    Developer docs are more specific. I would want to know the following:

    • What requirements does this satisfy?
    • What were the architecture choices? Why was this one chosen?
    • What is the API? If possible, who uses the API? How much of the API is inviolable?
    • What are the assumptions? Preconditions and postconditions are an excellent way of putting it.
    • What were the implementation choices? What was this one chosen?
    • Most importantly, where are you going with this? What are the intended future directions? Why?
    • What tests does this pass? How are the tests run? How often?
    • What is the build process? How is the build run? How often?

    These are all the same questions you would ask if you were handed 1200 lines of code and told "Make XYZ changes to it".

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

      Nice post dragonchild. For large applications that need this kind of documentation, I've found the ReadySET templates to be quite helpful. The project provides html templates for all of the questions you ask and more.

Re: Preferred method of documentation?
by FoxtrotUniform (Prior) on Oct 05, 2004 at 21:10 UTC

    POD headers for a module's interface are always nice - describing stuff like assumptions (preconditions) and guarantees (postconditions) of functions. Otherwise, I try to write code that doesn't need much commenting. Well-chosen symbol names, common idioms, and a consistent coding style that errs on the side of whitespace are far more valuable than comments that tell you what each line of code does (or more likely, what it did eight revisions ago when the comment was written).

    --
    F o x t r o t U n i f o r m
    Found a typo in this node? /msg me
    % man 3 strfry

Re: Preferred method of documentation?
by jplindstrom (Monsignor) on Oct 05, 2004 at 22:07 UTC
    This is an example of a well documented method:

    =head2 insertRows($nameTable, $raRow) Insert the data in $raRow (array ref with hash refs (key: field name, value: field data) into $nameTable. Return 1 on success, else 0. Die on fatal errors. =cut sub insertRows { my $self = shift; my ($nameTable, $raRow) = @_;

    It describes the interface to the method. So you don't have to read the code to know how to use it. But you need to keep it in sync with the implementation, so the POD should be next to the method (Locality breeds Maintainability).

    If you write the interface description before you start coding the method, chances are that you'll think one more time about edge cases, error conditions and stuff like that. Personally I find that a very useful way of organizing my thoughts before diving in.

    Names of classes, variables, and methods should be meaningful and consistent. That way you don't need so many comments. Try to avoid abbreviations.

    I find it useful to include the "domain specific data type" in the variable name to indicate what it contains, especially when using dynamically typed languages.

    $timeout vs $secondsTimeout $table vs $nameTable $workspace vs $dirWorkspace

    When you do comment, note the why, not the what. Write down assumptions, thoughts and non-obvious design decisions.

    Having said that, it is ok to say what a regexp does, because it's often not obvious. It's useful to provide a sample of what you're trying to match. Alway use /x to make it more readable.

    If you look at CPAN, each module has a SYNOPSIS section with short examples. These are very powerful.

    Apart from all other benefits with unit tests, they also provide great hands-on documentation, like a SYNOPSIS on steroids.

    /J

      I most wholeheartedly agree with the parent. I'd like to add, however, that a POD section granting an overview of the module is most helpful as well.

      Poke around on the CPAN docs -- many of the modules there are well-documented. Combine that style with that of jplindstrom above, and you have superb documentation that is both in-code and extractable using any one of the pod2 tools.

      The existence of pod2html also means that your documentation will be intranet-publishable, which looks great to your project manager.

      radiantmatrix
      require General::Disclaimer;
Re: Preferred method of documentation?
by bwelch (Curate) on Oct 06, 2004 at 13:11 UTC
    My rule for documentation is that I should be able to look over the code in a few years and see easily what the code does and whether all or parts of it is useful to any task at hand.

    This leads me to a style similar to that of documenting new projects in sections:

    • Purpose: Summary of what the script does and why
    • Scope: Who is this for? Developers? Users? How much of whatever task is assumed versus handled?
    • Input and output requirements: XML input?
    • Dependencies: external programs, file formats, databases, embedded references to paths, etc.
    • Usage: How should I execute it?
    • Change history: This is more for keeping track of updates and debugging purposes but can be very helpful, especially so if more than one developer edits the script. This might be part of CVS or could be at the top of the script. An example:
      # 08/26/04 BW Changed doAssemblyJobs to read bait from file versus DB # 08/25/04 BW Added indexing and formatdb of gss files. # 08/17/04 BW Moved logic of buildMSH.pl into this script
    For documention of individual functions that aren't completely obvious, I tend to add a couple lines of comments to indicate algorithms or other unusual behavior. With good variable names, this isn't needed much. An example:
    # Generic job monitor for watching jobs finish. # - No action is taken when each job finishes. # - Function returns when all jobs have finished. # - Inputs: A hash of job IDs from LSF # - Returns: No specified return code

    One thing I try to remember is that while things may seem obvious while writing and testing the code, a year or so in the future they won't be. Commenting with that in mind has helped me out in many situations.

Re: Preferred method of documentation? (One comment)
by BrowserUk (Patriarch) on Oct 05, 2004 at 21:35 UTC
    ...is a module about 1200 lines long....

    If it's perl, and that is 1200 executable lines...

    It's too damn big :)

    Humour (hopefully) with a point. It's hard to believe that it takes 1200 lines to do any one thing in Perl, and a module ideally should do only one thing.

    Breaking out a few sub-modules from the main, allows whatever documentation is included to concentrate on describing the function of that code, and not mix stuff up.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
      PDF::Template is about 2500 lines. Excel::Template is around 1800 lines. Both are unfinished, meaning they're going to grow. They arguably do only one thing and (I hope!) do them well.

      Heck, DBI is a helluva lot more than 1200 lines, and so is CGI.

      Being right, does not endow the right to be rude; politeness costs nothing.
      Being unknowing, is not the same as being stupid.
      Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
      Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

        Without looking, I'd bet that there are one or more subsection of each that are not used by every use of the main module and that these could be factored out into sub module(s) and required when required rather than loading everything always.

        CGI.pm certainly could do this, even allowing for the fact that it generates a lot of it's function on demand. I fully understand the motivations and temptations for people that want to "just parse the options string and print a header or two". If the major :xxx load options to CGI.pm effectively turned into

        require 'CGI::HTML4' if %opts{ html4 }; require 'CGI::download' if $opts{ download };

        I don't think it would do any harm at all.

        The only downside of that is that Perl doesn't support a use Foo::*; syntax, though writing a use wild 'Foo::*'; pragma would be possible I think.

        I have a general preference for lots of small files, rather than 1 or 2 huge ones. With editors that can hold multiple files and provide search/replace/index across all workspaces, I find the ability to be working in several places at once in different files very useful. My editor also supports having multiple views of a single file, but I very rarely use it. Years ago I used a very excellent folding editor which encouraged you to put lots of stuff in a single file and use a collapsed view to navigate. It worked great, but then compiling became a chore.

        I like to keep as much as possible to do with one unit of code in the same file as I can. Code. Interface (user) docs. Breif modification history. Unit tests. This becomes unweildy where the units are too all encompassing.

        For a very large OO module, I would seriously consider putting the class definition, initialisation, constructors and destructors into one file and moving the methods into one or more separate files and loading them as procedures.

        Anyway, it's just a preference, not an edict. The OP will doubtless make up there own mind on the matter.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
        I wouldn't give CGI.pm as an example of "doing one thing"...
      What do you think of all those nice ::Util modules? :) Not that I disagree with your opinion on big chunks of code in one file.

        It depends upon how you look at them. From the procedural point of view List::Util is a loosely related collection of disparete functions.

        If you take the OO view, then Perl's array's are a generic, ordered container class, and List::Util provides a set of extra methods.

        If only it were so easy to add a whole raft of new methods to every ordered collection type in the STL's of most other OO languages.

        I have to say that List::Util is my all-time, number-one, most used module. The only things it lacks from my perspective are mapn (also known as mapcar) and mapNbyM (to be known as zip in p6).


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: Preferred method of documentation?
by pg (Canon) on Oct 06, 2004 at 01:01 UTC

    Document has to be useful, not just because someone wants you to do it. I have to agree with you that comments in source code is kind of very useful document, but not enough.

    First there is a time sequence thing, lots of your documents come before you have the code. For example, your design document is a way to communicate your design to others, so they can review your design. You don't want to code for failure.

    Comments in your source code is good for describing things that are localized to one module, one source file or whatever, but cannot be used to describe things that are across modules.

    As for comments in source code, it also has to be useful, the last thing one wants to do is to repeat your code in natural language. If that's all what you want to do, to repeat, then better don't bother to comment, as your code is most likely more precise.

    Don't do this:

    #increments counter by 1 $counter ++;

    I always do document, sometime even for one line changes.

Re: Preferred method of documentation?
by ajt (Prior) on Oct 06, 2004 at 15:01 UTC

    A long time ago I asked the same question, "How to write documentation?"

    You may wish to read the node thread, I believe that many of the comments I got, are still valid today.


    --
    ajt
Re: Preferred method of documentation?
by dimar (Curate) on Oct 06, 2004 at 18:20 UTC
    T(always)MTOWTDI

    For every pair of eyeballs, there is a different preferred way of seeing documentation. The spectrum seems to range from 'More is better' to 'Less is More' to 'The Source Code is the Docs'. This seems to be tied to individual learning and cognitive styles, therefore one is not very likely to discover any 'single right way' to do it. The variation in individual preference is almost like a (mutable) fingerprint.

    a personal example

    Here is an example of something I have been using, which has worked well for me.

    ### <region-file_metadata> ### main: ### - name : tryPopupDialog ### sbty : perl script ### desc : here we try to pop up an input dialog box using perl ### date : Wednesday, October 06, 2004 ### see_also: ### - caption : tk documentation ### href : href="c:/docs/documentation/tk_docs.htm" ### generator: ### - caption: this file was generated using a skeleton template ### href: href="c:/docs/new_file/new_perl_script.pm" ### </region-file_metadata>

    Highlights: 1) It's eyeball friendly and machine readable (YAML); 2)The 'href' items work like hyperlinks that I can click and 'jump to' in my text editor or IDE. 3)The code uses as much auto-generation as possible so I do not have to retype things like the filename or the date. 3.5) The outline is flexible, and can be more or less detailed based on the project (you are not limited to 'main see_also' categories, put whatever you want in there. 4) The machine-readable format makes it easy to convert to POD, JavaDocs, Python Docstrings, or whatever other 'fingerprint' I have to match to make whatever project manager happy ;-) 5)The basic format is universally compatible with any programming environment that supports comments.

    consistency

    One discovery that no one else has mentioned, however, is that if you are serious about "good documentation" an important consideration is *consistency*. The reason why is because as you evolve and grow as an individual, so will your working style. As your working style evolves, you will go through different 'phases' ... each of which will be apparent in your documentation (among other things). It may sound like a contradiction, but as you change in each 'phase', make sure you stay consistent. In other words, you should be able to go back over your work through the years and identify whatever 'phase' you were in when you created it. "That's the phase where I thought that comments on every line was essential" ... "That's the phase where I preferred minimalism" (and so forth).

    nuts-and-bolts tips for the trenches

    Here are some practical tips that may (or may not) be worth your consideration.

    • Leverage the power of 'automatic boilerplate' so you don't have to type everything out by hand each time. I've never seen a consistently-documented work product that didnt have some kind of 'keystroke-saving-fill-in-the-blank-style-time-saver' for the people doing the work. Obviously, Perl is great for this kind of thing.
    • Leverage the power of 'hyperlinks'.For any sizeable project, there will inevitably be bits and pieces of inter-dependent, semi-consistent documentation strewn about. Find a way to make connecting the dots as *easy as possible* for your working style. If you dont have something brain-dead simple, you wont use it. CTAGS is an example approach, there are many other ways to do it.
    • Learn to distinguish when documentation is unecessary. Some work is like a quick grocery shopping list that you will never use more than once. Other work is like a Physics Textbook that you intend to sell internationally. Obviously, the level of necessary documentation is contingent upon your ultimate goal.
    • Try to find a nice balance between 'eyeball readable' and 'machine readable'If you have a documentation system that is 'eyeball friendly', as well as easily parsed by some kind of program or script, and you are satisfied with it, you have reached a major milestone.
Re: Preferred method of documentation?
by l3nz (Friar) on Oct 07, 2004 at 06:27 UTC
    I more or less agree with what has been said in this thread; thchnical documentation is definitely important (and I'd use POD, so you get nicely formatted document in no time).

    What usually lacks in other peopole docs is the "why": you find a lot of information on all possible API calls but very few on why the module was developed and on the general problem space it was trying to solve.

    Simple quick-and-dirty usage examples also help get started. (For a good exaple of documentation, see the LWP cookbook - firts you get started, then you read the docs when you try to do fancier things).

    Simple things should be easy, and documentation should help.

Re: Preferred method of documentation?
by zentara (Archbishop) on Oct 06, 2004 at 13:45 UTC
    I have been seeing a general trend toward using pdf docs in all sorts of software. It's easily printed. Pod isn't easily printed, and only Perl programmers seem to know about it.

    I'm not really a human, but I play one on earth. flash japh

      Pod can be easily converted to troff, html, plain text, or latex. Any of those can be printed, and are well-known formats.

Re: Preferred method of documentation?
by buttroast (Scribe) on Oct 09, 2004 at 19:43 UTC

    In college they told us that in the "real world", you have to have

    Thanks buttroast
Re: Preferred method of documentation?
by Anonymous Monk on Oct 11, 2004 at 17:21 UTC

    Consider POD first (for Perl), then maybe XML/XSLT. Either one will export easily to other formats.

    I have a modest Perl/Tk program of some umpteen modules. In all it runs to 3629 lines of code, 1052 comment lines and a longish POD tagging along at the end. It is available on-line here...

    http://starling.us/tet/gus_perl/#GUS-1

    The in-line comments are for me to know which subs do what. Any snippet of Perl, however small, gets a comment if come next year I might have forgotten what it is for. The POD I usually aim at the user...or as a general overview. For the user I'll also convert the POD into HTML for display online.

    Inside the link cited above cited above is a button labled "POD as HTML" which is what it says it is. But the save you the trouble of pressing it, I give same below...

    http://starling.us/tet/gus_perl/gus_rpc_edit_pl/gus_rpc_edit.html

    Such as it is, that is the method which works for me in writing Perl. For absolutely everything else I default to XML/XSLT. The link cited above was exported to HTML from XML/XSLT. If you like, I have something on XML/XSLT here...

    http://starling.ws

      Oops! I failed to observe that my login failed before replying. So it says "Anonymous Monk" so that blame may fall where do, I confess. It was I. Gan Starling

        Change your theme and it will be easier to realize your not logged in.

        "Cogito cogito ergo cogito sum - I think that I think, therefore I think that I am." Ambrose Bierce

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://396760]
Approved by kutsu
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-04-16 22:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found