A guide to coding Perl

These are my programming guidelines. These are idiosyncratic. This list is taped to my monitor. Since my team is just me my list is more of a cheat sheet.

I revise this list fairly often, a few times a year. It lists what I consider important in coding, what I need reminding of, and found tidbits to be integrated somehow/where/time.

The list items are short so it can live on the bezel of my monitor and so meanings may become overloaded. Here I have added annotations to clarify my abbreviated notes.

The list order is significant. Things will creep on and off the list as their applicability to my code is discovered or as I integrate the item into my code or mindset. For instance Avoiding premature optimization would be on this list, but that is not a problem for me. I've sworn off.

1. Do your best

This just cannot be over emphasized. I don't feel I have a problem with doing quality work, but bringing the issue up focuses me on the task. I am less apt to wander off into some other activity.

2. Test

Proper function of the code is the primary issue. Best & test just go together: assertion & evidence, spaghetti & sauce.

I find I never test too much. My bad habit is to throw away little tests that could go into the testing regime.

Test as play or try. Not sure how pack() works? Try it, see how it works. Often I see questions here about the usage of a function or operator, that show enough insight, that I wonder why the questioner didn't just write a short page of code and see for himself. Having or developing this quality is a key element to being a good programmer. It shows the right attitude, a pleasure in exploration, debugging and discovery.

Test as audition. I thought one day Why am I using double-quotes all the time? This is just a C string literal habit. So I tried using single-quotes until I knew the advantages of using double-quotes.

3. Code clearly

Readability is the second issue after function. (Hi, Ovid)

4. # to clarify

If you can't make the code clear, comment to clarify.

5. Gen code

Generate code if possible. Generated code is more consistent, efficient of production, and better planned than ad hoc code. The act of generating code forces a reasonable amount of planning and structure to the process.

It also tends to collect important information in one place. This harks back to Brooks' show me your data ....

6. Doc I/O & intent

Document the input and output and the intent of functions and modules. This is the coder's translation of specifications. These comments should treat the code as a black box.

7. use_names_like_this

It's the Perl way and very readable. Occasionally I'm tempted to save space with modernNameStyles so this is high on the list.

8. tidy

perltidy the code, an automatic cleaning may not shine like a hand rubbed polish, but I'm a geek, I believe in automation and I have other things to do.

Also being neat and tidy is a good thing. Well, I've heard that, maybe someday I'll know firsthand.

9. q

Double quotes are the right thing to use most of the time. However when generating Perl with Perl, I should use the generic quoting mechanisms more often.

10. need less

Good life advice generally.

To code beyond the current requirements is to ornament your code pointlessly. Some folks think this is slovenliness or shirking. They think you need to anticipate future growth of the code. They tend to waste time building features or infrastructure that is never used or needed.

Code to the current need. Don't make up additional spec's.

Also need fewer modules for your code for distribution. Be quick to use modules, but also consider factoring them out of your code.

o- ci

Is it time to submit to the CVS repository?

o- Use \s not tmp vars

This is somewhat experimental for me. A post by merlyn has me raising the bar greatly on meaningfulness of variable names. The idea is to use whitespace to format longer expressions instead of using meaningless variables to split large expressions into multiple statements.

o- Catch signals.

Some of what I'm writing should. Often what I write doesn't require this type of robustness, so I'm reminding myself.

o- Document with pod.

I do, but I rely too much on the code and my memory.

o- 2 == $var not $var == 2

An old C-ism put the constant first thereby catching any = instead of == typos.

o- id lgth =~ scope & presence.

Identifier name length is bound to scope and presence.

This is important, but we're getting to things that are second nature for me. Big long names go with large scopes, short names with small scopes. The exception are the pervasive variables that are central to a process and pop up again and again. $_ is the archetypical example. If you want to write $accounts_payable thousands of times fine, I'll just use $ap. This assumes that $ap has to be understood by anyone that has a clue about the codebase.

o- perldoc something!!?

I know this list pretty well, maybe I'd be better off reading a perldoc.

o- flyweight objects!?

I'm ignorant but I plan to check these out sometime.

o- use strict

I always do or this could be number two on the list.

o- use warnings

Ditto.

o- exit with meaning

Like handling signals, I should keep this in mind.

o- No pod & code.

Abigail-II's "don't interleave POD and code" obvious when you think about it. Having used here docs and such I'm not sure that I'd even support a "=begin comment" usage. The ugly hashs flowing down the left gutter sure do differentiate block comments from code.

o- < not >

petral's idea that 0 < $val <= 100 is more natural than 100 >= $val > 0. I think I just do this. But I never thought about it nor have I checked. Nice idea, not too important.

o- don't use select *

I never have. But I never thought about the unruliness of the beast either.

Update:

o- don't use one char var names

edited: Wed Dec 11 17:16:34 2002 by jeffa - added readmore tag

Comment on A guide to coding Perl Select or Download Code

Replies are listed 'Best First'.

Re: A guide to coding Perl
by PetaMem (Priest) on Dec 11, 2002 at 11:05 UTC

I like it. It's informal, but that's ok because it's made for a one man show. Plus it has some things I never thought of that are just smart. Big ++ from me. Oh. And here's something from our PPCGs (PetaMem Perl COding Guidelines): These are made to be aplicable for a group of programmers, with bits from other styleguides shamelessly stolen and even containing some totalitary guides like "you have to program with that editor". But I think this totalitarism is needed if you need to keep a Perl development team in sync.

Why PPCG? Why English?
First off, be assured, that other programmers *will* have to read your code. This can be in one or in two years, or in a week. Be also assured, that YOU will have to read your own code in a year or so. The other programmers reading your code may not speak czech, german, hindi or esperanto, but they certainly will speak and understand english. That's why all development has to happen in this language.

english documentation
english remarks in the program
english variable, package, symbol, subroutine ... names

Choose the right editor - it can help your career!
Basically you may choose whatever editor you like. If you don't choose emacs - however - you have to make pretty sure, that your source code is not screwed up in emacs. Why? Because the pointy haired big boss (PHBB) is using emacs and wants to use emacs and he will review all of your code in this editor. And because you'll see, that there are more requirements to a PetaMem sourcecode, that only few editors are able to meet.
If you to a later point want to be promoted to some "inferior big boss" position and review the code of some PetaMem programmers, you must do this in emacs. Why? Because even as an "inferior big boss" it can happen to you, that the real PHBB will re-review this code. If he should find any mistakes in a sourcode aproved by you, you will be very fast transformed from an "inferior big boss" to a dishwasher. So emacs can help your career - it cannot guarantee it however.
Make use of folding at the subsourine/method level. '{{{' is the start of a fold '}}}' is the end of it. The folding.el module for emacs does a good job at this. It should look like that:

# {{{ main_init main initialization routine # sub main_init { # # The priority of configurations is: # 1. command line parameters # 2. .elricrc file # 3. defaults/autodetection # At every beginning of a block - if you have to declare my variable +s... my $cpu = ''; my $get_opt = ''; if(-e '.elricrc') { $config = ConfigReader::Simple->new(&globalconf('rscfile')); $config->parse(); $cpu = $config->get('cpu') || &num_cpus; } $get_opt = GetOptions('cpu=i', \$cpu, 'c=i', \$cpu, 'help', \&print_usage_cmdline, 'h', \&print_usage_cmdline, 'v', \&print_version); &print_welcome; &globalconf('cpu',$cpu); } # }}} Give emacs hints what this source is all about: # For Emacs: -*- mode:cperl; mode:folding -*- Yes, use cperl-mode
[download]

Don't use nested folds. There are at least two reasons:

Nested folds are realized using different line ending, so this source cannot be compiled correctly.
You've done a mistake in your problem specification and/or don't write clean and efficient code.

The absolute basic hints

Know what the code is supposed to do BEFORE you start
Don't be too clever. Be elegant when coding and yes - cleverness is helpfull, but don't be TOO clever making it hard for someone reviewing your code a year later trying to find out what idea you had that best day in your life far ago.
"You may think of a very clever way to code something this week. Unfortunately you may not be as clever next week and you might not be able to figure out what you did."

If you want to be especially clever (or need to, because the problem is hard) use apropriate documentation. Apropriate documentation is, what all of the rest of the development team understand (including the PHBB).
don't mix platform dependent code with platform independent one. Test your code under various platforms.
Maintainable code should be your holy grail, above and beyond any other consideration (except for correctness). It should be more important than optimization (save for when business needs demand it).

PetaMem Specifics
You will use CVS source code and versioning system. Get used to it and USE it. Don't see the additional information requirements it puts on you as balast. You will miss that information to a later point. (Or regret not filling out after a talk with the PHBB) Every File of your Perl Sourcecode must contain the following header <xx> are placeholders:
# Started <date> by <name> # # $Author$ # $Date$ # $Id$ # $Revision$ # # [Modified <date> by <name>]* # <name of the file> <version_of_the_source> # Sanity checks: VARS|PROF|CERT # # PPCG: <version of the PPCG you are coding after> As for the sanity checks: VARS: Checked if there are any unnecessary variables (unused, used only once) and removed them PROF: Code has been examined and optimized by the help of the Profiler CERT: Certify, that this file has been validated against the stated PPCG - To work out the design priorities at PetaMem consider the following priorities for your code: Prio 1 CORRECT does what you intended what you intended was specified Prio 2 MAINTAINABLE well structured/modularized good documented easy to understand Prio 3 REUSABLE generic Prio 4 EFFICIENT fast ressource friendly small memory footstep etc.
[download]

Efficient being last priority doesn't mean, that you can or should choose the wrong algorithm O(n) instead of O(log n), or - to speak mor generally - with a runtime/memory complexity some level above the optimum solution. It just means, that you don't need to worry about reimplementing in C immediatedly. If you encounter a memory/cpu tradeoff situation, going for more memory instead of hoping for a faster CPU is the way. (most of the time and even then it depends)
Before you "finish" your work on a module or logical piece of code of yours, use Benchmark and the Profiler. See if there are any bottlenecks and eliminate them. Then your code is nearly "finished". So far for efficiency.
Then test it on various platforms and run it through the bytecode compiler. If it won't work on a platform, find out why and try to fix it. If fixing doesn't work (without major work), document it. Same applies to bytecompiling the code. THEN and only after youve performed all these steps. Your code can be seen as "finished" - ready for the maintainance mode.

Use Comments
If your program is not worth documenting, it probably isn't worth running. The time you save by writing clean code and commenting it carefully may be your own. Your comments are as important as your code. The compiler won't see them, but pople will. Be precise, don't be lenghty. Making speling errors in comments is bad. Have you forgotten? Comments are important and deserve at least a proofreading. You will see examples of comments later in the examples section.
You should use your comments to be easy used by others. It's hard to read and underestand program code if you must be very carefull to don't skip some fundamental construction hidden between comment lines.
It is better to use block of comments describing some action followed by block of code with described functionality instead to comment each line of code. If you need to comment each line place comments at the end of these lines.
Wide acceptable are comment schemes like
comment comment comment comment comment comment comment comment comment comment comment comment comment comment comment comment comment comment comment comment comment code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code code
or
code code code code comment comment comment code code code code comment comment comment code code code code comment comment comment code code code code comment comment comment code code code code comment comment comment code code code code comment comment comment
But not
comment comment comment comment comment comment comment code code code code code code code code code code code comment comment comment comment comment comment comment code code code code code code code code code code code comment comment comment comment comment comment comment code code code code code code code code code code code

Robust Code

use -w, don't use warnings instead
use strict
use Carp
be robust. Write robust code. That is - your code shouldn't break. When your code is robust enough, you will realize, that it is the environment that will break. Don't assume everything will go ok. Test the return values of the system calls and let subroutines have intelligent return values that allow you to be robust. (i.e. end gracefully on fatal errors or - even better - be that high-level fault tolerant that allows you to continue alternatively. Remember: You're working for a company that creates AI-entities. You don't need to restart your brain when encountering an error. The AI shouldn't either. The AI may go down if the machine it is running on explodes. And only then and only if the machine explodes completedly.
I said be robust don't be paranoid. Testing the return value of print is paranoid. (Most of the time)
Concrete examples for robustness:

User-Level: Handle wrong inputs or completedly missing inputs gracefully. In fact, expect them to go wrong.
System-Level: Expect that system calls may go wrong. Don't die because of this
Program-Level: It shouldn't happen, but as we can create code dynamically @ runtime... Handle missing or wrong parameters gracefully.
Hardware-Level. The machine you're running on is or may get buggy. Have sanity checks. Die gracefully and informative if you find your context broken.

General Perl Habits @ PetaMem

All PetaMem Perl developers are members of perlmonks.org, some are at the Prague Perlmonks.

Using Alien Perl Modules

Use of alien modules (from CPAN) is strongly preffered over the use of an own reimplementation of some task. Even if the module may seem an overkill. However: The policy for using modules is, that first the "best" module for a given task has to be choosen. Choosing is done by a gremium of at least 2 programmers. These are commited to do research at various community places like Perl-Newsgroups or Websites to form and support their decision. After they've chosen, they have to CARVE their decision in a document at W3, stating which module was chosen over what others, when and why.
Now be very VERY careful what you chose, because: If time shows, that another module is better (extended/faster functionality) and should be used instead of the old one, a gremium of at least 2 programmers including the PHBB has to make the decision to replace the old module against the new one. IF the decision is made, ALL use of the old module in existing sourcecode has to be replaced with the new one. Therefore it may well be, that replacing a modul won't happen even if it is technically superior. That's a bad thing. PetaMem programs should be that well written (functional encapsulation) and documented, that a replacement could happen.
If a module gets replaced, this decision has to be CARVED in W3. The information about why we used the old module becomes part of this report.
We're a small company (for now). But we're using a powerfull language and we're moving in a powerfull community. These two pillars are the giants on whose shoulders we stand. Don't program it if it already exists. If you do, you're asking for trouble.

Various suggestions
Interpolation, concatenation and list context
Using interpolation is in many cases a waste of ressources. Interpolation is absolutedly equivalent (in terms of speed) to concatenation and this is - when printing - much slower than using list context. This is because print expects a list context.

"$blah1 $blah2" (INTERPOLATION)

$blah.' '.$blah2 (CONCATENATION)

$blah,' ',$blah2 (LIST CONTEXT)

If you just need to construct a string which you need in scalar context, you must concatenate. And therefore it is sometimes much clearer and more readable if you use interpolation.
You may prefer interpolation for readability AND(!) if the operation doesn't require maximum speed. In every other case use list context.
Code length
If your subroutine is longer than 70 lines of code, you can be sure, that you're doing something wrong. If it is more than 100 lines you most probably have done a major mistake in the design of your code and it needs to be reviewed by someone other.

Overall module structure
Make your folded module look like this:
# CREATE methods # {{{ new constructor... # {{{ read read and parse lexicon from file... # {{{ restore fast restore of dumped lexicon... # QUERY methods # {{{ isin is the $key in this lexicon... # {{{ info information about the contents of this lexico +n object... # MODIFY methods # {{{ add_entry add entry to lexicon... # {{{ add_entries add entries to lexicon... # {{{ add_meaning add meaning IF DISJOINT else DISCARD or REPLA +CE... # {{{ del_entries delete entries from lexicon... # {{{ merge merge some other lexicon to this one... # {{{ consolidate consolidate this lexicon... # {{{ dump dump lexicon for fast recovery... # DERIVE methods # {{{ expand expand one entry...
[download]

To be continued...

signed the PHBB & Co.

Bye
PetaMem

[reply]
[d/l]
[select]

Re: A guide to coding Perl
by dingus (Friar) on Dec 11, 2002 at 20:44 UTC

"statement unless condition"

What do I mean?

well I often see code like

while (cond1) {
  if (cond2) {
    statement;
    if (cond3) {
      statement;
      statement;
      for (range) {
        something;
        something;
        something;
      }
    }
  }
  else {
    last;
  }
}
[download]

while (cond1) {
  last unless (cond2);
  statement
  next unless (cond3);
  statement
  statement
  for (range) {
    something
    something
    something
  }
}
[download]

Dingus

Enter any 47-digit prime number to continue.

[reply]
[d/l]
[select]

Re: Re: A guide to coding Perl

by rir (Vicar) on Dec 12, 2002 at 02:57 UTC

To my knowledge, I've only met two people, Pascal lovers both, who could not abide fallthrough. In their code, an if statement that could return was always followed by an else clause that contained the rest of the routine.

To be fair: a couple more spaces in your indents and your first version would read much better.

[reply]

Lifestyle versus coding guidelines ...
by johanvdb (Beadle) on Dec 11, 2002 at 14:22 UTC

Johan

[reply]

Re: Lifestyle versus coding guidelines ...

by rir (Vicar) on Dec 11, 2002 at 18:14 UTC

As for strictness of rules it doesn't really apply to a one man show. But if I were leading a team I'd prefer consensus on coding standards then my standards would not change too much in style.

If the culture would not allow that then I think very clear standards are good. Clear like: Use perltidy with these options. Use strict always. If people aren't on the same page there is no point to statements like: Try to avoid deep indent levels. Avoid complex regexes if possible. The first are clear and enforceable, the latter are not. Piddly little rules like the former may indicate a sick work environment. But a business can't stop just because the work culture is diseased.

If I talk of pointy haired bosses, I am showing that I am infected.

As regards code generation we disagree. But it does seem that we are talking about different things. My generated code is distinguishable from by my hand code only in the header comment that says it is generated and that it may be slightly differently formatted. If I ran it though perltidy you could not tell the difference.

I have fooled with IDE's but have never overcome the initial learning curve. Smalltalk was really tempting me, then java became the hot OO alternative to too-complex C++ and killed off the Smalltalk companies.

I started serious generation with the Jeeves generator described in Advanced Perl Programming by Srinivasan. It is not completely implemented as described, so I found myself descending into Perl to set up stuff for Jeeves. Eventually I decided that Jeeves was unnecessary, that using straight Perl to generate code would be faster, clearer, and more flexible. Just a little uglier in syntax.

This has allowed me to create more code, more consistent code, and less buggy code. By continually finding the differences between the entities in my code and defining them in one place I find that finding one bug and fixing it, fixes many bugs. Or points to further refinement of my meta-data. So my experience with code generation is very different than yours.

Yum, yum meta-data gooood.

Yes it does get to that viseral a level.

[reply]

Blurb on Flyweights available
by scrottie (Scribe) on Dec 12, 2002 at 07:59 UTC

blurb on Flyweights

Immutable Objects

Title edit by tye

[reply]

Re: A guide to coding Perl
by osama (Scribe) on Dec 14, 2002 at 10:50 UTC

Plan before you code.
perl -w
use strict;
Think in perl!! forget any other programming skills you may have...
Write small subroutines that do simple things.
Use CPAN modules.
Organize your file/files logically...
Separate generic stuff into modules (for reuse later, if modules are good/generic enough consider sharing).
indent works with perl!!!! i frequently use indent -kr filename.pl

[reply]
[d/l]


good chemistry is complicated, and a little bit messy -LW
	PerlMonks