http://qs321.pair.com?node_id=586065


in reply to Perl6 Pod -- reinventing the wheel?

Okay, so let me talk a little about why the proposed new Pod is the way it is, and why we didn't choose to jump to Texinfo or DocBook or XHTML or any other pre-existing markup system.

Let me start by rewriting your proposed POTD example using the new Pod notation:

#!/usr/bin/perl use strict; use warnings; =head1 A few good subs =para This is a line of Pod. This module contains some functions and might be used as follows: =code do_something(); # Magic happens here! # ------------------ # Subroutines # ------------------ =head2 do_something =para You'd use this I<awesome> function for: =item When you want to do foo. =item When you want to do bar, since foo obviously isn't cutting it. sub do_something { print "Magic goes here.\n"; } print "hi.\n"; do_something; print "bye!\n";

Take a moment to compare the two versions:

To take those points one at a time...

The readability of raw mark-up matters. Not because the readers of documentation read it raw, but because the writers of documentation write it raw. The less complex and intrusive a mark-up notation is, the less likely the document writer is to make a mistake (either with the notation, or with the content) when documenting.

The choice of keywords matters too. Texinfo (and HTML and Perl 5 POD for that matter) get this wrong in subtle but important ways. For example, Texinfo and HTML provide the @emph{...} and @strong{...} (<em>...</em> and <strong>...</strong> in HTML) markers. But what do they mean? When should I use "emphasis" and when should I use "strength"? The usual answer for most people is that they simply ignore the (un)descriptive aspect of these labels, mentally translate them back into "italics" and "bold" respectively, and decide which they want on that basis. So the syntax chosen for these descriptive elements actually undermines the descriptive focus: the writer has to resort to presentational considerations to work out what they should use.

In contrast, and in a typical Perlish (or perhaps a Damianish) approach, Pod solves this problem by going so far the other direction that it nearly comes full circle. Perl 5 POD provided the I<> and B<> markers (for "italics" and "bold"); it was not even pretending to be descriptive. Perl 6 Pod could have provided E<> and S<> markers (for "emphasis" and "strong"), but then it would have been pretending to be descriptive, since everyone would just mentally translate them back to "italics" and "bold". Instead Pod provides the U<>, I<>,, and B<> markers: for "unusual", "important", and "basis". That is, Pod provides three levels of significance markers and—far more importantly—provides an easy way to decide which one to use.

Instead of asking yourself "Should this be emphasized or strong", you ask yourself "Is this merely unusual in the surrounding text, or is it actually important in the surrounding text, or is it in fact the entire basis of the surrounding text". So Pod gives you a much better way of deciding which mark-up tag is appropriate...by making the markup keywords actually mean something. Instead of deciding what the text should look like (presentational mark-up) or how much emphasis to apply to the text (descriptive mark-up), you decide how significant that text is (semantic mark-up).

Now, by sheer coincidence, the unusual tag (U<>) is typically rendered in underlining, the important tag (I<>) is typically rendered in italics, and the basis tag (B<>) is typically remdered in bold. So if you don't like—or can't cope with—Pod's semantic level of mark-up, it turns out that you can just pretend that Pod is still a descriptive mark-up notation and simply choose whether you want underlining, italics, or boldness. Of course, it's a complete accident that the U<>, I<>,, and B<> tags can be misunderstood in that fashion, but it's a highly convenient and backwards-compatible accident. ;-)

This notion of semantic mark-up is applied throughout the design of Pod. For instance, whereas Texinfo has @verbatim blocks and HTML has <PRE> blocks, Pod has three distinct alternatives: =code, =input, and =output. That's so you can distinguish the three commonest uses for pre-formatted verbatim text in a readable way, and so you be specific about the semantics of a particular block (what it means), and so renderers can easily distinguish between those three types of block and thus present snippets of code, samples of input, and listings of output in three distinct and easily recognized formatting styles.

Added to all of the above motivations is the fact that Perl 6's Pod has been carefully designed to be very easy to adapt to if you're already familiar with Perl 5's POD. If you know POD, you can very quickly learn the new rules for Pod (there are fewer of them and they're less restrictive, simpler, and more consistent). Similarly, it's easy to pick up the small number of new constructs (nested list items, autonumbered list items, input and output samples, tables, definitions) because they use the same syntactic structures as existing constructs. Oh, and annoyances like =over/=back and =cut (which were often where mark-up mistakes crept in) have been removed.

So, yes, we could have chosen to move to an entirely different mark-up notation, but we decided that we could better meet the needs (and the expectations) of Perl programmers by tweaking the notation they're already familiar with to remove the pitfalls and annoyances, to raise the level of abstraction, and to increase the expressive power, without compromising the fundamental goal of having a truly Perlish documentation system: one that helps you get your job done efficiently, without getting in the way.

Damian