http://qs321.pair.com?node_id=11119799

Lady_Aleena has asked for the wisdom of the Perl Monks concerning the following question:

I have been writing POD for some of my modules, and one of the harder aspects of writing POD is coming up with good examples in the synopsis. I came up with, what I think is, a good example for numerical use; but I am stuck on coming up with a good example for the alpha use of the following sub.

The sub splits the values of the list into two parts, sorting by the first part, then sorting by the second part.

Please note, I do not know why I wrote this, since I am not using it anywhere in my code. I even searched my history here to see if I brought it up before and can not find anything, which is strange because most of what I write ends up here at some point.

sub split_sort { my ($in_a, $in_b, $split, $sort_type) = @_; $split = qr($split); my ($numa1, $numa2) = split(/$split/, $in_a); my ($numb1, $numb2) = split(/$split/, $in_b); if ($sort_type =~ /^num/) { $numa1 <=> $numb1 || $numa2 <=> $numb2 } elsif ($sort_type =~ /^(alpha|letter)/) { $numa1 cmp $numb1 || $numa2 cmp $numb2 } }

The list I came up with for the numerical option is:

my @numbers = qw(1:2 1:02 3:4 5:78 50:89 10:5); my @sorted = sort @numbers; # The list as it is written can not be sorted as numbers. # returns # [ # '10:5', # '1:02', # '1:2', # '3:4', # '50:89', # '5:78' # ] my @split_sorted = sort { split_sort($a, $b, ':', 'number') } @numbe +rs; # returns # [ # '1:2', # '1:02', # '3:4', # '5:78', # '10:5', # '50:89' # ];

I included a cautionary note in the POD that 02 is the same as 2 when using this.

I just can not come up with a list where this might be needed on alphabetical strings.

Comments on the code are always welcome.

Edit: I have a feeling I added the usage for alphabetical strings for completeness, nothing more. I have tried coming up with lists of alphabetical strings that would change if this is used and have not found any. I am now thinking that I should just put a note in saying that the alphabetical usage is redundant.

My OS is Debian 10 (Buster); my perl versions are 5.28.1 local and 5.16.3 or 5.30.0 on web host depending on the shebang.

No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
Lady Aleena

Replies are listed 'Best First'.
Re: Coming up with good examples in POD
by kcott (Archbishop) on Jul 25, 2020 at 23:58 UTC

    G'day Lady Aleena,

    "I just can not come up with a list where this might be needed on alphabetical strings."

    With split_sort(), as currently written, I can't see any use for the alpha sort type: it will return the same as a plain sort (in all cases, as far as I can tell).

    With an alpha sort type ignoring case, you could get this difference:

    $ perl -E 'my @x = ("ade:Y", "Abc:X", "Afg:Z"); say for sort @x' Abc:X Afg:Z ade:Y $ perl -E 'my @x = ("ade:Y", "Abc:X", "Afg:Z"); say for sort { fc($a) +cmp fc($b) } @x' Abc:X ade:Y Afg:Z

    With an alpha sort type expecting Unicode, you could get this difference:

    $ perl -C -E 'my @x = ("\x{c5}de:Y", "Abc:X", "Afg:Z"); say for sort @ +x' Abc:X Afg:Z Åde:Y $ perl -MUnicode::Collate -C -E 'my @x = ("\x{c5}de:Y", "Abc:X", "Afg: +Z"); say for Unicode::Collate->new->sort(@x)' Abc:X Åde:Y Afg:Z
    "I am now thinking that I should just put a note in saying that the alphabetical usage is redundant."

    Perhaps not entirely redundant. Consider its potential use in a scenario where you process an AoA which holds a mixture of numeric and alphabetic arrays.

    my @multi_sorts = ( [ $array1, ':', 'num' ], [ $array2, '-', 'alpha' ], [ $array3, ',', 'num' ], ); handle_multi_mixed_sorts(\@multi_sorts); # At this point in the code, each of the arrays in @multi_sorts # has the original array still as the first element # and the sorted array now as the fourth element. sub handle_multi_mixed_sorts { my ($multi_sorts) = @_; for my $i (0 .. $#$multi_sorts) { push @{$multi_sorts->[$i]}, [ sort { split_sort($a, $b, $multi_sorts->[$i][1], $multi_sorts->[$i][2]) } @{$multi_sorts->[$i][0]} ]; } return; }

    Do note that I just typed that code directly into my post: it's entirely untested.

    — Ken

      Case sensitivity is outside the scope of split_sort, and should a user want to add case insensitivity to the use of split_sort, they can do something like the following.

      my @sorted = sort { split_sort( fc($a), fc($b), 'type', 'split expr') +} @array; # I made changes to split_sort detailed in a moment.

      With a change I made to split_sort that should work in alpha or letter is chosen for type.

      After failing to come up with a list of strings where the split would return something different from a standard sort, I was about to abandon it again. Then a thought popped into my head, and it was not the Stay Puffed marshmallow man. There might be strings where numbers were on one side the potential split and alpha was on the other and someone might want the numbers sorted numerically. So, I added the left and right options. Also, since alpha (or letter) doesn't benefit from the splitting of the strings, I had those options skip the splitting.

      sub split_sort { my ($in_a, $in_b, $sort_type, $split) = @_; if ($sort_type =~ /^(alpha|letter)/) { $in_a cmp $in_b } else { $split = qr($split); my ($numa1, $numa2) = split(/$split/, $in_a, 2); my ($numb1, $numb2) = split(/$split/, $in_b, 2); if ($sort_type =~ /^num/) { $numa1 <=> $numb1 || $numa2 <=> $numb2 } elsif (fc($sort_type) eq 'left' ) { $numa1 <=> $numb1 || $numa2 cmp $numb2 } elsif (fc($sort_type) eq 'right' ) { $numa1 cmp $numb1 || $numa2 <=> $numb2 } } }

      Another change I made is the I changed is order of the parameters. Since the parameter $split can be ignored for alpha, I put it at the end. I prefer to put any parameter than can be ignored or undef to be at the end of the parameter list. (If two or more parameters can be undef or ignored, then I think it is best to make them $opt.)

      With those changes, the POD became easier to write. It may still be incomplete, but this is what I have so far.

      =pod =encoding utf8 =head1 NAME B<Fancy::Sort::Split> returns the expression to split the values in li +sts for sort. =head1 VERSION This document describes Fancy::Sort::Split version 1.0. =head1 SYNOPSIS my @numbers = qw(1:2 1:02 3:4 5:78 50:89 10:5); my @split_sorted = sort { split_sort($a, $b, 'number', ':') } @numbe +rs; # returns # [ # '1:2', # '1:02', # '3:4', # '5:78', # '10:5', # '50:89' # ]; my @left = qw(2:a 02:a 4:a 28:a 89:a 5:a); my @split_sorted_left = sort { split_sort($a, $b, 'left', ':') } + @left; # returns # [ # '2:a', # '02:a', # '4:a', # '5:a', # '28:a', # '89:a' # ]; my @right = qw(a:2 a:02 a:4 a:28 a:89 a:5); my @split_sorted_right = sort { split_sort($a, $b, 'right', ':') } + @right; # returns # [ # 'a:2', # 'a:02', # 'a:4', # 'a:5', # 'a:28', # 'a:89' # ]; =head1 DESCRIPTION Fancy::Sort::Split returns the expression to split the values in lists + for L<sort|https://perldoc.perl.org/functions/sort.html> subroutines + using C<split_sort>. C<split_sort> has to be imported into your scri +pt. C<split_sort> has four required parameters. The first and second parem +eters are C<$a> and C<$b> from C<sort> or C<$b> and C<$a> if you want + a descending sort. The third parameter is the expression you want to + split the strings by. The fourth is the type of sort you want, C<num +ber> or C<alpha> (C<letter>). split_sort($a, $b, 'type', 'expr'); A note of caution for the numerical sorts, when a number has a leading + zero (C<02>), the leading zero will be dropped. So, C<02> will be th +e same as C<2>. It requires Perl version 5.16.0 or better. =head2 Numerical sort When you have numbers on both sides of the expression, use C<number> s +o the numbers on both sides are numerically sorted. split_sort($a, $b, 'number', 'expr'); =head2 Numerical sort on the left When you have numbers on the left side of the expression, use C<left> +so the numbers on the left side are numerically sorted.. split_sort($a, $b, 'left', 'expr'); =head2 Numberial sort on the right When you have numbers on the right side of the expression, use C<right +> so the numbers on the right side are numerically sorted.. split_sort($a, $b, 'right', 'expr'); =head2 Alphabetical sort When you have letters on both sides of the expression, use C<alpha> or + C<letter>. However, the alphabetical sort is redundant and was added + for completeness. The sort expression returned will be the same as C +<$a cmp $b> for the entire string. So, for alphabetical sorts, you ma +y omit the expression for the split. split_sort($a, $b, 'alpha', 'expr'); split_sort($a, $b, 'letter', 'expr'); =head1 DEPENDENCIES Fancy::Sort::Split depends on L<Exporter>. =head1 AUTHOR Lady Aleena =cut

      I am a bit disappointed that the alpha part did not pan out as I wanted, but I did not put much thought into it in the first place. I am fairly happy with it as it is now.

      My OS is Debian 10 (Buster); my perl versions are 5.28.1 local and 5.16.3 or 5.30.0 on web host depending on the shebang.

      No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
      Lady Aleena
        "I am fairly happy with it as it is now."

        If your changes cover all potential use cases, that's good.

        As the original question was focused on POD, I thought I'd just point out a couple of discrepancies.

        You wrote "... the parameter $split can be ignored for alpha ..."; but, your POD has "C<split_sort> has four required parameters.". As it stands, users won't know what fourth parameter to use for alpha types; an additional example wouldn't hurt.

        Also in that paragraph, the last two sentences are back-to-front: it should be, 3rd is 'type' and 4th is 'expr' (cf. "split_sort($a, $b, 'type', 'expr');" which immediately follows that paragraph).

        &split_sort contains no implicit return so the return value will be the result of the last expression evaluated; that will be either a <=> or cmp expression, both of which return one of -1, 0 or 1 (see "perlop: Equality Operators"). The sort itself returns a list which you have correctly assigned to an array. The POD immediately afterwards indicates an ARRAYREF is returned: "returns [ ... ]".

        How you deal with that one is up to you. I'd probably show split_sort(...) and indicate it returns the same as <=> and cmp, making it useful for sort. Then show usage with sort much as you currently have. Then show that it returns a LIST instead of an ARRAYREF; just changing [...] to (...) should suffice.

        For your 'left' and 'right' examples, I'd mix it up a bit to give a clearer indication of how the sorting process works. For instance, here's a couple of arbitrary examples.

        IN: 10:ab 21:bb 2:bb 21:bb 2:b OUT: 2:b 2:bb 10:ab 21:ab 21:bb IN: bb:10 b:1 ab:10 ab:2 bb:9 OUT: ab:2 ab:10 b:1 bb:9 bb:10

        Including the '2' and '02' in your original examples was I good idea. You may want to point out that they'll keep their original order; by which, I mean:

        $ perl -E 'say for sort { $a <=> $b } qw{020 2 02 20}' 2 02 020 20 $ perl -E 'say for sort { $a <=> $b } qw{20 02 2 020}' 02 2 20 020

        Finally, as I've not seen the the actual module code, I'm guessing a bit here. The NAME should indicate what the module provides or implements, not what it returns; technically, it should just return a TRUE value — it's common for the last line of code, excluding POD, to be just 1;. Also, it's useful to know whether functions are exported by default or not.

        — Ken

Re: Coming up with good examples in POD
by stevieb (Canon) on Jul 25, 2020 at 19:51 UTC

    I usually snag chunks of my test files to use in SYNOPSIS and EXAMPLES sections.

    That, or take some test parts and slightly modify them for such a purpose. Your unit test suite should have every imaginable (to you) angled use of your software, and tests should be written before, or as you write code, so A) you know it works, B) it should be literally copy/pasteable and C) so long as the test code never changes, your examples shouldn't have to either.

      The history of this sub is probably I wrote a once off script, that is now gone. I liked the sub so plonked it my my Util::Sort with all my other sorting subs. It is easy enough to figure out how to use, so no usage was written for it anywhere. It has been sitting in my Util::Sort module a long time growing metaphorical moss.

      I have not written any tests for any of my work yet. I am writing all the POD first, then I will get to the tests, eventually. The last time I looked at the Test suite of modules, I got so confused on their usage, that I just quit trying. That was years ago when I was still learning how to write and had severe confidence issues. I still have confidence issues, though not as severe.

      It will be especially hard to write tests for the modules I wrote and do not use.

      My OS is Debian 10 (Buster); my perl versions are 5.28.1 local and 5.16.3 or 5.30.0 on web host depending on the shebang.

      No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
      Lady Aleena
        "I have not written any tests for any of my work yet. I am writing all the POD first, then I will get to the tests, eventually. The last time I looked at the Test suite of modules, I got so confused on their usage, that I just quit trying."

        Thing is, if you get all of the POD done, then write tests, what happens if by writing tests you find glaring issues, and you need to modify the public interface? You now have to scour all of your documentation to ensure all examples, usages and explanations get updated as well.

        To be honest, I feel that test cases are the best practice for confidence. You get familiar with the process, but it also allows you to have a closer relationship with your code (as you're usually bouncing back and forth from the test and the code).

        "It will be especially hard to write tests for the modules I wrote and do not use."

        If you don't use them, they are the perfect modules to learn how to write tests against. Nobody else is using them either presumably, so if/when you find glaring bugs and problems, you already know the code isn't in production anywhere. Not so much for writing code for modules already in production.

Re: Coming up with good examples in POD
by perlfan (Vicar) on Jul 25, 2020 at 23:05 UTC
    > good examples in the synopsis

    The SYNOPSIS is just that, a "quick start" or "tldr;" on how to get to something more interesting. It is not a treatise expounding on the inards of your functions.

    Similarly, POD detailing the the methods is best written as "this is the intent of this function". It is not a full break down of the implementation and what constitutes it running successfully. This is what unit tests are for - and unit tests can also serve as excellent documentation themselves. If your goal is really to expound on the gritty details of a function or method, this is probly best done around the function it self. The code should be enough to describe what is happening. And if it isn't then a few well placed internal comments (behind the #) would work well.

    Lastly, a good rule of thumb is that updating the code of a function should require documentation updates only in extreme cases (e.g., when IN/OUT changes, exceptions are added or removed, etc).

      I may be missing something, but how is POD not for breaking down the implementations of a module and using it successfully? When I go to read how to use a module I have not seen before, I read the documentation of it. I have never looked at a module's unit tests to figure out how to use any module. I also do not look at the code searching for comments on usage either. Most of the time when I use a module, I do not even look at its code. (The few times I have looked at modules' code, I could not understand it, even metaphorically standing on my head.)

      One of my bigger problems using modules is that some make me play guessing games on how to use them. I have walked away from several modules because the documentation was lacking in examples of usage, or only showed the most common usages and ignored anything fringe. If I have to play guessing games when trying to use a module, I quit trying usually and roll my own code to get what I want.

      So, when I write the documentation for my modules, I want to come up with all the usages I can imagine and note them so users can see the full capabilities of what my modules can do. I do not want potential users to play guessing games on how to use what I wrote. I also do not want potential users to have to go on a treasure hunt (looking at the code and tests), to see how to use the module either.

      But as I said, I may be missing something.

      My OS is Debian 10 (Buster); my perl versions are 5.28.1 local and 5.16.3 or 5.30.0 on web host depending on the shebang.

      No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
      Lady Aleena
        I may be missing something, but how is POD not for breaking down the implementations of a module and using it successfully? When I go to read how to use a module I have not seen before, I read the documentation of it.

        POD (and user-facing documentation in general) is for describing the purpose and usage of modules and their functions. It is not, in general cases, for describing the internal ways the module achieves its ends as these are very rarely any concern of the user.

        For each function/sub/method one could reasonably expect the documentation to include:

        • function name
        • Arguments: types, names/positions, optional/mandatory, defaults, acceptable values
        • Return values: types, names/positions, flag values
        • Conditions which will throw exceptions
        • Prose describing the purpose, side-effects, caveats, context-sensitive matters, etc.
        • Example(s) showing how to call the function with suitable arguments, etc.

        In other words, everything to do with the purpose of the call and the interface to it. Everything else internal to the function does not need documenting here (eg. private variables, other functions it calls, etc.).


        🦛