Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

RFC: Tool::Box

by Limbic~Region (Chancellor)
on Nov 15, 2004 at 15:15 UTC ( [id://407834]=perlmeditation: print w/replies, xml ) Need Help??

All,
I have this loose idea in my head about taking code re-use down to the snippet level. Presumably, individuals already do this but don't feel it warrants a new module on CPAN. Since there is no end to beginners in the language asking how to do X where the stock reply is to do Y, I was thinking maybe we should do just that. The following is a rough proof of concept:
package Tool::Box; @ISA = qw(Exporter); @EXPORT = qw(); @EXPORT_OK = qw( Ave Max Min Uniq Stats ); %EXPORT_TAGS = ( ':ALL' => [ qw(Ave Max Min Uniq Stats) ] ); $VERSION = '0.01'; use strict; use warnings; sub Max { my $max; my $find_max = sub { for ( @_ ) { $max = $_ if ! defined $max || $_ > $max; } return $max; }; $find_max->( @_ ); return $find_max; } sub Min { my $min; my $find_min = sub { for ( @_ ) { $min = $_ if ! defined $min || $_ < $min; } return $min; }; $find_min->( @_ ); return $find_min; } sub Ave { my $tot; my $cnt; my $find_ave = sub { $cnt += @_; $tot += $_ for @_; return $cnt ? $tot / $cnt : undef; }; $find_ave->( @_ ); return $find_ave; } sub Uniq { my %uniq; my $find_uniq = sub { @uniq{ @_ } = (); if ( defined wantarray ) { return wantarray ? keys %uniq : scalar keys %uniq; } }; $find_uniq->( @_ ); return $find_uniq; } sub Stats { my $stat = {}; my ($cnt, $max, $min, $tot); $stat->{ADD} = sub { $cnt += @_; for ( @_ ) { $tot += $_; $max = $_ if ! defined $max || $_ > $max; $min = $_ if ! defined $min || $_ < $min; } }; $stat->{MAX} = sub { $max }; $stat->{MIN} = sub { $min }; $stat->{AVE} = sub { $cnt ? $tot / $cnt : undef }; $stat->{TOT} = sub { $tot }; $stat->{ADD}->( @_ ); return $stat; } 42; __END__ =head1 NAME Tool::Box - A hodge podge of useful functions =head1 VERSION Version 0.01 =head1 SYNOPSIS use Tool::Box; use Tool::Box qw(Max Min); use Tool::Box ':ALL'; =head1 DESCRIPTION This module is a collection of commonly used functions to do what you +want how you want. =head1 EXPORTS None by default =head1 FUNCTIONS =head2 Stats Function generator that allows you to keep track of max/min/ave/tot use Tool::Box qw(Stats); my $stat = Stats(1 .. 10); print $stat->{TOT}(); # 55 print $stat->{AVE}(); # 5.5 print $stat->{MIN}(); # 1 print $stat->{MAX}(); # 10 for (8 .. 12) { $stat->{ADD}( $_ ); } print $stat->{MAX}(); # 12 print $stat->{TOT}(); # 105 =head2 Ave Function generator that allows you to keep track of average value use Tool::Box qw(Ave); my $ave = Ave(1 .. 10)->(); print $ave; # 5.5 $ave = Ave(); for (4 .. 9) { $ave->( $_ ); } print $ave->(); # 6.5 $ave = Ave(20 .. 30); print $ave->(); # 25 =head2 Max Function generator that allows you to keep track of maximum value use Tool::Box qw(Max); my $max = Max(1 .. 10)->(); print $max; # 10 $max = Max(); for (4 .. 9) { $max->( $_ ); } print $max->(); # 9 $max = Max(20 .. 30); print $max->(); # 30 =head2 Min Function generator that allows you to keep track of minimum value use Tool::Box qw(Min); my $min = Min(1 .. 10)->(); print $min; # 1 $min = Min(); for (4 .. 9) { $min->( $_ ); } print $min->(); # 4 $min = Min(20 .. 30); print $min->(); # 20 =head2 Uniq Function generator that allows you to keep track of unique things use Tool::Box qw(Uniq); my $count = Uniq(1,1,2,3,1,7)->(); print $count; # 4 my @u_nums = Uniq(1,1,2,3,1,7)->(); print join ' ' , @u_nums; # 1 2 3 7, though the order is not guarant +eed my $uniq = Uniq(); while ( <DATA> ) { chomp; $uniq->( $_ ); } my $unique_lines = $uniq->(); =head1 AUTHOR Joshua Gatcomb, <Limbic_Region_2000@Yahoo.com> =head1 ACKNOWLEDGEMENTS Various people from PerlMonks (L<http://www.perlmonks.org>) provided invaluable input. =head1 BUGS Functions that expect numeric arguments are not verifying they are num +eric =head1 TO DO =head1 COPYRIGHT Copyright (c) 2004 Joshua Gatcomb. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =head1 SEE ALSO L<perl>(1) =cut

Yes, I know that some of these functions already exist in List::Util. I chose them because the standard wheel doesn't always fit your bicycle (values are not all available at once). I got the idea from one of tmoertel's posts. I want to stress that this post is about the idea in general and not this specific proof of concept.

So is this a good idea or not? I was thinking along the lines of going through the FAQs and providing ready made solutions (such as getting the intersection of two arrays). I can see it going both ways: if beginner's don't learn fundementals where will the hand-holding end? If it is a good idea, what should be included? Besides going through the FAQs I was also thinking about going through Snippets.

Cheers - L~R

Updated 2004-11-16: I changed the all cap function names, as it is something I normally don't do anyway, after a couple of comments against them.

Replies are listed 'Best First'.
Re: RFC: Tool::Box
by hardburn (Abbot) on Nov 15, 2004 at 15:29 UTC

    I honestly don't see the point. A given programmer that doesn't read the FAQs now isn't going to start because of a shiny new module. It's more important that the FAQ give a good answer and that we point people tward it when these questions come up.

    "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Re: RFC: Tool::Box
by borisz (Canon) on Nov 15, 2004 at 15:40 UTC
    - I like, how you keep track of the values, but no beginner can understand the difference between your min and List::Util::min. So the hardest part is to explain why someone should use this module in favor of List::Util.
    - I personally dislike uppercase functionnames a lot!
    - And there is already List::MoreUtils, perhaps your functions fit more in this module. Instead of a new one.
    Boris
      borisz,
      • ...hardest part is to explain why someone should use this module in favor of List::Util
      • This is a good point in general. Perhaps the README should include a FAQ for questions like, why isn't there an X function. The reply would be because it is part of module Y.
      • I personally dislike uppercase functionnames a lot!
      • I have a tendency to name my subs like Add_Name and am not sure why I did all uppercase.
      • And there is already List::MoreUtils, perhaps your functions fit more in this module.

      From my original post:

      I want to stress that this post is about the idea in general and not this specific proof of concept.

      The reason the proposed name was Tool::Box and not Functions::List::Util::Forgot was because I see it as being a hodge podge of functions (such as getting the intersection of two arrays). These functions were only examples of what I was thinking about and not specific functions that I think should be included (concepts not specifics).

      Cheers - L~R

Re: RFC: Tool::Box
by jdporter (Paladin) on Nov 15, 2004 at 15:46 UTC
    IMHO, there is a level — measured in snippet code size — below which it does not make sense to encapsulate in a module. At that level, the toolbox should exist in the programmer, not in some external library. Above that level, the snippets are going to be specific enough that they ought to live in the appropriate module. Your example above is a just one such: Those functions should go in List::Util. Any other routines you'd be inclined to throw in which would be inappropriate for List::Util would also be inappropriate alongside these functions in any other module. Just MHO.
      jdporter,
      Just to recap so I am clear:
      • Relatively small snippets by themselves do not belong in a module
      • Collections of related snippets may warrant an appropriately named module
      • A "left overs" or "hodge podge" of snippets doesn't belong in a module, but should be part of the programmer's knowledge base.
      Just MHO

      That's what I asked for and thanks for giving it to me. Like I said, I could see it going both ways. Still not sure I am convinced either way though.

      Cheers - L~R

        Let's carefully consider this one:
        A "left overs" or "hodge podge" of snippets doesn't belong in a module, but should be part of the programmer's knowledge base.
        The snippets that are common knowledge to a significant portion of the community ought to be taken out of our personal, back-of-the-brain closets, dusted off, and given proper names. Let us place them firmly in the common, community-wide knowledge base.

        Why should every programmer in the community have their own pet versions of common but "noncollectable" functions? If there is value in giving something a name, give it a name. If there is value in collecting something, collect it.

        Who cares how big the thing is or whether there's already enough stuff like it to form a ready-made home for it? If the cost of giving it a name and collecting it and then using it to solve problems is less than the cost of solving the problems otherwise, then it has earned its name and its place. What other consideration is there?

        Cheers,
        Tom

Re: RFC: Tool::Box
by itub (Priest) on Nov 15, 2004 at 17:10 UTC
    There is a line that separates the ridiculously simple snippets from the useful subroutines and modules; the problem is that the exact location of that line is subjective and a matter of preference. I think most people would agree that this is ridiculous:

    package Increment; use base qw(Exporter); @EXPORT = qw(increment); sub increment { $_[0]++ } # (...)

    But when you come to things like a min/max function or a slurp function, there is more room for debate. While I'll happily use the functions in List::Util because they are part of the core (at least if I'm targetting perl-5.8.0+), I just never use any of the file slurping modules on CPAN. Adding another dependency for something that I can write in one line of code seems utterly unnecessary.

Re: RFC: Tool::Box
by tmoertel (Chaplain) on Nov 16, 2004 at 05:48 UTC
    First, this is a mighty fine idea. It reduces the cost of certain common operations, even if hand-coding them is already inexpensive. More importantly, it solidifies common idioms behind descriptive names that the community can incorporate into its collective vocabulary.

    Second, the all-caps are a bit much. (They burn my eyes.)

    Third, I like the pre-loading refinement. (Personally, I wouldn't use them like this:

    my $min = MIN(1..10)->();
    because I find this more pleasing:
    my $min = MIN->(1..10);
    )

    But the refinement does have value for finding min-maxes and max-mins:

    my $max_of_at_least_zero = MAX(0); $max_of_at_least_zero->($_) while <>; # ...

    Fourth, let's load the tool-box up! How about some new additions?

    • read_all – slurp filehandle
    • read_all_from_path – open, slurp, close
    • zip – merge parallel arrays
    • zip_with – merge parallel arrays with a given "zipper" function
    • foldl, foldl1, foldr, foldr1, scanl, scanl1, scanr, scanr1 – more friends from functional programming
    • curry – everybody loves curry

    Thanks for getting the ball rolling.

    Cheers,
    Tom

      tmoertel,
      • First, this is a mighty fine idea...
      • Thank you. I think it might be, but I would have preferred to hear from some relative newcomers.
      • Second, the all-caps are a bit much...
      • I have changed them. I am not sure why I used all caps as I don't normally.
      • Third, I like the pre-loading refinement...
      • This was a hard decision with regards to DWIM. I would have preferred to have the ability in calling syntax to determine if they just want to run one time and get an answer versus returning a function. This seemed like an acceptable comprimise.
      • Fourth, let's load the tool-box up! How about some new additions?...
      • I am more than happy to be a co-author. We would decide when/if we release it who is responsible for maintenance.

      Cheers - L~R

        You can tell CPAN that you are co-maintainers so that both of you can upload new versions.
      tmoertel,
      I decided to get rid of max/min even if it is more functional than what List::Util provides. I think the only List::Util function worth making a closure out of is reduce.

      Additionally, I changed the calling syntax a bit. If you call a function with an argument list it will return the result of that list instead of a function. The trouble, as pointed out by diotalevi, is what to do if it is called with an empty list/array. Perhaps using different function names?

      Cheers - L~R

      Update:Slight modification (removing a function) after diotalevi pointed out it was equivalent to another List::Util function
        Why do you define the anonymous subroutine every time you call the function? Your anonymous subs aren't closures - they're just anon subs. They don't close over anything, let alone anything in @_. Your functions would be much faster if you define the subrefs ahead of time. Your functions would also be simpler because they just dispatch to the subref or return it, as needed.

        Another item - you make the same mistake nearly every unique'ing function makes: what if I want to find unique objects or hashrefs? Much better is @uniq{@_} = @_; values %uniq. That way, you also respect any overloaded stringification I might use. Your uniq_ordered() function doesn't have this mistake.

        Being right, does not endow the right to be rude; politeness costs nothing.
        Being unknowing, is not the same as being stupid.
        Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
        Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

        In order for a library to be useful, programmers must understand it. A chimera-like interface that shifts depending on the number of arguments passed to a function or upon magic use-time flags only places hurdles on the track to understanding. I'm squarely with diotalevi: The interface should be clear and unchanging.

        I recommend having separate functions for each behavior. For example, we could use the _acc suffix to denote accumulating functions that can be used to accumulate results iteratively. As long as we are clear and consistent in our usage, we can deliver both accumulating and non-accumulating functionality without making the library harder to understand.

        One possible implementation:

        # factor out accumulating behavior: sub make_acc_fn(&@) { my $f = shift; $f->(@_); $f; } # mean sub mean { mean_acc(@_)->() } # all-at-once version sub mean_acc { # accumulating-function version my ($tot, $cnt); make_acc_fn { $cnt += @_; $tot += $_ for @_; $cnt ? $tot / $cnt : undef; } @_ ; } # uniq sub uniq { uniq_acc(@_)->() } sub uniq_acc { my %uniq; make_acc_fn { @uniq{ @_ } = (); wantarray ? keys %uniq : scalar keys %uniq; } @_ ; } # examples print mean(1,2,3,5), $/; # 2.5 my $m = mean_acc(); print "$_ => ", $m->($_), $/ for 1..5; # 1 => 1 # 2 => 1.5 # 3 => 2 # 4 => 2.5 # 5 => 3 print uniq(1,2,3,1..5), $/; # 41325 my $u = uniq_acc(1,2,3); print "$_ => ", $u->($_), $/ for 1..5; # 1 => 132 # 2 => 132 # 3 => 132 # 4 => 4132 # 5 => 41325

        Cheers,
        Tom

      For a read_all_from_path subroutine see File::Slurp (one of my must-have modules). An efficient read_all could probably be added there.

      For zip, fold*, and scan* see Language::Functional. If you want a zip_with, patching Language::Functional would perhaps be a good idea.

      For spices see Sub::Curry.

      ihb

      See perltoc if you don't know which perldoc to read!
      Read argumentation in its context!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://407834]
Approved by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-04-23 20:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found