All,
I have this loose idea in my head about taking code re-use down to the snippet level. Presumably, individuals already do this but don't feel it warrants a new module on CPAN. Since there is no end to beginners in the language asking how to do X where the stock reply is to do Y, I was thinking maybe we should do just that. The following is a rough proof of concept:
package Tool::Box;
@ISA = qw(Exporter);
@EXPORT = qw();
@EXPORT_OK = qw( Ave Max Min Uniq Stats );
%EXPORT_TAGS = ( ':ALL' => [ qw(Ave Max Min Uniq Stats) ] );
$VERSION = '0.01';
use strict;
use warnings;
sub Max {
my $max;
my $find_max = sub {
for ( @_ ) {
$max = $_ if ! defined $max || $_ > $max;
}
return $max;
};
$find_max->( @_ );
return $find_max;
}
sub Min {
my $min;
my $find_min = sub {
for ( @_ ) {
$min = $_ if ! defined $min || $_ < $min;
}
return $min;
};
$find_min->( @_ );
return $find_min;
}
sub Ave {
my $tot;
my $cnt;
my $find_ave = sub {
$cnt += @_;
$tot += $_ for @_;
return $cnt ? $tot / $cnt : undef;
};
$find_ave->( @_ );
return $find_ave;
}
sub Uniq {
my %uniq;
my $find_uniq = sub {
@uniq{ @_ } = ();
if ( defined wantarray ) {
return wantarray ? keys %uniq : scalar keys %uniq;
}
};
$find_uniq->( @_ );
return $find_uniq;
}
sub Stats {
my $stat = {};
my ($cnt, $max, $min, $tot);
$stat->{ADD} = sub {
$cnt += @_;
for ( @_ ) {
$tot += $_;
$max = $_ if ! defined $max || $_ > $max;
$min = $_ if ! defined $min || $_ < $min;
}
};
$stat->{MAX} = sub { $max };
$stat->{MIN} = sub { $min };
$stat->{AVE} = sub { $cnt ? $tot / $cnt : undef };
$stat->{TOT} = sub { $tot };
$stat->{ADD}->( @_ );
return $stat;
}
42;
__END__
=head1 NAME
Tool::Box - A hodge podge of useful functions
=head1 VERSION
Version 0.01
=head1 SYNOPSIS
use Tool::Box;
use Tool::Box qw(Max Min);
use Tool::Box ':ALL';
=head1 DESCRIPTION
This module is a collection of commonly used functions to do what you
+want how you want.
=head1 EXPORTS
None by default
=head1 FUNCTIONS
=head2 Stats
Function generator that allows you to keep track of max/min/ave/tot
use Tool::Box qw(Stats);
my $stat = Stats(1 .. 10);
print $stat->{TOT}(); # 55
print $stat->{AVE}(); # 5.5
print $stat->{MIN}(); # 1
print $stat->{MAX}(); # 10
for (8 .. 12) {
$stat->{ADD}( $_ );
}
print $stat->{MAX}(); # 12
print $stat->{TOT}(); # 105
=head2 Ave
Function generator that allows you to keep track of average value
use Tool::Box qw(Ave);
my $ave = Ave(1 .. 10)->();
print $ave; # 5.5
$ave = Ave();
for (4 .. 9) {
$ave->( $_ );
}
print $ave->(); # 6.5
$ave = Ave(20 .. 30);
print $ave->(); # 25
=head2 Max
Function generator that allows you to keep track of maximum value
use Tool::Box qw(Max);
my $max = Max(1 .. 10)->();
print $max; # 10
$max = Max();
for (4 .. 9) {
$max->( $_ );
}
print $max->(); # 9
$max = Max(20 .. 30);
print $max->(); # 30
=head2 Min
Function generator that allows you to keep track of minimum value
use Tool::Box qw(Min);
my $min = Min(1 .. 10)->();
print $min; # 1
$min = Min();
for (4 .. 9) {
$min->( $_ );
}
print $min->(); # 4
$min = Min(20 .. 30);
print $min->(); # 20
=head2 Uniq
Function generator that allows you to keep track of unique things
use Tool::Box qw(Uniq);
my $count = Uniq(1,1,2,3,1,7)->();
print $count; # 4
my @u_nums = Uniq(1,1,2,3,1,7)->();
print join ' ' , @u_nums; # 1 2 3 7, though the order is not guarant
+eed
my $uniq = Uniq();
while ( <DATA> ) {
chomp;
$uniq->( $_ );
}
my $unique_lines = $uniq->();
=head1 AUTHOR
Joshua Gatcomb, <Limbic_Region_2000@Yahoo.com>
=head1 ACKNOWLEDGEMENTS
Various people from PerlMonks (L<http://www.perlmonks.org>) provided
invaluable input.
=head1 BUGS
Functions that expect numeric arguments are not verifying they are num
+eric
=head1 TO DO
=head1 COPYRIGHT
Copyright (c) 2004 Joshua Gatcomb. All rights reserved.
This program is free software; you can redistribute it
and/or modify it under the same terms as Perl itself.
=head1 SEE ALSO
L<perl>(1)
=cut
Yes, I know that some of these functions already exist in List::Util. I chose them because the standard wheel doesn't always fit your bicycle (values are not all available at once). I got the idea from one of tmoertel's posts. I want to stress that this post is about the idea in general and not this specific proof of concept.
So is this a good idea or not? I was thinking along the lines of going through the FAQs and providing ready made solutions (such as getting the intersection of two arrays). I can see it going both ways: if beginner's don't learn fundementals where will the hand-holding end? If it is a good idea, what should be included? Besides going through the FAQs I was also thinking about going through Snippets.
Updated 2004-11-16: I changed the all cap function names, as it is something I normally don't do anyway, after a couple of comments against them.
Re: RFC: Tool::Box
by hardburn (Abbot) on Nov 15, 2004 at 15:29 UTC
|
I honestly don't see the point. A given programmer that doesn't read the FAQs now isn't going to start because of a shiny new module. It's more important that the FAQ give a good answer and that we point people tward it when these questions come up.
"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.
| [reply] |
Re: RFC: Tool::Box
by borisz (Canon) on Nov 15, 2004 at 15:40 UTC
|
- I like, how you keep track of the values, but no beginner can understand the difference between your min and List::Util::min. So the hardest part is to explain why someone should use this module in favor of List::Util.
- I personally dislike uppercase functionnames a lot!
- And there is already List::MoreUtils, perhaps your functions fit more in this module. Instead of a new one.
| [reply] |
|
borisz,
- ...hardest part is to explain why someone should use this module in favor of List::Util
This is a good point in general. Perhaps the README should include a FAQ for questions like, why isn't there an X function. The reply would be because it is part of module Y.
- I personally dislike uppercase functionnames a lot!
I have a tendency to name my subs like Add_Name and am not sure why I did all uppercase.
- And there is already List::MoreUtils, perhaps your functions fit more in this module.
From my original post:
I want to stress that this post is about the idea in general and not this specific proof of concept.
The reason the proposed name was Tool::Box and not Functions::List::Util::Forgot was because I see it as being a hodge podge of functions (such as getting the intersection of two arrays). These functions were only examples of what I was thinking about and not specific functions that I think should be included (concepts not specifics).
| [reply] |
Re: RFC: Tool::Box
by jdporter (Paladin) on Nov 15, 2004 at 15:46 UTC
|
IMHO, there is a level — measured in snippet code size — below which it does not make sense to encapsulate in a module. At that level, the toolbox should exist in the programmer, not in some external library. Above that level, the snippets are going to be specific enough that they ought to live in the appropriate module. Your example above is a just one such: Those functions should go in List::Util. Any other routines you'd be inclined to throw in which would be inappropriate for List::Util would also be inappropriate alongside these functions in any other module. Just MHO. | [reply] |
|
| [reply] [d/l] |
|
Let's carefully consider this one:
A "left overs" or "hodge podge" of snippets doesn't
belong in a module, but should be part of the programmer's knowledge
base.
The snippets that are common knowledge to a
significant portion of the community ought to be taken out of our
personal, back-of-the-brain closets, dusted off, and given proper
names. Let us place them firmly in the common,
community-wide knowledge base.
Why should every programmer in the community have their own pet
versions of common but "noncollectable" functions? If there
is value in giving something a name, give it a name. If there
is value in collecting something, collect it.
Who cares how big the thing is or whether there's already enough
stuff like it to form a ready-made home for it?
If the cost of giving it a name and collecting it and then using it to
solve problems is less than the cost of solving the problems
otherwise, then it has earned its name and its place. What other
consideration is there?
Cheers,
Tom
| [reply] |
|
Re: RFC: Tool::Box
by itub (Priest) on Nov 15, 2004 at 17:10 UTC
|
There is a line that separates the ridiculously simple snippets from the useful subroutines and modules; the problem is that the exact location of that line is subjective and a matter of preference. I think most people would agree that this is ridiculous:
package Increment;
use base qw(Exporter);
@EXPORT = qw(increment);
sub increment {
$_[0]++
}
# (...)
But when you come to things like a min/max function or a slurp function, there is more room for debate. While I'll happily use the functions in List::Util because they are part of the core (at least if I'm targetting perl-5.8.0+), I just never use any of the file slurping modules on CPAN. Adding another dependency for something that I can write in one line of code seems utterly unnecessary. | [reply] [d/l] |
Re: RFC: Tool::Box
by tmoertel (Chaplain) on Nov 16, 2004 at 05:48 UTC
|
First, this is a mighty fine idea. It reduces the cost of certain
common operations, even if hand-coding them is already inexpensive.
More importantly, it solidifies common idioms behind descriptive
names that the community can incorporate into its collective
vocabulary.
Second, the all-caps are a bit much. (They burn my eyes.)
Third, I like the pre-loading refinement. (Personally, I wouldn't
use them like this:
my $min = MIN(1..10)->();
because I find this more pleasing:
my $min = MIN->(1..10);
)
But the refinement does have value for finding min-maxes and max-mins:
my $max_of_at_least_zero = MAX(0);
$max_of_at_least_zero->($_) while <>;
# ...
Fourth, let's load the tool-box up! How about some new additions?
- read_all – slurp filehandle
- read_all_from_path – open, slurp, close
- zip – merge parallel arrays
- zip_with – merge parallel arrays with a given "zipper" function
- foldl, foldl1, foldr, foldr1, scanl, scanl1, scanr, scanr1 – more friends from functional programming
- curry – everybody loves curry
Thanks for getting the ball rolling.
Cheers, Tom
| [reply] [d/l] [select] |
|
tmoertel,
- First, this is a mighty fine idea...
Thank you. I think it might be, but I would have preferred to hear from some relative newcomers.
- Second, the all-caps are a bit much...
I have changed them. I am not sure why I used all caps as I don't normally.
- Third, I like the pre-loading refinement...
This was a hard decision with regards to DWIM. I would have preferred to have the ability in calling syntax to determine if they just want to run one time and get an answer versus returning a function. This seemed like an acceptable comprimise.
- Fourth, let's load the tool-box up! How about some new additions?...
I am more than happy to be a co-author. We would decide when/if we release it who is responsible for maintenance.
| [reply] [d/l] |
|
You can tell CPAN that you are co-maintainers so that both of you can upload new versions.
| [reply] |
|
tmoertel,
I decided to get rid of max/min even if it is more functional than what List::Util provides. I think the only List::Util function worth making a closure out of is reduce.
Additionally, I changed the calling syntax a bit. If you call a function with an argument list it will return the result of that list instead of a function. The trouble, as pointed out by diotalevi, is what to do if it is called with an empty list/array. Perhaps using different function names?
Update:Slight modification (removing a function) after diotalevi pointed out it was equivalent to another List::Util function
| [reply] [d/l] |
|
Why do you define the anonymous subroutine every time you call the function? Your anonymous subs aren't closures - they're just anon subs. They don't close over anything, let alone anything in @_. Your functions would be much faster if you define the subrefs ahead of time. Your functions would also be simpler because they just dispatch to the subref or return it, as needed.
Another item - you make the same mistake nearly every unique'ing function makes: what if I want to find unique objects or hashrefs? Much better is @uniq{@_} = @_; values %uniq. That way, you also respect any overloaded stringification I might use. Your uniq_ordered() function doesn't have this mistake.
Being right, does not endow the right to be rude; politeness costs nothing. Being unknowing, is not the same as being stupid. Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence. Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.
| [reply] [d/l] |
|
|
|
|
In order for a library to be useful, programmers must understand it. A
chimera-like interface that shifts depending on the number of arguments
passed to a function or upon magic use-time flags only places hurdles
on the track to understanding. I'm squarely with diotalevi: The
interface should be clear and unchanging.
I recommend having separate functions for each behavior. For
example, we could use the _acc suffix to denote accumulating
functions that can be used to accumulate results iteratively. As long
as we are clear and consistent in our usage, we can deliver both
accumulating and non-accumulating functionality without making the
library harder to understand.
One possible implementation:
# factor out accumulating behavior:
sub make_acc_fn(&@) {
my $f = shift;
$f->(@_);
$f;
}
# mean
sub mean { mean_acc(@_)->() } # all-at-once version
sub mean_acc { # accumulating-function version
my ($tot, $cnt);
make_acc_fn {
$cnt += @_;
$tot += $_ for @_;
$cnt ? $tot / $cnt : undef;
} @_ ;
}
# uniq
sub uniq { uniq_acc(@_)->() }
sub uniq_acc {
my %uniq;
make_acc_fn {
@uniq{ @_ } = ();
wantarray ? keys %uniq : scalar keys %uniq;
} @_ ;
}
# examples
print mean(1,2,3,5), $/;
# 2.5
my $m = mean_acc();
print "$_ => ", $m->($_), $/ for 1..5;
# 1 => 1
# 2 => 1.5
# 3 => 2
# 4 => 2.5
# 5 => 3
print uniq(1,2,3,1..5), $/;
# 41325
my $u = uniq_acc(1,2,3);
print "$_ => ", $u->($_), $/ for 1..5;
# 1 => 132
# 2 => 132
# 3 => 132
# 4 => 4132
# 5 => 41325
Cheers, Tom
| [reply] [d/l] |
|
For a read_all_from_path subroutine see File::Slurp (one of my must-have modules). An efficient read_all could probably be added there.
For zip, fold*, and scan* see Language::Functional. If you want a zip_with, patching Language::Functional would perhaps be a good idea.
For spices see Sub::Curry.
ihb
See perltoc if you don't know which perldoc to read!
Read argumentation in its context!
| [reply] |
|
|