minimum, maximum and average of a list of numbers at the same time

LucaPette has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: minimum, maximum and average of a list of numbers at the same time by broquaint (Abbot) on Nov 10, 2005 at 11:55 UTC
This would seem a "perlish" solution to me: `use List::Util qw/ min max sum /; sub min_max_avg { return min(@_), max(@_), sum(@_) / @_; }` [download] HTH `_________ broquaint`	[reply] [d/l]
Re^2: minimum, maximum and average of a list of numbers at the same time by davorg (Chancellor) on Nov 10, 2005 at 12:21 UTC
It does look temptingly simple doesn't it? But you'd be iterating across the list three times - which isn't very efficient. You only really need to iterate across it once. Something like this (off the top of my head and untested). `# use 'mean' instead of 'avg' as it's unambiguous sub min_max_mean { my ($min, $max, $tot); my $count = @_; $min = $max = $tot = shift; foreach (@_) { $min = $_ if $_ < $min; $max = $_ if $_ > $max; $tot += $_; } return ($min, $max, $tot / $count); }` [download] -- <http://dave.org.uk> "The first rule of Perl club is you do not talk about Perl club." -- Chip Salzenberg	[reply] [d/l]
Re^3: minimum, maximum and average of a list of numbers at the same time by Roy Johnson (Monsignor) on Nov 10, 2005 at 21:08 UTC
Reducing the number of iterations or comparisons by a constant number may make a performance difference, depending on what gyrations you have to go through to make the reduction, but you're not changing the order of complexity. A 3-N solution is the same order as a 3/2-N solution, is the same as a single-pass (N) solution. They're all O(n). This is an important concept. If you can reduce the order of a solution, your solution will scale better. We commonly look for O(n log n) solutions to replace O(n²) solutions, so that working with large amounts of data doesn't make our app bog down. If you merely change by a constant factor (as we're talking about in this case), you may see a constant-factor improvement at any size, but you won't alleviate any scaling problems. You're in the realm of micro-optimization. Walking through the list is not going to be a significant portion of the computation, compared to the comparisons and math being done on the variables of interest. And it's really not worth trying to take the elements two at a time, because that ends up being a very inefficient way to walk through the list, compared to Perl's built-in `for`. If our OP were to translate the algorithm into Inline::C, the reduced number of comparisons might compete well with three List::Util calls, with the difference being some constant factor on any size list. Caution: Contents may have been coded under pressure.	[reply] [d/l]
Re^2: minimum, maximum and average of a list of numbers at the same time by demerphq (Chancellor) on Nov 10, 2005 at 12:16 UTC
And with List::Util::reduce()... `use List::Util qw/ reduce /; sub min_max_avg { $res= reduce { my $r= ref $a ? $a : [($a) x 3]; if ($b < $r->[0]) { $r->[0]= $b; } elsif ($b > $r->[1]) { $r->[1]= $b; } $r->[2]+=$b; $r } @_; return @$res[0,1], $res->[2] / @_; }` [download] --- $world=~s/war/peace/g	[reply] [d/l]
Re^3: minimum, maximum and average of a list of numbers at the same time by Limbic~Region (Chancellor) on Nov 10, 2005 at 13:45 UTC
And if you want to go even more FP... sub Gen_Stats { my $stat = {}; my ($cnt, $max, $min, $tot); $stat->{ADD} = sub { $cnt += @_; for ( @_ ) { $tot += $_; $max = $_ if ! defined $max \|\| $_ > $max; $min = $_ if ! defined $min \|\| $_ < $min; } }; $stat->{MAX} = sub { $max }; $stat->{MIN} = sub { $min }; $stat->{AVE} = sub { $cnt ? $tot / $cnt : undef }; $stat->{TOT} = sub { $tot }; $stat->{ADD}->( @_ ); return $stat; } my $stat_info = Gen_Stats(); while ( <DATA> ) { chomp; $stat_info->{ADD}($_); } print join "\t", map { $_->() } @{$stat_info}{qw/MAX MIN AVE TOT/}; [download] This code was borrowed from RFC: Tool::Box. I recommend using List::Util when and wherever possible. It has been part of the core since 5.007003 and uses XS when possible. The only real limitation I see with it is that all the items in the list must be known at once as I pointed out in How A Function Becomes Higher Order. Cheers - L~R	[reply] [d/l]
Re^3: minimum, maximum and average of a list of numbers at the same time by codeacrobat (Chaplain) on Nov 12, 2005 at 09:58 UTC
Nice fast code, but does it return the right values? I checked with the orig mma routine, the my_mma routine and your reduce_mma routine. Using the same set of @nums, the results differ. use Benchmark qw(:all :hireswallclock); my @nums = map {rand} 1 .. 10000; my $subs = { mma => sub { my @r = min_max_avg(\@nums) }, reduce_mma => sub { my @r = reduce_mma(\@nums) }, mymma => sub { my @r = my_mma(\@nums) } }; cmpthese(-1, $subs); $count = 1000; for my $sub (keys %$subs){ $t = timeit($count, $subs->{$sub}); print "$count loops of $sub:",timestr($t),"\n"; } for my $sub (keys %$subs){ print "$sub results:\n"; print (join ' ', &{$subs->{$sub}} , "\n"); } ##### Subs to test ###### __END__ $ perl benchmark.pl Rate mma mymma reduce_mma mma 86.3/s -- -49% -100% mymma 171/s 98% -- -100% reduce_mma 89229/s 103265% 52110% -- 1000 loops of reduce_mma:0.0114148 wallclock secs ( 0.00 usr + 0.00 s +ys = 0.00 CPU) 1000 loops of mma:12.0897 wallclock secs (12.05 usr + 0.01 sys = 12.0 +6 CPU) @ 82.91/s (n=1000) 1000 loops of mymma:5.98583 wallclock secs ( 5.97 usr + 0.00 sys = 5 +.97 CPU) @ 167.56/s (n=1000) reduce_mma results: 0.956444677504809 0.209810070551129 0.765697545177442 mma results: 0.000119859684559742 0.999992750475673 0.505291691541093 mymma results: 0.000119859684559742 0.999992750475673 0.505291691541093 [download]	[reply] [d/l]
Re^4: minimum, maximum and average of a list of numbers at the same time by demerphq (Chancellor) on Nov 12, 2005 at 11:02 UTC
Re^2: minimum, maximum and average of a list of numbers at the same time by LucaPette (Friar) on Nov 10, 2005 at 12:20 UTC
I'm not a perl expert... infact i'm not able to really understand the code of List::Util maybe you can clarify to me about this question: Do your code process three times the input list?	[reply]
Re^3: minimum, maximum and average of a list of numbers at the same time by broquaint (Abbot) on Nov 10, 2005 at 12:41 UTC
Do your code process three times the input list? Indeed it does and as davorg and demerphq have pointed out in their replies it isn't terribly efficient, but that's a "perlish" solution for you. HTH `_________ broquaint`	[reply]
Re^3: minimum, maximum and average of a list of numbers at the same time by blazar (Canon) on Nov 10, 2005 at 14:36 UTC
But you're not required to understand it. But then of course without even having seen it I'm sure it does process it three times. If that is really relevant, then indeed you may want or have to process it once yourself. And if it is very relevant, perhaps you may want to do it as a binary extension. Well, the author may consider a adding to List::Util a function to calculate more than one relevant quantity at a time, e.g. `my ($min, $max, $sum) = calculate [qw/min max sum/], @list;` [download] or perhaps with an OO interface: `my $stats=List::Util->calculate([qw/min max sum/])->of(@list); print $stats->min, "\n";` [download] possibly even with a mixed interface: `calculate [qw/min max sum/], @list; print List::Util::lastrun->max, "\n";` [download]	[reply] [d/l] [select]
Re^2: minimum, maximum and average of a list of numbers at the same time by bbacker (Initiate) on Nov 10, 2005 at 23:11 UTC
I don't know what the programmatic performance would be, but using Statistics::Descriptive would certainly cut the amount of time you, the human, had to spend on this problem.	[reply]
Re: minimum, maximum and average of a list of numbers at the same time by BrowserUk (Patriarch) on Nov 10, 2005 at 12:51 UTC
As others have mentioned, you are unlikely to code anything in Perl that will beat List::Util for performance, even if you do avoid making 3 whole passes. That is the nature of a dynamic language. For error handling, you need to add code to detect and deal with them. You have to decide whether to die or croak within the subroutine, or return some failure indication (like an empty list). Here's one possibility that raises an exception (via die) within the sub and catches it in the calling code using the block form of eval: #! perl -slw use strict; sub minMaxAve { die 'Empty list' unless @_; my( $min, $max, $sum ); for( @_ ) { die 'Non-numeric value(s)' unless /^[\de+.-]+$/; $min = $_ if !defined $min or $min > $_; $max = $_ if !defined $max or $max < $_; $sum += $_; } return $min, $max, $sum / @_; } for ( [], [ 1 .. 4 ], [ 5 , 1, 5, 6, 5 ], [ 1, 2, -0.1, 1e-5, 2e10 ], [ 'fox', 'dog' ] ) { my @results = eval{ minMaxAve @$_ } or warn "[ @$_ ] : $@\n\n" and + next; printf "[ %s ]\nmin: %g max: %g ave: %g\n\n", join( ', ',@$_ ), @r +esults; } __END__ P:\test>junk2 [ ] : Empty list at P:\test\junk2.pl line 5. [ 1, 2, 3, 4 ] min: 1 max: 4 ave: 2.5 [ 5, 1, 5, 6, 5 ] min: 1 max: 6 ave: 4.4 [ 1, 2, -0.1, 1e-005, 20000000000 ] min: -0.1 max: 2e+010 ave: 4e+009 [ fox dog ] : Non-numeric value(s) at P:\test\junk2.pl line 8. [download] You should probably use Carp::croak rather than die, a better regex (say from Regexp::Common) or better, Scalar::Util::looks_like_number() to detect non-numeric values. And, if performance is your requirement, List::Util's calls, but make sure that you get a version that compiles the XS alternatives on your platform. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: minimum, maximum and average of a list of numbers at the same time by polettix (Vicar) on Nov 10, 2005 at 15:46 UTC
I would also add List::Moreutils, which has a `minmax` function using an improved algorithm to evaluate min and max at once: #!/usr/bin/perl use strict; use warnings; use List::Util qw( min max reduce ); use List::MoreUtils qw( minmax ); use Benchmark qw( cmpthese ); my @array = map { rand 100 } 1 .. 1_000_000; my $count = 0; cmpthese -10, { util_min_max => sub { my $min = min(@array); my $max = max(@array); }, util_reduce => sub { my ( $min, $max ) = @{ (reduce { my $r= ref $a ? $a : [($a) x 2]; if ($b < $r->[0]) { $r->[0]= $b; } elsif ($b > $r->[1]) { $r->[1]= $b; } $r } @array) }; }, moreutils_minmax => sub { my ( $min, $max ) = minmax(@array); }, }; __END__ Rate util_reduce util_min_max moreutils_m +inmax util_reduce 0.846/s -- -92% + -96% util_min_max 10.5/s 1146% -- + -45% moreutils_minmax 19.2/s 2169% 82% + -- [download] I added the `reduce` method because I did not know if it would perform better, but the result does not surprise me given the bunch of function calls that are involved. Anyway, this is what benchmarks are for :) Flavio perl -ple'$_=reverse' <<<ti.xittelop@oivalf Don't fool yourself.	[reply] [d/l] [select]
Re: minimum, maximum and average of a list of numbers at the same time by salva (Canon) on Nov 10, 2005 at 12:19 UTC
well, that algorithm could offer the best performance in assembler or C but I doubt it could make any difference in Perl. Actually... `use Benchmark qw(cmpthese); # insert your code here sub my_mma { my $r = shift; if (@$r) { my ($min, $max, $tot) = ($r->[0]) x 3; for (@$r[1..$#$r]) { $min = $_ if $_ < $min; $max = $_ if $_ > $max; $tot += $_; } return ($min, $max, $tot/@$r); } return (undef, undef, undef) } my @nums = map {rand} 1..1000; cmpthese(-1, { ita => sub { my @r = min_max_avg(\@nums) }, simple => sub { my @r = my_mma(\@nums) } })` [download] runs as... `Rate ita simple ita 222/s -- -50% simple 449/s 102% --` [download]	[reply] [d/l] [select]
Re^2: minimum, maximum and average of a list of numbers at the same time by xdg (Monsignor) on Nov 10, 2005 at 12:51 UTC
This is not really a comparison of the algorithm, but of how to write effective Perl. The algorithm is trying to reduce the number of comparisons from 2 per element to 3 per 2 elements. However, your code differs from the OP in a couple key ways that improve the efficiency. First, the OP dereferences the array pointer for each element access. Second, the OP winds up copying elements into temporary variables to work two at a time, whereas your code uses the nice aliasing behavior of `for` to avoid having to create new variables. In Perl the overhead of the extra comparisons are minor relative to the cost of creating new temporary variables. -xdg Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.	[reply] [d/l]
Re^3: minimum, maximum and average of a list of numbers at the same time by BrowserUk (Patriarch) on Nov 10, 2005 at 15:23 UTC
He also isn't saving as many comparisons as he thinks. I've added some counts to the following version of his implementation of the algorithm and where using three passes on 100 elements you would expect 200 comparisions, his code does 198--for a saving of just 2--, but he also does 302 dereferences (which admitedly is traded for allocating stack space for the arg list) and a mod operation. Maybe the algorithm can be implemented more economically, but quite where the saving comes in escapes me. #! perl -slw use strict; my( $comps, $mods, $derefs ) = (0)x3; sub min_max_avg { my $ref = shift; my ( $min, $max, $agv, $i ); my ( $current_min, $current_max ); $comps++; $derefs++; $mods++; if ( @{$ref} % 2 == 0 ) { $comps++; $derefs +=4; ( $min, $max ) = $ref->[0] < $ref->[1] ? ( $ref->[0], $ref->[1] ) : ( $ref->[1], $ref->[0] ); $derefs += 2; $agv = $ref->[0] + $ref->[1]; $i = 2; } else { $derefs++; $min = $max = $agv = $ref->[0]; $i = 1; } while ( $i < @{$ref} and ++$comps ) { $comps++; $derefs += 4; ( $current_min, $current_max ) = $ref->[$i] < $ref->[ $i + 1 ] ? ( $ref->[$i], $ref->[ $i + 1 ] ) : ( $ref->[ $i + 1 ], $ref->[$i] ); $comps++; $min = $current_min if ( $current_min < $min ); $comps++; $max = $current_max if ( $current_max > $max ); $derefs += 2; $agv += $ref->[$i] + $ref->[ $i + 1 ]; $i += 2; } $derefs++; return ( $min, $max, $agv / @{$ref} ); } my( $min, $max, $ave ) = min_max_avg [ 1 .. 100 ];; print "$min $max $ave"; print "comps:$comps derefs: $derefs mods: $mods"; [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^4: minimum, maximum and average of a list of numbers at the same time by LucaPette (Friar) on Nov 10, 2005 at 17:04 UTC
Re^5: minimum, maximum and average of a list of numbers at the same time by BrowserUk (Patriarch) on Nov 10, 2005 at 17:38 UTC
Some notes below your chosen depth have not been shown here
Re^3: minimum, maximum and average of a list of numbers at the same time by salva (Canon) on Nov 10, 2005 at 14:21 UTC
This is not really a comparison of the algorithm, but of how to write effective Perl. The algorithm is trying to reduce the number of comparisons from 2 per element to 3 per 2 elements. right, but my point is that trying to minimimize the number of comparisons in order to find the best algorithm is senseless when programing in perl where other operations like assignement or branching are equally (or more) expensive. Actually, I doubt that even when implemented in assembler or C this algorithm is faster than the simple one in all the moderm processors where comparisons are not specially expensive.	[reply]
Re^3: minimum, maximum and average of a list of numbers at the same time by polettix (Vicar) on Nov 10, 2005 at 15:10 UTC
Thank you xdg, you found some sense for my benchmark! It does not suffer for all the optimisations salva cleverly used - which shows that sometimes a worse coder can do a cleaner work :) Flavio perl -ple'$_=reverse' <<<ti.xittelop@oivalf Don't fool yourself.	[reply]
Re^3: minimum, maximum and average of a list of numbers at the same time by xdg (Monsignor) on Nov 10, 2005 at 16:32 UTC
I've taken a cut at trying to get as perlish as possible with algorithm, eliminating as much overhead as I could quickly think of (and keeping similar approaches on dereferencing, etc.). The simple approach is still faster in pure perl -- doing the array overhead manually with `splice` versus using `for` still swamps the comparision savings. Read more... (2 kB) `Rate algorithm simple algorithm 1517/s -- -30% simple 2177/s 44% --` [download] -xdg Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.	[reply] [d/l] [select]
Re^3: minimum, maximum and average of a list of numbers at the same time by LucaPette (Friar) on Nov 10, 2005 at 13:05 UTC
thank you xdg, your reply helps me to understand what is the problem of my approach...	[reply]
Re^2: minimum, maximum and average of a list of numbers at the same time by polettix (Vicar) on Nov 10, 2005 at 12:28 UTC
I had similar results even with longer arrays: #!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese ); my @array = map { int rand 100 } 1 .. 1_000_000; cmpthese -10, { op => sub { my ($min, $max, $avg) = min_max_avg(\@array); }, simple => sub { my ($min, $max, $avg) = min_max_avg_simple(\@array); }, }; sub min_max_avg_simple { my $ref = shift; my $n_elements = scalar @$ref; return unless $n_elements > 0; my ( $min, $max, $sum ) = ($ref->[0]) x 3; my $i = 1; # Start from next element while ($i < $n_elements) { my $value = $ref->[$i]; if ($value < $min) { $min = $value } elsif ($value > $max) { $max = $value } $sum += $value; ++$i; } return ( $min, $max, $sum / $n_elements ); } sub min_max_avg { my $ref = shift; my ( $min, $max, $agv, $i ); my ( $current_min, $current_max ); if ( @{$ref} % 2 == 0 ) { ( $min, $max ) = $ref->[0] < $ref->[1] ? ( $ref->[0], $ref->[1] ) : ( $ref->[1], $ref->[0] ); $agv = $ref->[0] + $ref->[1]; $i = 2; } else { $min = $max = $agv = $ref->[0]; $i = 1; } while ( $i < @{$ref} ) { ( $current_min, $current_max ) = $ref->[$i] < $ref->[ $i + 1 ] ? ( $ref->[$i], $ref->[ $i + 1 ] ) : ( $ref->[ $i + 1 ], $ref->[$i] ); $min = $current_min if ( $current_min < $min ); $max = $current_max if ( $current_max > $max ); $agv += $ref->[$i] + $ref->[ $i + 1 ]; $i += 2; } return ( $min, $max, $agv / @{$ref} ); } __END__ s/iter op simple op 1.13 -- -37% simple 0.707 60% -- [download] Next time I'll wait a bit more before coding :) Flavio perl -ple'$_=reverse' <<<ti.xittelop@oivalf Don't fool yourself.	[reply] [d/l]
Re: minimum, maximum and average of a list of numbers at the same time by rev_1318 (Chaplain) on Nov 10, 2005 at 16:22 UTC
A lot of alternatives have been provided, but I would like to add one more: `# no error checking... sub min_max_avg { my ($min, $max) = (sort {$a <=> $b} @_)[0,-1]; my $total; $total += $_ for @_; return ($min, $max, $total/@_) }` [download] Paul	[reply] [d/l]
Re^2: minimum, maximum and average of a list of numbers at the same time by Aristotle (Chancellor) on Nov 10, 2005 at 20:26 UTC
The logical conclusion: `use List::Util qw( sum ); sub min_max_avg { my ( $min, $max ) = ( sort { $a <=> $b } @_ )[ 0, -1 ]; return ( $min, $max, sum @_ ); }` [download] Makeshifts last the longest.	[reply] [d/l]
Re^3: minimum, maximum and average of a list of numbers at the same time by BrowserUk (Patriarch) on Nov 10, 2005 at 20:56 UTC
How is a O(NlogN) substitute for a O(2N) operation the "logical conclusion" of a thread that started out attempting to reduce the latter to O(3N/2)? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^4: minimum, maximum and average of a list of numbers at the same time by Aristotle (Chancellor) on Nov 10, 2005 at 21:57 UTC
Re^2: minimum, maximum and average of a list of numbers at the same time by salva (Canon) on Nov 10, 2005 at 20:32 UTC
using sort is in general a very bad idea because it has O(NlogN) complexity while the other algorithms proposed are just O(N). Though, for small lists it could be faster.	[reply]


Your skill will accomplish what the force of many cannot
	PerlMonks