in reply to Optimizing with Caching vs. Parallelizing (MCE::Map)

Great discussions! I'm sorry I missed all this; I haven't had much time to visit here recently.

I'm the one who contributed the task, so I have some interest in this beyond my own personal fascination with the Collatz sequence.

My own solution uses memoization and a couple of other optimizations. The full code is at, and my "review" of my solution is at That page also contains links to, and reviews of, every solution submitted by the participants in the challenge that week. None of them used MCE::Map, which is a shame!

Data structures and bookkeeping were key to good performance on this one, for me:

my @seqlen = (-1,1); # Memoize sequence length my $top = 20; # Report this many of the top sequences my @top = [ -1,-1 ]; # Top $top sequences my $upper = 1e6; # Upper limit starting term my $mintop = 0; # Lowest value in @top GetOptions('top=i' => \$top, 'upper=i' => \$upper);

Here is the main loop:

for (my $start = 3; $start < $upper; $start += 2) { my ($n, $len) = ($start, 0); while (! defined $seqlen[$n]) { $len += 1 + $n % 2; $n = $n % 2 ? (3*$n + 1)/2 : $n / 2; } $len += $seqlen[$n]; $seqlen[$start] = $len if $start < $upper * 2; # Cache top($start => $len) if $len > $mintop and $start < += $upper; top($n * 2 => $seqlen[$n] + 1) if $n < $upper/2 and $seqlen[$n] +> $mintop; }

I avoid function call overhead by making sure everything is done iteratively. As another optimization, instead of simply doing 3n+1 for odd numbers, I do 3n+1/2, and increment the sequence length by two instead of one. And finally, I'm able to skip even-numbered starts with a little creative arithmetic with my second call to top().

I decided to ask people to output the top 20, because that presents an interesting mini-challenge by itself. Maintaining it naively by calling sort on a million elements at the end takes longer than the above loop, and sorting a 20-item list repeatedly is even worse. Maintaining essentially a priority queue is much faster:

sub top { my ($n, $len) = @_; my $idx = first { $top[$_][1] < $len } 0..$#top; splice @top, $idx, 0, [ $n, $len ]; pop @top if @top > $top; $mintop = $top[-1][1]; }

The above sub is O(n), so it's not as good as a heap implementation, but it's only called when there is definitely a new element to be inserted, thanks to a bit of bookkeeping in $mintop, so I opted to keep it simple.


real 0m0.848s user 0m0.835s sys 0m0.012s

Purely for crude CPU comparison purposes, Laurent's solution in Re: Optimizing with Caching vs. Parallelizing (MCE::Map) runs in 1.57 sec on the same (virtual) machine.

use strict; use warnings; omitted for brevity.