Re: Optimizing with Caching vs. Parallelizing (MCE::Map)

in reply to Optimizing with Caching vs. Parallelizing (MCE::Map)

Great discussions! I'm sorry I missed all this; I haven't had much time to visit here recently.

I'm the one who contributed the task, so I have some interest in this beyond my own personal fascination with the Collatz sequence.

My own solution uses memoization and a couple of other optimizations. The full code is at https://github.com/manwar/perlweeklychallenge-club/blob/master/challenge-054/ryan-thompson/perl/ch-2.pl, and my "review" of my solution is at https://perlweeklychallenge.org/blog/review-challenge-054/#ryan-thompson2. That page also contains links to, and reviews of, every solution submitted by the participants in the challenge that week. None of them used MCE::Map, which is a shame!

Data structures and bookkeeping were key to good performance on this one, for me:

my @seqlen = (-1,1);    # Memoize sequence length
my $top    = 20;        # Report this many of the top sequences
my @top    = [ -1,-1 ]; # Top $top sequences
my $upper  = 1e6;       # Upper limit starting term
my $mintop = 0;         # Lowest value in @top

GetOptions('top=i' => \$top, 'upper=i' => \$upper);
[download]

Here is the main loop:

for (my $start = 3; $start < $upper; $start += 2) {
    my ($n, $len) = ($start, 0);
    while (! defined $seqlen[$n]) {
        $len += 1 + $n % 2;
        $n = $n % 2 ? (3*$n + 1)/2 : $n / 2;
    }
    $len += $seqlen[$n];
    $seqlen[$start] = $len if $start < $upper * 2; # Cache
    top($start => $len)            if $len > $mintop  and     $start <
+= $upper;
    top($n * 2 => $seqlen[$n] + 1) if   $n < $upper/2 and $seqlen[$n] 
+> $mintop;
}
[download]

I avoid function call overhead by making sure everything is done iteratively. As another optimization, instead of simply doing 3n+1 for odd numbers, I do 3n+1/2, and increment the sequence length by two instead of one. And finally, I'm able to skip even-numbered starts with a little creative arithmetic with my second call to top().

I decided to ask people to output the top 20, because that presents an interesting mini-challenge by itself. Maintaining it naively by calling sort on a million elements at the end takes longer than the above loop, and sorting a 20-item list repeatedly is even worse. Maintaining essentially a priority queue is much faster:

sub top {
    my ($n, $len) = @_;

    my $idx = first { $top[$_][1] < $len } 0..$#top;
    splice @top, $idx, 0, [ $n, $len ];

    pop @top if @top > $top;
    $mintop = $top[-1][1];
}
[download]

The above sub is O(n), so it's not as good as a heap implementation, but it's only called when there is definitely a new element to be inserted, thanks to a bit of bookkeeping in $mintop, so I opted to keep it simple.

Performance:

real    0m0.848s
user    0m0.835s
sys     0m0.012s
[download]

Purely for crude CPU comparison purposes, Laurent's solution in Re: Optimizing with Caching vs. Parallelizing (MCE::Map) runs in 1.57 sec on the same (virtual) machine.

use strict; use warnings; omitted for brevity.

In Section Meditations