note
marioroy
<p>Hi [Laurent_R] and fellow monks,</p>
<p>Nice speedup! I applied 3 optimizations in preparation for a follow-up post combining caching with parallelization. This was done because File::Map adds overhead (i.e. it will not be as fast as an array). So, I asked myself can anything be done to further improve the serial implementation. And if yes, is it helpful. I provide the run time for each stage at the end of this post. It turns out that improvements were possible and will be converting collatz3_d.pl to use File::Map for the [id://11115544|parallel demonstration].</p>
<p>Update: Further improvements, Step 4.</p>
<p>1. Replaced division by 2.</p>
<code>
$n >> 1;
</code>
<p>2. Removed one level of branching.</p>
<code>
while ($n != 1) {
$result += $cache[$n], last
if defined $cache[$n];
my $new_n = $n % 2 ? 3 * $n + 1 : $n >> 1;
$result++;
$cache[$n] = $cache[$new_n] + 1
if defined $cache[$new_n] and $n < $max;
$n = $new_n;
}
</code>
<p>3. Then reduced the number of loop iterations. Credit for reducing the # of loop iterations was from watching [https://www.youtube.com/watch?v=t1I9uHF9X5Y|Notation and compressed dynamics], one minute into it (i.e. the T(x) notation).</p>
<code>
while ($n != 1) {
$result += $cache[$n], last
if defined $cache[$n];
$n % 2 ? ( $result += 2, $new_n = (3 * $n + 1) >> 1 )
: ( $result += 1, $new_n = $n >> 1 );
$cache[$n] = $cache[$new_n] + ($n % 2 ? 2 : 1)
if defined $cache[$new_n] and $n < $max;
$n = $new_n;
}
</code>
<p>4. Finally, less caching.</p>
<code>
while ($n != 1) {
$result += $cache[$n], last
if defined $cache[$n];
$n % 2 ? ( $result += 2, $new_n = (3 * $n + 1) >> 1 )
: ( $result += 1, $new_n = $n >> 1 );
$n = $new_n;
}
</code>
<p>The final code optionally takes an argument.</p>
<code>
#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
my $max = shift || 1e6;
$max = 1e6 if $max < 1e6;
my @cache = (0, 1, 2);
sub collatz_seq {
my $input = shift;
my $n = $input;
my $result = 0;
my $new_n;
while ($n != 1) {
$result += $cache[$n], last
if defined $cache[$n];
$n % 2 ? ( $result += 2, $new_n = (3 * $n + 1) >> 1 )
: ( $result += 1, $new_n = $n >> 1 );
$n = $new_n;
}
$cache[$input] = $result if $input < $max;
return $result;
}
my @long_seqs;
for my $num (1..$max) {
my $seq_length = collatz_seq $num;
push @long_seqs, [ $num, $seq_length ] if $seq_length > 400;
}
@long_seqs = sort { $b->[1] <=> $a->[1]} @long_seqs;
say "$_->[0]: $_->[1]" for @long_seqs[0..19];
</code>
<p><b>Results:</b></p>
<code>
$ time perl collatz3_a.pl 1e7
# Intel i7 laptop, Docker Container, Ubuntu + Perlbrew Perl 5.30.1
collatz3_a.pl 1e7 32.291s (a) original code, accepts argument
collatz3_b.pl 1e7 30.134s (b) a + replaced division with >> 1
collatz3_c.pl 1e7 28.503s (c) b + removed 1 level of branching
collatz3_d.pl 1e7 21.464s (d) c + reduced loop iterations
collatz3_e.pl 1e7 19.357s (e) d + less caching
# AMD 3970x, Docker Container, Ubuntu + Perlbrew Perl 5.30.1
collatz3_a.pl 1e7 13.130s (a) original code, accepts argument
collatz3_b.pl 1e7 12.394s (b) a + replaced division with >> 1
collatz3_c.pl 1e7 12.261s (c) b + removed 1 level of branching
collatz3_d.pl 1e7 9.170s (d) c + reduced loop iterations
collatz3_e.pl 1e7 7.681s (e) d + less caching
8400511: 686
8865705: 668
6649279: 665
9973919: 663
6674175: 621
7332399: 616
7532665: 616
5649499: 613
8474249: 611
6355687: 608
8847225: 606
9533531: 606
6635419: 603
9953129: 601
7464846: 598
7464847: 598
3732423: 597
5598635: 595
8397953: 593
6298465: 590
</code>
<p>Regards, Mario</p>
11115088
11115520