note
marioroy
<p>Hi [rjt] and fellow Monks,</p>
<p>I updated the parallel demonstrations [id://11115544|here] and [id://11115780|here] to ensure orderly output plus cache miss update for parallel iM71. Then captured results for 1e8. Note that running parallel involves File::Map, pack, and unpack. Running Inline::C involves compiling C code on the first run.</p>
<p>Testing was done on a Windows 10 host inside a Docker container running Ubuntu 18.04.x and Perl 5.30.1. The hardware is an AMD 3970x box (32-cores with SMT disabled).</p>
<p><b>1e8 Output:</b></p>
<code>
Collatz(63728127) has sequence length of 950 steps
Collatz(95592191) has sequence length of 948 steps
Collatz(96883183) has sequence length of 811 steps
Collatz(86010015) has sequence length of 798 steps
Collatz(98110761) has sequence length of 749 steps
Collatz(73583070) has sequence length of 746 steps
Collatz(73583071) has sequence length of 746 steps
Collatz(36791535) has sequence length of 745 steps
Collatz(55187303) has sequence length of 743 steps
Collatz(56924955) has sequence length of 743 steps
Collatz(82780955) has sequence length of 741 steps
Collatz(85387433) has sequence length of 741 steps
Collatz(63101607) has sequence length of 738 steps
Collatz(64040575) has sequence length of 738 steps
Collatz(93128574) has sequence length of 736 steps
Collatz(93128575) has sequence length of 736 steps
Collatz(94652411) has sequence length of 736 steps
Collatz(96060863) has sequence length of 736 steps
Collatz(46564287) has sequence length of 735 steps
Collatz(69846431) has sequence length of 733 steps
</code>
<p><b>Performance:</b></p>
<code>
1e8: parallel, 32 cores (File::Map, pack, unpack):
https://www.perlmonks.org/?node_id=11115544
https://www.perlmonks.org/?node_id=11115780
Laurent + updates 3.474s
iM71 + updates 2.701s
Step counting in C 1.654s
1e8: parallel, 16 cores
Laurent + updates 6.219s
iM71 + updates 4.787s
Step counting in C 2.793s
1e8: parallel, 8 cores
Laurent + updates 12.061s
iM71 + updates 9.200s
Step counting in C 5.258s
1e8: parallel, 4 cores
Laurent + updates 23.615s
iM71 + updates 17.935s
Step counting in C 10.056s
1e8: parallel, 2 cores
Laurent + updates 46.146s
iM71 + updates 34.342s
Step counting in C 19.084s
1e8: non-parallel (Array):
https://www.perlmonks.org/?node_id=11115841
Laurent + updates 53.961s
iM71 + updates 48.673s
Step counting in C 19.023s
</code>
<p>Parallel now matches serial for sequences with equal number of steps (i.e. smallest sequence first).</p>
<p>Regards, Mario</p>
11115088
11115844