http://qs321.pair.com?node_id=1130278


in reply to The sieve of Xuedong Luo (Algorithm3) for generating prime numbers

Again, thanks for writing this up. Perl is indeed fun. MCE + Inline::C is pretty cool.

As I mentioned last year, counting primes can be much faster than sieving them. Using MCE with MPU's prime_count is counter-productive as it is doing much more work than a single thread calling it once. The analagous thing for primesieve is Kim Walisch's primecount -- also vastly faster than primesieve for counting. It's very slightly slower than MPU on a single thread, but its target is multiple threads where it cleans up. Times are on my MBP 2.2GHz (should be very similar to yours, just clocked lower). The only change is using the github version of Math::Prime::Util and changing bin/primeutil.pl to use the two new functions.

perl bin/algorithm3.pl 1_000_000_000 Prime numbers : 50847534 Compute time : 0.163 sec perl bin/primesieve.pl 1_000_000_000 Prime numbers : 50847534 Compute time : 0.071 sec perl bin/primeutil.pl 1_000_000_000 Prime numbers : 50847534 Compute time : 0.025 sec perl -Mntheory=:all -E 'say prime_count(1e9)' 50847534 0.020s user time
This really shows up when looking at larger sizes. 1e11 takes only 0.038s with prime_count, but the MCE version takes much longer. The single last thread is doing almost twice the work of a single call for the entire answer. Neither one is doing any sieving.

I added a simple sum_primes and print_primes yesterday. Not on CPAN yet. It helps quite a bit:

perl bin/algorithm3.pl 4_294_967_296 --sum Sum of primes : 425649736193687430 Compute time : 1.222 sec perl bin/primesieve.pl 4_294_967_296 --sum Sum of primes : 425649736193687430 Compute time : 0.431 sec perl bin/primeutil.pl 4_294_967_296 --sum Sum of primes : 425649736193687430 Compute time : 0.873 sec perl -Mntheory=:all -E 'say sum_primes(2**32)' 425649736193687430 4.627s user time perl -Mntheory=:all -E 'my $s=0; forprimes { $s+=$_ } 2**32; say $s' 425649736193687430 11.198s user time

The downside is that it a very specialized function, while forprimes { ... } $low,$high is very generic. There is also vecsum over primes, but that has even more overhead.

perl bin/algorithm3.pl 4_294_967_296 --print >/dev/null Compute time : 3.279 sec perl bin/primesieve.pl 4_294_967_296 --print >/dev/null Compute time : 3.126 sec perl bin/primeutil.pl 4_294_967_296 --print >/dev/null Compute time : 3.279 sec perl -Mntheory=:all -E 'print_primes(2**32)' >/dev/null 8.149 user time perl -Mntheory=:all -E 'forprimes { say } 2**32' 80.9s user time primesieve -p1 2**32 >/dev/null 172.9s user time

Printing is much faster. Your code was better than the simple forprimes here of course, but now it will quickly print primes in a range to a file descriptor. It looks like our print timing is different -- maybe a difference in the temp file speed. primesieve's print time is excessive on this machine. I rebuilt it with a printf unsigned long and it went down to 28 seconds. It didn't change the MCE time since you're doing your own efficient printing.

One of these days I'll invert my code: rip out all the C code and put it in a standalone library for use by anything, then have ntheory (MPU) link to it. Then the MCE + Inline::C could get the high performance without mucking with XS.