http://qs321.pair.com?node_id=549639

punklrokk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks! I am trying to optimize my code for the Miller-Rabin algorithm. (Finds prime numbers.) My friend did it in Java, and it took him 6 minutes, 18 seconds. (For his algorithm to run.) Mine looks like it'll take 2000 min, about 2 minutes per number; although in java. With out explaining the code too much, here's my question:
my $tempA = Math::BigInt->new($a); $b = $tempA->bmodpow($M,$n); #set b = a^m
How can I figure out a better, either more efficient, or just faster running function, (does one exist?) to do this modular exponentiation? I am going to try sqare and multiply, but I'm wondering if the monks have a better way to do what I'm trying.

Disclaimer: This is a homework related question, but I have already completed the assignment, and I want to learn how to better optimize this algorithm. It's also a crypto course, not a Perl, or programming course. I've chosen this course as an opportunity to learn Perl.
use Math::BigInt; #allows big integers needed for this algorithm $x=$k=0; #$M=$ARGV[0]-1; #$n=$ARGV[0]; #testing variable $data_file="primelist.txt"; #imports known primes (91000-93000) open(DAT, $data_file) || die("Could not open file!"); @raw_data=<DAT>; close(DAT); foreach $line(@raw_data) { #puts text file into a stepped array @prime $tempPrime=$line; @prime = split(/:/, $line); } # foreach $line(@prime) { #makes sure primes are loaded in array # print "$line\n"; # } for ($n=91001; $n<93000; $n++) { $counter=0; $M=$n-1; print "\$n: $n\n"; while ($M%2 != 1) { # Loop to calculate $M and $k (2^k*M) $M=$M/2; $k+=1; #print "K=$k, M=$M\n"; } for($a = 2; $a<($n-1);$a++){ my $tempA = Math::BigInt->new($a); $b = $tempA->bmodpow($M,$n); #set b = a^m mod n #print "\$b=$b\n"; #tells us what b is if ($b==1) { #test 1 (does b=1?) #print "$n: Prime (test1)\n"; $counter++; } else { for ($i=0; $i<$k; $i++) { if ($b==($n-1)) { #print "\$b-temp=$b\n"; $counter++; #print "$n is a prime (test2)\n"; last; #exits loop upon $b==1 } else { $tempB = Math::BigInt->new($b); $b = $tempB->bmodpow(2,$n); #print "$b\n"; #print "$n is composite\n"; } } + #print "\$a\=: $a\n"; } } $ticker[$n]=$counter; #Keep track of number of false positives $percentage[$n]=$counter/$n; #Figure out percentage print "Counter: $ticker[$n]\n"; print "Percentage: $percentage[$n]\n"; $n++; }
JP Bourget (punklrokk) MS Information and Security Rochester Institute of Technology Rochester, NY

Replies are listed 'Best First'.
Re: optimizing the miller-rabin algorithm
by rhesa (Vicar) on May 16, 2006 at 03:41 UTC
    The way I read the algorithm at http://en.wikipedia.org/wiki/Miller-Rabin_primality_test, you're looping too much in your middle loop. I believe you're supposed to choose a fixed accuracy k, and then a random set of k bases. You're using *all* numbers less than n as a base, and that is way too strong, especially for very large numbers, for which this algorithm is designed. If you're looping over all numbers less than n, you might as well do trial division :-)

    I've tried to implement the algorithm as described on Wikipedia, and for a modest accuracy (say 25), it runs in less than 30 seconds on your entire number range.

    Note that I'm using Math::Pari to verify the next prime number. If you don't have it installed, leave it out and do the check in some other way (against your file of primes, for instance).

    I also have the GMP library installed, and use that with Math::BigInt. For small numbers (which 91000-93000 are), this may not have a big benefit.

      The way I read the algorithm ..., you're looping too much in your middle loop. I believe you're supposed to choose a fixed accuracy k, and then a random set of k bases. You're using *all* numbers less than n as a base, and that is way too strong, especially for very large numbers, for which this algorithm is designed.

      Yes, that's pretty much right. Furthermore, for any number less than 341550071728322, if the Miller-Rabin test doesn't return "composite" for any of the bases 2, 3, 5, 7, 11, 13 and 17, then the number in question is proven to be prime. For numbers less than 1373653 it is sufficient to test only for bases 2 and 3. See http://primes.utm.edu/prove/prove2_3.html

      For large numbers Math::BigInt's bmodpow (pure perl) function is just way too slow. You'll be wanting to use something like Math::Pari or Math::GMP (as rhesa has suggested).

      Cheers,
      Rob
Re: optimizing the miller-rabin algorithm
by GrandFather (Saint) on May 16, 2006 at 01:46 UTC

    I'm not sure I can help with your question. But you do say you are doing this in part to learn Perl, so I'll make a few (kindly meant) comments about your code.

    First, always add use strict; use warnings; to your code. They will save you much grief and tearing of hair

    Avoid C style for loops. For example replace

    for (my $n=91001; $n<93000; $n++)

    with:

    for my $n (91001..93000)

    Do not use $a and $b as general purpose variables. They are magical and are used in the context of sort.


    DWIM is Perl's answer to Gödel
      If I use your version of a For Loop, how do I tell it to iterate two steps instead of one?

      JP Bourget (punklrokk) MS Information and Security Rochester Institute of Technology Rochester, NY

        Either use the C style for loop in that case or, if appropriate, do something like for ($start/2..$end/2) with $start and $end both even (but that's rather clunky!).

        The Perl style for is very good for iterating over lists so you can do something like for (1.5, 2..4, @moreNumbers). Generally you should think Perl style for first, then C style if you really need it.


        DWIM is Perl's answer to Gödel

        You could do

        for (0..(5-1)/2) { # 1..5, step 2 my $i = $_*2 + 1; ... }

        but you're probably better off falling back to a C-style loop in that case.

        As an aside, the above trick is particularly useful when dealing with floats.

        >perl -le "for (my $i=0; $i<10; $i+=0.1) { print $i }" ... 8.79999999999998 8.89999999999998 8.99999999999998 9.09999999999998 9.19999999999998 9.29999999999998 9.39999999999998 9.49999999999998 9.59999999999998 9.69999999999998 9.79999999999998 9.89999999999998 9.99999999999998 >perl -le "for (0..99) { my $i = $_ / 10; print $i }" ... 8.8 8.9 9 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9
Re: optimizing the miller-rabin algorithm
by ForgotPasswordAgain (Priest) on May 16, 2006 at 10:54 UTC
    For any kind of optimization, you need to profile the code, for example with Devel::DProf.
      ForgotPasswordAgain,
      This advice reeks of cargo-cult mantra chanting. If I was doing a linear search on a sorted list, would you need to profile in order to know that a binary search would be faster? Pure algorithm analysis doesn't require machine profiling. Implementation specific details may distort things where it is warranted. Profiling is always a good rule of thumb, but it isn't always necessary. Especially when the algorithm being implemented is well known and studied. Sometimes it is just a matter of discovering you have implemented it incorrectly.

      Cheers - L~R