http://qs321.pair.com?node_id=936209

moodywoody has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I was wondering whether there exist some general ideas of the performance impact when using bigint?

I was putting together some snippets for projecteuler.net (yes, I was bored and uninspired). The 14th problem asks for the length of Collatz chains http://projecteuler.net/problem=14. Anyhow, I wrote that http://pastebin.com/jbYBFiRZ up.

Or rather, at first I had one extra line at the top "use bigint;". Removing that line improved the performance from ~10min to 6sec or by factor 100.

Am I missing something or is bigint incredibly slow? Thank you for your insights.

Replies are listed 'Best First'.
Re: bigint == horrible performance?
by syphilis (Archbishop) on Nov 06, 2011 at 01:43 UTC
    Yes, bigint operations are comparatively slow - don't use bigint if you don't need the extended precision.

    See the "l, lib, try or only" and ensuing sections of the bigint docs for ways of getting vastly improved performance out of bigint on those occasions when you *do* need the extended precision.

    Cheers,
    Rob
Re: bigint == horrible performance?
by HAL9000 (Initiate) on Jun 12, 2017 at 04:50 UTC

    For a program I wrote to find Euler bricks (rectangular solids where all 3 edges and all three face diagonals are integers), the run time for longest edge < 2^32 and shortest edge < 2^15.5 = 46341 was 10.8s without bigint and 549.4s with bigint. That's a 51X slowdown. However, I believe the problem is mainly in multiplication rather than addition. I noticed that the largest multiplier in my program was 3, so I tried replacing 2*a with a+a and 3*a with a+a+a. The runtime was then 24.8s, only a 2.3X slowdown. I can live with that, even if it gets somewhat worse for larger numbers, in order to guarantee correctness.

    Specifically, I rewrote:

    # Now recursively generate the ternary tree. # (Note that the terms are the same in each branch except for +sign.) # Branch 1 (+-+): $i = $a - 2*$b + 2*$h; $j = 2*$a - $b + 2*$h; $k = 2*$a - 2*$b + 3*$h; &generate($i,$j,$k); # Branch 2 (+++): $i = $a + 2*$b + 2*$h; $j = 2*$a + $b + 2*$h; $k = 2*$a + 2*$b + 3*$h; &generate($i,$j,$k); # Branch 3 (-++): $i = -$a + 2*$b + 2*$h; $j = -2*$a + $b + 2*$h; $k = -2*$a + 2*$b + 3*$h; &generate($i,$j,$k);
    to be:
    # Now recursively generate the ternary tree. # (Note that the terms are the same in each branch except for +sign.) # Code to test whether bigint is mainly slowed by multiplies. $a2 = $a + $a; $b2 = $b + $b; $h2 = $h + $h; $h3 = $h + $h2; # Branch 1 (+-+): $i = $a - $b2 + $h2; $j = $a2 - $b + $h2; $k = $a2 - $b2 + $h3; &generate($i,$j,$k); # Branch 2 (+++): $i = $a + $b2 + $h2; $j = $a2 + $b + $h2; $k = $a2 + $b2 + $h3; &generate($i,$j,$k); # Branch 3 (-++): $i = -$a + $b2 + $h2; $j = -$a2 + $b + $h2; $k = -$a2 + $b2 + $h3; &generate($i,$j,$k);
    Replacing the multiplies with adds (and reducing the number of redundant adds) gave a 22X speedup.

Re: bigint == horrible performance?
by Khen1950fx (Canon) on Nov 06, 2011 at 10:05 UTC
    Instead of using bigint, use Math::NumSeq::CollatzSteps and Google. I googled #14 and found the answer: 837799. Double-check it:
    #!/usr/bin/perl -slw use strict; use warnings; use Math::NumSeq::CollatzSteps; my $seq = Math::NumSeq::CollatzSteps->new( step_type => 'both' ); my $i = 837799; my $value = $seq->ith($i); print $value;
    Returns 524 steps.

      It's not a hard problem to solve:

      #! perl -slw use strict; use Time::HiRes qw[ time ]; my $start = time; my @res; for ( 1 .. 1e6 ) { my $c = 0; my $n = $_; while( $n > 1 ) { ++$c; $n = $n&1 ? 3*$n+1 : $n/2; if( defined $res[ $n ] ) { $c += $res[ $n ]; last; } } $res[ $_ ] = $c; } printf "Took %.6f seconds\n", time() - $start; my( $n, $i ) = 0; $n < $res[ $_ ] and ( $n, $i ) = ( $res[ $_ ], $_ ) for 1 .. $#res; print "longest sequence is $n steps starting with $i"; __END__ c:\test>collatz.pl Took 2.000000 seconds longest sequence is 524 steps starting with 837799

      And the OP gave a link to his own 3 second solution, though I'm not sure why he pastebin'd it rather than posting it here.

      The intriguing thing is why the OP felt the need for bigint in the first place given that using Perl's native scalars he could calculate this(*) for numbers up to 2,251,799,813,685,248 before running into precision problems.

      (*)Assuming of course he had 400+ years and the exabytes of ram required to store the hash :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        So I've bumped into some stuff lately. XS extensions with no particular expectation of whether you're on an actual 64-bit bus or not. Perl itself actually recommends that it be compiled into 32-bit mode even when on a 64-bit architecture because the docs suppose <it will rarely be actually required>. (those are semi-quotes)

        So now, in the XS world, I've found myself facing a couple of things: 64 bit stuff and ipv6 (128 bits).

        In the XS world, Math::Int64 works quite nicely. However, I get way too many explosions with Math::128 at this point. So, I end up using packed arrays and timely casting, while hopefully not misplacing network byte-order.

        32-bit (and I guess 64-bit?) perl has unsigned, int, 'double', and 'string'. (e.g. UV, IV, NV, and PV). There is no native perl guarantee for 64-bit or 128-bit other than a string.

        I understand the perils of assuming too much about the data types. But in the world of XS, <whimper>

        Cheers,
        Matt

        Ok .. to clear things up .. I am _not_ interested in better libraries or algorithms for that problem. I used bigint because I was not sure whether one of the Collatz chains would produce an intermediate number exceeding the int scope (probably my take on pessimation). Only after reading on the projecteuler forum that everyones solution was running in 1-5sec and mine taking 10min - while using the same algorithm as many others - made me suspicious. So I removed "use bigint" .. lo, and behold! It runs .. fast. Googling "perl bigint performance" didn't turn up anything useful, that's why I asked here.

        PS. I used pastebin because I wasn't sure about the "post code" conventions ;-)