Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^11: Module for 128-bit integer math?

by BrowserUk (Patriarch)
on Feb 10, 2011 at 08:28 UTC ( [id://887361]=note: print w/replies, xml ) Need Help??


in reply to Re^10: Module for 128-bit integer math?
in thread Module for 128-bit integer math?

BrowserUk (or anyone else, of course), if you want a ppm package

Thanks syphilis. That works perfectly.

And especially thanks to salva.

I still intend to pursue an MSVC/SSE* solution to this, but this gives me a reference point.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^11: Module for 128-bit integer math?

Replies are listed 'Best First'.
Re^12: Module for 128-bit integer math?
by syphilis (Archbishop) on Feb 12, 2011 at 04:10 UTC
    And especially thanks to salva

    I'll second that !!

    There are of course, more things to consider than just *performance* when it comes to choosing which Math module to use - but I was curious to see how Math::Int128 compared for speed with Math::GMPz (which uses the gmp C library).

    I found that, if using overloaded operators with Math::GMPz, then Math::GMPz was roughly twice as slow as Math::Int128 - but if using the arithmetic functions that Math::GMPz also provides, then Math::GMPz was roughly twice as fast as Math::Int128.

    Without yet having taken the time to look closely at Math::Int128, I believe there would be a way of having it out-perform Math::GMPz. (And I'm about to embark upon a little excercise to satisfy myself that this is so.)
    For any Windows users who want to check the comparisons for themselves, there are ppm packages for 64-bit perl (5.10 and 5.12) available:
    ppm install http://www.sisyphusion.tk/ppm/Math-GMPz.ppd ppm install http://www.sisyphusion.tk/ppm/Math-Int128.ppd
    The benchmarks I ran are as follows:
    use warnings; use Math::Int128 qw(int128); use Math::GMPz qw(:mpz); use Benchmark qw(:all); $count = 40000; $mpz1 = Math::GMPz->new('676469752303423489'); $mpz2 = Math::GMPz->new('776469752999423489'); $i_1 = int128("$mpz1"); $i_2 = int128("$mpz2"); $mpz_sub = Math::GMPz->new('976469752313423489'); $i_sub = int128("$mpz_sub"); $mpz_div = Math::GMPz->new('76469752313423489'); $i_div = int128("$mpz_div"); $mpz_ret = Rmpz_init2(128); print " ****************** **MULTIPLICATION** ******************\n\n"; cmpthese($count * 9, { 'mul_M::I' => '$ri = Math::Int128::_mul($i_1, $i_2, 0)', 'mul_M::G1' =>'$mpz_ret = $mpz1 * $mpz2', 'mul_M::G2'=> 'Rmpz_mul($mpz_ret, $mpz1, $mpz2)', }); die "Error 1:\n$ri\n$mpz_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('525258301482620425304858018020933121'); $i_1 *= $i_1; $i_2 *= $i_2; $mpz1 *= $mpz1; $mpz2 *= $mpz2; print " ****************** *****DIVISION***** ******************\n\n"; cmpthese($count * 9, { 'div_M::I' => '$ri = Math::Int128::_div($i_1, $i_div, 0)', 'div_M::G1' =>'$mpz_ret = $mpz1 / $mpz_div', 'div_M::G2'=> 'Rmpz_tdiv_q($mpz_ret, $mpz1, $mpz_div)', }); die "Error 2:\n$ri\n$mpz_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('5984213521522366751'); print" ****************** *****ADDITION***** ******************\n\n"; cmpthese($count * 10, { 'add_M::I' => '$ri = Math::Int128::_add($i_1, $i_2, 0)', 'add_M::G1' => '$mpz_ret = $mpz1 + $mpz2', 'add_M::G2' => 'Rmpz_add($mpz_ret, $mpz1, $mpz2)', }); die "Error 3:\n$ri\n$mpz_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('1060516603104440851094132036041866242'); print " ****************** ****SUBTRACTION*** ******************\n\n"; cmpthese($count * 10, { 'add_M::I' => '$ri = Math::Int128::_sub($i_1, $i_sub, 0)', 'add_M::G1' => '$mpz_ret = $mpz1 - $mpz_sub', 'add_M::G2' => 'Rmpz_sub($mpz_ret, $mpz1, $mpz_sub)', }); die "Error 4:\n$ri\n$mpz_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('457611325781455127825205517363509632');
    The output I got was:
    C:\_64\pscrpt>perl bench.pl Name "main::i_div" used only once: possible typo at bench.pl line 17. Name "main::i_sub" used only once: possible typo at bench.pl line 14. ****************** **MULTIPLICATION** ****************** Rate mul_M::G1 mul_M::I mul_M::G2 mul_M::G1 156726/s -- -58% -82% mul_M::I 371517/s 137% -- -58% mul_M::G2 886700/s 466% 139% -- ****************** *****DIVISION***** ****************** Rate div_M::G1 div_M::I div_M::G2 div_M::G1 148637/s -- -57% -81% div_M::I 344168/s 132% -- -57% div_M::G2 794702/s 435% 131% -- ****************** *****ADDITION***** ****************** Rate add_M::G1 add_M::I add_M::G2 add_M::G1 148810/s -- -60% -83% add_M::I 376648/s 153% -- -56% add_M::G2 852878/s 473% 126% -- ****************** ****SUBTRACTION*** ****************** Rate add_M::G1 add_M::I add_M::G2 add_M::G1 147113/s -- -60% -83% add_M::I 365631/s 149% -- -59% add_M::G2 883002/s 500% 142% -- C:\_64\pscrpt>
    Update: I've since written an Inline::C script where the __int128 multiplications beat Math::GMPz ... but not by much:
    Rate gmpz mul_128 gmpz 900000/s -- -3% mul_128 927835/s 3% --
    The code I used is still a bit rough, but I don't see much scope for improvement at the moment.

    Cheers,
    Rob

      That's interesting, but there are reasons other than performance for wanting fix-width integer semantics.

      Try coding this with GMP (or any arbitrary precision math):

      #! perl -slw use strict; use Math::Int128 qw[int128 uint128]; my $ZERO = uint128( 0 ); my $ONE = uint128( 1 ); my $ALLONES = ~ $ZERO; my $MASK1 = $ALLONES / 3; my $MASK2 = $ALLONES / 15 * 3; my $MASK3 = $ALLONES / 255; my $MASK4 = $MASK3 * 15; sub bcount { my $v = shift; $v = $v - ( ( $v >> 1 ) & $MASK1 ); $v = ( $v & $MASK2 ) + ( ( $v >> 2 ) & $MASK2 ); $v = ( $v + ( $v >> 4 ) ) & $MASK4; my $c = ( $v * ( $MASK3 ) ) >> 120; return $c; } my $bits = uint128( 0 ); ## set every 3rd bit $bits |= $ONE << $_ for map $_*3+1, 0 .. 42; print $bits; ## count the bits (43) print bcount( $bits ); ## now invert them and count again. (85) print bcount( ~$bits ); __END__ c:\test>m128.pl 194447066811964836264785489961010406546 43 85

      Note: I would have supplied a GMPz version, but I couldn't work out how to do it from the docs. There seem to be a million ways to initialise a number; a million ways to divide two numbers etc, But not so much when it comes to doing bitwise math.

      Update: Here's my attempt at a GMP version of the above. Note that it doesn't just produce the wrong results, but it does so silently:

      #! perl -slw use strict; use Math::GMPz; my $ZERO = Math::GMPz->new( 0 ); my $ONE = Math::GMPz->new( 1 ); my $ALLONES = ~ $ZERO; my $MASK1 = $ALLONES / 3; my $MASK2 = $ALLONES / 15 * 3; my $MASK3 = $ALLONES / 255; my $MASK4 = $MASK3 * 15; sub bcount { my $v = shift; $v = $v - ( ( $v >> 1 ) & $MASK1 ); $v = ( $v & $MASK2 ) + ( ( $v >> 2 ) & $MASK2 ); $v = ( $v + ( $v >> 4 ) ) & $MASK4; my $c = ( $v * ( $MASK3 ) ) >> 120; return $c; } my $bits = Math::GMPz->new( 0 ); ## set every 3rd bit $bits |= $ONE << $_ for map $_*3+1, 0 .. 42; print $bits; ## count the bits (43) print bcount( $bits ); ## now invert them and count again. (85) print bcount( ~$bits ); __END__ c:\test>gmpz-t.pl 194447066811964836264785489961010406546 0 0

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        there are reasons other than performance for wanting fix-width integer semantics

        Definitely correct ... though I must confess that the reasons you've implicitly put forward in your demos were not exactly what I was thinking of when I acknowledged this (or tried to) in my earlier post.

        For a Math::GMPz rendition of your demo I would put forward the following:
        #! perl -slw use strict; use Math::GMPz qw(:mpz); my $ZERO = Math::GMPz->new( 0 ); my $ONE = Math::GMPz->new( 1 ); #my $ALLONES = ~ $ZERO; my $ALLONES = (Math::GMPz->new(1) << 128) - 1; my $MASK1 = $ALLONES / 3; my $MASK2 = $ALLONES / 15 * 3; my $MASK3 = $ALLONES / 255; my $MASK4 = $MASK3 * 15; my $bits = Math::GMPz->new( 0 ); ## set every 3rd bit $bits |= $ONE << $_ for map $_*3+1, 0 .. 42; print $bits; ## count the bits (43) #print bcount( $bits ); print Rmpz_popcount($bits); ## now invert them and count again. (85) #print bcount( ~$bits ); print Rmpz_popcount($ALLONES ^ $bits);
        I hope it doesn't miss the point ... I think the only changes are that:
        1) It changes the way that $ALLONES is created;
        2) It removes the need for sub bcount()

        Update: And, there's also now a qw(:mpz) in the third line.

        Cheers,
        Rob
      but I don't see much scope for improvement at the moment

      Well, there is...

      The G2 version is faster than G1 and I because it is not allocating and deallocating the result object every time but reusing $mpz_ret.

      After adding to Math::Int128 a new set of operators that use a preallocated argument for output, Math::Int128 becomes faster than Math::GMPz, around 60% faster.

      The modified benchmark script:

      And the results I get on my 64bits-linux-but-with-a-not-very-optimized-for-64bits-old-processor:

      ****************** **MULTIPLICATION** ****************** Rate mul_M::G1 mul_M::I mul_M::G2 mul_M::I2 mul_M::G1 321555/s -- -72% -87% -91% mul_M::I 1147836/s 257% -- -52% -69% mul_M::G2 2406041/s 648% 110% -- -35% mul_M::I2 3709585/s 1054% 223% 54% -- ****************** *****DIVISION***** ****************** Rate div_M::G1 div_M::I div_M::G2 div_M::I2 div_M::G1 314139/s -- -71% -83% -88% div_M::I 1092266/s 248% -- -39% -59% div_M::G2 1799026/s 473% 65% -- -32% div_M::I2 2633983/s 738% 141% 46% -- ****************** *****ADDITION***** ****************** Rate add_M::G1 add_M::I add_M::G2 add_M::I2 add_M::G1 317021/s -- -71% -86% -91% add_M::I 1092266/s 245% -- -51% -70% add_M::G2 2248783/s 609% 106% -- -38% add_M::I2 3598054/s 1035% 229% 60% -- ****************** ****SUBTRACTION*** ****************** Rate sub_M::G1 sub_M::I sub_M::G2 sub_M::I2 sub_M::G1 309967/s -- -72% -84% -91% sub_M::I 1113475/s 259% -- -44% -68% sub_M::G2 1997468/s 544% 79% -- -43% sub_M::I2 3495253/s 1028% 214% 75% --
      The new version of the module can be obtained from GitHub.
        Math::Int128 becomes faster than Math::GMPz, around 60%

        Using the latest version of Math::Int128, and your modified script, I find an improvement (on Windows Vista) of around 40%. Given that we're using different operating systems and probably different processors, I think we can agree that "I find the same as you".

        Cheers,
        Rob

        I should add that even 40% is better than I could get with my approach to Math::Int128 modifications. I might learn something if I ever find the time and energy to discover why that was so. (Best I could get was to have the int128 arithemtic about 5-10% faster than Math::GMPz.)
        After adding to Math::Int128 a new set of operators that use a preallocated argument for output, Math::Int128 becomes faster than Math::GMPz, around 60% faster

        That's more like the figures I anticipated (at a guess).
        My "I don't see much scope for improvement" comment was in relation to my own enhancements to Math::Int128 which removed the allocating/deallocating you mentioned - and which made the int128 multiplication a little faster than using Math::GMPz's functions (and, of course, significantly faster than using Math::GMPz's overloaded operators).

        When I get back to my Vista64 box, I'll grab the latest github version and see for myself ... and, of course, post again with my results for that machine.

        Cheers,
        Rob

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://887361]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-04-24 00:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found