http://qs321.pair.com?node_id=887320


in reply to Re^9: Module for 128-bit integer math?
in thread Module for 128-bit integer math?

Support for unsigned __int128 types was probably not yet implemented

A post to the mingw64 mailing list suggested that I create the __uint128 type myself (in the compiler's _mingw.h) with:
typedef unsigned int __uint128 __attribute__ ((__mode__ (TI)));
and that works fine.
Then, in Int128.xs, it was just a matter of changing:
typedef unsigned __int128 uint128_t; to typedef __uint128 uint128_t;
BrowserUk (or anyone else, of course), if you want a ppm package of that module that will work with the 64-bit builds of ActivePerl (both perl-5.10 and perl-5.12):
ppm install http://www.sisyphusion.tk/ppm/Math-Int128.ppd
Cheers,
Rob

Replies are listed 'Best First'.
Re^11: Module for 128-bit integer math?
by BrowserUk (Patriarch) on Feb 10, 2011 at 08:28 UTC
    BrowserUk (or anyone else, of course), if you want a ppm package

    Thanks syphilis. That works perfectly.

    And especially thanks to salva.

    I still intend to pursue an MSVC/SSE* solution to this, but this gives me a reference point.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      And especially thanks to salva

      I'll second that !!

      There are of course, more things to consider than just *performance* when it comes to choosing which Math module to use - but I was curious to see how Math::Int128 compared for speed with Math::GMPz (which uses the gmp C library).

      I found that, if using overloaded operators with Math::GMPz, then Math::GMPz was roughly twice as slow as Math::Int128 - but if using the arithmetic functions that Math::GMPz also provides, then Math::GMPz was roughly twice as fast as Math::Int128.

      Without yet having taken the time to look closely at Math::Int128, I believe there would be a way of having it out-perform Math::GMPz. (And I'm about to embark upon a little excercise to satisfy myself that this is so.)
      For any Windows users who want to check the comparisons for themselves, there are ppm packages for 64-bit perl (5.10 and 5.12) available:
      ppm install http://www.sisyphusion.tk/ppm/Math-GMPz.ppd ppm install http://www.sisyphusion.tk/ppm/Math-Int128.ppd
      The benchmarks I ran are as follows:
      use warnings; use Math::Int128 qw(int128); use Math::GMPz qw(:mpz); use Benchmark qw(:all); $count = 40000; $mpz1 = Math::GMPz->new('676469752303423489'); $mpz2 = Math::GMPz->new('776469752999423489'); $i_1 = int128("$mpz1"); $i_2 = int128("$mpz2"); $mpz_sub = Math::GMPz->new('976469752313423489'); $i_sub = int128("$mpz_sub"); $mpz_div = Math::GMPz->new('76469752313423489'); $i_div = int128("$mpz_div"); $mpz_ret = Rmpz_init2(128); print " ****************** **MULTIPLICATION** ******************\n\n"; cmpthese($count * 9, { 'mul_M::I' => '$ri = Math::Int128::_mul($i_1, $i_2, 0)', 'mul_M::G1' =>'$mpz_ret = $mpz1 * $mpz2', 'mul_M::G2'=> 'Rmpz_mul($mpz_ret, $mpz1, $mpz2)', }); die "Error 1:\n$ri\n$mpz_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('525258301482620425304858018020933121'); $i_1 *= $i_1; $i_2 *= $i_2; $mpz1 *= $mpz1; $mpz2 *= $mpz2; print " ****************** *****DIVISION***** ******************\n\n"; cmpthese($count * 9, { 'div_M::I' => '$ri = Math::Int128::_div($i_1, $i_div, 0)', 'div_M::G1' =>'$mpz_ret = $mpz1 / $mpz_div', 'div_M::G2'=> 'Rmpz_tdiv_q($mpz_ret, $mpz1, $mpz_div)', }); die "Error 2:\n$ri\n$mpz_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('5984213521522366751'); print" ****************** *****ADDITION***** ******************\n\n"; cmpthese($count * 10, { 'add_M::I' => '$ri = Math::Int128::_add($i_1, $i_2, 0)', 'add_M::G1' => '$mpz_ret = $mpz1 + $mpz2', 'add_M::G2' => 'Rmpz_add($mpz_ret, $mpz1, $mpz2)', }); die "Error 3:\n$ri\n$mpz_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('1060516603104440851094132036041866242'); print " ****************** ****SUBTRACTION*** ******************\n\n"; cmpthese($count * 10, { 'add_M::I' => '$ri = Math::Int128::_sub($i_1, $i_sub, 0)', 'add_M::G1' => '$mpz_ret = $mpz1 - $mpz_sub', 'add_M::G2' => 'Rmpz_sub($mpz_ret, $mpz1, $mpz_sub)', }); die "Error 4:\n$ri\n$mpz_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('457611325781455127825205517363509632');
      The output I got was:
      C:\_64\pscrpt>perl bench.pl Name "main::i_div" used only once: possible typo at bench.pl line 17. Name "main::i_sub" used only once: possible typo at bench.pl line 14. ****************** **MULTIPLICATION** ****************** Rate mul_M::G1 mul_M::I mul_M::G2 mul_M::G1 156726/s -- -58% -82% mul_M::I 371517/s 137% -- -58% mul_M::G2 886700/s 466% 139% -- ****************** *****DIVISION***** ****************** Rate div_M::G1 div_M::I div_M::G2 div_M::G1 148637/s -- -57% -81% div_M::I 344168/s 132% -- -57% div_M::G2 794702/s 435% 131% -- ****************** *****ADDITION***** ****************** Rate add_M::G1 add_M::I add_M::G2 add_M::G1 148810/s -- -60% -83% add_M::I 376648/s 153% -- -56% add_M::G2 852878/s 473% 126% -- ****************** ****SUBTRACTION*** ****************** Rate add_M::G1 add_M::I add_M::G2 add_M::G1 147113/s -- -60% -83% add_M::I 365631/s 149% -- -59% add_M::G2 883002/s 500% 142% -- C:\_64\pscrpt>
      Update: I've since written an Inline::C script where the __int128 multiplications beat Math::GMPz ... but not by much:
      Rate gmpz mul_128 gmpz 900000/s -- -3% mul_128 927835/s 3% --
      The code I used is still a bit rough, but I don't see much scope for improvement at the moment.

      Cheers,
      Rob

        That's interesting, but there are reasons other than performance for wanting fix-width integer semantics.

        Try coding this with GMP (or any arbitrary precision math):

        #! perl -slw use strict; use Math::Int128 qw[int128 uint128]; my $ZERO = uint128( 0 ); my $ONE = uint128( 1 ); my $ALLONES = ~ $ZERO; my $MASK1 = $ALLONES / 3; my $MASK2 = $ALLONES / 15 * 3; my $MASK3 = $ALLONES / 255; my $MASK4 = $MASK3 * 15; sub bcount { my $v = shift; $v = $v - ( ( $v >> 1 ) & $MASK1 ); $v = ( $v & $MASK2 ) + ( ( $v >> 2 ) & $MASK2 ); $v = ( $v + ( $v >> 4 ) ) & $MASK4; my $c = ( $v * ( $MASK3 ) ) >> 120; return $c; } my $bits = uint128( 0 ); ## set every 3rd bit $bits |= $ONE << $_ for map $_*3+1, 0 .. 42; print $bits; ## count the bits (43) print bcount( $bits ); ## now invert them and count again. (85) print bcount( ~$bits ); __END__ c:\test>m128.pl 194447066811964836264785489961010406546 43 85

        Note: I would have supplied a GMPz version, but I couldn't work out how to do it from the docs. There seem to be a million ways to initialise a number; a million ways to divide two numbers etc, But not so much when it comes to doing bitwise math.

        Update: Here's my attempt at a GMP version of the above. Note that it doesn't just produce the wrong results, but it does so silently:

        #! perl -slw use strict; use Math::GMPz; my $ZERO = Math::GMPz->new( 0 ); my $ONE = Math::GMPz->new( 1 ); my $ALLONES = ~ $ZERO; my $MASK1 = $ALLONES / 3; my $MASK2 = $ALLONES / 15 * 3; my $MASK3 = $ALLONES / 255; my $MASK4 = $MASK3 * 15; sub bcount { my $v = shift; $v = $v - ( ( $v >> 1 ) & $MASK1 ); $v = ( $v & $MASK2 ) + ( ( $v >> 2 ) & $MASK2 ); $v = ( $v + ( $v >> 4 ) ) & $MASK4; my $c = ( $v * ( $MASK3 ) ) >> 120; return $c; } my $bits = Math::GMPz->new( 0 ); ## set every 3rd bit $bits |= $ONE << $_ for map $_*3+1, 0 .. 42; print $bits; ## count the bits (43) print bcount( $bits ); ## now invert them and count again. (85) print bcount( ~$bits ); __END__ c:\test>gmpz-t.pl 194447066811964836264785489961010406546 0 0

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        but I don't see much scope for improvement at the moment

        Well, there is...

        The G2 version is faster than G1 and I because it is not allocating and deallocating the result object every time but reusing $mpz_ret.

        After adding to Math::Int128 a new set of operators that use a preallocated argument for output, Math::Int128 becomes faster than Math::GMPz, around 60% faster.

        The modified benchmark script:

        And the results I get on my 64bits-linux-but-with-a-not-very-optimized-for-64bits-old-processor:

        ****************** **MULTIPLICATION** ****************** Rate mul_M::G1 mul_M::I mul_M::G2 mul_M::I2 mul_M::G1 321555/s -- -72% -87% -91% mul_M::I 1147836/s 257% -- -52% -69% mul_M::G2 2406041/s 648% 110% -- -35% mul_M::I2 3709585/s 1054% 223% 54% -- ****************** *****DIVISION***** ****************** Rate div_M::G1 div_M::I div_M::G2 div_M::I2 div_M::G1 314139/s -- -71% -83% -88% div_M::I 1092266/s 248% -- -39% -59% div_M::G2 1799026/s 473% 65% -- -32% div_M::I2 2633983/s 738% 141% 46% -- ****************** *****ADDITION***** ****************** Rate add_M::G1 add_M::I add_M::G2 add_M::I2 add_M::G1 317021/s -- -71% -86% -91% add_M::I 1092266/s 245% -- -51% -70% add_M::G2 2248783/s 609% 106% -- -38% add_M::I2 3598054/s 1035% 229% 60% -- ****************** ****SUBTRACTION*** ****************** Rate sub_M::G1 sub_M::I sub_M::G2 sub_M::I2 sub_M::G1 309967/s -- -72% -84% -91% sub_M::I 1113475/s 259% -- -44% -68% sub_M::G2 1997468/s 544% 79% -- -43% sub_M::I2 3495253/s 1028% 214% 75% --
        The new version of the module can be obtained from GitHub.