http://qs321.pair.com?node_id=887971


in reply to Re^12: Module for 128-bit integer math?
in thread Module for 128-bit integer math?

but I don't see much scope for improvement at the moment

Well, there is...

The G2 version is faster than G1 and I because it is not allocating and deallocating the result object every time but reusing $mpz_ret.

After adding to Math::Int128 a new set of operators that use a preallocated argument for output, Math::Int128 becomes faster than Math::GMPz, around 60% faster.

The modified benchmark script:

use Math::Int128 qw(int128 :op); use Math::GMPz qw(:mpz); use Benchmark qw(:all); $count = 40000; $mpz1 = Math::GMPz->new('676469752303423489'); $mpz2 = Math::GMPz->new('776469752999423489'); $i_1 = int128("$mpz1"); $i_2 = int128("$mpz2"); $mpz_sub = Math::GMPz->new('976469752313423489'); $i_sub = int128("$mpz_sub"); $mpz_div = Math::GMPz->new('76469752313423489'); $i_div = int128("$mpz_div"); $mpz_ret = Rmpz_init2(128); $i_ret = int128(); use warnings; print " ****************** **MULTIPLICATION** ******************\n\n"; cmpthese(-1, { 'mul_M::I' => '$ri = Math::Int128::_mul($i_1, $i_2, 0)', 'mul_M::I2'=> 'int128_mul($i_ret, $i_1, $i_2)', 'mul_M::G1'=> '$mpz_ret = $mpz1 * $mpz2', 'mul_M::G2'=> 'Rmpz_mul($mpz_ret, $mpz1, $mpz2)', }); die "Error 1:\n$ri\n$mpz_ret\n$i_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('525258301482620425304858018020933121') || $ri != $i +_ret; $i_1 *= $i_1; $i_2 *= $i_2; $mpz1 *= $mpz1; $mpz2 *= $mpz2; # print "i_1: $i_1, i_2: $i_2\n"; print " ****************** *****DIVISION***** ******************\n\n"; cmpthese(-1, { 'div_M::I' => '$ri = Math::Int128::_div($i_1, $i_div, 0)', 'div_M::I2'=> 'int128_div($i_ret, $i_1, $i_div)', 'div_M::G1'=>'$mpz_ret = $mpz1 / $mpz_div', 'div_M::G2'=> 'Rmpz_tdiv_q($mpz_ret, $mpz1, $mpz_div)', }); die "Error 2:\n$ri\n$mpz_ret\n$i_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('5984213521522366751') || $ri != $i_ret; print" ****************** *****ADDITION***** ******************\n\n"; cmpthese(-1, { 'add_M::I' => '$ri = Math::Int128::_add($i_1, $i_2, 0)', 'add_M::I2' => 'int128_add($i_ret, $i_1, $i_2)', 'add_M::G1' => '$mpz_ret = $mpz1 + $mpz2', 'add_M::G2' => 'Rmpz_add($mpz_ret, $mpz1, $mpz2)', }); die "Error 3:\n$ri\n$mpz_ret\n$i_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('1060516603104440851094132036041866242') || $ri != $ +i_ret; print " ****************** ****SUBTRACTION*** ******************\n\n"; cmpthese(-1, { 'sub_M::I' => '$ri = Math::Int128::_sub($i_1, $i_sub, 0)', 'sub_M::I2' => 'int128_sub($i_ret, $i_1, $i_sub)', 'sub_M::G1' => '$mpz_ret = $mpz1 - $mpz_sub', 'sub_M::G2' => 'Rmpz_sub($mpz_ret, $mpz1, $mpz_sub)', }); die "Error 4:\n$ri\n$mpz_ret\n$i_ret\n" if $ri != int128("$mpz_ret") || $ri != int128('457611325781455127825205517363509632') || $ri != $i +_ret;

And the results I get on my 64bits-linux-but-with-a-not-very-optimized-for-64bits-old-processor:

****************** **MULTIPLICATION** ****************** Rate mul_M::G1 mul_M::I mul_M::G2 mul_M::I2 mul_M::G1 321555/s -- -72% -87% -91% mul_M::I 1147836/s 257% -- -52% -69% mul_M::G2 2406041/s 648% 110% -- -35% mul_M::I2 3709585/s 1054% 223% 54% -- ****************** *****DIVISION***** ****************** Rate div_M::G1 div_M::I div_M::G2 div_M::I2 div_M::G1 314139/s -- -71% -83% -88% div_M::I 1092266/s 248% -- -39% -59% div_M::G2 1799026/s 473% 65% -- -32% div_M::I2 2633983/s 738% 141% 46% -- ****************** *****ADDITION***** ****************** Rate add_M::G1 add_M::I add_M::G2 add_M::I2 add_M::G1 317021/s -- -71% -86% -91% add_M::I 1092266/s 245% -- -51% -70% add_M::G2 2248783/s 609% 106% -- -38% add_M::I2 3598054/s 1035% 229% 60% -- ****************** ****SUBTRACTION*** ****************** Rate sub_M::G1 sub_M::I sub_M::G2 sub_M::I2 sub_M::G1 309967/s -- -72% -84% -91% sub_M::I 1113475/s 259% -- -44% -68% sub_M::G2 1997468/s 544% 79% -- -43% sub_M::I2 3495253/s 1028% 214% 75% --
The new version of the module can be obtained from GitHub.