Re^13: Module for 128-bit integer math?

in reply to Re^12: Module for 128-bit integer math?
in thread Module for 128-bit integer math?

but I don't see much scope for improvement at the moment

Well, there is...

The G2 version is faster than G1 and I because it is not allocating and deallocating the result object every time but reusing $mpz_ret.

After adding to Math::Int128 a new set of operators that use a preallocated argument for output, Math::Int128 becomes faster than Math::GMPz, around 60% faster.

The modified benchmark script:


use Math::Int128 qw(int128 :op);
use Math::GMPz qw(:mpz);
use Benchmark qw(:all);

$count = 40000;

$mpz1  = Math::GMPz->new('676469752303423489');
$mpz2  = Math::GMPz->new('776469752999423489');
$i_1   = int128("$mpz1");
$i_2   = int128("$mpz2");

$mpz_sub = Math::GMPz->new('976469752313423489');
$i_sub   = int128("$mpz_sub");

$mpz_div = Math::GMPz->new('76469752313423489');
$i_div   = int128("$mpz_div");

$mpz_ret = Rmpz_init2(128);
$i_ret = int128();

use warnings;

print "
******************
**MULTIPLICATION**
******************\n\n";

cmpthese(-1, {
    'mul_M::I' => '$ri = Math::Int128::_mul($i_1, $i_2, 0)',
    'mul_M::I2'=> 'int128_mul($i_ret, $i_1, $i_2)',
    'mul_M::G1'=> '$mpz_ret = $mpz1 * $mpz2',
    'mul_M::G2'=> 'Rmpz_mul($mpz_ret, $mpz1, $mpz2)',
});

die "Error 1:\n$ri\n$mpz_ret\n$i_ret\n" if $ri != int128("$mpz_ret")
 || $ri != int128('525258301482620425304858018020933121') || $ri != $i
+_ret;


$i_1 *= $i_1;
$i_2 *= $i_2;
$mpz1 *= $mpz1;
$mpz2 *= $mpz2;

# print "i_1: $i_1, i_2: $i_2\n";


print "
******************
*****DIVISION*****
******************\n\n";

cmpthese(-1, {
    'div_M::I' => '$ri = Math::Int128::_div($i_1, $i_div, 0)',
    'div_M::I2'=> 'int128_div($i_ret, $i_1, $i_div)',
    'div_M::G1'=>'$mpz_ret = $mpz1 / $mpz_div',
    'div_M::G2'=> 'Rmpz_tdiv_q($mpz_ret, $mpz1, $mpz_div)',
});

die "Error 2:\n$ri\n$mpz_ret\n$i_ret\n" if $ri != int128("$mpz_ret")
 || $ri != int128('5984213521522366751') || $ri != $i_ret;

print"
******************
*****ADDITION*****
******************\n\n";

cmpthese(-1, {
    'add_M::I'  => '$ri = Math::Int128::_add($i_1, $i_2, 0)',
    'add_M::I2' => 'int128_add($i_ret, $i_1, $i_2)',
    'add_M::G1' => '$mpz_ret = $mpz1  + $mpz2',
    'add_M::G2' => 'Rmpz_add($mpz_ret, $mpz1, $mpz2)',
});

die "Error 3:\n$ri\n$mpz_ret\n$i_ret\n" if $ri != int128("$mpz_ret") 
 || $ri != int128('1060516603104440851094132036041866242') || $ri != $
+i_ret;

print "
******************
****SUBTRACTION***
******************\n\n";

cmpthese(-1, {
    'sub_M::I'  => '$ri = Math::Int128::_sub($i_1, $i_sub, 0)',
    'sub_M::I2' => 'int128_sub($i_ret, $i_1, $i_sub)',
    'sub_M::G1' => '$mpz_ret = $mpz1 - $mpz_sub',
    'sub_M::G2' => 'Rmpz_sub($mpz_ret, $mpz1, $mpz_sub)',
});

die "Error 4:\n$ri\n$mpz_ret\n$i_ret\n" if $ri != int128("$mpz_ret") 
 || $ri != int128('457611325781455127825205517363509632') || $ri != $i
+_ret;
[download]

And the results I get on my 64bits-linux-but-with-a-not-very-optimized-for-64bits-old-processor:

******************
**MULTIPLICATION**
******************

               Rate mul_M::G1  mul_M::I mul_M::G2 mul_M::I2
mul_M::G1  321555/s        --      -72%      -87%      -91%
mul_M::I  1147836/s      257%        --      -52%      -69%
mul_M::G2 2406041/s      648%      110%        --      -35%
mul_M::I2 3709585/s     1054%      223%       54%        --

******************
*****DIVISION*****
******************

               Rate div_M::G1  div_M::I div_M::G2 div_M::I2
div_M::G1  314139/s        --      -71%      -83%      -88%
div_M::I  1092266/s      248%        --      -39%      -59%
div_M::G2 1799026/s      473%       65%        --      -32%
div_M::I2 2633983/s      738%      141%       46%        --

******************
*****ADDITION*****
******************

               Rate add_M::G1  add_M::I add_M::G2 add_M::I2
add_M::G1  317021/s        --      -71%      -86%      -91%
add_M::I  1092266/s      245%        --      -51%      -70%
add_M::G2 2248783/s      609%      106%        --      -38%
add_M::I2 3598054/s     1035%      229%       60%        --

******************
****SUBTRACTION***
******************

               Rate sub_M::G1  sub_M::I sub_M::G2 sub_M::I2
sub_M::G1  309967/s        --      -72%      -84%      -91%
sub_M::I  1113475/s      259%        --      -44%      -68%
sub_M::G2 1997468/s      544%       79%        --      -43%
sub_M::I2 3495253/s     1028%      214%       75%        --
[download]

The new version of the module can be obtained from GitHub.

In Section Seekers of Perl Wisdom