 Perl Monk, Perl Meditation PerlMonks

### Re^15: Module for 128-bit integer math? (updated)

by BrowserUk (Pope)
 on Feb 12, 2011 at 14:41 UTC ( #887763=note: print w/replies, xml ) Need Help??

2) It removes the need for sub bcount()

That does miss the point, which was not what the bcount() sub did, but the way it did it.

The result of using bitwise math using arbitrary precision doesn't follow the fix-width integer semantics. Ie. the math isn't automatically module 2^128.

That means you get quite different results:

```#! perl -slw
use strict;
use Math::GMPz;

my \$ZERO = Math::GMPz->new( 0 );
my \$ONE  = Math::GMPz->new( 1 );
#my \$ALLONES = ~ \$ZERO;
my \$ALLONES = (Math::GMPz->new(1) << 128) - 1;
my \$MASK1 = \$ALLONES / 3;
my \$MASK2 = \$ALLONES / 15 * 3;
my \$MASK3 = \$ALLONES / 255;

sub bcount {
my \$v = shift;
\$v = \$v - ( ( \$v >> 1 ) & \$MASK1 );
\$v = ( \$v & \$MASK2 ) + ( ( \$v >> 2 ) & \$MASK2 );
\$v = ( \$v + ( \$v >> 4 ) ) &  \$MASK4;
my \$c = ( \$v * ( \$MASK3 ) ) >> 120;
return \$c;
}

my \$bits = Math::GMPz->new( 0 );

## set every 3rd bit
\$bits |= \$ONE << \$_ for map \$_*3+1, 0 .. 42;
print \$bits;

## count the bits (43)
print bcount( \$bits );

## now invert them and count again. (85)
print bcount( ~\$bits );

__END__
C:\test>gmpz-t.pl
194447066811964836264785489961010406546
4019000903644796537894933644280473643
6698389137940470254461716813547458644

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^16: Module for 128-bit integer math? (updated)
by syphilis (Bishop) on Feb 13, 2011 at 00:14 UTC
the math isn't automatically module 2^128

Quite so ... I don't think the gmp library provides a way of enforcing fixed precision. Therefore "the overflow" doesn't get automatically discarded, and you would have to remove it yourself by doing a mod(2 ** 128):
```#! perl -slw
use strict;
use Math::GMPz qw(:mpz);

my \$ZERO = Math::GMPz->new( 0 );
my \$ONE  = Math::GMPz->new( 1 );
my \$MOD128  = Math::GMPz->new(1) << 128;
my \$ALLONES = \$MOD128 - 1;
my \$MASK1 = \$ALLONES / 3;
my \$MASK2 = \$ALLONES / 15 * 3;
my \$MASK3 = \$ALLONES / 255;

my \$bits = Math::GMPz->new( 0 );

## set every 3rd bit
\$bits |= \$ONE << \$_ for map \$_*3+1, 0 .. 42;
print \$bits;

## count the bits (43)
print bcount( \$bits );

## now invert them and count again. (85)
#print bcount( ~\$bits );
print bcount(\$ALLONES ^ \$bits);

sub bcount {
my \$v = shift;
\$v = \$v - ( ( \$v >> 1 ) & \$MASK1 );
\$v = ( \$v & \$MASK2 ) + ( ( \$v >> 2 ) & \$MASK2 );
\$v = ( \$v + ( \$v >> 4 ) ) &  \$MASK4;
my \$c = ( ( \$v * ( \$MASK3 ) ) % \$MOD128) >> 120;
return \$c;
}

__END__
C:\_64\pscrpt>perl buk1.pl
194447066811964836264785489961010406546
43
85
Not sure what to do with the ~ operator in Math::GMPz. Currently it just does a one's complement - which is not the right thing for your purposes.
I notice that Math::GMP doesn't overload the ~ operator ... perhaps Math::GMPz should do the same ?

Cheers,
Rob
you would have to remove it yourself by doing a mod(2 ** 128):

That's exactly what I wish to avoid. It's not even so much to do with having to perform the modulo, (pain in the ass though that it is). It's more knowing when it needs to be done.

For some calculations, doing them at arbitrary precision and doing a single modulo operation at the end will produce the right result, but at the cost of all the intermediate calculations being done at far higher precision, and therefore much more slowly, than is wanted for the final result.

As (just) an example of this, the following compares performing 128-bit FNV1 and FNV1a hashing, both fixed precision; and infinite precision with the mod'ing done per step, and once at the end:

```#! perl -slw
use strict;
use Benchmark qw[ cmpthese ];
use Math::Int128 qw[ uint128 uint128_to_hex ];
use Math::GMPz;

sub FNV_1_128 {
my \$s = shift;
my \$h = uint128( '144066263297769815596495629667062367629' );
my \$p = uint128( '309485009821345068724781371' );
\$h *= \$p, \$h ^= \$_ for unpack 'C*', \$s;
return \$h;
}

sub FNV_1a_128 {
my \$s = shift;
my \$h = uint128( '144066263297769815596495629667062367629' );
my \$p = uint128( '309485009821345068724781371' );
\$h ^= \$_, \$h *= \$p for unpack 'C*', \$s;
return \$h;
}

sub FNV_1_128_gmpz {
my \$s = shift;
my \$h = Math::GMPz->new( '144066263297769815596495629667062367629'
+ );
my \$p = Math::GMPz->new( '309485009821345068724781371' );
my \$m = Math::GMPz->new( 1 ) << 128;
\$h *= \$p, \$h ^= \$_ for unpack 'C*', \$s;
return \$h % \$m;
}

sub FNV_1a_128_gmpz {
my \$s = shift;
my \$h = Math::GMPz->new( '144066263297769815596495629667062367629'
+ );
my \$p = Math::GMPz->new( '309485009821345068724781371' );
my \$m = Math::GMPz->new( 1 ) << 128;
\$h ^= \$_, \$h *= \$p for unpack 'C*', \$s;
return \$h % \$m;
}

sub FNV_1_128_gmpz2 {
my \$s = shift;
my \$h = Math::GMPz->new( '144066263297769815596495629667062367629'
+ );
my \$p = Math::GMPz->new( '309485009821345068724781371' );
my \$m = Math::GMPz->new( 1 ) << 128;
\$h *= \$p, \$h ^= \$_, \$h %= \$m for unpack 'C*', \$s;
return \$h;
}

sub FNV_1a_128_gmpz2 {
my \$s = shift;
my \$h = Math::GMPz->new( '144066263297769815596495629667062367629'
+ );
my \$p = Math::GMPz->new( '309485009821345068724781371' );
my \$m = Math::GMPz->new( 1 ) << 128;
\$h ^= \$_, \$h *= \$p, \$h %= \$m for unpack 'C*', \$s;
return \$h;
}

our \$text = do{ local( @ARGV, \$/ ) = \$0; <> };
print length \$text;

cmpthese -1, {
int128 => q[
my \$fnv1  = uint128_to_hex( FNV_1_128( \$text ) );
my \$fnv1a = uint128_to_hex( FNV_1a_128( \$text ) );
],
GMPz => q[
my \$fnv1  = Math::GMPz::Rmpz_get_str( FNV_1_128_gmpz( \$text ),
+ 16 );
my \$fnv1a = Math::GMPz::Rmpz_get_str( FNV_1a_128_gmpz( \$text )
+, 16 );
],
GMPz2 => q[
my \$fnv1  = Math::GMPz::Rmpz_get_str( FNV_1_128_gmpz2( \$text )
+, 16 );
my \$fnv1a = Math::GMPz::Rmpz_get_str( FNV_1a_128_gmpz2( \$text
+), 16 );
],
};

__END__
C:\test>FNV128.pl
2455
Rate   GMPz  GMPz2 int128
GMPz   14.1/s     --   -77%   -95%
GMPz2  61.4/s   334%     --   -78%
int128  274/s  1840%   347%     --

Notice how the _gmpz2() functions are 3 1/2 times faster than the _gmpz() functions, despite that they perform 2,455 modulo 128 operations on intermediate values, versus the latter's single modulo 128 at the end. The cost of carrying that extra precision adds up fast.

But far more pernicious are calculations where not performing the modulo operation at the appropriate times actually changes the outcome of the calculation. I went through the same thing trying to use Math::BigInt to perform 64-bit math before I had a 64-integer capable Perl. You start sticking in modulo ops all over the place with the inevitable disastrous affects on both the readability of the calculation and performance. And then you start trying to back them out again one at a time and verifying the results to ensure you haven't screwed the calculation up in the process.

The performance affects are bad enough. But the affect on readability and maintainability of complex calculations is simply too high to pay. Same thing goes for using the function call interface rather than the overloaded interface. It renders simple calculations all but unreadable; and complex ones nigh impossible to construct.

That's why I asked up front about a 128-bit math package, not an infinite precision math package. And why I tried really hard to avoid this debate. (Note. Not with you, given that you've provided me with the to both packages :)

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
That's why I asked up front about a 128-bit math package, not an infinite precision math package

Understood. At no stage was it was my intention to suggest that you should use Math::GMPz instead of Math::Int128 - I was merely querying the speed of Math::Int128 (whose basic math operations seem to be about 3 times slower than need be), and using Math::GMPz for benchmark comparisons.

And why I tried really hard to avoid this debate

Apologies for my part in forcing that upon you - though that was never my intention. (I know very well how annoying it is to be taken to task over matters that are irrelevant to the task at hand.)

Cheers,
Rob
Let me introduce Math::GMPn, providing fixed length arithmetic on top of the GMP library: says...
```         Rate  GMPz4   GMPz  GMPz2  GMPz3 int128   GMPn
GMPz4  2.19/s     --    -1%   -92%   -93%   -98%   -98%
GMPz   2.21/s     1%     --   -92%   -93%   -98%   -98%
GMPz2  28.1/s  1182%  1173%     --   -16%   -71%   -76%
GMPz3  33.3/s  1422%  1411%    19%     --   -66%   -71%
int128 97.2/s  4339%  4306%   246%   192%     --   -17%
GMPn    117/s  5220%  5182%   315%   250%    20%     --

update: as pointed out by BrowserUk below, there was an error in the benchmark script that has been corrected.

Create A New User
Node Status?
node history
Node Type: note [id://887763]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2020-11-24 04:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?