Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^17: Module for 128-bit integer math? (updated)

by BrowserUk (Patriarch)
on Feb 13, 2011 at 14:28 UTC ( [id://887867]=note: print w/replies, xml ) Need Help??


in reply to Re^16: Module for 128-bit integer math? (updated)
in thread Module for 128-bit integer math?

you would have to remove it yourself by doing a mod(2 ** 128):

That's exactly what I wish to avoid. It's not even so much to do with having to perform the modulo, (pain in the ass though that it is). It's more knowing when it needs to be done.

For some calculations, doing them at arbitrary precision and doing a single modulo operation at the end will produce the right result, but at the cost of all the intermediate calculations being done at far higher precision, and therefore much more slowly, than is wanted for the final result.

As (just) an example of this, the following compares performing 128-bit FNV1 and FNV1a hashing, both fixed precision; and infinite precision with the mod'ing done per step, and once at the end:

#! perl -slw use strict; use Benchmark qw[ cmpthese ]; use Math::Int128 qw[ uint128 uint128_to_hex ]; use Math::GMPz; sub FNV_1_128 { my $s = shift; my $h = uint128( '144066263297769815596495629667062367629' ); my $p = uint128( '309485009821345068724781371' ); $h *= $p, $h ^= $_ for unpack 'C*', $s; return $h; } sub FNV_1a_128 { my $s = shift; my $h = uint128( '144066263297769815596495629667062367629' ); my $p = uint128( '309485009821345068724781371' ); $h ^= $_, $h *= $p for unpack 'C*', $s; return $h; } sub FNV_1_128_gmpz { my $s = shift; my $h = Math::GMPz->new( '144066263297769815596495629667062367629' + ); my $p = Math::GMPz->new( '309485009821345068724781371' ); my $m = Math::GMPz->new( 1 ) << 128; $h *= $p, $h ^= $_ for unpack 'C*', $s; return $h % $m; } sub FNV_1a_128_gmpz { my $s = shift; my $h = Math::GMPz->new( '144066263297769815596495629667062367629' + ); my $p = Math::GMPz->new( '309485009821345068724781371' ); my $m = Math::GMPz->new( 1 ) << 128; $h ^= $_, $h *= $p for unpack 'C*', $s; return $h % $m; } sub FNV_1_128_gmpz2 { my $s = shift; my $h = Math::GMPz->new( '144066263297769815596495629667062367629' + ); my $p = Math::GMPz->new( '309485009821345068724781371' ); my $m = Math::GMPz->new( 1 ) << 128; $h *= $p, $h ^= $_, $h %= $m for unpack 'C*', $s; return $h; } sub FNV_1a_128_gmpz2 { my $s = shift; my $h = Math::GMPz->new( '144066263297769815596495629667062367629' + ); my $p = Math::GMPz->new( '309485009821345068724781371' ); my $m = Math::GMPz->new( 1 ) << 128; $h ^= $_, $h *= $p, $h %= $m for unpack 'C*', $s; return $h; } our $text = do{ local( @ARGV, $/ ) = $0; <> }; print length $text; cmpthese -1, { int128 => q[ my $fnv1 = uint128_to_hex( FNV_1_128( $text ) ); my $fnv1a = uint128_to_hex( FNV_1a_128( $text ) ); ], GMPz => q[ my $fnv1 = Math::GMPz::Rmpz_get_str( FNV_1_128_gmpz( $text ), + 16 ); my $fnv1a = Math::GMPz::Rmpz_get_str( FNV_1a_128_gmpz( $text ) +, 16 ); ], GMPz2 => q[ my $fnv1 = Math::GMPz::Rmpz_get_str( FNV_1_128_gmpz2( $text ) +, 16 ); my $fnv1a = Math::GMPz::Rmpz_get_str( FNV_1a_128_gmpz2( $text +), 16 ); ], }; __END__ C:\test>FNV128.pl 2455 Rate GMPz GMPz2 int128 GMPz 14.1/s -- -77% -95% GMPz2 61.4/s 334% -- -78% int128 274/s 1840% 347% --

Notice how the _gmpz2() functions are 3 1/2 times faster than the _gmpz() functions, despite that they perform 2,455 modulo 128 operations on intermediate values, versus the latter's single modulo 128 at the end. The cost of carrying that extra precision adds up fast.

But far more pernicious are calculations where not performing the modulo operation at the appropriate times actually changes the outcome of the calculation. I went through the same thing trying to use Math::BigInt to perform 64-bit math before I had a 64-integer capable Perl. You start sticking in modulo ops all over the place with the inevitable disastrous affects on both the readability of the calculation and performance. And then you start trying to back them out again one at a time and verifying the results to ensure you haven't screwed the calculation up in the process.

The performance affects are bad enough. But the affect on readability and maintainability of complex calculations is simply too high to pay. Same thing goes for using the function call interface rather than the overloaded interface. It renders simple calculations all but unreadable; and complex ones nigh impossible to construct.

That's why I asked up front about a 128-bit math package, not an infinite precision math package. And why I tried really hard to avoid this debate. (Note. Not with you, given that you've provided me with the to both packages :)


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^18: Module for 128-bit integer math? (updated)
by syphilis (Archbishop) on Feb 14, 2011 at 11:55 UTC
    That's why I asked up front about a 128-bit math package, not an infinite precision math package

    Understood. At no stage was it was my intention to suggest that you should use Math::GMPz instead of Math::Int128 - I was merely querying the speed of Math::Int128 (whose basic math operations seem to be about 3 times slower than need be), and using Math::GMPz for benchmark comparisons.

    And why I tried really hard to avoid this debate

    Apologies for my part in forcing that upon you - though that was never my intention. (I know very well how annoying it is to be taken to task over matters that are irrelevant to the task at hand.)

    Cheers,
    Rob
      I was merely querying the speed of Math::Int128

      It is early days for Math::Int128. Now salva's made it work, it's up to those of us using it to drive the process of making it work faster, as the need arises. Along with any enhancements we'd like. I have a few already.

      (whose basic math operations seem to be about 3 times slower than need be

      Where does the "3 times slower" come from? Looking back at your benchmark I see the functional interfaces of Int128 being 30% to 40% slower than those of GMPz. Or did I misread the numbers?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Where does the "3 times slower" come from?

        The (unposted) code that I wrote lifted the particular M::I128 multiplication that I benchmarked from 371517 multiplications/s to 927835 multiplications/s.
        That's a bit less than a factor of "3", admittedly. The rest was just my bad memory ... or perhaps deliberate exaggeration ;-)

        Cheers,
        Rob
Re^18: Module for 128-bit integer math? (updated)
by salva (Canon) on Feb 23, 2011 at 17:08 UTC
    Let me introduce Math::GMPn, providing fixed length arithmetic on top of the GMP library: says...
    Rate GMPz4 GMPz GMPz2 GMPz3 int128 GMPn GMPz4 2.19/s -- -1% -92% -93% -98% -98% GMPz 2.21/s 1% -- -92% -93% -98% -98% GMPz2 28.1/s 1182% 1173% -- -16% -71% -76% GMPz3 33.3/s 1422% 1411% 19% -- -66% -71% int128 97.2/s 4339% 4306% 246% 192% -- -17% GMPn 117/s 5220% 5182% 315% 250% 20% --

    update: as pointed out by BrowserUk below, there was an error in the benchmark script that has been corrected.

      I haven't tried to build this yet, ((I think?)I would need to use MPIR on 64-bit), so I haven't seen the full output, but am I missing something with regard to your use of ior rather than xor?

      sub FNV_1_128_gmpn { my $s = shift; mpn_set_str(my $h, '144066263297769815596495629667062367629', 10, +128); mpn_set_str(my $p, '309485009821345068724781371', 10, 128); for (unpack 'C*', $s) { my $h2 = $h; mpn_mul($h, $h2, $p); mpn_ior_uint($h, $h, $_); } return $h; } sub FNV_1a_128_gmpn { my $s = shift; mpn_set_str(my $h, '144066263297769815596495629667062367629', 10, +128); mpn_set_str(my $p, '309485009821345068724781371', 10, 128 ); for (unpack 'C*', $s) { mpn_ior_uint($h, $h, $_); my $h2 = $h; mpn_mul($h, $h2, $p); } return $h; }

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        but am I missing something with regard to your use of ior rather than xor?

        Oops, that was an error on the benchmark script. It has been corrected now (and a new version of the module released).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://887867]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-04-23 23:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found