Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: modulus and floating point numbers

by ccn (Vicar)
on Nov 18, 2008 at 03:17 UTC ( [id://724183]=note: print w/replies, xml ) Need Help??


in reply to modulus and floating point numbers

#!/usr/bin/perl use strict; use warnings; use Benchmark qw(cmpthese); my @a = map { (rand() * 1000) / 1000 } 1 .. 1000; cmpthese(-1, { 'GF' => sub {GF($_) for @a}, 'GF2' => sub {GF2($_) for @a}, 'dex' => sub {dex($_) for @a}, 'ccn' => sub {ccn($_) for @a}}); sub GF { my $number = abs($_[0] * 100); return $number - int ($number) > 0; } sub GF2 { my $number = $_[0] * 100; return $number - int ($number) > 0; } sub ccn { $_[0] =~ /\..../; } sub dex { my $number = $_[0]; return (($number*100) - int($number*100) > 0); } __END__ Rate dex GF GF2 ccn dex 931/s -- -7% -19% -28% GF 1000/s 7% -- -13% -23% GF2 1152/s 24% 15% -- -11% ccn 1297/s 39% 30% 13% --

Replies are listed 'Best First'.
Re^2: modulus and floating point numbers
by dextius (Monk) on Nov 18, 2008 at 03:28 UTC
    You took off the addition and the parens.. I got..
    Rate ccn dex GF GF2 ccn 296/s -- -73% -74% -76% dex 1097/s 270% -- -2% -10% GF 1124/s 279% 2% -- -7% GF2 1213/s 309% 11% 8% --
      I took off addition wittingly. One can skip addition if input data is numeric. Situation becomes worse if input must be validated.

      And once again, I am not shure if we can rely on the third digit after a floating point. Indeed, I will never use that sub.

        ... if we can rely on the third digit after a floating point.

        It's a sad, and often forgotten fact: binary floating point is a remarkably poor choice of arithmetic for anything requiring accuracy to a given number of decimals.

        The expression ($x * 100) - int($x * 100) suffers at most one rounding error (the same one, twice, unless the optimiser steps in). Addition and subtraction are the most troublesome of the simple operations, but in this case the subtraction is exact. The int is by definition exact.

        So from a floating point arithmetic perspective, this is pretty good, really.

        The problem, of course, is that few decimal fractions have an exact binary fraction representation, most are, in fact, recurring binary fractions. Consider:

        $x = 0.54 ; print "$x: ", ($x * 100) - int($x * 100), "\n" ; # 0.54 +: 0 $x = 0.56 ; print "$x: ", ($x * 100) - int($x * 100), "\n" ; # 0.56 +: 7.105427357601e-15 $x = 0.58 ; print "$x: ", ($x * 100) - int($x * 100), "\n" ; # 0.58 +: 0.999999999999993
        none of 0.54, 0.56 and 0.58 has an exact representation in binary floating form, but the rounding in the $x * 100 happens to reverse that error in one of these three cases.

        This also shows that "stringification" is rounding to some number of significant decimal figures, which is hiding the error in the binary representation. So, the method offered by ccn, which implicitly "stringifies" the argument (if it is in number form) is, from a floating point perspective, less accurate -- but for what you want, more useful !

        So, from an arithmetic perspective, before looking to see if you have any "sub penny" values, you need to decide what number of decimals you expect your values to be accurate to. If that's $D you could use:

        int(abs($_[0]) * 10**$D + 0.5) % 10**($D-2)
        to give you the value of any part below the 1/100-ths, in units of 1/10**$D -- noting that after the int this is all exact, integer arithmetic. If you don't happen to have a 64-bit processor about your person, then:
        use POSIX qw(fmod) ; fmod(int(abs($_[0]) * 10**$D + 0.5), 10**($D-2))
        will do the trick. Be aware, however, that the usual floating point (IEEE-754 Double) offers something over 15 decimal digits precision. So, if you set your $D to, say, 6 then you'll be in (or very close to) trouble if your values are 10^9 or more, and you might want to consider 10^8 to be your effective limit (depending on how much arithmetic you've done to reach the value you are testing).

        You could run into range issues with the above because it needs an exact value for $x * 10**$D. So rather than round to a fixed number of decimals, it can be better to round to a number of significant decimal digits. However, that's not an easy thing to do. You could use different multipliers as $x gets bigger -- but it's getting messy. But in any case, if you want arithmetic good to, say 4 decimal places, you're limited to 10-11 digits before the decimal point.

        Now, as far as I can see, stringification is rounding to 15 significant decimal digits -- I imagine that's documented somewhere. This will mask any representation error and any errors introduced in the conversion from decimal to binary and back again. It will also mask "some" rounding errors, but it is hard to be precise about how many.

        You can, of course use sprintf to do rounding stuff for you:

        $x = 0+sprintf('%0.6f', $x) ; # 6 decimal places $x = 0+sprintf('%1.5e', $x) ; # 6 significant digits
        but if the value is big enough, or small enough, 'f' format will start throwing exponent forms at you, so stay awake !

        I haven't tested this exhaustively, but you could use: 0+sprintf('%1.13e', $x) to round to 14 significant decimal digits, which will allow for "quite a lot" of rounding errors.

        So, you might be happy with the implicit rounding provided by stringification (and assume it will always be 15 decimals), or you might want to take charge of the problem and do the necessary arithmetic yourself. If you ever come across floating point that is not IEEE-754, you'll need to worry about its precision and range.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://724183]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-03-29 10:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found