Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Issue while reading decimal number from a Spreadsheet using perl

by rjt (Curate)
on Oct 21, 2019 at 08:16 UTC ( #11107749=note: print w/replies, xml ) Need Help??

in reply to Issue while reading decimal number from a Spreadsheet using perl

haukex correctly identified the issue as floating point error. While I'm sure the site that was linked does a better job of explaining this problem than I will, but even though there are a bazillion such explanations on the web already, I'm going to explain it briefly again, and then I'm going to relate it to your specific example, and then I'm finally going to show you how to avoid this class of errors.

Floating point errors (a very truncated explanation)

Your problem is not a problem with Excel or Perl, but a problem with how your computer stores floating point (non-integer) numbers. Computers store numbers in binary (base 2), while you are asking for them to be printed in decimal (base 10). For integers up to a maximum size, there is no loss of information converting between binary and decimal. Floating point numbers offer no such guarantee. They are stored as three distinct parts: a sign bit, the exponent, and the mantissa/significand, which means your number has to be converted to look like number = sign * mantissa * 2exponent. The problem is there are only so many bits available for the mantissa: 23 for single precision, or 52 for double. Those bits (b0 .. b23) are added together like so: b0/2 + b1/4 + ... Do you see now why some decimal values can't be represented exactly?

For example, -10.1 (your first example) looks like this in single precision IEEE-754 floating point format:

S Exponent Mantissa - -------- ----------------------- 1 10000010 01000011001100110011010 - 130 => 3 (1/1) + 0/2 + 1/4 + 0/8 + 1/16 + 1/32 + ... ~ 1.26250004768 +37158

The actual exponent is exponent - 127, so 130 - 127 gives you 23. This is to allow for negative exponents.

The mantissa is read from the back to the front (so the rightmost 0 corresponds to the 1/2s place, the one to the left of that is the 1/4s place, and so on). There's an implicit (1/1) added on that is not stored in the floating point number itself.

So putting all of that together, you get -1 * 23 * (0/2 + 1/4 + 0/8 + 1/16 + 1/32 + ...), and, you guessed it, that doesn't work out exactly to -10.1.

How to fix floating point1

In general, you can't; floating point binary numbers do not always convert exactly to decimal. You can increase the precision, but in practice, that just increases the count of unwanted 9s or 0s on the end of your number. You could use fixed point arithmetic if your numbers fall within a narrow enough range of exponents. You can use arbitrary precision like Math::BigFloat, but there is a massive performance penalty associated with that (not to mention the unbounded memory requirements), and you don't typically need unlimited precision anyway. So that brings me to the commonly used option, which is rounding to the required number of decimal places.

But you still need to be careful. Let's say you need two decimal places. You might think you can do printf "%.2f", $value and rely on that result. Not if you care about that last decimal place! Here's a classic example:

printf "%.2f =?= %.2f", 7.525, 6.02 * 1.25; # Output: 7.53 =?= 7.52

In other words, numbers that should be identical may be rounded differently. To get around this, you need to request one more decimal place than you need, and then look at the last digit (in decimal form) to decide whether to round up or down. Thankfully, there's a module for that:

use Math::Round; printf "%s == %s\n", nearest(0.01, 7.525), nearest(0.01, 6.02 * 1.25); # 7.53 == 7.53

When to round is also important. You typically want to round right before you display or output the number, not before, or you will lose precision. And don't round and then shove the result back into a variable Perl will treat as a number, or you won't have fixed anything!

Floating point numbers are full of other fun pitfalls, especially if you start mixing formats. Please do read the link that haukex gave you for a much more complete discussion.

1. Pun absolutely intendianed.

Replies are listed 'Best First'.
Re^2: Issue while reading decimal number from a Spreadsheet using perl
by KishKishore (Novice) on Oct 22, 2019 at 06:02 UTC

    Thanks a lot.. This helped me understand many things along with my issue

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11107749]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2023-03-26 08:54 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (63 votes). Check out past polls.