Fun with Tiny Numbers

We've all seen dozens of threads where someone expected two floating-point numbers calculated in different ways to be equal, and was shocked to find that they weren't. They generally assume there's a bug in the language.

Here's something a little different.

Consider a very small number: 1.234567e-302. This is tiny, but can easily be approximated within the precision of the 64-bit type Perl uses for numbers on i686 and x86_64.

$ perl -E 'say 1.234567e-302'
1.234567e-302
[download]

Now let's make it just a little larger. Adding a digit to the end will have that effect: 1.2345678e-302.

$ perl -E 'say 1.2345678e-302'
1e-302
[download]

Wait, what?

Perhaps it's just a formatting issue.

$ perl -E 'say "help" if 1.2345678e-302 == 1e-302'
help
$ perl -E 'say "what" if 1.2345678e-302 < 1.234567e-302'
what
[download]

Okay, perhaps we're trying to produce a number Perl can't actually represent?

$ perl -MPOSIX -E 'say scalar strtod "1.2345678e-302"'
1.2345678e-302
[download]

Nope, looks like this value can be approximated by a scalar just fine, when it's produced by a C function.

I actually ran into this at work while I was fuzzing some numerical code. At the time, I was most interested in finding a workaround, so the strtod approach got me round the problem -- but the oddness intrigued me, so I decided to try and figure it out.

I quickly realised it had to be a parsing issue. After all, the problem is triggered by the number of decimal places, not the digits involved nor the size of the exponent -- even adding a trailing zero can affect the way a literal is parsed:

$ perl -E 'say 5.5000000000e-298; say 5.50000000000e-298'
5.5e-298
5e-298
[download]

I was about to jump into the Perl source code when I paused for a little googling, and found a nice article by brian_d_foy explaining how Perl parses scientific notation, which goes through the relevant functions line by line. And it's easy enough to identify where everything's going pear-shaped. It is, of course, a floating-point rounding error -- but this one is indirect. brian even identifies the code in question as a place where precision can be lost.

It turns out that perl essentially parses 5.5000000000e-298 by taking the two sides of the decimal point separately, and calculating 5 / 1e298 + 5,000,000,000 / 1e308. When we add the extra zero, Perl tries to calculate 50,000,000,000 / 1e309 instead -- but that exponent does exceed the floating-point precision, the number overflows to infinity, and we end up with 5e-298 + 0.

So this might actually count as a bug in perl for once, though it's probably a known consequence of the algorithm and I certainly don't expect anyone to scramble to patch it. Either way, it's still interesting: unlike most floating-point gotchas it's not something inherent to working with inexact numbers, since other implementations of atof can and do parse these literals as expected -- it's just Perl's that doesn't.

Bonus observation: the above results were acquired on x86_64. Anyone who tries the examples above on a 32-bit x86 platform may have found that the problem does not manifest itself there -- unless you build Perl yourself with -Doptimize=-g, in which case the problem suddenly appears. I guess 32-bit gcc is managing to perform the whole calculation in 80-bit registers, which can handle e-309 just fine, and debug compilation forces it to store intermediate values in 64-bit variables.

Comment on Fun with Tiny Numbers Select or Download Code


There's more than one way to do things
	PerlMonks