Floating point precision in literals vs calculations

Floating point precision in literals vs calculations - java

I'm wondering why floating point numbers in Java can represent exact value when they are initialized as literals, but they are approximate when they represent result of some calculation.
For example:
double num1 = 0.3;
double num2 = 0.1 + 0.2;
System.out.println(num1);
System.out.println(num2);
why the result is:
0.3
0.30000000000000004
and not:
0.30000000000000004
0.30000000000000004
When there is no exact binary representation of 0.3.
I know the BigDecimal class, but I don't quite understand this primitive numbers inconsistency.

None of the three numbers can be represented exactly as a double. The reason that you get different results is that the value after adding 0.1 to 0.2 has a different representation error than 0.3. The difference of about 5.5E-17 is enough to cause a difference when printing out the result (demo).
double a = 0.2;
double b = 0.1;
double c = 0.3;
double d = a+b;
double e = d-c; // This is 5.551115123125783E-17

When 0.3 is converted to its representation as ones and zeroes then converted back to decimal, it rounds to 0.3.
However, when 0.1 and 0.2 are respectively converted to binary, the errors add up upon addition so as to show up when the sum is converted back to decimal.
A thorough explanation would involve demonstrating the IEEE representation of each number along with the addition and conversions. A bit involved, but I hope you got the idea.

The addition itself cannot produce an exact representation of 0.3, hence printing the result of 0.1 + 0.2 yields 0.30000000000000004.
On the other hand, when calling System.out.println(0.3);, the println(double) method will perform some rounding on the result: it eventually calls Double.toString(double) which mentions that the result is approximate:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.
If you use a BigDecimal the difference can be seen:
System.out.println(0.3); // 0.3
System.out.println(new BigDecimal(0.3)); // 0.299999999999999988897769753748434595763683319091796875

Related

The calculation accuracy of floating-point numbers (float, double) in Java (IEEE 754)

I tried the following calculations in jshell of JDK 11:
jshell> 0.1 + 0.2 == 0.3
$1 ==> false
I can understand this return. After all, neither 0.1, 0.2, or 0.3 can be accurately represented in binary.
But when I switched to the float instead of the double, I was surprised to find that the return of jshell was true:
jshell> 0.1f + 0.2f == 0.3f
$2 ==> true
This is contrary to my understanding all the time.
So I tried to make jshell calculate directly:
jshell> 0.1 + 0.2
$3 ==> 0.30000000000000004
jshell> 0.1f + 0.2f
$4 ==> 0.3
Indeed, if I use the float data type, it seems to be able to accurately calculate the result to be 0.3.
But why? If double can't accurately represent and calculate 0.1 + 0.2, then why float can?
If there is any error in my test, please point it out.
Thank you first!

Each time you perform floating-point operation, the ideal real-number-arithmetic result is rounded to the nearest value representable in the floating-point format, using whichever rounding method applies to the operation (most often round-to-nearest-ties-to-even).
Sometimes a rounding will be in the direction that cancels previous roundings. Sometimes a rounding will be in the direction that exacerbates previous roundings.
Converting the source text 0.1 to double is a floating-point operation. It produces 0.1000000000000000055511151231257827021181583404541015625, so rounding made the result bigger.
Converting 0.2 to double produces 0.200000000000000011102230246251565404236316680908203125, again bigger.
Converting 0.3 to double` produces 0.299999999999999988897769753748434595763683319091796875, so rounding made the result smaller.
Adding the first two, 0.1000000000000000055511151231257827021181583404541015625 and
0.200000000000000011102230246251565404236316680908203125, produces
0.3000000000000000444089209850062616169452667236328125. Here rounding again increased the result.
Converting 0.1f to float produces 0.100000001490116119384765625.
Converting 0.2f to float produces 0.20000000298023223876953125.
Converting 0.3f to float produces 0.300000011920928955078125. In this case, it happens that, because fewer numbers are representable in float than in double, the next float value below 0.3 is farther away from 0.3 than 0.300000011920928955078125 is. So converting 0.3f to float rounds up even though converting 0.3 to double rounds down.
Adding the first two of these float values, 0.100000001490116119384765625 and 0.20000000298023223876953125, produces 0.300000011920928955078125. Since that is the same as the result of converting 0.3f, 0.1f + 0.2f == 0.3f evaluates as true.
Another thing to note is that Java’s default display of floating-point numbers produces just enough significant digits to uniquely distinguish the value within its floating-point format. This means, when Java shows “0.3” for a number, it does not mean the floating-point value is 0.3. It means the floating-point value is closer to 0.3 than any other value in that format is, so printing “0.3” is enough to identify it.
This means that when “0.3” is printed for a double, the actual double value is 0.299999999999999988897769753748434595763683319091796875, but, when “0.3” is printed for a float, the actual float value is 0.300000011920928955078125. Java is not designed to show you the true values of floating-point numbers.

Even as a float, the number isn't actually 0.3.
public static void main (String[] args) throws java.lang.Exception
{
float a = 0.1f;
float b = 0.2f;
float c = 0.3f;
float d = a + b;
System.out.println( new BigDecimal(c) + ", " + new BigDecimal(d));
}
That puts out.
0.300000011920928955078125, 0.300000011920928955078125
You've found a case where the errors are offsetting.
It would be similar to using a decimal system, and adding 1/3 + 2/3. 1/3 is approximately 0.333 and 2/3 is approximately 0.667 so when you add the two you get 1.0 even though both of them are approximate.

Is Double.toString always produces exactly same as double literal?

Double.toString(0.1) produces "0.1", but 0.1 is a floating point number.
Floating point number can't represent exactly in program language, but Double.toString produces the exact result (0.1), how does it do that, is it always produces the result that mathematically equal to the double literal?
Assume that the literal is in double precision.
Here is the problem I have see:
When use Apache POI to read excel file, XSSFCell.getNumericCellValue can only return double, if I use BigDecimal.valueOf to convert it to BigDecimal, is that always safe, and why?

Double.toString produces the exact result (0.1), how does it do that, is it always produces the result that mathematically equal to the double literal?
Double.toString(XXX) will always produce a numeral equal to XXX if XXX is a decimal numeral with 15 or fewer significant digits and it is within the range of the Double format.
There are two reasons for this:
The Double format (IEEE-754 binary64) has enough precision so that 15-digit decimal numerals can always be distinguished.
Double.toString does not display the exact Double value but instead produces the fewest significant digits needed to distinguish the number from nearby Double values.
For example, the literal 0.1 in source text is converted to the Double value 0.1000000000000000055511151231257827021181583404541015625. But Double.toString will not produce all those digits by default. The algorithm it uses produces “0.1” because that is enough to uniquely distinguish 0.1000000000000000055511151231257827021181583404541015625 from its two neighbors, which are 0.09999999999999999167332731531132594682276248931884765625 and 0.10000000000000001942890293094023945741355419158935546875. Both of those are farther from 0.1.
Thus, Double.toString(1.234), Double.toString(123.4e-2), and Double.toString(.0001234e4) will all produce “1.234”—a numeral whose value equals all of the original decimal numerals (before they are converted to Double), although it differs in form from some of them.
When use Apache POI to read excel file, XSSFCell.getNumericCellValue can only return double, if I use BigDecimal.valueOf to convert it to BigDecimal, is that always safe, and why?
If the cell value being retrieved is not representable as a Double, then XSSFCell.getNumericCellValue must change it. From memory, I think BigDecimal.valueOf will produce the exact value of the Double returned, but I cannot speak authoritatively to this. That is a separate question from how Double and Double.toString behave, so you might ask it as a separate question.

10e-5d is a double literal equivalent to 10^-5
Double.toString(10e-5d) returns "1.0E-4"

Well, double type has limited precision, so if you add enough digits after the floating point, some of them will be truncated/rounded.
For example:
System.out.println (Double.toString(0.123456789123456789))
prints
0.12345678912345678

I agree with Eric Postpischil's answer, but another explanation may help.
For each double number there is a range of real numbers that round to it under round-half-even rules. For 0.1000000000000000055511151231257827021181583404541015625, the result of rounding 0.1 to a double, the range is [0.099999999999999998612221219218554324470460414886474609375,0.100000000000000012490009027033011079765856266021728515625].
Any double literal whose real number arithmetic value is in that range has the same double value as 0.1.
Double.toString(x) returns the String representation of the real number in the range that converts to x and has the fewest decimal places. Picking any real number in that range ensures that the round trip converting a double to a String using Double.toString and then converting the String back to a double using round-half-even rules recovers the original value.
System.out.println(0.100000000000000005); prints "0.1" because 0.100000000000000005 is in the range that rounds to the same double as 0.1, and 0.1 is the real number in that range with the fewest decimal places.
This effect is rarely visible because literals other than "0.1" with real number value in the range are rare. It is more noticeable for float because of the lesser precision. System.out.println(0.100000001f); prints "0.1".

Exact representation of unrepresentable floating point numbers in Java [duplicate]

This question already has answers here:
Why 0.1 represented in float correctly? (I know why not in result of 2.0-1.9)
(2 answers)
Closed 6 years ago.
This may seem a very silly question, but I encountered something mysterious and seemingly beautiful today. I tried Googling around, and I couldn't dig anything up.
I am aware that 0.1 cannot be represented in binary. And YET, when I run the following code:
double g = 1.0;
System.out.println(g);
g = g/10;
System.out.println(g);
g = g*3;
System.out.println(g);
This produces the output:
1.0
0.1
0.30000000000000004
The first output and the third output are expected, but what is going on with the second one? Why is it, well, correct? This should be impossible, and yet, there it is.

Why is it, well, correct?
As you noted, many decimal floating point numbers cannot be represented as binary floating point numbers and vice versa.
When you write a statement like this:
double g = 0.1;
the decimal value is converted to the nearest binary floating point value. And when you then print it like this
System.out.println(g);
the formatter produces the nearest decimal floating point value according to the following rules:
How many digits must be printed for the fractional part? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double.
(Reference: Double.toString(double) javadoc )
That means that you will often get the exact decimal representation of the decimal number that you started with.
In simple terms, the error in converting from decimal to binary is the same as the error when converting from binary to decimal. The errors "cancel out".
Now, this doesn't always happen. Often the cumulative errors in the calculation are large enough the errors in the decimal (and binary) results will be apparent in the output.

Numeric Promotion Rules
If two values have different data types, Java will automatically promote one of the val-
ues to the larger of the two data types.
If one of the values is integral and the other is floating-point, Java will automatically
promote the integral value to the floating-point value’s data type.
Smaller data types, namely byte, short, and char, are first promoted to int any time
they’re used with a Java binary arithmetic operator, even if neither of the operands is
int.
After all promotion has occurred and the operands have the same data type, the result-
ing value will have the same data type as its promoted operands.

Let us step through the computation line by line:
double g = 1.0;
g is the float64 number representing exactly 1.0.
g = g / 10;
The right operand is converted to double, so it is 10.0 exact.
The division operation is performed at infinite precision (conceptually), and then rounded to the closest float64 number as the result.
The exact answer is clearly 0.1. But the closest float64 number to 0.1 is exactly 7205759403792794 / 256.
Hence g = 0.10000000000000000555111512312578...(more digits). If you want to print the full-precision exact value, look at new BigDecimal(g).
g = g * 3;
Again, the right operand is converted to 3.0 exact. We multiply 0.1000000000000000055511151231257(...) by 3 to get 0.3000000000000000166533453693773(...).
The value of g now is exactly 5404319552844596 / 254.

Why does adding 0.1 multiple times remain lossless?

I know the 0.1 decimal number cannot be represented exactly with a finite binary number (explanation), so double n = 0.1 will lose some precision and will not be exactly 0.1. On the other hand 0.5 can be represented exactly because it is 0.5 = 1/2 = 0.1b.
Having said that it is understandable that adding 0.1 three times will not give exactly 0.3 so the following code prints false:
double sum = 0, d = 0.1;
for (int i = 0; i < 3; i++)
sum += d;
System.out.println(sum == 0.3); // Prints false, OK
But then how is it that adding 0.1 five times will give exactly 0.5? The following code prints true:
double sum = 0, d = 0.1;
for (int i = 0; i < 5; i++)
sum += d;
System.out.println(sum == 0.5); // Prints true, WHY?
If 0.1 cannot be represented exactly, how is it that adding it 5 times gives exactly 0.5 which can be represented precisely?

The rounding error is not random and the way it is implemented it attempts to minimise the error. This means that sometimes the error is not visible, or there is not error.
For example 0.1 is not exactly 0.1 i.e. new BigDecimal("0.1") < new BigDecimal(0.1) but 0.5 is exactly 1.0/2
This program shows you the true values involved.
BigDecimal _0_1 = new BigDecimal(0.1);
BigDecimal x = _0_1;
for(int i = 1; i <= 10; i ++) {
System.out.println(i+" x 0.1 is "+x+", as double "+x.doubleValue());
x = x.add(_0_1);
}
prints
0.1000000000000000055511151231257827021181583404541015625, as double 0.1
0.2000000000000000111022302462515654042363166809082031250, as double 0.2
0.3000000000000000166533453693773481063544750213623046875, as double 0.30000000000000004
0.4000000000000000222044604925031308084726333618164062500, as double 0.4
0.5000000000000000277555756156289135105907917022705078125, as double 0.5
0.6000000000000000333066907387546962127089500427246093750, as double 0.6000000000000001
0.7000000000000000388578058618804789148271083831787109375, as double 0.7000000000000001
0.8000000000000000444089209850062616169452667236328125000, as double 0.8
0.9000000000000000499600361081320443190634250640869140625, as double 0.9
1.0000000000000000555111512312578270211815834045410156250, as double 1.0
Note: that 0.3 is slightly off, but when you get to 0.4 the bits have to shift down one to fit into the 53-bit limit and the error is discarded. Again, an error creeps back in for 0.6 and 0.7 but for 0.8 to 1.0 the error is discarded.
Adding it 5 times should cumulate the error, not cancel it.
The reason there is an error is due to limited precision. i.e 53-bits. This means that as the number uses more bits as it get larger, bits have to be dropped off the end. This causes rounding which in this case is in your favour.
You can get the opposite effect when getting a smaller number e.g. 0.1-0.0999 => 1.0000000000000286E-4
and you see more error than before.
An example of this is why in Java 6 Why does Math.round(0.49999999999999994) return 1 In this case the loss of a bit in calculation results in a big difference to the answer.

Barring overflow, in floating-point, x + x + x is exactly the correctly rounded (i.e. nearest) floating-point number to the real 3*x, x + x + x + x is exactly 4*x, and x + x + x + x + x is again the correctly rounded floating-point approximation for 5*x.
The first result, for x + x + x, derives from the fact that x + x is exact. x + x + x is thus the result of only one rounding.
The second result is more difficult, one demonstration of it is discussed here (and Stephen Canon alludes to another proof by case analysis on the last 3 digits of x). To summarize, either 3*x is in the same binade as 2*x or it is in the same binade as 4*x, and in each case it is possible to deduce that the error on the third addition cancels the error on the second addition (the first addition being exact, as we already said).
The third result, “x + x + x + x + x is correctly rounded”, derives from the second in the same way that the first derives from the exactness of x + x.
The second result explains why 0.1 + 0.1 + 0.1 + 0.1 is exactly the floating-point number 0.4: the rational numbers 1/10 and 4/10 get approximated the same way, with the same relative error, when converted to floating-point. These floating-point numbers have a ratio of exactly 4 between them. The first and third result show that 0.1 + 0.1 + 0.1 and 0.1 + 0.1 + 0.1 + 0.1 + 0.1 can be expected to have less error than might be inferred by naive error analysis, but, in themselves, they only relate the results to respectively 3 * 0.1 and 5 * 0.1, which can be expected to be close but not necessarily identical to 0.3 and 0.5.
If you keep adding 0.1 after the fourth addition, you will finally observe rounding errors that make “0.1 added to itself n times” diverge from n * 0.1, and diverge even more from n/10. If you were to plot the values of “0.1 added to itself n times” as a function of n, you would observe lines of constant slope by binades (as soon as the result of the nth addition is destined to fall into a particular binade, the properties of the addition can be expected to be similar to previous additions that produced a result in the same binade). Within a same binade, the error will either grow or shrink. If you were to look at the sequence of the slopes from binade to binade, you would recognize the repeating digits of 0.1 in binary for a while. After that, absorption would start to take place and the curve would go flat.

Floating point systems do various magic including having a few extra bits of precision for rounding. Thus the very small error due to the inexact representation of 0.1 ends up getting rounded off to 0.5.
Think of floating point as being a great but INEXACT way to represent numbers. Not all possible numbers are easily represented in a computer. Irrational numbers like PI. Or like SQRT(2). (Symbolic math systems can represent them, but I did say "easily".)
The floating point value may be extremely close, but not exact. It may be so close that you could navigate to Pluto and be off by millimeters. But still not exact in a mathematical sense.
Don't use floating point when you need to be exact rather than approximate. For example, accounting applications want to keep exact track of a certain number of pennies in an account. Integers are good for that because they are exact. The primary issue you need to watch for with integers is overflow.
Using BigDecimal for currency works well because the underlying representation is an integer, albeit a big one.
Recognizing that floating point numbers are inexact, they still have a great many uses. Coordinate systems for navigation or coordinates in graphics systems. Astronomical values. Scientific values. (You probably cannot know the exact mass of a baseball to within a mass of an electron anyway, so inexactness doesn't really matter.)
For counting applications (including accounting) use integer. For counting the number of people that pass through a gate, use int or long.

Java BigDecimal precision problems

I know the following behavior is an old problem, but still I don't understand.
System.out.println(0.1 + 0.1 + 0.1);
Or even though I use BigDecimal
System.out.println(new BigDecimal(0.1).doubleValue()
+ new BigDecimal(0.1).doubleValue()
+ new BigDecimal(0.1).doubleValue());
Why this result is: 0.30000000000000004 instead of: 0.3?
How can I solve this?

What you actually want is
new BigDecimal("0.1")
.add(new BigDecimal("0.1"))
.add(new BigDecimal("0.1"));
The new BigDecimal(double) constructor gets all the imprecision of the double, so by the time you've said 0.1, you've already introduced the rounding error. Using the String constructor avoids the rounding error associated with going via the double.

First never, never use the double constructor of BigDecimal. It may be the right thing in a few situations but mostly it isn't
If you can control your input use the BigDecimal String constructor as was already proposed. That way you get exactly what you want. If you already have a double (can happen after all), don't use the double constructor but instead the static valueOf method. That has the nice advantage that we get the cannonical representation of the double which mitigates the problem at least.. and the result is usually much more intuitive.

This is not a problem of Java, but rather a problem of computers generally. The core problem lies in the conversion from decimal format (human format) to binary format (computer format). Some numbers in decimal format are not representable in binary format without infinite repeating decimals.
For example, 0.3 decimal is 0.01001100... binary But a computer has a limited "slots" (bits) to save a number, so it cannot save all the whole infinite representation. It saves only
0.01001100110011001100 (for example). But that number in decimal is no longer 0.3, but 0.30000000000000004 instead.

Try this:
BigDecimal sum = new BigDecimal(0.1).add(new BigDecimal(0.1)).add(new BigDecimal(0.1));
EDIT: Actually, looking over the Javadoc, this will have the same problem as the original. The constructor BigDecimal(double) will make a BigDecimal corresponding to the exact floating-point representation of 0.1, which is not exactly equal to 0.1.
This, however, gives the exact result, since integers CAN always be expressed exactly in floating-point representation:
BigDecimal one = new BigDecimal(1);
BigDecimal oneTenth = one.divide(new BigDecimal(10));
BigDecimal sum = oneTenth.add(oneTenth).add(oneTenth);

The problem you have is that 0.1 is represented with a slightly higher number e.g.
System.out.println(new BigDecimal(0.1));
prints
0.1000000000000000055511151231257827021181583404541015625
The Double.toString() takes into account this representation error so you don't see it.
Similarly 0.3 is represented by a value slightly lower than it really is.
0.299999999999999988897769753748434595763683319091796875
If you multiply the represented value of 0.1 by 3 you don't get the represented value for 0.3, you instead get something a little higher
0.3000000000000000166533453693773481063544750213623046875
This is not just a representation error but also a rounding error caused by the operations. This is more than the Double.toString() will correct and so you see the rounding error.
The moral of the story, if you use float or double also round the solution appropriately.
double d = 0.1 + 0.1 + 0.1;
System.out.println(d);
double d2 = (long)(d * 1e6 + 0.5) / 1e6; // round to 6 decimal places.
System.out.println(d2);
prints
0.30000000000000004
0.3

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.