This question already has answers here:
Why 0.1 represented in float correctly? (I know why not in result of 2.0-1.9)
(2 answers)
Closed 6 years ago.
This may seem a very silly question, but I encountered something mysterious and seemingly beautiful today. I tried Googling around, and I couldn't dig anything up.
I am aware that 0.1 cannot be represented in binary. And YET, when I run the following code:
double g = 1.0;
System.out.println(g);
g = g/10;
System.out.println(g);
g = g*3;
System.out.println(g);
This produces the output:
1.0
0.1
0.30000000000000004
The first output and the third output are expected, but what is going on with the second one? Why is it, well, correct? This should be impossible, and yet, there it is.
Why is it, well, correct?
As you noted, many decimal floating point numbers cannot be represented as binary floating point numbers and vice versa.
When you write a statement like this:
double g = 0.1;
the decimal value is converted to the nearest binary floating point value. And when you then print it like this
System.out.println(g);
the formatter produces the nearest decimal floating point value according to the following rules:
How many digits must be printed for the fractional part? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double.
(Reference: Double.toString(double) javadoc )
That means that you will often get the exact decimal representation of the decimal number that you started with.
In simple terms, the error in converting from decimal to binary is the same as the error when converting from binary to decimal. The errors "cancel out".
Now, this doesn't always happen. Often the cumulative errors in the calculation are large enough the errors in the decimal (and binary) results will be apparent in the output.
Numeric Promotion Rules
If two values have different data types, Java will automatically promote one of the val-
ues to the larger of the two data types.
If one of the values is integral and the other is floating-point, Java will automatically
promote the integral value to the floating-point value’s data type.
Smaller data types, namely byte, short, and char, are first promoted to int any time
they’re used with a Java binary arithmetic operator, even if neither of the operands is
int.
After all promotion has occurred and the operands have the same data type, the result-
ing value will have the same data type as its promoted operands.
Let us step through the computation line by line:
double g = 1.0;
g is the float64 number representing exactly 1.0.
g = g / 10;
The right operand is converted to double, so it is 10.0 exact.
The division operation is performed at infinite precision (conceptually), and then rounded to the closest float64 number as the result.
The exact answer is clearly 0.1. But the closest float64 number to 0.1 is exactly 7205759403792794 / 256.
Hence g = 0.10000000000000000555111512312578...(more digits). If you want to print the full-precision exact value, look at new BigDecimal(g).
g = g * 3;
Again, the right operand is converted to 3.0 exact. We multiply 0.1000000000000000055511151231257(...) by 3 to get 0.3000000000000000166533453693773(...).
The value of g now is exactly 5404319552844596 / 254.
Related
I understand that the theory of binary numbers, so operation of double numbers is not precise. However, in java, I have no idea why "(double)65 / 100" is 0.65, which is completely correct in decimal number, other than 0.6500000000004.
double a = 5;
double b = 4.35;
int c = 65;
int d = 100;
System.out.println(a - b); // 0.6500000000000004
System.out.println((double) c / d); // 0.65
Java completely messes up has its own way of handling floating-point binary to decimal conversions.
A simple program in C (compiled with gcc) gives the result:
printf("1: %.20f\n", 5.0 - 4.35); // 0.65000000000000035527
printf("2: %.20f\n", 65./100); // 0.65000000000000002220
while Java gives the result (note you only needed 17 digits to see it, but I'm trying to make it more clear):
System.out.printf("%.20f\n", 5.0 - 4.35); // 0.65000000000000040000
System.out.printf("%.20f\n", 65./100); // 0.65000000000000000000
But when using the %a format specifier, both languages printf the underlying hexadecimal (correct) value: 0x1.4ccccccccccd00000000p-1.
So, Java is performing some illegal rounding at some point in the code. The apparent issue here is that Java has a different set of rules to convert binary to decimal, from the Java specification:
The number of digits in the result for the fractional part of m or a is equal to the precision. If the precision is not specified then the default value is 6. If the precision is less than the number of digits which would appear after the decimal point in the string returned by Float.toString(float) or Double.toString(double) respectively, then the value will be rounded using the round half up algorithm. Otherwise, zeros may be appended to reach the precision. For a canonical representation of the value, use Float.toString(float) or Double.toString(double) as appropriate. (emphasis mine)
And in the toString specification:
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0. (emphasis mine)
So, Java does perform a different binary to decimal conversion from C, but it remains closer to the true binary value than to any other, so the spec guarantees that the binary value can be restored back by a decimal to binary conversion.
Professor William Kahan warned about some Java floating-point issues in this article:
How Java’s Floating-Point Hurts Everyone Everywhere
But this conversion behaviour seems to be IEEE-complaint.
EDIT: I have included information provided by #MarkDickinson in the comments, to report that this Java behaviour, albeit different from C, is documented, and is IEEE-compliant. This has already been explained here, here, and here.
This question already has answers here:
Rounding Errors?
(9 answers)
Is floating point math broken?
(31 answers)
Closed 6 years ago.
Today I find an interesting fact that the formula will influence the precision of the result. Please see the below code
double x = 7d
double y = 10d
println(1-x/y)
println((y-x)/y)
I wrote this code using groovy, you can just treat it as Java
The result is
1-x/y: 0.30000000000000004
(y-x)/y: 0.3
It's interesting that the two formulas which should be equal have different result.
Can anyone explain it for me?
And can I apply the second formula to anywhere applicable as a valid solution for double precision issue?
To control the precision of floating point arithmetic, you should use java.math.BigDecimal.
You can do something like this.
BigDecimal xBigdecimal = BigDecimal.valueOf(7d);
BigDecimal yBigdecimal = BigDecimal.valueOf(10d);
System.out.println(BigDecimal.valueOf(1).subtract(xBigdecimal.divide(yBigdecimal)));
Can anyone explain it for me?
The float and double primitive types in Java are floating point numbers, where the number is stored as a binary representation of a fraction and a exponent.
More specifically, a double-precision floating point value such as the double type is a 64-bit value, where:
1 bit denotes the sign (positive or negative).
11 bits for the exponent.
52 bits for the significant digits (the fractional part as a binary).
These parts are combined to produce a double representation of a value.
Check this
For a detailed description of how floating point values are handled in Java, follow Floating-Point Types, Formats, and Values of the Java Language Specification.
The byte, char, int, long types are fixed-point numbers, which are exact representions of numbers. Unlike fixed point numbers, floating point numbers will some times (safe to assume "most of the time") not be able to return an exact representation of a number. This is the reason why you end up with 0.30000000000000004 in the result of 1 - (x / y).
When requiring a value that is exact, such as 1.5 or 150.1005, you'll want to use one of the fixed-point types, which will be able to represent the number exactly.
As I've already showed in the above example, Java has a BigDecimal class which will handle very large numbers and very small numbers.
This question already has answers here:
Why converting from float to double changes the value?
(9 answers)
Closed 7 years ago.
I write the following code in java and check the values stored in the variables. when I store 1.2 in a double variable 'y' it becomes 1.200000025443 something.
Why it is not 1.200000000000 ?
public class DataTypes
{
static public void main(String[] args)
{
float a=1;
float b=1.2f;
float c=12e-1f;
float x=1.2f;
double y=x;
System.out.println("float a=1 shows a= "+a+"\nfloat b=1.2f shows b= "+b+"\nfloat c=12e-1f shows c= "+c+"\nfloat x=1.2f shows x= "+x+"\ndouble y=x shows y= "+y);
}
}
You can see the output here:
float a=1 shows a= 1.0
float b=1.2f shows b= 1.2
float c=12e-1f shows c= 1.2
float x=1.2f shows x= 1.2
double y=x shows y= 1.2000000476837158
This is a question of formatting above anything else.
Take a look at the Float.toString documentation (Float.toString is what's called to produce the decimal representations you see for the floats, and Double.toString for the double):
How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type float. That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument f. Then f must be the float value nearest to x; or, if two float values are equally close to x, then f must be one of them and the least significant bit of the significand of f must be 0.
(emphasis mine)
The situation is the same for Double.toString. But, you need more digits to "uniquely distinguish the argument value from adjacent values of type double" than you do for float (recall that double is 64-bits while float is 32), that's why you're seeing the extra digits for double and not for float.
Note that anything that can be represented by float can also be represented by double, so you're not actually losing any precision in the conversion.
Of course, not all numbers can be exactly representable by float or double, which is why you see those seemingly random extra digits in the decimal representation in the first place. See "What Every Computer Scientist Should Know About Floating-Point Arithmetic".
The reason why there's such issue is because a computer works only in discrete mathematics, because the microprocessor can only represent internally full numbers, but no decimals. Because we cannot only work with such numbers, but also with decimals, to circumvent that, decades ago very smart engineers have invented the floating point representation, normalized as IEEE754.
The IEEE754 norm that defines how floats and doubles are interpreted in memory. Basically, unlike the int which represent an exact value, the floats and doubles are a calculation from:
sign
exponent
fraction
So the issue here is that when you're storing 1.2 as a double, you actually store a binary approximation to it:
00111111100110011001100110011010
which gives you the closest representation of 1.2 that can be stored using a binary fraction, but not exactly that fraction. In decimal fraction, 12*10^-1 gives an exact value, but as a binary fraction, it cannot give an exact value.
(cf http://www.h-schmidt.net/FloatConverter/IEEE754.html as I'm too lazy to do it myself)
when I store 1.2 in a double variable 'y' it becomes 1.200000025443 something
well actually in both the float and the double versions of y, the value actually is 1.2000000476837158, but because of the smaller mantissa of the float, the value represented is truncated before the approximation, making you believe it's an exact value, whereas in the memory it's not.
Its a classic problem: your legacy code uses a floating point when it should really be using n integer. But, its to expensive to change every instance of that variable (or several) in the code. So, you need to write your own rounding function that takes a bunch of parameters to improve accuracy and convert to an integer.
So, the basic questions is, how do floating point numbers round when they are made in java? the classic example is 0.1 what is often quoted as rounding to 0.0999999999998 (or something like that). But does a floating point number always round down to the next value it can represent when given an integer in Java? Does it round down it's internal mantissa to efficiently round down its absolute value? Or does it just pick the value with the smallest error between the integer and the new float?
Also is the behavior different when calling Float.parseFloat(String) when the String is an integer like "1234567890"? And is the behavior also the same when String is a Floating point with more precision than the Float can store.
Please note that I use floating point or reference Float, I use that interchangeably with Double. Same with integer and long.
how do floating point numbers round when they are made in java?
Java truncates (rounds towards zero) when you use the construct (int) d where d has type double or float. If you need to round to the nearest integer, you can use the line below:
int a = (int) Math.round(d);
the classic example is 0.1 what is often quoted as rounding to 0.0999999999998 (or something like that).
The issue you allude to does not exist with integers, which are exactly representable as double (for those between -253 and 253). If the number you are rounding comes from previous computations that should have produced an integer but may not have because of floating-point rounding errors, then (int) Math.round(d) is likely the solution you should use. It means that you will get the correct integer as long as the cumulative error is not above 0.5.
your legacy code uses a floating point when it should really be using n integer. But, its to expensive to change every instance of that variable (or several) in the code.
If the computations producing the double d are only computations +, -, * with other integers, producing intermediate results between -253 and 253, then d automatically contains an integer (it is exact because the floating-point computations involved are exact, and it is an integer because the exact result is an integer), and you can convert it with the simpler (int) d. On the other hand, if division or non-integer operands are involved, then you should not lightly change the type of d, because it would change the results of these computations.
Also is the behavior different when calling Float.parseFloat(String) when the String is an integer like "1234567890"?
This will produce a float whose value is the nearest representable single-precision value to the rational 1234567890. This happens to be 1234567936.0f.
And is the behavior also the same when String is a Floating point with more precision than the Float can store.
Technically, “0.1” is more precision than Float can store. Also, technically, the previous example 1234567890 is also more precision than Float can store. The behavior is the same: Float.parseFloat("0.1") produces the nearest float to the rational number 0.1.
class Test{
public static void main(String[] args){
float f1=3.2f;
float f2=6.5f;
if(f1==3.2){
System.out.println("same");
}else{
System.out.println("different");
}
if(f2==6.5){
System.out.println("same");
}else{
System.out.println("different");
}
}
}
output:
different
same
Why is the output like that? I expected same as the result in first case.
The difference is that 6.5 can be represented exactly in both float and double, whereas 3.2 can't be represented exactly in either type. and the two closest approximations are different.
An equality comparison between float and double first converts the float to a double and then compares the two. So the data loss.
You shouldn't ever compare floats or doubles for equality; because you can't really guarantee that the number you assign to the float or double is exact.
This rounding error is a characteristic feature of floating-point computation.
Squeezing infinitely many real numbers into a finite number of bits
requires an approximate representation. Although there are infinitely
many integers, in most programs the result of integer computations can
be stored in 32 bits.
In contrast, given any fixed number of bits,
most calculations with real numbers will produce quantities that
cannot be exactly represented using that many bits. Therefore the
result of a floating-point calculation must often be rounded in order
to fit back into its finite representation. This rounding error is the
characteristic feature of floating-point computation.
Check What Every Computer Scientist Should Know About Floating-Point Arithmetic for more!
They're both implementations of different parts of the IEEE floating point standard. A float is 4 bytes wide, whereas a double is 8 bytes wide.
As a rule of thumb, you should probably prefer to use double in most cases, and only use float when you have a good reason to. (An example of a good reason to use float as opposed to a double is "I know I don't need that much precision and I need to store a million of them in memory.") It's also worth mentioning that it's hard to prove you don't need double precision.
Also, when comparing floating point values for equality, you'll typically want to use something like Math.abs(a-b) < EPSILON where a and b are the floating point values being compared and EPSILON is a small floating point value like 1e-5. The reason for this is that floating point values rarely encode the exact value they "should" -- rather, they usually encode a value very close -- so you have to "squint" when you determine if two values are the same.
EDIT: Everyone should read the link #Kugathasan Abimaran posted below: What Every Computer Scientist Should Know About Floating-Point Arithmetic for more!
To see what you're dealing with, you can use Float and Double's toHexString method:
class Test {
public static void main(String[] args) {
System.out.println("3.2F is: "+Float.toHexString(3.2F));
System.out.println("3.2 is: "+Double.toHexString(3.2));
System.out.println("6.5F is: "+Float.toHexString(6.5F));
System.out.println("6.5 is: "+Double.toHexString(6.5));
}
}
$ java Test
3.2F is: 0x1.99999ap1
3.2 is: 0x1.999999999999ap1
6.5F is: 0x1.ap2
6.5 is: 0x1.ap2
Generally, a number has an exact representation if it equals A * 2^B, where A and B are integers whose allowed values are set by the language specification (and double has more allowed values).
In this case,
6.5 = 13/2 = (1+10/16)*4 = (1+a/16)*2^2 == 0x1.ap2, while
3.2 = 16/5 = ( 1 + 9/16 + 9/16^2 + 9/16^3 + . . . ) * 2^1 == 0x1.999. . . p1.
But Java can only hold a finite number of digits, so it cuts the .999. . . off at some point. (You may remember from math that 0.999. . .=1. That's in base 10. In base 16, it would be 0.fff. . .=1.)
class Test {
public static void main(String[] args) {
float f1=3.2f;
float f2=6.5f;
if(f1==3.2f)
System.out.println("same");
else
System.out.println("different");
if(f2==6.5f)
System.out.println("same");
else
System.out.println("different");
}
}
Try like this and it will work. Without 'f' you are comparing a floating with other floating type and different precision which may cause unexpected result as in your case.
It is not possible to compare values of type float and double directly. Before the values can be compared, it is necessary to either convert the double to float, or convert the float to double. If one does the former comparison, the conversion will ask "Does the the float hold the best possible float representation of the double's value?" If one does the latter conversion, the question will be "Does the float hold a perfect representation of the double's value". In many contexts, the former question is the more meaningful one, but Java assumes that all comparisons between float and double are intended to ask the latter question.
I would suggest that regardless of what a language is willing to tolerate, one's coding standards should absolutely positively forbid direct comparisons between operands of type float and double. Given code like:
float f = function1();
double d = function2();
...
if (d==f) ...
it's impossible to tell what behavior is intended in cases where d represents a value which is not precisely representable in float. If the intention is that f be converted to a double, and the result of that conversion compared with d, one should write the comparison as
if (d==(double)f) ...
Although the typecast doesn't change the code's behavior, it makes clear that the code's behavior is intentional. If the intention was that the comparison indicate whether f holds the best float representation of d, it should be:
if ((float)d==f)
Note that the behavior of this is very different from what would happen without the cast. Had your original code cast the double operand of each comparison to float, then both equality tests would have passed.
In general is not a good practice to use the == operator with floating points number, due to approximation issues.
6.5 can be represented exactly in binary, whereas 3.2 can't. That's why the difference in precision doesn't matter for 6.5, so 6.5 == 6.5f.
To quickly refresh how binary numbers work:
100 -> 4
10 -> 2
1 -> 1
0.1 -> 0.5 (or 1/2)
0.01 -> 0.25 (or 1/4)
etc.
6.5 in binary: 110.1 (exact result, the rest of the digits are just zeroes)
3.2 in binary: 11.001100110011001100110011001100110011001100110011001101... (here precision matters!)
A float only has 24 bits precision (the rest is used for sign and exponent), so:
3.2f in binary: 11.0011001100110011001100 (not equal to the double precision approximation)
Basically it's the same as when you're writing 1/5 and 1/7 in decimal numbers:
1/5 = 0,2
1,7 = 0,14285714285714285714285714285714.
Float has less precision than double, bcoz float is using 32bits inwhich 1 is used for Sign, 23 precision and 8 for Exponent . Where as double uses 64 bits in which 52 are used for precision, 11 for exponent and 1for Sign....Precision is important matter.A decimal number represented as float and double can be equal or unequal depends is need of precision( i.e range of numbers after decimal point can vary). Regards S. ZAKIR