Can every float be expressed exactly as a double? - java

Can every possible value of a float variable can be represented exactly in a double variable?
In other words, for all possible values X will the following be successful:
float f1 = X;
double d = f1;
float f2 = (float)d;
if(f1 == f2)
System.out.println("Success!");
else
System.out.println("Failure!");
My suspicion is that there is no exception, or if there is it is only for an edge case (like +/- infinity or NaN).
Edit: Original wording of question was confusing (stated two ways, one which would be answered "no" the other would be answered "yes" for the same answer). I've reworded it so that it matches the question title.

Yes.
Proof by enumeration of all possible cases:
public class TestDoubleFloat {
public static void main(String[] args) {
for (long i = Integer.MIN_VALUE; i <= Integer.MAX_VALUE; i++) {
float f1 = Float.intBitsToFloat((int) i);
double d = (double) f1;
float f2 = (float) d;
if (f1 != f2) {
if (Float.isNaN(f1) && Float.isNaN(f2)) {
continue; // ok, NaN
}
fail("oops: " + f1 + " != " + f2);
}
}
}
}
finishes in 12 seconds on my machine. 32 bits are small.

In theory, there is not such a value, so "yes", every float should be representable as a double.. Converting from a float to a double should involve just tacking four bytes of 00 on the end -- they are stored using the same format, just with different sized fields.

Yes, floats are a subset of doubles. Both floats and doubles have the form (sign * a * 2^b). The difference between floats and doubles is the number of bits in a & b. Since doubles have more bits available, assigning a float value to a double effectively means inserting extra 0 bits.

As everyone has already said, "no". But that's actually a "yes" to the question itself, i.e. every float can be exactly expressed as a double. Confusing. :)

If I'm reading the language specification correctly (and as everyone else is confirming), there is no such value.
That is, each claims only to hold only IEEE 754 standard values, so casts between the two should incur no change except in memory given.
(clarification: There would be no change as long as the value was small enough to be held in a float; obviously if the value was too many bits to be held in a float to begin with, casting from double to float would result in a loss of precision.)

#KenG: This code:
float a = 0.1F
println "a=${a}"
double d = a
println "d=${d}"
fails not because 0.1f can't be exactly represented. The question was "is there a float value that cannot be represented as a double", which this code doesn't prove. Although 0.1f can't be stored exactly, the value that a is given (which isn't 0.1f exactly) can be stored as a double (which also won't be 0.1f exactly). Assuming an Intel FPU, the bit pattern for a is:
0 01111011 10011001100110011001101
and the bit pattern for d is:
0 01111111011 100110011001100110011010 (followed by lots more zeros)
which has the same sign, exponent (-4 in both cases) and the same fractional part (separated by spaces above). The difference in the output is due to the position of the second non-zero digit in the number (the first is the 1 after the point) which can only be represented with a double. The code that outputs the string format stores intermediate values in memory and is specific to floats and doubles (i.e. there is a function double-to-string and another float-to-string). If the to-string function was optimised to use the FPU stack to store the intermediate results of the to-string process, the output would be the same for float and double since the FPU uses the same, larger format (80bits) for both float and double.
There are no float values that can't be stored identically in a double, i.e. the set of float values is a sub-set of the the set of double values.

Snark: NaNs will compare differently after (or indeed before) conversion.
This does not, however, invalidate the answers already given.

I took the code you listed and decided to try it in C++ since I thought it might execute a little faster and it is significantly easier to do unsafe casting. :-D
I found out that for valid numbers, the conversion works and you get the exact bitwise representation after the cast. However, for non-numbers, e.g. 1.#QNAN0, etc., the result will use a simplified representation of the non-number rather than the exact bits of the source. For example:
**** FAILURE **** 2140188725 | 1.#QNAN0 -- 0xa0000000 0x7ffa1606
I cast an unsigned int to float then to double and back to float. The number 2140188725 (0x7F90B035) results in a NAN and converting to double and back is still a NAN but not the exact same NAN.
Here is the simple C++ code:
typedef unsigned int uint;
for (uint i = 0; i < 0xFFFFFFFF; ++i)
{
float f1 = *(float *)&i;
double d = f1;
float f2 = (float)d;
if(f1 != f2)
printf("**** FAILURE **** %u | %f -- 0x%08x 0x%08x\n", i, f1, f1, f2);
if ((i % 1000000) == 0)
printf("Iteration: %d\n", i);
}

The answer to the first question is yes, the answer to the 'in other words', however is no. If you change the test in the code to be if (!(f1 != f2)) the answer to the second question becomes yes -- it will print 'Success' for all float values.

In theory every normal single can have the exponent and mantissa padded to create a double and then remove the padding and you return to the original single.
When you go from theory to reality is when you will have problems. I dont know if you were interested in theory or implementation. If it is implementation then you can rapidly get into trouble.
IEEE is a horrible format, my understanding it was intentionally designed to be so tough that nobody could meet it and allow the market to catch up to intel (this was a while back) allowing for more competition. If that is true it failed, either way we are stuck with this dreadful spec. Something like the TI format is far superior for the real world in so many ways. I have no connection to either company or any of these formats.
Thanks to this spec there are very few if any fpus that actually meet it (in hardware or even in hardware plus the operating system), and those that do often fail on the next generation. (google: TestFloat). The problems these days tend to lie in the int to float and float to int and not single to double and double to single as you have specified above. Of course what operation is the fpu going to perform to do that conversion? Add 0? Multiply by 1? Depends on the fpu and the compiler.
The problem with IEEE related to your question above is that there is more than one way a number, not every number but many numbers can be represented. If I wanted to break your code I would start with minus zero in the hope that one of the two operations would convert it to a plus zero. Then I would try denormals. And it should fail with a signaling nan, but you called that out as a known exception.
The problem is that equal sign, here is rule number one about floating point, never use an equal sign. Equals is a bit comparison not a value comparison, if you have two values represented in different ways (plus zero and minus zero for example) the bit comparison will fail even though its the same number. Greater than and less than are done in the fpu, equals is done with the integer alu.
I realize that you probably used the equal to explain the problem and not necessarily the code you wanted to succeed or fail.

If a floating-point type is viewed as representing a precise value, then as other posters have noted, every float value is representable as a double, but only a few values of double can be represented by float. On the other hand, if one recognizes that floating-point values are approximations, one will realize the real situation is reversed. If one uses a very precise instrument to measure something which is 3.437mm, one may correctly describe is size as 3.4mm. if one uses a ruler to measure the object as 3.4mm, it would be incorrect to describe its size as 3.400mm.
Even bigger problems exist at the top of the range. There is a float value that represents: "computed value exceeded 2^127 by an unknown amount", but there's no double value that indicates such a thing. Casting an "infinity" from single to double will yield a value "computed value exceeded 2^1023 by an unknown amount" which is off by a factor of over a googol.

Related

Float-type useless?

in my java book it told me to not directly compare 2 different float(type) numbers when storing them in variables. Because it gives an approx of a number in the variable. Instead it suggested checking the absolute value of the difference and see if it equals 0. If it does they are the same. How is this helpful? What if I store 5 in variable a and 5 in variable b, how can they not be the same? And how does it help if I compare absolute value??
double a=5,b=5;
if (Math.abs(a-b)==0)
//run code
if (a==b)
//run code
I don't see at all why the above method would be more accurate? Since if 'a' is not equal to 'b' it wont matter if I use Math.abs.
I appreciate replies and thank you for your time.
I tried both methods.
Inaccuracy with comparisons using the == operator is caused by the way double values are stored in a computer's memory. We need to remember that there is an infinite number of values that must fit in limited memory space, usually 64 bits. As a result, we can't have an exact representation of most double values in our computers. They must be rounded to be saved.
Because of the rounding inaccuracy, interesting errors might occur:
double d1 = 0;
for (int i = 1; i <= 8; i++) {
d1 += 0.1;
}
double d2 = 0.1 * 8;
System.out.println(d1);
System.out.println(d2);
Both variables, d1 and d2, should equal 0.8. However, when we run the code above, we'll see the following results:
0.7999999999999999
0.8
In that case, comparing both values with the == operator would produce a wrong result. For this reason, we must use a more complex comparison algorithm.
If we want to have the best precision and control over the rounding mechanism, we can use java.math.BigDecimal class.
The recommended algorithm to compare double values in plain Java is a threshold comparison method. In this case, we need to check whether the difference between both numbers is within the specified tolerance, commonly called epsilon:
double epsilon = 0.000001d;
assertThat(Math.abs(d1 - d2) < epsilon).isTrue();
The smaller the epsilon's value, the greater the comparison accuracy. However, if we specify the tolerance value too small, we'll get the same false result as in the simple == comparison
The thing is, That statement you read in your java book just prevents you from some errors in future, that can be hardly debugged. Computers store decimals/floats as binary, thus not everything we can express as rational in decimal numbers can be expressed as rational in binary, so there's always something like 0.7 = 0.699999999998511. You may not see difference in those comparisons you use, but in real project where you may use much more variables, add and subtract from them, this difference may appear in very surprising place.
There is some classic question about floating numbers. You may see it in any language as well
Why does this code print a result of '7'?

Exact conversion from float to int

I want to convert float value to int value or throw an exception if this conversion is not exact.
I've found the following suggestion: use Math.round to convert and then use == to check whether those values are equal. If they're equal, then conversion is exact, otherwise it is not.
But I've found an example which does not work. Here's code demonstrating this example:
String s = "2147483648";
float f = Float.parseFloat(s);
System.out.printf("f=%f\n", f);
int i = Math.round(f);
System.out.printf("i=%d\n", i);
System.out.printf("(f == i)=%s\n", (f == i));
It outputs:
f=2147483648.000000
i=2147483647
(f == i)=true
I understand that 2147483648 does not fit into integer range, but I'm surprised that == returns true for those values. Is there better way to compare float and int? I guess it's possible to convert both values to strings, but that would be extremely slow for such a primitive function.
floats are rather inexact concepts. They are also mostly pointless unless you're running on at this point rather old hardware, or interacting specifically with systems and/or protocols that work in floats or have 'use a float' hardcoded in their spec. Which may be true, but if it isn't, stop using floats and start using double - unless you have a fairly large float[] there is zero memory and performance difference, floats are just less accurate.
Your algorithm cannot fail when using int vs double - all ints are perfectly representable as double.
Let's first explain your code snippet
The underlying error here is the notion of 'silent casting' and how java took some intentional liberties there.
In computer systems in general, you can only compare like with like. It's easy to put in exact terms of bits and machine code what it means to determine whether a == b is true or false if a and b are of the exact same type. It is not at all clear when a and b are different things. Same thing applies to pretty much any operator; a + b, if both are e.g. an int, is a clear and easily understood operation. But if a is a char and b is, say, a double, that's not clear at all.
Hence, in java, all binary operators that involve different types are illegal. In basis, there is no bytecode to directly compare a float and a double, for example, or to add a string to an int.
However, there is syntax sugar: When you write a == b where a and b are different types, and java determines that one of two types is 'a subset' of the other, then java will simply silently convert the 'smaller' type to the 'larger' type, so that the operation can then succeed. For example:
int x = 5;
long y = 5;
System.out.println(x == y);
This works - because java realizes that converting an int to a long value is not ever going to fail, so it doesn't bother you with explicitly specifying that you intended the code to do this. In JLS terms, this is called a widening conversion. In contrast, any attempt to convert a 'larger' type to a 'smaller' type isn't legal, you have to explicitly cast:
long x = 5;
int y = x; // does not compile
int y = (int) x; // but this does.
The point is simply this: When you write the reverse of the above (int x = 5; long y = x;), the code is identical, it's just that compiler silently injects the (long) cast for you, on the basis that no loss will occur. The same thing happens here:
int x = 5;
long y = 10;
long z = x + y;
That compiles because javac adds some syntax sugar for you, specifically, that is compiled as if it says: long z = ((long) x) + y;. The 'type' of the expression x + y there is long.
Here's the key trick: Java considers converting an int to a float, as well as an int or long to a double - a widening conversion.
As in, javac will just assume it can do that safely without any loss and therefore will not enforce that the programmer explicitly acknowledges by manually adding the cast. However, int->float, as well as long->double are not actually entirely safe.
floats can represent every integral value between -2^23 to +2^23, and doubles can represent every integral value between -2^52 to +2^52 (source). But int can represent every integral value between -2^31 to +2^31-1, and longs -2^63 to +2^63-1. That means at the edges (very large negative/positive numbers), integral values exist that are representable in ints but not in floats, or longs but not in doubles (all ints are representable in double, fortunately; int -> double conversion is entirely safe). But java doesn't 'acknowledge' this, which means silent widening conversions can nevertheless toss out data (introduce rounding) silently.
That is what happens here: (f == i) is syntax sugared into (f == ((float) i)) and the conversion from int to float introduces the rounding.
The solution
Mostly, when using doubles and floats and nevertheless wishing for exact numbers, you've already messed up. These concepts fundamentally just aren't exact and this exactness cannot be sideloaded in by attempting to account for error bands, as the errors introduced due to the rounding behaviour of float and double cannot be tracked (not easily, at any rate). You should not be using float/double as a consequence. Either find an atomary unit and represent those in terms of int/long, or use BigDecimal. (example: To write bookkeeping software, do not store finance amounts as a double. do store them as 'cents' (or satoshis or yen or pennies or whatever the atomic unit is in that currency) in long, or, use BigDecimal if you really know what you are doing).
I want an answer anyway
If you're absolutely positive that using float (or even double) here is acceptable and you still want exactness, we have a few solutions.
Option 1 is to employ the power of BigDecimal:
new BigDecimal(someDouble).intValueExact()
This works, is 100% reliable (unless float to double conversion can knock a non-exact value into an exact one somehow, I don't think that can happen), and throws. It's also very slow.
An alternative is to employ our knowledge of how the IEEE floating point standard works.
A real simple answer is simply to run your algorithm as you wrote it, but to add an additional check: If the value your int gets is below -2^23 or above +2^23 then it probably isn't correct. However, there are still a smattering of numbers below -2^23 and +2^23 that are perfectly representable in both float and int, just, no longer every number at that point. If you want an algorithm that will accept those exact numbers as well, then it gets much more complicated. My advice is not to delve into that cesspool: If you have a process where you end up with a float that is anywhere near such extremes, and you want to turn them to int but only if that is possible without loss, you've arrived at a crazy question and you need to rewire the parts that you got you there instead!
If you really need that, instead of trying to numbercrunch the float, I suggest using the BigDecimal().intValueExact() trick if you truly must have this.

Comparison of floating point numbers in Java

Lets say, I have the following:
float x= ...
float y = ...
What I like to do is just compare them whether x is greater than y or not. I am not interested in their equality.
My question is, should I take into account precision when just performing a > or a < check on floating point values? Am I correct to assume that precision is only taken into account for equality checks?
Since you already have two floats, named x and y, and if there hasn't been any casting before, you can easily use ">" and "<" for comparison. However, let's say if you had two doubles d1 and d2 with d1 > d2 and you cast them to f1 and f2, respectively, you might get f1 == f2 because of precision problems.
There is already a wheel you don't need to invent:
if (Float.compare(x, y) < 0)
// x is less than y
All float values have the same precision as each other.
It really depends on where those two floats came from. If there was roundoff earlier in their computation, then if the accumulated roundoff is large enough to exceed the difference between the "ideal" answers, you may not get the results you expect for any of the comparison values.
As a general rule, any comparison of floats must be considered fuzzy. If you must do it, you are responsible for understanding the sources of roundoff error and deciding whether you care about it and how you want to handle it if so. It's usually better to avoid comparing floats entirely unless your algorithm absolutely requires that you do so... and if you must, to make sure that a "close but not quite" comparison will not break your program. You may be able to restructure the formulas and order of computation to reduce loss of precision. Going up to double will reduce the accumulated error but is not a complete solution.
If you must compare, and the comparison is important, don't use floats. If you need absolute precision of floating-like numbers, use an infinite-precision math package like bignums or a rational-numbers implementation and accept the performance cost. Or switch to scaled integers -- which also round off, but round off in a way that makes more sense to humans.
This is a difficult question. The answer really depends on how you got those numbers.
First, you need to understand that floating point numbers ARE precise, but that they don't necessarily represent the number that you thought they did. Floating point types in typical programming language represent a finite subset of the infinite set of Real numbers. So if you have an arbitrary real number the chances are that you cannot represent it precisely using a float or double type. In fact ...
The only Real numbers that can be represented exactly as float or
double values have the form
mantissa * power(2, exponent)
where "mantissa" and "exponent" are integers in prescribed ranges.
And the corollary is that most "decimal" numbers don't have an exact float or double representation either.
So in fact, we end up with something like this:
true_value = floating_point_value + delta,
where "delta" is the error; i.e. the small (or not so small) difference between the true value and the float or double value.
Next, when you perform a calculation using floating point values, there are cases where the exact result cannot be represented as a floating point value. An obvious example is:
1.0f / 3.0f
for which the true value is 0.33333... recurring, which is not representable in any finite base 2 (or base 10!) floating point representation. Instead what happens is that a result is produced by rounding to the nearest float or double value ... introducing more error.
As you perform more and more calculations, the errors can potentially grow ... or stay stable ... depending on the sequence of operations that are performed. (There's a branch of mathematics that deals with this: Numerical Analysis.)
So back to your questions:
"Should I take into account precision when just performing a > or a < check on floating point values?"
It depends on how you got those floating point values, and what you know about the "delta" values relative to the true Real values they nominally represent.
If there are no errors (i.e. delta < the smallest representable difference for the value), then you can safely compare using ==, < or >.
If there are possible errors, then you need to take account of those errors ... if the semantic of the comparison are intended to couched in terms of the (nominal) true values. Furthermore, you need to have a credible estimate of the "delta" (the accumulated error) when you code the comparison.
In short, there is no simple (correct) answer ...
"Am I correct to assume that precision is only taken into account for equality checks? "
In fact precision is not "taken into account" in any of the comparison operators. These operators treat the operands as precise values, and compare them accordingly. It is up to your code to take account of precision issues, based on your error estimates for the preceding calculations.
If you have estimates for the deltas, then a mathematically sound < comparison would be something like this:
// Assume true_v1 = v1 +- delta_v1 ... (delta_v1 is a non-negative constant)
if (v1 + delta_v1 < v2 - delta_v2) {
// true_v1 is less than true_v2
}
and so on ...

Why is comparing floats inconsistent in Java?

class Test{
public static void main(String[] args){
float f1=3.2f;
float f2=6.5f;
if(f1==3.2){
System.out.println("same");
}else{
System.out.println("different");
}
if(f2==6.5){
System.out.println("same");
}else{
System.out.println("different");
}
}
}
output:
different
same
Why is the output like that? I expected same as the result in first case.
The difference is that 6.5 can be represented exactly in both float and double, whereas 3.2 can't be represented exactly in either type. and the two closest approximations are different.
An equality comparison between float and double first converts the float to a double and then compares the two. So the data loss.
You shouldn't ever compare floats or doubles for equality; because you can't really guarantee that the number you assign to the float or double is exact.
This rounding error is a characteristic feature of floating-point computation.
Squeezing infinitely many real numbers into a finite number of bits
requires an approximate representation. Although there are infinitely
many integers, in most programs the result of integer computations can
be stored in 32 bits.
In contrast, given any fixed number of bits,
most calculations with real numbers will produce quantities that
cannot be exactly represented using that many bits. Therefore the
result of a floating-point calculation must often be rounded in order
to fit back into its finite representation. This rounding error is the
characteristic feature of floating-point computation.
Check What Every Computer Scientist Should Know About Floating-Point Arithmetic for more!
They're both implementations of different parts of the IEEE floating point standard. A float is 4 bytes wide, whereas a double is 8 bytes wide.
As a rule of thumb, you should probably prefer to use double in most cases, and only use float when you have a good reason to. (An example of a good reason to use float as opposed to a double is "I know I don't need that much precision and I need to store a million of them in memory.") It's also worth mentioning that it's hard to prove you don't need double precision.
Also, when comparing floating point values for equality, you'll typically want to use something like Math.abs(a-b) < EPSILON where a and b are the floating point values being compared and EPSILON is a small floating point value like 1e-5. The reason for this is that floating point values rarely encode the exact value they "should" -- rather, they usually encode a value very close -- so you have to "squint" when you determine if two values are the same.
EDIT: Everyone should read the link #Kugathasan Abimaran posted below: What Every Computer Scientist Should Know About Floating-Point Arithmetic for more!
To see what you're dealing with, you can use Float and Double's toHexString method:
class Test {
public static void main(String[] args) {
System.out.println("3.2F is: "+Float.toHexString(3.2F));
System.out.println("3.2 is: "+Double.toHexString(3.2));
System.out.println("6.5F is: "+Float.toHexString(6.5F));
System.out.println("6.5 is: "+Double.toHexString(6.5));
}
}
$ java Test
3.2F is: 0x1.99999ap1
3.2 is: 0x1.999999999999ap1
6.5F is: 0x1.ap2
6.5 is: 0x1.ap2
Generally, a number has an exact representation if it equals A * 2^B, where A and B are integers whose allowed values are set by the language specification (and double has more allowed values).
In this case,
6.5 = 13/2 = (1+10/16)*4 = (1+a/16)*2^2 == 0x1.ap2, while
3.2 = 16/5 = ( 1 + 9/16 + 9/16^2 + 9/16^3 + . . . ) * 2^1 == 0x1.999. . . p1.
But Java can only hold a finite number of digits, so it cuts the .999. . . off at some point. (You may remember from math that 0.999. . .=1. That's in base 10. In base 16, it would be 0.fff. . .=1.)
class Test {
public static void main(String[] args) {
float f1=3.2f;
float f2=6.5f;
if(f1==3.2f)
System.out.println("same");
else
System.out.println("different");
if(f2==6.5f)
System.out.println("same");
else
System.out.println("different");
}
}
Try like this and it will work. Without 'f' you are comparing a floating with other floating type and different precision which may cause unexpected result as in your case.
It is not possible to compare values of type float and double directly. Before the values can be compared, it is necessary to either convert the double to float, or convert the float to double. If one does the former comparison, the conversion will ask "Does the the float hold the best possible float representation of the double's value?" If one does the latter conversion, the question will be "Does the float hold a perfect representation of the double's value". In many contexts, the former question is the more meaningful one, but Java assumes that all comparisons between float and double are intended to ask the latter question.
I would suggest that regardless of what a language is willing to tolerate, one's coding standards should absolutely positively forbid direct comparisons between operands of type float and double. Given code like:
float f = function1();
double d = function2();
...
if (d==f) ...
it's impossible to tell what behavior is intended in cases where d represents a value which is not precisely representable in float. If the intention is that f be converted to a double, and the result of that conversion compared with d, one should write the comparison as
if (d==(double)f) ...
Although the typecast doesn't change the code's behavior, it makes clear that the code's behavior is intentional. If the intention was that the comparison indicate whether f holds the best float representation of d, it should be:
if ((float)d==f)
Note that the behavior of this is very different from what would happen without the cast. Had your original code cast the double operand of each comparison to float, then both equality tests would have passed.
In general is not a good practice to use the == operator with floating points number, due to approximation issues.
6.5 can be represented exactly in binary, whereas 3.2 can't. That's why the difference in precision doesn't matter for 6.5, so 6.5 == 6.5f.
To quickly refresh how binary numbers work:
100 -> 4
10 -> 2
1 -> 1
0.1 -> 0.5 (or 1/2)
0.01 -> 0.25 (or 1/4)
etc.
6.5 in binary: 110.1 (exact result, the rest of the digits are just zeroes)
3.2 in binary: 11.001100110011001100110011001100110011001100110011001101... (here precision matters!)
A float only has 24 bits precision (the rest is used for sign and exponent), so:
3.2f in binary: 11.0011001100110011001100 (not equal to the double precision approximation)
Basically it's the same as when you're writing 1/5 and 1/7 in decimal numbers:
1/5 = 0,2
1,7 = 0,14285714285714285714285714285714.
Float has less precision than double, bcoz float is using 32bits inwhich 1 is used for Sign, 23 precision and 8 for Exponent . Where as double uses 64 bits in which 52 are used for precision, 11 for exponent and 1for Sign....Precision is important matter.A decimal number represented as float and double can be equal or unequal depends is need of precision( i.e range of numbers after decimal point can vary). Regards S. ZAKIR

Why is there a difference between the same value stored as a float and a double in Java?

I expected the following code to produce: "Both are equal", but I got "Both are NOT equal":
float a=1.3f;
double b=1.3;
if(a==b)
{
System.out.println("Both are equal");
}
else{
System.out.println("Both are NOT equal");
}
What is the reason for this?
It's because the closest float value to 1.3 isn't the same as the closest double value to 1.3. Neither value will be exactly 1.3 - that can't be represented exactly in a non-recurring binary representation.
To give a different understanding of why this happens, suppose we had two decimal floating point types - decimal5 and decimal10, where the number represents the number of significant digits. Now suppose we tried to assign the value of "a third" to both of them. You'd end up with
decimal5 oneThird = 0.33333
decimal10 oneThird = 0.3333333333
Clearly those values aren't equal. It's exactly the same thing here, just with different bases involved.
However if you restrict the values to the less-precise type, you'll find they are equal in this particular case:
double d = 1.3d;
float f = 1.3f;
System.out.println((float) d == f); // Prints true
That's not guaranteed to be the case, however. Sometimes the approximation from the decimal literal to the double representation, and then the approximation of that value to the float representation, ends up being less accurate than the straight decimal to float approximation. One example of this 1.0000001788139343 (thanks to stephentyrone for finding this example).
Somewaht more safely, you can do the comparison between doubles, but use a float literal in the original assignment:
double d = 1.3f;
float f = 1.3f;
System.out.println(d == f); // Prints true
In the latter case, it's a bit like saying:
decimal10 oneThird = 0.3333300000
However, as pointed out in the comments, you almost certainly shouldn't be comparing floating point values with ==. It's almost never the right thing to do, because of precisely this sort of thing. Usually if you want to compare two values you do it with some sort of "fuzzy" equality comparison, checking whether the two numbers are "close enough" for your purposes. See the Java Traps: double page for more information.
If you really need to check for absolute equality, that usually indicates that you should be using a different numeric format in the first place - for instance, for financial data you should probably be using BigDecimal.
A float is a single precision floating point number. A double is a double precision floating point number. More details here: http://www.concentric.net/~Ttwang/tech/javafloat.htm
Note: It is a bad idea to check exact equality for floating point numbers. Most of the time, you want to do a comparison based on a delta or tolerance value.
For example:
float a = 1.3f;
double b = 1.3;
float delta = 0.000001f;
if (Math.abs(a - b) < delta)
{
System.out.println("Close enough!");
}
else
{
System.out.println("Not very close!");
}
Some numbers can't be represented exactly in floating point (e.g. 0.01) so you might get unexpected results when you compare for equality.
Read this article.
The above article clearly illustrates with examples your scenario while using double and float types.
float a=1.3f;
double b=1.3;
At this point you have two variables containing binary approximations to the Real number 1.3. The first approximation is accurate to about 7 decimal digits, and the second one is accurate to about 15 decimal digits.
if(a==b) {
The expression a==b is evaluate in two stages. First the value of a is converted from a float to a double by padding the binary representation. The result is still only accurate to about 7 decimal digits as a representation of the Real 1.3. Next you compare the two different approximations. Since they are different, the result of a==b is false.
There are two lessons to learn:
Floating point (and double) literals are almost always approximations; e.g. actual number that corresponds to the literal 1.3f is not exactly equal to the Real number 1.3.
Every time you do a floating point computation, errors creep in. These errors tend to build up. So when you are comparing floating points / double numbers it is usually a mistake to use a simple "==", "<", etcetera. Instead you should use |a - b| < delta where delta is chosen appropriately. (And figuring out what is an appropriate delta is not always straight-forward either.)
You should have taken that course in Numerical Analysis :-)
Never check for equality between floating point numbers. Specifically, to answer your question, the number 1.3 is hard to represent as a binary floating point and the double and float representations are different.
The problem is that Java (and alas .NET as well) is inconsistent about whether a float value represents a single precise numeric quantity or a range of quantities. If a float is considered to represents an exact numeric quantity of the form Mant * 2^Exp, where Mant is an integer 0 to 2^25 and Exp is an integer), then an attempt to cast any number not of that form to float should throw an exception. If it's considered to represent "the locus of numbers for which some particular representation in the above form has been deemed likely to be the best", then a double-to-float cast would be correct even for double values not of the above form [casting the double that best represents a quantity to a float will almost always yield the float that best represents that quantity, though in some corner cases (e.g. numeric quantities in the range 8888888.500000000001 to 8888888.500000000932) the float which is chosen may be a few parts per trillion worse than the best possible float representation of the actual numeric quantity].
To use an analogy, suppose two people each have a ten-centimeter-long object and they measure it. Bob uses an expensive set of calibers and determines that his object is 3.937008" long. Joe uses a tape measure and determines that his object is 3 15/16" long. Are the objects the same size? If one converts Joe's measurement to millionths of an inch (3.937500") the measurements will appear different, but one instead converts Bob's measurement to the nearest 1/256" fraction, they will appear equal. Although the former comparison might seem more "precise", the latter is apt to be more meaningful. Joe's measurement if 3 15/16" doesn't really mean 3.937500"--it means "a distance which, using a tape measure, is indistinguishable from 3 15/16". And 3.937008" is, like the size of Joe's object, a distance which using a tape measure would be indistinguishable from 3 15/16.
Unfortunately, even though it would be more meaningful to compare the measurements using the lower precision, Java's floating-point-comparison rules assume that a float represents a single precise numeric quantity, and performs comparisons on that basis. While there are some cases where this is useful (e.g. knowing whether the particular double produced by casting some value to float and back to double would match the starting value), in general direct equality comparisons between float and double are not meaningful. Even though Java does not require it, one should always cast the operands of a floating-point equality comparison to be the same type. The semantics that result from casting the double to float before the comparison are different from those of casting the float to double, and the behavior Java picks by default (cast the float to double) is often semantically wrong.
Actually neither float nor double can store 1.3. I am not kidding. Watch this video carefully.
https://www.youtube.com/watch?v=RtHKwsXuRkk&index=50&list=PL6pxHmHF3F5JPdnEqKALRMgogwYc2szp1

Categories

Resources