I am trying the code below to convert string to float and double but getting different results.
Code:
System.out.println(Float.parseFloat("120059389"));
System.out.println(Double.parseDouble("120059389"));
Output:
1.20059392E8
1.20059389E8
Could somebody explain me why I got different result for parsing string in float and double? What are the ranges for float and double?
This is because you're trying to parse a float by giving it more digits of precision than it can handle. The "ulp" (unit in last place) of a float that big is 8.0, but the "ulp" for a double that big is still reasonably small. That is, at that magnitude, the closest possible float values differ by 8, but the closest double values, with more precision, differ by far less.
System.out.println(Math.ulp(120059389f));
System.out.println(Math.ulp(120059389d));
This prints:
8.0
1.4901161193847656E-8
So the Float parser must use the closest float to the true value of 120059389, which happens to be 1.20059392E8.
The difference lies in the fact that Double and Float store numbers differently.
With Single and Double precision, allowing Double to give Double the amount of precision the Float can handle.
So the simple answer is that because of Float's limited memory in comparison to Double information is lost upon conversion of numbers out of the range Float can handle.
Float : 32-bit Numbers
Double : 64-bit Numbers
So in the conversion some information is lost when converting to the Float because it is truncated.
Generally...
Float stores numbers as 1 bit for the sign (-/+), 8 bits for the Exponent, and 23 bits for the fraction.
Double stores numbers as 1 bit for the sign (-/+), 8 bits for the Exponent, and 53 bits for the fraction.
When you convert your number 120059389 = 111001001111111010111111101b has 27 bits worth of information which can be covered by the Double's 53 bit allowence, but not by the Float's 23 bit allowance, and the data is truncated at the least significant end.
The conversion will round the number to the nearest representable number using 23 bits 1.20059392 = 111001001111111011000000000b and the exponent will handle the rest of the expansion.
The earlier links and answers give good technical answers. The 'laymans' answer is that a float is 32 bits and a double is 64 bits. Some of those bits are used for the number and some are used for the exponent. The number you put in your code simply had too many digits for the 32 bit 'float'. The 64 bit 'double' has more bits and can be more precise with larger numbers.
The same concept holds for even larger numbers when you reach the limits of a 64 bit double and need 128 bits of precision.
Related
I am trying to convert float value to 32 bit unsigned long value and facing the problem of loss of value.
long v = (long) f;
Here when f is 4294967295 ((2^32) -1). The conversion to long returns 4294967296 instead of 4294967295 because float conversion is precised to 7 decimal places. I need precision to 9 decimal places. Is there any way to achieve this?
Quote from Java Puzzlers: Traps, Pitfalls, and Corner Cases book:
Floating-point operations return the floating-point value that is closest to their
exact mathematical result. Once the distance between adjacent floating-point values
is greater than 2, adding 1 to a floating-point value will have no effect,
because the half-way point between values won’t be reached. For the float type,
the least magnitude beyond which adding 1 will have no effect is 2^25, or
33,554,432; for the double type, it is 2^54, or approximately 1.8 × 10^16.
So basicly, if you want to represent big numbers, float is a bad idea. Above 2^25 it is not able to represent at least every other integer. It get worse the bigger the number gets.
The best option for you would be to use BigDecimal instead.
I am trying to convert double to float in java.
Double d = 1234567.1234;
Float f = d.floatValue();
I see that the value of f is
1234567.1
I am not trying to print a string value of float. I just wonder what is the maximum number of digits not to lose any precision when converting double to float. Can i show more than 8 significant digits in java?
float: 32 bits (4 bytes) where 23 bits are used for the mantissa (6 to 9 decimal digits, about 7 on average). 8 bits are used for the exponent, so a float can “move” the decimal point to the right or to the left using those 8 bits. Doing so avoids storing lots of zeros in the mantissa as in 0.0000003 (3 × 10-7) or 3000000 (3 × 107). There is 1 bit used as the sign bit.
double: 64 bits (8 bytes) where 52 bits are used for the mantissa (15 to 17 decimal digits, about 16 on average). 11 bits are used for the exponent and 1 bit is the sign bit.
I believe you hit this limit what cause that problem.
If you change
Double d = 123456789.1234;
Float f = d.floatValue();
You will see that float value will be 1.23456792E8
The precision of a float is about 7 decimals, but since floats are stored in binary, that's an approximation.
To illustrate the actual precision of the float value in question, try this:
double d = 1234567.1234;
float f = (float)d;
System.out.printf("%.9f%n", d);
System.out.printf("%.9f%n", Math.nextDown(f));
System.out.printf("%.9f%n", f);
System.out.printf("%.9f%n", Math.nextUp(f));
Output
1234567.123400000
1234567.000000000
1234567.125000000
1234567.250000000
As you can see, the effective decimal precision is about 1 decimal place for this number, or 8 digits, but if you ran the code with the number 9876543.9876, you get:
9876543.987600000
9876543.000000000
9876544.000000000
9876545.000000000
That's only 7 digits of precision.
This is a simple example in support of the view that there is no safe number of decimal digits.
Consider 0.1.
The closest IEEE 754 64-bit binary floating point number has exact value 0.1000000000000000055511151231257827021181583404541015625. It converts to 32-bit binary floating point as 0.100000001490116119384765625, which is considerably further from 0.1.
There can be loss of precision with even a single significant digit and single decimal place.
Unless you really need the compactness of float, and have very relaxed precision requirements, it is generally better to just stick with double.
I just wonder what is the maximum number of digits not to lose any precision when converting double to float.
Maybe you don't realize it, but the concept of N digits precisions is already ambigous. Doubtlessly you meant "N digits precision in base 10". But unlike humans, our computers work with Base 2.
Its not possible to convert every number from Base X to Base Y (with a limited amount of retained digits) without loss of precision, e.g. the value of 1/3rd is perfectly accurately representable in Base 3 as "0.1". In Base 10 it has an infinite number of digits 0.3333333333333... Likewise, commonly perfectly representable numbers in Base 10, e.g. 0.1 need an infinite number of digits to be represented in Base 2. On the other hand, 0.5 (Base 10) is peferectly accurately representable as 0.1 (Base 2).
So back to
I just wonder what is the maximum number of digits not to lose any precision when converting double to float.
The answer is "it depends on the value". The commonly cited rule of thumb "float has about 6 to 7 digits decimal precision" is just an approximation. It can be much more or much less depending on the value.
When dealing with floating point the concept of relative accuracy is more useful, stop thinking about "digits" and replace it with relative error. Any number N (in range) is representable with an error of (at most) N / accuracy, and the accuracy is the number of mantissa bits in the chosen format (e.g. 23 (+1) for float, 52 (+1) for double). So a decimal number represented as a float is has a maximum approximation error of N / pow(2, 24). The error may be less, even zero, but it is never greater.
The 23+1 comes from the convention that floating point numbers are organized with the exponent chosen such that the first mantissa bit is always a 1 (whenever possible), so it doesn't need to be explicitly stored. The number of physically stored bits, e.g. 23 thus allows for one extra bit of accuracy. (There is an exceptional case where "whenever possible" does not apply, but lets ignore that here).
TL;DR: There is no fixed number of decimal digits accuracy in float or double.
EDIT.
No you cannot get any more precise with a float in Java because floats can only contain 32 bits ( 4 bytes). If you want more precision, then continue to use the Double. This might also be helpful
When I assign from an int to a float I thought float allows more precision, so would not lose or change the value assigned, but what I am seeing is something quite different. What is going on here?
for(int i = 63000000; i < 63005515; i++) {
int a = i;
float f = 0;
f=a;
System.out.print(java.text.NumberFormat.getInstance().format(a) + " : " );
System.out.println(java.text.NumberFormat.getInstance().format(f));
}
some of the output :
...
63,005,504 : 63,005,504
63,005,505 : 63,005,504
63,005,506 : 63,005,504
63,005,507 : 63,005,508
63,005,508 : 63,005,508
Thanks!
A float has the same number of bits as an int -- 32 bits. But it allows for a greater range of values, far greater than the range of int values. However, the precision is fixed at 24 bits (23 "mantissa" bits, plus 1 implied 1 bit). At the value of about 63,000,000, the precision is greater than 1. You can verify this with the Math.ulp method, which gives the difference between 2 consecutive values.
The code
System.out.println(Math.ulp(63000000.0f));
prints
4.0
You can use double values for a far greater (yet still limited) precision:
System.out.println(Math.ulp(63000000.0));
prints
7.450580596923828E-9
However, you can just use ints here, because your values, at about 63 million, are still well below the maximum possible int value, which is about 2 billion.
A float in java is a number IEEE 754 floating point representation, even when it can be used to represent values from ±1.40129846432481707e-45 to ±3.40282346638528860e+38 it has only 6 or 7 significant decimal digits.
A simple solution would be use a double which has at least 14 significant digits and can cover without any issue all the values of an int.
However, if it is accuracy what you're looking for stay away from native floating point representations and go for classes like BigInteger and BigDecimal.
No, they are not necessarily the same value. An int and a float are each 32 bits but in a float some of those bits are used for the floating point part of the number so there are fewer whole numbers which can be represented in a float than in an int. Depending on what your application is doing with these numbers you may not care about these differences or maybe you want to look at using something like BigDecimal.
Floats don't allow more precision, floats allow wider range of numbers.
We've got 2^32 possible values for integers in range (approximately) -2 * 10^9 to 2 * 10^9. Floats are also 32bit, so the number of possible values is at most the same as for integers.
Out of these 32 bits, some of them are reserved for mantisa, the rest of these is for exponent. The resulting number represented by the float is then calculated (for simplicity I'll use 10-base) as mantisa * 10^exponent.
Obviously, the maximum precision is limited by the number of bits assigned to mantisa. So you can represent some integers exactly as integers, but they won't fit to mantisa, so the least significant bits are thrown off, as in your case.
Float have a greater range of values but lower precision.
Int have a lower range of values but higher precision.
Int is specific to 1, while Float is specific to 4.
So if you are dealing with trillions but don't care about +/- 4 then use float. but if you need the last digit to be precise you need to use int.
I want to convert longitude and latitude that I get as a string from my database. The string is correct, and when i try to convert it into double, it is also correct. However when i am convert the double or the string value (i have tried both) into a float value, the last decimal gets round off.
The value of the string or double is 59.858139
The convertion to float is 59.85814
I've tried everything, and this is one desperate example :)
private float ConvertToFloat(double d)
{
float f = 00.000000f;
f = (float) d;
return f;
}
You are aware that doubles have more precision than floats and that floats round off, right? This is expected behaviour. There is no sense in casting a double to a float in this case.
Here's something to get you thinking in the right direction...
Double.doubleToRawLongBits(long value);
Float.intBitsToFloat(int bits);
Doubles can't fit into int and they have to fit into long. It's really twice the size, even mediating bits with strings won't do any good here.
1. float has only 24 bits of precision, which will be insufficient to hold the number of digits in your latitude and longitude.
2. The rounding off is due to the size of the number. So use double if you require floating point, or use BigDecimal
We are starting with your decimal number 59.858139
Convert that number to binary: 111011.11011011101011101111111101011100011011000001000110100001000100...
I.e. the number is an infinite fraction in binary. It is not possible to represent it exactly. (In the same way that it is not possible to represent 1/3 exactly with decimal numbers)
Rewrite the number to some form of binary scientific notation:
10 ^ 101 * 1.1101111011011101011101111111101011100011011000001000110100001000100...
Remember that this is still in binary, so the 10 ^ 101 corresponds to 2 ^ 5 in decimal notation.
Now... A float value can store 23 bits in the mantissa. If we round it up using "round to nearest" rounding mode, we get:
10 ^ 101 * 1.11011110110111010111100
Which is equal to:
111011.110110111010111100
That is all the precision that can fit into the float data type. Now convert that back to decimal:
59.8581390380859375
Seems pretty close to 59.858139 actually... But that is just luck. What happens if we convert the second closest float value to binary instead?
111011.110110111010111011 = 59.858135223388671875
So basically the resolution is approximately 0.000004.
So all we can really know from the float value is that the number is something like: 59.858139 ± 0.000002
It could just as well be 59.858137 or 59.858141.
Since the last digit is rather uncertain, I am guessing that the printing code is smart enough to understand that the last digit falls outside the precision of a float value, and hence, the value is rounded to 59.85814.
By the way, if you (like me are) are too lazy to convert between binary and decimal fractions by hand, you can use this converter. If you want to read more about the details of the floating point system, the wikipedia page for floating point representation is a great resource.
I know about weird stuff with precision errors, but I can't fathom,
Why is (long)9223372036854665200d giving me 9223372036854665216 ?
9223372036854665200d is a constant of type double. However, 9223372036854665200 does not fit in a double without loss of precision. A double only has 52 bits of mantissa, whereas the number in question requires 63 bits to be represented exactly.
The nearest double to 9223372036854665200d is the number whose mantissa equals 1.1111111111111111111111111111111111111111111110010100 in binary and whose exponent is 63 (decimal). This number is none other than 9223372036854665216 (call it U).
If we decrease the mantissa one notch to 1.1...0011, we get 9223372036854664192 (call it L).
The original number is between L and U and is much closer to U than it is to L
Finally, if you think that this truncation of the mantissa ought to result in a number that ends in a bunch of zeros, you're right. Only it happens in binary, not in decimal: U in base-16 is 0x7ffffffffffe5000 and L is 0x7ffffffffffe4c00.
Because doubles don't have that much precision. Why are you doing such a strange thing? Change the d to l.
Doubles have 52-53 bit precision, whereas a long has 64 bit precision (for integers only). The loss of precision in a double is used to represent the exponent, which allows a double to represent larger/smaller numbers than a long can.
Your number is 19 digits long, whereas a double can only store roughly 16 digits of (decimal) integer data. Thus the final number ends up being rounded.
Reference: Double - Wikipedia
Because doubles have limited precision. Your constant has more significant digits than a double can keep track of, so it loses them.
You are assuming that limited precision means that it is represented in decimal so is limited to 15 or 16 digits. Actually it is represented in binary and limited to 53 bits of precision. double takes the closest representable value.
double d = 9223372036854665200d;
System.out.println(d +" is actually\n" + new BigDecimal(d)+" so when cast to (long) is\n"+(long) d);
prints
9.2233720368546652E18 is actually
9223372036854665216 so when cast to (long) is
9223372036854665216