Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
How do you explain floating point inaccuracy to fresh programmers and laymen who still think computers are infinitely wise and accurate?
Do you have a favourite example or anecdote which seems to get the idea across much better than an precise, but dry, explanation?
How is this taught in Computer Science classes?
There are basically two major pitfalls people stumble in with floating-point numbers.
The problem of scale. Each FP number has an exponent which determines the overall “scale” of the number so you can represent either really small values or really larges ones, though the number of digits you can devote for that is limited. Adding two numbers of different scale will sometimes result in the smaller one being “eaten” since there is no way to fit it into the larger scale.
PS> $a = 1; $b = 0.0000000000000000000000001
PS> Write-Host a=$a b=$b
a=1 b=1E-25
PS> $a + $b
1
As an analogy for this case you could picture a large swimming pool and a teaspoon of water. Both are of very different sizes, but individually you can easily grasp how much they roughly are. Pouring the teaspoon into the swimming pool, however, will leave you still with roughly a swimming pool full of water.
(If the people learning this have trouble with exponential notation, one can also use the values 1 and 100000000000000000000 or so.)
Then there is the problem of binary vs. decimal representation. A number like 0.1 can't be represented exactly with a limited amount of binary digits. Some languages mask this, though:
PS> "{0:N50}" -f 0.1
0.10000000000000000000000000000000000000000000000000
But you can “amplify” the representation error by repeatedly adding the numbers together:
PS> $sum = 0; for ($i = 0; $i -lt 100; $i++) { $sum += 0.1 }; $sum
9,99999999999998
I can't think of a nice analogy to properly explain this, though. It's basically the same problem why you can represent 1/3 only approximately in decimal because to get the exact value you need to repeat the 3 indefinitely at the end of the decimal fraction.
Similarly, binary fractions are good for representing halves, quarters, eighths, etc. but things like a tenth will yield an infinitely repeating stream of binary digits.
Then there is another problem, though most people don't stumble into that, unless they're doing huge amounts of numerical stuff. But then, those already know about the problem. Since many floating-point numbers are merely approximations of the exact value this means that for a given approximation f of a real number r there can be infinitely many more real numbers r1, r2, ... which map to exactly the same approximation. Those numbers lie in a certain interval. Let's say that rmin is the minimum possible value of r that results in f and rmax the maximum possible value of r for which this holds, then you got an interval [rmin, rmax] where any number in that interval can be your actual number r.
Now, if you perform calculations on that number—adding, subtracting, multiplying, etc.—you lose precision. Every number is just an approximation, therefore you're actually performing calculations with intervals. The result is an interval too and the approximation error only ever gets larger, thereby widening the interval. You may get back a single number from that calculation. But that's merely one number from the interval of possible results, taking into account precision of your original operands and the precision loss due to the calculation.
That sort of thing is called Interval arithmetic and at least for me it was part of our math course at the university.
Show them that the base-10 system suffers from exactly the same problem.
Try to represent 1/3 as a decimal representation in base 10. You won't be able to do it exactly.
So if you write "0.3333", you will have a reasonably exact representation for many use cases.
But if you move that back to a fraction, you will get "3333/10000", which is not the same as "1/3".
Other fractions, such as 1/2 can easily be represented by a finite decimal representation in base-10: "0.5"
Now base-2 and base-10 suffer from essentially the same problem: both have some numbers that they can't represent exactly.
While base-10 has no problem representing 1/10 as "0.1" in base-2 you'd need an infinite representation starting with "0.000110011..".
How's this for an explantation to the layman. One way computers represent numbers is by counting discrete units. These are digital computers. For whole numbers, those without a fractional part, modern digital computers count powers of two: 1, 2, 4, 8. ,,, Place value, binary digits, blah , blah, blah. For fractions, digital computers count inverse powers of two: 1/2, 1/4, 1/8, ... The problem is that many numbers can't be represented by a sum of a finite number of those inverse powers. Using more place values (more bits) will increase the precision of the representation of those 'problem' numbers, but never get it exactly because it only has a limited number of bits. Some numbers can't be represented with an infinite number of bits.
Snooze...
OK, you want to measure the volume of water in a container, and you only have 3 measuring cups: full cup, half cup, and quarter cup. After counting the last full cup, let's say there is one third of a cup remaining. Yet you can't measure that because it doesn't exactly fill any combination of available cups. It doesn't fill the half cup, and the overflow from the quarter cup is too small to fill anything. So you have an error - the difference between 1/3 and 1/4. This error is compounded when you combine it with errors from other measurements.
In python:
>>> 1.0 / 10
0.10000000000000001
Explain how some fractions cannot be represented precisely in binary. Just like some fractions (like 1/3) cannot be represented precisely in base 10.
Another example, in C
printf (" %.20f \n", 3.6);
incredibly gives
3.60000000000000008882
Here is my simple understanding.
Problem:
The value 0.45 cannot be accurately be represented by a float and is rounded up to 0.450000018. Why is that?
Answer:
An int value of 45 is represented by the binary value 101101.
In order to make the value 0.45 it would be accurate if it you could take 45 x 10^-2 (= 45 / 10^2.)
But that’s impossible because you must use the base 2 instead of 10.
So the closest to 10^2 = 100 would be 128 = 2^7. The total number of bits you need is 9 : 6 for the value 45 (101101) + 3 bits for the value 7 (111).
Then the value 45 x 2^-7 = 0.3515625. Now you have a serious inaccuracy problem. 0.3515625 is not nearly close to 0.45.
How do we improve this inaccuracy? Well we could change the value 45 and 7 to something else.
How about 460 x 2^-10 = 0.44921875. You are now using 9 bits for 460 and 4 bits for 10. Then it’s a bit closer but still not that close. However if your initial desired value was 0.44921875 then you would get an exact match with no approximation.
So the formula for your value would be X = A x 2^B. Where A and B are integer values positive or negative.
Obviously the higher the numbers can be the higher would your accuracy become however as you know the number of bits to represent the values A and B are limited. For float you have a total number of 32. Double has 64 and Decimal has 128.
A cute piece of numerical weirdness may be observed if one converts 9999999.4999999999 to a float and back to a double. The result is reported as 10000000, even though that value is obviously closer to 9999999, and even though 9999999.499999999 correctly rounds to 9999999.
Related
I'm trying to round the cents of a value.
The rounding seems to work, but there's an exception:
double amount = 289.42;
String f= String.format("%.1f", amount);
System.out.println(new DecimalFormat("##0.00").format(Double.valueOf(f)));
This is the error: java.lang.NumberFormatException: For input string: "289,4"
Your question posits an impossibility.
Here's the thing: You cannot represent currency amounts with double. At all. Thus, 'how do I render these cents-in-a-double appropriately' isn't a sensible concept in the first place. Because cents cannot be stored in a double.
The problem lies in what double are, fundamentally. double is a numeric storage system that is defined to consist of exactly 64 bits. That's a problem right there: It's a mathematical fact that 64 bits can store at most 2^64 unique things, because, well, math. Think about it.
The problem is, There are in fact an infinite amount of numbers between 0 and 1, let alone between -infinity and +infinity which double would appear to represent. So, how do you square that circle? How does one represent one specific value chosen from an infinite amount of infinities, with only 64 bits?
The answer is simple. You don't.
doubles do not in fact store arbitrary values at all. And that is why you cannot use them to store currencies. Instead, take the number line and mark off slightly less than 2^64 specific values on it. We'll call these 'the blessed numbers'. A double can only store blessed numbers. They can't store anything else. In addition, any math you do to doubles is silently rounded to the nearest blessed value as doubles can't store anything else. So, 0.1 + 0.1? Not actually 0.2. Instead, 0.1 isn't even blessed, so that's really round-to-blessed(0.1) + round-to-blessed(0.1), so actually that's 0.0999999999975 + 0.0999999999975 = 0.2000000000018 or whatever. The blessed numbers are distributed unequally - there are a ton of blessed numbers in the 0-1 range, and as you move away from the 0, the distance between 2 blessed numbers grows larger and larger. Their distribution makes sense, but, computers count in binary, so they fall on neat boundaries in binary, not in decimal (0.1 looks neat in decimal. It's similar to 1 divided by 3, i.e. endlessly repeating, and therefore not precisely representable no matter how many bits you care to involve, in binary).
That rounding is precisely what you absolutely don't want to happen when representing currency. You don't want a cent to randomly appear or disappear and yet that is exactly what will happen if you use double to store finance info.
Hence, you're asking about how to render 'cents in a double' appropriately but in fact that question cannot possibly be answered - you can't store cents in a double, hence, it is not possible to render it properly.
Instead..
Use cents-in-int
The easiest way to do currency correctly is to first determine the accepted atomary unit for your currency, and then store those, in long or int as seems appropriate. For euros, that's eurocents. For bitcoin, that's satoshis. For yen, it's just yen. For dollars, its dollarcents. And so on.
$5.45 is best represented as the int value 545. Not as the double value 5.45, because that's not actually a value a double can represent.
Why do doubles show up as 0.1?
Because System.out.println knows that doubles are wonky and that you're highly likely to want to see 0.1 and not 0.09999999999991238 and thus it rounds inherently. That doesn't magically make it possible to use double to represent finance amounts.
I need to divide, or multiply by complex factors
Division for currency is always nasty. Imagine a cost of 100 dollars that needs to be paid by each 'partner' in a coop. The coop has 120 shares, and each partner has 40 shares, so each partner must pay precisely 1/3 of the cost.
Now what? 100 dollars does not neatly divide into threes. You can't very well charge everybody 33 dollars, 33 cents, and a third of a cent. You could charge everybody 33.33, but now the bank needs to eat 1 cent. You could also charge everybody 33.34, and the bank gets to keep the 2 cents. Or, you could get a little creative, and roll some dice to determine 'the loser'. The loser pays 33.34, the other 2 pay 33.33.
The point is: There is no inherently correct answer. Each situation has its own answer. Hence, division in general is impossible without first answering that question. There is no solving this problem unless you have code that knows how to apply the chosen 'division' algorithm. a / b cannot be used in any case (as the operation has at least 3 params: The dividend, the divisor, and the algorithm to apply to it).
For foreign exchange, 'multiply by this large decimal value' comes up a lot. You can store arbitrary precision values exactly using the java.math.BigDecimal class. However, this is not particularly suitable for storing currencies (all multiplication-by-a-factor will mean the BDs grow ever larger, they still can't divide e.g. 1 by 3 (anything that has repeating digits), and they don't solve the more fundamental issue: Any talk with other systems, such as a bank, can't deal with fractions of atomary units). Stick with BD-space math (as that is perfect, though, can throw exceptions if you divide, and grows ever slower and more complicated over time), until the system you are programming for enforces a rounding to atomary units, at which point, you round, resetting the growth. If you never need to multiply by fractions this doesn't come up, and there's no need to use BigDecimal for anything currency related.
How do I format cents-in-a-long?
String.format("€%d.%02d", cents / 100, cents % 100);
It gets very slightly more complicated for negative numbers (% returns negative values, so you need to do something about this. Math.abs can help), but not very.
cents / 100 gives you the "whole" part when you integer-divide by 100, and % 100 gives you the remainder, which precisely boils down to 'euros' and 'eurocents'.
What would be the most efficient way to grab the, say, 207th decimal place of a number? Would it just be x * Math.pow(10,207) % 10?
How's this for python
int((x*(10**n)))%10
What you want is impossible.
The only things in java that work with Math.pow and basic operators are the primitives. The only floating point primitives are float and double. These are IEEE754 floating point numbers; doubles are 64 bits and floats are 32 bits.
A simple principle applies: If you have 64 bits, then you can only represent 2^64 different numbers (it's actually a little less). So, you get about 18446744073709551616 numbers, of all numbers in existence, which actually exist as far as the computer is concerned for doubles. all other numbers do not exist.
So what happens if a mathematical operation (say, 0.1 + 0.2) ends up being a number that doesn't exist? Well, java (this is predicated by the IEEE754 standard; most languages and chips do it this way) will return you the nearest number amongst all the 18446744073709551616 numbers that do exist.
The problem with wanting the 207th digit is that obviously, given that only 18446744073709551616 numbers exist, none of those 18446744073709551616 numbers have that kind of precision. Asking for the 207th digit is therefore completely random. It says nothing about the input number... whatsoever.
Let me repeat that: There are no double values that have a significant 207th digit AT ALL.
If you want 'perfect' representation, with no rounding whatsoever, you want BigDecimal, but note that demanding perfection is tricky. Imagine in basic decimal math (computers are binary, but lets stick with decimal as we're all much more familiar with it, what with our 10 fingers and all), I ask you to only give me perfect answers, and then I ask you to divide 1 by 3.
BigDecimal won't let you do that either, so the ops you can run on BigDecimals without telling BigDecimal in what ways it is allowed to be inprecise leads to exceptions.
If you've set it all up exactly how you wanted it, and you really have a BigDecimal with a 207th digit after the comma, you can use the scale feature, or just the power-of-10 feature to get what you want.
Note BigDecimal is not primitive and therefore does not support the +, %, etc operators.
***Special Note: There is no: "This will handle all situations" answer here as the arbitrary value such as 207 could take the calculations way outside the bounds of possible precision of the variable types involved. My answer as such will only work within the bounds of variable type precision for which 207 is really not possible...
To get the specific digit an arbitrary number (like 207) of places after the decimal point... if you just multiply by factor of 10.. and then take mod 10, the answer (in java) is still a floating point type... not a single digit...
To get a specific digit an arbitrary number (n) of places after the decimal point, without converting to string:
Math.floor(x*Math.pow(10,n)) % 10;
to get 4th digit after 2.987654321
x*Math.pow(10, 4) = 29876.54321
Math.floor(29876.54321) = 29876
29876 % 10 = 6
Why do some numbers lose accuracy when stored as floating point numbers?
For example, the decimal number 9.2 can be expressed exactly as a ratio of two decimal integers (92/10), both of which can be expressed exactly in binary (0b1011100/0b1010). However, the same ratio stored as a floating point number is never exactly equal to 9.2:
32-bit "single precision" float: 9.19999980926513671875
64-bit "double precision" float: 9.199999999999999289457264239899814128875732421875
How can such an apparently simple number be "too big" to express in 64 bits of memory?
In most programming languages, floating point numbers are represented a lot like scientific notation: with an exponent and a mantissa (also called the significand). A very simple number, say 9.2, is actually this fraction:
5179139571476070 * 2 -49
Where the exponent is -49 and the mantissa is 5179139571476070. The reason it is impossible to represent some decimal numbers this way is that both the exponent and the mantissa must be integers. In other words, all floats must be an integer multiplied by an integer power of 2.
9.2 may be simply 92/10, but 10 cannot be expressed as 2n if n is limited to integer values.
Seeing the Data
First, a few functions to see the components that make a 32- and 64-bit float. Gloss over these if you only care about the output (example in Python):
def float_to_bin_parts(number, bits=64):
if bits == 32: # single precision
int_pack = 'I'
float_pack = 'f'
exponent_bits = 8
mantissa_bits = 23
exponent_bias = 127
elif bits == 64: # double precision. all python floats are this
int_pack = 'Q'
float_pack = 'd'
exponent_bits = 11
mantissa_bits = 52
exponent_bias = 1023
else:
raise ValueError, 'bits argument must be 32 or 64'
bin_iter = iter(bin(struct.unpack(int_pack, struct.pack(float_pack, number))[0])[2:].rjust(bits, '0'))
return [''.join(islice(bin_iter, x)) for x in (1, exponent_bits, mantissa_bits)]
There's a lot of complexity behind that function, and it'd be quite the tangent to explain, but if you're interested, the important resource for our purposes is the struct module.
Python's float is a 64-bit, double-precision number. In other languages such as C, C++, Java and C#, double-precision has a separate type double, which is often implemented as 64 bits.
When we call that function with our example, 9.2, here's what we get:
>>> float_to_bin_parts(9.2)
['0', '10000000010', '0010011001100110011001100110011001100110011001100110']
Interpreting the Data
You'll see I've split the return value into three components. These components are:
Sign
Exponent
Mantissa (also called Significand, or Fraction)
Sign
The sign is stored in the first component as a single bit. It's easy to explain: 0 means the float is a positive number; 1 means it's negative. Because 9.2 is positive, our sign value is 0.
Exponent
The exponent is stored in the middle component as 11 bits. In our case, 0b10000000010. In decimal, that represents the value 1026. A quirk of this component is that you must subtract a number equal to 2(# of bits) - 1 - 1 to get the true exponent; in our case, that means subtracting 0b1111111111 (decimal number 1023) to get the true exponent, 0b00000000011 (decimal number 3).
Mantissa
The mantissa is stored in the third component as 52 bits. However, there's a quirk to this component as well. To understand this quirk, consider a number in scientific notation, like this:
6.0221413x1023
The mantissa would be the 6.0221413. Recall that the mantissa in scientific notation always begins with a single non-zero digit. The same holds true for binary, except that binary only has two digits: 0 and 1. So the binary mantissa always starts with 1! When a float is stored, the 1 at the front of the binary mantissa is omitted to save space; we have to place it back at the front of our third element to get the true mantissa:
1.0010011001100110011001100110011001100110011001100110
This involves more than just a simple addition, because the bits stored in our third component actually represent the fractional part of the mantissa, to the right of the radix point.
When dealing with decimal numbers, we "move the decimal point" by multiplying or dividing by powers of 10. In binary, we can do the same thing by multiplying or dividing by powers of 2. Since our third element has 52 bits, we divide it by 252 to move it 52 places to the right:
0.0010011001100110011001100110011001100110011001100110
In decimal notation, that's the same as dividing 675539944105574 by 4503599627370496 to get 0.1499999999999999. (This is one example of a ratio that can be expressed exactly in binary, but only approximately in decimal; for more detail, see: 675539944105574 / 4503599627370496.)
Now that we've transformed the third component into a fractional number, adding 1 gives the true mantissa.
Recapping the Components
Sign (first component): 0 for positive, 1 for negative
Exponent (middle component): Subtract 2(# of bits) - 1 - 1 to get the true exponent
Mantissa (last component): Divide by 2(# of bits) and add 1 to get the true mantissa
Calculating the Number
Putting all three parts together, we're given this binary number:
1.0010011001100110011001100110011001100110011001100110 x 1011
Which we can then convert from binary to decimal:
1.1499999999999999 x 23 (inexact!)
And multiply to reveal the final representation of the number we started with (9.2) after being stored as a floating point value:
9.1999999999999993
Representing as a Fraction
9.2
Now that we've built the number, it's possible to reconstruct it into a simple fraction:
1.0010011001100110011001100110011001100110011001100110 x 1011
Shift mantissa to a whole number:
10010011001100110011001100110011001100110011001100110 x 1011-110100
Convert to decimal:
5179139571476070 x 23-52
Subtract the exponent:
5179139571476070 x 2-49
Turn negative exponent into division:
5179139571476070 / 249
Multiply exponent:
5179139571476070 / 562949953421312
Which equals:
9.1999999999999993
9.5
>>> float_to_bin_parts(9.5)
['0', '10000000010', '0011000000000000000000000000000000000000000000000000']
Already you can see the mantissa is only 4 digits followed by a whole lot of zeroes. But let's go through the paces.
Assemble the binary scientific notation:
1.0011 x 1011
Shift the decimal point:
10011 x 1011-100
Subtract the exponent:
10011 x 10-1
Binary to decimal:
19 x 2-1
Negative exponent to division:
19 / 21
Multiply exponent:
19 / 2
Equals:
9.5
Further reading
The Floating-Point Guide: What Every Programmer Should Know About Floating-Point Arithmetic, or, Why don’t my numbers add up? (floating-point-gui.de)
What Every Computer Scientist Should Know About Floating-Point Arithmetic (Goldberg 1991)
IEEE Double-precision floating-point format (Wikipedia)
Floating Point Arithmetic: Issues and Limitations (docs.python.org)
Floating Point Binary
This isn't a full answer (mhlester already covered a lot of good ground I won't duplicate), but I would like to stress how much the representation of a number depends on the base you are working in.
Consider the fraction 2/3
In good-ol' base 10, we typically write it out as something like
0.666...
0.666
0.667
When we look at those representations, we tend to associate each of them with the fraction 2/3, even though only the first representation is mathematically equal to the fraction. The second and third representations/approximations have an error on the order of 0.001, which is actually much worse than the error between 9.2 and 9.1999999999999993. In fact, the second representation isn't even rounded correctly! Nevertheless, we don't have a problem with 0.666 as an approximation of the number 2/3, so we shouldn't really have a problem with how 9.2 is approximated in most programs. (Yes, in some programs it matters.)
Number bases
So here's where number bases are crucial. If we were trying to represent 2/3 in base 3, then
(2/3)10 = 0.23
In other words, we have an exact, finite representation for the same number by switching bases! The take-away is that even though you can convert any number to any base, all rational numbers have exact finite representations in some bases but not in others.
To drive this point home, let's look at 1/2. It might surprise you that even though this perfectly simple number has an exact representation in base 10 and 2, it requires a repeating representation in base 3.
(1/2)10 = 0.510 = 0.12 = 0.1111...3
Why are floating point numbers inaccurate?
Because often-times, they are approximating rationals that cannot be represented finitely in base 2 (the digits repeat), and in general they are approximating real (possibly irrational) numbers which may not be representable in finitely many digits in any base.
While all of the other answers are good there is still one thing missing:
It is impossible to represent irrational numbers (e.g. π, sqrt(2), log(3), etc.) precisely!
And that actually is why they are called irrational. No amount of bit storage in the world would be enough to hold even one of them. Only symbolic arithmetic is able to preserve their precision.
Although if you would limit your math needs to rational numbers only the problem of precision becomes manageable. You would need to store a pair of (possibly very big) integers a and b to hold the number represented by the fraction a/b. All your arithmetic would have to be done on fractions just like in highschool math (e.g. a/b * c/d = ac/bd).
But of course you would still run into the same kind of trouble when pi, sqrt, log, sin, etc. are involved.
TL;DR
For hardware accelerated arithmetic only a limited amount of rational numbers can be represented. Every not-representable number is approximated. Some numbers (i.e. irrational) can never be represented no matter the system.
There are infinitely many real numbers (so many that you can't enumerate them), and there are infinitely many rational numbers (it is possible to enumerate them).
The floating-point representation is a finite one (like anything in a computer) so unavoidably many many many numbers are impossible to represent. In particular, 64 bits only allow you to distinguish among only 18,446,744,073,709,551,616 different values (which is nothing compared to infinity). With the standard convention, 9.2 is not one of them. Those that can are of the form m.2^e for some integers m and e.
You might come up with a different numeration system, 10 based for instance, where 9.2 would have an exact representation. But other numbers, say 1/3, would still be impossible to represent.
Also note that double-precision floating-points numbers are extremely accurate. They can represent any number in a very wide range with as much as 15 exact digits. For daily life computations, 4 or 5 digits are more than enough. You will never really need those 15, unless you want to count every millisecond of your lifetime.
Why can we not represent 9.2 in binary floating point?
Floating point numbers are (simplifying slightly) a positional numbering system with a restricted number of digits and a movable radix point.
A fraction can only be expressed exactly using a finite number of digits in a positional numbering system if the prime factors of the denominator (when the fraction is expressed in it's lowest terms) are factors of the base.
The prime factors of 10 are 5 and 2, so in base 10 we can represent any fraction of the form a/(2b5c).
On the other hand the only prime factor of 2 is 2, so in base 2 we can only represent fractions of the form a/(2b)
Why do computers use this representation?
Because it's a simple format to work with and it is sufficiently accurate for most purposes. Basically the same reason scientists use "scientific notation" and round their results to a reasonable number of digits at each step.
It would certainly be possible to define a fraction format, with (for example) a 32-bit numerator and a 32-bit denominator. It would be able to represent numbers that IEEE double precision floating point could not, but equally there would be many numbers that can be represented in double precision floating point that could not be represented in such a fixed-size fraction format.
However the big problem is that such a format is a pain to do calculations on. For two reasons.
If you want to have exactly one representation of each number then after each calculation you need to reduce the fraction to it's lowest terms. That means that for every operation you basically need to do a greatest common divisor calculation.
If after your calculation you end up with an unrepresentable result because the numerator or denominator you need to find the closest representable result. This is non-trivil.
Some Languages do offer fraction types, but usually they do it in combination with arbitary precision, this avoids needing to worry about approximating fractions but it creates it's own problem, when a number passes through a large number of calculation steps the size of the denominator and hence the storage needed for the fraction can explode.
Some languages also offer decimal floating point types, these are mainly used in scenarios where it is imporant that the results the computer gets match pre-existing rounding rules that were written with humans in mind (chiefly financial calculations). These are slightly more difficult to work with than binary floating point, but the biggest problem is that most computers don't offer hardware support for them.
I'm looking through old exam questions (currently first year of uni.) and I'm wondering if someone could explain a bit more thoroughly why the following for loop does not end when it is supposed to. Why does this happen? I understand that it skips 100.0 because of a rounding-error or something, but why?
for(double i = 0.0; i != 100; i = i +0.1){
System.out.println(i);
}
The number 0.1 cannot be exactly represented in binary, much like 1/3 cannot be exactly represented in decimal, as such you cannot guarantee that:
0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1==1
This is because in binary:
0.1=(binary)0.00011001100110011001100110011001....... forever
However a double cannot contain an infinite precision and so, just as we approximate 1/3 to 0.3333333 so must the binary representation approximate 0.1.
Expanded decimal analogy
In decimal you may find that
1/3+1/3+1/3
=0.333+0.333+0.333
=0.999
This is exactly the same problem. It should not be seen as a weakness of floating point numbers as our own decimal system has the same difficulties (but for different numbers, someone with a base-3 system would find it strange that we struggled to represent 1/3). It is however an issue to be aware of.
Demo
A live demo provided by Andrea Ligios shows these errors building up.
Computers (at least current ones) works with binary data. Moreover, there is a length limitation for computers to process in their arithmetic logic units (i.e. 32bits, 64bits etc).
Representing integers in binary form is simple on the contrary we cant say the same thing for floating points.
As shown above there is a special way of representing floating points according to IEEE-754 which is also accepted as defacto by processor producers and software guys that's why it is important for everyone to know about it.
If we look at the maximum value of a double in java (Double.MAX_VALUE) is 1.7976931348623157E308 (>10^307). only with 64 bits, huge numbers could be represented however problem is the precision.
As '==' and '!=' operators compare numbers bitwise, in your case 0.1+0.1+0.1 is not equal to 0.3 in terms of bits they are represented.
As a conclusion, to fit huge floating point numbers in a few bits clever engineers decided to sacrifice precision. If you are working on floating points you shouldn't use '==' or '!=' unless you are sure what you are doing.
As a general rule, never use double to iterate with due to rounding errors (0.1 may look nice when written in base 10, but try writing it in base 2—which is what double uses). What you should do is use a plain int variable to iterate and calculate the double from it.
for (int i = 0; i < 1000; i++)
System.out.println(i/10.0);
First of all, I'm going to explain some things about doubles. This will actually take place in base ten for ease of understanding.
Take the value one-third and try to express it in base ten. You get 0.3333333333333.... Let's say we need to round it to 4 places. We get 0.3333. Now, let's add another 1/3. We get 0.6666333333333.... which rounds to 0.6666. Let's add another 1/3. We get 0.9999, not 1.
The same thing happens with base two and one-tenth. Since you're going by 0.110 and 0.110 is a repeating binary value(like 0.1666666... in base ten), you'll have just enough error to miss one hundred when you do get there.
1/2 can be represented in base ten just fine, and 1/5 can as well. This is because the prime factors of the denominator are a subset of the factors of the base. This is not the case for one third in base ten or one tenth in base two.
It should be for(double a = 0.0; a < 100.0; a = a + 0.01)
Try and see if this works instead
This question already has answers here:
Closed 11 years ago.
Possible Duplicates:
Is JavaScript's Math broken?
Java floating point arithmetic
I have the current code
for(double j = .01; j <= .17; j+=.01){
System.out.println(j);
}
the output is:
0.01
0.02
0.03
0.04
0.05
0.060000000000000005
0.07
0.08
0.09
0.09999999999999999
0.10999999999999999
0.11999999999999998
0.12999999999999998
0.13999999999999999
0.15
0.16
0.17
Can someone explain why this is happening? How do you fix this? Besides writing a rounding function?
Floats are an approximation of the actual number in Java, due to the way they're stored. If you need exact values, use a BigDecimal instead.
They are working correctly. Some decimal values are not representable exactly in binary floating point and get rounded to the closest value. See my answer to this question for more detail. The question was asked about Perl, but the answer applies equally to Java since it's a limitation of ALL floating point representations that do not have infinite precision (i.e. all of them).
As suggested by #Kaleb Brasee go and use BigDecimal's when accuracy is a must. Here is a link to a nice explanation of tiny details related to using floating point operations in Java http://firstclassthoughts.co.uk/java/traps/java_double_traps.html
There is also a link to issues involved with using BigDecimal's. Highly recommended to read them both. It really helped me.
Enjoy, Boro.
We humans are used to think in 'base 10' when we deal with floating point numbers 'by hand' (that is, literally when writing them on paper or when entering them into a computer). Because of this, it is possible for us to write down an exact representation of, say, 17%. We just write 0.17 (or 1.7E-1 etc). Trying to represent such a trivial thing as a third can not be done exactly with that system, because we have to write 0.3333333... with an infinite number of 3s, which is impossible.
Computers dealing with floating point not only have a limited number of bits to represent the mantissa (or significand) of the number, they are also restricted to express the mantissa in the base of two. That means that most percentages (which we humans with our base 10 floating point convention always can write exactly, like for example '0.17') are impossible for the computer to store exactly. Fractions like 0%, 25%, 50%, 75% and 100% can be expressed exactly as a floating point number in a computer, because it consists of either halves (2E-1) or quarters (2E-4) which fits nicely with a digital representation of a number. Percentage values like 17% or even trivial ones (for us humans!!) like 10% or 1% are as impossible for computers to store exactly simply because those numbers are, for the binary floating point system what the 'one third' is for the human (base 10) floating point system.
But if you carefully pick your floating point values, so they always are made of a whole number of 1/2^n where n might be 10 (meaning an integer number of 1/1024), then they can always be stored exactly without errors as a floating point number. So if you try to store 17/1024 in a computer, it will go smoothly. You can actually store it without error even using the 'human base 10' decimal system (but you would go nuts by the number of actual digits you have to deal with).
This is some reason I believe why some games express angles in a unit where a whole 360 degree turn is 256 angle units. Can be expressed without loss as a floating point number between 0 and 1 (where 1 means you go a full revolution).
It's normal in double representation on the computer. You lose some bits then you will have such results. Better solution is to do this:
for(int j = 1; j <= 17; j++){
System.out.println(j/100.0);
}
This is because floating point values are inherently not the same as reals in the mathematical sense.
In a computer, there is only a fixed number of bits that can be used to represent value. This means there are a finite number of values that it can hold. But there are an infinite amount of real numbers, thus not all of them can be represented exactly. But usually the value is something close. You can find a more detailed explanation here.
That is because of the limitations of IEEE754 the binary format to get the most out of 32 bit.
As others have pointed out, only numbers that are combinations of powers of two are exactly representable in (bianary) floating point format
If you need to store arbitrary numbers with arbitrary precision, then use BigDecimal.
If the problem is just a display issue, then you can get round this in how you display the number. For example:
String.format("%.2f", n)
will format the number to 2 decimal places.