Is comparing two same "literal" float numbers for equality wrong? - java

This question is kind of language-agnostic but the code is written in Java.
We have all heard that comparing floating-point numbers for equality is generally wrong. But what if I wanted to compare two exact same literal float values (or strings representing exact same literal values converted to floats)?
I'm quite sure that the numbers will be exactly equal (well, because they must be equal in binary—how can the exact same thing result in two different binary numbers?!) but I wanted to be sure.
Case 1:
void test1() {
float f1 = 4.7;
float f2 = 4.7;
print(f1 == f2);
}
Case 2:
class Movie {
String rating; // for some reason the type is String
}
void test2() {
movie1.rating = "4.7";
movie2.rating = "4.7";
float f1 = Float.parse(movie1.rating);
float f2 = Float.parse(movie2.rating);
print(f1 == f2);
}
In both situations, the expression f1 == f2 should result in true. Am I right? Can I safely compare ratings for equality if they have the same literal float or string values?

There's a rule of thumb that you should apply to all programming rules of thumb (rule of thumbs?):
They are oversimplified, and will result in boneheaded decision making if pushed too far. IF you do not fully -grok- the intent behind the rule of thumb, you will mess up. Perhaps the rule of thumb remains a net positive (applying it without thought will improve things more than it will make them worse), but it will cause damage, and in any case it cannot be used as an argument in a debate.
So, with that in mind, clearly, there is no point in asking the question:
"Giving that the rule of thumb 'do not use == to compare floats' exists, is it ALWAYS bad?".
The answer is the extremely obvious: Duh, no. It's not ALWAYS bad, because rules of thumb pretty much by definition, if not by common sense, never ALWAYS apply.
So let's break it down then.
WHY is there a rule of thumb that you shouldn't == compare floats?
Your question suggests you already know this: It's because doing any math on floating points as represented by IEEE754 concepts such as java's double or float are inexact (vs. concepts like java's BigDecimal, which is exact *).
Do what you should always do when faced with a rule of thumb that, upon grokking why the rule of thumb exists and realizing it does not apply to your scenario: Completely ignore it.
Perhaps your question boils down to: I THINK I grok the rule of thumb, but perhaps I'm missing something; aside from the 'floating point math introduces small deviations which mess up == comparison', which does not apply to this case, are there any other reasons for this rule of thumb that I am not aware of?
In which case, my answer is: As far as I know, no.
*) But BigDecimal has its own equality problems, such as: Are two BigDecimal objects that represent the same mathematical number precisely, but which are configured to render at a different scale 'equal'? That depends on whether your viewpoint is that they are numbers or objects representing an exact decimal point number along with some meta properties including how to render it and how to round things if explicitly asked to do so. For what it is worth, the equals implementation of BD, which has to make a sophie's choice and choose between 2 equally valid interpretations of what equality means, chooses 'I represent a number', not 'I represent a number along with a bunch of metadata'. The same sophie's choice exists in all JPA/Hibernate stacks: Does a JPA object represent 'a row in the database' (thus equality being defined solely by the primary key value, and if not saved yet, two objects cannot be equal, not even to itself, unless the same reference identity), or does it represent the thing that the row represents, e.g. a student, and not 'a row in the DB that represents a student', in which case unid is the one field that does NOT matter for identity, and all the others (name, birthdate, social security number, etc) do. equality is hard.

Yes. Compile time constants that are the same are evaluated consistently.
If you think about it, they must be the same, because there’s only one compiler and it converts literals to their floating point representation deterministically.

Yes, you can compare floats like this. The thing is that even if 4.7 isn't 4.7 when converted to a float, it will be converted consistently to the same value.
In general it is not wrong per se to compare floats like this. But for more complex math, you might want to use Math.round() or set a "sameness" difference span that the two should be within to be counted as "the same".
There is also an arbitrariness to fixed point numbers. For instance
1,000,000,001
is bigger than
1.000,000,000
Are these two numbers different? It depends on the precision you need. But for most purposes, these numbers are functionally the same

This question is kind of language-agnostic…
Actually, there is no floating-point issue here, and the answer depends entirely on the language.
There is no floating-point issue because IEEE-754 is clear: Two floating-point datums (finite numbers, infinities, and/or NaNs) compare as equal if and only if they correspond to the same real number.
There are language issues because how literals are mapped to floating-point numbers and how source text is mapped to operations differs from language to language. For example, C 2018 6.4.4.2 5 says:
All floating constants of the same source form77) shall convert to the same internal format with the same value.
And footnote 77 says:
1.23, 1.230, 123e-2, 123e-02, and 1.23L are all different source forms and thus need not convert to the same internal format and value.
Thus the C standard permits 1.23 == 1.230 to evaluate to false. (There are historical reasons this was permitted, leaving it as a quality-of-implementation issue.) If by “same” literal float value, you mean the exact same source text, then this problem does not occur in C; the exact same source text must produce the same floating-point value each time in a particular C implementation. However, this example teaches us to be cautious.
C also allows implementations flexibility in how floating-point operations are performed: It allows an implementation to use more than the nominal precision in evaluating expressions, and it allows using different precisions in different parts of the same expression. So 1./3. == 1./3. could evaluate to false.
Some languages, like Python, do not have a good formal specification and are largely silent about how floating-point operations are performed. It is conceivable a Python implementation could use excess precision available in processor registers to convert the source text 1.3 to a long double or similar type, then save it somewhere as a double, then convert the source text 1.3 to a long double, then retrieve the double to compare it to the long double still in registers and get a result indicating inequality.
This sort of issue does not occur in implementations I am aware of, but, when asking a question like this, asking whether a rule always holds, regardless of language, leaves the door open for possible exceptions.

Related

Why do we need numeric literals in Java?

I have simple question, why do we need to use special literals when it's already obviously what type of variable we are using.
For example, you can see that we are using double type here. And I think compiler should also see it. But if I run such code:
double no_double = 60*(1000/3600);
System.out.format("result is: %.3f",no_double);
I get the result is: 0,000.
But if I run that code:
double a_double = 60.0*(1000.0/3600.0);
System.out.format("result is: %.3f",a_double);
Then I get true result: 16,667.
So why do we need to use literals ?
up. Java Primitive Data Types http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
You're dividing two integers.
The result of that is another integer.
Assigning that integer to a double value later doesn't change the division.
It is not obvious, as the compiler (or JVM) cannot know if you really want floats or integers.
I'd argue that floating point math is hard, if you consider all the corner cases. Floats are inprecise by design, whereas with integers, you get exact results. If you can stick to exact results it is often better to do so and resort to floats only when explicitly needed. For example, if you need to compare the equality of two variables, with floats you have to give it some boundaries and definitions as to what you consider to be equal. With integers there is no need for this, it is self-evident.
There are sevaral programming languages where this kind of explicit separation does not happen, possibly javascript and PHP being the most popular. They choose to autoconvert the datatypes on the fly. It causes some considerable overhead and additional issues in the long run, when you need to know exactly what is this variable you have in your hands.
Still other programming languages exist that don't even have these different data types. Maybe everything there is just an object. This is one way of solving it.
This is just part of the specification of Java as a C-type language. Per the specification, if integer values aren't promoted in an expression, then the result of the calculation is an integer. The language designers could have decided to make the result of all calculations floating point numbers, but decided not to, probably because that behavior for primitive types was not familiar to C and C++ programmers, and because it makes the operations slower.

Comparing mathematical expressions

So here's my situation: I have two mathematical expressions which contains variables (x, y, z, etc). I have already compiled them to postfix using the shunting yard algorithm for execution and now I need a way to test if they're mathematical equal.
Examples:
x+5==5+x
x*2==x+x
4/(x/2)==8/x
My initial thinking is to just throw a couple of thousand different random inputs and see if the evaluation result is the same.
Problems I foresee with this approach: Precision problems, NaN-situations and possible overflows.
All calculations are done with Java's double type.
Any ideas? :)
Edit: As this is for a casual game, the solution doesn't need to be perfect, only good enough!
For the example expressions you have provided, you could transform the function to produce one polynomial divided by another, with the most significant coefficient of the divisor one, and with no common factor. This would give you a canonical form - if there was a difference the two functions would really be different. However, you would also need to represent the coefficients as arbitrary precision rationals or hit precision problems here too, and by then you will have written most of a basic computer algebra system, such as those listed at http://en.wikipedia.org/wiki/List_of_computer_algebra_systems - which does include some free systems.
According to Wikipidea on this topic:
http://en.wikipedia.org/wiki/Symbolic_computation
"There are two notions of equality for mathematical expressions. The syntactic equality is the equality of the expressions which means that they are written (or represented in a computer) in the same way. As trivial, it is rarely considered by mathematicians, but it is the only equality that is easy to test with a program. The semantic equality is when two expressions represent the same mathematical object, like in
It is known that there may not exist a algorithm that decides if two expressions representing numbers are semantically equal, if exponentials and logarithms are allowed in the expressions. Therefore (semantical) equality may be tested only on some classes of expressions such as the polynomials and the rational fractions.
To test the equality of two expressions, instead to design a specific algorithm, it is usual to put them in some canonical form or to put their difference in a normal form and to test the syntactic equality of the result."
That seems to be the best practice.
I was trying to write basically the same question when I ended up here. However I found some ideas which are not mentionned here.
First, I agree with #nelshh that in some specific cases you can find canonical forms which allow to test equality of expressions.
I found some examples of canonical forms:
The most famous is probably the minterm canonical form in the boolean algebra which is used for instance in circuit synthesis or verification.
Polynomial expressions also admit a canonical form as a sum of monomials. This can solve your examples:
The canonical form for rational numbers is the irreductible fraction.
Your examples:
Both are already in canonical form, you just need to sort them by increasing degree.
2*x is in its canonical form, x+x is not (because both operands of the addition have same degree).
Both are already in canonical form (monomials of degree -1), except that the coefficient in 4/(x/2) which is 4/(1/2) is not in the canonical forms for rational numbers.
If you are still interested in this, I would suggest that you experiment with a computer algebra system such as sympy for python (it probably also exists for java). However, I also think that you should remove the tags java and floating-point (it has nothing to do with how a computer stores real numbers), and add the tag computer-science.
For instance sympy is able to tell such things:
>>> Rational(3,4)*(x+y)**2
2
3⋅(x + y)
──────────
4
>>> Rational(3,4)*(x**2+y**2)+Rational(1,4)*2*x*y+Rational(4,8)*2*x*y
2 2
3⋅x 3⋅x⋅y 3⋅y
──── + ───── + ────
4 2 4
>>> expand(Rational(3,4)*(x+y)**2)==expand(Rational(3,4)*(x**2+y**2)+Rational(1,4)*2*x*y+Rational(4,8)*2*x*y)
True

Float vs Double

Is there ever a case where a comparison (equals()) between two floating point values would return false if you compare them as DOUBLE but return true if you compare them as FLOAT?
I'm writing some procedure, as part of my group project, to compare two numeric values of any given types. There're 4 types I'd have to deal with altogether : double, float, int and long. So I'd like to group double and float into one function, that is, I'd just cast any float to double and do the comparison.
Would this lead to any incorrect results?
Thanks.
If you're converting doubles to floats and the difference between them is beyond the precision of the float type, you can run into trouble.
For example, say you have the two double values:
9.876543210
9.876543211
and that the precision of a float was only six decimal digits. That would mean that both float values would be 9.87654, hence equal, even though the double values themselves are not equal.
However, if you're talking about floats being cast to doubles, then identical floats should give you identical doubles. If the floats are different, the extra precision will ensure the doubles are distinct as well.
As long as you are not mixing promoted floats and natively calculated doubles in your comparison you should be ok, but take care:
Comparing floats (or doubles) for equality is difficult - see this lengthy but excellent discussion.
Here are some highlights:
You can't use ==, because of problems with the limited precision of floating point formats
float(0.1) and double(0.1) are different values (0.100000001490116119384765625 and 0.1000000000000000055511151231257827021181583404541015625) respectively. In your case, this means that comparing two floats (by converting to double) will probably be ok, but be careful if you want to compare a float with a double.
It's common to use an epsilon or small value to make a relative comparison with (floats a and b are considered equal if a - b < epsilon). In C, float.h defines FLT_EPSILON for exactly this purpose. However, this type of comparison doesn't work where a and b are both very small, or both very large.
You can address this by using a scaled-relative-to-the-sizes-of-a-and-b epsilon, but this breaks down in some cases (like comparisons to zero).
You can compare the integer representations of the floating point numbers to find out how many representable floats there are between them. This is what Java's Float.equals() does. This is called the ULP difference, for "Units in Last Place" difference. It's generally good, but also breaks down when comparing against zero.
The article concludes:
Know what you’re doing
There is no silver bullet. You have to choose wisely.
If you are comparing against zero, then relative epsilons and ULPs based comparisons are usually meaningless. You’ll need to use an absolute epsilon, whose value might be some small multiple of FLT_EPSILON and the inputs to your calculation. Maybe.
If you are comparing against a non-zero number then relative epsilons or ULPs based comparisons are probably what you want. You’ll probably want some small multiple of FLT_EPSILON for your relative epsilon, or some small number of ULPs. An absolute epsilon could be used if you knew exactly what number you were comparing against.
If you are comparing two arbitrary numbers that could be zero or non-zero then you need the kitchen sink. Good luck and God speed.
So, to answer your question:
If you are downgrading doubles to floats, then you might lose precision, and incorrectly report two different doubles as equal (as paxdiablo points out.)
If you are upgrading identical floats to double, then the added precision won't be a problem unless you are comparing a float with a double (Say you'd got 1.234 in float, and you only had 4 decimal digits of accuracy, then the double 1.2345 MIGHT represent the same value as the float. In this case you'd probably be better to do the comparison at the precision of the float, or more generally, at the error level of the most inaccurate representation in the comparison).
If you know the number you'll be comparing with, you can follow the advice quoted above.
If you're comparing arbitrary numbers (which could be zero or non-zero), there's no way to compare them correctly in all cases - pick one comparison and know its limitations.
A couple of practical considerations (since this sounds like it's for an assignment):
The epsilon comparison mentioned by most is probably fine (but include a discussion of the limitations in the write up). If you're ever planning to compare doubles to floats, try to do it in float, but if not, try to do all comparisons in double. Even better, just use doubles everywhere.
If you want to totally ace the assignment, include a write-up of the issues when comparing floats and the rationale for why you chose any particular comparison method.
I don't understand why you're doing this at all. The == operator already caters for all possible types on both sides, with extensive rules on type coercion and widening which are already specified in the relevant language standards. All you have to do is use it.
I'm perhaps not answering the OP's question but rather responding to some more or less fuzzy advice which require clarifications.
Comparing two floating point values for equality is absolutely possible and can be done. If the type is single or double precision is often of less importance.
Having said that the steps leading up to the comparison itself require great care and a thorough understanding of floating-point dos and don'ts, whys and why nots.
Consider the following C statements:
result = a * b / c;
result = (a * b) / c;
result = a * (b / c);
In most naive floating-point programming they are seen as "equivalent" i e producing the "same" result. In the real world of floating-point they may be. Or actually, the first two are equivalent (as the second follows C evaluation rules, i e operators of same priority left to right). The third may or may not be equivalent to the first twp.
Why is this?
"a * b / c" or "b / c * a" may cause the "inexact" exception i e an intermediate or the final result (or both) is (are) not exact(ly representable in floating point format). If this is the case the results will be more or less subtly different. This may or may not lead to the end results being amenable to an equality comparison. Being aware of this and single-stepping through operations one at a time - noting intermediate results - will allow the patient programmer to "beat the system" i e construct a quality floating-point comparison for practically any situation.
For everyone else, passing over the equality comparison for floating-poiny numbers is good, solid advice.
It's really a bit ironic because most programmers know that integer math results in predictable truncations in various situations. When it comes to floating-point almost everyone is more or less thunderstruck that results are not exact. Go figure.
You should be okay to make that cast as long as the equality test involves a delta.
For example: abs((double) floatVal1 - (double) floatVal2) < .000001 should work.
Edit in response to the question change
No you would not. The above still stands.
For the comparison between float f and double d, you can calculate the difference of f and d. If abs(f-d) is less than some threshold, you can think of the equality holds. These threshold could be either absolute or relative as your application requirement. There are some good solutions Here. And I hope it helpful.
Would I ever get an incorrect result if I promote 2 floats to
double and do a 64bit comparison rather than a 32bit comparison?
No.
If you start with two floats, which could be float variables (float x = foo();) or float constants (1.234234234f) then you can compare them directly, of course. If you convert them to double and then compare them then the results will be identical.
This works because double is a super-set of float. That is, every value that can be stored in a float can be stored in a double. The range of the exponent and mantissa are both increased. There are billions of values that can be stored in a double but not in a float, but there are zero values that can be stored in a float but not a double.
As discussed in my float comparison article it can be tricky to do a meaningful comparison between float or double values, because rounding errors may have crept in. But, converting both numbers from float to double doesn't not change this. All of the mentions of epsilons (which are often but not always needed) are completely orthogonal to the question.
On the other hand, comparing a float to a double is madness. 1.1 (a double) is not equal to 1.1f (a float) because 1.1 cannot be exactly represented in either.

Why does Math.round return a long but Math.floor return a double?

Why the inconsistency?
There is no inconsistency: the methods are simply designed to follow different specifications.
long round(double a)
Returns the closest long to the argument.
double floor(double a)
Returns the largest (closest to positive infinity) double value that is less than or equal to the argument and is equal to a mathematical integer.
Compare with double ceil(double a)
double rint(double a)
Returns the double value that is closest in value to the argument and is equal to a mathematical integer
So by design round rounds to a long and rint rounds to a double. This has always been the case since JDK 1.0.
Other methods were added in JDK 1.2 (e.g. toRadians, toDegrees); others were added in 1.5 (e.g. log10, ulp, signum, etc), and yet some more were added in 1.6 (e.g. copySign, getExponent, nextUp, etc) (look for the Since: metadata in the documentation); but round and rint have always had each other the way they are now since the beginning.
Arguably, perhaps instead of long round and double rint, it'd be more "consistent" to name them double round and long rlong, but this is argumentative. That said, if you insist on categorically calling this an "inconsistency", then the reason may be as unsatisfying as "because it's inevitable".
Here's a quote from Effective Java 2nd Edition, Item 40: Design method signatures carefully:
When in doubt, look to the Java library APIs for guidance. While there are plenty of inconsistencies -- inevitable, given the size and scope of these libraries -- there are also fair amount of consensus.
Distantly related questions
Why does int num = Integer.getInteger("123") throw NullPointerException?
Most awkward/misleading method in Java Base API ?
Most Astonishing Violation of the Principle of Least Astonishment
floor would have been chosen to match the standard c routine in math.h (rint, mentioned in another answer, is also present in that library, and returns a double, as in java).
but round was not a standard function in c at that time (it's not mentioned in C89 - c identifiers and standards; c99 does define round and it returns a double, as you would expect). it's normal for language designers to "borrow" ideas, so maybe it comes from some other language? fortran 77 doesn't have a function of that name and i am not sure what else would have been used back then as a reference. perhaps vb - that does have Round but, unfortunately for this theory, it returns a double (php too). interestingly, perl deliberately avoids defining round.
[update: hmmm. looks like smalltalk returns integers. i don't know enough about smalltalk to know if that is correct and/or general, and the method is called rounded, but it might be the source. smalltalk did influence java in some ways (although more conceptually than in details).]
if it's not smalltalk, then we're left with the hypothesis that someone simply chose poorly (given the implicit conversions possible in java it seems to me that returning a double would have been more useful, since then it can be used both while converting types and when doing floating point calculations).
in other words: functions common to java and c tend to be consistent with the c library standard at the time; the rest seem to be arbitrary, but this particular wrinkle may have come from smalltalk.
I agree, that it is odd that Math.round(double) returns long. If large double values are cast to long (which is what Math.round implicitly does), Long.MAX_VALUE is returned. An alternative is using Math.rint() in order to avoid that. However, Math.rint() has a somewhat strange rounding behavior: ties are settled by rounding to the even integer, i.e. 4.5 is rounded down to 4.0 but 5.5 is rounded up to 6.0). Another alternative is to use Math.floor(x+0.5). But be aware that 1.5 is rounded to 2 while -1.5 is rounded to -1, not -2. Yet another alternative is to use Math.round, but only if the number is in the range between Long.MIN_VALUE and Long.MAX_VALUE. Double precision floating point values outside this range are integers anyhow.
Unfortunately, why Math.round() returns long is unknown. Somebody made that decision, and he probably never gave an interview to tell us why. My guess is, that Math.round was designed to provide a better way (i.e., with rounding) for converting doubles to longs.
Like everyone else here I also don't know the answer, but thought someone might find this useful. I noticed that if you want to round a double to an int without casting, you can use the two round implementations long round(double) and int round(float) together:
double d = something;
int i = Math.round(Math.round(d));

What's wrong with using == to compare floats in Java?

According to this java.sun page == is the equality comparison operator for floating point numbers in Java.
However, when I type this code:
if(sectionID == currentSectionID)
into my editor and run static analysis, I get: "JAVA0078 Floating point values compared with =="
What is wrong with using == to compare floating point values? What is the correct way to do it?
the correct way to test floats for 'equality' is:
if(Math.abs(sectionID - currentSectionID) < epsilon)
where epsilon is a very small number like 0.00000001, depending on the desired precision.
Floating point values can be off by a little bit, so they may not report as exactly equal. For example, setting a float to "6.1" and then printing it out again, you may get a reported value of something like "6.099999904632568359375". This is fundamental to the way floats work; therefore, you don't want to compare them using equality, but rather comparison within a range, that is, if the diff of the float to the number you want to compare it to is less than a certain absolute value.
This article on the Register gives a good overview of why this is the case; useful and interesting reading.
Just to give the reason behind what everyone else is saying.
The binary representation of a float is kind of annoying.
In binary, most programmers know the correlation between 1b=1d, 10b=2d, 100b=4d, 1000b=8d
Well it works the other way too.
.1b=.5d, .01b=.25d, .001b=.125, ...
The problem is that there is no exact way to represent most decimal numbers like .1, .2, .3, etc. All you can do is approximate in binary. The system does a little fudge-rounding when the numbers print so that it displays .1 instead of .10000000000001 or .999999999999 (which are probably just as close to the stored representation as .1 is)
Edit from comment: The reason this is a problem is our expectations. We fully expect 2/3 to be fudged at some point when we convert it to decimal, either .7 or .67 or .666667.. But we don't automatically expect .1 to be rounded in the same way as 2/3--and that's exactly what's happening.
By the way, if you are curious the number it stores internally is a pure binary representation using a binary "Scientific Notation". So if you told it to store the decimal number 10.75d, it would store 1010b for the 10, and .11b for the decimal. So it would store .101011 then it saves a few bits at the end to say: Move the decimal point four places right.
(Although technically it's no longer a decimal point, it's now a binary point, but that terminology wouldn't have made things more understandable for most people who would find this answer of any use.)
What is wrong with using == to compare floating point values?
Because it's not true that 0.1 + 0.2 == 0.3
As of today, the quick & easy way to do it is:
if (Float.compare(sectionID, currentSectionID) == 0) {...}
However, the docs do not clearly specify the value of the margin difference (an epsilon from #Victor 's answer) that is always present in calculations on floats, but it should be something reasonable as it is a part of the standard language library.
Yet if a higher or customized precision is needed, then
float epsilon = Float.MIN_NORMAL;
if(Math.abs(sectionID - currentSectionID) < epsilon){...}
is another solution option.
I think there is a lot of confusion around floats (and doubles), it is good to clear it up.
There is nothing inherently wrong in using floats as IDs in standard-compliant JVM [*]. If you simply set the float ID to x, do nothing with it (i.e. no arithmetics) and later test for y == x, you'll be fine. Also there is nothing wrong in using them as keys in a HashMap. What you cannot do is assume equalities like x == (x - y) + y, etc. This being said, people usually use integer types as IDs, and you can observe that most people here are put off by this code, so for practical reasons, it is better to adhere to conventions. Note that there are as many different double values as there are long values, so you gain nothing by using double. Also, generating "next available ID" can be tricky with doubles and requires some knowledge of the floating-point arithmetic. Not worth the trouble.
On the other hand, relying on numerical equality of the results of two mathematically equivalent computations is risky. This is because of the rounding errors and loss of precision when converting from decimal to binary representation. This has been discussed to death on SO.
[*] When I said "standard-compliant JVM" I wanted to exclude certain brain-damaged JVM implementations. See this.
Foating point values are not reliable, due to roundoff error.
As such they should probably not be used for as key values, such as sectionID. Use integers instead, or long if int doesn't contain enough possible values.
This is a problem not specific to java. Using == to compare two floats/doubles/any decimal type number can potentially cause problems because of the way they are stored.
A single-precision float (as per IEEE standard 754) has 32 bits, distributed as follows:
1 bit - Sign (0 = positive, 1 = negative)
8 bits - Exponent (a special (bias-127) representation of the x in 2^x)
23 bits - Mantisa. The actuall number that is stored.
The mantisa is what causes the problem. It's kinda like scientific notation, only the number in base 2 (binary) looks like 1.110011 x 2^5 or something similar.
But in binary, the first 1 is always a 1 (except for the representation of 0)
Therefore, to save a bit of memory space (pun intended), IEEE deccided that the 1 should be assumed. For example, a mantisa of 1011 really is 1.1011.
This can cause some issues with comparison, esspecially with 0 since 0 cannot possibly be represented exactly in a float.
This is the main reason the == is discouraged, in addition to the floating point math issues described by other answers.
Java has a unique problem in that the language is universal across many different platforms, each of which could have it's own unique float format. That makes it even more important to avoid ==.
The proper way to compare two floats (not-language specific mind you) for equality is as follows:
if(ABS(float1 - float2) < ACCEPTABLE_ERROR)
//they are approximately equal
where ACCEPTABLE_ERROR is #defined or some other constant equal to 0.000000001 or whatever precision is required, as Victor mentioned already.
Some languages have this functionality or this constant built in, but generally this is a good habit to be in.
Here is a very long (but hopefully useful) discussion about this and many other floating point issues you may encounter: What Every Computer Scientist Should Know About Floating-Point Arithmetic
In addition to previous answers, you should be aware that there are strange behaviours associated with -0.0f and +0.0f (they are == but not equals) and Float.NaN (it is equals but not ==) (hope I've got that right - argh, don't do it!).
Edit: Let's check!
import static java.lang.Float.NaN;
public class Fl {
public static void main(String[] args) {
System.err.println( -0.0f == 0.0f); // true
System.err.println(new Float(-0.0f).equals(new Float(0.0f))); // false
System.err.println( NaN == NaN); // false
System.err.println(new Float( NaN).equals(new Float( NaN))); // true
}
}
Welcome to IEEE/754.
First of all, are they float or Float? If one of them is a Float, you should use the equals() method. Also, probably best to use the static Float.compare method.
You can use Float.floatToIntBits().
Float.floatToIntBits(sectionID) == Float.floatToIntBits(currentSectionID)
The following automatically uses the best precision:
/**
* Compare to floats for (almost) equality. Will check whether they are
* at most 5 ULP apart.
*/
public static boolean isFloatingEqual(float v1, float v2) {
if (v1 == v2)
return true;
float absoluteDifference = Math.abs(v1 - v2);
float maxUlp = Math.max(Math.ulp(v1), Math.ulp(v2));
return absoluteDifference < 5 * maxUlp;
}
Of course, you might choose more or less than 5 ULPs (‘unit in the last place’).
If you’re into the Apache Commons library, the Precision class has compareTo() and equals() with both epsilon and ULP.
you may want it to be ==, but 123.4444444444443 != 123.4444444444442
If you *have to* use floats, strictfp keyword may be useful.
http://en.wikipedia.org/wiki/strictfp
Two different calculations which produce equal real numbers do not necessarily produce equal floating point numbers. People who use == to compare the results of calculations usually end up being surprised by this, so the warning helps flag what might otherwise be a subtle and difficult to reproduce bug.
Are you dealing with outsourced code that would use floats for things named sectionID and currentSectionID? Just curious.
#Bill K: "The binary representation of a float is kind of annoying." How so? How would you do it better? There are certain numbers that cannot be represented in any base properly, because they never end. Pi is a good example. You can only approximate it. If you have a better solution, contact Intel.
As mentioned in other answers, doubles can have small deviations. And you could write your own method to compare them using an "acceptable" deviation. However ...
There is an apache class for comparing doubles: org.apache.commons.math3.util.Precision
It contains some interesting constants: SAFE_MIN and EPSILON, which are the maximum possible deviations of simple arithmetic operations.
It also provides the necessary methods to compare, equal or round doubles. (using ulps or absolute deviation)
In one line answer I can say, you should use:
Float.floatToIntBits(sectionID) == Float.floatToIntBits(currentSectionID)
To make you learned more about using related operators correctly, I am elaborating some cases here:
Generally, there are three ways to test strings in Java. You can use ==, .equals (), or Objects.equals ().
How are they different? == tests for the reference quality in strings meaning finding out whether the two objects are the same. On the other hand, .equals () tests whether the two strings are of equal value logically. Finally, Objects.equals () tests for any nulls in the two strings then determine whether to call .equals ().
Ideal operator to use
Well this has been subject to lots of debates because each of the three operators have their unique set of strengths and weaknesses. Example, == is often a preferred option when comparing object reference, but there are cases where it may seem to compare string values as well.
However, what you get is a falls value because Java creates an illusion that you are comparing values but in the real sense you are not. Consider the two cases below:
Case 1:
String a="Test";
String b="Test";
if(a==b) ===> true
Case 2:
String nullString1 = null;
String nullString2 = null;
//evaluates to true
nullString1 == nullString2;
//throws an exception
nullString1.equals(nullString2);
So, it’s way better to use each operator when testing the specific attribute it’s designed for. But in almost all cases, Objects.equals () is a more universal operator thus experience web developers opt for it.
Here you can get more details: http://fluentthemes.com/use-compare-strings-java/
The correct way would be
java.lang.Float.compare(float1, float2)
One way to reduce rounding error is to use double rather than float. This won't make the problem go away, but it does reduce the amount of error in your program and float is almost never the best choice. IMHO.

Categories

Resources