Related
I get this suggestion from IntelliJ IDEA when using #Data annotation from lombok.
The class in question is an #Entity.
Can someone explain:
what does it do exactly (especially the part with Hibernate)
Is this method preferred over comparing every field one-by-one? If yes, why?
#Override
public boolean equals(Object o) {
if (this == o)
return true;
if (o == null || Hibernate.getClass(this) != Hibernate.getClass(o))
return false;
MyObject that = (MyObject ) o;
return id != null && Objects.equals(id, that.id);
}
The project contains/uses Spring boot, Hibernate, Lombok.
Thank you
There's a fundamental problem at work, one inherent to JPA/Hibernate. For this example, let's say we have a db table named User, and we have a class also named User that models it.
The problem boils down to simply this:
What does the java class User represent? Does it represent 'a row in the database table "User"', or does it represent a User?
Depending on your answer, you get a wildly different requirement for the equals method. Depending on which equals method you chose, answering this question incorrectly leads to code bugs. As far as I know, there is no actual 'standard', people just sort of do something and most aren't aware that this is a fundamental problem.
It represents a row in the DB
Such an interpretation would then suggest the following implementation of your equals method:
If all fields that model the primary key columns in the DB definition are equal between the two instances, then they are equal, even if the other (non-primary-key) fields are different. After all, that's how the DB determines equality, so java code should match it.
The java code should be like SQL when dealing with NULLs. That is to say, quite unlike just about every equality definition, equals method code generator (including lombok, intellij, and eclipse), and even the Objects.equals method, in this mode, null == null should be FALSE, as it is in SQL! Specifically, if any of the primary key fields have a null value, that object cannot be equal to any other, even a carbon copy of itself; to stick to java rules, it can (must, really) be equal to its own reference.
In other words:
Any 2 objects are equal if either [A] they are literally the same object (this == other), or [B] BOTH object's unid field is initialized and equal. Whether you use null or 0 to track 'not written to DB yet', that value instantly disqualifies that row from being equal to any other, even another one with 100% identical values.
After all, if you make 2 separate new objects and save() them both, they would turn into 2 separate rows.
It represents a user object
Then what happens is that the equals rules do a 180. The primary key, assuming its an unid style primary key and not a natural primary key, are inherently an implementation detail. Imagine that somehow in your DB you end up with 2 rows for the exact same user (presumably somebody messed up and failed to add a UNIQUE constraint on username, perhaps). In the semantic model of users on the system, users are uniquely identified by their username, therefore, equality is defined by username alone. 2 objects with identical username but different unid values are nevertheless equal.
So which one do I take?
I have no idea. Fortunately, your question asked for explanation and not an answer!
What IntelliJ is telling you is to go with the first interpretation (row in the DB), and even applies the wonky null stuff correctly, so whomever wrote the suggestion tool in intellij at least seems to understand what's going on.
For what its worth, I think 'represents a row in the DB' is the more 'useful' interpretation (because not doing this involves invoking getters which make equality checks incredibly pricey, as it may result in hundreds of SELECT calls and a gigantic bunch of heap mem as you pull half the DB in!), however, the 'an instance of class User represents a user in the system' is the more java-like interpretation and the one that most java programmers would (erroneously then, if you use intellij's suggestion here) silently presume.
I've solved this problem in my own programming endeavours by never using hibernate/JPA in the first place, and using tools like JOOQ or JDBI instead. But, the downside is that generally you end up with more code – you really do sometimes have an object, e.g. called UserRow, representing a user row, and an object e.g. called User that represents a user on-system.
Another trick could be to decide to name all your Hibernate model classes as XRow. Names are important and the best documentation around: This makes no bones about it and clues in all users of this code about how they are to interpret its semantic meaning: Row in DB. Thus, the intellij suggestion would then be your equals implementation.
NB: Lombok is java and not Hibernate specific, so it makes the 'represents a user in the system' choice. You can try to push lombok towards the 'row in DB' interpretation by telling lombok to only use the id field (stick an #EqualsAndHashCode.Include on that field), but lombok would still consider 2 null values / 2 0 values identical even though it shouldn't. This is on hibernate, as it is breaking all sorts of rules and specs.
(NB: Added due to a comment on another answer)
Why is .getClass() being invoked?
Java has sensible rules about what equals is supposed to mean. This is in the javadoc of the equals method and these rules can be relied upon (and are, by e.g. HashSet and co). The rules are:
If aequals(b) is true , a.hashCode() == b.hashCode() must also be true.
a.equals(a) must be true.
If a.equals(b) then b.equals(a) must also be true.
If a.equals(b) and b.equals(c) then a.equals(c) must also be true.
Sensible and simple, right?
Nope. That's actually really complex.
Imagine you make a subclass of ArrayList: You decide to give lists a colour. You can have a blue list of strings and a red list of strings.
Right now the equality method of ArrayList checks if the that is a list and if so, compares elements. Seems sensible, right? We can see it in action:
List<String> a = new ArrayList<String>();
a.add("Hello");
List<String> b = new LinkedList<String>();
b.add("Hello");
System.out.println(a.equals(b));
This prints true.
Let's now make our coloured arraylist implementation: class ColoredList<T> extends ArrayList<T> { .. }. Surely, a red empty list is no longer equal to a blue empty list right?
Nope, you'd be breaking rules if you do that!
List<String> a = new ArrayList<String>();
List<String> b = new ColoredList<String>(Color.RED);
List<String> c = new ColoredList<String>(Color.BLUE);
System.out.println(a.equals(b));
System.out.println(a.equals(c));
System.out.println(b.equals(c));
That prints true/true/false which is invalid. The conclusion is that it is in fact impossible to make any list subclass that adds some semantically relevant information. The only subclasses that can exist are those which either actively break spec (bad idea), or whose additions have no impact on equality.
There is a different view of things which says that you ought to be able to make such classes. Again we're struggling, just like with the JPA/Hibernate case, about what equals is even supposed to mean.
A more common and far better default behaviour for your equals implementations is to simply state that any 2 objects can only be equal if they are of the exact same type: An instance of Dog cannot be equal to an instance of Animal.
The only way to accomplish this, given that the rule a.equals(b)? Then b.equals(a) exists, is that animal checks the class of that and returns false if it isn't exactly Animal. In other words:
Animal a = new Animal("Betsy");
Cow c = new Cow("Betsy");
a.equals(c); // must return false!!
The .getClass() check accomplishes this.
Lombok gives you the best of both worlds. It can't perform miracles, so it won't take away the rule that at the type level you need to choose extensibility, but lombok has the canEqual system to deal with this: The equals code of Animal will ask the that code if the two things can be equal. In this mode, if you have some non-semantically-different subclass of animal (such as ArrayList, which is a subclass of AbstractList and doesn't change the semantics at all, it just adds implementation details that have no bearing on equality), it can say that it can be equal, whereas if you have one that is semantically different, such as your coloured list, it can say that none are.
In other words, going back to the coloured lists, IF ArrayList and co were written with lombok's canEqual system, this could have worked out, you could have had the results (where a is an arraylist, b is a red list, and c is a blur list):
a.equals(b); // false, even though same items
a.equals(c); // false, same reason.
b.equals(c); // false and now it's not a conflict.
Lombok's default behaviour is that all subtypes add semantic load and therefore any X cannot be equal to any Y where Y is a subclass of X, but you can override this by writing out the canEqual method in Y. You would do that if you write a subclass that doesn't add semantic load.
This isn't going to help you in the slightest with the problems above about hibernate.
Who knew something as seemingly simple as equality is hiding 2 intractably difficult philosophical treatises, huh?
For more info on canEqual, see lombok's #EqualsAndHashCode documentation.
I'm not trying to undermine ~rzwitserloot 's excellent answer, just trying to help you figure out why it uses Hibernate.getClass(this) for you instead of this.getClass().
It doesn't do it for me, but I don't have Hibernate in my project anyway.
The code is generated using velocity macros as seen here:
The IntelliJ default uses a file 'equalsHelper.vm'. I found a possible source of that file version at https://github.com/JetBrains/intellij-community/blob/master/java/java-impl/src/com/intellij/codeInsight/generation/equalsHelper.vm
It contains this:
#macro(addInstanceOfToText)
#if ($checkParameterWithInstanceof)
if(!($paramName instanceof $classname)) return false;
#else
if($paramName == null || getClass() != ${paramName}.getClass()) return false;
#end
#end
So apparently you have a different version of that file? Or you use a different template? Maybe some plugin changed it?
Two objects are not equal if they are of different class.
For 'preferred', it depends on what an 'id' is. The last line seems a little redundant; it could have been
return Objects.equals(id, that.id);
since the null case is handled by Objects.equals. But to my taste, it's clearer to write
return id != null && id.equals(that.id);
The extra layer adds nothing that I can see in the example.
I'm seeking some clarification to the definition of Value-based Classes. I can't imagine, how is the last bullet point (6) supposed to work together with the first one
(1) they are final and immutable (though may contain references to mutable objects)
(6) they are freely substitutable when equal, meaning that interchanging any two instances x and y that are equal according to equals() in any computation or method invocation should produce no visible change in behavior.
Optional is such a class.
Optional a = Optional.of(new ArrayList<String>());
Optional b = Optional.of(new ArrayList<String>());
assertEquals(a, b); // passes as `equals` delegated to the lists
b.get().add("a");
// now bite the last bullet
assertTrue(a.get().isEmpty()); // passes
assertTrue(b.get().isEmpty()); // throws
Am I reading it incorrectly, or would it need to get more precise?
Update
The answer by Eran makes sense (they are no more equal), but let me move the target:
...
assertEquals(a, b); // now, they are still equal
assertEquals(m(a, b), m(a, a)); // this will throw
assertEquals(a, b); // now, they are equal, too
Let's define a funny method m, which does some mutation and undoes it again:
int m(Optional<ArrayList<String>> x, Optional<ArrayList<String>> y) {
x.get().add("");
int result = x.get().size() + y.get().size();
x.get().remove(x.get().size() - 1);
return result;
}
It's strange method, I know. But I guess, it qualifies as "any computation or method invocation", doesn't it?
they are freely substitutable when equal, meaning that interchanging any two instances x and y that are equal according to equals() in any computation or method invocation should produce no visible change in behavior
Once b.get().add("a"); is executed, a is no longer equals to b, so you have no reason to expect assertTrue(a.get().isEmpty()); and assertTrue(b.get().isEmpty()); would produce the same result.
The fact that a value based class is immutable doesn't mean you can't mutate the values stored in instances of such classes (as stated in though may contain references to mutable objects). It only means that once you create an Optional instance with Optional a = Optional.of(new ArrayList<String>()), you can't mutate a to hold a reference to a different ArrayList.
You can derive the invalidity of your actions from the specification you’re referring to:
A program may produce unpredictable results if it attempts to distinguish two references to equal values of a value-based class, whether directly via reference equality or indirectly via an appeal to synchronization, identity hashing, serialization, or any other identity-sensitive mechanism. Use of such identity-sensitive operations on instances of value-based classes may have unpredictable effects and should be avoided.
(emphasis mine)
Modifying an object is an identity-sensitive operation, as it only affects the object with the specific identity represented by the reference you are using for the modification.
When you are calling x.get().add(""); you are performing an operation that allows to recognize whether x and y represent the same instance, in other words, you are performing an identity sensitive operation.
Still, I expect that if a future JVM truly tries to substitute value based instances, it has to exclude instances referring to mutable objects, to ensure compatibility. If you perform an operation that produces an Optional followed by extracting the Optional, e.g. … stream. findAny().get(), it would be disastrous/unacceptable if the intermediate operation allowed to substitute the element with another object that happened to be equal at the point of the intermediate Optional use (if the element is not itself a value type)…
I think a more interesting example is as follows:
void foo() {
List<String> list = new ArrayList<>();
Optional<List<String>> a = Optional.of(list);
Optional<List<String>> b = Optional.of(list);
bar(a, b);
}
It's clear that a.equals(b) is true. Furthermore, since Optional is final (cannot be subclassed), immutable, and both a and b refer to the same list, a.equals(b) will always be true. (Well, almost always, subject to race conditions where another thread is modifying the list while this one is comparing them.) Thus, this seems like it would be a case where it would be possible for the JVM to substitute b for a or vice-versa.
As things stand today (Java 8 and 9 and 10) we can write a == b and the result will be false. The reason is that we know that Optional is an instance of an ordinary reference type, and the way things are currently implemented, Optional.of(x) will always return a new instance, and two new instances are never == to each other.
However, the paragraph at the bottom of the value-based classes definition says:
A program may produce unpredictable results if it attempts to distinguish two references to equal values of a value-based class, whether directly via reference equality or indirectly via an appeal to synchronization, identity hashing, serialization, or any other identity-sensitive mechanism. Use of such identity-sensitive operations on instances of value-based classes may have unpredictable effects and should be avoided.
In other words, "don't do that," or at least, don't rely on the result. The reason is that tomorrow the semantics of the == operation might change. In a hypothetical future value-typed world, == might be redefined for value types to be the same as equals, and Optional might change from being a value-based class to being a value type. If this happens, then a == b will be true instead of false.
One of the main ideas about value types is that they have no notion of identity (or perhaps their identity isn't detectable to Java programs). In such a world, how could we tell whether a and b "really" are the same or different?
Suppose we were to instrument the bar method via some means (say, a debugger) such that we can inspect the attributes of the parameter values in a way that can't be done through the programming language, such as by looking at machine addresses. Even if a == b is true (remember, in a value-typed world, == is the same as equals) we might be able to ascertain that a and b reside at different addresses in memory.
Now suppose the JIT compiler compiles foo and inlines the calls to Optional.of. Seeing that there are now two hunks of code that return two results that are always equals, the compiler eliminates one of the hunks and then uses the same result wherever a or b is used. Now, in our instrumented version of bar, we might observe that the two parameter values are the same. The JIT compiler is allowed to do this because of the sixth bullet item, which allows substitution of values that are equals.
Note that we're only able to observe this difference because we're using an extra-linguistic mechanism such as a debugger. Within the Java programming language, we can't tell the difference at all, and thus this substitution can't affect the result of any Java program. This lets the JVM choose any implementation strategy it sees fit. The JVM is free to allocate a and b on the heap, on the stack, one on each, as distinct instances, or as the same instances, as long as Java programs can't tell the difference. When the JVM is granted freedom of implementation choices, it can make programs go a lot faster.
That's the point of the sixth bullet item.
When you execute the lines:
Optional a = Optional.of(new ArrayList<String>());
Optional b = Optional.of(new ArrayList<String>());
assertEquals(a, b); // passes as `equals` delegated to the lists
In the assertEquals(a, b), according to the API :
will check if the params a and b are both Optional
Items both have no value present or,
The present values are "equal to" each other via
equals() (in your example this equals is the one from ArrayList).
So, when you change one of the ArrayList the Optional instance is pointing to, the assert will fail in the third point.
Point 6 says if a & b are equal then they can be used interchangeably i.e say if a method expects two instances of Class A and you have created a&b instances then if a & b passes point 6 you may send (a,a) or (b,b) or (a,b) all three will give the same output.
In Java, if one is to check if two Strings are equal, in the sense that their values are the same, he/she needs to use the equals method. E.g. :
String foo = "foo";
String bar = "bar";
if(foo.equals(bar)) { /* do stuff */ }
And if one wants to check for reference equality he needs to use the == operator on the two strings.
if( foo == bar ) { /* do stuff */ }
So my question is does the == operator have it's use for the String class ? Why would one want to compare String references ?
Edit:
What I am not asking : How to compare strings ? How does the == work ? How does the equals method work?
What I am asking is what uses does the == operator have for String class in Java ? What is the justification of not overloading it, so that it does a deep comparison ?
Imagine a thread-safe Queue<String> acting as a communication channel between a producer thread and a consumer thread. It seems perfectly reasonable to use a special String to indicate termination.
// Deliberate use of `new` to make sure JVM does not re-use a cached "EOT".
private static final String EOT = new String("EOT");
...
// Signal we're done.
queue.put(EOT);
// Meanwhile at the consumer end of the queue.
String got = queue.get();
if ( got == EOT ) {
// Tidy shutdown
}
note that this would be resilient to:
queue.put("EOT");
because "EOT" != EOT even though "EOT".equals(EOT) would be true.
What use is there for it? Not much in normal practice but you can always write a class that operates on intern()-ed strings, which can then use == to compare them.
Why it isn't overloaded is a simpler question: because there is no operator overloading in Java. (To mess things up a bit, the + operator IS sort of overloaded for strings, which was done to make string operations slightly less cumbersome. But you can argue that's just syntactic sugar and there certainly is no operator overloading in Java on the bytecode level.)
The lack of an overloaded == operator made the use of the operator much less ambiguous, at least for reference types. (That is, until the point autoboxing/unboxing was introduced, which muddies the waters again, but that's another story.) It also allows you to have classes like IdentityHashMap that will behave the same way for every object you put into it.
Having said all that, the decision to avoid operator overloading (where possible) was a fairly arbitrary design choice.
The == operator compares the reference between two objects. For example, if String x and String y refers to two different things, then the == operator will show false. However, the String.equals() method compares not if they refer to each other, but if the values (ex. "Hello", "World", etc.) are the same.
// A.java
String foo1 = "foo";
// B.java
String bar1 = "foo";
All String literals realized at compile time are added to String Constant Pool. So when you have two different String declarations in two different classes, two String objects will not be created and both foo1 & bar1 refer to the same String instance of value foo. Now that you have same String reference in two different variables, you can just check if those two strings are equal just by using == which is fast because all it does is compare the bit pattern, where as in equals() method, each character is compared and is generally used for two different String instances but same content.
In fact, if you look at equals() implementation in String class, the first check they do is Reference comparison using == because they might seem as different instances to you, but if they're String literals or if they're interned by someone else already, then all you have is a Single reference in two variables.
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
// remaining code
}
Also, == is not just for Strings, it's used to compare any two bit patterns, be it primitives or references
1."=="operation of comparison are the values of the two variables are equal, for a reference type variables is expressed by the two variables in the heap memory address is the same, namely the stack have the same content.
2."equals"Whether the two operation variables represent references to the same object in the heap, i.e. whether the contents of the same.
String s = "string1"; creates 1 reference and 1 object in pool String
s1 = "string1"; creates just 1 reference and points to what s is
pointing to.
s == s1 // true
String s2 = new String("string1"); creates 1 object in heap, one in
pool and one reference.
//Since, s2 is pointing to different object so,
s2 == s // false
s1 == s // false
Problem :
So, suppose We want to check, how many unique String object is created and stored in pool by the application while it is running,
We can have a singleton object which can have all the String references stored in an array.
From the previous examples of s, s1 and s2, finally for s and s1, 1 object is created and for s2, 1 object (in total 2).
//If we use equals method, all
s.equals(s1) // gives true
s1.equals(s2) // gives true
//So, number of references present in the array of singleton object will be our
//total number of objects created which equals to 3 // doesn't match actual
//count which is 2
we can use == to check for equality of reference, so if reference is equal, we will not increment our count of unique String object in pool, and for every non equal result, we will increment the count.
here,
for
s // count = 1
s1 == s // count remains same
s2 == s // false, so count = 1 + 1 = 2
//We get total number of unique String objects created and got stored in pool using ==
Simple answer...
Why would one want to compare String references ?
Because they want to compare String values in a very fast way.
Strings are not always interned(). String constants are, but it is possible that the string was created manually on the heap. Using the intern() on a manually created string allows us to to continue using reference comparison on our strings for value comparison.
What is the justification of not overloading it, so that it does a deep comparison ?
Because Java does not have operator overloading as a design decision
Operator '==' is a reference operator always, and equals() is a value method always. In C++ you can change that, but many feel that simply obfuscates the code.
Checking references is Faster compared to checking the entire Strings' equality.
Assume you have Large Strings (URLs or DBMS queries), a have multiple references to them. To check if they are equal, either you can check character by character or you can check if they both refer to the same object.
In fact, equals method in java first checks if the references are same and only if not goes ahead and checks character by character.
Java is full of references and hence, you might need a case where you need to check if two variables are referring to the same String/Object rather than both having each copy of the same String so that you can update string at one place and it reflects in all variables.
To do so, equals method does not help as it checks the copies to be equal as well. you need to check if they both refer to the same object and hence == comes into picture.
It seems that this was asked before and received quite a popular answer here:
Why didn't == operator string value comparison make it to Java?
The simple answer is: consistency
I guess it's just consistency, or "principle of least astonishment".
String is an object, so it would be surprising if was treated
differently than other objects.
Although this is not the fundamental reason, a usage could be to improve performances: before executing a heavy computation, "internalize" your Strings (intern()) and use only == for comparisons.
What I am asking is what uses does the == operator have for String class in Java ?
What is the justification of not overloading it, so that it does a deep comparison ?
== and equals have altogether different uses.
== confirms if there is reference-equality
Equals confirms if the objects contains are same.
Example of reference-equality is IdentityHashMap.
There could be a case in which Only the object inserting something to IdentityHashMap has the right to get/remove the object.
overloading reference-equality can lead to unwanted complexity for java.
for example
if (string)
{
do deep equality
}
else
{
do reference-equality
}
/*****************************************************************/
public class IdentityHashMap extends AbstractMap implements Map, Serializable, Cloneable
This class implements the Map interface with a hash table, using reference-equality in place of object-equality when comparing keys (and values). In other words, in an IdentityHashMap, two keys k1 and k2 are considered equal if and only if (k1==k2). (In normal Map implementations (like HashMap) two keys k1 and k2 are considered equal if and only if (k1==null ? k2==null : k1.equals(k2)).)
This class is not a general-purpose Map implementation! While this class implements the Map interface, it intentionally violates Map's general contract, which mandates the use of the equals method when comparing objects. This class is designed for use only in the rare cases wherein reference-equality semantics are required.
I commonly find myself writing code like this:
private List<Foo> fooList = new ArrayList<Foo>();
public Foo findFoo(FooAttr attr) {
for(Foo foo : fooList) {
if (foo.getAttr().equals(attr)) {
return foo;
}
}
}
However, assuming I properly guard against null input, I could also express the loop like this:
for(Foo foo : fooList) {
if (attr.equals(foo.getAttr()) {
return foo;
}
}
I'm wondering if one of the above forms has a performance advantage over the other. I'm well aware of the dangers of premature optimization, but in this case, I think the code is equally legible either way, so I'm looking for a reason to prefer one form over another, so I can build my coding habits to favor that form. I think given a large enough list, even a small performance advantage could amount to a significant amount of time.
In particular, I'm wondering if the second form might be more performant because the equals() method is called repeatedly on the same object, instead of different objects? Maybe branch prediction is a factor?
I would offer 2 pieces of advice here:
Measure it
If nothing else points you in any given direction, prefer the form which makes most sense and sounds most natural when you say it out loud (or in your head!)
I think that considering branch prediction is worrying about efficiency at too low of a level. However, I find the second example of your code more readable because you put the consistent object first. Similarly, if you were comparing this to some other object that, I would put the this first.
Of course, equals is defined by the programmer so it could be asymmetric. You should make equals an equivalence relation so this shouldn't be the case. Even if you have an equivalence relation, the order could matter. Suppose that attr is a superclass of the various foo.getAttr and the first test of your equals method checks if the other object is an instance of the same class. Then attr.equals(foo.getAttr()) will pass the first check but foo.getAttr().equals(attr) will fail the first check.
However, worrying about efficiency at this level seldom has benefits.
This depends on the implementation of the equals methods. In this situation I assume that both objects are instances of the same class. So that would mean that the methods are equal. This makes no performance difference.
If both objects are of the same type, then they should perform the same. If not, then you can't really know in advance what's going to happen, but usually it will be stopped quite quickly (with an instanceof or something else).
For myself, I usually start the method with a non-null check on the given parameter and I then use the attr.equals(foo.getAttr()) since I don't have to check for null in the loop. Just a question of preference I guess.
The only thing which does affect performance is code which does nothing.
In some cases you have code which is much the same or the difference is so small it just doesn't matter. This is the case here.
Where its is useful to swap the .equals() around is when you have a known value which cannot be null (This doesn't appear to be the cases here) of the type you are using is known.
e.g.
Object o = (Integer) 123;
String s = "Hello";
o.equals(s); // the type of equals is unknown and a virtual table look might be required
s.equals(o); // the type of equals is known and the class is final.
The difference is so small I wouldn't worry about it.
DEVENTER (n) A decision that's very hard to make because so little depends on it, such as which way to walk around a park
-- The Deeper Meaning of Liff by Douglas Adams and John Lloyd.
The performance should be the same, but in terms of safety, it's usually best to have the left operand be something that you are sure is not null, and have your equals method deal with null values.
Take for instance:
String s1 = null;
s1.equals("abc");
"abc".equals(s1);
The two calls to equals are not equivalent as one would issue a NullPointerException (the first one), and the other would return false.
The latter form is generally preferred for comparing with string constants for exactly this reason.
I stumbled across the source of AtomicInteger and realized that
new AtomicInteger(0).equals(new AtomicInteger(0))
evaluates to false.
Why is this? Is it some "defensive" design choice related to concurrency issues? If so, what could go wrong if it was implemented differently?
(I do realize I could use get and == instead.)
This is partly because an AtomicInteger is not a general purpose replacement for an Integer.
The java.util.concurrent.atomic package summary states:
Atomic classes are not general purpose replacements for
java.lang.Integer and related classes. They do not define methods
such as hashCode and compareTo. (Because atomic variables are
expected to be mutated, they are poor choices for hash table keys.)
hashCode is not implemented, and so is the case with equals. This is in part due to a far larger rationale that is discussed in the mailing list archives, on whether AtomicInteger should extend Number or not.
One of the reasons why an AtomicXXX class is not a drop-in replacement for a primitive, and that it does not implement the Comparable interface, is because it is pointless to compare two instances of an AtomicXXX class in most scenarios. If two threads could access and mutate the value of an AtomicInteger, then the comparison result is invalid before you use the result, if a thread mutates the value of an AtomicInteger. The same rationale holds good for the equals method - the result for an equality test (that depends on the value of the AtomicInteger) is only valid before a thread mutates one of the AtomicIntegers in question.
On the face of it, it seems like a simple omission but it maybe it does make some sense to actually just use the idenity equals provided by Object.equals
For instance:
AtomicInteger a = new AtomicInteger(0)
AtomicInteger b = new AtomicInteger(0)
assert a.equals(b)
seems reasonable, but b isn't really a, it is designed to be a mutable holder for a value and therefore can't really replace a in a program.
also:
assert a.equals(b)
assert a.hashCode() == b.hashCode()
should work but what if b's value changes in between.
If this is the reason it's a shame it wasn't documented in the source for AtomicInteger.
As an aside: A nice feature might also have been to allow AtomicInteger to be equal to an Integer.
AtomicInteger a = new AtomicInteger(25);
if( a.equals(25) ){
// woot
}
trouble it would mean that in order to be reflexive in this case Integer would have to accept AtomicInteger in it's equals too.
I would argue that because the point of an AtomicInteger is that operations can be done atomically, it would be be hard to ensure that the two values are compared atomically, and because AtomicIntegers are generally counters, you'd get some odd behaviour.
So without ensuring that the equals method is synchronised you wouldn't be sure that the value of the atomic integer hasn't changed by the time equals returns. However, as the whole point of an atomic integer is not to use synchronisation, you'd end up with little benefit.
I suspect that comparing the values is a no-go since there's no way to do it atomically in a portable fashion (without locks, that is).
And if there's no atomicity then the variables could compare equal even they never contained the same value at the same time (e.g. if a changed from 0 to 1 at exactly the same time as b changed from 1 to 0).
AtomicInteger inherits from Object and not Integer, and it uses standard reference equality check.
If you google you will find this discussion of this exact case.
Imagine if equals was overriden and you put it in a HashMap and then you change the value. Bad things will happen:)
equals is not only used for equality but also to meet its contract with hashCode, i.e. in hash collections. The only safe approach for hash collections is for mutable object not to be dependant on their contents. i.e. for mutable keys a HashMap is the same as using an IdentityMap. This way the hashCode and whether two objects are equal does not change when the keys content changes.
So new StringBuilder().equals(new StringBuilder()) is also false.
To compare the contents of two AtomicInteger, you need ai.get() == ai2.get() or ai.intValue() == ai2.intValue()
Lets say that you had a mutable key where the hashCode and equals changed based on the contents.
static class BadKey {
int num;
#Override
public int hashCode() {
return num;
}
#Override
public boolean equals(Object obj) {
return obj instanceof BadKey && num == ((BadKey) obj).num;
}
#Override
public String toString() {
return "Bad Key "+num;
}
}
public static void main(String... args) {
Map<BadKey, Integer> map = new LinkedHashMap<BadKey, Integer>();
for(int i=0;i<10;i++) {
BadKey bk1 = new BadKey();
bk1.num = i;
map.put(bk1, i);
bk1.num = 0;
}
System.out.println(map);
}
prints
{Bad Key 0=0, Bad Key 0=1, Bad Key 0=2, Bad Key 0=3, Bad Key 0=4, Bad Key 0=5, Bad Key 0=6, Bad Key 0=7, Bad Key 0=8, Bad Key 0=9}
As you can see we now have 10 keys, all equal and with the same hashCode!
equals is correctly implemented: an AtomicInteger instance can only equal itself, as only that very same instance will provably store the same sequence of values over time.
Please recall that Atomic* classes act as reference types (just like java.lang.ref.*), meant to wrap an actual, "useful" value. Unlike it is the case in functional languages (see e.g. Clojure's Atom or Haskell's IORef), the distinction between references and values is rather blurry in Java (blame mutability), but it is still there.
Considering the current wrapped value of an Atomic class as the criterion for equality is quite clearly a misconception, as it would imply that new AtomicInteger(1).equals(1).
One limitation with Java is that there is no means of distinguishing a mutable-class instance which can and will be mutated, from a mutable-class instance which will never be exposed to anything that might mutate it(*). References to things of the former type should only be considered equal if they refer to the same object, while references to things of the latter type should often be considered equal if the refer to objects with equivalent state. Because Java only allows one override of the virtual equals(object) method, designers of mutable classes have to guess whether enough instances will meet the latter pattern (i.e. be held in such a way that they'll never be mutated) to justify having equals() and hashCode() behave in a fashion suitable for such usage.
In the case of something like Date, there are a lot of classes which encapsulate a reference to a Date that is never going to be modified, and which want to have their own equivalence relation incorporate the value-equivalence of the encapsulated Date. As such, it makes sense for Date to override equals and hashCode to test value equivalence. On the other hand, holding a reference to an AtomicInteger that is never going to be modified would be silly, since the whole purpose of that type centers around mutability. An AtomicInteger instance which is never going to be mutated may, for all practical purposes, simply be an Integer.
(*) Any requirement that a particular instance never mutate is only binding as long as either (1) information about its identity hash value exists somewhere, or (2) more than one reference to the object exists somewhere in the universe. If neither condition applies to the instance referred to by Foo, replacing Foo with a reference to a clone of Foo would have no observable effect. Consequently, one would be able to mutate the instance without violating a requirement that it "never mutate" by pretending to replace Foo with a clone and mutating the "clone".