Value-based Classes confusion - java

I'm seeking some clarification to the definition of Value-based Classes. I can't imagine, how is the last bullet point (6) supposed to work together with the first one
(1) they are final and immutable (though may contain references to mutable objects)
(6) they are freely substitutable when equal, meaning that interchanging any two instances x and y that are equal according to equals() in any computation or method invocation should produce no visible change in behavior.
Optional is such a class.
Optional a = Optional.of(new ArrayList<String>());
Optional b = Optional.of(new ArrayList<String>());
assertEquals(a, b); // passes as `equals` delegated to the lists
b.get().add("a");
// now bite the last bullet
assertTrue(a.get().isEmpty()); // passes
assertTrue(b.get().isEmpty()); // throws
Am I reading it incorrectly, or would it need to get more precise?
Update
The answer by Eran makes sense (they are no more equal), but let me move the target:
...
assertEquals(a, b); // now, they are still equal
assertEquals(m(a, b), m(a, a)); // this will throw
assertEquals(a, b); // now, they are equal, too
Let's define a funny method m, which does some mutation and undoes it again:
int m(Optional<ArrayList<String>> x, Optional<ArrayList<String>> y) {
x.get().add("");
int result = x.get().size() + y.get().size();
x.get().remove(x.get().size() - 1);
return result;
}
It's strange method, I know. But I guess, it qualifies as "any computation or method invocation", doesn't it?

they are freely substitutable when equal, meaning that interchanging any two instances x and y that are equal according to equals() in any computation or method invocation should produce no visible change in behavior
Once b.get().add("a"); is executed, a is no longer equals to b, so you have no reason to expect assertTrue(a.get().isEmpty()); and assertTrue(b.get().isEmpty()); would produce the same result.
The fact that a value based class is immutable doesn't mean you can't mutate the values stored in instances of such classes (as stated in though may contain references to mutable objects). It only means that once you create an Optional instance with Optional a = Optional.of(new ArrayList<String>()), you can't mutate a to hold a reference to a different ArrayList.

You can derive the invalidity of your actions from the specification you’re referring to:
A program may produce unpredictable results if it attempts to distinguish two references to equal values of a value-based class, whether directly via reference equality or indirectly via an appeal to synchronization, identity hashing, serialization, or any other identity-sensitive mechanism. Use of such identity-sensitive operations on instances of value-based classes may have unpredictable effects and should be avoided.
(emphasis mine)
Modifying an object is an identity-sensitive operation, as it only affects the object with the specific identity represented by the reference you are using for the modification.
When you are calling x.get().add(""); you are performing an operation that allows to recognize whether x and y represent the same instance, in other words, you are performing an identity sensitive operation.
Still, I expect that if a future JVM truly tries to substitute value based instances, it has to exclude instances referring to mutable objects, to ensure compatibility. If you perform an operation that produces an Optional followed by extracting the Optional, e.g. … stream. findAny().get(), it would be disastrous/unacceptable if the intermediate operation allowed to substitute the element with another object that happened to be equal at the point of the intermediate Optional use (if the element is not itself a value type)…

I think a more interesting example is as follows:
void foo() {
List<String> list = new ArrayList<>();
Optional<List<String>> a = Optional.of(list);
Optional<List<String>> b = Optional.of(list);
bar(a, b);
}
It's clear that a.equals(b) is true. Furthermore, since Optional is final (cannot be subclassed), immutable, and both a and b refer to the same list, a.equals(b) will always be true. (Well, almost always, subject to race conditions where another thread is modifying the list while this one is comparing them.) Thus, this seems like it would be a case where it would be possible for the JVM to substitute b for a or vice-versa.
As things stand today (Java 8 and 9 and 10) we can write a == b and the result will be false. The reason is that we know that Optional is an instance of an ordinary reference type, and the way things are currently implemented, Optional.of(x) will always return a new instance, and two new instances are never == to each other.
However, the paragraph at the bottom of the value-based classes definition says:
A program may produce unpredictable results if it attempts to distinguish two references to equal values of a value-based class, whether directly via reference equality or indirectly via an appeal to synchronization, identity hashing, serialization, or any other identity-sensitive mechanism. Use of such identity-sensitive operations on instances of value-based classes may have unpredictable effects and should be avoided.
In other words, "don't do that," or at least, don't rely on the result. The reason is that tomorrow the semantics of the == operation might change. In a hypothetical future value-typed world, == might be redefined for value types to be the same as equals, and Optional might change from being a value-based class to being a value type. If this happens, then a == b will be true instead of false.
One of the main ideas about value types is that they have no notion of identity (or perhaps their identity isn't detectable to Java programs). In such a world, how could we tell whether a and b "really" are the same or different?
Suppose we were to instrument the bar method via some means (say, a debugger) such that we can inspect the attributes of the parameter values in a way that can't be done through the programming language, such as by looking at machine addresses. Even if a == b is true (remember, in a value-typed world, == is the same as equals) we might be able to ascertain that a and b reside at different addresses in memory.
Now suppose the JIT compiler compiles foo and inlines the calls to Optional.of. Seeing that there are now two hunks of code that return two results that are always equals, the compiler eliminates one of the hunks and then uses the same result wherever a or b is used. Now, in our instrumented version of bar, we might observe that the two parameter values are the same. The JIT compiler is allowed to do this because of the sixth bullet item, which allows substitution of values that are equals.
Note that we're only able to observe this difference because we're using an extra-linguistic mechanism such as a debugger. Within the Java programming language, we can't tell the difference at all, and thus this substitution can't affect the result of any Java program. This lets the JVM choose any implementation strategy it sees fit. The JVM is free to allocate a and b on the heap, on the stack, one on each, as distinct instances, or as the same instances, as long as Java programs can't tell the difference. When the JVM is granted freedom of implementation choices, it can make programs go a lot faster.
That's the point of the sixth bullet item.

When you execute the lines:
Optional a = Optional.of(new ArrayList<String>());
Optional b = Optional.of(new ArrayList<String>());
assertEquals(a, b); // passes as `equals` delegated to the lists
In the assertEquals(a, b), according to the API :
will check if the params a and b are both Optional
Items both have no value present or,
The present values are "equal to" each other via
equals() (in your example this equals is the one from ArrayList).
So, when you change one of the ArrayList the Optional instance is pointing to, the assert will fail in the third point.

Point 6 says if a & b are equal then they can be used interchangeably i.e say if a method expects two instances of Class A and you have created a&b instances then if a & b passes point 6 you may send (a,a) or (b,b) or (a,b) all three will give the same output.

Related

Can someone please explain Intellij's default equals implementation?

I get this suggestion from IntelliJ IDEA when using #Data annotation from lombok.
The class in question is an #Entity.
Can someone explain:
what does it do exactly (especially the part with Hibernate)
Is this method preferred over comparing every field one-by-one? If yes, why?
#Override
public boolean equals(Object o) {
if (this == o)
return true;
if (o == null || Hibernate.getClass(this) != Hibernate.getClass(o))
return false;
MyObject that = (MyObject ) o;
return id != null && Objects.equals(id, that.id);
}
The project contains/uses Spring boot, Hibernate, Lombok.
Thank you
There's a fundamental problem at work, one inherent to JPA/Hibernate. For this example, let's say we have a db table named User, and we have a class also named User that models it.
The problem boils down to simply this:
What does the java class User represent? Does it represent 'a row in the database table "User"', or does it represent a User?
Depending on your answer, you get a wildly different requirement for the equals method. Depending on which equals method you chose, answering this question incorrectly leads to code bugs. As far as I know, there is no actual 'standard', people just sort of do something and most aren't aware that this is a fundamental problem.
It represents a row in the DB
Such an interpretation would then suggest the following implementation of your equals method:
If all fields that model the primary key columns in the DB definition are equal between the two instances, then they are equal, even if the other (non-primary-key) fields are different. After all, that's how the DB determines equality, so java code should match it.
The java code should be like SQL when dealing with NULLs. That is to say, quite unlike just about every equality definition, equals method code generator (including lombok, intellij, and eclipse), and even the Objects.equals method, in this mode, null == null should be FALSE, as it is in SQL! Specifically, if any of the primary key fields have a null value, that object cannot be equal to any other, even a carbon copy of itself; to stick to java rules, it can (must, really) be equal to its own reference.
In other words:
Any 2 objects are equal if either [A] they are literally the same object (this == other), or [B] BOTH object's unid field is initialized and equal. Whether you use null or 0 to track 'not written to DB yet', that value instantly disqualifies that row from being equal to any other, even another one with 100% identical values.
After all, if you make 2 separate new objects and save() them both, they would turn into 2 separate rows.
It represents a user object
Then what happens is that the equals rules do a 180. The primary key, assuming its an unid style primary key and not a natural primary key, are inherently an implementation detail. Imagine that somehow in your DB you end up with 2 rows for the exact same user (presumably somebody messed up and failed to add a UNIQUE constraint on username, perhaps). In the semantic model of users on the system, users are uniquely identified by their username, therefore, equality is defined by username alone. 2 objects with identical username but different unid values are nevertheless equal.
So which one do I take?
I have no idea. Fortunately, your question asked for explanation and not an answer!
What IntelliJ is telling you is to go with the first interpretation (row in the DB), and even applies the wonky null stuff correctly, so whomever wrote the suggestion tool in intellij at least seems to understand what's going on.
For what its worth, I think 'represents a row in the DB' is the more 'useful' interpretation (because not doing this involves invoking getters which make equality checks incredibly pricey, as it may result in hundreds of SELECT calls and a gigantic bunch of heap mem as you pull half the DB in!), however, the 'an instance of class User represents a user in the system' is the more java-like interpretation and the one that most java programmers would (erroneously then, if you use intellij's suggestion here) silently presume.
I've solved this problem in my own programming endeavours by never using hibernate/JPA in the first place, and using tools like JOOQ or JDBI instead. But, the downside is that generally you end up with more code – you really do sometimes have an object, e.g. called UserRow, representing a user row, and an object e.g. called User that represents a user on-system.
Another trick could be to decide to name all your Hibernate model classes as XRow. Names are important and the best documentation around: This makes no bones about it and clues in all users of this code about how they are to interpret its semantic meaning: Row in DB. Thus, the intellij suggestion would then be your equals implementation.
NB: Lombok is java and not Hibernate specific, so it makes the 'represents a user in the system' choice. You can try to push lombok towards the 'row in DB' interpretation by telling lombok to only use the id field (stick an #EqualsAndHashCode.Include on that field), but lombok would still consider 2 null values / 2 0 values identical even though it shouldn't. This is on hibernate, as it is breaking all sorts of rules and specs.
(NB: Added due to a comment on another answer)
Why is .getClass() being invoked?
Java has sensible rules about what equals is supposed to mean. This is in the javadoc of the equals method and these rules can be relied upon (and are, by e.g. HashSet and co). The rules are:
If aequals(b) is true , a.hashCode() == b.hashCode() must also be true.
a.equals(a) must be true.
If a.equals(b) then b.equals(a) must also be true.
If a.equals(b) and b.equals(c) then a.equals(c) must also be true.
Sensible and simple, right?
Nope. That's actually really complex.
Imagine you make a subclass of ArrayList: You decide to give lists a colour. You can have a blue list of strings and a red list of strings.
Right now the equality method of ArrayList checks if the that is a list and if so, compares elements. Seems sensible, right? We can see it in action:
List<String> a = new ArrayList<String>();
a.add("Hello");
List<String> b = new LinkedList<String>();
b.add("Hello");
System.out.println(a.equals(b));
This prints true.
Let's now make our coloured arraylist implementation: class ColoredList<T> extends ArrayList<T> { .. }. Surely, a red empty list is no longer equal to a blue empty list right?
Nope, you'd be breaking rules if you do that!
List<String> a = new ArrayList<String>();
List<String> b = new ColoredList<String>(Color.RED);
List<String> c = new ColoredList<String>(Color.BLUE);
System.out.println(a.equals(b));
System.out.println(a.equals(c));
System.out.println(b.equals(c));
That prints true/true/false which is invalid. The conclusion is that it is in fact impossible to make any list subclass that adds some semantically relevant information. The only subclasses that can exist are those which either actively break spec (bad idea), or whose additions have no impact on equality.
There is a different view of things which says that you ought to be able to make such classes. Again we're struggling, just like with the JPA/Hibernate case, about what equals is even supposed to mean.
A more common and far better default behaviour for your equals implementations is to simply state that any 2 objects can only be equal if they are of the exact same type: An instance of Dog cannot be equal to an instance of Animal.
The only way to accomplish this, given that the rule a.equals(b)? Then b.equals(a) exists, is that animal checks the class of that and returns false if it isn't exactly Animal. In other words:
Animal a = new Animal("Betsy");
Cow c = new Cow("Betsy");
a.equals(c); // must return false!!
The .getClass() check accomplishes this.
Lombok gives you the best of both worlds. It can't perform miracles, so it won't take away the rule that at the type level you need to choose extensibility, but lombok has the canEqual system to deal with this: The equals code of Animal will ask the that code if the two things can be equal. In this mode, if you have some non-semantically-different subclass of animal (such as ArrayList, which is a subclass of AbstractList and doesn't change the semantics at all, it just adds implementation details that have no bearing on equality), it can say that it can be equal, whereas if you have one that is semantically different, such as your coloured list, it can say that none are.
In other words, going back to the coloured lists, IF ArrayList and co were written with lombok's canEqual system, this could have worked out, you could have had the results (where a is an arraylist, b is a red list, and c is a blur list):
a.equals(b); // false, even though same items
a.equals(c); // false, same reason.
b.equals(c); // false and now it's not a conflict.
Lombok's default behaviour is that all subtypes add semantic load and therefore any X cannot be equal to any Y where Y is a subclass of X, but you can override this by writing out the canEqual method in Y. You would do that if you write a subclass that doesn't add semantic load.
This isn't going to help you in the slightest with the problems above about hibernate.
Who knew something as seemingly simple as equality is hiding 2 intractably difficult philosophical treatises, huh?
For more info on canEqual, see lombok's #EqualsAndHashCode documentation.
I'm not trying to undermine ~rzwitserloot 's excellent answer, just trying to help you figure out why it uses Hibernate.getClass(this) for you instead of this.getClass().
It doesn't do it for me, but I don't have Hibernate in my project anyway.
The code is generated using velocity macros as seen here:
The IntelliJ default uses a file 'equalsHelper.vm'. I found a possible source of that file version at https://github.com/JetBrains/intellij-community/blob/master/java/java-impl/src/com/intellij/codeInsight/generation/equalsHelper.vm
It contains this:
#macro(addInstanceOfToText)
#if ($checkParameterWithInstanceof)
if(!($paramName instanceof $classname)) return false;
#else
if($paramName == null || getClass() != ${paramName}.getClass()) return false;
#end
#end
So apparently you have a different version of that file? Or you use a different template? Maybe some plugin changed it?
Two objects are not equal if they are of different class.
For 'preferred', it depends on what an 'id' is. The last line seems a little redundant; it could have been
return Objects.equals(id, that.id);
since the null case is handled by Objects.equals. But to my taste, it's clearer to write
return id != null && id.equals(that.id);
The extra layer adds nothing that I can see in the example.

Writing unittests for immutability

I have written couple of classes which are designed to be immutable. I am trying to test them. I can certainly use MobilityDetector but I want to write something on my own. Not extensive, something basic.
The idea which I am trying to put in my test cases that on each action, the object reference for the object I performed action would be different than the object returned from the action.
For example, let's say I have designed a class say Digit and it has a method called add. So the test case I am writing is
#Test
public void test_add(){
Digit zero = Digit.getInstance(); //ignore why i am using getinstance here
Digit result = zero.add(new Random().nextInt());
assertNotEqual (zero, result); //there is no equal method overridden in Digit class
}
My assumption here is that assertNotEqual will test the reference of two objects (zero and result). If both references are different then it means that the operation performed on zero object did return a new object rather the old one.
Does this make sense?
My assumption here is that assertNotEqual will test the reference of two objects
Don't assume. Read the javadoc! It says:
Asserts that two objects are not equals. If they are, an AssertionError without a message is thrown. If unexpected and actual are null, they are considered equal.
In other words, this is using the standard Object::equals(Object) method of testing equality. That will only use == comparison if that is whaat the relevant equals(Object) method does under the hood.
To answer your question, testing for zero == result is neither a necessary or sufficient test for immutability.
zero plus some random integer could be zero
The fact that zero is == or not == to result does not prove that the zero object's state has not changed.
In fact, I don't think that there is a valid test for immutability in the general sense. The immutability property is about what is happening inside the abstraction boundary of your Digit class. If you treat Digit as a true black box, you cannot assume that you will be able to detect changes inside the box.
The only valid way to test for (true) immutability is to combine testing with (sure) knowledge of what is happening "inside the box"; i.e. code inspection of your Digit class and white-box testing.
Another alternatively, is to define what you mean by "immutability" in terms of certain externally visible attributes of Digit; e.g. what toString() returns. (But there are problems with that approach too ...)
assertEquals/assertNotEquals test for equality using the equals method of the objects. It's similar to assertTrue(expected.equals(actual)). If equal is not overridden, it will check the object's references (hashcode), but I wouldn't rely on this as this is likely to break your test if you implement the equals method eventually.
If you want to check, whether the objects are the same (or not) use assertSame/assertNoSame which test on reference equality, similar to assertTrue(expected == actual).
But to check for immutability is not only checking if any modifying operation returns a new instance, but also checking the original object is still unchanged!
One way to do this is, is to create a reference object (or a clone) of the original object and additionally check, that the original object still equals the reference object after calling add.
#Test
public void test_add(){
Digit zero = Digit.getInstance();
Digit goldenMaster = zero.clone(); //if clone is implemented
//or this
Digit goldenMaster = Digit.getInstance();
Digit result = zero.add(new Random().nextInt());
assertNotSame (zero, result); //use check for reference
assertEquals(goldenMaster, zero); //check the original object is still unmodified
}
But this in return requires you properly implement equals.

Does equality test order affect performance in Java?

I commonly find myself writing code like this:
private List<Foo> fooList = new ArrayList<Foo>();
public Foo findFoo(FooAttr attr) {
for(Foo foo : fooList) {
if (foo.getAttr().equals(attr)) {
return foo;
}
}
}
However, assuming I properly guard against null input, I could also express the loop like this:
for(Foo foo : fooList) {
if (attr.equals(foo.getAttr()) {
return foo;
}
}
I'm wondering if one of the above forms has a performance advantage over the other. I'm well aware of the dangers of premature optimization, but in this case, I think the code is equally legible either way, so I'm looking for a reason to prefer one form over another, so I can build my coding habits to favor that form. I think given a large enough list, even a small performance advantage could amount to a significant amount of time.
In particular, I'm wondering if the second form might be more performant because the equals() method is called repeatedly on the same object, instead of different objects? Maybe branch prediction is a factor?
I would offer 2 pieces of advice here:
Measure it
If nothing else points you in any given direction, prefer the form which makes most sense and sounds most natural when you say it out loud (or in your head!)
I think that considering branch prediction is worrying about efficiency at too low of a level. However, I find the second example of your code more readable because you put the consistent object first. Similarly, if you were comparing this to some other object that, I would put the this first.
Of course, equals is defined by the programmer so it could be asymmetric. You should make equals an equivalence relation so this shouldn't be the case. Even if you have an equivalence relation, the order could matter. Suppose that attr is a superclass of the various foo.getAttr and the first test of your equals method checks if the other object is an instance of the same class. Then attr.equals(foo.getAttr()) will pass the first check but foo.getAttr().equals(attr) will fail the first check.
However, worrying about efficiency at this level seldom has benefits.
This depends on the implementation of the equals methods. In this situation I assume that both objects are instances of the same class. So that would mean that the methods are equal. This makes no performance difference.
If both objects are of the same type, then they should perform the same. If not, then you can't really know in advance what's going to happen, but usually it will be stopped quite quickly (with an instanceof or something else).
For myself, I usually start the method with a non-null check on the given parameter and I then use the attr.equals(foo.getAttr()) since I don't have to check for null in the loop. Just a question of preference I guess.
The only thing which does affect performance is code which does nothing.
In some cases you have code which is much the same or the difference is so small it just doesn't matter. This is the case here.
Where its is useful to swap the .equals() around is when you have a known value which cannot be null (This doesn't appear to be the cases here) of the type you are using is known.
e.g.
Object o = (Integer) 123;
String s = "Hello";
o.equals(s); // the type of equals is unknown and a virtual table look might be required
s.equals(o); // the type of equals is known and the class is final.
The difference is so small I wouldn't worry about it.
DEVENTER (n) A decision that's very hard to make because so little depends on it, such as which way to walk around a park
-- The Deeper Meaning of Liff by Douglas Adams and John Lloyd.
The performance should be the same, but in terms of safety, it's usually best to have the left operand be something that you are sure is not null, and have your equals method deal with null values.
Take for instance:
String s1 = null;
s1.equals("abc");
"abc".equals(s1);
The two calls to equals are not equivalent as one would issue a NullPointerException (the first one), and the other would return false.
The latter form is generally preferred for comparing with string constants for exactly this reason.

Why are two AtomicIntegers never equal?

I stumbled across the source of AtomicInteger and realized that
new AtomicInteger(0).equals(new AtomicInteger(0))
evaluates to false.
Why is this? Is it some "defensive" design choice related to concurrency issues? If so, what could go wrong if it was implemented differently?
(I do realize I could use get and == instead.)
This is partly because an AtomicInteger is not a general purpose replacement for an Integer.
The java.util.concurrent.atomic package summary states:
Atomic classes are not general purpose replacements for
java.lang.Integer and related classes. They do not define methods
such as hashCode and compareTo. (Because atomic variables are
expected to be mutated, they are poor choices for hash table keys.)
hashCode is not implemented, and so is the case with equals. This is in part due to a far larger rationale that is discussed in the mailing list archives, on whether AtomicInteger should extend Number or not.
One of the reasons why an AtomicXXX class is not a drop-in replacement for a primitive, and that it does not implement the Comparable interface, is because it is pointless to compare two instances of an AtomicXXX class in most scenarios. If two threads could access and mutate the value of an AtomicInteger, then the comparison result is invalid before you use the result, if a thread mutates the value of an AtomicInteger. The same rationale holds good for the equals method - the result for an equality test (that depends on the value of the AtomicInteger) is only valid before a thread mutates one of the AtomicIntegers in question.
On the face of it, it seems like a simple omission but it maybe it does make some sense to actually just use the idenity equals provided by Object.equals
For instance:
AtomicInteger a = new AtomicInteger(0)
AtomicInteger b = new AtomicInteger(0)
assert a.equals(b)
seems reasonable, but b isn't really a, it is designed to be a mutable holder for a value and therefore can't really replace a in a program.
also:
assert a.equals(b)
assert a.hashCode() == b.hashCode()
should work but what if b's value changes in between.
If this is the reason it's a shame it wasn't documented in the source for AtomicInteger.
As an aside: A nice feature might also have been to allow AtomicInteger to be equal to an Integer.
AtomicInteger a = new AtomicInteger(25);
if( a.equals(25) ){
// woot
}
trouble it would mean that in order to be reflexive in this case Integer would have to accept AtomicInteger in it's equals too.
I would argue that because the point of an AtomicInteger is that operations can be done atomically, it would be be hard to ensure that the two values are compared atomically, and because AtomicIntegers are generally counters, you'd get some odd behaviour.
So without ensuring that the equals method is synchronised you wouldn't be sure that the value of the atomic integer hasn't changed by the time equals returns. However, as the whole point of an atomic integer is not to use synchronisation, you'd end up with little benefit.
I suspect that comparing the values is a no-go since there's no way to do it atomically in a portable fashion (without locks, that is).
And if there's no atomicity then the variables could compare equal even they never contained the same value at the same time (e.g. if a changed from 0 to 1 at exactly the same time as b changed from 1 to 0).
AtomicInteger inherits from Object and not Integer, and it uses standard reference equality check.
If you google you will find this discussion of this exact case.
Imagine if equals was overriden and you put it in a HashMap and then you change the value. Bad things will happen:)
equals is not only used for equality but also to meet its contract with hashCode, i.e. in hash collections. The only safe approach for hash collections is for mutable object not to be dependant on their contents. i.e. for mutable keys a HashMap is the same as using an IdentityMap. This way the hashCode and whether two objects are equal does not change when the keys content changes.
So new StringBuilder().equals(new StringBuilder()) is also false.
To compare the contents of two AtomicInteger, you need ai.get() == ai2.get() or ai.intValue() == ai2.intValue()
Lets say that you had a mutable key where the hashCode and equals changed based on the contents.
static class BadKey {
int num;
#Override
public int hashCode() {
return num;
}
#Override
public boolean equals(Object obj) {
return obj instanceof BadKey && num == ((BadKey) obj).num;
}
#Override
public String toString() {
return "Bad Key "+num;
}
}
public static void main(String... args) {
Map<BadKey, Integer> map = new LinkedHashMap<BadKey, Integer>();
for(int i=0;i<10;i++) {
BadKey bk1 = new BadKey();
bk1.num = i;
map.put(bk1, i);
bk1.num = 0;
}
System.out.println(map);
}
prints
{Bad Key 0=0, Bad Key 0=1, Bad Key 0=2, Bad Key 0=3, Bad Key 0=4, Bad Key 0=5, Bad Key 0=6, Bad Key 0=7, Bad Key 0=8, Bad Key 0=9}
As you can see we now have 10 keys, all equal and with the same hashCode!
equals is correctly implemented: an AtomicInteger instance can only equal itself, as only that very same instance will provably store the same sequence of values over time.
Please recall that Atomic* classes act as reference types (just like java.lang.ref.*), meant to wrap an actual, "useful" value. Unlike it is the case in functional languages (see e.g. Clojure's Atom or Haskell's IORef), the distinction between references and values is rather blurry in Java (blame mutability), but it is still there.
Considering the current wrapped value of an Atomic class as the criterion for equality is quite clearly a misconception, as it would imply that new AtomicInteger(1).equals(1).
One limitation with Java is that there is no means of distinguishing a mutable-class instance which can and will be mutated, from a mutable-class instance which will never be exposed to anything that might mutate it(*). References to things of the former type should only be considered equal if they refer to the same object, while references to things of the latter type should often be considered equal if the refer to objects with equivalent state. Because Java only allows one override of the virtual equals(object) method, designers of mutable classes have to guess whether enough instances will meet the latter pattern (i.e. be held in such a way that they'll never be mutated) to justify having equals() and hashCode() behave in a fashion suitable for such usage.
In the case of something like Date, there are a lot of classes which encapsulate a reference to a Date that is never going to be modified, and which want to have their own equivalence relation incorporate the value-equivalence of the encapsulated Date. As such, it makes sense for Date to override equals and hashCode to test value equivalence. On the other hand, holding a reference to an AtomicInteger that is never going to be modified would be silly, since the whole purpose of that type centers around mutability. An AtomicInteger instance which is never going to be mutated may, for all practical purposes, simply be an Integer.
(*) Any requirement that a particular instance never mutate is only binding as long as either (1) information about its identity hash value exists somewhere, or (2) more than one reference to the object exists somewhere in the universe. If neither condition applies to the instance referred to by Foo, replacing Foo with a reference to a clone of Foo would have no observable effect. Consequently, one would be able to mutate the instance without violating a requirement that it "never mutate" by pretending to replace Foo with a clone and mutating the "clone".

Java, Object.hashCode() result constant across all JVMs/Systems?

Is the output of Object.hashCode() required to be the same on all JVM implementations for the same Object?
For example if "test".hashCode() returns 1 on 1.4, could it potentially return 2 running on 1.6. Or what if the operating systems were different, or there was a different processor architecture between instances?
No. The output of hashCode is liable to change between JVM implementations and even between different executions of a program on the same JVM.
However, in the specific example you gave, the value of "test".hashCode() will actually be consistent because the implementation of hashCode for String objects is part of the API of String (see the Javadocs for java.lang.String and this other SO post).
From the API
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
No, the result of hashCode() is only constant during a single execution. You should not expect the result of the function to be the same between executions, let alone between JRE versions or platforms.
first of all, the result of hashCode depends heavily on the Object type and its implementation. every class including its subclasses can define its own behavior. you can rely on it following the general contract as outlined in the javadoc as well as in other answers. but the value is not required to stay the same after a VM restart. especially if it depends on the .hashCode implementations of thrid party classes.
when referring to the concrete implementation of the String class, you should not depend on the return value. if you program is executed in a different VM, it could potentially change.
if you refer solely to the Sun Vm, it could be argued that Sun will not break - even badly programmed - existing code. so "test".hashCode() will always return exactly 3556498 for any version of the Sun VM.
if you want to deliberatly shoot yourself in the foot, go ahead and depend on this. people who will need to fix your code running on the "2015 Nintendo Java VM for Hairdryer" will cry out your name at night.
As noted, for many implementations the default behavior of hashCode() is to return the address of the object. Obviously this can be different each time the program is run. This is also consistent with the default behavior of equals(): two objects are equal only if they are the same object (where x and y are both non-null, x.equals(y) if and only if x == y).
For any classes where hashCode() and equals() are overridden, generally they are calculated in a deterministic way based on the values of some or all of the members. Thus, in practice it is likely that if an object in one run of the program can be said to be equal to an object in another run of the program, and the source code is the same (including such things as the source code for String.hashCode() if that is called by the hashCode() override), the hash codes will be the same.
It is not guaranteed, although it is hard to think of a reasonable real-world example.
The only truth: hashcode is the same for the application run. Another run may give other hashcodes.
When you ask for object's hashcode, JVM creates it using one of RNG algorithms and puts it in object's header for future usage.
Just look into get_next_hash function in OpenJDK.
The RNG algorithm is configurable with JVM arg -XX:hashCode=x,
where x is a digit:
0 – Park-Miller RNG (default)
1 – f (address, the global)
2 – constant 1
3 – sequential counter
4 – object's address in heap
5 – Xorshift (the fastest)
When the hashcode equals address in heap - this is sometimes awkward, because GC can move objects to another heap cells etc.

Categories

Resources