Question: if there are two objects o1 and o2 such that o1.equals(o2), what is java's standard convention about the relationship between o1.hashCode() == o2.hashCode()?
Answer: o1.hashCode() == o2.hashCode()
I don't know why..what i thought was since o1 and o2 are different objects, they shouldn't have the same hashCode. If what the question says were o1 == o2, then I would say they have the same hashCode since both o1 and o2 point to the same object.
Can anyone point out what I did wrong?
.equals is used to evaluate the value of an object. When objects are hashed, their numerical values on a systems level are used to produce a hash code (an integer, if you will). If the objects are the exact same (e.g. in terms of attributes), they will have the same numerical sequence and therefore hash to the same value.
Equals and Hashcode always go hand in hand. If o1 == o2, then they point to the same memory and hashcodes are trivially equal. However, one you implement Equals, you always have to make sure that when Equals returns true, Hashcode must return the same value.
Here are the official docs which set out the requirements/standards.
Related
Given java Object#hashCode documentation snapshot :
As much as is reasonably practical, the hashCode method defined by
class Object does return distinct integers for distinct objects. (The
hashCode may or may not be implemented as some function of an object's
memory address at some point in time.)
How would we make this method return true without overriding the hashCode() method ?
boolean challenge(Object o1, Object o2) {
return o1 == o2 && o1.hashCode() != o2.hashCode();
}
Stated otherwise, make this method return true :
boolean makeMeReturnTrue(Object o1, Object o2) {
return o1 == o2 && System.identityHashCode(o1) != System.identityHashCode(o2);
}
You cannot.
If o1 == o2, then o1.hashCode() is certain to return the same value as o2.hashCode(), which will make the expression return false.
One of the properties of hashCode() is consistency, meaning that within the same program, the result of hashCode() on the same object should not change (and changing the return value is the only way to make the expression return true - but you can't get Object.equals to return incosistent values, so you would be forced to override hashCode to break the contract for yourself).
return o1 == o2 && o1.hashCode() != o2.hashCode();
The expression o1 == o2 tests o1 and o2 to see if they refer to the same object. Not two objects that compare equally, but the actual same object. So o1 == o2 will only be true if o1 and o2 refer to the same object.
Given that o1 and o2 must be the same object, the only way for o1.hashCode() != o2.hashCode() to be true would be if the relevant hashCode() method returned a different value each time it was called. For example, the expression could be true if hashCode() returned a random value each time it was called. That violates the expectation for hashCode(), but you could certainly write a non-conformant hashCode() method if you wanted to.
Well, its going to depend on actual implementation of hashCode() and equals() for the final Objects that get passed in. It looks like your class has an implementation where equality stands for "we contain the same data", but you seem to be hoping that the hash code implementation is based on the memory address or some other "unique to this instantiation" identifier so that something may not challenge itself.
You could always manage that identifier yourself, which would be safest (since the underlying implementation of hashCode() could change on a default version of a common data structure during a major Java version update for example), but it might be overkill.
Just download openJDK and modify the code as per your requirement. It'll work. But it'll work only in your own custom JDK that you have created by modifying OpenJDK. Use your custom JDK where you need to deploy your application.
With respect to 3 contracts mentioned below:
1) Whenever hashCode() is invoked on the same object more than once during an execution of an application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
From this statement, i understand that, In a single execution of an application, if hashCode() is used one or more times on same object it should return same value.
2) If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same integer result.
From this statement, i understand that, to perform the equality operation(in broad scope) in your subclass, There are at least four different degrees of equality.
(a) Reference equality(==), comparing the internal address of two reference type objects.
(b) Shallow structural equality: two objects are "equals" if all their fields are ==.
{ For example, two SingleLinkedList whose "size" fields are equal and whose "head" field point to the same SListNode.}
(c) Deep structural equality: two objects are "equals" if all their fields are "equals".
{For example, two SingleLinkedList that represent the same sequence of "items" (though the SListNodes may be different).}
(d) Logical equality. {Two examples:
(a) Two "Set" objects are "equals" if they contain the same elements, even if the underlying lists store the elements in different orders.
(b) The Fractions 1/3 and 2/6 are "equals", even though their numerators and denominators are all different.}
Based on above four categories of equality, second contract will hold good only: if(Say) equals() method returns truth value based on logical_equality between two objects then hashCode() method must also consider logical_equality amidst computation before generating the integer for every new object instead of considering internal address of a new object.
But i have a problem in understanding this third contract.
3) IT IS NOT REQUIRED that if two objects are unequal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
In second contract, As we are saying that hashCode() method should be accordingly[for ex: considering logical_equality before generating integer] implemented, I feel, It is not true to say that, if two objects are unequal according to equals(Object) then hashCode() method may produce same integer results as mentioned in third contract? As per the argument in second contract, hashCode() must produce distinct integer results. One just writing return 42 in hashCode() is breaking second contract!
please help me understand this point!
It would be impossible for hashCode() to always return different values for unequal objects. For example, there are 2^64 different Long values, but only 2^32 possible int values. Therefore the hashCode() method for Long has to have some repeats. In situations like this you have to try hard to ensure that your hashCode() method distributes values as evenly as possible, and is unlikely to produce repeats for the instances you are most likely to use in practice.
The second condition just says that two equal() instances must return the same hashCode() value, so this program must print true:
Long a = Long.MIN_VALUE;
Long b = Long.MIN_VALUE;
System.out.println(a.hashCode() == b.hashCode()); // a.equals(b), so must print true.
However this program also prints true:
Long c = 0L;
Long d = 4294967297L;
System.out.println(c.hashCode() == d.hashCode()); // prints true even though !c.equals(d)
hashCode() does not have to produce a distinct result. return 0; is a perfectly legal implementation of hashCode() - it ensures that two equal objects will have the same hash code. But it will ensure dismal performance when using HashMaps and HashSets.
It's preferable that hashCode() return values will be distinct (i.e., objects that are not equal should have different hash codes), but it's not required.
The second contract states what happens when equals() returns true. It does not say anything about the case when equals() returns false.
The third contract is just a reminder about that fact. It reminds you that when equals() is false for two objects, there is no connection between their hash codes. They may be same or different, as the implementation happens to make them.
The third point means that you can have many unequal objects with the same hashcode. . For example 2 string objects can have the same hashcode. The second point states that two equal objects must have the same hashcode. . return 5 is a valid hash implementation because it returns the same value for 2 equal objects.
I have a written a class like
public class HashCodeImpl{
public int hashCode(){
return 1;
}
public static void main(String[] args) {
// TODO Auto-generated method stub
HashCodeUtil h= new HashCodeUtil();
HashCodeUtil h1= new HashCodeUtil();
System.out.println(h.hashCode());
System.out.println(h1.hashCode());
System.out.println(h);
System.out.println(h1);
System.out.println(h==h1);
}
}
OutPut:
1
com.manu.test.HashCodeUtil#1
com.manu.test.HashCodeUtil#1 false
My question is: when my hashCode method is returning same value then why
System.out.println(h==h1);
is coming false?
Please explain.
Because they are two different object references. == compare the references, not the hashCode results.
To get a desired result, you may override the equals method in your class and use h1.equals(h2) to see if they're equivalent. Here you may use the result of hashCode to ease the evaluation of the equality of the objects being compared (this doesn't mean that two objects with the same hash code are equals).
But note that even if the objects have the same hashCode and are equivalent by the definition of the equals method, they are different references that occupy a different place in the heap.
As #ZouZou points out, hashCode equality does not equate to object equality. Having said that, you are not even comparing for object equality. Comparing two objects with == is a reference equality check, which you should almost never use, unless you really know what you're doing.
You're misunderstanding the purpose of the hashCode. As others have pointed out, == compares references and not hash codes. However, an overriding equals method, which compares values and not references, still wouldn't compare hash codes.
Think about it ... A hash code is an int, and therefore there are only 232 possible values for a hash code. But how many possible Strings are there? Many, many more than 232. (Since each char has 216 possible values, there are 248 possible Strings of length three, and the number just keeps growing the longer the Strings get.) Therefore, it is impossible to set up a scheme where two Strings are always equal if their hash codes are equal. The same is true of most other objects (although a class with a relatively small number of possible values could be set up with a unique hash code for each value).
The hashCode's purpose is to come up with a number that can be used for a hashMap or hashSet. We often try to come up with a function that will reduce that chance that unequal objects have unequal hash codes, in order to improve the efficiency of the map or set. But for most objects, of course, it's impossible to guarantee this.
Let's say I have two types A and B that both have a unique id field, here is how I usually implement the equals() and hashCode() methods:
#Override
public boolean equals(Object obj) {
return obj instanceof ThisType && obj.hashCode() == hashCode();
}
#Override
public int hashCode() {
return Arrays.hashCode(new Object[] { id });
}
In that case, given that A and B both have a 1-arg constructor to set their respective id field,
new A(1).equals(new A(1)) // prints true as expected,
new A(1).equals(new A(2)) // prints false as expected,
new A(1).equals(new B(1)) // prints false as expected.
But also,
new A(1).hashCode() == new B(1).hashCode() // prints true.
I wonder if it matters if two hashCodes are equal, even if the two objects aren't from the same type? Could hashCode() be used somewhere else than in equals()? If yes, to what purpose?
I thought about implementing the two methods as follow:
#Override
public boolean equals(Object obj) {
return obj != null && obj.hashCode() == hashCode();
}
#Override
public int hashCode() {
return Arrays.hashCode(new Object[] { getClass(), id });
}
Adding the class to the hashCode generation would solve this potential problem. What do you think? Is it necessary?
For objects of different classes the same hashCode() does not matter. The hashCode() only says that the objects are possibly the same. If e.g. HashSet encounters the same hashCode() it will test for equality with equals().
The rule is simple:
A.equals(B) implies B.hashcode() == A.hashcode()
B.hashcode() != A.hashcode() implies !A.equals(B)
There should be no other relations between the two. If you use hashcode() inside equals(), you should have a warning.
Hashcode is definitely not used in equals; it is used by collections based on the data structure called a hash table. It is always OK from the correctness standpoint for two hashcodes to equal each other; this is called a hash collision, it is unavoidable in the general case, and the only consequence is weaker performance.
Nothing wrong with two different objects (even of the same type) to have the equal hash code, but your second variant of equals() looks odd to me. It will work only if you can guarantee that your objects will be compared only to the objects of the same type.
Could hashCode() be used somewhere else than in equals()?
This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable. from javadoc
Also
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Not that, when A extends B, or B extends A, then your equals method is faulty, since:
a.equals(b) != b.equals(a)
if a and b happen to have the same hash code.
What is the difference between identity and equality in OOP (Object Oriented Programming)?
identity: a variable holds the
same instance as another variable.
equality: two distinct objects can
be used interchangeably. they often
have the same id.
Identity
For example:
Integer a = new Integer(1);
Integer b = a;
a is identical to b.
In Java, identity is tested with ==. For example, if( a == b ).
Equality
Integer c = new Integer(1);
Integer d = new Integer(1);
c is equal but not identical to d.
Of course, two identical variables are always equal.
In Java, equality is defined by the equals method. Keep in mind, if you implement equals you must also implement hashCode.
Identity determines whether two objects share the same memory address. Equality determines if two object contain the same state.
If two object are identical then they are also equal but just because two objects are equal dies not mean that they share the same memory address.
There is a special case for Strings but that is off topic and you'll need to ask someone else about how that works exactly ;-)
Identity means it is the same object instance while equality means the objects you compare are to different instances of an object but happen to contain the same data.
Illustration (in java)
Date a = new Date(123);
Date b = new Date(123);
System.out.println(a==b); //false
System.out.println(a.equals(b)); //true
So a and b are different instances (different allocations in memory) but on the "data" level they are equal.
For instance,
In StackOverFlow:
identity: I am Michael, you are
sevugarajan, so we are not same.
equality: if we have same reputation
scores, we are equal in some ways.
In Java and similar languages which 'leak' the abstraction of a reference of an object, you can test whether two references refer to the same object. If they refer to the same object, then the references are identical. In Java, this is the == operator.
There is also an equals method which is used to test whether two objects have the same value, for example when used as keys of a HashSet (the hash code of equal objects should also be equal). Equal objects should have the same 'value' and semantics when used by client code.
Purer object-oriented languages do not have an identity comparison, as client code generally shouldn't care whether or not two objects have the same memory address. If objects represent the same real-world entity, then that is better modelled using some ID or key value rather than identity, which then becomes part of the equals contract. Not relying on the memory address of the object to represent real-world identity simplifies caching and distributed behaviour, and suppressing == would remove a host of bugs in string comparison or some uses of boxing of primitives in Java.
Think about the words "identical" and "equivalent". If two things are identical, they have the same identity; they are same thing. If they are equivalent, one can be substituted for the other without affecting the outcome; they have the same behavior and properties.
Identity: Two references to the same object (o1 == o2).
Equality: The method o1.equals( o2 ) returns true. This doesn't necessarily mean that the two objects contain (all) the same data.
In theory it's possible to override a method equals() to return false even for identical objects. But this would break the specification of Object.equals():
The equals method implements an equivalence relation on non-null object references:
It is reflexive: for any non-null reference value x, x.equals(x) should return true.
It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
For any non-null reference value x, x.equals(null) should return false.
Identity concept is quite philosophical, that's why you shouldn't reconduce it just to references.
You can say that two identities are the same if a change to the first is reflected on the second and vice-versa. Of course this includes also sharing the same memory address but in general while identity is related to the attributes of the object, equality is used to check whenever two objects are identical, but this doesn't include identity.
The viceversa is quite obvious, if two items have the same identity they are also equal (in equality terms of being interchangeable).
For primitive types ( int , boolean , char, long , float ... )
== and != is equality test
and for Objects
== and != is identity test. [ it compares only the reference ]
equals method is used for equality test of Objects [ it can be overridden to compare specific attributes]
i found an excellent article on this
http://www.cs.cornell.edu/courses/cs211/2006sp/Lectures/L14-Comparison/L14cs211sp06.pdf
http://ocw.mit.edu/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-170Fall-2005/D659DC53-FB1D-403C-8E35-2CAECBED266E/0/lec12.pdf
Quote
I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals. :D
Sir Winston Churchill
x == y is true only if there's the same object referenced by variables x and y.
x.equals(y) depends on the implementation of x.equals(), and is usually less strict that the above, as it compares the content of the object. (In Java, if x.equals(y), it must also be true that x.hashCode() == y.hashCode();)
Example:
Integer w = new Integer(3);
Integer x = new Integer(1);
Integer y = x;
Integer z = new Integer(1);
// all of these evaluate to true
y.equals(x) // it's the same object, of course the content is same
x.equals(z) // different objects, same content (`1`)
z.equals(y)
!w.equals(x); // the content is different (`3` vs `1`)
!w.equals(y);
!w.equals(z);
x == y // same object
z != x // different objects
y != z
w != x
Identical vs. Equal objects
Two objects are said to have identical states (deep equality) if the graphs representing their states are identical in every respect, including the OIDs at every level.
Two objects are said to have equal states (shallow equality) if the graphs representing their states are same, including all the corresponding atomic values. However, some corresponding internal nodes in the two graphs may have objects with different OIDs.
Example: This example illustrates the difference between the two definitions for comparing object
states for equality.
o2 = (i 2 , tuple, <a 1 :i 5 , a 2 :i 6 >)
o3 = (i 3 , tuple, <a 1 :i 4 , a 2 :i 6 >)
o4 = (i 4 , atom, 10)
o5 = (i 5 , atom, 10)
o6 = (i 6 , atom, 20)
In this example, the objects o1 and o2 have equal states (shallow equality), since their states at the atomic level are the same but the values are reached through distinct objects o 4 and o 5 .
However, the objects o1 and o3 have identical states (deep equality), even though the objects themselves are not because they have distinct OIDs. Similarly, although the states of o4 and o5 are identical, the actual objects o4 and o5 are equal but not identical, because they have distinct OIDs.
To add, identity is also known as referential check (references to objects, get it?) and equality as structural check by some authors. At the end of the day an object in memory abstract is just a map/table structure indexed at certain memory address. There can be one or many references (memory addresses) pointing to it. They all are referential-ly identical. When contents of same object is copied to another counterpart then both are structurally equal.