More on equals and hashCode - java

I understand the contract between the equals and hashCode methods. If equals is overridden, hashCode should also be. Can I override the hashCode method to always return the same value, say the int 23? Can I override the hashCode method to return a random number each time it is called?

You shouldn't override hashCode to return a random value. It should always return the same value for the same instance.
This is clearly stated in the Javadoc :
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified.
You shouldn't return a constant value, since it would make a very poor hashCode when used in classes such as HashMap and HashSet.
This is also mentioned in the Javadoc :
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

Related

Can you just return a field's hashCode() value in a hashCode() method?

While reviewing a large code base, I've often come across cases like this:
#Override
public int hashCode()
{
return someFieldValue.hashCode();
}
where the programmer, instead of generating their own unique hash code for the class, simply inherits the hash code from a field value. My gut feeling (which might just as well be digestive problems) tells me that this is wrong, but I can't put my finger on it. What problems can arise, if any, with this sort of implementation?
This is fine if you want to hash your object based on a single property.
For example, in a Person class you might have an ID property that uniquely identifies a Person, so the hashCode() of Person can simply be the hash of that ID.
In addition, the hashCode() is related to the implementation of equals. If two objects are equal, they must have the same hashCode (the opposite doesn't have to be true - two non equal objects may still have the same hashCode). Therefore, if equality is determined by a single property (such as a unique ID), the hashCode method must also use only that single property.
This can be seen in the JavaDoc of hashCode :
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
Technically speaking, you can return any consistent number from hashCode, even a constant value. The only requirement the contract places upon you is that equal objects must return the same hash code:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Theoretically, if all objects return, say, zero for their hashCode, the contract is formally satisfied. However, this makes hashCode completely useless.
The real question is whether you should do it or not. The answer depends on how unique is the field the hash code of which you are returning. It is not uncommon to return the hashCode of a unique identifier of an object for the object's hashCode. On the other hand, if a significant percentage of objects have the sane value of someFieldValue, you would be better off using a different strategy for making the hash code of your object.
hashCode() has to go with equals().
If the only property defining equalness is, for example, an ID, you HAVE TO take care that your hash codes are equal when the ID is equal.
The easiest way to accomplish this is by taking the hashCode() of your ID.
This is fine, if you really want to uniquely identify your object by this single property. Here is an article that explains what object identity really is.
As noted in the documentation of Object, your equals() and hashCode() need to incorporate the same properties, be sure to verify that.
So this means that you should ask yourself the question: do I really want the objects to be equal if only this single property is equal?
Finally do take great care when subclassing objects with a custom equals() and hashcode() implementation, if you want to add properties to the identity of the object, you will break the requirement that a.equals(b) == b.equals(a) (to see why this fails thing about this as a being the super class and b being the subclass.
yes you can do it technically, you need a non-primitive somefieldValue for that.

why the value of hashCode is same while all are different String Object

Why the value of hashCode is same while all are different String Object:
public class StringObj {
public static void main(String[] args) {
String s1="Jack";
String s2=new String("Jack");
String s3=new String("Jack");
System.out.println(s1.hashCode());
System.out.println(s2.hashCode());
System.out.println(s3.hashCode());
}
}
The Java documentation for Object says that if an object equals() another, it must have the same hashCode(). This makes sense, since both objects are, supposedly, representing the same thing.
From a practical standpoint, this is very important. It allows you to use a well known String to write into and read from a map instead of having to use a singleton object key.
Short answer:
The contract of hashCode and equals explicitly say that the hashcode() of the two objects should be the same if equals returns true for them.
Note that equals on these strings will return true. So it is necessary that the hashCode is the same.
Long Answer
The contract of hashCode says:
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution
of an application to another execution of the same application.
If
two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce
the same integer result.
It is not required that if two objects are
unequal according to the equals(java.lang.Object) method, then calling
the hashCode method on each of the two objects must produce distinct
integer results. However, the programmer should be aware that
producing distinct integer results for unequal objects may improve the
performance of hash tables.
hashCode of String is computed as:
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using int arithmetic, where s[i] is the ith character of the
string, n is the length of the string, and ^ indicates
exponentiation. (The hash value of the empty string is zero.)
so, THE hashCode() of String uses the value of the String to compute the hashCode. Many objects do that. So as long as the content is the same, the hashCode will be the same. This is specifically because that is what the hashCode contract mandates.

Different fields for equals and hashcode

I agree with the statement from this post What issues should be considered when overriding equals and hashCode in Java?
Use the same set of fields that you use to compute equals() to compute hashCode().
But i've some doubts :
Is this absolutely necessary to have same fields ?
If yes, what if I don't use same field ?
Will it affect HashMap performance or HashMap Accuracy ?
The fields don't have to be the same. The requirement is for two objects that are equal, they must have the same hash code. If they have the same hash code, they don't have to be equal. From the javadocs:
Whenever it is invoked on the same object more than once during an
execution of a Java application, the hashCode method must consistently
return the same integer, provided no information used in equals
comparisons on the object is modified. This integer need not remain
consistent from one execution of an application to another execution
of the same application.
If two objects are equal according to the
equals(Object) method, then calling the hashCode method on each of the
two objects must produce the same integer result.
It is not required
that if two objects are unequal according to the
equals(java.lang.Object) method, then calling the hashCode method on
each of the two objects must produce distinct integer results.
However, the programmer should be aware that producing distinct
integer results for unequal objects may improve the performance of
hash tables.
For example, you could return 1 as your hash code always, and you would obey the hash code contract, no matter what fields you used in your equals method.
Returning 1 all the time would improve the computation time of hashCode, but HashMap's performance would drop since it would have to resort to equals() more often.
Is this absolutely necessary to have same fields ?
Yes, if you don't want any surprises.
If yes, what if I don't use same field ?
You might get different hashCode for objects that are equal, as per equals() method, which is a requirement for the equals and hashCode contract.
For example, suppose you've 3 fields - a, b, c. And you use a and b for equals() method, and all the 3 fields for hashCode() method. So, for 2 objects, if a and b are equals, and c is different, both will be equals with different hashcode.
Will it affect HashMap performance or HashMap Accuracy ?
It's not about performance, but yes your map will not behave as expected.
Fields used in hashcode can be a subset of fields used in equals.
It will still abide by this rule "Whenever a.equals(b), then a.hashCode() must be same as b.hashCode()"

Correct implementation of hashcode/equals

I have a class
class Pair<T>{
private T data;
private T alternative;
}
Two pair objects would be equal if
this.data.equals(that.data) && this.alternative.equals(that.alternative) ||
this.data.equals(that.alternative) && this.alternative.equals(that.data)
I'm having difficulty correctly implementing the hashCode() part though. Any suggestions would be appreciated
You should use the hashCode from data and alternative like this :
return this.data.hashCode() + this.alterative.hashCode();
Although it is not the best approach, as if you change the data or alternative, then their hashcode will also change. Think a little bit and see if you really need to use this class as a key in a map and if not a Long or String would be a better candidate.
This should do the trick:
#Override
public int hashCode() {
return data.hashCode() * alternative.hashCode();
}
Since you want to include both fields into the equals, you need to include both fields into the hashCode method. It is correct if unequal objects end up having the same hash code, but equal objects according to your scheme will always end up having the same hash code with this method.
Refer to the java doc, the general contract of hashCode is(copied from java doc):
- Whenever it is invoked on the same object more than once during an
execution of a Java application, the hashCode method must
consistently return the same integer, provided no information used in
equals comparisons on the object is modified. This integer need not
remain consistent from one execution of an application to another
execution of the same application.
- If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce
the same integer result.
- It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then calling the hashCode method on
each of the two objects must produce distinct integer results.
However, the programmer should be aware that producing distinct
integer results for unequal objects may improve the performance of
hashtables.
So from your implementation of equals, data and alternative are switchable. So you need make sure in your hashCode implementation returns the same value if you switch the position of data.hashCode() and alternative.hashCode(). If you are not sure, just return a const value such as 1 (But it may cause performance issue when you try to put the object into a Map).

Consistent hashcode for an object in java

In java Is it possible to get consistent hash code for an object when we are running the application multiple times
Sure. If it is a String for example, then String.hashCode() gives a consistent hashcode each time you run the application.
You only get into trouble if the hashcode incorporates something other than the values of the object's component fields; e.g. an identity hashcode. And of course, this means that the object class needs to override Object.hashcode() at some point, because that method gives you an identity hashcode.
FOLLOW UP
Judging from comments on other answers, the OP still seems to be pursuing the illusory goal of a unique hash function; i.e. some function that will map (for example) any String to a hashcode that is unique for all possible Strings.
Unfortunately this is impossible in the general case, and in this case. Furthermore, it is a simple matter to construct a proof that a String to int hash function that generates unique int values is mathematically impossible. (I won't bore you with the details ... but the basis of the proof is that there are more String values than int values.)
In fact, the only situation where such a hash function is possible is when the set of all possible values of input type has a size that is no greater than the number of possible values of the integer type. There are hash functions that will map a byte, char, short or int to a unique int, but a hash function that maps long values to unique int values is impossible.
It depends on implementation on hashCode() method of Object
It can also be
public int hashCode() {
return 1;
}
No, not for objects in general. Objects with their own hashcode method will probably be consistent across runs.
Implement/override the public int hashCode() method all objects have?
You have to decide what makes the object the same. Usually it is based on the content of one or more fields. In this case, you should make the hashCode based on these fields. (And equals())
However, I would suggest you shouldn't rely on the hashCode being the same between runs of the application. This is highly likely to break when you change code and very hard to fix when it does. e.g. if you add/remove a field which is part of the hashCode or change the way the hashCode is calculated or anything ti depends on, the hashCode will change.
What are you trying to do? This sounds like a problem where a different solution would be better.
Looking in the contract of hashCode:
Whenever it is invoked on the same object more than once during an
execution of a Java application, the
hashCode method must consistently
return the same integer, provided no
information used in equals comparisons
on the object is modified. This
integer need not remain consistent
from one execution of an application
to another execution of the same
application.
If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of
the two objects must produce the same
integer result.
It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then
calling the hashCode method on each of
the two objects must produce distinct
integer results. However, the
programmer should be aware that
producing distinct integer results for
unequal objects may improve the
performance of hashtables.
So it is not guaranteed that the hashCode is euqal between invocations. In reality, there are be quite some hashCode implementations that return the same value across invocations: String and all types used for boxing (like Integer) have a consistent return value for hashCode. Objects that only combine member hashCodes where each member has a consistent return value also feature this consistency. So, in practice it should be rather common to have a hashCode return value that is consistent accross invocations.

Categories

Resources