Java - TreeSet and hashCode() - java

I have a quick question about TreeSet collections and hashCode methods. I have a TreeSet and I'm adding objects to it, before I add an object, I check to see if it exists in the TreeSet using the contains method.
I have 2 distinct objects, each of which produce a distinct hashCode using my implementation of the hashCode method, example below:
public int hashCode()
{
int hash = 7;
hash = hash * 31 + anAttribute.hashCode();
hash = hash * 31 + anotherAttribute.hashCode();
hash = hash * 31 + yetAnotherAttribute.hashCode();
return hash;
}
The hashCodes for a particular run are: 76126352 and 76126353 (the objects only differ by one digit in one attribute).
The contains method is returning true for these objects, even though the hashCodes are different. Any ideas why? This is really confusing and help would really be appreciated.

TreeSet does not use hashCode at all. It uses either compareTo or the Comparator you passed to the constructor. This is used by methods like contains to find objects in the set.
So the answer to your question is that your compareTo method or your Comparator are defined so that the two objects in question are considered equal.
From the javadocs:
a TreeSet instance performs all
element comparisons using its
compareTo (or compare) method, so two
elements that are deemed equal by this
method are, from the standpoint of the
set, equal.

From Java Doc:
If two objects are equal according to the equals(Object) method,
then calling the hashCode method on each of the two objects must
produce the same integer result.
Means: the objects you use for hashing are not equal.

You need to read Joshua Bloch's "Effective Java" chapter 3. It explains the equals contract and how to properly override equals, hashCode, and compareTo.

You don't need to checked if it is contained, because the insert() basically does the same operation (i.e. searching the proper position) on its way to the insertion point. If the object can't be inserted (i.e., the object is already contained), insert returns false.

Related

Why does Java need equals() if there is hashCode()?

If two objects return same hashCode, doesn't it mean that they are equal? Or we need equals to prevent collisions?
And can I implement equals by comparing hashCodes?
If two objects have the same hashCode then they are NOT necessarily equal. Otherwise you will have discovered the perfect hash function. But the opposite is true - if the objects are equal, then they must have the same hashCode.
hashCode and Equals are different information about objects
Consider the analogy to Persons where hashcode is the Birthday,
in that escenario, you and many other people have the same b-day (same hashcode), all you are not the same person however..
Why does Java need equals() if there is hashCode()?
Java needs equals() because it is the method through which object equality is tested by examining classes, fields, and other conditions the designer considers to be part of an equality test.
The purpose of hashCode() is to provide a hash value primarily for use by hash tables; though it can also be used for other purposes. The value returned is based on an object's fields and hash codes of its composite and/or aggregate objects. The method does not take into account the class or type of object.
The relationship between equals() and hashCode() is an implication.
Two objects that are equal implies that the have the same hash code.
Two objects having the same hash code does not imply that they are equal.
The latter does not hold for several reasons:
There is a chance that two distinct objects may return the same hash code. Keep in mind that a hash value folds information from a large amount of data into a smaller number.
Two objects from different classes with similar fields will most likely use the same type of hash function, and return equal hash values; yet, they are not the same.
hashCode() can be implementation-specific returning different values on different JVMs or JVM target installations.
Within the same JVM, hashCode() can be used as a cheap precursor for equality by testing for a known hash code first and only if the same testing actual equality; provided that the equality test is significantly more expensive than generating a hash code.
And can I implement equals by comparing hashCodes?
No. As mentioned, equal hash codes does not imply equal objects.
The hashCode method as stated in the Oracle Docs is a numeric representation of an object in Java. This hash code has limited possible values (represented by the values which can be stored in an int).
For a more complex class, there is a high possibility that you will find two different objects which have the same hash code value. Also, no one stops you from doing this inside any class.
class Test {
#Override
public int hashCode() {
return 0;
}
}
So, it is not recommended to implement the equals method by comparing hash codes. You should use them for comparison only if you can guarantee that each object has an unique hash code. In most cases, your only certainty is that if two objects are equal using o1.equals(o2) then o1.hashCode() == o2.hashCode().
In the equals method you can define a more complex logic for comparing two objects of the same class.
If two objects return same hashCode, doesn't it mean that they are equal?
No it doesn't mean that.
The javadocs for Object state this:
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently
return the same integer, provided no information used in equals
comparisons on the object is modified. ...
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must
produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCodemethod on
each of the two objects must produce distinct integer results. ...
Note the highlighted statement. It plainly says "No" to your question.
There is another way to look at this.
The hashCode returns an int.
There are only 232 distinct values that an int can take.
If a.hashCode() == b.hashCode() implies a.equals(b), then there can be only 232 distinct (i.e. mutually unequal) objects at any given time in a running Java application.
That last point is plainly not true. Indeed, it is demonstrably not true if you have a large enough heap to hold 232 instances of java.lang.Object ... in a 64-bit JVM.
And a third way is to some well-known examples where two different two character strings have the same hashcode.
Given that your assumption is incorrect, the reasoning that follows from it is also incorrect.
Java does need an equals method.
You generally cannot implement equals using just hashCode.
You may be able to use hashCode to implement a faster equals method, but only if calling hashCode twice is faster than comparing two objects. It generally isn't.
hashCodes are equal -> Objects might be equal -> further comparision is required
hashCodes are different -> Object are not equal (if hashCode is implemented right)
That's how equals method are implemented. At first you check if hashCodes are equal. If yes, you need to check class fields to see if it represents the exact same object. If hashCodes are different, you can be sure that objects are not equal.
Sometimes (very often?) you don't!
These answers are not untrue. But they don't tell the whole story.
One example would be where you are creating a load of objects of class SomeClass, and each instance that is created is given a unique ID by incrementing a static variable, nInstanceCount, or some such, in the constructor:
iD = nInstanceCount++;
Your hash function could then be
int hashCode(){
return iD;
}
and your equals could then be
boolean equals( Object obj ){
if( ! ( obj instanceof SomeClass )){
return false;
}
return hashCode() == obj.hashCode();
}
... under such circumstances your idea that "equals is superfluous" is effectively true: if all classes behaved like this, Java 10 (or Java 23) might say, ah, let's just get rid of silly old equals, what's the point? (NB backwards compatibility would then go out the window).
There are two essential points:
you couldn't then create more than MAXINT instances of SomeClass. Or... you could ... if you set up a system for reassigning the IDs of previously destroyed instances. IDs are typically long rather than int ... but this wouldn't work because hashCode() returns int.
none of these objects could then be "equal" to another one, since equality = identity for this particular class, as you have defined it. Often this is desirable. Often it shuts off whole avenues of possibilities...
The necessary implication of your question is, perhaps, what's the use of these two methods which, in a rather annoying way, have to "cooperate"? Frelling, in his/her answer, alluded to the crucial point: hash codes are needed for sorting into "buckets" with classes like HashMap. It's well worth reading up on this: the amount of advanced maths that has gone into designing efficient "bucket" mechanisms for classes like HashMap is quite frightening. After reading up on it you may come to have (like me) a bit of understanding and reverence about how and why you should bother implementing hashCode() with a bit of thought!

what is the disadvantage of overriding equals and not hashcode and vice versa? [duplicate]

This question already has answers here:
Why do I need to override the equals and hashCode methods in Java?
(31 answers)
Closed 7 years ago.
I know there are lots of similar questions out there but I have not satisfied by the answers I have read. I tried to figure it out but I still did not get the idea.
What I know is these two are important while using set or map especially HashSet, HashMap or Hash objects in general which use hash mechanism for storing element objects.
Both methods are used to test if two Objects are equal or not.
For two objects A and B to be equal first they need to have the same hash value( have to be in the same bucket) and second we have to get true while executing A.equals(B).
What I do not understand is, WHY is it necessary to override both of these methods.
WHAT if we do not override hashcode. IS IT A MUST TO OVERRIDE BOTH.If it is not what is the disadvantage of overriding equals and not hashcode and vice versa.
Properly implementing hashCode is necessary for your object to be a key in hash-based containers. It is not necessary for anything else.
Here's why it is important for hash-based containers such as HashMap, HashSet, ConcurrentHashMap etc.
At a high level, a HashMap is an array, indexed by the hashCode of the key, whose entries are "chains" - lists of (key, value) pairs where all keys in a particular chain have the same hash code. For a refresher on hashtables, see Wikipedia.
Consider what happens if two keys A, B are equal, but have a different hash code - for example, a.hashCode() == 42 and b.hashCode() == 37. Suppose you write:
hashTable.put(a, "foo");
hashTable.get(b);
Since the keys are equal, you would like the result to be "foo", right?
However, get(b) will look into the chain corresponding to hash 37, while the pair (a, "foo") is located in the chain corresponding to hash 42, so the lookup will fail and you'll get null.
This is why it is important that equal objects have equal hash codes if you intend to use the object as a key in a hash-based container.
Note that if you use a non-hash based container, such as TreeMap, then you don't have to implement hashCode because the container doesn't use it. Instead, in case of TreeMap, you should implement compareTo - other types of containers may have their own requirements.
Yes it's correct when you override equals method you have to override hashcode method as well. The reason behind is that in hash base elements two objects are equal if their equals method return true and their hashcode method return same integer value. In hash base elements (hash map) when you make the equal check for two objects first their hashcode method is get called, if it return same value for both then only equals method is get called. If hashcode don't return same value for both then it simplity consider both objects as not equal. By default the hashcode method return some random value, so if you are making two objects equal for some specific condition by overriding equals method, they still won't equal because their hashcode value is different, so in order to make their hascode value equal you have to override it. Otherwise you won't be able to make this object as a key to your hash map.

How to compare two java objects [duplicate]

This question already has answers here:
Compare two objects with .equals() and == operator
(16 answers)
Closed 9 years ago.
I have two java objects that are instantiated from the same class.
MyClass myClass1 = new MyClass();
MyClass myClass2 = new MyClass();
If I set both of their properties to the exact same values and then verify that they are the same
if(myClass1 == myClass2){
// objects match
...
}
if(myClass1.equals(myClass2)){
// objects match
...
}
However, neither of these approaches return a true value. I have checked the properties of each and they match.
How do I compare these two objects to verify that they are identical?
You need to provide your own implementation of equals() in MyClass.
#Override
public boolean equals(Object other) {
if (!(other instanceof MyClass)) {
return false;
}
MyClass that = (MyClass) other;
// Custom equality check here.
return this.field1.equals(that.field1)
&& this.field2.equals(that.field2);
}
You should also override hashCode() if there's any chance of your objects being used in a hash table. A reasonable implementation would be to combine the hash codes of the object's fields with something like:
#Override
public int hashCode() {
int hashCode = 1;
hashCode = hashCode * 37 + this.field1.hashCode();
hashCode = hashCode * 37 + this.field2.hashCode();
return hashCode;
}
See this question for more details on implementing a hash function.
You need to Override equals and hashCode.
equals will compare the objects for equality according to the properties you need and hashCode is mandatory in order for your objects to be used correctly in Collections and Maps
You need to implement the equals() method in your MyClass.
The reason that == didn't work is this is checking that they refer to the same instance. Since you did new for each, each one is a different instance.
The reason that equals() didn't work is because you didn't implement it yourself yet. I believe it's default behavior is the same thing as ==.
Note that you should also implement hashcode() if you're going to implement equals() because a lot of java.util Collections expect that.
You have to correctly override method equals() from class Object
Edit: I think that my first response was misunderstood probably because I was not too precise. So I decided to to add more explanations.
Why do you have to override equals()? Well, because this is in the domain of a developer to decide what does it mean for two objects to be equal. Reference equality is not enough for most of the cases.
For example, imagine that you have a HashMap whose keys are of type Person. Each person has name and address. Now, you want to find detailed bean using the key. The problem is, that you usually are not able to create an instance with the same reference as the one in the map. What you do is to create another instance of class Person. Clearly, operator == will not work here and you have to use equals().
But now, we come to another problem. Let's imagine that your collection is very large and you want to execute a search. The naive implementation would compare your key object with every instance in a map using equals(). That, however, would be very expansive. And here comes the hashCode(). As others pointed out, hashcode is a single number that does not have to be unique. The important requirement is that whenever equals() gives true for two objects, hashCode() must return the same value for both of them. The inverse implication does not hold, which is a good thing, because hashcode separates our keys into kind of buckets. We have a small number of instances of class Person in a single bucket. When we execute a search, the algorithm can jump right away to a correct bucket and only now execute equals for each instance. The implementation for hashCode() therefore must distribute objects as evenly as possible across buckets.
There is one more point. Some collections require a proper implementation of a hashCode() method in classes that are used as keys not only for performance reasons. The examples are: HashSet and LinkedHashSet. If they don’t override hashCode(), the default Object
hashCode() method will allow multiple objects that you might consider "meaningfully
equal" to be added to your "no duplicates allowed" set.
Some of the collections that use hashCode()
HashSet
LinkedHashSet
HashMap
Have a look at those two classes from apache commons that will allow you to implement equals() and hashCode() easily
EqualsBuilder
HashCodeBuilder
1) == evaluates reference equality in this case
2) im not too sure about the equals, but why not simply overriding the compare method and plant it inside MyClass?

Correct implementation of hashcode/equals

I have a class
class Pair<T>{
private T data;
private T alternative;
}
Two pair objects would be equal if
this.data.equals(that.data) && this.alternative.equals(that.alternative) ||
this.data.equals(that.alternative) && this.alternative.equals(that.data)
I'm having difficulty correctly implementing the hashCode() part though. Any suggestions would be appreciated
You should use the hashCode from data and alternative like this :
return this.data.hashCode() + this.alterative.hashCode();
Although it is not the best approach, as if you change the data or alternative, then their hashcode will also change. Think a little bit and see if you really need to use this class as a key in a map and if not a Long or String would be a better candidate.
This should do the trick:
#Override
public int hashCode() {
return data.hashCode() * alternative.hashCode();
}
Since you want to include both fields into the equals, you need to include both fields into the hashCode method. It is correct if unequal objects end up having the same hash code, but equal objects according to your scheme will always end up having the same hash code with this method.
Refer to the java doc, the general contract of hashCode is(copied from java doc):
- Whenever it is invoked on the same object more than once during an
execution of a Java application, the hashCode method must
consistently return the same integer, provided no information used in
equals comparisons on the object is modified. This integer need not
remain consistent from one execution of an application to another
execution of the same application.
- If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce
the same integer result.
- It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then calling the hashCode method on
each of the two objects must produce distinct integer results.
However, the programmer should be aware that producing distinct
integer results for unequal objects may improve the performance of
hashtables.
So from your implementation of equals, data and alternative are switchable. So you need make sure in your hashCode implementation returns the same value if you switch the position of data.hashCode() and alternative.hashCode(). If you are not sure, just return a const value such as 1 (But it may cause performance issue when you try to put the object into a Map).

Java: overriding equals method doesn't do the trick when looking for a key of hashtable?

I have a hashtable looking like this:
Hashtable<Mapping, Integer> mappingCount = new Hashtable<Mapping, Integer>();
I want to use this code:
if (mappingCount.get(currentMapping) != null)
mappingCount.put(currentMapping, mappingCount.get(currentMapping) + 1);
else
mappingCount.put(currentMapping, 1);
In order to be able to get the value from the hashtable, for the class Mapping I did the following:
#Override
public boolean equals(Object obj) {
return ((Mapping)obj).mappingXML.equals(this.mappingXML);
}
However, this doesn't do the trick since mappingCount.get(currentMapping) always results in null. To be sure that something's not wrong, I did the following:
if (aaa.contains(currentMapping.getMappingXML()))
System.out.println("found it!");
else
aaa.add(currentMapping.getMappingXML());
where aaa is List<String> aaa = new ArrayList<String>(). Of course, found it is printed many times. What am I doing wrong?
You also need to override the hashCode() method.
From the JavaDocs:
To successfully store and retrieve
objects from a hashtable, the objects
used as keys must implement the
hashCode method and the equals method.
The reason for this is that Hashtable uses hashCode as a preliminary test to see if two objects are equals. If the hashCode matches, then it uses equals to check for collissions.
The default implementation of hashCode() returns the memory address of the object, and for two objects that are equal, their hashcodes must also be equal.
Also look at the general contract for hashCode().
All of the recommendations to override equals and hash code correctly are spot on; Joshua Bloch tells you how to do it properly.
But an equally important requirement is that keys in maps must be immutable. If your class can change its values, then the equals and hash code can change after you add it to the map; disaster ensues.
Whenever you override equals, you must override hashCode as well.
You need to override hashCode as well.
From the Object#hashCode doc:
Returns a hash code value for the
object. This method is supported for
the benefit of hashtables such as
those provided by java.util.Hashtable.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an
execution of a Java application, the
hashCode method must consistently
return the same integer, provided no
information used in equals comparisons
on the object is modified. This
integer need not remain consistent
from one execution of an application
to another execution of the same
application.
If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of
the two objects must produce the same
integer result.
It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then
calling the hashCode method on each of
the two objects must produce distinct
integer results. However, the
programmer should be aware that
producing distinct integer results for
unequal objects may improve the
performance of hashtables.
As much as is reasonably practical,
the hashCode method defined by class
Object does return distinct integers
for distinct objects. (This is
typically implemented by converting
the internal address of the object
into an integer, but this
implementation technique is not
required by the JavaTM programming
language.)
You have to implement hashcode() as well!
Example:
public class Employee{
int employeeId;
String name;
Department dept;
// other methods would be in here
#Override
public int hashCode() {
int hash = 1;
hash = hash * 17 + employeeId;
hash = hash * 31 + name.hashCode();
hash = hash * 13 + (dept == null ? 0 : dept.hashCode());
return hash;
}
}

Categories

Resources