Java HashSet is allowing dupes; problem with comparable? - java

I've got a class, "Accumulator", that implements the Comparable compareTo method, and I'm trying to put these objects into a HashSet.
When I add() to the HashSet, I don't see any activity in my compareTo method in the debugger, regardless of where I set my breakpoints. Additionally, when I'm done with the add()s, I see several duplicates within the Set.
What am I screwing up, here; why is it not Comparing, and therefore, allowing the dupes?
Thanks,
IVR Avenger

What am I screwing up, here?
HashSet is based on hashCode(), not on compareTo(). You may be confusing it with TreeSet. In both cases, be sure to also implement equals() in a manner that is consistent with the other method.

You need to correctly implement hashCode() and equals().
You must override hashCode and return a number based on the values in your class such that any two equal objects have the same hashcode.

HashSet uses the hashCode() and equals() methods to prevent duplicates from being added. First, it gets the hash code of the object you want to add. Then, it finds the corresponding bucket for that hash code and iterates through each object in that bucket, using the equals() method to see if any identical objects already exist in the set.
Your debugger is not breaking on compareTo() because it is never used with HashSet!
The rules are:
If two objects are equal, then their hash codes
must be equal.
But if two objects' hash codes
are equal, then this doesn't mean
the objects are equal! It could be
that the two objects just happen to have the same hash.

When hashCode return different values for 2 objects, then equal is not used. Btw, compareTo has nothing to do with hashing collections :) but sorted collections

Your objects are Comparable, and probably you've implemented equals() too, but HashSets deal with object hashes, and odds are you haven't implemented hashCode() (or your implementation of hashCode() doesn't return the same hash for two objects that are (a.equals(b) == true).

One thing which people tends to ignore which result in a huge mistake.
While defining equals method always take the parameter as object class and then conver the object to your desired class.
For eg
public bolean equals(Object aSong){
if(!(aSoneg instanceof Song)){
return false;
}
Song s=(Song) aSong;
return getTitle().equals(s.getTitle());
}
If u pass write Song aSong instead of Object aSong your equals method will never get called.
Hope this helps

HashSet uses hashCode and equals. TreeSet uses the Comparable interface. Note: if you decide to override either hashcode or equals, you should always override the other.

When ever you create an object of class Accumulator it takes new space in JVM and returns unique hashCode every time you add an object in hashSet. It does not depends upon the value of the object because you have not overridden hashCode() method hence it will call Object class hashCode() method which will return unique hashCode with every object created in your program.
Solution:
Override hashCode() and equals() method and apply your logic depending upon the properties of your class. Be sure to read equals and hashcode contract
http://www.ibm.com/developerworks/java/library/j-jtp05273/index.html

Related

Compare two objects and add them in a Set [duplicate]

This question already has answers here:
Compare two objects with .equals() and == operator
(16 answers)
Closed 9 years ago.
I have two java objects that are instantiated from the same class.
MyClass myClass1 = new MyClass();
MyClass myClass2 = new MyClass();
If I set both of their properties to the exact same values and then verify that they are the same
if(myClass1 == myClass2){
// objects match
...
}
if(myClass1.equals(myClass2)){
// objects match
...
}
However, neither of these approaches return a true value. I have checked the properties of each and they match.
How do I compare these two objects to verify that they are identical?
You need to provide your own implementation of equals() in MyClass.
#Override
public boolean equals(Object other) {
if (!(other instanceof MyClass)) {
return false;
}
MyClass that = (MyClass) other;
// Custom equality check here.
return this.field1.equals(that.field1)
&& this.field2.equals(that.field2);
}
You should also override hashCode() if there's any chance of your objects being used in a hash table. A reasonable implementation would be to combine the hash codes of the object's fields with something like:
#Override
public int hashCode() {
int hashCode = 1;
hashCode = hashCode * 37 + this.field1.hashCode();
hashCode = hashCode * 37 + this.field2.hashCode();
return hashCode;
}
See this question for more details on implementing a hash function.
You need to Override equals and hashCode.
equals will compare the objects for equality according to the properties you need and hashCode is mandatory in order for your objects to be used correctly in Collections and Maps
You need to implement the equals() method in your MyClass.
The reason that == didn't work is this is checking that they refer to the same instance. Since you did new for each, each one is a different instance.
The reason that equals() didn't work is because you didn't implement it yourself yet. I believe it's default behavior is the same thing as ==.
Note that you should also implement hashcode() if you're going to implement equals() because a lot of java.util Collections expect that.
You have to correctly override method equals() from class Object
Edit: I think that my first response was misunderstood probably because I was not too precise. So I decided to to add more explanations.
Why do you have to override equals()? Well, because this is in the domain of a developer to decide what does it mean for two objects to be equal. Reference equality is not enough for most of the cases.
For example, imagine that you have a HashMap whose keys are of type Person. Each person has name and address. Now, you want to find detailed bean using the key. The problem is, that you usually are not able to create an instance with the same reference as the one in the map. What you do is to create another instance of class Person. Clearly, operator == will not work here and you have to use equals().
But now, we come to another problem. Let's imagine that your collection is very large and you want to execute a search. The naive implementation would compare your key object with every instance in a map using equals(). That, however, would be very expansive. And here comes the hashCode(). As others pointed out, hashcode is a single number that does not have to be unique. The important requirement is that whenever equals() gives true for two objects, hashCode() must return the same value for both of them. The inverse implication does not hold, which is a good thing, because hashcode separates our keys into kind of buckets. We have a small number of instances of class Person in a single bucket. When we execute a search, the algorithm can jump right away to a correct bucket and only now execute equals for each instance. The implementation for hashCode() therefore must distribute objects as evenly as possible across buckets.
There is one more point. Some collections require a proper implementation of a hashCode() method in classes that are used as keys not only for performance reasons. The examples are: HashSet and LinkedHashSet. If they don’t override hashCode(), the default Object
hashCode() method will allow multiple objects that you might consider "meaningfully
equal" to be added to your "no duplicates allowed" set.
Some of the collections that use hashCode()
HashSet
LinkedHashSet
HashMap
Have a look at those two classes from apache commons that will allow you to implement equals() and hashCode() easily
EqualsBuilder
HashCodeBuilder
1) == evaluates reference equality in this case
2) im not too sure about the equals, but why not simply overriding the compare method and plant it inside MyClass?

Can you just return a field's hashCode() value in a hashCode() method?

While reviewing a large code base, I've often come across cases like this:
#Override
public int hashCode()
{
return someFieldValue.hashCode();
}
where the programmer, instead of generating their own unique hash code for the class, simply inherits the hash code from a field value. My gut feeling (which might just as well be digestive problems) tells me that this is wrong, but I can't put my finger on it. What problems can arise, if any, with this sort of implementation?
This is fine if you want to hash your object based on a single property.
For example, in a Person class you might have an ID property that uniquely identifies a Person, so the hashCode() of Person can simply be the hash of that ID.
In addition, the hashCode() is related to the implementation of equals. If two objects are equal, they must have the same hashCode (the opposite doesn't have to be true - two non equal objects may still have the same hashCode). Therefore, if equality is determined by a single property (such as a unique ID), the hashCode method must also use only that single property.
This can be seen in the JavaDoc of hashCode :
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
Technically speaking, you can return any consistent number from hashCode, even a constant value. The only requirement the contract places upon you is that equal objects must return the same hash code:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Theoretically, if all objects return, say, zero for their hashCode, the contract is formally satisfied. However, this makes hashCode completely useless.
The real question is whether you should do it or not. The answer depends on how unique is the field the hash code of which you are returning. It is not uncommon to return the hashCode of a unique identifier of an object for the object's hashCode. On the other hand, if a significant percentage of objects have the sane value of someFieldValue, you would be better off using a different strategy for making the hash code of your object.
hashCode() has to go with equals().
If the only property defining equalness is, for example, an ID, you HAVE TO take care that your hash codes are equal when the ID is equal.
The easiest way to accomplish this is by taking the hashCode() of your ID.
This is fine, if you really want to uniquely identify your object by this single property. Here is an article that explains what object identity really is.
As noted in the documentation of Object, your equals() and hashCode() need to incorporate the same properties, be sure to verify that.
So this means that you should ask yourself the question: do I really want the objects to be equal if only this single property is equal?
Finally do take great care when subclassing objects with a custom equals() and hashcode() implementation, if you want to add properties to the identity of the object, you will break the requirement that a.equals(b) == b.equals(a) (to see why this fails thing about this as a being the super class and b being the subclass.
yes you can do it technically, you need a non-primitive somefieldValue for that.

HashMap not returning values based on key

I'm trying to use a HashMap with my class Cell as the key. However, after putting an item into the HashMap, calling contains on the item will return false.
public static void main(String args[]) {
HashMap<Cell, String> map = new HashMap<Cell, String>();
map.put(new Cell(0,0), "Bob");
System.out.println(map.containsKey(new Cell(0,0)));
System.out.println(new Cell(0,0).equals(new Cell(0,0)));
}
This prints out false and true, where it should print true and true, since according to the Map docs containsKey uses .equals(). What am I doing wrong?
This is most likely because you don't have equals() and hashCode() implemented. In Java, the rule of thumb is that if you implement one, you must implement the other. In your case, it's mandatory because HashMap makes use of them.
You created two separate objects with two separate addresses. Without these methods, the JVM has no way of knowing that the objects are "the same."
See http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()
Consider how a HashMap is implemented. When putting, it first calculates the object hashCode() to figure out which bucket to place the object in. When it tries to retrieve an object, it again gets its hashCode(), identifies to target bucket, goes through the linked list in the bucket, calling equals() against each object. It returns if it finds a match.
In other words, when you use HashMap you need to have a correct and matching implementation of equals() and hashCode().
The default hashCode() method inherited from Object does not correctly return the same hashCode() unless the object references are the same. In your case they are not.
calling new Cell(0,0) several times produce different objects with different hash Codes. You should implement hashCode for Cell class.
You likely forgot to implement the hashcode() function for Cell, which is also required in order to use a user defined class in a HashMap. Here is a simple and generally accurate way to implement a hashcode() function:
int hashcode(){
return (field1.toString()+field2.toString()+...+fieldN.toString()).hashcode();
}
Where field1 to fieldN are the fields in your class. If the fields are primatives (ie int) just take out the toString().

How to compare two java objects [duplicate]

This question already has answers here:
Compare two objects with .equals() and == operator
(16 answers)
Closed 9 years ago.
I have two java objects that are instantiated from the same class.
MyClass myClass1 = new MyClass();
MyClass myClass2 = new MyClass();
If I set both of their properties to the exact same values and then verify that they are the same
if(myClass1 == myClass2){
// objects match
...
}
if(myClass1.equals(myClass2)){
// objects match
...
}
However, neither of these approaches return a true value. I have checked the properties of each and they match.
How do I compare these two objects to verify that they are identical?
You need to provide your own implementation of equals() in MyClass.
#Override
public boolean equals(Object other) {
if (!(other instanceof MyClass)) {
return false;
}
MyClass that = (MyClass) other;
// Custom equality check here.
return this.field1.equals(that.field1)
&& this.field2.equals(that.field2);
}
You should also override hashCode() if there's any chance of your objects being used in a hash table. A reasonable implementation would be to combine the hash codes of the object's fields with something like:
#Override
public int hashCode() {
int hashCode = 1;
hashCode = hashCode * 37 + this.field1.hashCode();
hashCode = hashCode * 37 + this.field2.hashCode();
return hashCode;
}
See this question for more details on implementing a hash function.
You need to Override equals and hashCode.
equals will compare the objects for equality according to the properties you need and hashCode is mandatory in order for your objects to be used correctly in Collections and Maps
You need to implement the equals() method in your MyClass.
The reason that == didn't work is this is checking that they refer to the same instance. Since you did new for each, each one is a different instance.
The reason that equals() didn't work is because you didn't implement it yourself yet. I believe it's default behavior is the same thing as ==.
Note that you should also implement hashcode() if you're going to implement equals() because a lot of java.util Collections expect that.
You have to correctly override method equals() from class Object
Edit: I think that my first response was misunderstood probably because I was not too precise. So I decided to to add more explanations.
Why do you have to override equals()? Well, because this is in the domain of a developer to decide what does it mean for two objects to be equal. Reference equality is not enough for most of the cases.
For example, imagine that you have a HashMap whose keys are of type Person. Each person has name and address. Now, you want to find detailed bean using the key. The problem is, that you usually are not able to create an instance with the same reference as the one in the map. What you do is to create another instance of class Person. Clearly, operator == will not work here and you have to use equals().
But now, we come to another problem. Let's imagine that your collection is very large and you want to execute a search. The naive implementation would compare your key object with every instance in a map using equals(). That, however, would be very expansive. And here comes the hashCode(). As others pointed out, hashcode is a single number that does not have to be unique. The important requirement is that whenever equals() gives true for two objects, hashCode() must return the same value for both of them. The inverse implication does not hold, which is a good thing, because hashcode separates our keys into kind of buckets. We have a small number of instances of class Person in a single bucket. When we execute a search, the algorithm can jump right away to a correct bucket and only now execute equals for each instance. The implementation for hashCode() therefore must distribute objects as evenly as possible across buckets.
There is one more point. Some collections require a proper implementation of a hashCode() method in classes that are used as keys not only for performance reasons. The examples are: HashSet and LinkedHashSet. If they don’t override hashCode(), the default Object
hashCode() method will allow multiple objects that you might consider "meaningfully
equal" to be added to your "no duplicates allowed" set.
Some of the collections that use hashCode()
HashSet
LinkedHashSet
HashMap
Have a look at those two classes from apache commons that will allow you to implement equals() and hashCode() easily
EqualsBuilder
HashCodeBuilder
1) == evaluates reference equality in this case
2) im not too sure about the equals, but why not simply overriding the compare method and plant it inside MyClass?

Contains and equals

I'm a little befuddled by some code:
for (AbstractItem item : mSetOfItems) {
if (item.equals(pPrimaryItem))
{
System.out.println("Contains? " + mSetOfItems.contains(pPrimaryItem));
}
}
How could it be possible that item.equals(pPrimaryItem) resolves as true, and mSetOfItems.contains(pPrimaryItem) resolves as false? Because that's what I'm seeing in my code.
In other words, if I iterate through my set, I can find an element equal to my test element. But if I use contains, my test elements is not reported being in the set. I'm baffled because I thought contains used equals. What could I be overlooking?
You didn't give the type of mSetOfItems, but I'm guessing that AbstractItem overrides .equals() but not .hashcode(). This is bad.
If mSetOfItems uses hashcode for lookup, which it could based on its type, you'll get the behavior you described.
Your assumption is that .contains() is implemented with iteration and .equals(). There's no list interface which guarantees that.
What is the implementation of mSetOfItems?
If it's a tree, it could be that your comparison function returns inconsistent values.
If it's a hash, it could be that your equals() returns true for objects with different hash codes, or that the object's hashCode() has changed since it was inserted into the set.
If your set is a TreeSet or some other set where you're using a custom comparator, then you could see this if the comparator was broken, either by not returning a valid sorted order or by having objects that are actually equal compare unequal. When the set internally looks up an element and uses the comparator, it would make a wrong choice and not see the element.
If your set is a HashSet, your hash function could be broken and cause two objects that are equal to have different hash code. Internally as the HashSet uses the object's hash code to figure out where to look, it might end up looking in the wrong bucket.
Alternatively, if you store objects in a Set of any sort and then modify them, you might end up breaking some internal invariant of the Set. For instance, if you store something in a HashSet and then change its value, it will be in the wrong bucket, and if you have a TreeSet and change the value it may appear in the wrong spot in sorted order.
If you are concurrently modifying the set, it's possible that you might have added the element in another thread but not had any guarantees that the operation that made that change be visible in another thread. The second thread would then not see the element even if it were added.
Check the hashcode() method of your class
If mSetOfItems is a java.util.HashTable (or similar 'Hash' Collection, Set, etc) then you must implement hashCode() as well. boolean contains(Object elem) will first try to find the passed object by calculating its hash and retrieving it in the Collection. Once contains finds something, it will then use the equals method to verify that the two objects are the same objects according to your implementation.
If not properly overridden, hashCode() will return an unpredictable int that is usually the integer representation of the internal address of the object itself. This will always be different for two distinct objects no matter the the values of their instance variables. If not overridden, contains won't be able to find it any objects...
When implementing hashCode() remind that:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Also, make sure that you properly overridden the equals function by respecting its signature:
public boolean equals(Object obj);

Categories

Resources