Why am I getting duplicate keys in Java HashMap? [duplicate] - java

This question already has answers here:
Java 1.7 Override of hashCode() not behaving as I would expect
(2 answers)
Closed 6 years ago.
I seem to be getting duplicate keys in the standard Java HashMap. By "duplicate", I mean the keys are equal by their equals() method. Here is the problematic code:
import java.util.Map;
import java.util.HashMap;
public class User {
private String userId;
public User(String userId) {
this.userId = userId;
}
public boolean equals(User other) {
return userId.equals(other.getUserId());
}
public int hashCode() {
return userId.hashCode();
}
public String toString() {
return userId;
}
public static void main(String[] args) {
User arvo1 = new User("Arvo-Part");
User arvo2 = new User("Arvo-Part");
Map<User,Integer> map = new HashMap<User,Integer>();
map.put(arvo1,1);
map.put(arvo2,2);
System.out.println("arvo1.equals(arvo2): " + arvo1.equals(arvo2));
System.out.println("map: " + map.toString());
System.out.println("arvo1 hash: " + arvo1.hashCode());
System.out.println("arvo2 hash: " + arvo2.hashCode());
System.out.println("map.get(arvo1): " + map.get(arvo1));
System.out.println("map.get(arvo2): " + map.get(arvo2));
System.out.println("map.get(arvo2): " + map.get(arvo2));
System.out.println("map.get(arvo1): " + map.get(arvo1));
}
}
And here is the resulting output:
arvo1.equals(arvo2): true
map: {Arvo-Part=1, Arvo-Part=2}
arvo1 hash: 164585782
arvo2 hash: 164585782
map.get(arvo1): 1
map.get(arvo2): 2
map.get(arvo2): 2
map.get(arvo1): 1
As you can see, the equals() method on the two User objects is returning true and their hash codes are the same, yet they each form a distinct key in map. Furthermore, map continues to distinguish between the two User keys in the last four get() calls.
This directly contradicts the documentation:
More formally, if this map contains a mapping from a key k to a value v such that (key==null ? k==null : key.equals(k)), then this method returns v; otherwise it returns null. (There can be at most one such mapping.)
Is this a bug? Am I missing something here? I'm running Java version 1.8.0_92, which I installed via Homebrew.
EDIT: This question has been marked as a duplicate of this other question, but I'll leave this question as is because it identifies a seeming inconsistency with equals(), whereas the other question assumes the error lies with hashCode(). Hopefully the presence of this question will make this issue more easily searchable.

The issue lies in your equals() method. The signature of Object.equals() is equals(OBJECT), but in your case it is equals(USER), so these are two completely different methods and the hashmap is calling the one with Object parameter. You can verify that by putting an #Override annotation over your equals - it will generate a compiler error.
The equals method should be:
#Override
public boolean equals(Object other) {
if(other instanceof User){
User user = (User) other;
return userId.equals(user.userId);
}
return false;
}
As a best practice you should always put #Override on the methods you override - it can save you a lot of trouble.

Your equals method does not override equals, and the types in the Map are erased at runtime, so the actual equals method called is equals(Object). Your equals should look more like this:
#Override
public boolean equals(Object other) {
if (!(other instanceof User))
return false;
User u = (User)other;
return userId.equals(u.userId);
}

OK, so first of all, the code doesn't compile. Missing this method:
other.getUserId()
But aside from that, you'll need to #Override equals method, IDE like Eclipse can also help generating equals and hashCode btw.
#Override
public boolean equals(Object obj)
{
if(this == obj)
return true;
if(obj == null)
return false;
if(getClass() != obj.getClass())
return false;
User other = (User) obj;
if(userId == null)
{
if(other.userId != null)
return false;
}
else if(!userId.equals(other.userId))
return false;
return true;
}

Like others answered you had a problem with the equals method signature. According to Java equals best practice you should implement equals like the following :
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
User user = (User) o;
return userId.equals(user.userId);
}
Same thing applies for the hashCode() method. see Overriding equals() and hashCode() method in Java
The Second Problem
you don't have duplicates anymore now, but you have a new problem, your HashMap contains only one element:
map: {Arvo-Part=2}
This is because both User objects are referencing the same String(JVM String Interning), and from the HashMap perspective your two objects are the same, since both objects are equivalent in hashcode and equals methods. so when you add your second object to the HashMap you override your first one.
to avoid this problem, make sure you use a unique ID for each User
A simple demonstration on your users :

Related

Why do equals() implementations always start from self checking even when it's redundant? [duplicate]

This question already has answers here:
regarding using this in implementing equals for comparing objects in Java
(5 answers)
Closed 1 year ago.
I am learning to override Java's equals() method, and I can understand the correctness of many tutorials such as the following from https://www.baeldung.com/java-hashcode#handling-hash-collisions.
public class User {
private long id;
private String name;
private String email;
// standard getters/setters/constructors
#Override
public int hashCode() {
return 1;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null) return false;
if (this.getClass() != o.getClass()) return false;
User user = (User) o;
return id == user.id
&& (name.equals(user.name)
&& email.equals(user.email));
}
// getters and setters here
}
My Question is that the implementation starts from self-checking, like
if (this == o) return true;
but this line seems to be redundant. If o references to the same object,
the last checking
User user = (User) o;
return id == user.id
&& (name.equals(user.name)
&& email.equals(user.email));
will be true as well.
I have googled a lot, but cannot find any topic related to it.
Why does every implementation of equals() start with self-checking even when there is no need to do that?
Is this a performance issue or something?
The first to call to == is an optimization. If this and o are the exact same object (i.e. this == o returns true), there's no need to perform the subsequent following operations of going over all the object's properties and comparing them one by one.

Java Set with multiple equality criteria

I have a particular requirement where I need to dedupe a list of objects based on a combination of equality criteria.
e.g. Two Student objects are equal if:
1. firstName and id are same OR 2. lastName, class, and emailId are same
I was planning to use a Set to remove duplicates. However, there's a problem:
I can override the equals method but the hashCode method may not return same hash code for two equal objects.
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Student other = (Student) obj;
if ((firstName.equals(other.firstName) && id==other.id) ||
(lastName.equals(other.lastName) && class==other.class && emailId.equals(other.emailId ))
return true;
return false;
}
Now I cannot override hashCode method in a way that it returns same hash codes for two objects that are equal according to this equals method.
Is there a way to dedupe based on multiple equality criteria? I considered using a List and then using the contains method to check if the element is already there, but this increases the complexity as contains runs in O(n) time. I don't want to return the exact same hash codes for all the objects as that's just increases the time and beats the purpose of using hash codes. I've also considered sorting items using a custom comparator, but that again takes at least O(n log n), plus one more walk through to remove the duplicates.
As of now, the best solution I have is to maintain two different sets, one for each condition and use that to build a List, but that takes almost three times the memory. I'm looking for a faster and memory efficient way as I'll be dealing with a large number of records.
You can make Student Comparable and use TreeSet. Simple implementation of compareTo may be:
#Override
public int compareTo(Student other) {
if (this.equals(other)) {
return 0;
} else {
return (this.firstName + this.lastName + emailId + clazz + id)
.compareTo(other.firstName + other.lastName + other.emailId + clazz + id);
}
}
Or make your own Set implementation, for instance containing a List of distinct Student objects, checking for equality every time you add a student. This will have O(n) add complexity, so can't be considered a good implementation, but it is simple to write.
class ListSet<T> extends AbstractSet<T> {
private List<T> list = new ArrayList<T>();
#Override
public boolean add(T t) {
if (list.contains(t)) {
return false;
} else {
return list.add(t);
}
}
#Override
public Iterator<T> iterator() {
return list.iterator();
}
#Override
public int size() {
return list.size();
}
}

Checking if a value is already in a List won't work

I am doing a small program that holds shelves in a library list. If the number of shelf was already entered before, you can't enter it again. However, it's not working.
Here is my code in the main class:
Shelf s = new Shelf(1);
Shelf s2 = new Shelf(1);
Library l = new Library();
l.Addshelf(s);
l.Addshelf(s2);
As you can see I entered 1 in both objects as the shelf number so this code below should then run from the library class
public void Addshelf(Shelf s)
{
List li = new ArrayList();
if(li.contains(s))
{
System.out.println("already exists");
} else {
li.add(s);
}
}
The problem must be in the above method. I want to know how I check if that shelf number already exists in the list, in which case it should prompt me with the above statement - "already exists.
You'll have to override equals method in Shelf in order to get the behavior you desire.
Without overriding equals, ArrayList::contains, which calls ArrayList::indexOf, would use the default implementation of Object::equals, which compares object references.
#Override
public boolean equals (Object anObject)
{
if (this == anObject)
return true;
if (anObject instanceof Shelf) {
Shelf anotherShelf = (Shelf) anObject;
return this.getShelfNumber() == anotherShelf.getShelfNumber(); // assuming this
// is a primitive
// (if not, use equals)
}
return false;
}
If you look at the Javadoc for List at the contains method you will see that it uses the equals()method to evaluate if two objects are the same. So you have to override the method equals on your Shelf class.
Example:
public class Shelf
{
public int a;
public Shelf (int x)
{
this.a= x;
}
#Override
public boolean equals(Object object)
{
boolean isEqual= false;
if (object != null && object instanceof Shelf)
{
isEqual = (this.a == ((Shelf) object).a);
}
return isEqual;
}
}
Make sure that you have override equals() method in Shelf.
From Java doc. How contains() works?
Returns true if this list contains the specified element. More
formally, returns true if and only if this list contains at least one
element e such that (o==null ? e==null : o.equals(e)).
^^
Try overriding methods hashCode() and equals(Object obj) in your Shelf class and then call contains.
Equals and HashCode tutorial

Do I need to implement hashCode() and equals() methods?

If I have a map and an object as map key, are the default hash and equals methods enough?
class EventInfo{
private String name;
private Map<String, Integer> info
}
Then I want to create a map:
Map<EventInfo, String> map = new HashMap<EventInfo, String>();
Do I have to explicitly implement hashCode() and equals()? Thanks.
Yes, you do. HashMaps work by computing the hash code of the key and using that as a base point. If the hashCode function isn't overriden (by you), then it will use the memory address, and equals will be the same as ==.
If you're in Eclipse, it'll generate them for you. Click Source menu → Generate hashCode() and equals().
If you don't have Eclipse, here's some that should work. (I generated these in Eclipse, as described above.)
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((info == null) ? 0 : info.hashCode());
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (!(obj instanceof EventInfo)) {
return false;
}
EventInfo other = (EventInfo) obj;
if (info == null) {
if (other.info != null) {
return false;
}
} else if (!info.equals(other.info)) {
return false;
}
if (name == null) {
if (other.name != null) {
return false;
}
} else if (!name.equals(other.name)) {
return false;
}
return true;
}
Yes, you need them else you won't be able to compare two EventInfo (and your map won't work).
Strictly speaking, no. The default implementations of hashCode() and equals() will produce results that ought to work. See http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html#hashCode()
My understanding is that the default implementation of hashCode() works by taking the object's address in memory and converting to integer, and the default implementation of equals() returns true only if the two objects are actually the same object.
In practice, you could (and should) probably improve on both of those implementations. For example, both methods should ignore object members that aren't important. In addition, equals() might want to recursively compare references in the object.
In your particular case, you might define equals() as true if the two objects refer to the same string or the two strings are equal and the two maps are the same or they are equal. I think WChargin gave you pretty good implementations.
Depends on what you want to happen. If two different EventInfo instances with the same name and info should result in two different keys, then you don't need to implement equals and hashCode.
So
EventInfo info1 = new EventInfo();
info1.setName("myname");
info1.setInfo(null);
EventInfo info2 = new EventInfo();
info2.setName("myname");
info2.setInfo(null);
info1.equals(info2) would return false and info1.hashCode() would return a different value to info2.hashCode().
Therefore, when you are adding them to your map:
map.put(info1, "test1");
map.put(info2, "test2");
you would have two different entries.
Now, that may be desired behaviour. For example, if your EventInfo is collecting different events, two distinct events with the same data may well want to be desired to be two different entries.
The equals and hashCode contracts is also applicable in a Set.
So for example, if your event info contains mouse clicks, it may well be desired that you would want to end up with:
Set<EventInfo> collectedEvents = new HashSet<EventInfo>();
collectedEvents.add(info1);
collectedEvents.add(info2);
2 collected events instead of just 1...
Hope I'm making sense here...
EDIT:
If however, the above set and map should only contain a single entry, then you could use apache commons EqualsBuilder and HashCodeBuilder to simplify the implementation of equals and hashCode:
#Override
public boolean equals(Object obj) {
if (obj instanceof EventInfo) {
EventInfo other = (EventInfo) obj;
EqualsBuilder builder = new EqualsBuilder();
builder.append(name, other.name);
builder.append(info, other.info);
return builder.isEquals();
}
return false;
}
#Override
public int hashCode() {
HashCodeBuilder builder = new HashCodeBuilder();
builder.append(name);
builder.append(info);
return builder.toHashCode();
}
EDIT2:
It could also be appropriate if two EventInfo instances are considered the same, if they have the same name, for example if the name is some unique identifier (I know it's a bit far fetched with your specific object, but I'm generalising here...)

Java HashMap.containsKey() doesn't call equals()

I have a hashmap:
Map<LotWaferBean, File> hm = new HashMap<LotWaferBean, File>();
LotWaferBean lw = new LotWaferBean();
... //populate lw
if (!hm.containsKey((LotWaferBean) lw)) {
hm.put(lw, triggerFiles[l]);
}
The code for LotWaferBean:
#Override
public boolean equals(Object o) {
if (!(o instanceof LotWaferBean)) {
return false;
}
if (((LotWaferBean) o).getLotId().equals(lotId)
&& ((LotWaferBean) o).getWaferNo() == waferNo) {
return true;
}
return false;
}
In my IDE I put breakpoints in equals() but it is never executed. Why?
Try putting a breakpoint in hashCode().
If the hashCode() of two objects in a map return the same number, then equals will be called to determine if they're really equal.
JVM checks the hashcode bucket of that object's hashcode, if there are more objects with the same hashcode, then only, the equals() method will be executed. And, the developer should follow correct contract between the hashCode() and equals() methods.
Only if 2 hashCodes equal, equals() will be called during loop keys.
Only if 2 hashCodes equal, equals() will be called during loop keys.
this is the correct answer... or almost. Precisely, if 2 hash codes collide (being the same ensures they are bound to collide under proper hashmap impl), only then equality check is performed.
BTW, your equal method is most likely incorrect. In case LotWaferBean is overridden, your equals method will accept the subclass instance, but will your subclass also do?
It better should read:
#Override
public boolean equals(Object o) {
if (o == null || o.getClass() != getClass()) { // << this is important
return false;
}
final LotWaferBean other = (LotWaferBean)o;
return other.getLotId().equals(lotId)
&& other.getWaferNo() == waferNo);
}
As Abimaran Kugathasan noted, the HashMap implementation uses hash-buckets to efficiently look up keys, and only uses equals() to compare the keys in the matching hash-bucket against the given key. It's worth noting that keys are assigned to hash-buckets when they are added to a HashMap. If you alter keys in a HashMap after adding them, in a way that would change their hash code, then they won't be in the proper hash-bucket; and trying to use a matching key to access the map will find the proper hash-bucket, but it won't contain the altered key.
class aMutableType {
private int value;
public aMutableType(int originalValue) {
this.value = originalValue;
}
public int getValue() {
return this.value;
}
public void setValue(int newValue) {
this.value = newValue;
}
#Override
public boolean equals(Object o) {
// ... all the normal tests ...
return this.value == ((aMutableType) o).value;
}
#Override
public int hashCode() {
return Integer.hashCode(this.value);
}
}
...
Map<aMutableType, Integer> aMap = new HashMap<>();
aMap.put(new aMutableType(5), 3); // puts key in bucket for hash(5)
for (aMutableType key : new HashSet<>(aMap.keySet()))
key.setValue(key.getValue()+1); // key 5 => 6
if (aMap.containsKey(new aMutableType(6))
doSomething(); // won't get here, even though
// there's a key == 6 in the Map,
// because that key is in the hash-bucket for 5
This can result in some pretty odd-looking behavior. You can set a breakpoint just before theMap.containsKey(theKey), and see that the value of theKey matches a key in theMap, and yet the key's equals() won't be called, and containsKey() will return false.
As noted here https://stackoverflow.com/a/21601013 , there's actually a warning the JavaDoc for Map regarding the use of mutable types for keys. Non-hash Map types won't have this particular problem, but could have other problems when keys are altered in-place.

Categories

Resources