I've been encountering some strange behavior when trying to find a key inside a java.util.HashMap, and I guess I'm missing something. The code segment is basically:
HashMap<Key, Value> data = ...
Key k1 = ...
Value v = data.get(k1);
boolean bool1 = data.containsKey(k1);
for (Key k2 : data.keySet()) {
boolean bool2 = k1.equals(k2);
boolean bool3 = k2.equals(k1);
boolean bool4 = k1.hashCode() == k2.hashCode();
break;
}
That strange for loop is there because for a specific execution I happen to know that data contains only one item at this point and it is k1, and indeed bool2, bool3 and bool4 will be evaluated to true in that execution. bool1, however, will be evaluated to false, and v will be null.
Now, this is part of a bigger program - I could not reproduce the error on a smaller sample - but still it seems to me that no matter what the rest of the program does, this behavior should never happen.
EDIT: I have manually verified that the hash code does not change between the time the object was inserted to the map and the time it was queried. I'll keep checking this venue, but is there any other option?
This behavior could happen if the hash code of the key were changed after it was inserted in to the map.
Here's an example with the behavior you described:
public class Key
{
int hashCode = 0;
#Override
public int hashCode() {
return hashCode;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Key other = (Key) obj;
return hashCode == other.hashCode;
}
public static void main(String[] args) throws Exception {
HashMap<Key, Integer> data = new HashMap<Key, Integer>();
Key k1 = new Key();
data.put(k1, 1);
k1.hashCode = 1;
boolean bool1 = data.containsKey(k1);
for (Key k2 : data.keySet()) {
boolean bool2 = k1.equals(k2);
boolean bool3 = k2.equals(k1);
boolean bool4 = k1.hashCode() == k2.hashCode();
System.out.println("bool1: " + bool1);
System.out.println("bool2: " + bool2);
System.out.println("bool3: " + bool3);
System.out.println("bool4: " + bool4);
break;
}
}
}
From the API description of the Map interface:
Note: great care must be exercised if
mutable objects are used as map keys.
The behavior of a map is not specified
if the value of an object is changed
in a manner that affects equals
comparisons while the object is a key
in the map. A special case of this
prohibition is that it is not
permissible for a map to contain
itself as a key. While it is
permissible for a map to contain
itself as a value, extreme caution is
advised: the equals and hashCode
methods are no longer well defined on
such a map.
Also, there are very specific requirements on the behavior of equals() and hashCode() for types used as Map keys. Failure to follow the rules here will result in all sorts of undefined behavior.
If you're certain the hash code does not change between the time the key is inserted and the time you do the contains check, then there is something seriously wrong somewhere. Are you sure you're using a java.util.HashMap and not a subclass of some sort? Do you know what implementation of the JVM you are using?
Here's the source code for java.util.HashMap.getEntry(Object key) from Sun's 1.6.0_20 JVM:
final Entry<K,V> getEntry(Object key) {
int hash = (key == null) ? 0 : hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
}
return null;
As you can see, it retrieves the hashCode, goes to the corresponding slot in the table, then does an equals check on each element in that slot. If this is the code you're running and the hash code of the key has not changed, then it must be doing an equals check which must be failing.
The next step would be for you to give us some more code or context - the hashCode and equals methods of your Key class at a minimum.
Alternatively, I would recommend hooking up to a debugger if you can. Watch what bucket your key is hashed to, and step through the containsKey check to see where it's failing.
Is this application multi-threaded? If so, another thread could change the data between the data.containsKey(k1) call and the data.keySet() call.
If equals() returns true for two objects, then hashCode() should return the same value. If equals() returns false, then hashCode() should return different values.
For Reference:
http://www.ibm.com/developerworks/java/library/j-jtp05273.html
Perhaps the Key class looks like
Key
{
boolean equals = false ;
public boolean equals ( Object oth )
{
try
{
return ( equals ) ;
}
finally
{
equals = true ;
}
}
}
Related
So... all is in code:
// get vector...
SignVector v = ...;
//print to console: [1058, 5, 820 in flat]
System.out.println(v);
//size: 1
System.out.println("size: " + signs.size());
//check all elements...
for (Entry<SignVector, FakeSign> entry : signs.entrySet())
{
// get key
SignVector key = entry.getKey();
//print to console: [1058, 5, 820 in flat] (YaY! it's that key! like v)
System.out.println(key);
if (key.equals(v))
{
// print: "YaY: "
System.out.println("YaY: [1058, 5, 820 in flat]"+key);
}
}
//So second test... just get it from map: null
System.out.println(signs.get(v));
Why that return null?
In JavaDocs is written that: map.get using key.equals(k) so why my code return good object, but map.get return null?
Map:
private final Map<SignVector, FakeSign> signs = new HashMap<>()
Equals method form SignVector for #home user
#Override
public boolean equals(Object obj)
{
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
SignVector other = (SignVector) obj;
// w can't be null so I skip that
System.out.print(w.getName() + ", " + other.w.getName() + ", " + (w.getName().equals(other.w.getName()))); // this same
if (!w.getName().equals(other.w.getName()))
return false;
if (x != other.x)
return false;
if (y != other.y)
return false;
if (z != other.z)
return false;
return true;
}
But this method works good, always return that I want, x,y,z is int, and w is custom object.
The javadoc is a bit misleading, but it's relying on the fact that if you implement equals, you should also implement hashcode to be consistent. As the doc states:
Many methods in Collections Framework interfaces are defined in terms
of the equals method. For example, the specification for the
containsKey(Object key) method says: "returns true if and only if this
map contains a mapping for a key k such that (key==null ? k==null :
key.equals(k))."
This specification should not be construed to imply
that invoking Map.containsKey with a non-null argument key will cause
key.equals(k) to be invoked for any key k.
Implementations are free to
implement optimizations whereby the equals invocation is avoided, for
example, by first comparing the hash codes of the two keys. (The
Object.hashCode() specification guarantees that two objects with
unequal hash codes cannot be equal.)
More generally, implementations
of the various Collections Framework interfaces are free to take
advantage of the specified behavior of underlying Object methods
wherever the implementor deems it appropriate.
Let's take a look a the underlying implementation of get for an HashMap.
314 public V get(Object key) {
315 if (key == null)
316 return getForNullKey();
317 int hash = hash(key.hashCode());
318 for (Entry<K,V> e = table[indexFor(hash, table.length)];
319 e != null;
320 e = e.next) {
321 Object k;
322 if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
323 return e.value;
324 }
325 return null;
326 }
You see that is uses the hashcode of the object to find the possible entries in the table and THEN it uses equals to determine which value it has to return. Since the entry is probably null, the for loop is skipped and get returns null.
Override hashCode in your SignVector class to be consistent with equals and everything should work fine.
From the javadocs:
If this map permits null values, then a return value of null does not necessarily indicate that the map contains no mapping for the key; it's also possible that the map explicitly maps the key to null. The containsKey operation may be used to distinguish these two cases.
Unless you share with us how you built the map, we can't help you if this is the case. The code you shared should otherwise be working just fine.
http://docs.oracle.com/javase/7/docs/api/java/util/Map.html#get%28java.lang.Object%29
If I have a map and an object as map key, are the default hash and equals methods enough?
class EventInfo{
private String name;
private Map<String, Integer> info
}
Then I want to create a map:
Map<EventInfo, String> map = new HashMap<EventInfo, String>();
Do I have to explicitly implement hashCode() and equals()? Thanks.
Yes, you do. HashMaps work by computing the hash code of the key and using that as a base point. If the hashCode function isn't overriden (by you), then it will use the memory address, and equals will be the same as ==.
If you're in Eclipse, it'll generate them for you. Click Source menu → Generate hashCode() and equals().
If you don't have Eclipse, here's some that should work. (I generated these in Eclipse, as described above.)
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((info == null) ? 0 : info.hashCode());
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (!(obj instanceof EventInfo)) {
return false;
}
EventInfo other = (EventInfo) obj;
if (info == null) {
if (other.info != null) {
return false;
}
} else if (!info.equals(other.info)) {
return false;
}
if (name == null) {
if (other.name != null) {
return false;
}
} else if (!name.equals(other.name)) {
return false;
}
return true;
}
Yes, you need them else you won't be able to compare two EventInfo (and your map won't work).
Strictly speaking, no. The default implementations of hashCode() and equals() will produce results that ought to work. See http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html#hashCode()
My understanding is that the default implementation of hashCode() works by taking the object's address in memory and converting to integer, and the default implementation of equals() returns true only if the two objects are actually the same object.
In practice, you could (and should) probably improve on both of those implementations. For example, both methods should ignore object members that aren't important. In addition, equals() might want to recursively compare references in the object.
In your particular case, you might define equals() as true if the two objects refer to the same string or the two strings are equal and the two maps are the same or they are equal. I think WChargin gave you pretty good implementations.
Depends on what you want to happen. If two different EventInfo instances with the same name and info should result in two different keys, then you don't need to implement equals and hashCode.
So
EventInfo info1 = new EventInfo();
info1.setName("myname");
info1.setInfo(null);
EventInfo info2 = new EventInfo();
info2.setName("myname");
info2.setInfo(null);
info1.equals(info2) would return false and info1.hashCode() would return a different value to info2.hashCode().
Therefore, when you are adding them to your map:
map.put(info1, "test1");
map.put(info2, "test2");
you would have two different entries.
Now, that may be desired behaviour. For example, if your EventInfo is collecting different events, two distinct events with the same data may well want to be desired to be two different entries.
The equals and hashCode contracts is also applicable in a Set.
So for example, if your event info contains mouse clicks, it may well be desired that you would want to end up with:
Set<EventInfo> collectedEvents = new HashSet<EventInfo>();
collectedEvents.add(info1);
collectedEvents.add(info2);
2 collected events instead of just 1...
Hope I'm making sense here...
EDIT:
If however, the above set and map should only contain a single entry, then you could use apache commons EqualsBuilder and HashCodeBuilder to simplify the implementation of equals and hashCode:
#Override
public boolean equals(Object obj) {
if (obj instanceof EventInfo) {
EventInfo other = (EventInfo) obj;
EqualsBuilder builder = new EqualsBuilder();
builder.append(name, other.name);
builder.append(info, other.info);
return builder.isEquals();
}
return false;
}
#Override
public int hashCode() {
HashCodeBuilder builder = new HashCodeBuilder();
builder.append(name);
builder.append(info);
return builder.toHashCode();
}
EDIT2:
It could also be appropriate if two EventInfo instances are considered the same, if they have the same name, for example if the name is some unique identifier (I know it's a bit far fetched with your specific object, but I'm generalising here...)
I have an issue with a TreeMap that we have defined a custom key object for. The issue is that after putting a few objects into the map, and trying to retrieve with the same key used to put on the map, I get a null. I believe this is caused by the fact that we have 2 data points on the key. One value is always populated and one value is not always populated. So it seems like the issue lies with the use of compareTo and equals. Unfortunately the business requirement for how our keys determine equality needs to be implemented this way.
I think this is best illustrated with code.
public class Key implements Comparable<Key> {
private String sometimesPopulated;
private String alwaysPopulated;
public int compareTo(Key aKey){
if(this.equals(aKey)){
return 0;
}
if(StringUtils.isNotBlank(sometimesPopulated) && StringUtils.isNotBlank(aKey.getSometimesPopulated())){
return sometimesPopulated.compareTo(aKey.getSometimesPopulated());
}
if(StringUtils.isNotBlank(alwaysPopulated) && StringUtils.isNotBlank(aKey.getAlwaysPopulated())){
return alwaysPopulated.compareTo(aKey.getAlwaysPopulated());
}
return 1;
}
public boolean equals(Object aObject){
if (this == aObject) {
return true;
}
final Key aKey = (Key) aObject;
if(StringUtils.isNotBlank(sometimesPopulated) && StringUtils.isNotBlank(aKey.getSometimesPopulated())){
return sometimesPopulated.equals(aKey.getSometimesPopulated());
}
if(StringUtils.isNotBlank(alwaysPopulated) && StringUtils.isNotBlank(aKey.getAlwaysPopulated())){
return alwaysPopulated.equals(aKey.getAlwaysPopulated());
}
return false;
}
So the issue occurs when trying to get a value off the map after putting some items on it.
Map<Key, String> map = new TreeMap<Key, String>();
Key aKey = new Key(null, "Hello");
map.put(aKey, "world");
//Put some more things on the map...
//they may have a value for sometimesPopulated or not
String value = map.get(aKey); // this = null
So why is the value null after just putting it in? I think the algorithm used by the TreeMap is sorting the map in an inconsistent manner because of the way I'm using compareTo and equals. I am open to suggestions on how to improve this code. Thanks
Your comparator violates the transitivity requirement.
Consider three objects:
Object A: sometimesPopulated="X" and alwaysPopulated="3".
Object B: sometimesPopulated="Y" and alwaysPopulated="1".
Object C: sometimesPopulated is blank and alwaysPopulated="2".
Using your comparator, A<B and B<C. Transitivity requires that A<C. However, using your comparator, A>C.
Since the comparator doesn't fulfil its contract, TreeMap is unable to do its job correctly.
I think the problem is that you are returning 1 from your compareTo if either of the sometimesPopulated values is blank or either of the alwaysPopulated values is blank. Remember that compareTo can be thought of returning the value of a subtraction operation and your's is not transitive. (a - b) can == (b - a) even when a != b.
I would return -1 if the aKey sometimesPopulated is not blank and the local sometimesPopulated is blank. If they are the same then I would do the same with alwaysPopulated.
I think your logic should be something like:
public int compareTo(Key aKey){
if(this.equals(aKey)){
return 0;
}
if (StringUtils.isBlank(sometimesPopulated)) {
if (StringUtils.isNotBlank(aKey.getSometimesPopulated())) {
return -1;
}
} else if (StringUtils.isBlank(aKey.getSometimesPopulated())) {
return 1;
} else {
int result = sometimesPopulated.compareTo(aKey.getSometimesPopulated());
if (result != 0) {
return result;
}
}
// same logic with alwaysPopulated
return 0;
}
I believe the problem is that you are treating two keys with both blank fields as greater than each other which could confuse the structure.
class Main {
public static void main(String... args) {
Map<Key, String> map = new TreeMap<Key, String>();
Key aKey = new Key(null, "Hello");
map.put(aKey, "world");
//Put some more things on the map...
//they may have a value for sometimesPopulated or not
String value = map.get(aKey); // this = "world"
System.out.println(value);
}
}
class Key implements Comparable<Key> {
private final String sometimesPopulated;
private final String alwaysPopulated;
Key(String alwaysPopulated, String sometimesPopulated) {
this.alwaysPopulated = defaultIfBlank(alwaysPopulated, "");
this.sometimesPopulated = defaultIfBlank(sometimesPopulated, "");
}
static String defaultIfBlank(String s, String defaultString) {
return s == null || s.trim().isEmpty() ? defaultString : s;
}
#Override
public int compareTo(Key o) {
int cmp = sometimesPopulated.compareTo(o.sometimesPopulated);
if (cmp == 0)
cmp = alwaysPopulated.compareTo(o.alwaysPopulated);
return cmp;
}
}
I think your equals, hashCode and compareTo methods should only use the field that is always populated. It's the only way to ensure the same object will always be found in the map regardless of if its optional field is set or not.
Second option, you could write an utility method that tries to find the value in the map, and if no value is found, tries again with the same key but with (or without) the optional field set.
I have a hashmap:
Map<LotWaferBean, File> hm = new HashMap<LotWaferBean, File>();
LotWaferBean lw = new LotWaferBean();
... //populate lw
if (!hm.containsKey((LotWaferBean) lw)) {
hm.put(lw, triggerFiles[l]);
}
The code for LotWaferBean:
#Override
public boolean equals(Object o) {
if (!(o instanceof LotWaferBean)) {
return false;
}
if (((LotWaferBean) o).getLotId().equals(lotId)
&& ((LotWaferBean) o).getWaferNo() == waferNo) {
return true;
}
return false;
}
In my IDE I put breakpoints in equals() but it is never executed. Why?
Try putting a breakpoint in hashCode().
If the hashCode() of two objects in a map return the same number, then equals will be called to determine if they're really equal.
JVM checks the hashcode bucket of that object's hashcode, if there are more objects with the same hashcode, then only, the equals() method will be executed. And, the developer should follow correct contract between the hashCode() and equals() methods.
Only if 2 hashCodes equal, equals() will be called during loop keys.
Only if 2 hashCodes equal, equals() will be called during loop keys.
this is the correct answer... or almost. Precisely, if 2 hash codes collide (being the same ensures they are bound to collide under proper hashmap impl), only then equality check is performed.
BTW, your equal method is most likely incorrect. In case LotWaferBean is overridden, your equals method will accept the subclass instance, but will your subclass also do?
It better should read:
#Override
public boolean equals(Object o) {
if (o == null || o.getClass() != getClass()) { // << this is important
return false;
}
final LotWaferBean other = (LotWaferBean)o;
return other.getLotId().equals(lotId)
&& other.getWaferNo() == waferNo);
}
As Abimaran Kugathasan noted, the HashMap implementation uses hash-buckets to efficiently look up keys, and only uses equals() to compare the keys in the matching hash-bucket against the given key. It's worth noting that keys are assigned to hash-buckets when they are added to a HashMap. If you alter keys in a HashMap after adding them, in a way that would change their hash code, then they won't be in the proper hash-bucket; and trying to use a matching key to access the map will find the proper hash-bucket, but it won't contain the altered key.
class aMutableType {
private int value;
public aMutableType(int originalValue) {
this.value = originalValue;
}
public int getValue() {
return this.value;
}
public void setValue(int newValue) {
this.value = newValue;
}
#Override
public boolean equals(Object o) {
// ... all the normal tests ...
return this.value == ((aMutableType) o).value;
}
#Override
public int hashCode() {
return Integer.hashCode(this.value);
}
}
...
Map<aMutableType, Integer> aMap = new HashMap<>();
aMap.put(new aMutableType(5), 3); // puts key in bucket for hash(5)
for (aMutableType key : new HashSet<>(aMap.keySet()))
key.setValue(key.getValue()+1); // key 5 => 6
if (aMap.containsKey(new aMutableType(6))
doSomething(); // won't get here, even though
// there's a key == 6 in the Map,
// because that key is in the hash-bucket for 5
This can result in some pretty odd-looking behavior. You can set a breakpoint just before theMap.containsKey(theKey), and see that the value of theKey matches a key in theMap, and yet the key's equals() won't be called, and containsKey() will return false.
As noted here https://stackoverflow.com/a/21601013 , there's actually a warning the JavaDoc for Map regarding the use of mutable types for keys. Non-hash Map types won't have this particular problem, but could have other problems when keys are altered in-place.
We are storing a String key in a HashMap that is a concatenation of three String fields and a boolean field. Problem is duplicate keys can be created if the delimiter appears in the field value.
So to get around this, based on advice in another post, I'm planning on creating a key class which will be used as the HashMap key:
class TheKey {
public final String k1;
public final String k2;
public final String k3;
public final boolean k4;
public TheKey(String k1, String k2, String k3, boolean k4) {
this.k1 = k1; this.k2 = k2; this.k3 = k3; this.k4 = k4;
}
public boolean equals(Object o) {
TheKey other = (TheKey) o;
//return true if all four fields are equal
}
public int hashCode() {
return ???;
}
}
My questions are:
What value should be returned from hashCode(). The map will hold a total of about 30 values. Of those 30, there are about 10 distinct values of k1 (some entries share the same k1 value).
To store this key class as the HashMap key, does one only need to override the equals() and hashCode() methods? Is anything else required?
Just hashCode and equals should be fine. The hashCode could look something like this:
public int hashCode() {
int hash = 17;
hash = hash * 31 + k1.hashCode();
hash = hash * 31 + k2.hashCode();
hash = hash * 31 + k3.hashCode();
hash = hash * 31 + k4 ? 0 : 1;
return hash;
}
That's assuming none of the keys can be null, of course. Typically you could use 0 as the "logical" hash code for a null reference in the above equation. Two useful methods for compound equality/hash code which needs to deal with nulls:
public static boolean equals(Object o1, Object o2) {
if (o1 == o2) {
return true;
}
if (o1 == null || o2 == null) {
return false;
}
return o1.equals(o2);
}
public static boolean hashCode(Object o) {
return o == null ? 0 : o.hashCode();
}
Using the latter method in the hash algorithm at the start of this answer, you'd end up with something like:
public int hashCode() {
int hash = 17;
hash = hash * 31 + ObjectUtil.hashCode(k1);
hash = hash * 31 + ObjectUtil.hashCode(k2);
hash = hash * 31 + ObjectUtil.hashCode(k3);
hash = hash * 31 + k4 ? 0 : 1;
return hash;
}
In Eclipse you can generate hashCode and equals by Alt-Shift-S h.
Ask Eclipse 3.5 to create the hashcode and equals methods for you :)
this is how a well-formed equals class with equals ans hashCode should look like: (generated with intellij idea, with null checks enabled)
class TheKey {
public final String k1;
public final String k2;
public final String k3;
public final boolean k4;
public TheKey(String k1, String k2, String k3, boolean k4) {
this.k1 = k1;
this.k2 = k2;
this.k3 = k3;
this.k4 = k4;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
TheKey theKey = (TheKey) o;
if (k4 != theKey.k4) return false;
if (k1 != null ? !k1.equals(theKey.k1) : theKey.k1 != null) return false;
if (k2 != null ? !k2.equals(theKey.k2) : theKey.k2 != null) return false;
if (k3 != null ? !k3.equals(theKey.k3) : theKey.k3 != null) return false;
return true;
}
#Override
public int hashCode() {
int result = k1 != null ? k1.hashCode() : 0;
result = 31 * result + (k2 != null ? k2.hashCode() : 0);
result = 31 * result + (k3 != null ? k3.hashCode() : 0);
result = 31 * result + (k4 ? 1 : 0);
return result;
}
}
The implementation of your hashCode() doesn't matter much unless you make it super stupid. You could very well just return the sum of all the strings hash codes (truncated to an int) but you should make sure you fix this:
If your hash code implementation is slow, consider caching it in the instance. Depending on how long your key objects stick around and how they are used with the hash table when you get things out of it you may not want to spend longer than necessary calculating the same value over and over again. If you stick with Jon's implementation of hashCode() there is probably no need for it as String already cache its hashCode() for you.
This is however more of a general advice, since the mid 90's I've seen quite a few developers get stung on slow (and even worse, changing) hashCode() implementations.
Don't be sloppy when you create the equals() implementation. Your equals() above will be both ineffective and flawed. First of all you don't need to compare the values if the objects have different hash codes. You should also return false (and not a null pointer exception) if you get a null as the argument.
The rules are simple, this page will walk you through them.
Edit:
I have to ask one more thing... You say "Problem is duplicate keys can be created if the delimiter appears in the field value". Why is that?
If the format is key+delimiter+key+delimiter+key it really doesn't matter if there are one or more delimiters in the keys unless you get really unlucky with a combination of two keys and in that case you probably should have selected another delimiter (there are quite a few to choose from in unicode).
Anyway, Jon is right in his comment below... Don't do caching "until you've proven it's a good thing". It is a good practice always.
Have you taken a look at the specifications of hashCode()? Perhaps this will give you a better idea of what the function should return.
I do not know if this is an option for you but apache commons library provides an implementation for MultiKeyMap
For the hashCode, you could instead use something like
k1.hashCode() ^ k2.hashCode() ^ k3.hashCode() ^ k4.hashCode()
XOR is entropy-preserving, and this incorporates k4's hashCode in a much better way than the previous suggestions. Just having one bit of information from k4 means that if all your composite keys have identical k1, k2, k3 and only differing k4s, your hash codes will all be identical and you'll get a degenerate HashMap.
I thought your main concern was speed (based on your original post)? Why don't you just make sure you use a separator which does not occur in your (handfull of) field values? Then you can just create String key using concatenation and do away with all this 'key-class' hocus pocus. Smells like serious over-engineering to me.