Strange results after overriding hashcode() method

Strange results after overriding hashcode() method - java

I have the following situation-
I am creating a HashMap using generics. The key is of type TestHashMap, and the value is of type String.
In the main method of TestHashMap, I use three instances of TestHashMap as keys to store three different strings.
I override the hashcode() method so that it returns a different integer on each call.
Then I extract the keys in the HashMap and print out the corresponding values.
This gives me totally unexpected results- I get a null for each value of the three key-value pairs.
Note that if, instead of returning different integers on different calls of hashcode(), I just returned the same integer, everything works fine.
This has me really stymied. Any help will be much appreciated.
Here is the code. You should be able to run it if you copy it as is.
import java.util.HashMap;
import java.util.Set;
public class TestHashMap {
private static int hash = 0;
public static void main(String[] args) {
HashMap<TestHashMap,String> h = new HashMap<TestHashMap,String>();
TestHashMap thm1 = new TestHashMap();
TestHashMap thm2 = new TestHashMap();
TestHashMap thm3 = new TestHashMap();
h.put(thm1, "one");
h.put(thm2, "two");
h.put(thm3, "three");
Set<TestHashMap> keys = h.keySet();
for(TestHashMap k : keys){
System.out.println(k + " " + h.get(k));
}
}
#Override
public int hashCode(){ return hash++;}
}

When you put an object into a map, the map will read the key's hashcode and place it into the appropriate bucket;
when you later get an object by the same key, the map will again read the key's hashcode and look for an object equal to the key in that bucket;
but your key gives the map a different hashcode each time. So the poor map is tricked into looking in the wrong bucket.

Related

Can we use object as a key in hashmap in Java?

How to use an object as a key in a hashmap. If you use an object as key do you need to override equals and hashcode methods for that object?

A simple thumb rule is to use immutable objects as keys in a HashMap.
because:
If it were mutable, then the hashcode() value or equals() condition might change, and you would never be able to retrieve the key from your HashMap.
More precisely, class fields that are used to compute equals() and hashcode() should be immutable!
Now, suppose you create your own class:
To compare two objects of your class you will have to override equals()
To use it as a key in any Hash based Data structure you will have to override hashcode() (again, keeping immutability in mind)
Remember that if two objects are equal(), then their hashcode() should be equal as well!

hashCode() -HashMap provides put(key, value) for storing and get(key) for retrieving values from a HashMap. When using put(key, value) to store a key-value-pair, HashMap calls hashcode() on the key object to calculate a hash that is used to find a bucket where the Entry object is stored. When get() is used to retrieve a value, again, the key object is used to calculate a hash which is used then to find a bucket where that particular key is stored.
equals() - equals() is used to compare objects for equality. In the case of HashMap, the key object is used for comparison, also using equals(). HashMap knows how to handle hashing collisions (more than one key having the same hash value, thus assigned to the same bucket). In that case objects are stored in a linked list (refer to the figure for more clarity).
hashCode() helps in finding the bucket where that key is stored, equals() helps in finding the right key as there may be more than one key-value pair stored in a single bucket.

You can use any object in a HashMap as long as it has properly defined hashCode and equals methods - those are absolutely crucial because the hashing mechanism depends on them.

Answer to your question is yes, objects of custom classes can be used as a key in a HashMap. But in order to retrieve the value object back from the map without failure, there are certain guidelines that need to be followed.
1)Custom class should follow the contract between hashCode() and equals().
The contract states that:
If two objects are equal according to the equals(Object) method, then calling
the hashCode method on each of the two objects must produce the same integer result.
This can be done by implementing hashcode() and equals() in your custom class.
2) Make custom class immutable.
Hint: use final, remove setters, use deep copy to set fields

package com.java.demo.map;
import java.util.HashMap;
public class TestMutableKey
{
public static void main(String[] args)
{
//Create a HashMap with mutable key
HashMap<Account, String> map = new HashMap<Account, String>();
//Create key 1
Account a1 = new Account(1);
a1.setHolderName("A_ONE");
//Create key 2
Account a2 = new Account(2);
a2.setHolderName("A_TWO");
//Put mutable key and value in map
map.put(a1, a1.getHolderName());
map.put(a2, a2.getHolderName());
//Change the keys state so hash map should be calculated again
a1.setHolderName("Defaulter");
a2.setHolderName("Bankrupt");
//Success !! We are able to get back the values
System.out.println(map.get(a1)); //Prints A_ONE
System.out.println(map.get(a2)); //Prints A_TWO
//Try with newly created key with same account number
Account a3 = new Account(1);
a3.setHolderName("A_THREE");
//Success !! We are still able to get back the value for account number 1
System.out.println(map.get(a3)); //Prints A_ONE
}
}

Yes, you should override equals and hashcode, for the proper functioning of the code otherwise you won't be able to get the value of the key which you have inserted in the map.
e.g
map.put(new Object() , "value") ;
when you want to get that value ,
map.get(new Object()) ; // This will always return null
Because with new Object() - new hashcode will be generated and it will not point to the expected bucket number on which value is saved, and if eventually bucket number comes to be same - it won't be able to match hashcode and even equals so it always return NULL .

Yes, we can use any object as key in a Map in java but we need to override the equals() and hashCode() methods of that object class. Please refer an example below, in which I am storing an object of Pair class as key in a hashMap with value type as string in map. I have overriden the hashCode() and equals() methods of Pair class. So, that different objects of Pair class with same values of Pair(x,y) will be treated as one object only.
import java.util.*;
import java.util.Map.Entry;
class App { // Case-sensitive
private class Pair {
private int x, y;
public Pair(int x, int y) {
this.x = x;
this.y = y;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + x;
result = prime * result + y;
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Pair other = (Pair) obj;
if (x != other.x)
return false;
if (y != other.y)
return false;
return true;
}
}
public static void main(String[] args) {
App obj = new App();
obj.show();
}
private void show() {
Map<Pair, String> map = new HashMap<>();
Pair obj1 = new Pair(10, 20);
Pair obj2 = new Pair(40, 50);
Pair obj3 = new Pair(10, 20);
// We can see that obj1 and obj3 have same values. So, we want to store these
// objects
// as one .To achieve
// that,
// we have overridden equals() and hashCode() methods of Pair class.
map.put(obj1, "First");
map.put(obj2, "Second");
map.put(obj3, "Third");
System.out.printf("Size of Map is :%d \n", map.size());
for (Entry<App.Pair, String> p : map.entrySet()) {
Pair pair = p.getKey();
System.out.printf("Map key-value pair is (%d,%d)->%s \n", pair.x, pair.y, p.getValue());
}
// output -
// Size of Map is :2
// Map key-value pair is (10,20)->Third
// Map key-value pair is (40,50)->Second
}
}

HashMap is not adding duplicate keys

import java.util.*;
class U {
int x;
U(int x) {
this.x = x;
}
}
public class G {
public U a = new U(22);
public U b = new U(23);
Integer y = 22;
Integer r = 23;
void a() {
Map<U, Integer> set = new HashMap<U, Integer>();
set.put(a, y);
set.put(a, r);
set.put(b, y);
System.out.print(set.size() + " ");
}
public static void main(String[] args) {
G m = new G();
m.a();
}
}
I always get confused in Maps and Lists.
I know that when map put keys in the collection , it calls hashcode and if the bucket is same , equal method is called. However , I learned that if the class override these two methods then only duplicate keys are not stored. For example wrapper class : String implements its own hashcode and equal method. Moreover, if you don't do so, a unique hashcode is been called and duplicate keys get stored in the collection.
But in the above example , class U is NOT implementing hashcode and equal method. However , Map is not allowing duplicate keys.
I checked the SIZE : its 2
its supposed to be 3 because my U class is not implementing either hashcode nor equal.
please clear me
Thanks in advance

HashMap doesn't allow duplicated keys,
If you don't provide hashcode() and equals() implementation it extends it from super class (for your case it is java.lang.Object), and that implementation provides same hashcode for same object and equals() on same object returns true
public boolean equals(Object obj) {
return (this == obj);
}
as well

You are using the same instance of U as a key twice:
set.put(a, y);
set.put(a, r);
Your U class does not implement hashCode() as you mention, so the default implementation is Object#hashCode which will obviously be the same since it is the same instance. Therefore the map will only contain the second entry. However, if you try the following you would end up with two separate entries:
set.put(new U(22), y);
set.put(new U(22), r);
But generally you will always want to implement equals() and hashCode() for any class used as the key of a map - otherwise you can't look up the value without having access to the exact instance it was stored as!

By design hashmaps do not add duplicate keys. It will replace the value of the current item in the map with that key. See http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html#put%28K,%20V%29

If you want to add duplicate keys, try something like this:
Map<Integer, List<Integer>> map = new HashMap<Integer, List<Integer>>();
map.put(1, new ArrayList<Integer>());
map.get(1).add(1);
map.get(1).add(2);

Understanding the Implementation of HashTable in Java

I am trying to understand the implementation HashTables in Java. Below is my code:
Hashtable<Integer, String> hTab = new Hashtable<Integer, String>();
hTab.put(1, "A");
hTab.put(1, "B");
hTab.put(2, "C");
hTab.put(3, "D");
Iterator<Map.Entry<Integer, String>> itr = hTab.entrySet().iterator();
Entry<Integer, String> entry;
while(itr.hasNext()){
entry = itr.next();
System.out.println(entry.getValue());
}
When I run it, I get the below output:
D
C
B
Which means that there has been a collision for the Key = 1; and as per the implementation:
"Whenever a collision happens in the hashTable, a new node is created in the linkedList corresponding for the particular bucket and the EntrySet(Key, Value) pairs are stored as nodes in the list, the new value is inserted in the beginning of the list for the particular bucket". And I completely agree to this implementation.
But if this is true, then where did "A" go when I try to retrieve the entrysets from the hashTable?
Again, I tried with the below code to understand this by implementing my own HashCode and equals method. And surprisingly, this works perfect and as per the HashTable implementation. Below is my code:
public class Hash {
private int key;
public Hash(int key){
this.key = key;
}
public int hashCode(){
return key;
}
public boolean equals(Hash o){
return this.key == o.key;
}
}
public class HashTable1 {
public static void main(String[] args) {
// TODO Auto-generated method stub
Hashtable<Hash, String> hTab = new Hashtable<Hash, String>();
hTab.put(new Hash(1), "A");
hTab.put(new Hash(1), "B");
hTab.put(new Hash(2), "C");
hTab.put(new Hash(3), "D");
Iterator<Map.Entry<Hash, String>> itr = hTab.entrySet().iterator();
Entry<Hash, String> entry;
while(itr.hasNext()){
entry = itr.next();
System.out.println(entry.getValue());
}
}
}
Output :
D
C
B
A
Which is perfect. I am not able to understand this ambiguity in the behavior of HashTable in Java.
Update
#garrytan and #Brian: thanks for responding. But I still have a small doubt.
In my second code, where it works fine. I have created two objects which are new keys and since they are 2 objects, Key collision does not happens in this case and it works fine. I agree with your explanation. However, if in the first set of code I use "new Integer(1)" instead of simply "1", it still doesn't work although now I am creating 2 objects now and they should be different. I cross checked by writing the simple line below:
Integer int1 = new Integer(1);
Integer int2 = new Integer(1);
System.out.println(int1 == int2);
which gives "False". it means now, the Key collision should have been resolved. But still it doesn't work. Why is this?

By design hashtable is not meant to store duplicate keys.
I think you get mixed up between 'hash collision' and 'key collision'. Put it simply, hash table consist of a collection of linked lists (ie: buckets). When you add a new key value pairs (KVPs), it is distributed into the buckets by the key's hash value. 'hash collision' happen when two keys result in the same hash (hence they get put into the same bucket)
A good hash function is one that distributes the key evenly into a number of buckets, hence improving key searching performance.

The second example gives the behaviour you want because your implementation of equals is incorrect.
The signature is
public boolean equals(Object o) {}
not
public boolean equals(Hash h) {}
So what you have created is a hash Collision, where two objects have the same hash code (key), but they are not equal according to the equals method (because your signature is wrong, it's still using the == operator and not your this.key == h.key code). As opposed to a key collision, where the objects both have the same hashCode and are also equals, as in your first example. If you fix the code in the second example to implement the actual equals(Object o) method you will see 'A' will again be missing from the values.

In your second example you are not overriding the original equals function because you use the following signature:
public boolean equals(Hash h) {}
Thus the original equals function with Object as a parameter is still used and as you create a new object Hash for each insert that Object is different from the other one and thus your keys for A and B are not equal.
Furthermore a HashTable is designed to have ONE value for EACH key. And keys are indeed relying on the equals functions to be compared.
About your example with two new Integers, try comparing them with .equals(). You could also override the hashCode function to generate different hashCodes or not for each object, i.e. depending on time, but that would be not a good coding principle. Objects which are the same should hash to the same code.

Chaining in HashMap

Code:
public static void main(String[] args) {
Map<String,String> map= new HashMap<String,String>();
map.put("a", "s");
map.put("a", "v");
System.out.println(map.get("a"));
}
Now, as per my understanding, since the key values in both the put case is the same i.e. a, collision is bound to happen, and hence chaining occurs. [Correct me if I am wrong].
Now if I want to retrieve the list of all the values mapped to key value a, how do i get it?
Right now my println prints v only.

This has nothing to do with collision or chaining: you're replacing the old value of a with a new value.
A map keeps unique keys. collision/chaining will occur in a hash data structure when two distinct keys happen to get the same hash value based on the particular hash function. Or in java, you can explicitly create an object that returns the same value for hashCode().
If you want mapping with multiple values for a key, then you'll need to use a different data structure/class.

Like other people already suggested, there is no such thing as Collision for your case.
It's simply because Hashmap only accepts an unique key.
However you can have an alternative if you want the key to be not unique, for example Google Guava Multimap or Apache Multimap
Example using Google lib:
public class MutliMapTest {
public static void main(String... args) {
Multimap<String, String> myMultimap = ArrayListMultimap.create();
// Adding some key/value
myMultimap.put("Fruits", "Bannana");
myMultimap.put("Fruits", "Apple");
myMultimap.put("Fruits", "Pear");
myMultimap.put("Vegetables", "Carrot");
// Getting the size
int size = myMultimap.size();
System.out.println(size); // 4
// Getting values
Collection<string> fruits = myMultimap.get("Fruits");
System.out.println(fruits); // [Bannana, Apple, Pear]
Collection<string> vegetables = myMultimap.get("Vegetables");
System.out.println(vegetables); // [Carrot]
// Iterating over entire Mutlimap
for(String value : myMultimap.values()) {
System.out.println(value);
}
// Removing a single value
myMultimap.remove("Fruits","Pear");
System.out.println(myMultimap.get("Fruits")); // [Bannana, Pear]
// Remove all values for a key
myMultimap.removeAll("Fruits");
System.out.println(myMultimap.get("Fruits")); // [] (Empty Collection!)
}
}

See the java doc for put
Associates the specified value with the specified key in this map (optional operation). If the map previously contained a mapping for the key, the old value is replaced by the specified value. (A map m is said to contain a mapping for a key k if and only if m.containsKey(k) would return true.)
The collision happens when two different keys comes up with the same hashcode and not when two same keys.
class StringKey {
String text;
public StringKey() {
text = "";
}
public StringKey(String text) {
this.text = text;
}
public String getText() {
return text;
}
public void setText(String text) {
this.text = text;
}
#Override
public int hashCode() {
if (text != null) {
text.substring(0, 1).hashCode();
}
return 0;
}
#Override
public boolean equals(Object o) {
if (o instanceof StringKey) {
return ((StringKey) o).getText().equals(this.getText());
}
return false;
}
public static void main(String[] args) {
Map<StringKey, String> map = new HashMap<StringKey, String>();
StringKey key1 = new StringKey("a");
StringKey key2 = new StringKey("b");
map.put(key1, "s");
map.put(key2, "v");
System.out.println(map.get(key1));
System.out.println(key1.hashCode() + " " + key2.hashCode() + " " + key1.equals(key2));
}
}
The output is
s
0 0 false
now this will cause a collision; but you can not interpret this from the output of map keys and values.

The second put() simply overwrites what the first put() wrote. There is no chaining.

Second put replaces first put, so you will have only one value with key "a" in Hashmap.
So your map just contains
map.put("a", "v");

Now,as per my understanding, since the key values in both the put case
is the same i.e. a, collision is bound to happen, and hence chaining
occurs. [Correct me if i am wrong].
You're wrong. Thats not how a Map works. Consider using a MultiMap from Google's Guava library.
You can always roll your own:
Map<String, ArrayList<String>>();

You will have to make your HashMap as follows
public static void main(String[] args) {
HashMap<String, ArrayList<String>> map = new HashMap<String, ArrayList<String>>();
if ( map.get("a") == null ){
map.put("a", new ArrayList<String>());
}
ArrayList<String> innerList = map.get("a");
innerList.add("s");
innerList.add("v");
map.put("a",innerList);
System.out.println(map.get("a"));
}

Hashing algorithm used in HashMaps are pretty vague in the first go. Internally a HashMap is nothing but an array with indices. The index here is usually referred to as 'hashValue'. As the hashValue is the index of an element in the array, it has to be less than the size of the HashMap itself.The HashMap's hashing algorithm converts the key's hashcode into the hashValue. This is where the Map stores the Entry (key-value pair).
When an element is put into a Map, it generates the hashValue from the element key's hashcode, and stores the Entry into the array at this index, which is nothing but the hashValue.
Now, hashing algorithm can be efficient to a certain extent only, that is we can never assure that the hashValue generated for two different keys are always different. It could be same under two conditions:
1) The keys are same (as in your case)
2) The Keys are different, but the hashValue generated for both the keys are same.
We simply cannot replace the value of the Entry at the hashValue position in the array, as this will violate the second condition, which is very valid. This is where the equals() comes into picture. Now, the HashMap checks for the equality between the new key and the key that exists in that index's Entry. If both the keys are same it means replacement, else it's collision and the HashMap uses the appropriate collision technique.
Now, if you want the list of all the values that you put for a particular key, consider using a composite map
HashMap<String, List<String>>.

Both the keys you tried to put in the HashMap has the same HashCode. Thus the first value gets overwritten an you will end up having only one value in the HashMap.
You can put Two similar objects in the same HashMap by overriding thier hashCode() Method.

Further notes on when Chaining actually takes place when a HashMap is used:
The Java implementation for HashMap will either override a key or chain an object to it depending on the following:
You put an object foo as key, with hash code X into the map
You put another object bar (as key..) that has the same hash code X
into the map
Since the hashes are the same, the algorithm would need to put the
object bar on the same index where foo is already stored. It would then consult the equals method of foo, to determine whether it should chain bar to foo (i.e foo.next() will become bar) or override foo with bar:
3.1.If equals returns true, foo & bar are either the same object, or they are semantically the same, and overriding will take place rather than chaining.
3.2. If equals returns false, foo & bar are treated as two distinct entities and chaining will take place. If you then print your HashMap, you'll be seeing both foo and bar.

Problem with Maps in java

I have a Hashmap which has X number of elements
I need to move this map into another map
This is what my code looks like
Map originMap = initialize();
Map destMap = new Hashmap ();
int originMapSize = originMap.size();
Set<Map.Entry<K, V>> entries = originMap.entrySet();
for (Map.Entry<K, Y> mapEntry : entries) {
K key = mapEntry.getKey();
V value = mapEntry.getValue();
destMap.put (key,value);
}
// Shouldnt this be equal to originMapSize ????
int destMapSize = destMap.size();
What I am observing is - originMapSize is NOT equal to the destMapSize
It seems when we put the elements in the destMap, some of the elements are being overridden
We have overrridden the hashCode and equals method- and it is a suspicious implementation.
However, if the originMap allowed the elements to be added, why would the destinationMap not add a new elements and override an existing element instead ?

This could happen if the equals method was asymmetric. Suppose there are two keys a and b such that:
a.hashCode() == b.hashCode()
a.equals(b) returns false
b.equals(a) returns true
Then suppose that the HashMap implementation searches for an existing key by calling existingKey.equals(newKey) for each existing key with the same hash code as the new key.
Now suppose we originally add them in the order { a, b }.
The first key (a) obviously goes in with no problems. The second key (b) insertion ends up calling a.equals(b) - which is false, so we get two keys.
Now building the second HashMap, we may end up getting the entries in the order { b, a }.
This time we add b first, which is fine... but when we insert the second key (a) we end up calling b.equals(a), which returns true, so we overwrite the entry.
That may not be what's going on, but it could explain things - and shows the dangers of an asymmetric equals method.
EDIT: Here's a short but complete program demonstrating this situation. (The exact details of a and b may not be the same, but the asymmetry is.)
import java.util.*;
public class Test {
private final String name;
public Test(String name)
{
this.name = name;
}
public static void main(String[] args)
{
Map<Test, String> firstMap = new HashMap<Test, String>();
Test a = new Test("a");
Test b = new Test("b");
firstMap.put(b, "b");
firstMap.put(a, "a");
Map<Test, String> secondMap = new HashMap<Test, String>();
for (Map.Entry<Test, String> entry : firstMap.entrySet())
{
System.out.println("Adding " + entry.getKey().name);
secondMap.put(entry.getKey(), entry.getValue());
}
System.out.println(secondMap.size());
}
#Override public int hashCode()
{
return 0;
}
#Override public boolean equals(Object other)
{
return this.name.equals("b");
}
}
Output on my machine:
Adding a
Adding b
1
You may not get the output that way round - it depends on:
The way that equals is called (candidateKey.equals(newKey) or vice versa)
The order in which entries are returned from the set
It may even work differently on different runs.

Those values should be equal, but the problem is you are iterating over a different Map object.
for (Map.Entry mapEntry : entries)
is not the same as
for (Map.Entry mapEntry : originMap)

I suspect the order of the elements being added to the first hashmap is not the same as the order added to the second. This combined with the sketchy hashCode method is causing duplicates to be added to the first.
Try changing hashCode to always return the same value to see if your problem goes away.

Why don't you use destMap.putAll(originMap) ?

Map has a putAll method. Try something like this:
Map<String, String> destination = new HashMap<String, String>();
Map<String, String> original = new HashMap<String, String>();
destination.putAll(original);

It depends of how the first HashMap is initialized. Also everytime you add an object into the HashMap , once it passes 75% load factor, it allocates twice the default size to accomodate new values. Maps usually have default size = 16: when you pass the 75% load factor it enlarges to 32.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.