Question
How is the HashMap method putIfAbsent able to perform a put conditionally in a way thats faster than calling containsKey(x) prior?
For example, if you didn't use putIfAbsent you could use:
if(!map.containsKey(x)){
map.put(x,someValue);
}
I had previously thought putIfAbsent was convenience method for calling containsKey followed by a put on a HashMap. But after running a benchmark putIfAbsent is significantly faster than using containsKey followed by Put. I looked at the java.util source code to try and see how this is possible but it's a bit too cryptic for me to figure out. Does anyone know internally how putIfAbsent seems to work in a better time complexity? Thats my assumption based on running a few code tests in which my code ran 50% faster when using putIfAbsent. It seems to avoid calling a get() but how?
Example
if(!map.containsKey(x)){
map.put(x,someValue);
}
VS
map.putIfAbsent(x,somevalue)
Java Source Code for Hashmap.putIfAbsent
#Override
public V putIfAbsent(K key, V value) {
return putVal(hash(key), key, value, true, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
The HashMap implementation of putIfAbsent searches for the key just once, and if it doesn't find the key, it puts the value in the relevant bin (which was already located). That's what putVal does.
On the other hand, using map.containsKey(x) followed by map.put(x,someValue) performs two lookups for the key in the Map, which takes more time.
Note that put also calls putVal (put calls putVal(hash(key), key, value, false, true) while putIfAbsent calls putVal(hash(key), key, value, true, true)), so putIfAbsent has the same performance as calling just put, which is faster than calling both containsKey and put.
See Eran's answer... I'd like to also answer it more succinctly. put and putIfAbsent both use the same helper method putVal. But clients using put can't take advantage of its many parameters that allow put-if-present behavior. The public method putIfAbsent exposes this. So using putIfAbsent has the same underlying time complexity as the put you are already going to use in conjunction with containsKey. The use of containsKey then becomes a waste.
So the core of this is that private function putVal is being used by both put and putIfAbsent.
Related
public static void main(String []args){
TreeSet tree = new TreeSet();
String obj = "Ranga";
tree.add(null);
tree.add(obj);
}
As per my knowledge, the TreeSet is depends on default natural sorting order. So JVM internally calls compareTo() method.
In above example, the case is:
obj.compareTo(null);
So, why the result is null pointer exception?
From 1.7 onwards null is not at all accepted by TreeSet. If you enforce to add then we will get NullPointerException. Till 1.6 null was accepted only as the first element.
Before java 7 -
For a non-empty TreeSet, if we are trying to insert a null value at run time you will get a NullPointerException. This is because when some elements exist in the tree, before inserting any object it compares the new object to the existing ones using the compareTo() method and decides where to put the new object. So by inserting null the compareTo() method internally throws NullPointerException.
TreeMap Add method documentation
When you try to add null on empty TreeSet initially it does not contain any element to compare hence its add without NPE, when second element you will add in TreeSet, TreeSet will use Comparable compareTo() method to sort the element and place into TreeSet object hence it will call null.compareTo() which defiantly leads to NPE.
TreeSet backed by TreeMap internally, before java 7 TreeMap put(K,V) doesn't have null check for K(key) and from java 7 null check has been added in TreeMap put(K.V) mehod
Before java 7 TreeMap put mehod code does not have null check -
public V put(K key, V value) {
Entry<K,V> t = root;
if (t == null) {
incrementSize();
root = new Entry<K,V>(key, value, null);
return null;
}
while (true) {
int cmp = compare(key, t.key);
if (cmp == 0) {
return t.setValue(value);
} else if (cmp < 0) {
if (t.left != null) {
t = t.left;
} else {
incrementSize();
t.left = new Entry<K,V>(key, value, t);
fixAfterInsertion(t.left);
return null;
}
} else { // cmp > 0
if (t.right != null) {
t = t.right;
} else {
incrementSize();
t.right = new Entry<K,V>(key, value, t);
fixAfterInsertion(t.right);
return null;
}
}
}
}
from java 7 you can see null check for key, if its is null it will throw NPE.
public V put(K key, V value) {
Entry<K,V> t = root;
if (t == null) {
compare(key, key); // type (and possibly null) check
root = new Entry<>(key, value, null);
size = 1;
modCount++;
return null;
}
int cmp;
Entry<K,V> parent;
// split comparator and comparable paths
Comparator<? super K> cpr = comparator;
if (cpr != null) {
do {
parent = t;
cmp = cpr.compare(key, t.key);
if (cmp < 0)
t = t.left;
else if (cmp > 0)
t = t.right;
else
return t.setValue(value);
} while (t != null);
}
else {
if (key == null)
throw new NullPointerException();
Comparable<? super K> k = (Comparable<? super K>) key;
do {
parent = t;
cmp = k.compareTo(t.key);
if (cmp < 0)
t = t.left;
else if (cmp > 0)
t = t.right;
else
return t.setValue(value);
} while (t != null);
}
Entry<K,V> e = new Entry<>(key, value, parent);
if (cmp < 0)
parent.left = e;
else
parent.right = e;
fixAfterInsertion(e);
size++;
modCount++;
return null;
}
I hope this will leads you on proper conclusion.
Just to maintain the contract and the behavior is enforced byComparable incase of natural ordering.
The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C. Note that null is not an instance of any class, and e.compareTo(null) should throw a NullPointerException even though e.equals(null) returns false.
In "relatively recent" Java versions (from the 6th version), the NullPointerException is expected to be thrown in the first add() invocation :
tree.add(null);
as TreeSet.add() javadoc states that :
throw NullPointerException -
if the specified element is null and this set uses natural ordering,
or its comparator does not permit null elements
Note that it is specified in this way from the JDK 6.
For example JDK 5 doesn't explicit this point.
If you use an old Java version (as Java 5), please specify it.
Firstly, don't use raw types instead utilise the power of generics:
TreeSet<String> tree = new TreeSet<>();
As for your issue:
TreeSet no longer supports the addition of null.
From the doc:
public boolean add(E e)
Throws NullPointerException if the specified element is null and this set
uses natural ordering, or its comparator does not permit null
elements.
solutions to overcome this issue:
Provide a null-safe comparator where the null elements will come first:
TreeSet<String> tree = new TreeSet<>(Comparator.nullsFirst(Comparator.naturalOrder()));
or provide a null-safe comparator where the null elements will come last:
TreeSet<String> tree = new TreeSet<>(Comparator.nullsLast(Comparator.naturalOrder()));
This question already has answers here:
What basic operations on a Map are permitted while iterating over it?
(4 answers)
Closed 6 years ago.
For example see the following code snippet:
Map<String,String> unsafemap=new HashMap<>();
unsafemap.put("hello",null);
unsafemap.put(null, null);
unsafemap.put("world","hello");
unsafemap.put("foo","hello");
unsafemap.put("bar","hello");
unsafemap.put("john","hello");
unsafemap.put("doe","hello");
System.out.println("changing null values");
for(Iterator<Map.Entry<String,String>> i=unsafemap.entrySet().iterator();i.hasNext();){
Map.Entry<String,String> e=i.next();
System.out.println("key : "+e.getKey()+" value :"+e.getValue());
if(e.getValue() == null){
//why is the below line not throwing ConcurrentModificationException
unsafemap.put(e.getKey(), "no data");
//same result, no ConcurrentModificationException thrown
e.setValue("no data");
}
//throws ConcurrentModificationException
unsafemap.put("testKey","testData");
}
System.out.println("---------------------------------");
for(Map.Entry<String,String> e :unsafemap.entrySet()){
System.out.println(e);
}
Modifying the map during iteration always results in an exception, if not done using the iterator e.g. iterator.remove(). So obviously adding a new value during iteration is throwing the exception as expected but why is it not thrown if the value of a particular key/value pair is modified?
The Entry object already exists in your first case, so the value will just be modified using e.value = value; and return and no new Entry will be made. So, no exception here.
In second case, changes done to the value object really don't affect the map, so no exception there.
From HashMap source code:
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
public V put(K key, V value) {
Entry<K,V> t = root;
if (t == null) {
compare(key, key); // type (and possibly null) check
root = new Entry<>(key, value, null);
size = 1;
modCount++;
return null;
}
int cmp;
...
}
final int compare(Object k1, Object k2) {
return comparator==null ? ((Comparable<? super K>)k1).compareTo((K)k2)
: comparator.compare((K)k1, (K)k2);
}
After facing some bug in my application, I had to debug TreeMaps put method. My issue was in comparing objects that were put in the map. What is odd, is that when I put FIRST element to the Map, it key gets compared with itself. I can't understand why would it work like that. Any insights (besides the commented "type (and possibly null) check")? Why wouldn't they just check if key was null? What kind of "type" check is made out there and what for?
As mentioned in the comment, https://bugs.openjdk.java.net/browse/JDK-5045147 is the issue where this was introduced. From the discussion in that issue, the original fix was the following:
BT2:SUGGESTED FIX
Doug Lea writes:
"Thanks! I have a strong sense of deja vu that I've
added this before(!) but Treemap.put should have the
following trap added."
public V put(K key, V value) {
Entry<K,V> t = root;
if (t == null) {
+ if (key == null) {
+ if (comparator == null)
+ throw new NullPointerException();
+ comparator.compare(key, key);
+ }
incrementSize();
root = new Entry<K,V>(key, value, null);
return null;
}
The intention seems to throw a NPE in case the comparator of the TreeMap is null, or the comparator does not accept null keys (which conforms to the API specification). It seems the fix was shortened to one line:
compare(key, key);
which is defined as:
#SuppressWarnings("unchecked")
final int compare(Object k1, Object k2) {
return comparator==null ? ((Comparable<? super K>)k1).compareTo((K)k2)
: comparator.compare((K)k1, (K)k2);
}
Hence this test will do both the null check and the type check, namely the cast to Comparable.
I believe this is the place where TreeMap< K,V > checks if K implements Comparable if no Comparator is supplied. You get a ClassCastException otherwise.
I am looking for some better insight on hashtable/hash-map data-structure.
By going through the api I could make out that inner Entry class is referrred to as bucket. Please correct me if I am wrong.
Please find the following method:-
public synchronized V put(K key, V value) {
// Make sure the value is not null
if (value == null) {
throw new NullPointerException();
}
// Makes sure the key is not already in the hashtable.
Entry tab[] = table;
int hash = hash(key);
int index = (hash & 0x7FFFFFFF) % tab.length;
for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
V old = e.value;
e.value = value;
return old;
}
}
modCount++;
if (count >= threshold) {
// Rehash the table if the threshold is exceeded
rehash();
tab = table;
hash = hash(key);
index = (hash & 0x7FFFFFFF) % tab.length;
}
// Creates the new entry.
Entry<K,V> e = tab[index]; <-------are we assigining null to this entry?
tab[index] = new Entry<>(hash, key, value, e);
count++;
return null;
}
By the following line of code
Entry<K,V> e = tab[index];
I can assume that we are assigning null to this new entry object; Please correct me here also.
So my another question is :-
why are we not doing this directly
Entry<K,V> e = null
instead of
Entry<K,V> e = tab[index];
Please find below is the screen shot for the debug also:-
Please share your valuable insights on this.
Entry<K,V> is an instance that can represent a link in a linked list. Note that the next member refers to the next Entry on the list.
A bucket contains a linked list of entries that were mapped to the same index.
Entry<K,V> e = tab[index] will return null only if there's no Entry stored in that index yet. Otherwise it will return the first Entry in the linked list of that bucket.
tab[index] = new Entry<>(hash, key, value, e); creates a new Entry and stores it as the first Entry in the bucket. The previous first Entry is passed to the Entry constructor, in order to become the next (second) Entry in the list.
I know that we can use a linked list to handle chain collision for hash map. However, in Java, the hash map implementation uses an array, and I am curious how java implements hash map chain collision resolution. I do find this post: Collision resolution in Java HashMap
. However, this is not the answer I am looking for.
Thanks a lot.
HashMap contains an array of Entry class. Each bucket has a LinkedList implementation. Each bucket points to hashCode, That being said, if there is a collision, then the new entry will be added at the end of the list in the same bucket.
Look at this code :
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length); // get table/ bucket index
for (Entry<K,V> e = table[i]; e != null; e = e.next) { // walk through the list of nodes
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue; // return old value if found
}
}
modCount++;
addEntry(hash, key, value, i); // add new value if not found
return null;
}