Why readValueUnderLock(e) exist in ConcurrentHashMap’s get method? - java

When I read the source code of ConcurrentHashMap at JDK1.6, I found that readValueUnderLock(e) can't be reached, because the put method has checked the value: if value is null, it must throw NullPointerException. So I think there may be something wrong but i'm not sure what it is. I'll be grateful if someone can answer me!
some source code here:
V get(Object key, int hash) {
if (count != 0) { // read-volatile
HashEntry<K,V> e = getFirst(hash);
while (e != null) {
if (e.hash == hash && key.equals(e.key)) {
V v = e.value;
if (v != null)
return v;
return readValueUnderLock(e); // recheck
}
e = e.next;
}
}
return null;
}
V readValueUnderLock(HashEntry<K,V> e) {
lock();
try {
return e.value;
} finally {
unlock();
}
}
public V put(K key, V value) {
if (value == null)
throw new NullPointerException();
int hash = hash(key.hashCode());
return segmentFor(hash).put(key, hash, value, false);
}

V is just a snapshot of the Entry.value. The Entry may not be fully constructed yet (consider the double-check lock issue in previous Java Memory Model) and it could be null. While this is just a extreme edge case, JRE has to make sure this works, so there is your readValueUnderLock.
PS: It's better to keep up with the time. Java is evolving and Java 9 is coming in several months. There has been some tremendous changes in its codebase. Filling up your head with obsolete knowledge may not be a good idea.

Related

How does ConcurrentHashMap.get() prevent dirty read?

I am looking at the source code of ConcurrentHashMap and wondering how the get() method works without any monitor, here's the code:
public V get(Object key) {
Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
int h = spread(key.hashCode());
if ((tab = table) != null && (n = tab.length) > 0 &&
(e = tabAt(tab, (n - 1) & h)) != null) {
if ((eh = e.hash) == h) {
if ((ek = e.key) == key || (ek != null && key.equals(ek))) // mark here for possible dirty read
return e.val;
}
else if (eh < 0)
return (p = e.find(h, key)) != null ? p.val : null;
while ((e = e.next) != null) {
if (e.hash == h &&
((ek = e.key) == key || (ek != null && key.equals(ek)))) // mark here for possible dirty read
return e.val;
}
}
return null;
}
The two lines I marked are doing the same thing: checking if the key of the current Node<K, V> equals to the key needed. If true, will return its corresponding value. But what if another thread cuts in before return and remove() this node from the data structure. Since the local variable e is still holding the reference of the removed node, GC will leave it be and the get() method will still return the removed value, thus causing a dirty read.
Did I miss something?
It doesn't:
Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset. (More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value.)
This is generally not a problem, since get will never return a result that couldn't have happened if the get method acquired a lock, blocking the update operation in the other thread. You just get the result as if the get call happened before the update operation began.
So, if you don't mind whether the get happens before or after the update, you also shouldn't mind it happening during the update, because there is no observable difference between during and before. If you do want the get to appear to happen after the update, then you will need to signal from the updating thread that the update is complete; waiting to acquire a lock wouldn't achieve that anyway, because you might get the lock before the update happens (in which case you'd get the same result as if you didn't acquire the lock).

A confusion about the source code for ConcurrentHashMap's putVal method

Here is part of codes for putVal method:
final V putVal(K key, V value, boolean onlyIfAbsent) {
if (key == null || value == null) throw new NullPointerException();
int hash = spread(key.hashCode());
int binCount = 0;
for (Node<K,V>[] tab = table;;) {
Node<K,V> f; int n, i, fh;
if (tab == null || (n = tab.length) == 0)
tab = initTable(); // lazy Initialization
//step1,tabAt(...) is CAS
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
//step2,casTabAt(...) is CAS
if (casTabAt(tab, i, null,
new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
...
return null;
}
Suppose there are currently two threads, A and B, and when A executes the step1 , it gets true ,but at the same time B also executes step1 and gets true as well. And both A and B execute step2.
from this situation, B's Node replace the A's Node, or said A's data is replaced by B, this's is wrong.
I don't know is it right or wrong, can anyone help me to solve it?
Here's how casTabAt is implemented:
static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
Node<K,V> c, Node<K,V> v) {
return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
}
Whereas U is declared as follows: private static final sun.misc.Unsafe U;. Methods of this class guarantees atomicity at low level. And from this usage:
casTabAt(tab, i, null, new Node<K,V>(hash, key, value, null))
we can see, assuming that the third parameter of compareAndSwapObject is expected value, that, due to atomicity guaranteed, either A or B thread that executes compareAndSwapObject first will see null here and compareAndSwapObject will actually replace the object, whereas the next thread executing compareAndSwapObject won't change the value, because the actual value is not null anymore, whereas null was expected as a condition to make a change for a value.

Throw exception instead of return in Java method

I'm writing Deque class on Java, according to Algorithms, Part 1 on Coursera. And currently my array-based Deque has method removeLast():
public Item removeLast() {
if (size() == array.length / 4) {
resize(array.length / 2);
}
if (head != tail) {
Item tmp = array[--head];
array[head] = null;
return tmp;
}
throw new NoSuchElementException("Stack underflow");
}
If head == tail means Deque is empty and I throw exception, according to homework specification, at the end of method instead of return statement. This code gives direct intention about invariants (head != tail).
On the other hand method may be rewritten like this:
public Item removeLastRewritten() {
if (size() == array.length / 4) {
resize(array.length / 2);
}
if (head == tail) {
throw new NoSuchElementException("Stack underflow");
}
Item tmp = array[--head];
array[head] = null;
return tmp;
}
In my opinion removeLast is more clearly written by these reasons:
Adhere to pessimistic scenario - always fail, only if ... which is more reliable approach, especially when method code will enlarge and become more complicated.
Gives more clear link between invariant tail != head and subsequent if {} code block.
I have the following questions:
Which is a better approach?
Is it considered an appropriate/good practice to write like removeLast?
What is considered a best practice for Java? Is there any code style about it (I couldn't find any)?
There is not a wrong answer. In GrepCode you can find every flavour you propose:
Inside a if with the != operator and at the end of the method:
E java.util.PriorityQueue.next()
public E More ...next() {
if (... != ...)
throw new ConcurrentModificationException();
...
if (...) {
return lastRetElt;
}
throw new NoSuchElementException();
}
Inside a if with the == operator
E org.fluentlenium.core.domain.FluentList.first()
public E More ...first() {
if (this.size() == 0) {
throw new NoSuchElementException("Element not found");
}
return this.get(0);
}
The reason why it looks weird is because you omit the else block of your if block, which would incorporate either leftover that comes after the if block in both your methods. The reason you can get away with it here is because the thrown exception will disrupt the flow of your method.
I think it's better not to rely on that and just be nice and use the if-else blocks intuitively.
public Item removeLastRewrittenAgain() {
if (size() == array.length / 4) {
resize(array.length / 2);
}
if (head != tail) { // invert the if-else block if preferred, it would not make any difference
Item tmp = array[--head];
array[head] = null;
return tmp;
} else {
throw new NoSuchElementException("Stack underflow");
}
}
Another reason why I don't really like to throw exceptions before the end of a method is that I strongly believe in and thus thoroughly use the concept of a single point of exit, meaning I don't leave the method somewhere in the middle which I believe is more difficult to read for someone not familiar with the code.
One place to return your value (or thrown exceptions): the very bottom of your method.
Your code becomes more readable if you explicitly mention the else, regardless of your choice.
Then there is the dilemma of
if (head != tail) {
Item tmp = array[--head];
array[head] = null;
return tmp;
} else {
throw new NoSuchElementException("Stack underflow");
}
versus
if (head == tail) {
throw new NoSuchElementException("Stack underflow");
} else {
Item tmp = array[--head];
array[head] = null;
return tmp;
}
Here I strongly prefer the second one. When I'm reading the "complicated part" of the if statement, I want to know why I'm actually inside an if. When reading the first variant the whole reason for the if only becomes apparent when you're throwing the exception.
I guess you could also solve that by writing
boolean isEmpty = head == tail;
if (!isEmpty) {
Item tmp = array[--head];
array[head] = null;
return tmp;
}
throw new NoSuchElementException("Stack underflow");
but I prefer the version that throws the exception as soon as you know something is wrong.

Java Dead Code, can someone explain?

This is part of a binary tree class, here is the find function, given the key to find the node in the tree, if not found return null, however this part have been recognized as dead code, when I move the if(current==null) statement to the bottom of inside while loop, it works, why? is it the same?
public class Tree {
public Node root;
public Node find(int key) {
Node current = root;
while (current.key != key) {
if (current == null) { //dead code here, why?
return null;
}
if (key < current.key) {
current = current.leftChild;
} else if (key > current.key) {
current = current.rightChild;
}
}
return current;
}
}
public class Node {
public char label;
public boolean visited = false;
public int key;
public float data;
public Node leftChild;
public Node rightChild;
}
If current is null it will never reach to the null check as you are accessing current.key beforehand it will throw a nullPointerException
If you move the if(current==null) to bottom as you are assigning new value before it won't be a dead code. (as the current.leftChild and current.rightChild might be null)
Because
while (current.key != key) // <-- current.key would throw NPE if current was null.
On the statement before, you're dereferencing current.key. If current == null, you will have an NPE. If it is not null, then the if check is meaningless since it will never be reached.
What you probably intended to do was move the if check to before the loop instead:
public Node find(int key) {
if (root == null) {
return null;
}
Node current = root;
while (current.key != key) {
if (key < current.key) {
current = current.leftChild;
} else if (key > current.key) {
current = current.rightChild;
}
}
return current;
}
This would give you the intended behavior that you want.
while (current.key != key) {
if (current == null) { //dead code here, why?
return null;
}
in your while condition you are already making sure that current is not null (by using current.key!=key) , so there's no point in rechecking it in if(current==null). if current=null, then you will get a NullPointerException in your while() and you will not even reach the if condition.
If current.key has not already thrown a NullPointerException through the attempt to access the key member, current cannot possibly be null at the beginning of the while loop. When the test is moved to the bottom of the loop, current has been assigned a new value which the compiler recognizes as potentially null.

ConcurrentHashMap Locking at every read?

I wanted to understand how does locking work in Java ConcurrentHashMap. Accordingly to the source-code here, it looks like for every read it is locking the reader using the lock of that particular segment. Have I got it wrong?
V readValueUnderLock(HashEntry<K,V> e) {
lock();
try {
return e.value;
} finally {
unlock();
}
}
Every Read is not locked below is documentation of method readValueUnderLock
Reads value field of an entry under lock. Called if value field ever
appears to be null. This is possible only if a compiler happens to
reorder a HashEntry initialization with its table assignment, which is
legal under memory model but is not known to ever occur.
Read in a ConcurrentHashMap does not synchronize on the entire map. Infact traversal does not synchronize at all except under one condition. The internal LinkedList implementation is aware of the changes to the underlying collection. If it detects any such changes during traversal it synchronizes itself on the bucket it is traversing and then tries to re-read the values. This always insures that while the values received are always fresh, there is minimalistic locking if any.
Below is get implementation in this class readValueUnderLock is called only when v is null
V get(Object key, int hash) {
if (count != 0) { // read-volatile
HashEntry<K,V> e = getFirst(hash);
while (e != null) {
if (e.hash == hash && key.equals(e.key)) {
V v = e.value;
if (v != null)
return v;
return readValueUnderLock(e); // recheck
}
e = e.next;
}
}
return null;
}

Categories

Resources