CAS and Non Blocking Counter - java

I have been reading JCIP by Brian Goetz. He explains the implementation of a non-blocking counter using CAS instruction. I could not understand how the increment is happening using CAS instruction. Can anyone help me understand this.
public class CasCounter {
private SimulatedCAS value;
public int getValue() {
return value.get();
}
public int increment() {
int v;
do {
v = value.get();
}
while (v != value.compareAndSwap(v, v + 1));
return v + 1;
}
}

value.compareAndSwap(v, v + 1) is equivalent to the following, except that the entire block is atomic: (see compare-and-swap for details)
int old = value.val;
if (old == v) {
value.val = v + 1;
}
return old;
Now v = value.get() gets the current value of the counter, and if nobody else is trying to update the counter at the same time, old == v will be true, so the value is set to v+1 (i.e. it is incremented) and old is returned. The loop terminates since v == old.
Suppose someone else incremented the counter just after we did v = value.get(), then old == v would be false, and the method will immediately return old, which is the updated value. Since v != old now, the loop continues.

The compareAndSwap() method will perform the following operations atomically:
- determine if `value` is equal to `v`
- if so, it will set `value` to `v+1`
- it returns whatever `value` was when the method was entered (whether or not `value` was updated)
The caller can check to see if value was what they expected it to be when the called compareAndSwap(). If it was, then the caller knows it's been updated. If it wasn't what was expected, the caller knows that it wasn't updated, and will try again, using the 'new' current value of value as what's expected (that's what the loop is doing).
This way, the caller can know that the increment operation doesn't get lost by some other thread that tries to modify value at the same moment.

Related

A confusion about the source code for ConcurrentHashMap's putVal method

Here is part of codes for putVal method:
final V putVal(K key, V value, boolean onlyIfAbsent) {
if (key == null || value == null) throw new NullPointerException();
int hash = spread(key.hashCode());
int binCount = 0;
for (Node<K,V>[] tab = table;;) {
Node<K,V> f; int n, i, fh;
if (tab == null || (n = tab.length) == 0)
tab = initTable(); // lazy Initialization
//step1,tabAt(...) is CAS
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
//step2,casTabAt(...) is CAS
if (casTabAt(tab, i, null,
new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
...
return null;
}
Suppose there are currently two threads, A and B, and when A executes the step1 , it gets true ,but at the same time B also executes step1 and gets true as well. And both A and B execute step2.
from this situation, B's Node replace the A's Node, or said A's data is replaced by B, this's is wrong.
I don't know is it right or wrong, can anyone help me to solve it?
Here's how casTabAt is implemented:
static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
Node<K,V> c, Node<K,V> v) {
return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
}
Whereas U is declared as follows: private static final sun.misc.Unsafe U;. Methods of this class guarantees atomicity at low level. And from this usage:
casTabAt(tab, i, null, new Node<K,V>(hash, key, value, null))
we can see, assuming that the third parameter of compareAndSwapObject is expected value, that, due to atomicity guaranteed, either A or B thread that executes compareAndSwapObject first will see null here and compareAndSwapObject will actually replace the object, whereas the next thread executing compareAndSwapObject won't change the value, because the actual value is not null anymore, whereas null was expected as a condition to make a change for a value.

Finding the maximum value of a linked list recursively

I need to write a Java method called findMax within a class called Node, which has two instance variables: int value and Node next. The method takes no parameters, and must return the greatest value of a linked list. Within the context of the program, the method will always be called by the first Node of a linked list (except for the recursive calls). I was struggling to complete the method when I accidentally found a working solution:
public int findMax(){
int max = value;
if(next == null){
return max;
}
else{
if(max <= next.findMax()){
max = next.value;
}
else return max;
}
return next.findMax();
}
This method properly returned the largest value of each linked list I tested it for. However, since I found this solution by trying random arrangements of code, I don't really feel like I understand what's going on here. Can anyone explain to me how/why this works? Also, if there is a more efficient solution, how would it be implemented?
You can imagine a linked list looking something like this:
val1 -> val2 -> val3 -> null
Recursion works on the principle that eventually, the input you pass into the function can be handled without recursing further. In your case, node.findMax() can be handled if the next pointer is null. That is, the max of a linked list of size 1 is simply the value (base case of the recursion), the max of any other linked list is the max of the value of that node or the max of the remaining elements.
ie) for the Node n3 with value val3, n3.findMax() simply returns the value
For any other node n, n.findMax() returns the maximum of the node's value or n.next.findMax()
The way this looks in the example at the start is:
n1.findMax()
= Max(n1.value, n2.findMax())
= Max(val1, Max(n2.value, n3.findMax())
= Max(val1, Max(val2, n3.value)) // Since n3.next == null
= Max(val1, Max(val2, val3))
which is simply the maximum over the whole list
Edit: Based on the discussion above, although what you said might work, there is a simpler way of writing the program:
int findMax() {
if (this.next == null) {
return this.value;
} else {
return Math.max(this.value, this.next.findMax());
}
}
Edit 2: A break down of why your code works (and why it's bad):
public int findMax(){
// This variable doesn't serve much purpose
int max = value;
if(next == null){
return max;
}
else{
// This if condition simply prevents us from following
// the else block below but the stuff inside does nothing.
if(max <= next.findMax()){
// max is never used again if you are here.
max = next.value;
}
else return max;
}
// We now compute findMax() again, leading to serious inefficiency
return next.findMax();
}
Why is this inefficient? Because each call to findMax() on a node makes two subsequent calls to findMax() on the next node. Each of those calls will generate two more calls, etc.
The way to fix this up is by storing the result of next.findMax() like so:
public int findMax() {
if (next == null) {
return value;
}
else {
int maxOfRest = next.findMax();
if(value <= maxOfRest) {
return maxOfRest;
}
else return value;
}
}

Atomic compareAndSet but with callback?

I know that AtomicReference has compareAndSet, but I feel like what I want to do is this
private final AtomicReference<Boolean> initialized = new AtomicReference<>( false );
...
atomicRef.compareSetAndDo( false, true, () -> {
// stuff that only happens if false
});
this would probably work too, might be better.
atomicRef.compareAndSet( false, () -> {
// stuff that only happens if false
// if I die still false.
return true;
});
I've noticed there's some new functional constructs but I'm not sure if any of them are what I'm looking for.
Can any of the new constructs do this? if so please provide an example.
update
To attempt to simplify my problem, I'm trying to find a less error prone way to guard code in a "do once for object" or (really) lazy initializer fashion, and I know that some developers on my team find compareAndSet confusing.
guard code in a "do once for object"
how exactly to implement that depends on what you want other threads attempting to execute the same thing in the meantime. if you just let them run past the CAS they may observe things in an intermediate state while the one thread that succeeded does its action.
or (really) lazy initializer fashion
that construct is not thread-safe if you're using it for lazy initializers because the "is initialized" boolean may be set to true by one thread and then execute the block while another thread observes the true-state but reads an empty result.
You can use Atomicreference::updateAndGet if multiple concurrent/repeated initialization attempts are acceptable with one object winning in the end and the others being discarded by GC. The update method should be side-effect-free.
Otherwise you should just use the double checked locking pattern with a variable reference field.
Of course you can always package any of these into a higher order function that returns a Runnable or Supplier which you then assign to a final field.
// == FunctionalUtils.java
/** #param mayRunMultipleTimes must be side-effect-free */
public static <T> Supplier<T> instantiateOne(Supplier<T> mayRunMultipleTimes) {
AtomicReference<T> ref = new AtomicReference<>(null);
return () -> {
T val = ref.get(); // fast-path if already initialized
if(val != null)
return val;
return ref.updateAndGet(v -> v == null ? mayRunMultipleTimes.get() : v)
};
}
// == ClassWithLazyField.java
private final Supplier<Foo> lazyInstanceVal = FunctionalUtils.instantiateOne(() -> new Foo());
public Foo getFoo() {
lazyInstanceVal.get();
}
You can easily encapsulate various custom control-flow and locking patterns this way. Here are two of my own..
compareAndSet returns true if the update was done, and false if the actual value was not equal to the expected value.
So just use
if (ref.compareAndSet(expectedValue, newValue)) {
...
}
That said, I don't really understand your examples, since you're passing true and false to a method taking object references as argument. And your second example doesn't do the same thing as the first one. If the second is what you want, I think what you're after is
ref.getAndUpdate(value -> {
if (value.equals(expectedValue)) {
return someNewValue(value);
}
else {
return value;
}
});
You’re over-complicating things. Just because there are now lambda expression, you don’t need to solve everything with lambdas:
private volatile boolean initialized;
…
if(!initialized) synchronized(this) {
if(!initialized) {
// stuff to be done exactly once
initialized=true;
}
}
The double checked locking might not have a good reputation, but for non-static properties, there are little alternatives.
If you consider multiple threads accessing it concurrently in the uninitialized state and want a guaranty that the action runs only once, and that it has completed, before dependent code is executed, an Atomic… object won’t help you.
There’s only one thread that can successfully perform compareAndSet(false,true), but since failure implies that the flag already has the new value, i.e. is initialized, all other threads will proceed as if the “stuff to be done exactly once” has been done while it might still be running. The alternative would be reading the flag first and conditionally perform the stuff and compareAndSet afterwards, but that allows multiple concurrent executions of “stuff”. This is also what happens with updateAndGet or accumulateAndGet and it’s provided function.
To guaranty exactly one execution before proceeding, threads must get blocked, if the “stuff” is currently executed. The code above does this. Note that once the “stuff” has been done, there will be no locking anymore and the performance characteristics of the volatile read are the same as for the Atomic… read.
The only solution which is simpler in programming, is to use a ConcurrentMap:
private final ConcurrentHashMap<String,Boolean> initialized=new ConcurrentHashMap<>();
…
initialized.computeIfAbsent("dummy", ignore -> {
// stuff to do exactly once
return true;
});
It might look a bit oversized, but it provides exactly the required performance characteristics. It will guard the initial computation using synchronized (or well, an implementation dependent exclusion mechanism) but perform a single read with volatile semantics on subsequent queries.
If you want a more lightweight solution, you may stay with the double checked locking shown at the beginning of this answer…
I know this is old, but I've found there is no perfect way to achieve this, more specifically this:
trying to find a less error prone way to guard code in a "do (anything) once..."
I'll add to this "while respecting a happens before behavior." which is required for instantiating singletons in your case.
IMO The best way to achieve this is by means of a synchronized function:
public<T> T transaction(Function<NonSyncObject, T> transaction) {
synchronized (lock) {
return transaction.apply(nonSyncObject);
}
}
This allows to preform atomic "transactions" on the given object.
Other options are double-check spin-locks:
for (;;) {
T t = atomicT.get();
T newT = new T();
if (atomicT.compareAndSet(t, newT)) return;
}
On this one new T(); will get executed repeatedly until the value is set successfully, so it is not really a "do something once".
This would only work on copy on write transactions, and could help on "instantiating objects once" (which in reality is instantiating many but at the end is referencing the same) by tweaking the code.
The final option is a worst performant version of the first one, but this one is a true happens before AND ONCE (as opposed to the double-check spin-lock):
public void doSomething(Runnable r) {
while (!atomicBoolean.compareAndSet(false, true)) {}
// Do some heavy stuff ONCE
r.run();
atomicBoolean.set(false);
}
The reason why the first one is the better option is that it is doing what this one does, but in a more optimized way.
As a side note, in my projects I've actually used the code below (similar to #the8472's answer), that at the time I thought safe, and it may be:
public T get() {
T res = ref.get();
if (res == null) {
res = builder.get();
if (ref.compareAndSet(null, res))
return res;
else
return ref.get();
} else {
return res;
}
}
The thing about this code is that, as the copy on write loop, this one generates multiple instances, one for each contending thread, but only one is cached, the first one, all the other constructions eventually get GC'd.
Looking at the putIfAbsent method I see the benefit is the skipping of 17 lines of code and then a synchronized body:
/** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
if (key == null || value == null) throw new NullPointerException();
int hash = spread(key.hashCode());
int binCount = 0;
for (Node<K,V>[] tab = table;;) {
Node<K,V> f; int n, i, fh;
if (tab == null || (n = tab.length) == 0)
tab = initTable();
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
if (casTabAt(tab, i, null,
new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
else if ((fh = f.hash) == MOVED)
tab = helpTransfer(tab, f);
else {
V oldVal = null;
synchronized (f) {
if (tabAt(tab, i) == f) {
And then the synchronized body itself is another 34 lines:
synchronized (f) {
if (tabAt(tab, i) == f) {
if (fh >= 0) {
binCount = 1;
for (Node<K,V> e = f;; ++binCount) {
K ek;
if (e.hash == hash &&
((ek = e.key) == key ||
(ek != null && key.equals(ek)))) {
oldVal = e.val;
if (!onlyIfAbsent)
e.val = value;
break;
}
Node<K,V> pred = e;
if ((e = e.next) == null) {
pred.next = new Node<K,V>(hash, key,
value, null);
break;
}
}
}
else if (f instanceof TreeBin) {
Node<K,V> p;
binCount = 2;
if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
value)) != null) {
oldVal = p.val;
if (!onlyIfAbsent)
p.val = value;
}
}
}
}
The pro(s) of using a ConcurrentHashMap is that it will undoubtedly work.

Unable to understand why count is not getting incremented

As a part of exercise, i am writing a recursive code to count the number of nodes in a queue. The code part which I have added/modified (in NodeQueue.java) is here:
public class NodeQueue implements Queue
{
static protected int count; //for RecNodeCount method only
protected Node beingCountedNode = head; //for RecNodeCount method only
// other methods..
public int RecNodeCount()
{
if(beingCountedNode == null)
return count;
else
{
count++;
beingCountedNode = beingCountedNode.getNext();
return RecNodeCount();
}
}
The entire code is as here:
Queue.java: http://pastebin.com/raw.php?i=Dpkd8ynk
Node.java: http://pastebin.com/raw.php?i=Zy0KbrtJ
NodeQueue.java: http://pastebin.com/raw.php?i=j6hieiLG
SimpleQueue.java: http://pastebin.com/raw.php?i=vaTy41z4
I am unable to understand why I am getting zero even after enqueueing few nodes in the queue. The size variable returns the correct number. I am doing more or less the same with the count variable (I think!) i.e. incrementing the required variable.
Although I believe the method will work (if beingCountedNode is set properly before the call. See #peter.petrov answer), it is weird to use instance variables as parameters for a function. I think the recursive function should have the signature int Count( Node node ) which returns the number of nodes after (including) the given Node.
// returns the number of nodes in the list
public int Count(){ return CountHelper( head ); }
// helper recursive function
// returns the number of nodes in the list after and including "node".
// call with head of the list to get the count of all nodes.
private int CountHelper( Node node )
{
if( node == null )
return 0;
else
return 1 + CountHelper( node.getNext() );
}
Also note in your current example, you never reset count, so if I call RecNodeCount() twice in a row, your method will tell me the count is twice what it actually is. Edit, actually I guess it wouldn't since beingCountedNode would be null, but it is still weird to do it this way.
My guess is this following.
When this line is executed
protected Node beingCountedNode = head;
your head is null.
So beingCountedNode is set to null. Due to this,
later in your method you never enter the else clause.
Just add a few System.out.println calls in RecNodeCount()
and you'll see what exactly is happening in this method.
Maybe this is not a direct answer for your issue, but why do you even use recursive and static variable here? Is' really easy to count nodes with simple while.
public int nodeCount(Node node) {
int result = 0;
while(node != null) {
node = node.getNext();
result++;
}
return result;
}

compareAndSet vs incrementAndGet

Doing a course on concurrent programming.
As an example we have
final class Counter {
private AtomicInteger value;
public long getValue() {
return value.get();
}
public long increment() {
int v;
do {
v = value.get();
}
while(!value.compareAndSet(v, v+1));
return v+1;
}
}
Why would you use compareAndSet in this case and not incrementAndGet ?
Thanks
Here the the implementation of AtomicInteger.incrementAndGet() method from the JDK version I have on my machine:
/**
* Atomically increments by one the current value.
*
* #return the updated value
*/
public final int incrementAndGet() {
for (;;) {
int current = get();
int next = current + 1;
if (compareAndSet(current, next))
return next;
}
}
As you can see, the implementation is very similar to yours.
PS: Why do you compute v+1 twice?
From the Java docs,
compareAndSet :
Atomically sets the value to the given
updated value if the current value ==
the expected value.
public final boolean compareAndSet(V expect,
V update)
incrementAndGet :
Atomically increments by one the
current value.
public final int incrementAndGet()
Since compareAndSet basically does the same, I can't think about a single reason to use this handwritten implementation of increment.
In your case, the Class Counter implements the value increment in it's own way, and JDK AtomicInteger.incrementAndGet() also implements it in it's own way. But they also use the CAS method compareAndSet(V expect ,V newValue).
So these two kinds of implementation have no difference. The minor difference between the two ways is the circulation form. BTW, answer for compute v+1 twice.
while(!value.compareAndSet(v, v+1));----v+1 is the parameter for function , and realize value to add 1;
return v+1; v+1 is the return value;

Categories

Resources