The following code, I am confused about what would happen when 2 threads compete the lock for map.get(k). When thread A wins, it makes map.get(k) null and the second thread would get a synchronized(null)? Or would it be both threads see it as synchronized(v) even though the first thread changes it to null but during which thread B still sees it as v?
synchronized(map.get(k)) {
map.get(k).notify();
map.remove(k);
}
The question is a similar to another question, except lock object is value of a map.
UPDATE:
compared the discussion in this post and that in the above link, is it true that
synchronized(v) {
v.notify();
v = null;
}
would cause the 2nd thread synchronized(null). But for the synchronized(map.get(k)), the 2nd thread would have synchronized(v)???
UPDATE:
To answer #Holger's question, the main difference between this post and the other one is:
final V v = new V();
synchonized(map.get(k)) {
map.get(k).notify();
map.remove(k);
}
The second thread won't "request" a lock on thread.get(k), both threads will request a lock on the result of map.get(k) before the first one starts executing. So the code is roughly similar to:
Object val = map.get(k);
val.notify();
So, when the thread that obtained the lock finishes executing, the second thread will still have a reference to Object val, even if map[k] doesn't point to it anymore (or points to null)
EDIT: (following many useful comments)
It seems that the lock on map.get(k) is being acquired to ensure that the processing is done only once (map.remove(k) is called after processing). While it's true that 2 threads that compete for the lock on val won't run into null.notify(), the safety of this code is not guaranteed as the second thread may call synchronized(map.get(k)) after the first one has exited the synchronized block.
To ensure that the k is processed atomically, a safer approach may be needed. One way to do this is to use a concurrent hash map, like below:
map.computeIfPresent(k, (key, value) -> {
//process the value here
//key is k
//value is the value to which k is mapped.
return null; //return null to remove the value after processing.
});
Please note that map in the preceding example is an instance of ConcurrentHashMap. This will ensure that the value is processed once (computeIfPresent runs atomically).
To quote ConcurrentHashMap.computeIfAbsent doc comments:
If the value for the specified key is present, attempts to compute a new mapping given the key and its current mapped value. The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.
What would happen is that you would lock on the value currently in the hashmap entry for key k.
Problem #1 - if the map.get(k) call returns null, then you would get an NPE.
Problem #2 - since you are not locking on map:
you are likely to get race conditions with other threads; e.g. if some other thread does a map.put(k, v) with a different v to the one you are locking, and
the map.remove(k) may result in memory anomalies leading (potentially) to corruption of the map data structure.
It is not clear what you are actually trying to achieve by synchronizing on map.get(k) (rather than map). But whatever it is, this code is not thread-safe.
Re your update: Yes that is true ... assuming that other thread is synchronizing on the value of the same variable v. Note that you always synchronize on an object, so when you do synchronized(v), that means "take the current value of v and synchroize on that object".
Related
Here's the class:
#NotThreadSafe
public class MutableInteger {
private int value;
public int get() { return value;}
public void set(int value) { this.value = value;}
}
Here's the post-condition that I have come up with: The value returned by get() is equal to the value value set by set() or 0.
It is easy to see that the above post-condition will not always hold true. Take two threads A and B for instance. Suppose A sets value to 5 and B sets it to 8 after that. Doing a get() in thread A would return 8. It should have returned 5. It is a simple race condition situation.
How can I make this class thread safe? In the book Java: Concurrency in Practice, the authors have guarded value and both the methods on the same object. I fail to understand how this helps at all with the race condition. First of all, set() is not a compound action. So, why do we need to synchronise it? And even after that, the race condition does not disappear. As soon as the lock is released once a thread exits from the set() method, another thread can aquire the lock and set a new value. Doing a get() in the initial thread will return the new value, breaching the post-condition.
(I understand the the author is guarding the get()) for visibility stuff. But I am not sure how it eliminates the race condition.
First of all, set() is not a compound action. So, why do we need to synchronise it?
You're not synchronising the set() in its own right, you're synchronising both the get() and set() methods against the same object (assuming you make both these methods synchronised.)
If you didn't do this, and the value variable isn't marked as volatile, then you can't guarantee that a thread will see the correct value because of per-thread caching. (Thread a could update it to 5, then thread b could still potentially see 8 even after thread a has updated it. That's what's meant by lack of thread safety in this context.)
You're correct that all reference assignments are atomic, so there's no worry about a corrupt reference in this scenario.
And even after that, the race condition does not disappear. As soon as the lock is released once a thread exits from the set() method, another thread can aquire the lock and set a new value.
New threads setting new values (or new code setting new values) isn't an issue at all in terms of thread safety - that's designed, and expected. The issue is if the results are inconsistent, or specifically if multiple threads are able to concurrently view the object in an inconsistent state.
This is follow up question to my original SO question.
Thanks to the answer on that question, it looks like that according to ConcurrentMap.computeIfPresent javadoc
The default implementation may retry these steps when multiple threads
attempt updates including potentially calling the remapping function
multiple times.
My question is:
Does ConcurrentHashMap.computeIfPresent call remappingFunction multiple times when it is shared between multiple threads only or can also be called multiple times when created and passed from a single thread?
And if it is the latter case why would it be called multiple times instead of once?
The general contract of the interface method ConcurrentMap.computeIfPresent allows implementations to repeat evaluations in the case of contention and that’s exactly what happens when a ConcurrentMap inherits the default method, as it would be impossible to provide atomicity atop the general ConcurrentMap interface in a default method.
However, the implementation class ConcurrentHashMap overrides this method and provides a guaranty in its documentation:
If the value for the specified key is present, attempts to compute a new mapping given the key and its current mapped value. The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.
emphasis mine
So, since your question asks for ConcurrentHashMap.computeIfPresent specifically, the answer is, its argument function will never get evaluated multiple times. This differs from, e.g. ConcurrentSkipListMap.computeIfPresent where the function may get evaluated multiple times.
Does ConcurrentMap.computeIfPresent call remappingFunction multiple
times when it is shared between multiple threads or can be called
multiple times when created and passed from a single thread?
The documentation does not specify, but the implication is that it is contention of multiple threads to modify the mapping of the same key (not necessarily all via computeIfPresent()) that might cause the remappingFunction to be run multiple times. I would anticipate that an implementation would check whether the value presented to the remapping function is still the one associated with the key before setting the remapping result as that key's new value. If not, it would try again, computing a new remapped value from the new current value.
you can see the code here:
#Override
default V computeIfPresent(K key,
BiFunction<? super K, ? super V, ? extends V> remappingFunction) {
Objects.requireNonNull(remappingFunction);
V oldValue;
while((oldValue = get(key)) != null) {
V newValue = remappingFunction.apply(key, oldValue);
if (newValue != null) {
if (replace(key, oldValue, newValue))
return newValue;
} else if (remove(key, oldValue))
return null;
}
return oldValue;
}
if thread 1 comes in and calls the remappingFunction and gets the value X,
then thread 2 comes and changes the value while thread 1 is waiting, and only then thread 1 calls "replace"...
then the "replace" method will return "false" due to the value change.
so thread 1 will loop again and call the remappingFunction once again.
this can go on and on and create "infinite" invocations of the remappingFunction.
I use the .get(...), .put(...) and .clear() operations from multiple threads on one HashMap. .put(...) and .clear() are inside a synchronized block but .get(...) is not. I can't imagine that this will cause problems but in other code I've seen .get() is pretty much always synchronized.
relevant code for get/put
Object value = map.get(key);
if(value == null) {
synchronized (map) {
value = map.get(key); // check again, might have been changed in between
if(value == null) {
map.put(key, new Value(...));
}
}
}
and clear is just:
synchronized (map) {
map.clear();
}
The write operations will invalidate caches because of the synchronized and the get(...) returns either null or an instance. I can't really see what could go wrong or what would improve by putting the .get(...) operation into a synchronized(map) block.
Here is one simple scenario that would produce a problem on unsynchronized get:
Thread A starts get, computes the hash bucket number, and gets pre-empted
Thread B calls clear(), so a smaller array of buckets gets allocated
Thread A wakes up, and may run into the index-out-of-bounds exception
Here is a more complex scenario:
Thread A locks the map for an update, and gets pre-empted
Thread B initiates a get operation, computes the hash bucket number, and gets pre-empted
Thread A wakes up, and continues the put, and realizes that the buckets need resizing
Thread A allocates new buckets, copies old content into them, and adds the new item
Thread B wakes up, and continues the search using the old bucket index on the new array of buckets.
At this point, A is probably not going to find the right item, because it is very likely to be in a hash bucket at a different index. That is why get needs to be synchronized as well.
I know that in a program that works with multiple threads it's necessary to synchronize the methods because it's possible to have problems like race conditions.
But I cannot understand why we need to synchronize also the methods that need just to read a shared variable.
Look at this example:
public ConcurrentIntegerArray(final int size) {
arr = new int[size];
}
public void set(final int index, final int value) {
lock.lock();
try {
arr[index] = value;
} finally {
lock.unlock();
}
}
public int get(final int index) {
lock.lock();
try {
return arr[index];
} finally {
lock.unlock();
}
}
They did a look on the get and also on the set method. On the set method I understand why. For example if I want to put with Thread1 in index=3 the number 5 and after some milliseconds the Thread2 have to put in index=3 the number 6. Can it happen that I have in index=3 in my array still a 5 instead of a 6 (if I don't do a synchronization on the method set)? This because the Thread1 can have a switch-context and so the Thread2 enter in the same method put the value and after the Thread1 assign the value 5 on the same position So instead of a 6 I have a 5.
But I don't understand why we need (look the example) to synchronize also the method get. I'm asking this question because we need just to read on the memory and not to write.So why we need also on the method get to have a synchronization? Can someone give to me a very simple example?
Both methods need to be synchronized. Without synchronization on the get method, this sequence is possible:
get is called, but the old value isn't returned yet.
Another thread calls set and updates the value.
The first thread that called get now examines the now-returned value and sees what is now an outdated value.
Synchronization would disallow this scenario by guaranteeing that another thread can't just call set and invalidate the get value before it even returns. It would force a thread that calls set to wait for the thread that calls get to finish.
If you do not lock in the get method than a thread might keep a local copy of the array and never refreshes from the main memory. So its possible that a get never sees a value which was updated by a set method. Lock will force the visibility.
Each thread maintain their own copy of value. The synchronized ensures that the coherency is maintained between different threads. Without synchronized, one can never be sure if any one has modified it. Alternatively, one can define the variable as volatile and it will have the same memory effects as synchronized.
The locking action also guarantees memory visibility. From the Lock doc:
All Lock implementations must enforce the same memory synchronization semantics as provided by the built-in monitor lock, [...]:
A successful lock operation has the same memory synchronization effects as a successful Lock action.
A successful unlock operation has the same memory synchronization effects as a successful Unlock action.
Without acquiring the lock, due to memory consistency errors, there's no reason a call to get needs to see the most updated value. Modern processors are very fast, access to DRAM is comparatively very slow, so processors store values they are working on in a local cache. In concurrent programming this means one thread might write to a variable in memory but a subsequent read from a different thread gets a stale value because it is read from the cache.
The locking guarantees that the value is actually read from memory and not from the cache.
I have the following code in thread 1:
synchronized (queues.get(currentQueue)) { //line 1
queues.get(currentQueue).add(networkEvent); //line 2
}
and the following in thread 2:
synchronized (queues.get(currentQueue)) {
if (queues.get(currentQueue).size() > 10) {
currentQueue = 1;
}
}
Now to my question: The currentQueue variable currently has the value of 0. When thread 2 changes the value of currentQueue to 1 and thread 1 waits at line 1 (because of the synchronized), does thread 1 then use the updated currentQueue value in line 2 after thread 2 has finished (that's what I want to).
The answer to the question is that it depends. I assume there is other chunk of code that increments the currentQueue variable. This being the case, the lock is happening not at the 'currentQueue' variable and neither is it happening at the collection of 'queues', but rather it is happening on one of the 10 queues (or however many you have) in the 'queues' collection.
Hence, if both threads happen to access the same queue (say queue 5), then the answer to your question is yes. However, for that to happen is one in ten chance (one in x chance, where x = the number or queues in the 'queues' collection). Therefore, if the threads access different queues, then the answer is no.
The correct answer to your question is: The result is undefined.
Your monitor object is queues.get(currentQueue), but since currentQueue is variable, your monitor is variable, therefore the state it is currently in is more or less random. Effectively this code would break eventually.
A simple way to fix it would be a function like this:
protected synchronized QueueType getCurrentQueue() {
return queues.get(currentQueue);
}
However this is still a bad way of implementing the whole thing. You should either try to eliminate the synchronization completely through the use of a concurrent Queue (like ConcurrentLinkedQueue) or work with a lock/final monitor object.
final Object queueLock = new Object();
...
synchronized(queueLock) {
queues.get(currentQueue).add(networkEvent);
}
Note that you will have to use that locking every time you access queues or currentQueue as both define the dataset you are using.
Assuming you have no other thread will change the value of currentQueue, yes Thread 1 will end up using the queue pointed to by the updated value of currentQueue, since you're invoking queues.get(currentQueue) once again in the body of the synchronized block. This however doesn't mean that your synchronization is sound. You actually should synchronize on currentQueue, since it seems to be the shared key to access the current queue.
Also remember when you use synchronize you're synchronizing on the reference of the variable, and not its value. So if you reassign a new object to it, your synchronization doesn't make sense anymore.