Java8: ConcurrentHashMap.compute atomicity - java

I am trying to use ConcurrentHashMap.compute to implement something and I cannot figure out if my logic is 100% threadSafe and correct.
Javadoc says about compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction) method:
Attempts to compute a mapping for the specified key and its current
mapped value (or null if there is no current mapping). The entire
method invocation is performed atomically. Some attempted update
operations on this map by other threads may be blocked while
computation is in progress, so the computation should be short and
simple, and must not attempt to update any other mappings of this Map.
The question is: Can I pass a function having side-effects to this method? Is there any guarantee that the function is applied only once even if there are other threads trying to update the same key?
LE: As requested, this is my update function (tries to generate an event and remove the cached updates from map):
final Set<UniqueID> updatedIds = new HashSet<>(updatesMap.keySet());
for (final ID id : updatedIds) {
updatesMap.compute(id, (modifiedId, updates) -> {
final Event event = createEvent(modifiedId, updates);
threadPool.addEvent(event);
return null;
});
}
I must be sure that there is only one event created from each updatedId.

Related

Is it possible to synchronize ConcurrentHashMap update by it's key?

I have tons of update operation on ConcurrentHashMap and I need to minimize synchronization for update this map.
please see code below.
public ConcurrentHashMap<String, Integer> dataMap = new ConcurrentHashMap<>();
public void updateData(String key, int data) {
if ( dataMap.containsKey(key)) {
// do something with previous value and new value and update it
}
}
For example, When one thread calls updateData with Key "A", then every other try calling updateData with key "A" should be blocked until first thread is done. meanwhile, I want another thread trying to call updateData with key "B" runs concurrently.
I wonder if there is any fancy way to lock hashMap simply with its key.
I think you are looking for the compute method.
It takes a function that is given the key and the current value (or null if there is no value yet), and can compute the new value.
ConcurrentHashMap guarantees that only one such functions runs at the same time for the given key. A concurrent second call will block until ready.

Execution of `remappingFunction` in ConcurrentHashMap.computeIfPresent

This is follow up question to my original SO question.
Thanks to the answer on that question, it looks like that according to ConcurrentMap.computeIfPresent javadoc
The default implementation may retry these steps when multiple threads
attempt updates including potentially calling the remapping function
multiple times.
My question is:
Does ConcurrentHashMap.computeIfPresent call remappingFunction multiple times when it is shared between multiple threads only or can also be called multiple times when created and passed from a single thread?
And if it is the latter case why would it be called multiple times instead of once?
The general contract of the interface method ConcurrentMap.computeIfPresent allows implementations to repeat evaluations in the case of contention and that’s exactly what happens when a ConcurrentMap inherits the default method, as it would be impossible to provide atomicity atop the general ConcurrentMap interface in a default method.
However, the implementation class ConcurrentHashMap overrides this method and provides a guaranty in its documentation:
If the value for the specified key is present, attempts to compute a new mapping given the key and its current mapped value. The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.
emphasis mine
So, since your question asks for ConcurrentHashMap.computeIfPresent specifically, the answer is, its argument function will never get evaluated multiple times. This differs from, e.g. ConcurrentSkipListMap.computeIfPresent where the function may get evaluated multiple times.
Does ConcurrentMap.computeIfPresent call remappingFunction multiple
times when it is shared between multiple threads or can be called
multiple times when created and passed from a single thread?
The documentation does not specify, but the implication is that it is contention of multiple threads to modify the mapping of the same key (not necessarily all via computeIfPresent()) that might cause the remappingFunction to be run multiple times. I would anticipate that an implementation would check whether the value presented to the remapping function is still the one associated with the key before setting the remapping result as that key's new value. If not, it would try again, computing a new remapped value from the new current value.
you can see the code here:
#Override
default V computeIfPresent(K key,
BiFunction<? super K, ? super V, ? extends V> remappingFunction) {
Objects.requireNonNull(remappingFunction);
V oldValue;
while((oldValue = get(key)) != null) {
V newValue = remappingFunction.apply(key, oldValue);
if (newValue != null) {
if (replace(key, oldValue, newValue))
return newValue;
} else if (remove(key, oldValue))
return null;
}
return oldValue;
}
if thread 1 comes in and calls the remappingFunction and gets the value X,
then thread 2 comes and changes the value while thread 1 is waiting, and only then thread 1 calls "replace"...
then the "replace" method will return "false" due to the value change.
so thread 1 will loop again and call the remappingFunction once again.
this can go on and on and create "infinite" invocations of the remappingFunction.

Concurrent byte array access in Java with as few locks as possible

I'm trying to reduce the memory usage for the lock objects of segmented data. See my questions here and here. Or just assume you have a byte array and every 16 bytes can (de)serialize into an object. Let us call this a "row" with row length of 16 bytes. Now if you modify such a row from a writer thread and read from multiple threads you need locking. And if you have a byte array size of 1MB (1024*1024) this means 65536 rows and the same number of locks.
This is a bit too much, also that I need much larger byte arrays, and I would like to reduce it to something roughly proportional to the number of threads. My idea was to create a
ConcurrentHashMap<Integer, LockHelper> concurrentMap;
where Integer is the row index and before a thread 'enters' a row it puts a lock object in this map (got this idea from this answer). But no matter what I think through I cannot find an approach that is really thread-safe:
// somewhere else where we need to write or read the row
LockHelper lock1 = new LockHelper();
LockHelper lock = concurrentMap.putIfAbsent(rowIndex, lock1);
lock.addWaitingThread(); // is too late
synchronized(lock) {
try {
// read or write row at rowIndex e.g. writing like
bytes[rowIndex/16] = 1;
bytes[rowIndex/16 + 1] = 2;
// ...
} finally {
if(lock.noThreadsWaiting())
concurrentMap.remove(rowIndex);
}
}
Do you see a possibility to make this thread-safe?
I have the feeling that this will look very similar like the concurrentMap.compute contstruct (e.g. see this answer) or could I even utilize this method?
map.compute(rowIndex, (key, value) -> {
if(value == null)
value = new Object();
synchronized (value) {
// access row
return value;
}
});
map.remove(rowIndex);
Is the value and the 'synchronized' necessary at all as we already know the compute operation is atomically?
// null is forbidden so use the key also as the value to avoid creating additional objects
ConcurrentHashMap<Integer, Integer> map = ...;
// now the row access looks really simple:
map.compute(rowIndex, (key, value) -> {
// access row
return key;
});
map.remove(rowIndex);
BTW: Since when we have this compute in Java. Since 1.8? Cannot find this in the JavaDocs
Update: I found a very similar question here with userIds instead rowIndices, note that the question contains an example with several problems like missing final, calling lock inside the try-finally-clause and lack of shrinking the map. Also there seems to be a library JKeyLockManager for this purpose but I don't think it is thread-safe.
Update 2: The solution seem to be really simple as Nicolas Filotto pointed out how to avoid the removal:
map.compute(rowIndex, (key, value) -> {
// access row
return null;
});
So this is really less memory intense BUT the simple segment locking with synchronized is at least 50% faster in my scenario.
Is the value and the synchronized necessary at all as we already
know the compute operation is atomically?
I confirm that it is not needed to add a synchronized block in this case as the compute method is done atomically as stated in the Javadoc of ConcurrentHashMap#compute(K key, BiFunction<? super K,? super V,? extends V> remappingFunction) that has been added with BiFunction since Java 8, I quote:
Attempts to compute a mapping for the specified key and its current
mapped value (or null if there is no current mapping). The entire
method invocation is performed atomically. Some attempted update
operations on this map by other threads may be blocked while
computation is in progress, so the computation should be short and
simple, and must not attempt to update any other mappings of this Map.
What you try to achieve with the compute method could be totally atomic if you make your BiFunction always returns null to remove the key atomically too such that everything will be done atomically.
map.compute(
rowIndex,
(key, value) -> {
// access row here
return null;
}
);
This way you will then fully rely on the locking mechanism of a ConcurrentHashMap to synchronize your accesses to your rows.

ConcurrentHashMap update exists value thread safe

I want to use the concurrent hash map holding some results,
ConcurrentHashMap<Long,AtomicInteger>
add a new entry if key not exists,or get value by key and increment,like this:
if(map.contains(key))
map.get(key).addAndGet(1);
else
map.put(key,new AtomicInteger(1));
the put operation is not thead safe,how to solve this problem? Is put operation should within synchronized block?
The put() operation itself is implemented in a threadsafe way, i.e. if you put the same key it will be synchronized internally.
The call, however, isn't, i.e. two threads could add a new key simultaneously. You could try putIfAbsent() and if you get a return value (i.e. not null) you could call the get method. Thus you could change your code like this:
//this only adds a new key-value pair if there's not already one for the key
if( map.putIfAbsent(key,new AtomicInteger(1)) != null ) {
map.get(key).addAndGet(1);
}
Alternatively if you're using Java 8 you could use the compute() method which according to the JavaDoc is performed atomically. The function you pass would then check whether the value already exists or not. Since the whole call is synchronized you probably wouldn't even need to use a AtomicInteger (depends on what else you are doing with the value).
In Java 8 you could use ConcurrentHashMap's computeIfAbsent to provide initial value:
map.computeIfAbsent(key, new AtomicInteger(0)).addAndGet(1)
You should use the ConcurrentHashMap.putIfAbsent(K key, V value) and pay attention to the return value.

Using putIfAbsent like a short circuit operator

Is it possible to use putIfAbsent or any of its equivalents like a short circuit operator.
myConcurrentMap.putIfAbsent(key,calculatedValue)
I want that if there is already a calculatedValue it shouldnt be calculated again.
by default putIfAbsent would still do the calculation every time even though it will not actually store the value again.
Java doesn't allow any form of short-circuiting save the built-in cases, sadly - all method calls result in the arguments being fully evaluated before control passes to the method. Thus you couldn't do this with "normal" syntax; you'd need to manually wrap up the calculation inside a Callable or similar, and then explicitly invoke it.
In this case I find it difficult to see how it could work anyway, though. putIfAbsent works on the basis of being an atomic, non-blocking operation. If it were to do what you want, the sequence of events would roughly be:
Check if key exists in the map (this example assumes it doesn't)
Evaluate calculatedValue (probably expensive, given the context of the question)
Put result in map
It would be impossible for this to be non-blocking if the value didn't already exist at step two - two different threads calling this method at the same time could only perform correctly if blocking happened. At this point you may as well just use synchronized blocks with the flexibility of implementation that that entails; you can definitely implement what you're after with some simple locking, something like the following:
private final Map<K, V> map = ...;
public void myAdd(K key, Callable<V> valueComputation) {
synchronized(map) {
if (!map.containsKey(key)) {
map.put(key, valueComputation.call());
}
}
}
You can put Future<V> objects into the map. Using putIfAbsent, only one object will be there, and computation of final value will be performed by calling Future.get() (e.g. by FutureTask + Callable classes). Check out Java Concurrency in Practice for discussion about using this technique. (Example code is also in this question here on SO.
This way, your value is computed only once, and all threads get same value. Access to map isn't blocked, although access to value (through Future.get()) will block until this value is computed by one of the threads.
You could consider to use a Guava ComputingMap
ConcurrentMap<Key, Value> myConcurrentMap = new MapMaker()
.makeComputingMap(
new Function<Key, Value>() {
public Value apply(Key key) {
Value calculatedValue = calculateValue(key);
return calculatedValue;
}
});

Categories

Resources