ConcurrentHashMap update exists value thread safe - java

I want to use the concurrent hash map holding some results,
ConcurrentHashMap<Long,AtomicInteger>
add a new entry if key not exists,or get value by key and increment,like this:
if(map.contains(key))
map.get(key).addAndGet(1);
else
map.put(key,new AtomicInteger(1));
the put operation is not thead safe,how to solve this problem? Is put operation should within synchronized block?

The put() operation itself is implemented in a threadsafe way, i.e. if you put the same key it will be synchronized internally.
The call, however, isn't, i.e. two threads could add a new key simultaneously. You could try putIfAbsent() and if you get a return value (i.e. not null) you could call the get method. Thus you could change your code like this:
//this only adds a new key-value pair if there's not already one for the key
if( map.putIfAbsent(key,new AtomicInteger(1)) != null ) {
map.get(key).addAndGet(1);
}
Alternatively if you're using Java 8 you could use the compute() method which according to the JavaDoc is performed atomically. The function you pass would then check whether the value already exists or not. Since the whole call is synchronized you probably wouldn't even need to use a AtomicInteger (depends on what else you are doing with the value).

In Java 8 you could use ConcurrentHashMap's computeIfAbsent to provide initial value:
map.computeIfAbsent(key, new AtomicInteger(0)).addAndGet(1)

You should use the ConcurrentHashMap.putIfAbsent(K key, V value) and pay attention to the return value.

Related

Is it possible to synchronize ConcurrentHashMap update by it's key?

I have tons of update operation on ConcurrentHashMap and I need to minimize synchronization for update this map.
please see code below.
public ConcurrentHashMap<String, Integer> dataMap = new ConcurrentHashMap<>();
public void updateData(String key, int data) {
if ( dataMap.containsKey(key)) {
// do something with previous value and new value and update it
}
}
For example, When one thread calls updateData with Key "A", then every other try calling updateData with key "A" should be blocked until first thread is done. meanwhile, I want another thread trying to call updateData with key "B" runs concurrently.
I wonder if there is any fancy way to lock hashMap simply with its key.
I think you are looking for the compute method.
It takes a function that is given the key and the current value (or null if there is no value yet), and can compute the new value.
ConcurrentHashMap guarantees that only one such functions runs at the same time for the given key. A concurrent second call will block until ready.

Concurrent byte array access in Java with as few locks as possible

I'm trying to reduce the memory usage for the lock objects of segmented data. See my questions here and here. Or just assume you have a byte array and every 16 bytes can (de)serialize into an object. Let us call this a "row" with row length of 16 bytes. Now if you modify such a row from a writer thread and read from multiple threads you need locking. And if you have a byte array size of 1MB (1024*1024) this means 65536 rows and the same number of locks.
This is a bit too much, also that I need much larger byte arrays, and I would like to reduce it to something roughly proportional to the number of threads. My idea was to create a
ConcurrentHashMap<Integer, LockHelper> concurrentMap;
where Integer is the row index and before a thread 'enters' a row it puts a lock object in this map (got this idea from this answer). But no matter what I think through I cannot find an approach that is really thread-safe:
// somewhere else where we need to write or read the row
LockHelper lock1 = new LockHelper();
LockHelper lock = concurrentMap.putIfAbsent(rowIndex, lock1);
lock.addWaitingThread(); // is too late
synchronized(lock) {
try {
// read or write row at rowIndex e.g. writing like
bytes[rowIndex/16] = 1;
bytes[rowIndex/16 + 1] = 2;
// ...
} finally {
if(lock.noThreadsWaiting())
concurrentMap.remove(rowIndex);
}
}
Do you see a possibility to make this thread-safe?
I have the feeling that this will look very similar like the concurrentMap.compute contstruct (e.g. see this answer) or could I even utilize this method?
map.compute(rowIndex, (key, value) -> {
if(value == null)
value = new Object();
synchronized (value) {
// access row
return value;
}
});
map.remove(rowIndex);
Is the value and the 'synchronized' necessary at all as we already know the compute operation is atomically?
// null is forbidden so use the key also as the value to avoid creating additional objects
ConcurrentHashMap<Integer, Integer> map = ...;
// now the row access looks really simple:
map.compute(rowIndex, (key, value) -> {
// access row
return key;
});
map.remove(rowIndex);
BTW: Since when we have this compute in Java. Since 1.8? Cannot find this in the JavaDocs
Update: I found a very similar question here with userIds instead rowIndices, note that the question contains an example with several problems like missing final, calling lock inside the try-finally-clause and lack of shrinking the map. Also there seems to be a library JKeyLockManager for this purpose but I don't think it is thread-safe.
Update 2: The solution seem to be really simple as Nicolas Filotto pointed out how to avoid the removal:
map.compute(rowIndex, (key, value) -> {
// access row
return null;
});
So this is really less memory intense BUT the simple segment locking with synchronized is at least 50% faster in my scenario.
Is the value and the synchronized necessary at all as we already
know the compute operation is atomically?
I confirm that it is not needed to add a synchronized block in this case as the compute method is done atomically as stated in the Javadoc of ConcurrentHashMap#compute(K key, BiFunction<? super K,? super V,? extends V> remappingFunction) that has been added with BiFunction since Java 8, I quote:
Attempts to compute a mapping for the specified key and its current
mapped value (or null if there is no current mapping). The entire
method invocation is performed atomically. Some attempted update
operations on this map by other threads may be blocked while
computation is in progress, so the computation should be short and
simple, and must not attempt to update any other mappings of this Map.
What you try to achieve with the compute method could be totally atomic if you make your BiFunction always returns null to remove the key atomically too such that everything will be done atomically.
map.compute(
rowIndex,
(key, value) -> {
// access row here
return null;
}
);
This way you will then fully rely on the locking mechanism of a ConcurrentHashMap to synchronize your accesses to your rows.

Is a java synchronized method entry point thread safe enough?

I have a Singleton class handling a kind of cache with different objects in a Hashmap.
(The format of a key is directly linked to the type of object stored in the map - hence the map is of )
Three different actions are possible on the map : add, get, remove.
I secured the access to the map by using a public entry point method (no intense access) :
public synchronized Object doAction(String actionType, String key, Object data){
Object myObj = null;
if (actionType.equalsIgnorecase("ADD"){
addDataToMyMap(key,data);
} else if (actionType.equalsIgnorecase("GET"){
myObj = getDataFromMyMap(key);
} else if (actionType.equalsIgnorecase("REM"){
removeDataFromMyMap(key);
}
return myObj;
}
Notes:
The map is private. Methods addDataToMyMap(), getDataFromMyMap() and removeDataFromMyMap() are private. Only the entry point method is public and nothing else except the static getInstance() of the class itself.
Do you confirm it is thread safe for concurrent access to the map since there is no other way to use map but through that method ?
If it is safge for a Map, I guess this principle could be applied to any other kind of shared ressource.
Many thanks in advance for your answers.
David
I would need to see your implementation of your methods, but it could be enough.
BUT i would recommend you to use a Map from the Collection API of java then you wouldnt need to synchronize your method unless your sharing some other instance.
read this: http://www.java-examples.com/get-synchronized-map-java-hashmap-example
Yes your class will be thread safe as long as the only entry point is doAction.
If your cache class has private HashMap and you have three methods and all are public synchronized and not static and if you don't have any other public instance variable then i think your cache is thread-safe.
Better to post your code.
This is entirely safe. As long as all the threads are accessing it using a common lock, which in this case is the Object, then it's thread-safe. (Other answers may be more performant but your implementation is safe.)
You can use Collections.synchronizedMap to synchronize access to the Map.
As is it is hard to determine if the code is thread safe. Important information missing from your example are:
Are the methods public
Are the methods synchronized
It the map only accessed through the methods
I would advice you to look into synchronization to get a grasp of the problems and how to tackle them. Exploring the ConcurrentHashMap class would give further information about your problem.
You should use ConcurrentHashMap. It offers better throughput than synchronized doAction and better thread safety than Collections.synchronizedMap().
This depends on your code. As someone else stated, you can use Collections.synchronizedMap. However, this only synchronizes the individual method calls on the map. So if:
map.get(key);
map.put(key,value);
Are executed at the same time in two different threads, one will block until the other exits. However, if your critical section is larger than the single call into the map:
SomeExpensiveObject value = map.get(key);
if (value == null) {
value = new SomeExpensiveObject();
map.put(key,value);
}
Now let's assume the key is not present. The first thread executes, and gets a null value back. The scheduler yields that thread, and runs thread 2, which also gets back a null value.
It constructs the new object and puts it in the map. Then thread 1 resumes and does the same, since it still has a null value.
This is where you'd want a larger synchronization block around your critical section
SomeExpensiveObject value = null;
synchronized (map) {
value = map.get(key);
if (value == null) {
value = new SomeExpensiveObject();
map.put(key,value);
}
}

Using putIfAbsent like a short circuit operator

Is it possible to use putIfAbsent or any of its equivalents like a short circuit operator.
myConcurrentMap.putIfAbsent(key,calculatedValue)
I want that if there is already a calculatedValue it shouldnt be calculated again.
by default putIfAbsent would still do the calculation every time even though it will not actually store the value again.
Java doesn't allow any form of short-circuiting save the built-in cases, sadly - all method calls result in the arguments being fully evaluated before control passes to the method. Thus you couldn't do this with "normal" syntax; you'd need to manually wrap up the calculation inside a Callable or similar, and then explicitly invoke it.
In this case I find it difficult to see how it could work anyway, though. putIfAbsent works on the basis of being an atomic, non-blocking operation. If it were to do what you want, the sequence of events would roughly be:
Check if key exists in the map (this example assumes it doesn't)
Evaluate calculatedValue (probably expensive, given the context of the question)
Put result in map
It would be impossible for this to be non-blocking if the value didn't already exist at step two - two different threads calling this method at the same time could only perform correctly if blocking happened. At this point you may as well just use synchronized blocks with the flexibility of implementation that that entails; you can definitely implement what you're after with some simple locking, something like the following:
private final Map<K, V> map = ...;
public void myAdd(K key, Callable<V> valueComputation) {
synchronized(map) {
if (!map.containsKey(key)) {
map.put(key, valueComputation.call());
}
}
}
You can put Future<V> objects into the map. Using putIfAbsent, only one object will be there, and computation of final value will be performed by calling Future.get() (e.g. by FutureTask + Callable classes). Check out Java Concurrency in Practice for discussion about using this technique. (Example code is also in this question here on SO.
This way, your value is computed only once, and all threads get same value. Access to map isn't blocked, although access to value (through Future.get()) will block until this value is computed by one of the threads.
You could consider to use a Guava ComputingMap
ConcurrentMap<Key, Value> myConcurrentMap = new MapMaker()
.makeComputingMap(
new Function<Key, Value>() {
public Value apply(Key key) {
Value calculatedValue = calculateValue(key);
return calculatedValue;
}
});

How to safely modify values in Java HashMaps concurrently?

I have a block of Java code that looks something like this that I'm trying to parallelize:
value = map.get(key);
if (value == null) {
value = new Value();
map.put(key,value);
}
value.update();
I want to block any other thread from accessing the map with that particular key until after value.update() is called even if key is not in the key set. Accessing with other keys should be allowed. How could I achieve this?
Short answer is there's no safe way to do this without synchronizing the entire block. You could use java.util.concurrent.ConcurrentHashMap though, see this article for more details. The basic idea is to use ConcurrentHashMap.putIfAbsent instead of the normal put.
You cannot parallelize updates to HashMap because update can trigger resize of the underlying array including recalculation of all keys.
Use other collection, for example java.util.concurrent.ConcurrentHashMap which is a "A hash table supporting full concurrency of retrievals and adjustable expected concurrency for updates." according to javadoc.
I wouldn't use HashMap if you need to be concerned about threading issues. Make use of the Java 5 concurrent package and look into ConcurrentHashMap.
You just described the use case for the Guava computing map. You create it with:
Map<Key, Value> map = new MapMaker().makeComputingMap(new Function<Key, Value>() {
public Value apply(Key key) {
return new Value().update();
}
));
and use it:
Value v = map.get(key);
This guarantees only one thread will call update() and other threads will block and wait until the method completes.
You probably don't actually want your value having a mutable update method on it, but that's another discussion.
private void synchronized functionname() {
value = map.get(key);
if (value == null) {
value = new Value();
map.put(key,value);
}
value.update();
}
You can learn more about synchronized methods here: Synchronized Methods
You might also want to investigate the ConcurrentHashMap class, which might suit your purposes. You can see it on the JavaDoc.
Look into Concurrent HashMap. It has excellent performance even for single-threaded applications. It allows concurrent modification of Map from various threads without any need of blocking them.
One possibility is to manage multiple locks. So you can keep an array of locks that is retrieved based on the key's hash code. This should give you better through-put then synchronizing the whole method. You can size the array based on the number of thread that you believe will be accessing the code.
private static final int NUM_LOCKS = 16;
Object [] lockArray = new Object[NUM_LOCKS];
...
// Load array with Objects or Reentrant Locks
...
Object keyLock = lockArray[key.hashcode % NUM_LOCKS];
synchronize(keyLock){
value = map.get(key);
if (value == null) {
value = new Value();
map.put(key,value);
}
value.update();
}

Categories

Resources