How to safely modify values in Java HashMaps concurrently?

How to safely modify values in Java HashMaps concurrently? - java

I have a block of Java code that looks something like this that I'm trying to parallelize:
value = map.get(key);
if (value == null) {
value = new Value();
map.put(key,value);
}
value.update();
I want to block any other thread from accessing the map with that particular key until after value.update() is called even if key is not in the key set. Accessing with other keys should be allowed. How could I achieve this?

Short answer is there's no safe way to do this without synchronizing the entire block. You could use java.util.concurrent.ConcurrentHashMap though, see this article for more details. The basic idea is to use ConcurrentHashMap.putIfAbsent instead of the normal put.

You cannot parallelize updates to HashMap because update can trigger resize of the underlying array including recalculation of all keys.
Use other collection, for example java.util.concurrent.ConcurrentHashMap which is a "A hash table supporting full concurrency of retrievals and adjustable expected concurrency for updates." according to javadoc.

I wouldn't use HashMap if you need to be concerned about threading issues. Make use of the Java 5 concurrent package and look into ConcurrentHashMap.

You just described the use case for the Guava computing map. You create it with:
Map<Key, Value> map = new MapMaker().makeComputingMap(new Function<Key, Value>() {
public Value apply(Key key) {
return new Value().update();
}
));
and use it:
Value v = map.get(key);
This guarantees only one thread will call update() and other threads will block and wait until the method completes.
You probably don't actually want your value having a mutable update method on it, but that's another discussion.

private void synchronized functionname() {
value = map.get(key);
if (value == null) {
value = new Value();
map.put(key,value);
}
value.update();
}
You can learn more about synchronized methods here: Synchronized Methods
You might also want to investigate the ConcurrentHashMap class, which might suit your purposes. You can see it on the JavaDoc.

Look into Concurrent HashMap. It has excellent performance even for single-threaded applications. It allows concurrent modification of Map from various threads without any need of blocking them.

One possibility is to manage multiple locks. So you can keep an array of locks that is retrieved based on the key's hash code. This should give you better through-put then synchronizing the whole method. You can size the array based on the number of thread that you believe will be accessing the code.
private static final int NUM_LOCKS = 16;
Object [] lockArray = new Object[NUM_LOCKS];
...
// Load array with Objects or Reentrant Locks
...
Object keyLock = lockArray[key.hashcode % NUM_LOCKS];
synchronize(keyLock){
value = map.get(key);
if (value == null) {
value = new Value();
map.put(key,value);
}
value.update();
}

Related

How to atomically update the value of ConcurrentMap in multithreaded application?

I have a ConcurrentMap which I need to populate from multithread application. My map is shown below:
private final ConcurrentMap<String, AtomicLongMap<String>> deviceErrorHolder = Maps.newConcurrentMap();
Below is my method which is called from multithreaded application at very fast rate so I need to make sure it is fast.
public void addDeviceErrorStats(String deviceName, String errorName) {
AtomicLongMap<String> errorMap = deviceErrorHolder.get(deviceName);
if (errorMap == null) {
errorMap = AtomicLongMap.create();
AtomicLongMap<String> currenttErrorMap = deviceErrorHolder.putIfAbsent(deviceName, errorMap);
if (currenttErrorMap != null) {
errorMap = currenttErrorMap;
}
}
errorMap.incrementAndGet(errorName);
}
For each deviceName, I will have an AtomicLongMap which will contain all the counts for different errorName.
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorA");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorB");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorC");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceB", "errorA");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceB", "errorB");
Is my addDeviceErrorStats method thread safe? And also the way I am updating the value of my deviceErrorHolder map is correct? Meaning will it be an atomic operation? Do I need to synchronize creation of new AtomicLongMap instances? Or CM will take care that for me?
I am working with Java7.

You can create a lot simpler version of this with computeIfAbsent().
AtomicLongMap<String> errorMap = deviceErrorHolder.computeIfAbsent(deviceName, a -> AtomicLongMap.create());
errorMap.incrementAndGet(errorName);
The computeIfAbsent (in concurrent maps) is especially meant to do an atomic version of what your null checking logic does. If the deviceName key has a value, it's returned, otherwise the computation is called atomically, and the return value of the computation is both associated with the key in the map as well as returned.

I believe your method is correct. Let's assume we have two concurrent threads calling it for the same device
The case where the errorMap already existed is trivial, as both threads will get the same and call incrementAndGet on it, which is atomic.
Let's now consider the case where errorMap didn't exist. say the first thread gets to AtomicLongMap.create(), and then the second thread is scheduled. Such thread will also create its own local map. putIfAbsent() is atomic, hence one of the threads will return null, while the second will return the map put by the first. In the latter case, you're throwing away the map that was instantiated by this thread, and using the one returned instead. Looks good to me.

Concurrent byte array access in Java with as few locks as possible

I'm trying to reduce the memory usage for the lock objects of segmented data. See my questions here and here. Or just assume you have a byte array and every 16 bytes can (de)serialize into an object. Let us call this a "row" with row length of 16 bytes. Now if you modify such a row from a writer thread and read from multiple threads you need locking. And if you have a byte array size of 1MB (1024*1024) this means 65536 rows and the same number of locks.
This is a bit too much, also that I need much larger byte arrays, and I would like to reduce it to something roughly proportional to the number of threads. My idea was to create a
ConcurrentHashMap<Integer, LockHelper> concurrentMap;
where Integer is the row index and before a thread 'enters' a row it puts a lock object in this map (got this idea from this answer). But no matter what I think through I cannot find an approach that is really thread-safe:
// somewhere else where we need to write or read the row
LockHelper lock1 = new LockHelper();
LockHelper lock = concurrentMap.putIfAbsent(rowIndex, lock1);
lock.addWaitingThread(); // is too late
synchronized(lock) {
try {
// read or write row at rowIndex e.g. writing like
bytes[rowIndex/16] = 1;
bytes[rowIndex/16 + 1] = 2;
// ...
} finally {
if(lock.noThreadsWaiting())
concurrentMap.remove(rowIndex);
}
}
Do you see a possibility to make this thread-safe?
I have the feeling that this will look very similar like the concurrentMap.compute contstruct (e.g. see this answer) or could I even utilize this method?
map.compute(rowIndex, (key, value) -> {
if(value == null)
value = new Object();
synchronized (value) {
// access row
return value;
}
});
map.remove(rowIndex);
Is the value and the 'synchronized' necessary at all as we already know the compute operation is atomically?
// null is forbidden so use the key also as the value to avoid creating additional objects
ConcurrentHashMap<Integer, Integer> map = ...;
// now the row access looks really simple:
map.compute(rowIndex, (key, value) -> {
// access row
return key;
});
map.remove(rowIndex);
BTW: Since when we have this compute in Java. Since 1.8? Cannot find this in the JavaDocs
Update: I found a very similar question here with userIds instead rowIndices, note that the question contains an example with several problems like missing final, calling lock inside the try-finally-clause and lack of shrinking the map. Also there seems to be a library JKeyLockManager for this purpose but I don't think it is thread-safe.
Update 2: The solution seem to be really simple as Nicolas Filotto pointed out how to avoid the removal:
map.compute(rowIndex, (key, value) -> {
// access row
return null;
});
So this is really less memory intense BUT the simple segment locking with synchronized is at least 50% faster in my scenario.

Is the value and the synchronized necessary at all as we already
know the compute operation is atomically?
I confirm that it is not needed to add a synchronized block in this case as the compute method is done atomically as stated in the Javadoc of ConcurrentHashMap#compute(K key, BiFunction<? super K,? super V,? extends V> remappingFunction) that has been added with BiFunction since Java 8, I quote:
Attempts to compute a mapping for the specified key and its current
mapped value (or null if there is no current mapping). The entire
method invocation is performed atomically. Some attempted update
operations on this map by other threads may be blocked while
computation is in progress, so the computation should be short and
simple, and must not attempt to update any other mappings of this Map.
What you try to achieve with the compute method could be totally atomic if you make your BiFunction always returns null to remove the key atomically too such that everything will be done atomically.
map.compute(
rowIndex,
(key, value) -> {
// access row here
return null;
}
);
This way you will then fully rely on the locking mechanism of a ConcurrentHashMap to synchronize your accesses to your rows.

Updating BigDecimal concurrently within ConcurrentHashMap thread safe

Is the code below thread/concurrency safe when there are multiple threads calling the totalBadRecords() method from inside other method? Both map objects parameters to this method are ConcurrentHashMap. I want to ensure that each call updates the total properly.
If it is not safe, please explain what do I have to do to ensure thread safety.
Do I need to synchronize the add/put or is there a better way?
Do i need to synchronize the get method in TestVO. TestVO is simple java bean and having getter/setter method.
Below is my Sample code:
public void totalBadRecords(final Map<Integer, TestVO> sourceMap,
final Map<String, String> logMap) {
BigDecimal badCharges = new BigDecimal(0);
boolean badRecordsFound = false;
for (Entry<Integer, TestVO> e : sourceMap.entrySet()) {
if ("Y".equals(e.getValue().getInd()))
badCharges = badCharges.add(e.getValue()
.getAmount());
badRecordsFound = true;
}
if (badRecordsFound)
logMap.put("badRecordsFound:", badCharges.toPlainString());
}

That depends on how your objects are used in your whole application.
If each call to totalBadRecords takes a different sourceMap and the map (and its content) is not mutated while counting, it's thread-safe:
badCharges is a local variable, it can't be shared between thread, and is thus thread-safe (no need to synchronize add)
logMap can be shared between calls to totalBadRecords: the method put of ConcurrentHashMap is already synchronized (or behaves as if it was).
if instances of TestVO are not mutated, the value from getValue() and getInd() are always coherent with one other.
the sourceMap is not mutated, so you can iterate over it.
Actually, in this case, you don't need a concurrent map for sourceMap. You could even make it immutable.
If the instances of TestVO and the sourceMap can change while counting, then of course you could be counting wrongly.

It depends on what you mean by thread-safe. And that boils down to what the requirements for this method are.
At the data structure level, the method will not corrupt any data structures, because the only data structures that could be shared with other threads are ConcurrentHashMap instances, and they safe against that kind of problem.
The potential thread-safety issue is that iterating a ConcurrentHashMap is not an atomic operation. The guarantees for the iterators are such that you are not guaranteed to see all entries in the iteration if the map is updated (e.g. by another thread) while you are iterating. That means that the totalBadRecords method may not give an accurate count if some other thread modifies the map during the call. Whether this is a real thread-safety issue depends on whether or not the totalBadRecords is required to give an accurate result in that circumstance.
If you need to get an accurate count, then you have to (somehow) lock out updates to the sourceMap while making the totalBadRecords call. AFAIK, there is no way to do this using (just) the ConcurrentHashMap API, and I can't think of a way to do it that doesn't make the map a concurrency bottleneck.
In fact, if you need to calculate accurate counts, you have to use external locking for (at least) the counting operation, and all operations that could change the outcome of the counting. And even that doesn't deal with the possibility that some thread may modify one of the TestVO objects while you are counting records, and cause the TestVO to change from "good" to "bad" or vice-versa.

You could use something like the following.
That would guarantee you that after a call to the totalBadRecords method, the String representing the bad charges in the logMap is accurate, you don't have lost updates. Of course a phantom read can always happen, as you do not lock the sourceMap.
private static final String BAD_RECORDS_KEY = "badRecordsFound:";
public void totalBadRecords(final ConcurrentMap<Integer, TestVO> sourceMap,
final ConcurrentMap<String, String> logMap) {
while (true) {
// get the old value that is going to be replaced.
String oldValue = logMap.get(BAD_RECORDS_KEY);
// calculate new value
BigDecimal badCharges = BigDecimal.ZERO;
for (TestVO e : sourceMap.values()) {
if ("Y".equals(e.getInd()))
badCharges = badCharges.add(e.getAmount());
}
final String newValue = badCharges.toPlainString();
// insert into map if there was no mapping before
if (oldValue == null) {
oldValue = logMap.putIfAbsent(BAD_RECORDS_KEY, newValue);
if (oldValue == null) {
oldValue = newValue;
}
}
// replace the entry in the map
if (logMap.replace(BAD_RECORDS_KEY, oldValue, newValue)) {
// update succeeded -> there where no updates to the logMap while calculating the bad charges.
break;
}
}
}

Is a java synchronized method entry point thread safe enough?

I have a Singleton class handling a kind of cache with different objects in a Hashmap.
(The format of a key is directly linked to the type of object stored in the map - hence the map is of )
Three different actions are possible on the map : add, get, remove.
I secured the access to the map by using a public entry point method (no intense access) :
public synchronized Object doAction(String actionType, String key, Object data){
Object myObj = null;
if (actionType.equalsIgnorecase("ADD"){
addDataToMyMap(key,data);
} else if (actionType.equalsIgnorecase("GET"){
myObj = getDataFromMyMap(key);
} else if (actionType.equalsIgnorecase("REM"){
removeDataFromMyMap(key);
}
return myObj;
}
Notes:
The map is private. Methods addDataToMyMap(), getDataFromMyMap() and removeDataFromMyMap() are private. Only the entry point method is public and nothing else except the static getInstance() of the class itself.
Do you confirm it is thread safe for concurrent access to the map since there is no other way to use map but through that method ?
If it is safge for a Map, I guess this principle could be applied to any other kind of shared ressource.
Many thanks in advance for your answers.
David

I would need to see your implementation of your methods, but it could be enough.
BUT i would recommend you to use a Map from the Collection API of java then you wouldnt need to synchronize your method unless your sharing some other instance.
read this: http://www.java-examples.com/get-synchronized-map-java-hashmap-example

Yes your class will be thread safe as long as the only entry point is doAction.

If your cache class has private HashMap and you have three methods and all are public synchronized and not static and if you don't have any other public instance variable then i think your cache is thread-safe.
Better to post your code.

This is entirely safe. As long as all the threads are accessing it using a common lock, which in this case is the Object, then it's thread-safe. (Other answers may be more performant but your implementation is safe.)

You can use Collections.synchronizedMap to synchronize access to the Map.

As is it is hard to determine if the code is thread safe. Important information missing from your example are:
Are the methods public
Are the methods synchronized
It the map only accessed through the methods
I would advice you to look into synchronization to get a grasp of the problems and how to tackle them. Exploring the ConcurrentHashMap class would give further information about your problem.

You should use ConcurrentHashMap. It offers better throughput than synchronized doAction and better thread safety than Collections.synchronizedMap().

This depends on your code. As someone else stated, you can use Collections.synchronizedMap. However, this only synchronizes the individual method calls on the map. So if:
map.get(key);
map.put(key,value);
Are executed at the same time in two different threads, one will block until the other exits. However, if your critical section is larger than the single call into the map:
SomeExpensiveObject value = map.get(key);
if (value == null) {
value = new SomeExpensiveObject();
map.put(key,value);
}
Now let's assume the key is not present. The first thread executes, and gets a null value back. The scheduler yields that thread, and runs thread 2, which also gets back a null value.
It constructs the new object and puts it in the map. Then thread 1 resumes and does the same, since it still has a null value.
This is where you'd want a larger synchronization block around your critical section
SomeExpensiveObject value = null;
synchronized (map) {
value = map.get(key);
if (value == null) {
value = new SomeExpensiveObject();
map.put(key,value);
}
}

Does a HashMap with a getAndWait() method exist? E.g. a BlockingConcurrentHashMap implementation?

Many threads may populate a HashMap, in some cases I need to wait (block) until an object exists in the HashMap, such as:
BlockingConcurrentHashMap map = new BlockingConcurrentHashMap();
Object x = map.getAndWait(key, 1000); //(object_to_get, max_delay_ms)
Wondering if such a thing exists already, I hate re-inventing wheels.

As far as I know, there is no 'Transfer Map' available. Though the creation of one in theory isn't too difficult.
public class TransferMap<K,V> implements Map<K,V>{
#GuardedBy("lock")
private final HashMap<K,V> backingMap = new HashMap<K,V>();
private final Object lock = new Object();
public V getAndWait(Object key){
synchronized(lock){
V value = null;
do{
value = backingMap.get(key);
if(value == null) lock.wait();
}while(value == null);
}
return value;
}
public V put(K key, V value){
synchronized(lock){
V value = backingMap.put(key,value);
lock.notifyAll();
}
return value;
}
}
There are obvious exclusions in this class. Not to mentioned the lock coarsening; needless to say it won't perform great, but you should get the idea of what is going on

Blockingmap4j will suit your requirement just right.
You can find it at https://github.com/sarveswaran-m/blockingMap4j/wiki/
Since granular locks are used in the implementation, performance will not be severely degraded.
PS
This is a rather late answer on a question that is 2 years old. Since, there is no way to send a private message to the author of the question, am replying here.
Disclaimer
I am the author of the library.

Improvement on John's impl, with aimed notify(), instead of "thundering herd", which is especially bad when nobody is waiting on an inserted key
HashMap<K,Object> locks = new HashMap<>();
put(key, value)
synchronized(locks)
backingMap.put(key,value);
lock = locks.get(key);
if(lock!=null)
lock.notifyAll();
getAndWait(key)
// not hard, but pretty verbose

You can populate your Hashtable with java.util.concurrent.FutureTask<ObjReturned>s at the start with all the tasks you need to compute. You then use a thread pool to start executing your FutureTasks. You can get your results asynchronously with ObjReturned obj = hashtable.get(key).get(), which will wait if the FutureTask in question is not done yet.
You probably don't want one single thread to retrieve the results since it might wait on the task that will turn out to finish last. You could have multiple retrieval threads, or you could cycle through the keys when you wait too long for one task (there is a method FutureTask.get(waitTime, timeUnit)).

I'm not sure what your question is. Do you want to wait for the value when it is not in the map? You want the producer-consumer pattern of BlockingQueue on a map. If it is that I don't know anything similar in the JRE or anywhere else.
Google guava MapMaker allows you to make a computing map, that is a Map that creates the value if it does not exist by using a factory with type Function<Key, Value>. If several threads reach that situation at the same time one creates the value and the rest blocks waiting for it. I know it's not producer-consumer but is what I can offer.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.