Updating BigDecimal concurrently within ConcurrentHashMap thread safe

Updating BigDecimal concurrently within ConcurrentHashMap thread safe - java

Is the code below thread/concurrency safe when there are multiple threads calling the totalBadRecords() method from inside other method? Both map objects parameters to this method are ConcurrentHashMap. I want to ensure that each call updates the total properly.
If it is not safe, please explain what do I have to do to ensure thread safety.
Do I need to synchronize the add/put or is there a better way?
Do i need to synchronize the get method in TestVO. TestVO is simple java bean and having getter/setter method.
Below is my Sample code:
public void totalBadRecords(final Map<Integer, TestVO> sourceMap,
final Map<String, String> logMap) {
BigDecimal badCharges = new BigDecimal(0);
boolean badRecordsFound = false;
for (Entry<Integer, TestVO> e : sourceMap.entrySet()) {
if ("Y".equals(e.getValue().getInd()))
badCharges = badCharges.add(e.getValue()
.getAmount());
badRecordsFound = true;
}
if (badRecordsFound)
logMap.put("badRecordsFound:", badCharges.toPlainString());
}

That depends on how your objects are used in your whole application.
If each call to totalBadRecords takes a different sourceMap and the map (and its content) is not mutated while counting, it's thread-safe:
badCharges is a local variable, it can't be shared between thread, and is thus thread-safe (no need to synchronize add)
logMap can be shared between calls to totalBadRecords: the method put of ConcurrentHashMap is already synchronized (or behaves as if it was).
if instances of TestVO are not mutated, the value from getValue() and getInd() are always coherent with one other.
the sourceMap is not mutated, so you can iterate over it.
Actually, in this case, you don't need a concurrent map for sourceMap. You could even make it immutable.
If the instances of TestVO and the sourceMap can change while counting, then of course you could be counting wrongly.

It depends on what you mean by thread-safe. And that boils down to what the requirements for this method are.
At the data structure level, the method will not corrupt any data structures, because the only data structures that could be shared with other threads are ConcurrentHashMap instances, and they safe against that kind of problem.
The potential thread-safety issue is that iterating a ConcurrentHashMap is not an atomic operation. The guarantees for the iterators are such that you are not guaranteed to see all entries in the iteration if the map is updated (e.g. by another thread) while you are iterating. That means that the totalBadRecords method may not give an accurate count if some other thread modifies the map during the call. Whether this is a real thread-safety issue depends on whether or not the totalBadRecords is required to give an accurate result in that circumstance.
If you need to get an accurate count, then you have to (somehow) lock out updates to the sourceMap while making the totalBadRecords call. AFAIK, there is no way to do this using (just) the ConcurrentHashMap API, and I can't think of a way to do it that doesn't make the map a concurrency bottleneck.
In fact, if you need to calculate accurate counts, you have to use external locking for (at least) the counting operation, and all operations that could change the outcome of the counting. And even that doesn't deal with the possibility that some thread may modify one of the TestVO objects while you are counting records, and cause the TestVO to change from "good" to "bad" or vice-versa.

You could use something like the following.
That would guarantee you that after a call to the totalBadRecords method, the String representing the bad charges in the logMap is accurate, you don't have lost updates. Of course a phantom read can always happen, as you do not lock the sourceMap.
private static final String BAD_RECORDS_KEY = "badRecordsFound:";
public void totalBadRecords(final ConcurrentMap<Integer, TestVO> sourceMap,
final ConcurrentMap<String, String> logMap) {
while (true) {
// get the old value that is going to be replaced.
String oldValue = logMap.get(BAD_RECORDS_KEY);
// calculate new value
BigDecimal badCharges = BigDecimal.ZERO;
for (TestVO e : sourceMap.values()) {
if ("Y".equals(e.getInd()))
badCharges = badCharges.add(e.getAmount());
}
final String newValue = badCharges.toPlainString();
// insert into map if there was no mapping before
if (oldValue == null) {
oldValue = logMap.putIfAbsent(BAD_RECORDS_KEY, newValue);
if (oldValue == null) {
oldValue = newValue;
}
}
// replace the entry in the map
if (logMap.replace(BAD_RECORDS_KEY, oldValue, newValue)) {
// update succeeded -> there where no updates to the logMap while calculating the bad charges.
break;
}
}
}

Related

Thread safety in Set obtained from a cache

I stumbled upon the following piece of code:
public static final Map<String, Set<String>> fooCacheMap = new ConcurrentHashMap<>();
this cache is accessed from rest controller method:
public void fooMethod(String fooId) {
Set<String> fooSet = cacheMap.computeIfAbsent(fooId, k -> new ConcurrentSet<>());
//operations with fooSet
}
Is ConcurrentSet really necessary? when I know for sure that the set is accessed only in this method?

As you use it in the controller then multiple threads can call your method simultaneously (ex. multiple parallel requests can call your method)
As this method does not look like synchronized in any way then ConcurrentSet is probably necessary here.

Is ConcurrentSet really necessary?
Possibly, possibly not. We don't know how this code is being used.
However, assuming that it is being used in a multithreaded way (specifically: that two threads can invoke fooMethod concurrently), yes.
The atomicity in ConcurrentHashMap is only guaranteed for each invocation of computeIfAbsent. Once this completes, the lock is released, and other threads are able to invoke the method. As such, access to the return value is not atomic, and so you can get thread inference when accessing that value.
In terms of the question "do I need `ConcurrentSet"? No: you can do it so that accesses to the set are atomic:
cacheMap.compute(fooId, (k, fooSet) -> {
if (fooSet == null) fooSet = new HashSet<>();
// Operations with fooSet
return v;
});

Using a concurrent map will not guarantee thread safety. Additions to the Map need to be performed in a synchronized block to ensure that two threads don't attempt to add the same key to the map. Therefore, the concurrent map is not really needed, especially because the Map itself is static and final. Furthermore, if the code modifies the Set inside the Map, which appears likely, that needs to be synchronized as well.
The correct approach is to the Map is to check for the key. If it does not exist, enter a synchronized block and check the key again. This guarantees that the key does not exist without entering a synchronized block every time.
Set modifications should typically occur in a synchronized block as well.

How to atomically update the value of ConcurrentMap in multithreaded application?

I have a ConcurrentMap which I need to populate from multithread application. My map is shown below:
private final ConcurrentMap<String, AtomicLongMap<String>> deviceErrorHolder = Maps.newConcurrentMap();
Below is my method which is called from multithreaded application at very fast rate so I need to make sure it is fast.
public void addDeviceErrorStats(String deviceName, String errorName) {
AtomicLongMap<String> errorMap = deviceErrorHolder.get(deviceName);
if (errorMap == null) {
errorMap = AtomicLongMap.create();
AtomicLongMap<String> currenttErrorMap = deviceErrorHolder.putIfAbsent(deviceName, errorMap);
if (currenttErrorMap != null) {
errorMap = currenttErrorMap;
}
}
errorMap.incrementAndGet(errorName);
}
For each deviceName, I will have an AtomicLongMap which will contain all the counts for different errorName.
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorA");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorB");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorC");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceB", "errorA");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceB", "errorB");
Is my addDeviceErrorStats method thread safe? And also the way I am updating the value of my deviceErrorHolder map is correct? Meaning will it be an atomic operation? Do I need to synchronize creation of new AtomicLongMap instances? Or CM will take care that for me?
I am working with Java7.

You can create a lot simpler version of this with computeIfAbsent().
AtomicLongMap<String> errorMap = deviceErrorHolder.computeIfAbsent(deviceName, a -> AtomicLongMap.create());
errorMap.incrementAndGet(errorName);
The computeIfAbsent (in concurrent maps) is especially meant to do an atomic version of what your null checking logic does. If the deviceName key has a value, it's returned, otherwise the computation is called atomically, and the return value of the computation is both associated with the key in the map as well as returned.

I believe your method is correct. Let's assume we have two concurrent threads calling it for the same device
The case where the errorMap already existed is trivial, as both threads will get the same and call incrementAndGet on it, which is atomic.
Let's now consider the case where errorMap didn't exist. say the first thread gets to AtomicLongMap.create(), and then the second thread is scheduled. Such thread will also create its own local map. putIfAbsent() is atomic, hence one of the threads will return null, while the second will return the map put by the first. In the latter case, you're throwing away the map that was instantiated by this thread, and using the one returned instead. Looks good to me.

Can we use AtomicInteger as a local variable in a method and achieve thread safety?

public void tSafe(List<Foo> list, Properties status) {
if(list == null) return;
String key = "COUNT";
AtomicInteger a = new AtomicInteger(Integer.valueOf(status.getProperty(key,"0")));
list.parallelStream().filter(Foo::check).
forEach(foo -> {status.setProperty(key, String.valueOf(a.incrementAndGet())); }
);
}
private interface Foo {
public boolean check();
}
Description:
In the above example, status is a shared properties and it contains a key with name COUNT. My aim is to increment count and put it back in properties to count the number of checks performed. Consider tSafe method is being called by multiple threads, Do I get the correct count at the end? Note that I've used AtomicInteger a as local variable.

If you only have one thread, this will work, however if you have more than one thread calling this, you have some operations which are thread safe. This will be fine provided each thread operates on different list and status objects.
As status is a thread safe collection, you can lock it, and provided the list is not changed in another thread, this would would.
In general, working with String as numbers in a thread safe manner is very tricky to get right. You are far better off making your value thread i.e. an AtomicInteger and never anything else.

No this will not guarantee thread safety. Even though incrementAndGet() is itself atomic, getting a value from the Properties object and setting it back is not.
Consider the following scenario:
Thread #1 gets a value from the Properties object. For argument's sake let's say it's "100".
Thread #2 gets a value from the Properties object. Since nothing has happened, this value is still "100".
Thread #1 creates an AtomicInteger, increments it, and places "101" in the Properties object.
Thread #2 does exactly the same, and places "101" in the Properties object, instead of the 102 you expected.
EDIT:
On a more productive note, a better approach would be to just store the AtomicInteger on your status map, and increment it inplace. That way, you have a single instance and don't have to worry about races as described above. As the Properties class extends Hashtable<Object, Object> this should technically work, although Properties really isn't intended for values that aren't Strings, and you'd be much better off with a modern thread safe Map implementation, such as a ConcurrentHashMap:
public void tSafe(List<Foo> list, ConcurrentMap<String, AtomicInteger> status) {
if(list == null) {
return;
}
String key = "COUNT";
status.putIfAbsent(key, new AtomicInteger(0));
list.parallelStream()
.filter(Foo::check)
.forEach(foo -> { status.get(ket).incrementAndGet(); });
}

Storing object reference into a volatile field

I'm using the following field:
private DateDao dateDao;
private volatile Map<String, Date> dates;
public Map<String, Date> getDates() {
return Collections.unmodifiableMap(dates);
}
public retrieveDates() {
dates = dateDao.retrieveDates();
}
Where
public interface DateDao {
//Currently returns HashMap instance
public Map<String, Date> retrieveDates();
}
Is it safe to publish the map of dates that way? I mean, volatile field means that the reference to a field won't be cached in CPU registers and be read from memory any time it is accessed.
So, we might as well read a stale value for the state of the map because HashMap doesn't do any synchronization.
Is it safe to do so?
UPD: For instance assume that the DAo method implemented in the following way:
public Map<String, Date> retrieveDates() {
Map<String, Date> retVal = new HashMap<>();
retVal.put("SomeString", new Date());
//ad so forth...
return retVal;
}
As can be seen, the Dao method doesn't do any synchronization, and both HashMap and Date are mutable and not thread safe. Now, we've created and publish them as it was shown above. Is it guaranteed that any subsequent read from the dates from some another thread will observe not only the correct reference to the Map object, but also it's "fresh" state.
I'm not sure about if the thread can't observe some stale value (e.g. dates.get("SomeString") returns null)

I think you're asking two questions:
Given that DAO code, is it possible for your code using it to use the object reference it gets here:
dates = dateDao.retrieveDates();
before the dateDao.retrieveDates method as quoted is done adding to that object. E.g., do the memory model' statement reordering semantics allow the retrieveDates method to return the reference before the last put (etc.) is complete?
Once your code has the dates reference, is there an issue with unsynchronized access to dates in your code and also via the read-only view of it you return from getDates.
Whether your field is volatile has no bearing on either of those questions. The only thing that making your field volatile does is prevent a thread calling getDates from getting an out-of-date value for your dates field. That is:
Thread A Thread B
---------- --------
1. Updates `dates` from dateDao.retrieveDates
2. Updates `dates` from " " again
3. getDates returns read-only
view of `dates` from #1
Without volatile, the scenario above is possible (but harmless). With volatile, it isn't, Thread B will see the value of dates from #2, not #1.
But that doesn't relate to either of the questions I think you're asking.
Question 1
No, your code in retrieveDates cannot see the object reference returned by dateDao.retrieveDates before dateDao.retrieveDates is done filling in that map. The memory model allows reordering statements, but:
...compilers are allowed to reorder the instructions in either thread, when this does not affect the execution of that thread in isolation
(My emphasis.) Returning the reference to your code before dateDao.retrieveDates would obviously affect the execution of the thread in isolation.
Question 2
The DAO code you've shown can never modify the map it returns to you, since it doesn't keep a copy of it, so we don't need to worry about the DAO.
In your code, you haven't shown anything that modifies the contents of dates. If your code doesn't modify the contents of dates, then there's no need for synchronization, since the map is unchanging. You might want to make that a guarantee by wrapping dates in the read-only view when you get it, rather than when you return it:
dates = Collection.unmodifiableMap(dateDao.retrieveDates());
If your code does modify dates somewhere you haven't shown, then yes, there's potential for trouble because Collections.unmodifiableMap does nothing to synchronize map operations. It just creates a read-only view.
If you wanted to ensure synchronization, you'd want to wrap dates in a Collections.synchronizedMap instance:
dates = Collections.synchronizedMap(dateDao.retrieveDates());
Then all access to it in your code will be synchronized, and all access to it via the read-only view you return will also be synchronized, as they all go through the synchronized map.

As far as I can tell, declaring a map volatile won't synchronize its access (i.e. readers could read the map while it is being updated by the dao). However, it guarantees that the map lives in shared memory, so every thread will see the same values in it at every given time. What I usually do when I need synchronization and freshness is using a lock object, something similar to the following :
private DateDao dateDao;
private volatile Map<String, Date> dates;
private final Object _lock = new Object();
public Map<String, Date> getDates() {
synchronized(_lock) {
return Collections.unmodifiableMap(dates);
}
}
public retrieveDates() {
synchronized(_lock) {
dates = dateDao.retrieveDates();
}
}
This provides readers/writers synchronization (but note that writers are not prioritized, i.e. if a reader is getting the map the writers will have to wait) and 'data freshness' via volatile. Moreover, this is a pretty basic approach, and there are other ways of achieving the same features (e.g. Locks and Semaphores), but most of the times this does the trick for me.

Is a java synchronized method entry point thread safe enough?

I have a Singleton class handling a kind of cache with different objects in a Hashmap.
(The format of a key is directly linked to the type of object stored in the map - hence the map is of )
Three different actions are possible on the map : add, get, remove.
I secured the access to the map by using a public entry point method (no intense access) :
public synchronized Object doAction(String actionType, String key, Object data){
Object myObj = null;
if (actionType.equalsIgnorecase("ADD"){
addDataToMyMap(key,data);
} else if (actionType.equalsIgnorecase("GET"){
myObj = getDataFromMyMap(key);
} else if (actionType.equalsIgnorecase("REM"){
removeDataFromMyMap(key);
}
return myObj;
}
Notes:
The map is private. Methods addDataToMyMap(), getDataFromMyMap() and removeDataFromMyMap() are private. Only the entry point method is public and nothing else except the static getInstance() of the class itself.
Do you confirm it is thread safe for concurrent access to the map since there is no other way to use map but through that method ?
If it is safge for a Map, I guess this principle could be applied to any other kind of shared ressource.
Many thanks in advance for your answers.
David

I would need to see your implementation of your methods, but it could be enough.
BUT i would recommend you to use a Map from the Collection API of java then you wouldnt need to synchronize your method unless your sharing some other instance.
read this: http://www.java-examples.com/get-synchronized-map-java-hashmap-example

Yes your class will be thread safe as long as the only entry point is doAction.

If your cache class has private HashMap and you have three methods and all are public synchronized and not static and if you don't have any other public instance variable then i think your cache is thread-safe.
Better to post your code.

This is entirely safe. As long as all the threads are accessing it using a common lock, which in this case is the Object, then it's thread-safe. (Other answers may be more performant but your implementation is safe.)

You can use Collections.synchronizedMap to synchronize access to the Map.

As is it is hard to determine if the code is thread safe. Important information missing from your example are:
Are the methods public
Are the methods synchronized
It the map only accessed through the methods
I would advice you to look into synchronization to get a grasp of the problems and how to tackle them. Exploring the ConcurrentHashMap class would give further information about your problem.

You should use ConcurrentHashMap. It offers better throughput than synchronized doAction and better thread safety than Collections.synchronizedMap().

This depends on your code. As someone else stated, you can use Collections.synchronizedMap. However, this only synchronizes the individual method calls on the map. So if:
map.get(key);
map.put(key,value);
Are executed at the same time in two different threads, one will block until the other exits. However, if your critical section is larger than the single call into the map:
SomeExpensiveObject value = map.get(key);
if (value == null) {
value = new SomeExpensiveObject();
map.put(key,value);
}
Now let's assume the key is not present. The first thread executes, and gets a null value back. The scheduler yields that thread, and runs thread 2, which also gets back a null value.
It constructs the new object and puts it in the map. Then thread 1 resumes and does the same, since it still has a null value.
This is where you'd want a larger synchronization block around your critical section
SomeExpensiveObject value = null;
synchronized (map) {
value = map.get(key);
if (value == null) {
value = new SomeExpensiveObject();
map.put(key,value);
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.