Confusion on AtomicReference getAndUpdate

Confusion on AtomicReference getAndUpdate - java

I'm a bit confused about how AtomicReference getAndUpdate guarantees atomicity. Consider the following examples
example 1
AtomicReference<Set<String>> set = new AtomicReference<>(new HashSet<>());
set.getAndUpdate(current -> {
Set<String> updated = new HashSet<>();
updated.add("test");
return updated;
});
example 2
AtomicReference<Set<String>> set = new AtomicReference<>(new HashSet<>());
set.getAndUpdate(current -> {
current.add("test");
return current;
});
In example 2, the set will be modified in the callback of the getAndUpdate. If multiple threads try to access this function at the same time, will they see the modified state or getAndUpdate prevents this by cloning the original set when passing it to the callback so that the modification happens in one thread will not be seen in other threads? If example 2 does not guarantee atomicity, why would getAndUpdate allow us to write this code?
example 1 will guarantee the atomicity since the modification happens on a new set. But how it defers from below?
AtomicReference<Set<String>> set = new AtomicReference<>(new HashSet<>());
Set<String> updated = new HashSet<>();
updated.add("test");
set.set(updated);

In example 2, the set will be modified in the callback of the getAndUpdate. If multiple threads try to access this function at the same time, will they see the modified state or getAndUpdate prevents this by cloning the original set when passing it to the callback so that the modification happens in one thread will not be seen in other threads?
There's nothing preventing the modification to the set from being seen by multiple threads. Only the change to the reference is atomic, not changes to whatever the reference refers to.
If example 2 does not guarantee atomicity, why would getAndUpdate allow us to write this code?
Because it can't stop you. The compiler isn't smart enough to know that example 2 is broken code.
To minimize the risk of accidentally / unsafely modifying shared state, make sure the things stored in an atomic reference are immutable, or at least unmodifiable:
AtomicReference<Set<String>> set = new AtomicReference<>(Collections.emptySet());
set.getAndUpdate(current -> {
Set<String> updated = new HashSet<>();
updated.add("test");
return Collections.unmodifiableSet(updated);
});
// or if you're on Java 9+
set.getAndUpdate(current -> {
return Set.of("test");
}
example 1 will guarantee the atomicity since the modification happens on a new set. But how it defers from [AtomicReference.set()]?
Your Example 1 uses AtomicReference.getAndUpdate which returns the previous value, and lets you generate the new value based on the previous value. If you don't need to know what the previous value was, you can just call AtomicReference.set.

Related

Thread safety in Set obtained from a cache

I stumbled upon the following piece of code:
public static final Map<String, Set<String>> fooCacheMap = new ConcurrentHashMap<>();
this cache is accessed from rest controller method:
public void fooMethod(String fooId) {
Set<String> fooSet = cacheMap.computeIfAbsent(fooId, k -> new ConcurrentSet<>());
//operations with fooSet
}
Is ConcurrentSet really necessary? when I know for sure that the set is accessed only in this method?

As you use it in the controller then multiple threads can call your method simultaneously (ex. multiple parallel requests can call your method)
As this method does not look like synchronized in any way then ConcurrentSet is probably necessary here.

Is ConcurrentSet really necessary?
Possibly, possibly not. We don't know how this code is being used.
However, assuming that it is being used in a multithreaded way (specifically: that two threads can invoke fooMethod concurrently), yes.
The atomicity in ConcurrentHashMap is only guaranteed for each invocation of computeIfAbsent. Once this completes, the lock is released, and other threads are able to invoke the method. As such, access to the return value is not atomic, and so you can get thread inference when accessing that value.
In terms of the question "do I need `ConcurrentSet"? No: you can do it so that accesses to the set are atomic:
cacheMap.compute(fooId, (k, fooSet) -> {
if (fooSet == null) fooSet = new HashSet<>();
// Operations with fooSet
return v;
});

Using a concurrent map will not guarantee thread safety. Additions to the Map need to be performed in a synchronized block to ensure that two threads don't attempt to add the same key to the map. Therefore, the concurrent map is not really needed, especially because the Map itself is static and final. Furthermore, if the code modifies the Set inside the Map, which appears likely, that needs to be synchronized as well.
The correct approach is to the Map is to check for the key. If it does not exist, enter a synchronized block and check the key again. This guarantees that the key does not exist without entering a synchronized block every time.
Set modifications should typically occur in a synchronized block as well.

Storing object reference into a volatile field

I'm using the following field:
private DateDao dateDao;
private volatile Map<String, Date> dates;
public Map<String, Date> getDates() {
return Collections.unmodifiableMap(dates);
}
public retrieveDates() {
dates = dateDao.retrieveDates();
}
Where
public interface DateDao {
//Currently returns HashMap instance
public Map<String, Date> retrieveDates();
}
Is it safe to publish the map of dates that way? I mean, volatile field means that the reference to a field won't be cached in CPU registers and be read from memory any time it is accessed.
So, we might as well read a stale value for the state of the map because HashMap doesn't do any synchronization.
Is it safe to do so?
UPD: For instance assume that the DAo method implemented in the following way:
public Map<String, Date> retrieveDates() {
Map<String, Date> retVal = new HashMap<>();
retVal.put("SomeString", new Date());
//ad so forth...
return retVal;
}
As can be seen, the Dao method doesn't do any synchronization, and both HashMap and Date are mutable and not thread safe. Now, we've created and publish them as it was shown above. Is it guaranteed that any subsequent read from the dates from some another thread will observe not only the correct reference to the Map object, but also it's "fresh" state.
I'm not sure about if the thread can't observe some stale value (e.g. dates.get("SomeString") returns null)

I think you're asking two questions:
Given that DAO code, is it possible for your code using it to use the object reference it gets here:
dates = dateDao.retrieveDates();
before the dateDao.retrieveDates method as quoted is done adding to that object. E.g., do the memory model' statement reordering semantics allow the retrieveDates method to return the reference before the last put (etc.) is complete?
Once your code has the dates reference, is there an issue with unsynchronized access to dates in your code and also via the read-only view of it you return from getDates.
Whether your field is volatile has no bearing on either of those questions. The only thing that making your field volatile does is prevent a thread calling getDates from getting an out-of-date value for your dates field. That is:
Thread A Thread B
---------- --------
1. Updates `dates` from dateDao.retrieveDates
2. Updates `dates` from " " again
3. getDates returns read-only
view of `dates` from #1
Without volatile, the scenario above is possible (but harmless). With volatile, it isn't, Thread B will see the value of dates from #2, not #1.
But that doesn't relate to either of the questions I think you're asking.
Question 1
No, your code in retrieveDates cannot see the object reference returned by dateDao.retrieveDates before dateDao.retrieveDates is done filling in that map. The memory model allows reordering statements, but:
...compilers are allowed to reorder the instructions in either thread, when this does not affect the execution of that thread in isolation
(My emphasis.) Returning the reference to your code before dateDao.retrieveDates would obviously affect the execution of the thread in isolation.
Question 2
The DAO code you've shown can never modify the map it returns to you, since it doesn't keep a copy of it, so we don't need to worry about the DAO.
In your code, you haven't shown anything that modifies the contents of dates. If your code doesn't modify the contents of dates, then there's no need for synchronization, since the map is unchanging. You might want to make that a guarantee by wrapping dates in the read-only view when you get it, rather than when you return it:
dates = Collection.unmodifiableMap(dateDao.retrieveDates());
If your code does modify dates somewhere you haven't shown, then yes, there's potential for trouble because Collections.unmodifiableMap does nothing to synchronize map operations. It just creates a read-only view.
If you wanted to ensure synchronization, you'd want to wrap dates in a Collections.synchronizedMap instance:
dates = Collections.synchronizedMap(dateDao.retrieveDates());
Then all access to it in your code will be synchronized, and all access to it via the read-only view you return will also be synchronized, as they all go through the synchronized map.

As far as I can tell, declaring a map volatile won't synchronize its access (i.e. readers could read the map while it is being updated by the dao). However, it guarantees that the map lives in shared memory, so every thread will see the same values in it at every given time. What I usually do when I need synchronization and freshness is using a lock object, something similar to the following :
private DateDao dateDao;
private volatile Map<String, Date> dates;
private final Object _lock = new Object();
public Map<String, Date> getDates() {
synchronized(_lock) {
return Collections.unmodifiableMap(dates);
}
}
public retrieveDates() {
synchronized(_lock) {
dates = dateDao.retrieveDates();
}
}
This provides readers/writers synchronization (but note that writers are not prioritized, i.e. if a reader is getting the map the writers will have to wait) and 'data freshness' via volatile. Moreover, this is a pretty basic approach, and there are other ways of achieving the same features (e.g. Locks and Semaphores), but most of the times this does the trick for me.

Java visibility: final static non-threadsafe collection changes after construction

I found the following code snippet in luaj and I started to doubt that if there is a possibility that changes made to the Map after it has been constructed might not be visible to other threads since there is no synchronization in place.
I know that since the Map is declared final, its initialized values after construction is visible to other threads, but what about changes that happen after that.
Some might also realize that this class is so not thread-safe that calling coerce in a multi-threaded environment might even cause infinite loop in the HashMap, but my question is not about that.
public class CoerceJavaToLua {
static final Map COERCIONS = new HashMap(); // this map is visible to all threads after construction, since its final
public static LuaValue coerce(Object paramObject) {
...;
if (localCoercion == null) {
localCoercion = ...;
COERCIONS.put(localClass, localCoercion); // visible?
}
return ...;
}
...
}

You're correct that changes to the Map may not be visible to other threads. Every method that accesses COERCIONS (both reading and writing) should be synchronized on the same object. Alternatively, if you never need sequences of accesses to be atomic, you could use a synchronized collection.
(BTW, why are you using raw types?)

This code is actually bad and may cause many problems (probably not infinite loop, that's more common with TreeMap, with HashMap it's more likely to get the silent data loss due to overwrite or probably some random exception). And you're right, it's not guaranteed that the changes made in one thread will be visible by another one.
Here the problem may look not very big as this Map is used for caching purposes, thus silent overwrites or visibility lag doesn't lead to real problems (just two distinct instances of coersion will be used for the same class, which is probably ok in this case). However it's still possible that such code will break your program. If you like, you can submit a patch to LuaJ team.

Two options:
// Synchronized (since Java 1.2)
static final Map COERCIONS = Collections.synchronizedMap(new HashMap());
// Concurrent (since Java 5)
static final Map COERCIONS = new ConcurrentHashMap();
They each have their pros and cons.
ConcurrentHashMap pro is no locking. Con is that operations are not atomic, e.g. an Iterator in one thread and a call to putAll in another will allow iterator to see some of the values added.

Mutating instance or local object variables in Lambda java 8

I know that for concurrency reasons I cannot update the value of a local variable in a lambda in Java 8. So this is illegal:
double d = 0;
orders.forEach( (o) -> {
d+= o.getTotal();
});
But, what about updating an instance variable or changing the state of a local object?, For example a Swing application I have a button and a label declared as instance variables, when I click the button I want to hide the label
jButton1.addActionListener(( e) -> {
jLabel.setVisible(false);
});
I get no compiler errors and works fine, but... is it right to change state of an object in a lambda?, Will I have concurrency problems or something bad in the future?
Here another example. Imagine that the following code is in the method doGet of a servlet
Will I have some problem here?, If the answer is yes: Why?
String key = request.getParameter("key");
Map<String, String> resultMap = new HashMap<>();
Map<String, String> map = new HashMap<>();
//Load map
map.forEach((k, v) -> {
if (k.equals(key)) {
resultMap.put(k, v);
}
});
response.getWriter().print(resultMap);
What I want to know is: When is it right to mutate the state of an object instance in a lambda?

Your assumptions are incorrect.
You can only change effectively final variables in lambdas, because lambdas are syntactic sugar* over anonymous inner classes.
*They are actually more than only syntactic sugar, but that is not relevant here.
And in anonymous inner classes you can only change effectively final variables, hence the same holds for lambdas.
You can do anything you want with lambdas as long as the compiler allows it, onto the behaviour part now:
If you modify state that depends on other state, in a parallel setting, then you are in trouble.
If you modify state that depends on other state, in a linear setting, then everything is fine.
If you modify state that does not depend on anything else, then everything is fine as well.
Some examples:
class MutableNonSafeInt {
private int i = 0;
public void increase() {
i++;
}
public int get() {
return i;
}
}
MutableNonSafeInt integer = new MutableNonSafeInt();
IntStream.range(0, 1000000)
.forEach(i -> integer.increase());
System.out.println(integer.get());
This will print 1000000 as expected no matter what happens, even though it depends on the previous state.
Now let's parallelize the stream:
MutableNonSafeInt integer = new MutableNonSafeInt();
IntStream.range(0, 1000000)
.parallel()
.forEach(i -> integer.increase());
System.out.println(integer.get());
Now it prints different integers, like 199205, or 249165, because other threads are not always seeing the changes that different threads have made, because there is no synchronization.
But say that we now get rid of our dummy class and use the AtomicInteger, which is thread-safe, we get the following:
AtomicInteger integer = new AtomicInteger(0);
IntStream.range(0, 1000000)
.parallel()
.forEach(i -> integer.getAndIncrement());
System.out.println(integer.get());
Now it correctly prints 1000000 again.
Synchronization is costly however, and we have lost nearly all benefits of parallelization here.

In general: yes, you may get concurrency problems, but only the ones you already had. Lambdafying it won't make code non-threadsafe where it was before, or vice versa. In the example you give, your code is (probably) threadsafe because an ActionListener is only ever called on the event-dispatching thread. Provided you have observed the Swing single-threaded rule, no other thread ever accesses jLabel, and if so there can be no thread interference on it. But that question is orthogonal to the use of lambdas.

in case 'forEach' is distributed to different threads/cores you might have concurrency issues. consider using atomics or concurrent structures (like ConcurrentHashMap)

Updating BigDecimal concurrently within ConcurrentHashMap thread safe

Is the code below thread/concurrency safe when there are multiple threads calling the totalBadRecords() method from inside other method? Both map objects parameters to this method are ConcurrentHashMap. I want to ensure that each call updates the total properly.
If it is not safe, please explain what do I have to do to ensure thread safety.
Do I need to synchronize the add/put or is there a better way?
Do i need to synchronize the get method in TestVO. TestVO is simple java bean and having getter/setter method.
Below is my Sample code:
public void totalBadRecords(final Map<Integer, TestVO> sourceMap,
final Map<String, String> logMap) {
BigDecimal badCharges = new BigDecimal(0);
boolean badRecordsFound = false;
for (Entry<Integer, TestVO> e : sourceMap.entrySet()) {
if ("Y".equals(e.getValue().getInd()))
badCharges = badCharges.add(e.getValue()
.getAmount());
badRecordsFound = true;
}
if (badRecordsFound)
logMap.put("badRecordsFound:", badCharges.toPlainString());
}

That depends on how your objects are used in your whole application.
If each call to totalBadRecords takes a different sourceMap and the map (and its content) is not mutated while counting, it's thread-safe:
badCharges is a local variable, it can't be shared between thread, and is thus thread-safe (no need to synchronize add)
logMap can be shared between calls to totalBadRecords: the method put of ConcurrentHashMap is already synchronized (or behaves as if it was).
if instances of TestVO are not mutated, the value from getValue() and getInd() are always coherent with one other.
the sourceMap is not mutated, so you can iterate over it.
Actually, in this case, you don't need a concurrent map for sourceMap. You could even make it immutable.
If the instances of TestVO and the sourceMap can change while counting, then of course you could be counting wrongly.

It depends on what you mean by thread-safe. And that boils down to what the requirements for this method are.
At the data structure level, the method will not corrupt any data structures, because the only data structures that could be shared with other threads are ConcurrentHashMap instances, and they safe against that kind of problem.
The potential thread-safety issue is that iterating a ConcurrentHashMap is not an atomic operation. The guarantees for the iterators are such that you are not guaranteed to see all entries in the iteration if the map is updated (e.g. by another thread) while you are iterating. That means that the totalBadRecords method may not give an accurate count if some other thread modifies the map during the call. Whether this is a real thread-safety issue depends on whether or not the totalBadRecords is required to give an accurate result in that circumstance.
If you need to get an accurate count, then you have to (somehow) lock out updates to the sourceMap while making the totalBadRecords call. AFAIK, there is no way to do this using (just) the ConcurrentHashMap API, and I can't think of a way to do it that doesn't make the map a concurrency bottleneck.
In fact, if you need to calculate accurate counts, you have to use external locking for (at least) the counting operation, and all operations that could change the outcome of the counting. And even that doesn't deal with the possibility that some thread may modify one of the TestVO objects while you are counting records, and cause the TestVO to change from "good" to "bad" or vice-versa.

You could use something like the following.
That would guarantee you that after a call to the totalBadRecords method, the String representing the bad charges in the logMap is accurate, you don't have lost updates. Of course a phantom read can always happen, as you do not lock the sourceMap.
private static final String BAD_RECORDS_KEY = "badRecordsFound:";
public void totalBadRecords(final ConcurrentMap<Integer, TestVO> sourceMap,
final ConcurrentMap<String, String> logMap) {
while (true) {
// get the old value that is going to be replaced.
String oldValue = logMap.get(BAD_RECORDS_KEY);
// calculate new value
BigDecimal badCharges = BigDecimal.ZERO;
for (TestVO e : sourceMap.values()) {
if ("Y".equals(e.getInd()))
badCharges = badCharges.add(e.getAmount());
}
final String newValue = badCharges.toPlainString();
// insert into map if there was no mapping before
if (oldValue == null) {
oldValue = logMap.putIfAbsent(BAD_RECORDS_KEY, newValue);
if (oldValue == null) {
oldValue = newValue;
}
}
// replace the entry in the map
if (logMap.replace(BAD_RECORDS_KEY, oldValue, newValue)) {
// update succeeded -> there where no updates to the logMap while calculating the bad charges.
break;
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.