ConcurrentHashMap with synchronized

ConcurrentHashMap with synchronized - java

I am maintaining some legacy code and found some implementation with synchronized key-word on ConcurrentHashMap. It seem unnecessary to me:
public class MyClass{
private final Map<MyObj, Map<String, List<String>>> conMap = new ConcurrentHashMap<>();
//...
//adding new record into conMap:
private void addToMap(MyObj id, String name, String value){
conMap.putIfAbsent(id, new ConcurrentHashMap<>());
Map<String, List<String>> subMap = conMap.get(id);
synchronized(subMap){ // <-- is it necessary?
subMap.putIfAbsent(name, new ArrayList<>());
subMap.get(name).add(value);
}
}
//...
public void doSomthing((MyObj id){
List<Map<String, List<String>>> mapsList = new LinkedList<>();
for(MyObj objId: conMap.keySet()){
if(objId.key1.equals(id.key1)){
mapsList.add(conMap.get(objId));
}
}
for(Map<String, List<String>> map: mapsList){
synchronized(map){ // <-- is it necessary?
if(timeout <= 0){
log(map.size());
for(List<String> value: map.values(){
log(id, value);
}
}
else{
int sum = 0;
for(map.Entry<String, List<String>> val: map.entrySet()){
sum += val.getValue().size();
}
log(sum);
map.wait(timeout);
}
}
//...
}
So, is it reasonable to use synchronized key on object that already concurrent? Or those are two different things?

In this case:
synchronized(subMap){ // <-- is it necessary?
subMap.putIfAbsent(name, new ArrayList<>());
subMap.get(name).add(value);
}
the synchronized is necessary. Without it, you could have two threads simultaneously updating the same ArrayList instance. Since ArrayList is not thread-safe, the addToMap method would not be thread-safe either.
In this case:
synchronized(map){ // <-- is it necessary?
if(/*condition*/){
log(map.size());
for(List<String> value: map.values(){
log(id, value);
}
}
else{
int sum = 0;
for(map.Entry<String, List<String>> val: map.entrySet()){
sum += val.getValue().size();
}
log(sum);
map.wait(timeout);
}
the synchronized is necessary.
In the if branch, the log method (or something called from it) will probably call ArrayList::toString which will iterate each ArrayList. Without the synchronizing at the submap level, there could be a simultaneous add by another thread (e.g. an addToMap call). That means that there are memory hazards, and a ConcurrentModificationException may be possible in the toString() method.
In the else branch, the size() call is accessing a size field in each ArrayList in the submap. Without the synchronizing at the submap level, there could be a simultaneous add on one of those list. That could cause the size() method to return a stale value. In addition, you are not guaranteed to see map entries added to a submap while your are iterating it. If either of those events happen, the sum could be inaccurate. (Whether that is really an issue depends on the requirements for this method: inaccurate counts could be acceptable.)

ConcurrentHashMap synchronizes each individual method call itself, so that no other thread can access the map (and possibly break the internal data structure of the map).
Synchronized block synchronizes two or more consecutive method calls, so that no other thread can modify the data structure between the calls (and possibly break the consistency of the data, with regards to the application logic).
Note that the synchornized block only works if all access to the HashMap is performed from synchronized blocks using the same monitor object.

It is sort of necessary, as multiple threads may try to append to the same ArrayList at the same time. The synchonized is protecting against that happening as ArrayList is obviously not synchronized.
Since Java 8 we have computeIfAbsent which means the puts followed by gets they are doing can be simplified. I would write it like this, no synchronization required:
conMap.computeIfAbsent(id, k -> new ConcurrentHashMap<>())
.computeIfAbsent(name, k -> new CopyOnWriteArrayList<>()) // or other thread-safe list
.add(value);

Other answers don't adequately this bit...
for(Map<String, List<String>> map: mapsList){
synchronized(map){ // <-- is it necessary?
if(/*condition*/){
...iterate over map...
}
else {
...iterate over map...
}
}
}
Is it necessary? Hard to tell.
What is /*condition*/ ? Does synchronizing on map prevent some other thread A from changing the value of /*condition*/ after thread B has tested it, but before or while thread B is performing either of the two branches? If so, then the synchronized block could be very important.
How about those iterations? Does synchronizing on map prevent some other thread A from changing the contents of the map while thread B is iterating? If so, then the synchronized block could be very important.

Related

Java: get+clear atomic for map

I would like to implement the following logic:
-the following structure is to be used
//Map<String, CopyOnWriteArrayList> keeping the pending updates
//grouped by the id of the updated object
final Map<String, List<Update>> updatesPerId = new ConcurrentHashMap<>();
-n producers will add updates to updatesPerId map (for the same id, 2 updates can be added at the same time)
-one TimerThread will run from time to time and has to process the received updates. Something like:
final Map<String, List<Update>> toBeProcessed = new HashMap<>(updatesPerId);
updatesPerId.clear();
// iterate over toBeProcessed and process them
Is there any way to make this logic thread safe without synchronizing the adding logic from producers and the logic from timerThread(consumer)? I am thinking about an atomic clear+get but it seems that ConcurrentMap does not provide something like that.
Also, I have to mention that updates should be kept by updated object id so I cannot replace the map with a queue or something else.
Any ideas?
Thanks!

You can leverage the fact that ConcurrentHashMap.compute executes atomically.
You can put into the updatesPerId like so:
updatesPerId.compute(k, (k, list) -> {
if (list == null) list = new ArrayList<>();
// ... add to the list
// Return a non-null list, so the key/value pair is stored in the map.
return list;
});
This is not using computeIfAbsent then adding to the return value, which would not be atomic.
Then in your thread to remove things:
for (String key : updatesPerId.keySet()) {
List<Update> list = updatesPerId.put(key, null);
updatesPerId.compute(key, (k, list) -> {
// ... Process the contents of the list.
// Removes the key/value pair from the map.
return null;
});
}
So, adding a key to the list (or processing all the values for that key) might block if you so happen to try to process the key in both places at once; otherwise, it will not be blocked.
Edit: as pointed out by #StuartMarks, it might be better to simply get all things out of the map first, and then process them later, in order to avoid blocking other threads trying to add:
Map<String, List<Update>> newMap = new HashMap<>();
for (String key : updatesPerId.keySet()) {
newMap.put(key, updatesPerId.remove(key));
}
// ... Process entries in newMap.

I'd suggest using LinkedBlockingQueue instead of CopyOnWriteArrayList as the map value. With COWAL, adds get successively more expensive, so adding N elements results in N^2 performance. LBQ addition is O(1). Also, LBQ has drainTo which can be used effectively here. You could do this:
final Map<String, Queue<Update>> updatesPerId = new ConcurrentHashMap<>();
Producer:
updatesPerId.computeIfAbsent(id, LinkedBlockingQueue::new).add(update);
Consumer:
updatesPerId.forEach((id, queue) -> {
List<Update> updates = new ArrayList<>();
queue.drainTo(updates);
processUpdates(id, updates);
});
This is somewhat different from what you had suggested. This technique processes the updates for each id, but lets producers continue to add updates to the map while this is going on. This leaves map entries and queues in the map for each id. If the ids end up getting reused a lot, the number of map entries will plateau at a high-water mark.
If new ids are continually coming in, and old ids becoming disused, the map will grow continually, which probably isn't what you want. If this is the case you could use the technique in Andy Turner's answer.
If the consumer really needs to snapshot and clear the entire map, I think you have to use locking, which you wanted to avoid.

Is there any way to make this logic thread safe without synchronizing the adding logic from producers and the logic from timerThread(consumer)?
In short, no - depending on what you mean by "synchronizing".
The easiest way is to wrap your Map into a class of your own.
class UpdateManager {
Map<String,List<Update>> updates = new HashMap<>();
public void add(Update update) {
synchronized (updates) {
updates.computeIfAbsent(update.getKey(), k -> new ArrayList<>()).add(update);
}
}
public Map<String,List<Update>> getUpdatesAndClear() {
synchronized (updates) {
Map<String,List<Update>> copy = new HashMap<>(updates);
updates.clear();
return copy;
}
}
}

How to avoid HashMap "ConcurrentModificationException" while manipulating `values()` and `put()` in concurrent threads?

Code:
I have a HashMap
private Map<K, V> map = new HashMap<>();
One method will put K-V pair into it by calling put(K,V).
The other method wants to extract a set of random elements from its values:
int size = map.size(); // size > 0
V[] value_array = map.values().toArray(new V[size]);
Random rand = new Random();
int start = rand.nextInt(size); int end = rand.nextInt(size);
// return value_array[start .. end - 1]
The two methods are called in two different concurrent threads.
Error:
I got a ConcurrentModificationException error:
at java.util.HashMap$HashIterator.nextEntry(Unknown Source)
at java.util.HashMap$ValueIterator.next(Unknown Source)
at java.util.AbstractCollection.toArray(Unknown Source)
It seems that the toArray() method in one thread is actually iterating over the HashMap and a put() modification in other thread occurs.
Question: How to avoid "ConcurrentModificationException" while using HashMap.values().toArray() and HashMap.put() in concurrent threads?
Directly avoiding using values().toArray() in the second method is also OK.

You need to provide some level of synchronization so that the call to put is blocked while the toArray call is executing and vice versa. There are three two simple approaches:
Wrap your calls to put and toArray in synchronized blocks that synchronize on the same lock object (which might be the map itself or some other object).
Turn your map into a synchronized map using Collections.synchronizedMap()
private Map<K, V> map = Collections.synchronizedMap(new HashMap<>());
Use a ConcurrentHashMap instead of a HashMap.
EDIT: The problem with using Collections.synchronizedMap is that once the call to values() returns, the concurrency protection will disappear. At that point, calls to put() and toArray() might execute concurrently. A ConcurrentHashMap has a somewhat similar problem, but it can still be used. From the docs for ConcurrentHashMap.values():
The view's iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.

I would use ConcurrentHashMap instead of a HashMap and protect it from concurrent reading and modification by different threads. See the below implementation. It is not possible for thread 1 and thread 2 to read and write at the same time. When thread 1 is extracting values from Map to an array, all other threads that invoke storeInMap(K, V) will suspend and wait on the map until the first thread is done with the object.
Note: I do not use synchronized method in this context; I do not completely rule out synchronized method but I would use it with caution. A synchronized method is actually just syntax sugar for getting the lock on 'this' and holding it for the duration of the method so it can hurt throughput.
private Map<K, V> map = new ConcurrentHashMap<K, V>();
// thread 1
public V[] pickRandom() {
int size = map.size(); // size > 0
synchronized(map) {
V[] value_array = map.values().toArray(new V[size]);
}
Random rand = new Random();
int start = rand.nextInt(size);
int end = rand.nextInt(size);
return value_array[start .. end - 1]
}
// thread 2
public void storeInMap(K, V) {
synchronized(map) {
map.put(K,V);
}
}

Java: concurrent modification exception

I am getting an error in the for(Entry...) loop where after calling dfs(), it will say concurrentmodificationexception. I don't know why it is happening even though visitedOrder is not related with the foreach loop. How can this be fixed?
public TreeMap<Integer, Integer> DFS()
{
TreeMap<Integer, Integer> stack = new TreeMap<Integer, Integer>();
TreeMap<Integer, Integer> visitedOrder = stack;
for(int i = 1; i < graph[0].length-1; i++)
{
stack.put(i, 0);
}
for(Entry<Integer, Integer> vertex : stack.entrySet())
{
if(vertex.getValue() == 0)
dfs(vertex.getKey(), visitedOrder);
}
System.out.println(visitedOrder.values());
return visitedOrder;
}
public void dfs(int vertex, TreeMap<Integer, Integer> visited)
{
visited.put(vertex, order++);
int currVertex = vertex;
for(int i = vertex; i < graph[0].length-1;i++)
{
if(graph[vertex][i+1] == 1)
{
dfs(++currVertex, visited);
break;
}
currVertex++;
}
}

Here is the Javadoc for "Class ConcurrentModificationException":
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/ConcurrentModificationException.html
This exception may be thrown by methods that have detected concurrent
modification of an object when such modification is not permissible.
For example, it is not generally permissible for one thread to modify
a Collection while another thread is iterating over it. In general,
the results of the iteration are undefined under these circumstances.
Some Iterator implementations (including those of all the general
purpose collection implementations provided by the JRE) may choose to
throw this exception if this behavior is detected. Iterators that do
this are known as fail-fast iterators, as they fail quickly and
cleanly, rather that risking arbitrary, non-deterministic behavior at
an undetermined time in the future.
Note that this exception does not always indicate that an object has
been concurrently modified by a different thread. If a single thread
issues a sequence of method invocations that violates the contract of
an object, the object may throw this exception. For example, if a
thread modifies a collection directly while it is iterating over the
collection with a fail-fast iterator, the iterator will throw this
exception.
As it happens, that's precisely what you're doing: modifying the very structure you're using in your "foreach" loop.
WORKAROUND:
If you believe your design is correct, then substitute a simple for loop: for (int i=0; i < myContainer.size(); i++) ...

I don't know why it is happening even though visitedOrder is not
related with the foreach loop.
You are trying to modify the TreeMap while you are reading.
You are just pointing the reference here in this line. So its just the same TreeMap with different reference name.
TreeMap<Integer, Integer> stack = new TreeMap<Integer, Integer>();
TreeMap<Integer, Integer> visitedOrder = stack;

There is just one TreeMap instance that is created when you do a new TreeMap<Integer, Integer>(). The stack variable refers to this instance; the visitedOrder variable also refers the same instance. And when you call dfs(int vertex, TreeMap<Integer, Integer> visited), the visited parameter also refers to the same TreeMap instance.
Now you're iterating over the entry set of this TreeMap instance in the for(Entry<Integer,... loop. While iterating, you call the dfs(int, TreeMap<Interge, Integer>) method and within this method, you invoke a put on the TreeMap instance and that modifies the instance; hence the ConcurrentModificationException.
From the code you've provided, my understanding is that you are trying to convert a graph array to a TreeMap by doing a DFS. You are iterating over the TreeMap referenced by stack and trying to populate visitedOrder. To resolve the exception you are getting, just point the visitedorder variable to a new TreeMap<Integer, Integer>() instance.
Note that the fix I've suggested is aimed at fixing the exception while keeping you code flow and logic unchanged as I only have a limited picture of your solution.

Why to synchronize on SynchronizedMap or SynchronizedCollections?

I am referring to question asked here and using authors code example, now my question is
Why does author uses synchronized(synchronizedMap), is it really necessary because synchronizedMap will always make sure that there are no two threads trying to do read/put operation on Map so why do we need to synchronize on that map itself?
Would really appreciate an explanation.
public class MyClass {
private static Map<String, List<String>> synchronizedMap =
Collections.synchronizedMap(new HashMap<String, List<String>>());
public void doWork(String key) {
List<String> values = null;
while ((values = synchronizedMap.remove(key)) != null) {
//do something with values
}
}
public static void addToMap(String key, String value) {
synchronized (synchronizedMap) {
if (synchronizedMap.containsKey(key)) {
synchronizedMap.get(key).add(value);
}
else {
List<String> valuesList = new ArrayList<String>();
valuesList.add(value);
synchronizedMap.put(key, valuesList);
}
}
}
}

why do we need to synchronize on that synchronizemap itself?
You may need to synchronize on an already synchronized collection because you are performing two operations on the collection -- in your example, a containsKey() and then a put(). You are trying to protect against race conditions in the code that is calling the collection. In addition, in this case, the synchronized block also protects the ArrayList values so that multiple threads can add their values to these unsynchronized collections.
If you look at the code you linked to, they first check for the existence of the key and then put a value into the map if the key did not exist. You need to protect against 2 threads checking for a key's existence and then both of them putting into the map. The race is which one will put first and which one will overwrite the previous put.
The synchronized collection protects itself from multiple threads corrupting the map itself. It does not protect against logic race conditions around multiple calls to the map.
synchronized (synchronizedMap) {
// test for a key in the map
if (synchronizedMap.containsKey(key)) {
synchronizedMap.get(key).add(value);
} else {
List<String> valuesList = new ArrayList<String>();
valuesList.add(value);
// store a value into the map
synchronizedMap.put(key, valuesList);
}
}
This is one of the reasons why the ConcurrentMap interface has the putIfAbsent(K key, V value);. That does not require two operations so you may not need to synchronize around it.
Btw, I would rewrite the above code to be:
synchronized (synchronizedMap) {
// test for a key in the map
List<String> valuesList = synchronizedMap.get(key);
if (valueList == null) {
valuesList = new ArrayList<String>();
// store a value into the map
synchronizedMap.put(key, valuesList);
}
valuesList.add(value);
}
Lastly, if most of the operations on the map need to be in a synchronized block anyway, you might as well not pay for the synchronizedMap and just use a HashMap always inside of synchronized blocks.

It is not just about updating synchronizedMap values, it is about sequence of operations effecting the map. There are two operations happening on the map inside same method.
If you don't synchronize block/method, assume there may be case like Thread1 executing first part and thread2 executing second part, your business operation may results in weird results (even though updates to map are synchronized)

Is it thread-safe to iterate a HashMap object concurrently?

If multiple threads concurrently iterate a HashMap object, without modifying it, is there a chance for race conditions?

No race, if you can guarantee that no other thread would modify this HashMap while it is being iterated.

Nope, that is perfectly fine. As long as all reads are synchronized with all writes, and all writes are synchronized with each other, there is no harm in concurrent reads; so if there are no writes at all, then all concurrent access is safe.

It will be al right. But if any of the threads add or remove an item, this will throw exception in any other threads that are just iterating HashMap (any collection in fact)

If you are going to iterate of a Map repeatedly you may find it marginally faster to iterate over an array copy.
private final HashMap<String, String> properties = new HashMap<String, String>();
private volatile Map.Entry<String, String>[] propertyEntries = null;
private void updatePropertyEntries() {
propertyEntries = properties.entrySet().toArray(new Map.Entry[properties.size()]);
}
{
// no objects created
for (Map.Entry<String, String> entry : propertyEntries) {
}
}
BTW: You can have one thread modify/replace the propertyEntries while iterating in many threads with this pattern.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.