One concurrent collection inside another: is it thread safe - java

[Question]: Is it thread safe to use ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>> or not.
[Optional to answer]: Also what about another concurrent maps types? And what about concurrent collections?
P.S. I'm asking only about java.util.concurrent package.
Specific Usage Context:
//we have
ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>> map = new ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>>();
//each string can be executed separately and concurently
ConcurrentHashMap<Object, Object> subMap = new ConcurrentHashMap<Object, Object>()
map.put(key, subMap);
map.remove(key);
map.get(key);
map.get(key).put(key, ref);
map.get(key).remove(key);
Maybe my solution lays around Guava HashBasedTable?

You can't define thread safety without the specific context in which you plan to use your collections.
The concurrent collections you have named are thread-safe on their own in the sense that their internal invariants will not be broken by concurrent access; however that's just one bullet point on the thread safety checklist.
If you perform anything more than a single operation on your structure, which must be atomic as a whole, then you will not get thread safety just by using these classes. You will have to resort to classic locking, or some quite elaborate, and usually unmotivated, lock-free updating scheme.
Using the examples from your question, consider the following.
Thread 1 executes
map.get(mapKey).put(key, value);
At the same time, Thread 2 executes
map.remove(mapKey);
What is the outcome? Thread 1 may be putting something to a map which has already been removed, or it may even get a null result from get. In most cases more coordination will be needed for correctness.

Concurrent Collections means multiple thread could perform add/remove operation on collection same time, No it is not thread safe
More Detail:
for further please read
What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?
Is ConcurrentHashMap totally safe?

The concurrent collections are thread safe for reads; but you must expect ConcurrentModificationException in case of competing concurrent updates or when modifying a Collection while another thread is iterating over it.

this is what the javadoc of ConcurrentHashMap says:
However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access
So, they ARE thread-safe in terms of modifying it.
UPDATE
same javadoc http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html says:
Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset. For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.

In general the classes which are part of java.util.concurrent provide additional performance at the (potential) penalty of additional coding complexity.
The issue that I see with nesting ConcurrentMap instances is managing the populating the outer map with values at given keys. If all the keys are known upfront and values placed in the map in some sort of initialization phase, there are no issues (but you also likely would not need to have the outer map be a ConcurrentMap). If you need to be able to insert new maps into the outer map as you go, the work becomes a bit more complicated. When creating a new map to insert into the outer map, you would need to use the putIfAbsentmethod[1] and pay attention to the returned value to determine what instance to add data to.
[1] - http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentMap.html#putIfAbsent(K,%20V)

Related

Is it safe to replace all the occurrences of Hashtable with ConcurrentHashmap?

Our legacy multi-threaded application has a lots of usage of Hashtable. Is it safe to replace the Hashtable instances with ConcurrentHashmap instances for performance gain? Will there be any side effect?
Is it safe to replace the Hashtable instances with ConcurrentHashmap instances for performance gain?
In most cases it should be safe and yield better performance. The effort on changing depends on whether you used the Map interface or Hashtable directly.
Will there be any side effect?
There might be side effects if your application expects to immediately be able to access elements that were put into the map by another thread.
From the JavaDoc on ConcurrentHashMap:
Retrieval operations (including get) generally do not block, so may overlap
with update operations (including put and remove). Retrievals reflect the
results of the most recently completed update operations holding upon their onset.
Edit: to clarify on "immediately" consider thread 1 adds element A to the map and while that write is executed thread 2 tries to whether A exists in the map. With Hashtable thread 2 would be blocked until after the write so the check would return true but when using ConcurrentHashMap it would return false since thread 2 would not be blocked and the write operation is not yet completed (thus thread 2 would see an outdated version of the bucket).
Depending on the size of your Hashtable objects you might get some performance gains by switching to ConcurrentHashmap.
ConcurrentHashmap is broken into segments which allow for the table to be only partially locked. This means that you can get more accesses per second than a Hashtable, which requires that you lock the entire table.
The tables themselves are both thread safe and both implement the Map interface, so replacement should be relatively easy.

Update two ConcurrentHashMaps atomically

Map<String,Integer> m1 = new ConcurrentHashMap<>();
Map<String,Integer> m2 = new ConcurrentHashMap<>();
public void update(int i) {
m1.put("key", i);
m2.put("key", i);
}
In the dummy code above, the updates to m1 & m2 are not atomic. If I try to synchronize this block, findBugs complains "Synchronization performed on util.concurrent instance"
Is there a recommended way to do this other than not using concurrent collections and doing all synchronization explicitly.
As an aside I also don't know the exact implications of wrapping concurrent collections in explicit synchronization.
If the intention is that for a key both maps always retrieve the same value, the design of using two different concurrent maps is flawed. Even if you synchronize the write, reading threads can still access the two maps while youre writing. Thats probably what the FindBugs rule tries to catch.
There are two ways to go about this, either use explicit synchronization (that is two regular maps with synchronized reads and writes), or use just one concurrent map and put in a value object hold both ints.
A ConcurrentMap is not a synchronized map. The latter does indeed synchronizes on itself, so synchronizing on the Map will have an effect regarding other Map operations.
In contrast, a ConcurrentMap is designed to allow concurrent updates of different keys and does not synchronizes on itself, so if other code synchronizes on the Map instance, it has no effect. It wouldn’t help to use a different mutex; as long as other thread perform operations on the same maps without using this mutex, they won’t respect you intended atomicity. On the other hand, if all threads would use the mutex for all accesses to the maps, you wouldn’t need a ConcurrentMap anymore.
But as Jarrod Roberson noted, this is likely an X-Y Problem, as updating two maps atomically without having an atomic read of both maps, no thread would notice whether the update happened atomically or not. The question is what you actually want to achieve, either, you don’t really need the atomicity or you shouldn’t use ConcurrentMaps.

ConcurrentHashMap operations are thread safe

Java Docs for the ConcurrentHashMap says,
even though all operations are thread-safe
What is the meaning when we say all operations of ConcurrentHashMap are thread safe?
EDIT:
what i mean to ask is that suppose there is put() operation. then according to above statement put() in CHM is thread safe. What does this mean?
From Wikipedia:
A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time.
To answer your expanded question, if multiple threads were to execute put() the effect would be that the last one to run would set the value for that key in the map. All of the puts would happen in some sequence, but they would not interfere with each other. How might they interfere without a concurrency guarantee? Well, put() returns null if no value had previously been associated with the mapping or the previous value. If two puts happened on a non-concurrent map they can both get the same return value from the put.
This sequence is possible without concurrency:
Thread1: map.put("key1", "value1") => null
then
Thread2: map.put("key2", "value2") => "value1"
Thread3: map.put("key3", "value3") => "value1"
If Thread3 got in just after Thread2, it might see "value1" rather than "value2", even though that's not what it replaces. This won't happen in a concurrent map.
What thread safety means is that you are permitted to share a ConcurrentHashMap object across multiple threads, and to access/modify that object concurrently without external locking.
Thread-safety means that an object can be used simultaneously by multiple threads while still operating correctly. In the specific case of ConcurrentHashMap, these characteristics are guaranteed:
Iterators produced by the map never throw ConcurrentModificationException, and they'll iterate in an order that's fixed when they're created. They may or may not reflect any modifications made while the map is being accessed. Ordinary HashMap iterators will throw exceptions if modified while a thread is iterating over them.
Insertion and removal operations are thread-safe. Ordinary HashMaps might get into an inconsistent internal state if multiple threads tried to insert or remove items simultaneously, especially if modifications required a rehash.
that if two threads will concurrently try to do operations on the ConcurrentHashMap you are guaranteed that the operations will not leave the data structure in an inconsistent state.
That's not something other non concurrent data structure guarantee.
It means that all the operations you do to add/delete objects into your hash map is thread safe, but retrieving is not thread safe. Means that when you added a object in a perfect thread safe environment, after that moment that object should be visible to all the thread who are retrieving object from this MAP. But this thing is not guaranteed here.

Is KeySet iterator of ConcurrentHashMap is threadsafe?

I just trying to explore What is ThreadSafe mean?
Below are my understanding:
It looks like for me; allowing multiple threads to access a collection at the same time; this is irrespective of its synchronization. For example any method without synchronized keyword on it; is thread safe, means mutiple threads can access it.
It is up to a developer choice to maintain some more logic (synchronization) on this method to maintain data integrity while multi-threads are accessing it. That is separate from thread safe.
If my above statement is false; just read the below JAVA DOC for `ConcurrentHashMap:
keySet: The view's iterator is a "weakly consistent" iterator that will never throw
ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.
The above statement says keySet iterator will not guarantee the data integrity; while multi-threads are modifying the collection.
Could you please answer me, *Is KeySet iterator of ConcurrentHashMap is threadsafe?
And my understanding on thread safe is correct ??
keySet: The view's iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction
This itself explains, that KeySet iterator of ConcurrentHashMap is threadsafe.
General idea behind the java.util.concurrent package is providing a set of data structures that provide thread-safe access without strong consistency. This way these objects achieve higher concurrency then properly locked objects.
Being thread safe means that, even without any explicit synchronization you never corrupt the objects. In HashTable and HashMap some methods are potential problems for multi-thread access, such as remove method, that first checks that the element exists, then removes it. These kind of methods are implemented as atomic operations in ConcurrentHashMap, thus you do not need to afraid that you will lose some data.
However it does not mean that this class is automatically locked for each operation. High level operations such as putAll and iterators are not synchronized. The class does not provide strong consistency. The order and timing of your operations are guaranteed to not to corrupt the object, but are not guaranteed to generate accurate results.
For example if your print the object concurrently with a call to putAll, you might see a partially populated output. Using an iterator concurrently with new insertions also might not reflect all insertions as you quoted.
This is different from being thread safe. Even though the results might surprise you, you are assured that nothing is lost or accidentally overwritten, elements are added to and removed from your object without any problem. If this behaviour is sufficient for your requirements you are advised to use java.util.concurrent classes. If you need more consistency, then you need to use synchronized classes from java.util or use synchronization yourself.
By your definition the Set returned by ConcurrentHashMap.keySet() is thread safe.
However, it may act in very strange ways, as pointed out in the quote you included.
As a Set, entries may appear and/or disappear at random. I.e. if you call contains twice on the same object, the two results may differ.
As an Iterable you could begin two iterations of its underlying objects in two different threads and discover that the two iterations enumerate different entries.
Furthermre, contains and iteration may not match either.
This activity will not occur, however, if you somehow lock the underlying Map from modification while you have hold of your Set but the need to do that does not imply that the structure is not thread safe.

Does a Collection which is Thread Safe have to be Synchronized?

I wanted to use Collection for only single threaded environment and I am using a HashMap that is synchronized.
However, I still doubt if it is thread safe to have it synchronized or not.
If you're only using a single thread, you don't need a thread-safe collection - HashMap should be fine.
You should be very careful to work out your requirements:
If you're really using a single thread, stick with HashMap (or consider LinkedHashMap)
If you're sharing the map, you need to work out what kind of safety you want:
If the map is fully populated before it's used by multiple threads, which just read, then
HashMap is still fine.
Collections.synchronizedMap will only synchronize each individual operation; it still isn't
safe to iterate in one thread and modify the map in another thread without synchronization.
ConcurrentHashMap is a more "thoroughly" thread-safe approach, and one I'd generally prefer
over synchronizedMap. It allows for modification during iteration, but doesn't guarantee
where such modifications will be seen while iterating. Also note that while HashMap allows null
keys and values, ConcurrentHashMap doesn't.
For your needs, use ConcurrentHashMap. It allows concurrent modification of the Map from several threads without the need to block them. Collections.synchronizedMap(map) creates a blocking Map which will degrade performance, albeit ensure consistency
the standard java HashMap is not synchronized.
If you are in a single threaded environment you don't need to worry about synchronization.
The commonly used Collection classes, such as java.util.ArrayList, are not synchronized. However, if there's a chance that two threads could be altering a collection concurrently, you can generate a synchronized collection from it using the synchronizedCollection() method. Similar to the read-only methods, the thread-safe APIs let you generate a synchronized collection, list, set, or map. For instance, this is how you can generate a synchronized map from a HashMap:
Map map = Collections.synchronizedMap(new HashMap());
map.put(...
As its a single-threaded environment you can safely use HashMap.

Categories

Resources