Java Docs for the ConcurrentHashMap says,
even though all operations are thread-safe
What is the meaning when we say all operations of ConcurrentHashMap are thread safe?
EDIT:
what i mean to ask is that suppose there is put() operation. then according to above statement put() in CHM is thread safe. What does this mean?
From Wikipedia:
A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time.
To answer your expanded question, if multiple threads were to execute put() the effect would be that the last one to run would set the value for that key in the map. All of the puts would happen in some sequence, but they would not interfere with each other. How might they interfere without a concurrency guarantee? Well, put() returns null if no value had previously been associated with the mapping or the previous value. If two puts happened on a non-concurrent map they can both get the same return value from the put.
This sequence is possible without concurrency:
Thread1: map.put("key1", "value1") => null
then
Thread2: map.put("key2", "value2") => "value1"
Thread3: map.put("key3", "value3") => "value1"
If Thread3 got in just after Thread2, it might see "value1" rather than "value2", even though that's not what it replaces. This won't happen in a concurrent map.
What thread safety means is that you are permitted to share a ConcurrentHashMap object across multiple threads, and to access/modify that object concurrently without external locking.
Thread-safety means that an object can be used simultaneously by multiple threads while still operating correctly. In the specific case of ConcurrentHashMap, these characteristics are guaranteed:
Iterators produced by the map never throw ConcurrentModificationException, and they'll iterate in an order that's fixed when they're created. They may or may not reflect any modifications made while the map is being accessed. Ordinary HashMap iterators will throw exceptions if modified while a thread is iterating over them.
Insertion and removal operations are thread-safe. Ordinary HashMaps might get into an inconsistent internal state if multiple threads tried to insert or remove items simultaneously, especially if modifications required a rehash.
that if two threads will concurrently try to do operations on the ConcurrentHashMap you are guaranteed that the operations will not leave the data structure in an inconsistent state.
That's not something other non concurrent data structure guarantee.
It means that all the operations you do to add/delete objects into your hash map is thread safe, but retrieving is not thread safe. Means that when you added a object in a perfect thread safe environment, after that moment that object should be visible to all the thread who are retrieving object from this MAP. But this thing is not guaranteed here.
Related
I have two thread that shares a common HashMap, one thread will always insert an objects to the Map and the second thread will remove objects from the HashMap.
my question is if this is the only logic of the two thread should I "protect" the Map with synchronize or ConcurrentHashMap, could I have a race condition?
if yes can you please explain what is the risk of not protecting the Map.
thanks
could I have a race condition
Yes, without synchronization or ConcurrentHashMap.
It is clearly stated in the Javadoc of HashMap:
Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.
You have two threads modifying the map structurally, so you need to synchronize if you use HashMap.
please explain what is the risk of not protecting the Map
Undefined behavior, ranging from doing completely the wrong thing (the very best kind of undefined behavior, because you know it needs fixing), down to seeming to work, until you change your JVM version and it mysteriously stops working (the very worst kind of undefined behavior).
In the doc, it says
If multiple threads access a hash map concurrently, and at least one
of the threads modifies the map structurally, it must be synchronized
externally. (A structural modification is any operation that adds or
deletes one or more mappings; merely changing the value associated
with a key that an instance already contains is not a structural
modification.)
It seems indicating that changing the value associated with a key that an instance already contains does not need external synchronization. But I think it's not thread safe. right?
For thread visibility purposes yes, you'll need external syncing if you have two threads that communicate using the map. But unsynchronized structural changes have a chance of corrupting the map completely (imagine when 2 threads put a new mapping and both start to rehash the map), whereas changing a mapped value will have less dramatic effects.
Even with only one thread doing structural modifications it's problematic if the backing array is grown/rehashed. Other threads using the same array (or the old one, if the array is grown) can encounter lost updates (thread puts value in the old array instead of new array), disappearing mappings (thread puts value in the array, while another thread is rehashing the same array, value gets put in the wrong bucket) and so on.
So when is it safe to not synchronize? Almost never. A safe situation would be a pre-built map with threads only accessing "their" entries, like
thread1: map.get("A");
thread2: map.put("B", "1"); // Assume "B" was in the map already
thread3: map.get("C");
No problems because no structural changes and threads are not sharing keys. As soon as you start sharing keys between threads, you can get race conditions and visibility issues. If you introduce structural changes, those visibility issues can result in data loss in the map.
Our legacy multi-threaded application has a lots of usage of Hashtable. Is it safe to replace the Hashtable instances with ConcurrentHashmap instances for performance gain? Will there be any side effect?
Is it safe to replace the Hashtable instances with ConcurrentHashmap instances for performance gain?
In most cases it should be safe and yield better performance. The effort on changing depends on whether you used the Map interface or Hashtable directly.
Will there be any side effect?
There might be side effects if your application expects to immediately be able to access elements that were put into the map by another thread.
From the JavaDoc on ConcurrentHashMap:
Retrieval operations (including get) generally do not block, so may overlap
with update operations (including put and remove). Retrievals reflect the
results of the most recently completed update operations holding upon their onset.
Edit: to clarify on "immediately" consider thread 1 adds element A to the map and while that write is executed thread 2 tries to whether A exists in the map. With Hashtable thread 2 would be blocked until after the write so the check would return true but when using ConcurrentHashMap it would return false since thread 2 would not be blocked and the write operation is not yet completed (thus thread 2 would see an outdated version of the bucket).
Depending on the size of your Hashtable objects you might get some performance gains by switching to ConcurrentHashmap.
ConcurrentHashmap is broken into segments which allow for the table to be only partially locked. This means that you can get more accesses per second than a Hashtable, which requires that you lock the entire table.
The tables themselves are both thread safe and both implement the Map interface, so replacement should be relatively easy.
[Question]: Is it thread safe to use ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>> or not.
[Optional to answer]: Also what about another concurrent maps types? And what about concurrent collections?
P.S. I'm asking only about java.util.concurrent package.
Specific Usage Context:
//we have
ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>> map = new ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>>();
//each string can be executed separately and concurently
ConcurrentHashMap<Object, Object> subMap = new ConcurrentHashMap<Object, Object>()
map.put(key, subMap);
map.remove(key);
map.get(key);
map.get(key).put(key, ref);
map.get(key).remove(key);
Maybe my solution lays around Guava HashBasedTable?
You can't define thread safety without the specific context in which you plan to use your collections.
The concurrent collections you have named are thread-safe on their own in the sense that their internal invariants will not be broken by concurrent access; however that's just one bullet point on the thread safety checklist.
If you perform anything more than a single operation on your structure, which must be atomic as a whole, then you will not get thread safety just by using these classes. You will have to resort to classic locking, or some quite elaborate, and usually unmotivated, lock-free updating scheme.
Using the examples from your question, consider the following.
Thread 1 executes
map.get(mapKey).put(key, value);
At the same time, Thread 2 executes
map.remove(mapKey);
What is the outcome? Thread 1 may be putting something to a map which has already been removed, or it may even get a null result from get. In most cases more coordination will be needed for correctness.
Concurrent Collections means multiple thread could perform add/remove operation on collection same time, No it is not thread safe
More Detail:
for further please read
What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?
Is ConcurrentHashMap totally safe?
The concurrent collections are thread safe for reads; but you must expect ConcurrentModificationException in case of competing concurrent updates or when modifying a Collection while another thread is iterating over it.
this is what the javadoc of ConcurrentHashMap says:
However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access
So, they ARE thread-safe in terms of modifying it.
UPDATE
same javadoc http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html says:
Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset. For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.
In general the classes which are part of java.util.concurrent provide additional performance at the (potential) penalty of additional coding complexity.
The issue that I see with nesting ConcurrentMap instances is managing the populating the outer map with values at given keys. If all the keys are known upfront and values placed in the map in some sort of initialization phase, there are no issues (but you also likely would not need to have the outer map be a ConcurrentMap). If you need to be able to insert new maps into the outer map as you go, the work becomes a bit more complicated. When creating a new map to insert into the outer map, you would need to use the putIfAbsentmethod[1] and pay attention to the returned value to determine what instance to add data to.
[1] - http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentMap.html#putIfAbsent(K,%20V)
I need to use a HashMap of the form <String, ArrayList<String>> that is going to be
accessed by several different threads. From what I've managed to understand, ConcurrentHashMap is the preferred method. But will there be any problem with the fact that the value of the map is an ArrayList? Do I have to define the value as a synchronized ArrayList or something like that?
yes, there can be a problem. The ConcurrentHashMap will be thread safe for accesses into the Map, but the Lists served out need to be thread-safe, if multiple threads can operate on the same List instances concurrently.
So use a thread-safe list if that is true.
Edit -- now that i think about it, the rabbit-hole goes further. You have your Map, you have your List, and you have the objects in the list. Anything multiple threads can modify should be thread safe. So if many threads can modify the Map, Lists, and Objects in the Lists, then all of those should have thread-safety guards. If only the Map and List instances can be modified concurrently, only they need thread safety. If multiple threads can read everything, but not modify, then you don't need any thread safety (I think, someone will correct me if this is wrong)
ConcurrentHashMap guarantees atomicity in its mutating methods e.g. putIfAbsent, computeIfAbsent, computeIfPresent so if all the modification is done through these methods no problems will arise.
But at the same time, multiple threads can read the same map entry (concurrent reads are allowed). therefore multiple threads can access and modify an unsafe collection (map value)