ConcurrentHashMap needed with ReadWriteLock? - java

I have a Map which is read by multiple threads but which is (from time to time) cleared and rebuilt by another thread.
I have surrounded all the access to this map with
readWriteLock.readLock().lock()
try {
... access myMap here...
} finally {
readWriteLock.readLock().unlock()
}
... or the writeLock() equivalents, depending on the type of access.
My question is... will the ReadWriteLock ensure the updates to myMap are visible to the other threads (since they must wait until after the unlock() is called by the writing thread? Or, do I also need to make myMap a concurrent map, like ConcurrentHashMap?
I will probably do that, just to be safe, but I'd like to understand better.

Yes, this should be fine even without a thread-aware map. The Javadoc for ReadWriteLock explicitly says:
All ReadWriteLock implementations must guarantee that the memory synchronization effects of writeLock operations (as specified in the Lock interface) also hold with respect to the associated readLock. That is, a thread successfully acquiring the read lock will see all updates made upon previous release of the write lock.
(Of course, by using a reader/writer lock at all you depend on the map supporting concurrent lookups from different threads. One could imagine clever data structure that try to save time overall by mutating some internal cached state during a lookup. But the standard collections such as HashMap will not do that).

Related

Update two ConcurrentHashMaps atomically

Map<String,Integer> m1 = new ConcurrentHashMap<>();
Map<String,Integer> m2 = new ConcurrentHashMap<>();
public void update(int i) {
m1.put("key", i);
m2.put("key", i);
}
In the dummy code above, the updates to m1 & m2 are not atomic. If I try to synchronize this block, findBugs complains "Synchronization performed on util.concurrent instance"
Is there a recommended way to do this other than not using concurrent collections and doing all synchronization explicitly.
As an aside I also don't know the exact implications of wrapping concurrent collections in explicit synchronization.
If the intention is that for a key both maps always retrieve the same value, the design of using two different concurrent maps is flawed. Even if you synchronize the write, reading threads can still access the two maps while youre writing. Thats probably what the FindBugs rule tries to catch.
There are two ways to go about this, either use explicit synchronization (that is two regular maps with synchronized reads and writes), or use just one concurrent map and put in a value object hold both ints.
A ConcurrentMap is not a synchronized map. The latter does indeed synchronizes on itself, so synchronizing on the Map will have an effect regarding other Map operations.
In contrast, a ConcurrentMap is designed to allow concurrent updates of different keys and does not synchronizes on itself, so if other code synchronizes on the Map instance, it has no effect. It wouldn’t help to use a different mutex; as long as other thread perform operations on the same maps without using this mutex, they won’t respect you intended atomicity. On the other hand, if all threads would use the mutex for all accesses to the maps, you wouldn’t need a ConcurrentMap anymore.
But as Jarrod Roberson noted, this is likely an X-Y Problem, as updating two maps atomically without having an atomic read of both maps, no thread would notice whether the update happened atomically or not. The question is what you actually want to achieve, either, you don’t really need the atomicity or you shouldn’t use ConcurrentMaps.

What are the not thread-Safe cases when using HashMap in Java?

In the API documents, we can see:
If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be
synchronized externally. (A structural modification is any operation
that adds or deletes one or more mappings; merely changing the value
associated with a key that an instance already contains is not a
structural modification.)
I'm thinking if the "put" method should be synchronized ? It said only the structural modification. Can you give some unsafe cases for the HashMap. And when I view the source code of "HashTable", the "get" method is also been synchronized, why not only the write operations be synchronized?
There is a general rule of thumb:
If you have more than one thread accessing a collection and at least one thread modifies the collection at some point, you need to synchronize all accesses to the collection.
If you think about it, its very clear: If a collection is modified while another thread reads from it (e.g. iterates), read and write operation can interfere with each other (the read seeing a partial write, e.g. entry created but value not yet set or entry not properly linked yet).
Exempt from this are collections one thread creates and modifies, then hands of to "the world" but never modifies them after publishing their reference.
why not only the write operations be synchronized?
If the reads are not synchronized as well, you might encounter visibility issues. Not only that, but it is also possible to completely thrash the object, if it performs structural changes!
The JVM specification gives a few guarantees regarding when modifications to a memory location made by one thread will be visible to other threads. One such guarantee is that modifications by a thread prior to releasing a lock are visible to threads that subsequently acquire the same lock. That's why you need to synchronized the read operations as well, even in the absence of concurrent structural modifications to the object.
Note that this releasing/acquiring locks is not the only way to guarantee visibility of memory modifications, but it's the easiest. Others include order of starting threads, class initialization, reads/writes to memory locations... more sophisticated stuff (and possibly more scalable on a highly concurrent environment, due to a reduced level of contention).
If you don't use any of those other techniques to ensure visibility, then simply locking only on write operations is wrong code. You might or might not encounter visibility issues though -- there's no guarantee that the JVM will fail, but it's possible, so... wrong code.
I'd suggest you read the book "Java Concurrency in Practice", one of the best texts on the subject I've ever read, after the JVM spec itself. Obviously, the book is way easier (still far from trivial!) and more fun to read than the spec...
One example would be:
Thread 1:
Iterator<YourType> it = yourMapInstance.entrySet().iterator();
while(it.hasNext()) {
it.next().getValue().doSth();
Thread.sleep(1000);
}
}
Thread 2:
for(int i = 0; i < 10; i++) {
if(Math.random() < 0.5) {
yourMapInstance.clear();
Thread.sleep(500);
}
}
Now, if both threads are executed concurrently, at some point there might be a situation, that you have a value in your iterator, while the other thread has already deleted everything from the map. In this case, synchronization is required.

advantages of java's ConcurrentHashMap for a get-only map?

Consider these two situations:
a map which you are going to populate once at the beginning and then will be accessed from many different threads.
a map which you are going to use as cache that will be accessed from many different threads. you would like to avoid computing the result that will be stored in the map unless it is missing, the get-computation-store block will be synchronized. (and the map will not otherwise be used)
In either of these cases, does ConcurrentHashMap offer you anything additional in terms of thread safety above an ordinary HashMap?
In the first case, it should not matter in practice, but there is no guarantee that modifications written to a regular hashmap will ever be seen by other threads. So if one thread initially creates and populates the map, and that thread never synchronized with your other threads, then those threads may never see the initial values set into the map.
The above situation is unlikely in practice, and would only take a single synchronization event or happens before guarantee between the threads (read / write to a volatile variable for instance) to ensure even theoretical correctness.
In the second case, there is a concern since access to a HashMap that modifies it structurally (adding a value) requires synchronization. Furthermore, you need some type of synchronization to establish a happens-before relationship / shared visibility with the other threads or there is no guarantee that the other threads will see the new values you put in. ConcurrentHashMap offers these guarantees and will not break when one thread modifies it structurally.
There is no difference in thread safety, no. For scenario #2 there is a difference in performance and a small difference in timing guarantees.
There will be no synchronization for your scenario #2, so threads that want to use the cache don't have to queue up and wait for others to finish. However, in order to get that benefit you don't have hard happens-before relationships at the synchronization boundaries, so it's possible two threads will compute the same cached value more or less at the same time. This is generally harmless as long as the computation is repeatable.
(There is also the slight difference that ConcurrentHashMap does not allow null to be used as a key.)

Does Guava MapMaker and JDK ConcurrentMap uses read/write lock?

As I understand both maps are designed to work in multithreaded enviroment. But I intereted in what features they guarantee(i.e. availability, consistency).
I believe they don't use read locks (relying on volatile fields to ensure that reads see writes from other threads) and are internally broken down into some number of segments (based on the expected concurrency level) that entries are distributed among, each of which uses its own write lock separate from the others. That way reads never block and writes only block if they happen to need to write to the same segment at the same time. I'm no expert on it though.
As far as guarantees, I'm not sure what you're asking. ConcurrentMap specifies a memory consistency guarantee:
Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access or removal of that object from the ConcurrentMap in another thread.

Can concurrntHashMap guarantee true thread safety and concurrency at the same time?

We know that ConcurrentHashMap can provide concurrent access to multiple threads to boost performance , and inside this class, segments are synchronized up (am I right?). Question is, can this design guarantee the thread safety? Say we have 30+ threads accessing &changing an object mapped by the same key in a ConcurrentHashMap instance, my guess is, they still have to line up for that, don't they?
From my recollection that the book "Java Concurrency in Practice" says the ConcurrentHashMap provide concurrent reading and a decent level of concurrent writing. in the aforementioned scenario, and if my guess is correct, the performance won't be better than using the Collection's static synchonization wrapper api?
Thanks for clarifying,
John
You will still have to synchronize any access to the object being modified, and as you suspect all access to the same key will still have contention. The performance improvement comes in access to different keys, which is of course the more typical case.
All a ConcurrentMap can give you wrt to concurrency is that modifications to the map itself are done atomically, and that any writes happen-before any reads (this is important as it provides safe publishing of any reference from the map.
Safe-publishing means that any (mutable) object retrieved from the map will be seen with all writes to it before it was placed in the map. It won't help for publishing modifications that are made after retrieving it though.
However, concurrency and thread-safety is generally hard to reason about and make correct if you have mutable objects that are being modified by multiple parties. Usually you have to lock in order to get it right. A better approach is often to use immutable objects in conjunction with the ConcurrentMap conditional putIfAbsent/replace methods and linearize your algorithm that way. This lock-free style tends to be easier to reason about.
Question is, can this design guarantee the thread safety?
It guarantees the thread safety of the map; i.e. that access and updates on the map have a well defined and orderly behaviour in the presence of multiple threads performing updates simultaneously.
It does guarantee thread safety of the key or value objects. And it does not provide any form of higher level synchronization.
Say we have 30+ threads accessing &changing an object mapped by the same key in a ConcurrentHashMap instance, my guess is, they still have to line up for that, don't they?
If you have multiple threads trying to use the same key, then their operations will inevitably be serialized to some degree. That is unavoidable.
In fact, from briefly looking at the source code, it looks like ConcurrentHashMap falls back to using conventional locks if there is too much contention for a particular segment of the map. And if you have multiple threads trying to access AND update the same key simultaneously, that will trigger locking.
first remember that a thread safe tool doesn't guarantee thread safe usage of it in and of itself
the if(!map.contains(k))map.put(k,v); construct to putIfAbsent for example is not thread safe
and each value access/modification still has to be made thread safe independently
Reads are concurrent, even for the same key, so performance will be better for typical applications.

Categories

Resources