What's wrong with this below code?.
private Map<Integer, Integer> aMap = new ConcurrentHashMap<Integer, Integer>();
Record rec = records.get(id);
if (rec == null) {
rec = new Record(id);
records.put(id, rec);
}
return rec;
Is the above code not Thread-safe?. Why should i use putIfAbsent here in this case?.
Locking is applied only for updates. In case of of retrievals, it
allows full concurrency. What does this statement mean?.
It's not thread safe.
If there was another thread, then in the time between records.get and records.put the other thread might have put the record as well.
Read only operations (i.e. ones that do not modify a structure) can be done by multiple threads at the same time. For example, 1000 threads can safely read the value of an int. However, those 1000 threads cannot update the value of the int without some sort of locking operation.
I know that this may sound like a very unlikely event, but remember that a 1 in a million event happens 1000 times per second at 1GHz.
This is thread safe:
private Map<Integer, Integer> aMap = new ConcurrentHashMap<Integer, Integer>();
// presumably aMap is a member and the code below is in a function
aMap.putIfAbsent(id, new Record(id))
Record rec = records.get(id);
return rec;
Note that this might create a Record and never use it.
It could or could not be thread-safe, depending on how you want it to act.
By the end of the code, aMap will safely have a Record for id. However, it's possible that two threads will both create and put a Record in, such that there are two (or more, if more threads do it) Records in existence. That might be fine, and it might not be -- really depends on your application.
One of the dangers of thread-safety (for instance, if you use a normal HashMap without synchronization) is that threads can read partially-created or partially-updated objects across threads; in other words, things can go really haywire. This will not happen in your code, because ConcurrentHashMap will ensure memory is kept up-to-date between threads, and in that sense it is thread-safe.
One thing you can do is to use putIfAbsent, which will atomically put a key-value pair into the map, but only if there's nothing at that key already:
if (rec == null) {
records.putIfAbsent(id, new Record(id));
rec = records.get(id);
}
In this approach, you might create a second Record object, but if so, it'll not get inserted and will immediately be available for garbage collection. By the end of the snippet:
records will contain a Record for the given id
only one Record will have ever been put into records for that id (whether put there by this thread or another)
rec will point to that record
Related
There have been many discussions on this topic, e.g. here:
What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?
But I haven't found an answer to my specific use-case.
In general, you cannot assume that a HashMap is thread-safe. If write to the same key from different threads at the same time, all hell could break loose. But what if I know that all my threads will have unique keys?
Is this code thread-safe or do I need to add blocking mechanism (or use concurrent map)?
Map<int, String> myMap = new HashMap<>();
for (int i = 1 ; i > 6 ; i++) {
new Thread(() -> {
myMap.put(i, Integer.toString(i));
}).start();
}
The answer is simple: HashMap makes absolutely no thread-safety guarantees at all.
In fact it's explicitly documented that it's not thread-safe:
If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.
So accessing one from multiple threads without any kind of synchronization is a recipe for disaster.
I have seen cases where each thread uses a different key cause issue (like iterations happening at the same time resulting in infinite loops).
Just think of re-hashing: when the threshold is reached, the internal bucket-array needs to be resized. That's a somewhat lengthy operation (compared to a single put). During that time all manner of weird things can happen if another thread tries to put as well (and maybe even triggers a second re-hashing!).
Additionally, there's no reliable way for you to proof that your specific use case is safe, since all tests you could run could just "accidentally" work. In other words: you can never depend on this working, even if you thin k you covered it with unit tests.
And since not everyone is convinced, you can easily test it yourself with this code:
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
class HashMapDemonstration {
public static void main(String[] args) throws InterruptedException {
int threadCount = 10;
int valuesPerThread = 1000;
Map<Integer, Integer> map = new HashMap<>();
List<Thread> threads = new ArrayList<>(threadCount);
for (int i = 0; i < threadCount; i++) {
Thread thread = new Thread(new MyUpdater(map, i*valuesPerThread, (i+1)*valuesPerThread - 1));
thread.start();
threads.add(thread);
}
for (Thread thread : threads) {
thread.join();
}
System.out.printf("%d threads with %d values per thread with a %s produced %d entries, should be %d%n",
threadCount, valuesPerThread, map.getClass().getName(), map.size(), threadCount * valuesPerThread);
}
}
class MyUpdater implements Runnable {
private final Map<Integer, Integer> map;
private final int startValue;
private final int endValue;
MyUpdater(Map<Integer, Integer> map, int startValue, int endValue) {
this.map = map;
this.startValue = startValue;
this.endValue = endValue;
System.out.printf("Creating updater for values %d to %d%n", startValue, endValue);
}
#Override
public void run() {
for (int i = startValue; i<= endValue; i++) {
map.put(i, i);
}
}
}
This is exactly the type of program OP mentioned: Each thread will only ever write to keys that no other thread ever touches. And still, the resulting Map will not contain all entries:
Creating updater for values 0 to 999
Creating updater for values 1000 to 1999
Creating updater for values 2000 to 2999
Creating updater for values 3000 to 3999
Creating updater for values 4000 to 4999
Creating updater for values 5000 to 5999
Creating updater for values 6000 to 6999
Creating updater for values 7000 to 7999
Creating updater for values 8000 to 8999
Creating updater for values 9000 to 9999
10 threads with 1000 values per thread with a java.util.HashMap produced 9968 entries, should be 10000
Note that the actual number of entries in the final Map will vary for each run. It even sometimes prints 10000 (because it's not thread-safe!).
Note that this failure mode (losing entries) is definitely not the only possible one: basically anything could happen.
I would like to specifically respond to the phrase.
But what if I know that all my threads will have unique keys?
You are making an assumption about the implementation of the map. The implementation is subject to change. If the implementation is documented not to be thread-safe, you must take into account the Java Memory Model (JMM) that guarantees almost nothing about visibility of memory between threads.
This is making a lot of assumptions and few guarantees. You should not rely on these assumptions, even if it happens to work on your machine, in a specific use-case, at a specific time.
In short: if an implementation that is not thread-safe is used in multiple threads, you MUST surround it with constructs that ensure thread-safety. Always.
However, just for the fun of it, let's describe what can go wrong in your particular case, where each thread only uses a unique key.
When adding or removing a key, even if unique, there are cases when a hash map needs to reorganise internally. The first one is in case of a hash-collision,1 in which a linked list of key-value entries must be updated. The second one is where the map decides to resize its internal entry table. That overhauls the internal structure including the mentioned linked lists.
Because of the JMM it is largely not guaranteed what another thread sees of the reorganisation. That means that behaviour is undefined if another threads happens to be in the middle of a get(key) when the reorganisation happens. If another thread is concurrently doing a put(key,value), you could end up with two threads trying to resize the map at the same time. Frankly, I do not even want to think what mayhem that can cause!
1 Multiple keys can have the same hash-code. Because the map has no limitless storage, the hash-code are often also wrapped around with the size of the internal table of entries, like (hashCode % sizeOfTable), which can lead to a situation where different hash-codes utilize the same "entry".
I have a ConcurrentMap which I need to populate from multithread application. My map is shown below:
private final ConcurrentMap<String, AtomicLongMap<String>> deviceErrorHolder = Maps.newConcurrentMap();
Below is my method which is called from multithreaded application at very fast rate so I need to make sure it is fast.
public void addDeviceErrorStats(String deviceName, String errorName) {
AtomicLongMap<String> errorMap = deviceErrorHolder.get(deviceName);
if (errorMap == null) {
errorMap = AtomicLongMap.create();
AtomicLongMap<String> currenttErrorMap = deviceErrorHolder.putIfAbsent(deviceName, errorMap);
if (currenttErrorMap != null) {
errorMap = currenttErrorMap;
}
}
errorMap.incrementAndGet(errorName);
}
For each deviceName, I will have an AtomicLongMap which will contain all the counts for different errorName.
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorA");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorB");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceA", "errorC");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceB", "errorA");
ExceptionCounter.getInstance().addDeviceErrorStats("deviceB", "errorB");
Is my addDeviceErrorStats method thread safe? And also the way I am updating the value of my deviceErrorHolder map is correct? Meaning will it be an atomic operation? Do I need to synchronize creation of new AtomicLongMap instances? Or CM will take care that for me?
I am working with Java7.
You can create a lot simpler version of this with computeIfAbsent().
AtomicLongMap<String> errorMap = deviceErrorHolder.computeIfAbsent(deviceName, a -> AtomicLongMap.create());
errorMap.incrementAndGet(errorName);
The computeIfAbsent (in concurrent maps) is especially meant to do an atomic version of what your null checking logic does. If the deviceName key has a value, it's returned, otherwise the computation is called atomically, and the return value of the computation is both associated with the key in the map as well as returned.
I believe your method is correct. Let's assume we have two concurrent threads calling it for the same device
The case where the errorMap already existed is trivial, as both threads will get the same and call incrementAndGet on it, which is atomic.
Let's now consider the case where errorMap didn't exist. say the first thread gets to AtomicLongMap.create(), and then the second thread is scheduled. Such thread will also create its own local map. putIfAbsent() is atomic, hence one of the threads will return null, while the second will return the map put by the first. In the latter case, you're throwing away the map that was instantiated by this thread, and using the one returned instead. Looks good to me.
The someParameters hashmap is loaded from a .csv file every twenty minutes or so by one thread and set by the setParameters method.
It is very frequently read by multiple threads calling getParameters: to perform a lookup translation of one value into a corresponding value.
Is the code unsafe and/ or the "wrong" way to achieve this (particularly in terms of performance)? I know about ConcurrentHashMap but am trying to get a more fundamental understanding of concurrency, rather than using classes that are inherrently thread-safe.
One potential risk I see is that the object reference someParameters could be reset whilst another thread is reading the copy, so the other thread might not have the latest values (which wouldn't matter to me).
public class ConfigObject {
private static HashMap<String, String> someParameters = new HashMap<String, String>();
public HashMap<String, String> getParameters(){
return new HashMap<String, String>(someParameters);
//to some thread which will only ever iterate or get
}
public void setParameters(HashMap<String, String> newParameters){
//could be called by any thread at any time
someParameters = newParameters;
}
}
There are two problems here
Visibility problem, as someParameters after update might not be visible to other thread, to fix this mark someParameters as volatile.
Other problem is performance one due to creating new HashMap in get method, to fix that use Utility method Collections.unmodifiableMap() this just wrap original map and disallowing put/remove method.
If I understand your problem correctly, you need to change/replace many parameters at once (atomically). Unfortunately, ConcurrentHashMap doesn't support atomic bulk inserts/updates.
To achieve this, you should use shared ReadWriteLock. Advantage comparing to Collections.synchronized... is that concurrent reads can be performed simultaneously: if readLock is acquired from some thread, readLock().lock() called from another thread will not block.
ReadWriteLock lock = new ReadWriteLock();
// on write:
lock.writeLock().lock();
try {
// write/update operation,
// e. g. clear map and write new values
} finally {
lock.writeLock().unlock();
}
// on read:
lock.readLock().lock();
try {
// read operation
} finally {
lock.readLock().unlock();
}
I have a ConcurrentHashMap which is asynchronously updated to mirror the data in a database. I am attempting to sort an array based on this data which works fine most of the time but if the data updates while sorting then things can get messy.
I have thought of copying the map and then sorting with the copied map but due to the frequency I need to sort and the size of the map this is not a possibility.
I'm not sure I understood your requirements perfectly, so I'll deal with two separate cases.
Let's say your async "update" operation requires you to update 2 keys in your map.
Scenario 1 is : it's OK if a "sort" operation occurs while only 1 of the two updates is visible.
Scenario 2 is : you need the 2 updates to be visible simultaneously or not at all (which is called atomic behavior).
Case 1 : you do not need atomic bulk updates
In this case, ConcurrentHashMap is OK as is, seeing iterators are guaranteed not to fail upon modification of the map. From ConcurrentHashMap documentation (emphasis mine) :
Similarly, Iterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.
So you are guaranteed that you can iterate through the map even while it is being modified without the iteration crashing for concurrent modifications. But (see the emphasis) you are NOT guaranteed that all modifications made concurrently to the map are immediately visible not even if only part of them are, and in which order.
Case 2 : you need bulk updates to be atomic
Further more with ConcurrentHashMap, you do not have any guarantee that bulk operations (putAll) will behave atomically :
For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries.
So I see two scenarios for working this case, each of which entail locking.
Solution 1 : building a copy
Building a "frozen" copy can help you only if this copy is built during a phase where all other updates are locked, because the copying of your map implies iterating through it, and our hypothesis is that iteration is not safe if we have concurrent modification.
This could look like :
ConcurrentMap<String, String> map = new ConcurrentHashMap<String, String>(); //
AtomicReference<Map<String, String>> frozenCopy = new AtomicReference<Map<String, String>>(map);
public void sortOperation() {
sortUsingFrozenCopy();
}
public void updateOperation() {
synchronized (map) { // Exclusive access to the map instance
updateMap();
Map<String, String> newCopy = new HashMap<String, String>();
newCopy.putAll(map); // You build the copy. This is safe thanks to the exclusive access.
frozenCopy.set(newCopy); // And you update the reference to the copy
}
}
This solution could be refined...
Seeing your 2 operations (map read and map writes) are totally asynchronous, one can assume that your read operations can not know (and should not care) wether the previous write operation occured 0.1 sec before or will occur 0.1 sec after.
So having your read operations depend on a "frozen copy" of the map that is actually updated once every 1 (or 2, or 5, or 10) seconds (or update events) instead of each time may be a possibility for your case.
Solution 2 : lock the map for updates
Locking the Map without copying it is a solution. You'd want a ReadWriteLock (or StampedLock in Java 8) so as to have multiple sorts possible, and a mutual exclusion of read and write operations.
Solution 2 is actually easy to implement. You'd have something like
ReadWriteLock lock = new ReentrantReadWriteLock();
public void sortOperation() {
lock.readLock().lock();
// read lock granted, which prevents writeLock to be granted
try {
sort(); // This is safe, nobody can write
} finally {
lock.readLock().unlock();
}
}
public void updateOperation() {
lock.writeLock().lock();
// Write lock granted, no other writeLock (except to myself) can be granted
// nor any readLock
try {
updateMap(); // Nobody is reading, that's OK.
} finally {
lock.writeLock().unlock();
}
}
With a ReadWriteLock, multiple reads can occur simultaneously, or a single write, but not multiple writes nor reads and writes.
You'd have to consider the possibility of using a fair variant of the lock, so that you are sure that every read and write process will eventually have a chance of being executed, depending on your usage pattern.
(NB : if you use Locking/synchronized, your Map may not need to be concurrent, as write and read operations will be exclusive, but this is another topic).
private static Map<Integer, String> map = null;
public static String getString(int parameter){
if(map == null){
map = new HashMap<Integer, String>();
//map gets filled here...
}
return map.get(parameter);
}
Is that code unsafe as multithreading goes?
As mentioned, it's definitely not safe. If the contents of the map are not based on the parameter in getString(), then you would be better served by initializing the map as a static initializer as follows:
private static final Map<Integer, String> MAP = new HashMap<Integer,String>();
static {
// Populate map here
}
The above code gets called once, when the class is loaded. It's completely thread safe (although future modification to the map are not).
Are you trying to lazy load it for performance reasons? If so, this is much safer:
private static Map<Integer, String> map = null;
public synchronized static String getString(int parameter){
if(map == null){
map = new HashMap<Integer, String>();
//map gets filled here...
}
return map.get(parameter);
}
Using the synchronized keyword will make sure that only a single thread can execute the method at any one time, and that changes to the map reference are always propagated.
If you're asking this question, I recommend reading "Java Concurrency in Practice".
Race condition? Possibly.
If map is null, and two threads check if (map == null) at the same time, each would allocate a separate map. This may or may not be a problem, depending mainly on whether map is invariant. Even if the map is invariant, the cost of populating the map may also become an issue.
Memory leak? No.
The garbage collector will do its job correctly regardless of the race condition.
You do run the risk of initializing map twice in a multi-threaded scenario.
In a managed language, the garbage collector will eventually dispose of the no-longer-referenced instance. In an unmanaged language, you will never free the memory allocated for the overwritten map.
Either way, initialization should be properly protected so that multiple threads do not run initialization code at the same time.
One reason: The first thread could be in the middle of initializing the HashMap, while a second thread comes a long, sees that map is not null, and merrily tries to use the partially-initialized data structure.
It is unsafe in multithreading case due to race condition.
But do you really need the lazy initialization for the map? If the map is going to be used anyway, seems you could just do eager initialization for it..
The above code isn't thread-safe, as others have mentioned, your map can be initialized twice. You may be tempted to try and fix the above code by adding some synchronization, this is known as "double checked locking", Here is an article that describes the problems with this approach, as well as some potential fixes.
The simplest solution is to make the field a static field in a separate class:
class HelperSingleton {
static Helper singleton = new Helper();
}
it can also be fixed using the volatile keyword, as described in Bill Pugh's article.
No, this code is not safe for use by multiple threads.
There is a race condition in the initialization of the map. For example, multiple threads could initialize the map simultaneously and clobber each others' writes.
There are no memory barriers to ensure that modifications made by a thread are visible to other threads. For example, each thread could use its own copy of the map because they never "see" the values written by another thread.
There is no atomicity to ensure that invariants are preserved as the map is accessed concurrently. For example, a thread that's performing a get() operation could get into an infinite loop because another thread rehashed the buckets during a simultaneous put() operation.
If you are using Java 6, use ConcurrentHashMap
ConcurrentHashMap JavaDoc