Can anyone explain to me how the parameter map will be affected in the following code if two threads access it at the same time. Is the map exposed to thread safety issues because it is not inside the synchronized block?
public void fun(String type, String name, Map<String, Object> parameters) {
parameters.put(Constants.PARM_TYPE, type);
parameters.put(Constants.PARM_NAME, name);
try {
synchronized (launcher) {
launcher.launch(type, bool, parameters);
}
} catch (Exception e) {
logger.error("AHHHHH, the world has ended!",e);
}
}
I have looked at the following but I'm still questioning it: Synchronized and the scope of visibility
If your parameters instances are separate (as you mentioned in your last comment), then there is no problem with this code.
The method parameters - besides Map parameters - are just 2 Strings, so there are no synchronisation issues regarding them.
To put the synchronized block onto method level or on launcher: They're different objects. If you put on method, it will synchronize on this, otherwise on launcher. Since you want to protect the 'launcher', you have to "build the fence" as close as you can - so synchronizing on launcher is OK.
There is another technique which is using a Object lockObject = new Object(), and does the synchronization on that object, but for this purpuse I think it's overkill, but you can do that.
Imagine if you had a shared Map.
private Map<String, Object> map = new HashMap<String,Object>();
that is being updated by many threads as displayed in your example.
new Thread(new Runnable(){
public void run(){
fun("a","b", map);
}
}).start();
new Thread(new Runnable(){
public void run(){
fun("a","b", map);
}
}).start();
Each thread may update the map at the same time which could lead to A Beautiful Race Condition
If multiple threads have a handle to the same parameters instance and they call this method (which modifies the map) with a non-thread-safe map implementation, all kinds of bad things can/will happen (e.g. map corruption which may/may not manifest itself as exceptions like NullPointerException).
Assuming multiple threads are accessing the method fun(), the way map works is if you insert the same key multiple times then the value of that key would be overridden each time. But this might not be the only problem. There could be race conditions and corruption issues too. If you want an implicitly thread safe data structure, I assume a HashTable will get your job done.
if more than one thread executes that code concurrently passing the same object as the parameter map then you will have a race condition.
This will definitely cause thread safety issues unless you:
use the right Map implementation, based on your requirements and the Map implementation concurrent behavior (ConcurrentHashMap for instance, but this depends a lot on the actual requirements for your app)
or write thread safe code yourself (probably using synchronization primitives like 'synchronized').
IMPORTANT: Please notice that just moving the lines of code that modify the map into the synchronized block won't necessarily remove the race condition as you'll have to consider which other threads in your app may try to modify the map and which object they will use to synchronize their access to it. The code in the function is using a reference to 'launcher' to synchronize. Any other thread modifying the map without synchronization or with synchronization over an object different than 'launcher' will cause a race condition
Related
I stumbled upon the following piece of code:
public static final Map<String, Set<String>> fooCacheMap = new ConcurrentHashMap<>();
this cache is accessed from rest controller method:
public void fooMethod(String fooId) {
Set<String> fooSet = cacheMap.computeIfAbsent(fooId, k -> new ConcurrentSet<>());
//operations with fooSet
}
Is ConcurrentSet really necessary? when I know for sure that the set is accessed only in this method?
As you use it in the controller then multiple threads can call your method simultaneously (ex. multiple parallel requests can call your method)
As this method does not look like synchronized in any way then ConcurrentSet is probably necessary here.
Is ConcurrentSet really necessary?
Possibly, possibly not. We don't know how this code is being used.
However, assuming that it is being used in a multithreaded way (specifically: that two threads can invoke fooMethod concurrently), yes.
The atomicity in ConcurrentHashMap is only guaranteed for each invocation of computeIfAbsent. Once this completes, the lock is released, and other threads are able to invoke the method. As such, access to the return value is not atomic, and so you can get thread inference when accessing that value.
In terms of the question "do I need `ConcurrentSet"? No: you can do it so that accesses to the set are atomic:
cacheMap.compute(fooId, (k, fooSet) -> {
if (fooSet == null) fooSet = new HashSet<>();
// Operations with fooSet
return v;
});
Using a concurrent map will not guarantee thread safety. Additions to the Map need to be performed in a synchronized block to ensure that two threads don't attempt to add the same key to the map. Therefore, the concurrent map is not really needed, especially because the Map itself is static and final. Furthermore, if the code modifies the Set inside the Map, which appears likely, that needs to be synchronized as well.
The correct approach is to the Map is to check for the key. If it does not exist, enter a synchronized block and check the key again. This guarantees that the key does not exist without entering a synchronized block every time.
Set modifications should typically occur in a synchronized block as well.
public void fooAndBar() {
HashMap<Foo, Bar> fooBarMap = new HashMap<>();
CompletionService completionService = new ExecutorCompletionService(exec);
for(int i=0; i<10; i++) {
completionService.submit(new Callable() {
#Override
public Void call() throws Exception {
fooBarMap.put(new Foo(i), new Bar(i));
return null;
}
});
}
}
Is it safe to modify the HashMap inside the Callable?
Should the hashmap be final (or maybe volatile) and if so, why?
Should I use a structure other than HashMap, something like ConcurrentHashMap or SynchronizedMap and why?
I'm trying to grasp java concepts so please bear with me
Is it safe to modify the HashMap inside the Callable?
No. If you are using a threadpool I assume you are planning to have more of those callables running in parallel. Any time an object with mutable state is accessed from more than one thread, that's thread-unsafe. If you write to a thread-unsafe hashmap from two threads simultaneously, its internal structure will be corrupted. If you read from a thread-unsafe hashmap while another thread is writing to it simultaneously, your reading thread will read garbage. This is a very well known and extensively studied situation known as a Race Condition, a description of which would be totally beyond the scope of this answer. For more information, read about Race Condition on Wikipedia or on another question answered back in 2008: Stackoverflow - What is a Race Condition.
Should the hashmap be final (or maybe volatile) and if so, why?
For your purposes it does not need to be final, but it is always a good practice to make final anything that can be made final.
It does not need to be volatile because:
if you were to make it volatile, you would be making the reference to it volatile, but the reference never changes, it is its contents that change, and volatile has nothing to do with those.
the threadpool makes sure that call() will be executed after fooBarMap = new HashMap<>(). (If you are wondering why such a thing could ever be a concern, google for "memory boundary".)
Should I use a structure other than HashMap, something like ConcurrentHashMap or SynchronizedMap and why?
Definitely. Because, as I wrote earlier, any time an object with mutable state is accessed from more than one thread, that's thread-unsafe. And ConcurrentHashMap, SynchronizedMap, synchronize, etc. exist precisely for taking care of thread-unsafe situations.
Hashmap should not be final, as you are modifying it multiple times(from within a for loop).
If you make it final, you may get an error.
The someParameters hashmap is loaded from a .csv file every twenty minutes or so by one thread and set by the setParameters method.
It is very frequently read by multiple threads calling getParameters: to perform a lookup translation of one value into a corresponding value.
Is the code unsafe and/ or the "wrong" way to achieve this (particularly in terms of performance)? I know about ConcurrentHashMap but am trying to get a more fundamental understanding of concurrency, rather than using classes that are inherrently thread-safe.
One potential risk I see is that the object reference someParameters could be reset whilst another thread is reading the copy, so the other thread might not have the latest values (which wouldn't matter to me).
public class ConfigObject {
private static HashMap<String, String> someParameters = new HashMap<String, String>();
public HashMap<String, String> getParameters(){
return new HashMap<String, String>(someParameters);
//to some thread which will only ever iterate or get
}
public void setParameters(HashMap<String, String> newParameters){
//could be called by any thread at any time
someParameters = newParameters;
}
}
There are two problems here
Visibility problem, as someParameters after update might not be visible to other thread, to fix this mark someParameters as volatile.
Other problem is performance one due to creating new HashMap in get method, to fix that use Utility method Collections.unmodifiableMap() this just wrap original map and disallowing put/remove method.
If I understand your problem correctly, you need to change/replace many parameters at once (atomically). Unfortunately, ConcurrentHashMap doesn't support atomic bulk inserts/updates.
To achieve this, you should use shared ReadWriteLock. Advantage comparing to Collections.synchronized... is that concurrent reads can be performed simultaneously: if readLock is acquired from some thread, readLock().lock() called from another thread will not block.
ReadWriteLock lock = new ReadWriteLock();
// on write:
lock.writeLock().lock();
try {
// write/update operation,
// e. g. clear map and write new values
} finally {
lock.writeLock().unlock();
}
// on read:
lock.readLock().lock();
try {
// read operation
} finally {
lock.readLock().unlock();
}
private static Map<Integer, String> map = null;
public static String getString(int parameter){
if(map == null){
map = new HashMap<Integer, String>();
//map gets filled here...
}
return map.get(parameter);
}
Is that code unsafe as multithreading goes?
As mentioned, it's definitely not safe. If the contents of the map are not based on the parameter in getString(), then you would be better served by initializing the map as a static initializer as follows:
private static final Map<Integer, String> MAP = new HashMap<Integer,String>();
static {
// Populate map here
}
The above code gets called once, when the class is loaded. It's completely thread safe (although future modification to the map are not).
Are you trying to lazy load it for performance reasons? If so, this is much safer:
private static Map<Integer, String> map = null;
public synchronized static String getString(int parameter){
if(map == null){
map = new HashMap<Integer, String>();
//map gets filled here...
}
return map.get(parameter);
}
Using the synchronized keyword will make sure that only a single thread can execute the method at any one time, and that changes to the map reference are always propagated.
If you're asking this question, I recommend reading "Java Concurrency in Practice".
Race condition? Possibly.
If map is null, and two threads check if (map == null) at the same time, each would allocate a separate map. This may or may not be a problem, depending mainly on whether map is invariant. Even if the map is invariant, the cost of populating the map may also become an issue.
Memory leak? No.
The garbage collector will do its job correctly regardless of the race condition.
You do run the risk of initializing map twice in a multi-threaded scenario.
In a managed language, the garbage collector will eventually dispose of the no-longer-referenced instance. In an unmanaged language, you will never free the memory allocated for the overwritten map.
Either way, initialization should be properly protected so that multiple threads do not run initialization code at the same time.
One reason: The first thread could be in the middle of initializing the HashMap, while a second thread comes a long, sees that map is not null, and merrily tries to use the partially-initialized data structure.
It is unsafe in multithreading case due to race condition.
But do you really need the lazy initialization for the map? If the map is going to be used anyway, seems you could just do eager initialization for it..
The above code isn't thread-safe, as others have mentioned, your map can be initialized twice. You may be tempted to try and fix the above code by adding some synchronization, this is known as "double checked locking", Here is an article that describes the problems with this approach, as well as some potential fixes.
The simplest solution is to make the field a static field in a separate class:
class HelperSingleton {
static Helper singleton = new Helper();
}
it can also be fixed using the volatile keyword, as described in Bill Pugh's article.
No, this code is not safe for use by multiple threads.
There is a race condition in the initialization of the map. For example, multiple threads could initialize the map simultaneously and clobber each others' writes.
There are no memory barriers to ensure that modifications made by a thread are visible to other threads. For example, each thread could use its own copy of the map because they never "see" the values written by another thread.
There is no atomicity to ensure that invariants are preserved as the map is accessed concurrently. For example, a thread that's performing a get() operation could get into an infinite loop because another thread rehashed the buckets during a simultaneous put() operation.
If you are using Java 6, use ConcurrentHashMap
ConcurrentHashMap JavaDoc
I recently saw a piece of code which used a ThreadLocal object and kept a ConcurrentHashMap within it.
Is there any logic/benefit in this, or is it redundant?
If the only reference to the concurrent hashmap resides in the ThreadLocal, the hashmap is obviously only referenced from a single thread. In such case I would say it is completely redundant.
However, it's not hard to imagine someone "sharing" the thread-locally stored hashmap with other threads:
ThreadLocal<ConcurrentHashMap<String, String>> tl = ...
// ...
final ConcurrentHashMap<String, String> props = tl.get();
EventQueue.invokeLater(new Runnable() {
public void run() {
props.add(key.getText(), val.getText());
}
});
Either he used ThreadLocal wrongly, or ConcurrentHashMap wrongly. The likelihood that the combination makes sense is close to 0.
In addition to what #aioobe said, consider the case of InheritableThreadLocal, in which the value of local is passed from a thread to each child thread that it creates.
And as #pst says, there is nothing to prevent the same value being used in different (non-inheritable) ThreadLocals.
In short, you have to do a thorough analysis of the thread locals, the way that they are initialized and the way that they are used before you can safely conclude that they don't need to be threadsafe.