I've read about double-checked locking and its drawbacks, but I'm asking if using a synchronizedMap can be considered safe.
Here is my code:
public class EntityUtils
{
private static final Map<String, Map<String, String>> searchMap = Collections.synchronizedMap(new HashMap<String, Map<String, String>>());
private static Map<String, String> getSearchablePathMap(String key)
{
Map<String, String> pathMap = searchMap.get(key);
if(pathMap != null) return pathMap;
synchronized(searchMap)
{
// double check locking (safe for synchronizedMap?)
pathMap = searchMap.get(key);
if(pathMap != null) return pathMap;
pathMap = new HashMap<>();
pathMap.put(..., ...);
...
// heavy map population operations
...
pathMap = Collections.unmodifiableMap(pathMap);
searchMap.put(key, pathMap);
}
return pathMap;
}
}
Suggestions or improvements are welcome.
It's safe, but it doesn't have any advantages in this case.
Double-checked locking is used to avoid costly synchronization on the most frequent code path, but in your case get() is synchronized as well, therefore you actually have two synchronized blocks instead of one.
java.util.concurrent.ConcurrentHashMap has a putIfAbsent method that might help.
There is a slight obstacle in your scenario in that if the key is absent then your code both creates and populates a new map. There is an overhead to using putIfAbsent on the "not absent" execution path if the value to be set in the absent case has a significant construction cost, but if the construction cost is cheap or can be amortised over many calls then this might be useful.
Using a method from the standard library is very likely to be safer than rolling your own unless you have particular expertise in Java concurrency.
Related
I'm trying to understand how ConcurrentHashMap works. I found an example, but I'm not able to understand it. Here's its code:
Map<String, Object> myData = new HashMap<String, Object>();
myData.put("A", 1);
myData.put("B", 2);
for (String key : myData.keySet()) {
myData.remove(key);
}
This will throw an exception ConcurrentModificationException at runtime.
However, this code using ConcurrentHashMap will work correctly:
Map<String, Object> myData = new ConcurrentHashMap<String, Object>();
myData.put("A", 1);
myData.put("B", 2);
for (String key : myData.keySet()) }
myData.remove(key);
}
Can someone explain to me why ConcurrentHashMap allows to remove keys while the HashMap throws an exception? Thanks
That's just one of the features of ConcurrentHashMap. To quote from the docs:
Similarly, Iterators, Spliterators and Enumerations return elements
reflecting the state of the hash table at some point at or since the
creation of the iterator/enumeration. They do not throw
ConcurrentModificationException.
ConcurrentHashMap does not really do this to support your use case, however. It's done to allow iteration in one thread happen concurrently with modifications made in other threads.
If this is your only reason for using ConcurrentHashMap, then you should probably reconsider, because it is a lot more expensive than HashMap. You are better off just making a copy of the key set before using it like this:
Map<String, Object> myData = new HashMap<String, Object>();
myData.put("A", 1);
myData.put("B", 2);
for(String key: myData.keySet().toArray(new String[0]))
myData.remove(key);
Given a ConcurrentHashMap<String, String> contains following entries:
this.map = new ConcurrentHashMap<>();
this.map.put("user", "user description");
this.map.put("session", "session description");
this.map.put("test", "test description");
The map is accessed by multiple threads.
How to remove all keys, except session in an atomic way?
Will this code work as I expect, without race conditions? Is forEach method atomic? Is there any other way to achieve this atomically?
map.forEach((key, value) -> {
if (!key.equals("session")) {
map.remove(key);
}
});
Your forEach() will not happen atomically, since each remove() call is synchronized independently.
You could try doing this by creating a new map and replacing the old one with it:
ConcurrentMap<String, String> newMap = new ConcurrentHashMap<>();
newMap.put("session", this.map.get("session"));
this.map = newMap;
Threads viewing this.map before the switch will have the old view, but the atomic "removal" can be thought of as taking place when you assign the new map. The only other issue is when another thread modifies the value associated with "session" in the original map between the 2nd and 3rd lines (or if that key isn't even present), but as you said in your comment that never happens.
We have a single thread that regularly updates a Map. And then we have multiple other threads that read this map.
This is how the update thread executes
private Map<String, SecondMap> firstMap = new ConcurrentHashMap<>();
private void refresh() //This method is called every X seconds by one thread only
{
List<SecondMap> newData = getLatestData();
final List<String> newEntries = new ArrayList<>();
for(SecondMap map : newData) {
newEntries.add(map.getName());
firstMap.put(map.getName(), map);
}
final Set<String> cachedEntries = firstMap.keySet();
for (final String cachedEntry : cachedEntries) {
if (!newEntries.contains(cachedEntry)) {
firstMap.remove(cachedEntry);
}
}
}
public Map<String, SecondMap> getFirstMap()//Other threads call this
{
return firstMap;
}
The SecondMap class looks like this
class SecondMap {
Map<String, SomeClass> data; //Not necessarily a concurrent hashmap
public Map<String, SomeClass> getData() {
return data;
}
}
Below is the simplified version of how reader threads access
public void getValue() {
Map<String, SecondMap> firstMap = getFirstMap();
SecondMap secondMap = firstMap.get("SomeKey");
secondMap.getData().get("AnotherKey");// This returns null
}
We are seeing that in other threads, when they iterate over the received
firstMap, sometimes they get null values for some keys in the SecondMap. We don't see any null values for keys in the firstMap, but we see null values for keys in second value. One thing that we can rule out is that the method getLatestData will never return such data. It reads from a database and returns these entries. There can never be null values in the database in the first place. Also we see that this happens occasionally. We are probably missing something here in handling multi-threaded situation in a proper way, but I am looking for an explanation why this can happen.
Assuming the Map<String, SomeClass> data; inside the SecondMap class is a HashMap, you can get a null value for a key in two scenarios.
1. If the key maps to a null value. Example "Something" -> null.
2. If the key is not in the map in the first place.
So without knowing much about where the data is coming from. If one of maps returned by getLatestData(); doesn't have the key "SomeKey" in the map at all, it will return null.
Also since there's not enough information about how that Map<String, SomeClass> data; is updated, and if it's mutable or immutable, you may have issues there. If that map is immutable and the SecondMap is immutable then it's more probably ok. But if you are modifying if from multiple threads you should make it a ConcurrentHashMap and if you update the reference to a new Map<String, SomeClass> data from different threads, inside the SecondMap you should also make that reference volatile.
class SecondMap {
volatile Map<String, SomeClass> data; //Not necessarily a concurrent hashmap
public Map<String, SomeClass> getData() {
return data;
}
}
If you'd like to understand in depth on when to use the volatile keyword and all the intricacies of data races, there's a section in this online course https://www.udemy.com/java-multithreading-concurrency-performance-optimization/?couponCode=CONCURRENCY
about it. I have not seen any resource that explains and demonstrates it better. And unfortunately there are so many articles online that just explain it WRONG, which is sad.
I hope from the little information in the question I was able to point you to some directions that might help. Please share more information if nothing of that works, or if something does work, please let me know, I'm curious to know what it was :)
I have this code that has a shared hash map initialized in static block. I don't expose the hashmap and it's used read-only (get and containKey).
I wanted to make sure if this is thread-safe.
public class MyClass {
private static final Map<String, MyObject> myMap;
static {
myMap = new MyLoader().load()
}
public MyClass() {
if (containsKey(someKey)) {
// do something
}
myMap.get(something)
}
static boolean containsKey(String key) {
// do some other stuff
return myMap.containsKey(key)
}
}
Assuming that new MyLoader().load() returns a map that is completely initialized with all data and which is never modified after that, then it is safe for all threads to retrieve data from this map concurrently. The Javadoc for HashMap says: "If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally." Therefore, if no thread is modifying the map, then it doesn't have to be synchronized.
As a safety measure, your load() method should enforce immutability:
public Map<String, MyObject> load() {
Map<String, MyObject> mymap = new HashMap<>();
mymap.put(...);
...
return Collections.unmodifiableMap(mymap);
}
This way, you don't have to worry that some thread in some code you're unfamiliar with might inadvertently modify the map. It won't be able to.
Is it possible for a Hashmap to keep its original key/value pair when a duplicate key is entered?
For example, let's say I have something like this:
Map<String, String> map = new HashMap<String, String>();
map.put("username","password1");
map.put("username","password2");
I want the original key/value pair - username, password1 to be kept and not be overrode by username, password2.
Is this possible? If not, how can I eliminate duplicate entries from being put into the map?
As mentioned, you can use putIfAbsent if you use Java 8.
If you are on an older Java version you can use a ConcurrentHashMap instead, which has a putIfAbsent method.
Of course, you get the additional overhead of thread safety, but if you are not writing an extremely performance sensitive application it should not be a concern.
If not on Java 8, you have some options.
The most straightforward is the verbose code everywhere
Object existingValue = map.get(key);
if(existingValue == null){
map.put(key,newValue);
}
You could have a utility method to do this for you
public <T,V> void addToMapIfAbsent(Map<T,V> map, T key, V value){
V oldValue = map.get(key);
if(oldValue == null){
map.put(key,value);
}
}
Or extend a flavor of Map and add it there.
public class MyMap<T,V> extends HashMap<T,V>{
public void putIfNotExist(T key, V value){
V oldValue = get(key);
if(oldValue == null){
put(key,value);
}
}
}
Which allows you to create a Map thusly
Map<String,String> map = new MyMap<>();
EDIT: Although, to get to the MyMap method, of course, you'll need to have the map variable declared as that type. So anywhere you need that, you'll have to take an instance of MyMap instead of Map.
https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#putIfAbsent-K-V-
If you are using Java 8, you can use putIfAbsent.