Inconsistent responses when using ConcurrentHashMap in multi-threaded environment - java

We have a single thread that regularly updates a Map. And then we have multiple other threads that read this map.
This is how the update thread executes
private Map<String, SecondMap> firstMap = new ConcurrentHashMap<>();
private void refresh() //This method is called every X seconds by one thread only
{
List<SecondMap> newData = getLatestData();
final List<String> newEntries = new ArrayList<>();
for(SecondMap map : newData) {
newEntries.add(map.getName());
firstMap.put(map.getName(), map);
}
final Set<String> cachedEntries = firstMap.keySet();
for (final String cachedEntry : cachedEntries) {
if (!newEntries.contains(cachedEntry)) {
firstMap.remove(cachedEntry);
}
}
}
public Map<String, SecondMap> getFirstMap()//Other threads call this
{
return firstMap;
}
The SecondMap class looks like this
class SecondMap {
Map<String, SomeClass> data; //Not necessarily a concurrent hashmap
public Map<String, SomeClass> getData() {
return data;
}
}
Below is the simplified version of how reader threads access
public void getValue() {
Map<String, SecondMap> firstMap = getFirstMap();
SecondMap secondMap = firstMap.get("SomeKey");
secondMap.getData().get("AnotherKey");// This returns null
}
We are seeing that in other threads, when they iterate over the received
firstMap, sometimes they get null values for some keys in the SecondMap. We don't see any null values for keys in the firstMap, but we see null values for keys in second value. One thing that we can rule out is that the method getLatestData will never return such data. It reads from a database and returns these entries. There can never be null values in the database in the first place. Also we see that this happens occasionally. We are probably missing something here in handling multi-threaded situation in a proper way, but I am looking for an explanation why this can happen.

Assuming the Map<String, SomeClass> data; inside the SecondMap class is a HashMap, you can get a null value for a key in two scenarios.
1. If the key maps to a null value. Example "Something" -> null.
2. If the key is not in the map in the first place.
So without knowing much about where the data is coming from. If one of maps returned by getLatestData(); doesn't have the key "SomeKey" in the map at all, it will return null.
Also since there's not enough information about how that Map<String, SomeClass> data; is updated, and if it's mutable or immutable, you may have issues there. If that map is immutable and the SecondMap is immutable then it's more probably ok. But if you are modifying if from multiple threads you should make it a ConcurrentHashMap and if you update the reference to a new Map<String, SomeClass> data from different threads, inside the SecondMap you should also make that reference volatile.
class SecondMap {
volatile Map<String, SomeClass> data; //Not necessarily a concurrent hashmap
public Map<String, SomeClass> getData() {
return data;
}
}
If you'd like to understand in depth on when to use the volatile keyword and all the intricacies of data races, there's a section in this online course https://www.udemy.com/java-multithreading-concurrency-performance-optimization/?couponCode=CONCURRENCY
about it. I have not seen any resource that explains and demonstrates it better. And unfortunately there are so many articles online that just explain it WRONG, which is sad.
I hope from the little information in the question I was able to point you to some directions that might help. Please share more information if nothing of that works, or if something does work, please let me know, I'm curious to know what it was :)

Related

Confused on the difference between ConcurrentHashMap and HashMap behavior in this example

I'm trying to understand how ConcurrentHashMap works. I found an example, but I'm not able to understand it. Here's its code:
Map<String, Object> myData = new HashMap<String, Object>();
myData.put("A", 1);
myData.put("B", 2);
for (String key : myData.keySet()) {
myData.remove(key);
}
This will throw an exception ConcurrentModificationException at runtime.
However, this code using ConcurrentHashMap will work correctly:
Map<String, Object> myData = new ConcurrentHashMap<String, Object>();
myData.put("A", 1);
myData.put("B", 2);
for (String key : myData.keySet()) }
myData.remove(key);
}
Can someone explain to me why ConcurrentHashMap allows to remove keys while the HashMap throws an exception? Thanks
That's just one of the features of ConcurrentHashMap. To quote from the docs:
Similarly, Iterators, Spliterators and Enumerations return elements
reflecting the state of the hash table at some point at or since the
creation of the iterator/enumeration. They do not throw
ConcurrentModificationException.
ConcurrentHashMap does not really do this to support your use case, however. It's done to allow iteration in one thread happen concurrently with modifications made in other threads.
If this is your only reason for using ConcurrentHashMap, then you should probably reconsider, because it is a lot more expensive than HashMap. You are better off just making a copy of the key set before using it like this:
Map<String, Object> myData = new HashMap<String, Object>();
myData.put("A", 1);
myData.put("B", 2);
for(String key: myData.keySet().toArray(new String[0]))
myData.remove(key);

ConcurrentHashMap atomic operation to remove all entries except one

Given a ConcurrentHashMap<String, String> contains following entries:
this.map = new ConcurrentHashMap<>();
this.map.put("user", "user description");
this.map.put("session", "session description");
this.map.put("test", "test description");
The map is accessed by multiple threads.
How to remove all keys, except session in an atomic way?
Will this code work as I expect, without race conditions? Is forEach method atomic? Is there any other way to achieve this atomically?
map.forEach((key, value) -> {
if (!key.equals("session")) {
map.remove(key);
}
});
Your forEach() will not happen atomically, since each remove() call is synchronized independently.
You could try doing this by creating a new map and replacing the old one with it:
ConcurrentMap<String, String> newMap = new ConcurrentHashMap<>();
newMap.put("session", this.map.get("session"));
this.map = newMap;
Threads viewing this.map before the switch will have the old view, but the atomic "removal" can be thought of as taking place when you assign the new map. The only other issue is when another thread modifies the value associated with "session" in the original map between the 2nd and 3rd lines (or if that key isn't even present), but as you said in your comment that never happens.

Does code in methods always get executed?

I have created the implementation of a abstract method of the super class. Does the code in the method always get executed or is there some kind of cache that knows the code will never change?
I want to know if there are performance issues with my code. Is it better to create the map as a member variable and then return it in the method?
#Override
protected Map<String, Function<Information, String>> getDefinitionMap() {
final Map<String, Function<Information, String>> map = new LinkedHashMap<>();
map.put("Name", t -> t.getName());
map.put("ID", t -> t.getId());
return map;
}
Each time the method getDefinitionMap() is called, a new LinkedHashMap instance is created. There is no "implicit caching".
You can avoid that, if you create the map once, store it in a member variable and return this. You may want to make it unmodifiable so that it cannot be changed by callers. (see java.util.Collections.unmodifiableMap)

Java hashmap readonly thread safety

I have this code that has a shared hash map initialized in static block. I don't expose the hashmap and it's used read-only (get and containKey).
I wanted to make sure if this is thread-safe.
public class MyClass {
private static final Map<String, MyObject> myMap;
static {
myMap = new MyLoader().load()
}
public MyClass() {
if (containsKey(someKey)) {
// do something
}
myMap.get(something)
}
static boolean containsKey(String key) {
// do some other stuff
return myMap.containsKey(key)
}
}
Assuming that new MyLoader().load() returns a map that is completely initialized with all data and which is never modified after that, then it is safe for all threads to retrieve data from this map concurrently. The Javadoc for HashMap says: "If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally." Therefore, if no thread is modifying the map, then it doesn't have to be synchronized.
As a safety measure, your load() method should enforce immutability:
public Map<String, MyObject> load() {
Map<String, MyObject> mymap = new HashMap<>();
mymap.put(...);
...
return Collections.unmodifiableMap(mymap);
}
This way, you don't have to worry that some thread in some code you're unfamiliar with might inadvertently modify the map. It won't be able to.

How to pass in initialized HashMap as param?

How can I pass in a new HashMap in the most canonical (simplest, shortest hand) form?
// 1. ? (of course this doesn't work)
passMyHashMap(new HashMap<String, String>().put("key", "val"));
// 2. ? (of course this doesn't work)
passMyHashMap(new HashMap<String, String>(){"key", "val"});
void passMyHashMap(HashMap<?, ?> hm) {
// do stuff witih my hashMap
}
Create it, initialize it, then pass it:
Map<String,String> myMap = new HashMap<String,String>();
myMap.put("key", "val");
passMyHashMap(myMap);
You could use the "double curly" style that David Wallace mentions in a comment, I suppose:
passMyHashMap(new HashMap<String,String>(){{
put("x", "y");
put("a", "b");
}});
This essentially derives a new class from HashMap and sets up values in the initializer block. I don't particularly care for it (hence originally not mentioning it), but it doesn't really cause any problems per se, it's more of a style preference (it does spit out an extra .class file, although in most cases that's not a big deal). You could compress it all to one line if you'd like, but readability will suffer.
You can't call put and pass the HashMap into the method at the same time, because the put method doesn't return the HashMap. It returns the old value from the old mapping, if it existed.
You must create the map, populate it separately, then pass it in. It's more readable that way anyway.
HashMap<String, String> map = new HashMap<>();
map.put("key", "val");
passMyHashMap(map);
HashMap< K,V>.put
public **V** put(K key,V value)
Associates the specified value with the specified key in this map. If
the map previously contained a mapping for the key, the old value is
replaced.
Returns the previous value associated with key, or null if there was
no mapping for key. (A null return can also indicate that the map
previously associated null with key.)
As you can see, it does not return the type HashMap<?, ?>
You can't do that. What you can do is create a factory that allow you to do so.
public class MapFactory{
public static Map<String, String> put(final Map<String, String> map, final String key, final String valeu){
map.put(key, value);
return map;
}
}
passMyHashMap(MapFactory.put(new HashMap<String, String>(),"key", "value"));
Although I can't image a approach that would need such implementation, also I kinda don't like it. I would recommend you to create your map, pass the values and just then send to your method.
Map<String, String> map = new HashMap<String, String>();
map.put("key","value");
passMyHashMap(map);

Categories

Resources