I have this code that has a shared hash map initialized in static block. I don't expose the hashmap and it's used read-only (get and containKey).
I wanted to make sure if this is thread-safe.
public class MyClass {
private static final Map<String, MyObject> myMap;
static {
myMap = new MyLoader().load()
}
public MyClass() {
if (containsKey(someKey)) {
// do something
}
myMap.get(something)
}
static boolean containsKey(String key) {
// do some other stuff
return myMap.containsKey(key)
}
}
Assuming that new MyLoader().load() returns a map that is completely initialized with all data and which is never modified after that, then it is safe for all threads to retrieve data from this map concurrently. The Javadoc for HashMap says: "If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally." Therefore, if no thread is modifying the map, then it doesn't have to be synchronized.
As a safety measure, your load() method should enforce immutability:
public Map<String, MyObject> load() {
Map<String, MyObject> mymap = new HashMap<>();
mymap.put(...);
...
return Collections.unmodifiableMap(mymap);
}
This way, you don't have to worry that some thread in some code you're unfamiliar with might inadvertently modify the map. It won't be able to.
Related
Given a ConcurrentHashMap<String, String> contains following entries:
this.map = new ConcurrentHashMap<>();
this.map.put("user", "user description");
this.map.put("session", "session description");
this.map.put("test", "test description");
The map is accessed by multiple threads.
How to remove all keys, except session in an atomic way?
Will this code work as I expect, without race conditions? Is forEach method atomic? Is there any other way to achieve this atomically?
map.forEach((key, value) -> {
if (!key.equals("session")) {
map.remove(key);
}
});
Your forEach() will not happen atomically, since each remove() call is synchronized independently.
You could try doing this by creating a new map and replacing the old one with it:
ConcurrentMap<String, String> newMap = new ConcurrentHashMap<>();
newMap.put("session", this.map.get("session"));
this.map = newMap;
Threads viewing this.map before the switch will have the old view, but the atomic "removal" can be thought of as taking place when you assign the new map. The only other issue is when another thread modifies the value associated with "session" in the original map between the 2nd and 3rd lines (or if that key isn't even present), but as you said in your comment that never happens.
We have a single thread that regularly updates a Map. And then we have multiple other threads that read this map.
This is how the update thread executes
private Map<String, SecondMap> firstMap = new ConcurrentHashMap<>();
private void refresh() //This method is called every X seconds by one thread only
{
List<SecondMap> newData = getLatestData();
final List<String> newEntries = new ArrayList<>();
for(SecondMap map : newData) {
newEntries.add(map.getName());
firstMap.put(map.getName(), map);
}
final Set<String> cachedEntries = firstMap.keySet();
for (final String cachedEntry : cachedEntries) {
if (!newEntries.contains(cachedEntry)) {
firstMap.remove(cachedEntry);
}
}
}
public Map<String, SecondMap> getFirstMap()//Other threads call this
{
return firstMap;
}
The SecondMap class looks like this
class SecondMap {
Map<String, SomeClass> data; //Not necessarily a concurrent hashmap
public Map<String, SomeClass> getData() {
return data;
}
}
Below is the simplified version of how reader threads access
public void getValue() {
Map<String, SecondMap> firstMap = getFirstMap();
SecondMap secondMap = firstMap.get("SomeKey");
secondMap.getData().get("AnotherKey");// This returns null
}
We are seeing that in other threads, when they iterate over the received
firstMap, sometimes they get null values for some keys in the SecondMap. We don't see any null values for keys in the firstMap, but we see null values for keys in second value. One thing that we can rule out is that the method getLatestData will never return such data. It reads from a database and returns these entries. There can never be null values in the database in the first place. Also we see that this happens occasionally. We are probably missing something here in handling multi-threaded situation in a proper way, but I am looking for an explanation why this can happen.
Assuming the Map<String, SomeClass> data; inside the SecondMap class is a HashMap, you can get a null value for a key in two scenarios.
1. If the key maps to a null value. Example "Something" -> null.
2. If the key is not in the map in the first place.
So without knowing much about where the data is coming from. If one of maps returned by getLatestData(); doesn't have the key "SomeKey" in the map at all, it will return null.
Also since there's not enough information about how that Map<String, SomeClass> data; is updated, and if it's mutable or immutable, you may have issues there. If that map is immutable and the SecondMap is immutable then it's more probably ok. But if you are modifying if from multiple threads you should make it a ConcurrentHashMap and if you update the reference to a new Map<String, SomeClass> data from different threads, inside the SecondMap you should also make that reference volatile.
class SecondMap {
volatile Map<String, SomeClass> data; //Not necessarily a concurrent hashmap
public Map<String, SomeClass> getData() {
return data;
}
}
If you'd like to understand in depth on when to use the volatile keyword and all the intricacies of data races, there's a section in this online course https://www.udemy.com/java-multithreading-concurrency-performance-optimization/?couponCode=CONCURRENCY
about it. I have not seen any resource that explains and demonstrates it better. And unfortunately there are so many articles online that just explain it WRONG, which is sad.
I hope from the little information in the question I was able to point you to some directions that might help. Please share more information if nothing of that works, or if something does work, please let me know, I'm curious to know what it was :)
On page 65 and 66 of Java Concurrency in Practice Brian Goetz lists the following code:
#ThreadSafe
public class DelegatingVehicleTracker {
private final ConcurrentMap<String, Point> locations;
private final Map<String, Point> unmodifiableMap;
public DelegatingVehicleTracker(Map<String, Point> points) {
locations = new ConcurrentHashMap<String, Point>(points);
unmodifiableMap = Collections.unmodifiableMap(locations);
}
public Map<String, Point> getLocations() {
return unmodifiableMap;
}
public Point getLocation(String id) {
return locations.get(id);
}
public void setLocation(String id, int x, int y) {
if (locations.replace(id, new Point(x, y)) == null)
throw new IllegalArgumentException("invalid vehicle name: " + id);
}
// Alternate version of getLocations (Listing 4.8)
public Map<String, Point> getLocationsAsStatic() {
return Collections.unmodifiableMap(
new HashMap<String, Point>(locations));
}
}
About this class Goetz writes:
"...the delegating version [the code above] returns an unmodifiable but
'live' view of the vehicle locations. This means that if thread A calls
getLocations() and thread B later modifies the location of some of the
points, those changes are reflected in the map returned to thread A."
In what sense would Thread A's unmodifiableMap be "live"? I do not see how changes made by Thread B via calls to setLocation() would be reflected in Thread A's unmodifiableMap. This would seem the case only if Thread A constructed a new instance of DelegatingVehicleTracker. But were Thread A to hold a reference to this class, I do not see how this is possible.
Goetz goes on to say that getLocationsAsStatic() could be called were an "unchanging view of the fleet required." I am confused. It seems to me that precisely the opposite is the case, that a call to getLocationsAsStatic() would indeed return the "live" view, and a call to getLocations(), were the class not constructed anew, would return the static, unchanging view of the fleet of cars.
What am I missing here in this example?
Any thoughts or perspectives are appreciated!
I think your confusion is due to misunderstanding of Collections.unmodifiableMap. Direct mutation of the map returned by Collections.unmodifiableMap is not allowed, however, mutating the backing map is totally fine (as long as the backing map allows mutation). For example:
Map<String,String> map = new HashMap<>();
Map<String, String> unmodifiableMap = Collections.unmodifiableMap(map);
map.put("key","value");
for (String key : unmodifiableMap.keySet()) {
System.out.println(key); // prints key
}
So, unmodifiableMap in the DelegatingVehicleTracker example is backed by a mutable map locations (a thread-safe one). setLocation mutates locations atomically and hence changes will be visible for threads holding references to the unmodifiableMap knowing that those thread can't mutate the unmodifiableMap.
Readers don't have access to locations so mutating it will be done through DelegatingVehicleTracker only and hence the name delegation.
In what sense would Thread A's unmodifiableMap be "live"? I do not see how changes made by Thread B via calls to setLocation() would be reflected in Thread A's unmodifiableMap
This is because getLocations() returns an unmodifiable wrapped map of the actual mutable map.
public DelegatingVehicleTracker(Map<String, Point> points) {
locations = new ConcurrentHashMap<String, Point>(points);
unmodifiableMap = Collections.unmodifiableMap(locations);
}
...
public Map<String, Point> getLocations() {
return unmodifiableMap;
}
So any changes later will be automatically reflected in the original returned map since they both eventually point to the same internal Map object.
Goetz goes on to say that getLocationsAsStatic() could be called were an "unchanging view of the fleet required"
This code
public Map<String, Point> getLocationsAsStatic() {
return Collections.unmodifiableMap(
new HashMap<String, Point>(locations));
}
is static in the sense that future changes to locations are not reflected since it returns a new map with a copy of the all the current key-value pairs.
getLocations() will return a read-only map which will reflect updates after getLocations() is called.
getLocationsAsStatic() on the other hand, will return a read-only-snapshot (aka deep copy) of the Location map at the time getLocationsAsStatic() is called.
To illustrate:
Map<String, Point> locs = // a map with point A(1,1) in it
DelegatingVehicleTracker tracker = DelegatingVehicleTracker(locs);
Map<String, Point> snapshot = getLocationsAsStatic();
Map<String, Point> live = getLocations();
Point newB = // a point A(2,2)
tracker.setLocation(newB);
snapshot.get("A"); // will read A(1,1)
live.get("A"); // will read A(2,2)
Is there any way to declare hashMap or hashTable as static but not final?
I want to be able to update it and therefor I don't want it to be final..
If not, what other way can I create a static dictionary?
You can do this but most likely you don't need to make it non final.
When you make a reference final it is only that reference, not the object you reference to which cannot be changed.
e.g.
static final Map<String, String> map = ...
map.put("Hello", "world"); // is okay
map = new HashMap<>(); // not okay
BTW it is generally not good practice to have global/static collections. You should limit access to such a collection as much as possible and ensure it is thread safe unless you know this is not required. e.g. instead of making the collection public, you can do
private static final Map<String, String> map = ...
public static synchronized void put(String key, String value) {
map.put(key, value);
}
public static synchronized String get(String key) {
return map.get(key);
}
I've read about double-checked locking and its drawbacks, but I'm asking if using a synchronizedMap can be considered safe.
Here is my code:
public class EntityUtils
{
private static final Map<String, Map<String, String>> searchMap = Collections.synchronizedMap(new HashMap<String, Map<String, String>>());
private static Map<String, String> getSearchablePathMap(String key)
{
Map<String, String> pathMap = searchMap.get(key);
if(pathMap != null) return pathMap;
synchronized(searchMap)
{
// double check locking (safe for synchronizedMap?)
pathMap = searchMap.get(key);
if(pathMap != null) return pathMap;
pathMap = new HashMap<>();
pathMap.put(..., ...);
...
// heavy map population operations
...
pathMap = Collections.unmodifiableMap(pathMap);
searchMap.put(key, pathMap);
}
return pathMap;
}
}
Suggestions or improvements are welcome.
It's safe, but it doesn't have any advantages in this case.
Double-checked locking is used to avoid costly synchronization on the most frequent code path, but in your case get() is synchronized as well, therefore you actually have two synchronized blocks instead of one.
java.util.concurrent.ConcurrentHashMap has a putIfAbsent method that might help.
There is a slight obstacle in your scenario in that if the key is absent then your code both creates and populates a new map. There is an overhead to using putIfAbsent on the "not absent" execution path if the value to be set in the absent case has a significant construction cost, but if the construction cost is cheap or can be amortised over many calls then this might be useful.
Using a method from the standard library is very likely to be safer than rolling your own unless you have particular expertise in Java concurrency.