Java concurrent access to field, trick to not use volatile

Java concurrent access to field, trick to not use volatile - java

Preface: I'm know that in most cases using a volatile field won't yield any measurable performance penalty, but this question is more theoretical and targeted towards a design with an extremly high corrency support.
I've got a field that is a List<Something> which is filled after constrution. To save some performance I would like to convert the List into a read only Map. Doing so at any point requires at least a volatile Map field so make changes visible for all threads.
I was thinking of doing the following:
Map map;
public void get(Object key){
if(map==null){
Map temp = new Map();
for(Object value : super.getList()){
temp.put(value.getKey(),value);
}
map = temp;
}
return map.get(key);
}
This could cause multiple threads to generate the map even if they enter the get block in a serialized way. This would be no big issue, if threads work on different identical instances of the map. What worries me more is:
Is it possible that one thread assigns the new temp map to the map field, and then a second thread sees that map!=null and therefore accesses the map field without generating a new one, but to my suprise finds that the map is empty, because the put operations where not yet pushed to some shared memory area?
Answers to comments:
The threads only modify the temporary map after that it is read only.
I must convert a List to a Map because of some speical JAXB setup which doesn't make it feasable to have a Map to begin with.

Is it possible that one thread assigns the new temp map to the map field, and then a second thread sees that map!=null and therefore accesses the map field without generating a new one, but to my suprise finds that the map is empty, because the put operations where not yet pushed to some shared memory area?
Yes, this is absolutely possible; for example, an optimizing compiler could actually completely get rid of the local temp variable, and just use the map field the whole time, provided it restored map to null in the case of an exception.
Similarly, a thread could also see a non-null, non-empty map that is nonetheless not fully populated. And unless your Map class is carefully designed to allow simultaneous reads and writes (or uses synchronized to avoid the issue), you could also get bizarre behavior if one thread is calling its get method while another is calling its put.

Can you create your Map in the ctor and declare it final? Provided you don't leak the map so others can modify it, that should suffice to make your get() safely sharable by multiple threads.

When you really in doubt whether an other thread could read an "half completed" map
(I don't think so, but never say never ;-), you may try this.
map is null or complete
static class MyMap extends HashMap {
MyMap (List pList) {
for(Object value : pList){
put(value.getKey(), value);
}
}
}
MyMap map;
public Object get(Object key){
if(map==null){
map = new MyMap (super.getList());
}
return map.get(key);
}
Or does someone see a new introduced problem ?

In addition to the visibility concerns previously mentioned, there is another problem with the original code, viz. it can throw a NullPointerException here:
return this.map.get(key)
Which is counter-intuitive, but that is what you can expect from incorrectly synchronized code.
Sample code to prevent this:
Map temp;
if ((temp = this.map) == null)
{
temp = new ImmutableMap(getList());
this.map = temp;
}
return temp.get(key);

Related

What is the different between map.put and creating a new map?

i'm reading the source code of sentinel, i find when the map need adding a entry, it create a new hashmap replacing the old rather than using map.put directly. like this:
public class NodeSelectorSlot extends AbstractLinkedProcessorSlot<Object> {
private volatile Map<String, DefaultNode> map = new HashMap<String, DefaultNode>(10);
#Override
public void entry(Context context, ResourceWrapper resourceWrapper, Object obj, int count, boolean prioritized, Object... args)
throws Throwable {
DefaultNode node = map.get(context.getName());
if (node == null) {
synchronized (this) {
node = map.get(context.getName());
if (node == null) {
node = new DefaultNode(resourceWrapper, null);
// create a new hashmap
HashMap<String, DefaultNode> cacheMap = new HashMap<String, DefaultNode>(map.size());
cacheMap.putAll(map);
cacheMap.put(context.getName(), node);
map = cacheMap;
((DefaultNode) context.getLastNode()).addChild(node);
}
}
}
context.setCurNode(node);
fireEntry(context, resourceWrapper, node, count, prioritized, args);
}
...
}
what's the different between them?

The code you are looking is fetching a Node from the map, creating and adding a new Node if one is not present.
Clearly, this operation needs to be thread-safe. The simple ways to implement this would be:
Lock the map and perform get and put operations while holding the lock.
Use a ConcurrentHashMap which has operations for doing this kind of thing atomically; e.g. computeIfAbsent.
The authors of this code have chosen a different approach. They are using so-called Double Checked Locking (DCL) to avoid doing the initial get while holding a lock. That is what this code does:
DefaultNode node = map.get(context.getName());
if (node == null) {
synchronized (this) {
node = map.get(context.getName());
...
The authors have decided that when they then need to add a new entry to the map they need to do it by replacing the entire map with a new one. On the face of it, that seems unnecessary. The map updates are being performed while holding the lock and the volatile adds a happens before that seems to ensure that the initial map.get call sees any recent writes to the HashMap.
But that reasoning is INCORRECT. The problem is that there is a small time window between fetching the map reference and the get call completing. During that time window, a simultaneous put operation may be updating the HashMap data structures. This is harmful because those changes could cause the get to read stale data (because there is no happens before relationship from the put writes to the get reads). Even worse, the put could trigger reconstruction of a hash chain or even an expansion of the hash array. The resulting behavior is (at least) outside of the HashMap spec, since HashMap is not defined to be thread-safe.
The authors' solution is to create a new HashMap with the existing entries and the new one, then update map with a single assignment. I haven't done a formal analysis, but I think that this approach is thread-safe.
In short, the reason that the code creates a new HashMap is to make the DCL approach thread-safe.
And if you ignore the thread-safety aspect, this approach is functionality equivalent to a simple put.
Finally, we need to consider whether the authors' approach is going to give optimal performance. The answer will depend on whether the number of cache entries stabilizes, and whether it is relatively small. One observation is that the cost of adding N entries to the cache is O(N^2) !! (Assuming that entries are never removed, as appears to be the case.)

It is so-called copy-on-write, which is intended to ensure thread-safe. When read operations are a lot more than write operations, it is more efficient than mechanisms like ConcurrentHashMap.
Ref: https://github.com/alibaba/Sentinel/issues/1733

Is it possible to iterate a ConcurrentHashMap without creating new objects?

After profiling my android game, I notice an unusual amount of ConcurrentHashmaps generated during a simple iteration process that I call throughout the main game loop. The code is as follows
public void checkIfStillNeedsToShowUI() {
for (Map.Entry<String, GameUI> gameUIEntry : listOfUIObjects.entrySet()) {
if(!gameUIEntry.getValue().isShowing()){//ignore what not showing
continue;
}
final GameUI tmpGameUI = (gameUIEntry.getValue());
if(!tmpGameUI.hasReasonForShowing()){
continue;
}
if(tmpGameUI.reasonForShowing.checkReason()){
tmpGameUI.setShowing(true);
} else {
tmpGameUI.setShowing(false);
}
}
}
and the results are as follows
Is this normal? or am I doing something wrong? I know that using the generic/enhanced for loop type results in an object being created in order to access it but I currently don't know another way to iterate a hashmap that would give me desired results.

They are not instances of ConcurrentHashMap, they are instances of MapEntry.
If you mean MapEntry instances, the answer is no JDK create new instances of those objects during iteration of ConcurrentHashMap and it is inevitable when you are using ConcurrentHashMap you can see that in the next method in EntityIterator class inside ConcurrentHashMap. The problem is that to mange concurrency JDK store Objects of type Node and those objects are not considered to be exported as mentioned in Documentation in the source code:
Key-value entry. This class is never exported out as a user-mutable Map.Entry (i.e., one supporting setValue; see MapEntry below), but can be used for read-only traversals used in bulk tasks. Subclasses of Node with a negative hash field are special, and contain null keys and values (but are never exported). Otherwise, keys and vals are never null.
So EntryIterator class inside ConcurrentHashMap class converts this objects into MapEntry inside next method. If you are only using your map in single thread application you can use HashMap instead.

Implementing per-key or striped locking in a Map - best approach?

I came across this dilemma at work and wanted to see if there is a better solution... it feels like there should be an easier, cleaner answer.
Goal: Concurrently access a map with locks at the key level, not at the entire map level, to ensure atomicity while impacting performance as little as possible.
I have a Map which needs to be concurrent. *(Added) The map will be filled with an unknown amount of entries over time. I have multiple readers and a single writer. The writer does a "check-then-put" and the reader does a simple get(). I need these to be atomic... but only at the key level. So for example, if the reader is checking for Key X, and the writer is writing to Key Y, I don't care if I miss the write to Key Y. If the reader/writer is working on the same key however I need that to be atomic.
The easiest solution is to lock the whole map. But this seems like it would impact performance, since there are about 10,000 keys that will end up in the map. (If that doesn't seem like it would hurt performance because the size of the Map is relatively small, let's pretend the Map has many more keys, for arguments sake.)
As far as I know, ConcurrentHashMap will not guarantee the "per-key" atomic behavior I need.
The next solution that came to mind was to have an array of lock objects. You would index into that array of lock Object()'s based on a hash of the original key. This would still have some contention since you have less locks than you have keys into the original map. I'm aware that ConcurrentHashMap does a similar thing under the hood (striping) to provide concurrency (but not atomicity).
Is there an easier way to perform this type of per-key or striped locking?
Thanks.

This concern can come up when value generation is a time-consuming process. You don't want to lock the whole map and find a missing value, and keep the map locked while you generate the value. You could release the map during generation, but then you could have two simultaneous misses and generations.
Instead of directly storing the value with the key, store it inside a reference object:
public class Ref<T>
{
private T value;
public T getValue()
{
return value;
}
public void setValue(T value)
{
this.value = value;
}
}
So if you originally had a map of Map<String, MyThing>, you instead use Map<String, Ref<MyThing>>. Don't bother with a concurrent implementation, just use HashMap or LinkedHashMap or whatever.
Now you can lock the map to find or create a reference holder, and then release the map. Following that, you can lock the reference to find or create the value object:
String key; // key you're looking up
Map<String, Ref<MyThing>> map; // the map
// Find the reference container, create it if necessary
Ref<MyThing> ref;
synchronized(map)
{
ref = map.get(key);
if (ref == null)
{
ref = new Ref<MyThing>();
map.put(key, ref);
}
}
// Map is released at this point
// Now get the value, creating if necessary
MyThing result;
synchronized(ref)
{
result = ref.getValue();
if (result == null)
{
result = generateMyThing();
ref.setValue(result);
}
}
// result == your existing or new object

How do I create a thread-safe write-once read-many Map in Java?

I have a Java class with private static maps used to store information during the execution of the application. I would only ever put a key/value once into the Map but the map value may be read many times.
So the way I have it now, the code does a get and checks for null. If null then I gather the data I need and put it into the map. Subsequent calls by the client code would be guaranteed to get the value from the map. The client would not need to do null checks.
The reason for this is that getting the data to put in the map could be expensive so I only want to do this once per key.
Is there any pattern for this? I can't seem to find anything out there that discusses this situation.
TIA
Here's a totally non-thread-safe example:
public class TestWorm {
private static Map<String, Object> map = new HashMap<String, Object>(32);
public Object getValue(String key) {
if (map.get(key) != null) {
return map.get(key);
}
// do some process to get Object
Object o = new Object();
map.put(key, o);
return o;
}
}

Your best bet is ConcurrentHashMap and it's putIfAbsent method.
Your implementation is not thread safe. To make it thread safe, declare field final, change implementation class to ConcurrentHashMap and that is enough, if you don't care if sometimes values will be computed and stored several times (this would be rare: in case two thread simultaneously enter get and corresponding value is not yet computed. This is usually good trade off, as usually the most common case is when you have something in the cache. And in this case you do not use any extra synchronization to retrieve existing value.)
If you want to make sure that there is at most one value present in your application for the given key you can further extend your implementation by using putIfAbsent, instead of put.
Another way to implement this would be to use guava library: http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/MapMaker.html
Yet another way to do that would be to use computeIfAbsent of ConcurrentHashMapV8 (http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/jsr166e/ConcurrentHashMapV8.java?view=markup) which will someday appear in Java 8.

checkout ConcurrentHashMap.putIfAbsent() which will provide all you need for storing and retrieving synchronously.

This is a typical caching pattern.
For example the Spring Framework offers a good API since 3.1
But there are specialized Frameworks there like Ehcache, too.

Java concurrency with a Map of Lists

I have a java class that is accessed by a lot of threads at once and want to make sure it is thread safe. The class has one private field, which is a Map of Strings to Lists of Strings. I've implemented the Map as a ConcurrentHashMap to ensure gets and puts are thread safe:
public class ListStore {
private Map<String, List<String>> innerListStore;
public ListStore() {
innerListStore = new ConcurrentHashMap<String, List<String>>();
}
...
}
So given that gets and puts to the Map are thread safe, my concern is with the lists that are stored in the Map. For instance, consider the following method that checks if a given entry exists in a given list in the store (I've omitted error checking for brevity):
public boolean listEntryExists(String listName, String listEntry) {
List<String> listToSearch = innerListStore.get(listName);
for (String entryName : listToSearch) {
if(entryName.equals(listEntry)) {
return true;
}
}
return false;
}
It would seem that I need to synchronize the entire contents of this method because if another method changed the contents of the list at innerListStore.get(listName) while this method is iterating over it, a ConcurrentModificationException would be thrown.
Is that correct and if so, do I synchronize on innerListStore or would synchronizing on the local listToSearch variable work?
UPDATE: Thanks for the responses. It sounds like I can synchronize on the list itself. For more information, here is the add() method, which can be running at the same time the listEntryExists() method is running in another thread:
public void add(String listName, String entryName) {
List<String> addTo = innerListStore.get(listName);
if (addTo == null) {
addTo = Collections.synchronizedList(new ArrayList<String>());
List<String> added = innerListStore.putIfAbsent(listName, addTo);
if (added != null) {
addTo = added;
}
}
addTo.add(entryName);
}
If this is the only method that modifies the underlying lists stored in the map and no public methods return references to the map or entries in the map, can I synchronize iteration on the lists themselves and is this implementation of add() sufficient?

You can synchronize on listToSearch ("synchronized(listToSearch) {...}"). Make sure that there is no race condition creating the lists (use innerListStore.putIfAbsent to create them).

You could synchronize on just listToSearch, there's no reason to lock the entire map any time anyone is using just one entry.
Just remember though, that you need to synchronize on the list everywhere it is modified! Synchronizing the iterator doesn't automagically block other people from doing an add() or whatnot if you passed out to them references to the unsynchronized list.
It would be safest to just store synchronized lists in the Map and then lock on them when you iterate, and also document when you return a reference to the list that the user must sycnhronize on it if they iterate. Synchronization is pretty cheap in modern JVMs when no actual contention is happening. Of course if you never let a reference to one of the lists escape your class, you can handle it internally with a finer comb.
Alternately you can use a threadsafe list such as CopyOnWriteArrayList that uses snapshot iterators. What kind of point in time consistency you need is a design decision we can't make for you. The javadoc also includes a helpful discussion of performance characteristics.

It would seem that I need to synchronize the entire contents of this method because if another method changed the contents of the list at innerListStore.get(listName) while this method is iterating over it, a ConcurrentModificationException would be thrown.
Are other threads accessing the List itself, or only though operations exposed by ListStore?
Will operations invoked by other threads result in the contents of the a List stored in the Map being changed? Or will entries only be added/removed from the Map?
You would only need to synchronize access to the List stored within the Map if different threads can result in changes to the same List instances. If the threads are only allowed to add/remove List instances from the Map (i.e. change the structure of the Map), then synchronization is not necessary.

if the lists stored in the map are of the type that don't throw CME (CopyOnWriteArrayList for example) you can iterate at will
this can introduce some races though if you're not careful

If the Map is already thread safe, then I think syncronizing the listToSearch should work. Im not 100% but I think it should work
synchronized(listToSearch)
{
}

You could use another abstraction from Guava
Note that this will synchronize on the whole map, so it might be not that useful for you.

As you haven't provided any client for the map of lists apart from the boolean listEntryExists(String listName, String listEntry) method, I wonder why you are storing lists at all? This structure seems to be more naturally a Map<String, Set<String>> and the listEntryExists should use the contains method (available on List as well, but O(n) to the size of the list):
public boolean listEntryExists(String name, String entry) {
SetString> set = map.get(name);
return (set == null) ? false : set.contains(entry;
}
Now, the contains call can encapsulate whatever internal concurrency protocol you want it to.
For the add you can either use a synchronized wrapper (simple, but maybe slow) or if writes are infrequent compared to reads, utilise ConcurrentMap.replace to implement your own copy-on-write strategy. For instance, using Guava ImmutableSet:
public boolean add(String name, String entry) {
while(true) {
SetString> set = map.get(name);
if (set == null) {
if (map.putIfAbsent(name, ImmutableSet.of(entry))
return true
continue;
}
if (set.contains(entry)
return false; // no need to change, already exists
Set<String> newSet = ImmutableSet.copyOf(Iterables.concat(set, ImmutableSet.of(entry))
if (map.replace(name, set, newSet)
return true;
}
}
This is now an entirely thread-safe lock-free structure, where concurrent readers and writers will not block each other (modulo the lock-freeness of the underlying ConcurrentMap implementation). This implementation does have an O(n) in its write, where your original implementation was O9n) in the read. Again if you are read-mostly rather than write-mostly this could be a big win.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.