Is it possible to iterate a ConcurrentHashMap without creating new objects?

Is it possible to iterate a ConcurrentHashMap without creating new objects? - java

After profiling my android game, I notice an unusual amount of ConcurrentHashmaps generated during a simple iteration process that I call throughout the main game loop. The code is as follows
public void checkIfStillNeedsToShowUI() {
for (Map.Entry<String, GameUI> gameUIEntry : listOfUIObjects.entrySet()) {
if(!gameUIEntry.getValue().isShowing()){//ignore what not showing
continue;
}
final GameUI tmpGameUI = (gameUIEntry.getValue());
if(!tmpGameUI.hasReasonForShowing()){
continue;
}
if(tmpGameUI.reasonForShowing.checkReason()){
tmpGameUI.setShowing(true);
} else {
tmpGameUI.setShowing(false);
}
}
}
and the results are as follows
Is this normal? or am I doing something wrong? I know that using the generic/enhanced for loop type results in an object being created in order to access it but I currently don't know another way to iterate a hashmap that would give me desired results.

They are not instances of ConcurrentHashMap, they are instances of MapEntry.
If you mean MapEntry instances, the answer is no JDK create new instances of those objects during iteration of ConcurrentHashMap and it is inevitable when you are using ConcurrentHashMap you can see that in the next method in EntityIterator class inside ConcurrentHashMap. The problem is that to mange concurrency JDK store Objects of type Node and those objects are not considered to be exported as mentioned in Documentation in the source code:
Key-value entry. This class is never exported out as a user-mutable Map.Entry (i.e., one supporting setValue; see MapEntry below), but can be used for read-only traversals used in bulk tasks. Subclasses of Node with a negative hash field are special, and contain null keys and values (but are never exported). Otherwise, keys and vals are never null.
So EntryIterator class inside ConcurrentHashMap class converts this objects into MapEntry inside next method. If you are only using your map in single thread application you can use HashMap instead.

Related

What is the different between map.put and creating a new map?

i'm reading the source code of sentinel, i find when the map need adding a entry, it create a new hashmap replacing the old rather than using map.put directly. like this:
public class NodeSelectorSlot extends AbstractLinkedProcessorSlot<Object> {
private volatile Map<String, DefaultNode> map = new HashMap<String, DefaultNode>(10);
#Override
public void entry(Context context, ResourceWrapper resourceWrapper, Object obj, int count, boolean prioritized, Object... args)
throws Throwable {
DefaultNode node = map.get(context.getName());
if (node == null) {
synchronized (this) {
node = map.get(context.getName());
if (node == null) {
node = new DefaultNode(resourceWrapper, null);
// create a new hashmap
HashMap<String, DefaultNode> cacheMap = new HashMap<String, DefaultNode>(map.size());
cacheMap.putAll(map);
cacheMap.put(context.getName(), node);
map = cacheMap;
((DefaultNode) context.getLastNode()).addChild(node);
}
}
}
context.setCurNode(node);
fireEntry(context, resourceWrapper, node, count, prioritized, args);
}
...
}
what's the different between them?

The code you are looking is fetching a Node from the map, creating and adding a new Node if one is not present.
Clearly, this operation needs to be thread-safe. The simple ways to implement this would be:
Lock the map and perform get and put operations while holding the lock.
Use a ConcurrentHashMap which has operations for doing this kind of thing atomically; e.g. computeIfAbsent.
The authors of this code have chosen a different approach. They are using so-called Double Checked Locking (DCL) to avoid doing the initial get while holding a lock. That is what this code does:
DefaultNode node = map.get(context.getName());
if (node == null) {
synchronized (this) {
node = map.get(context.getName());
...
The authors have decided that when they then need to add a new entry to the map they need to do it by replacing the entire map with a new one. On the face of it, that seems unnecessary. The map updates are being performed while holding the lock and the volatile adds a happens before that seems to ensure that the initial map.get call sees any recent writes to the HashMap.
But that reasoning is INCORRECT. The problem is that there is a small time window between fetching the map reference and the get call completing. During that time window, a simultaneous put operation may be updating the HashMap data structures. This is harmful because those changes could cause the get to read stale data (because there is no happens before relationship from the put writes to the get reads). Even worse, the put could trigger reconstruction of a hash chain or even an expansion of the hash array. The resulting behavior is (at least) outside of the HashMap spec, since HashMap is not defined to be thread-safe.
The authors' solution is to create a new HashMap with the existing entries and the new one, then update map with a single assignment. I haven't done a formal analysis, but I think that this approach is thread-safe.
In short, the reason that the code creates a new HashMap is to make the DCL approach thread-safe.
And if you ignore the thread-safety aspect, this approach is functionality equivalent to a simple put.
Finally, we need to consider whether the authors' approach is going to give optimal performance. The answer will depend on whether the number of cache entries stabilizes, and whether it is relatively small. One observation is that the cost of adding N entries to the cache is O(N^2) !! (Assuming that entries are never removed, as appears to be the case.)

It is so-called copy-on-write, which is intended to ensure thread-safe. When read operations are a lot more than write operations, it is more efficient than mechanisms like ConcurrentHashMap.
Ref: https://github.com/alibaba/Sentinel/issues/1733

How do I create a thread-safe write-once read-many Map in Java?

I have a Java class with private static maps used to store information during the execution of the application. I would only ever put a key/value once into the Map but the map value may be read many times.
So the way I have it now, the code does a get and checks for null. If null then I gather the data I need and put it into the map. Subsequent calls by the client code would be guaranteed to get the value from the map. The client would not need to do null checks.
The reason for this is that getting the data to put in the map could be expensive so I only want to do this once per key.
Is there any pattern for this? I can't seem to find anything out there that discusses this situation.
TIA
Here's a totally non-thread-safe example:
public class TestWorm {
private static Map<String, Object> map = new HashMap<String, Object>(32);
public Object getValue(String key) {
if (map.get(key) != null) {
return map.get(key);
}
// do some process to get Object
Object o = new Object();
map.put(key, o);
return o;
}
}

Your best bet is ConcurrentHashMap and it's putIfAbsent method.
Your implementation is not thread safe. To make it thread safe, declare field final, change implementation class to ConcurrentHashMap and that is enough, if you don't care if sometimes values will be computed and stored several times (this would be rare: in case two thread simultaneously enter get and corresponding value is not yet computed. This is usually good trade off, as usually the most common case is when you have something in the cache. And in this case you do not use any extra synchronization to retrieve existing value.)
If you want to make sure that there is at most one value present in your application for the given key you can further extend your implementation by using putIfAbsent, instead of put.
Another way to implement this would be to use guava library: http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/MapMaker.html
Yet another way to do that would be to use computeIfAbsent of ConcurrentHashMapV8 (http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/jsr166e/ConcurrentHashMapV8.java?view=markup) which will someday appear in Java 8.

checkout ConcurrentHashMap.putIfAbsent() which will provide all you need for storing and retrieving synchronously.

This is a typical caching pattern.
For example the Spring Framework offers a good API since 3.1
But there are specialized Frameworks there like Ehcache, too.

Java concurrent access to field, trick to not use volatile

Preface: I'm know that in most cases using a volatile field won't yield any measurable performance penalty, but this question is more theoretical and targeted towards a design with an extremly high corrency support.
I've got a field that is a List<Something> which is filled after constrution. To save some performance I would like to convert the List into a read only Map. Doing so at any point requires at least a volatile Map field so make changes visible for all threads.
I was thinking of doing the following:
Map map;
public void get(Object key){
if(map==null){
Map temp = new Map();
for(Object value : super.getList()){
temp.put(value.getKey(),value);
}
map = temp;
}
return map.get(key);
}
This could cause multiple threads to generate the map even if they enter the get block in a serialized way. This would be no big issue, if threads work on different identical instances of the map. What worries me more is:
Is it possible that one thread assigns the new temp map to the map field, and then a second thread sees that map!=null and therefore accesses the map field without generating a new one, but to my suprise finds that the map is empty, because the put operations where not yet pushed to some shared memory area?
Answers to comments:
The threads only modify the temporary map after that it is read only.
I must convert a List to a Map because of some speical JAXB setup which doesn't make it feasable to have a Map to begin with.

Is it possible that one thread assigns the new temp map to the map field, and then a second thread sees that map!=null and therefore accesses the map field without generating a new one, but to my suprise finds that the map is empty, because the put operations where not yet pushed to some shared memory area?
Yes, this is absolutely possible; for example, an optimizing compiler could actually completely get rid of the local temp variable, and just use the map field the whole time, provided it restored map to null in the case of an exception.
Similarly, a thread could also see a non-null, non-empty map that is nonetheless not fully populated. And unless your Map class is carefully designed to allow simultaneous reads and writes (or uses synchronized to avoid the issue), you could also get bizarre behavior if one thread is calling its get method while another is calling its put.

Can you create your Map in the ctor and declare it final? Provided you don't leak the map so others can modify it, that should suffice to make your get() safely sharable by multiple threads.

When you really in doubt whether an other thread could read an "half completed" map
(I don't think so, but never say never ;-), you may try this.
map is null or complete
static class MyMap extends HashMap {
MyMap (List pList) {
for(Object value : pList){
put(value.getKey(), value);
}
}
}
MyMap map;
public Object get(Object key){
if(map==null){
map = new MyMap (super.getList());
}
return map.get(key);
}
Or does someone see a new introduced problem ?

In addition to the visibility concerns previously mentioned, there is another problem with the original code, viz. it can throw a NullPointerException here:
return this.map.get(key)
Which is counter-intuitive, but that is what you can expect from incorrectly synchronized code.
Sample code to prevent this:
Map temp;
if ((temp = this.map) == null)
{
temp = new ImmutableMap(getList());
this.map = temp;
}
return temp.get(key);

modifying a ConcurrentHashMap and Synchronized ArrayList in same method

I have a collection of objects that is modified by one thread and read by another (more specifically the EDT). I needed a solution that gave me fast look up and also fast indexing (by order inserted), so I'm using a ConcurrentHashMap with an accompanying ArrayList of the keys, so if want to index an entry, I can index the List for the key and then use the returned key to get the value from the hash map. So I have a wrapper class that makes sure when and entry is added, the mapping is added to the hash map and the key is added to the list at the same time, similarly for removal.
I'm posting an example of the code in question:
private List<K> keys = Collections.synchronizedList(new ArrayList<K>(INITIAL_CAPACITY));
private ConcurrentMap<K, T> entries = new ConcurrentHashMap<K, T>(INITIAL_CAPACITY, .75f);
public synchronized T getEntryAt(int index){
return entries.get(keys.get(index));
}
**public synchronized void addOrReplaceEntry(K key, T value){
T result = entries.get(key);
if(result == null){
entries.putIfAbsent(key, value);
keys.add(key);
}
else{
entries.replace(key, result);
}
}**
public syncrhonized T removeEntry(K key, T value){
keys.remove(key);
entries.remove(key, value);
}
public synchronized int getSize(){
return keys.size();
}
my question is: am I losing all the benefits of using the ConcurrentHashMap (over syncrhonized hashmap) by operating on it in synchronized methods? I have to synchronize the methods to safely modify/read from the ArrayList of keys (CopyOnWriteArrayList is not an option because a lot of modification happens...) Also, if you know of a better way to do this, that would be appreciated...

Yes, using a Concurrent collection and a Synchronized collection in only synchronized blocks is a waste. You wont get the benefits of ConcurrentHashMap because only one thread will be accesing it at a time.
You could have a look at this implementation of a concurrent linked hashmap, I havnt use it so can't attest to it's features.
One thing to consider would be to switching from synchronized blocks to a ReadWriteLock to improve concurrent read only performance.
I'm not really sure of the utility of proving a remove at index method, perhaps you could give some more details about the problem you are trying to solve?

It seems that you only care about finding values by index. If so, dump the Map and just use a List. Why do you need the Map?

Mixing synchronized and concurrent collections the way you have done it is not recommended. Any reason why you are maintaining two copies of the stuff you are interested in? You can easily get a list of all the keys from the map anytime rather than maintaining a separate list.

Why not store the values in the list and in the map the key -> index mapping?
so for getEntry you only need on lookup (in the list which should be anyway faster than a map) and for remove you do not have to travers the whole list. Syhnronization happens so.

You can get all access to the List keys onto the event queue using EventQueue.invokeLater. This will get rid of the synchronization. With all the synching you were not running much in parallel anyway. Also it means the getSize method will give the same answer for the duration of an event.
If you stick with synchronization instead of using invokeLater, at least get the entries hash table out of the synch block. Either way, you get more parallel processing. Of course, entries can now become out-of-synch with keys. The only down side is sometimes a key will come up with a null entry. With such a dynamic table this is unlikely to matter much.
Using the suggestion made by chrisichris to put the values in the list will solve this problem if it is one. In fact, this puts a nice wall between keys and entries; they are now used in completely separate ways. (If your only need for entries is to provide values to the JTable, you can get rid of it.) But entries (if still needed) should reference the entries, not contain an index; maintaining indexes there would be a hopeless task. And always remember that keys and entries are snapshots of "reality" (for lack of a better word) taken at different times.

Java concurrency with a Map of Lists

I have a java class that is accessed by a lot of threads at once and want to make sure it is thread safe. The class has one private field, which is a Map of Strings to Lists of Strings. I've implemented the Map as a ConcurrentHashMap to ensure gets and puts are thread safe:
public class ListStore {
private Map<String, List<String>> innerListStore;
public ListStore() {
innerListStore = new ConcurrentHashMap<String, List<String>>();
}
...
}
So given that gets and puts to the Map are thread safe, my concern is with the lists that are stored in the Map. For instance, consider the following method that checks if a given entry exists in a given list in the store (I've omitted error checking for brevity):
public boolean listEntryExists(String listName, String listEntry) {
List<String> listToSearch = innerListStore.get(listName);
for (String entryName : listToSearch) {
if(entryName.equals(listEntry)) {
return true;
}
}
return false;
}
It would seem that I need to synchronize the entire contents of this method because if another method changed the contents of the list at innerListStore.get(listName) while this method is iterating over it, a ConcurrentModificationException would be thrown.
Is that correct and if so, do I synchronize on innerListStore or would synchronizing on the local listToSearch variable work?
UPDATE: Thanks for the responses. It sounds like I can synchronize on the list itself. For more information, here is the add() method, which can be running at the same time the listEntryExists() method is running in another thread:
public void add(String listName, String entryName) {
List<String> addTo = innerListStore.get(listName);
if (addTo == null) {
addTo = Collections.synchronizedList(new ArrayList<String>());
List<String> added = innerListStore.putIfAbsent(listName, addTo);
if (added != null) {
addTo = added;
}
}
addTo.add(entryName);
}
If this is the only method that modifies the underlying lists stored in the map and no public methods return references to the map or entries in the map, can I synchronize iteration on the lists themselves and is this implementation of add() sufficient?

You can synchronize on listToSearch ("synchronized(listToSearch) {...}"). Make sure that there is no race condition creating the lists (use innerListStore.putIfAbsent to create them).

You could synchronize on just listToSearch, there's no reason to lock the entire map any time anyone is using just one entry.
Just remember though, that you need to synchronize on the list everywhere it is modified! Synchronizing the iterator doesn't automagically block other people from doing an add() or whatnot if you passed out to them references to the unsynchronized list.
It would be safest to just store synchronized lists in the Map and then lock on them when you iterate, and also document when you return a reference to the list that the user must sycnhronize on it if they iterate. Synchronization is pretty cheap in modern JVMs when no actual contention is happening. Of course if you never let a reference to one of the lists escape your class, you can handle it internally with a finer comb.
Alternately you can use a threadsafe list such as CopyOnWriteArrayList that uses snapshot iterators. What kind of point in time consistency you need is a design decision we can't make for you. The javadoc also includes a helpful discussion of performance characteristics.

It would seem that I need to synchronize the entire contents of this method because if another method changed the contents of the list at innerListStore.get(listName) while this method is iterating over it, a ConcurrentModificationException would be thrown.
Are other threads accessing the List itself, or only though operations exposed by ListStore?
Will operations invoked by other threads result in the contents of the a List stored in the Map being changed? Or will entries only be added/removed from the Map?
You would only need to synchronize access to the List stored within the Map if different threads can result in changes to the same List instances. If the threads are only allowed to add/remove List instances from the Map (i.e. change the structure of the Map), then synchronization is not necessary.

if the lists stored in the map are of the type that don't throw CME (CopyOnWriteArrayList for example) you can iterate at will
this can introduce some races though if you're not careful

If the Map is already thread safe, then I think syncronizing the listToSearch should work. Im not 100% but I think it should work
synchronized(listToSearch)
{
}

You could use another abstraction from Guava
Note that this will synchronize on the whole map, so it might be not that useful for you.

As you haven't provided any client for the map of lists apart from the boolean listEntryExists(String listName, String listEntry) method, I wonder why you are storing lists at all? This structure seems to be more naturally a Map<String, Set<String>> and the listEntryExists should use the contains method (available on List as well, but O(n) to the size of the list):
public boolean listEntryExists(String name, String entry) {
SetString> set = map.get(name);
return (set == null) ? false : set.contains(entry;
}
Now, the contains call can encapsulate whatever internal concurrency protocol you want it to.
For the add you can either use a synchronized wrapper (simple, but maybe slow) or if writes are infrequent compared to reads, utilise ConcurrentMap.replace to implement your own copy-on-write strategy. For instance, using Guava ImmutableSet:
public boolean add(String name, String entry) {
while(true) {
SetString> set = map.get(name);
if (set == null) {
if (map.putIfAbsent(name, ImmutableSet.of(entry))
return true
continue;
}
if (set.contains(entry)
return false; // no need to change, already exists
Set<String> newSet = ImmutableSet.copyOf(Iterables.concat(set, ImmutableSet.of(entry))
if (map.replace(name, set, newSet)
return true;
}
}
This is now an entirely thread-safe lock-free structure, where concurrent readers and writers will not block each other (modulo the lock-freeness of the underlying ConcurrentMap implementation). This implementation does have an O(n) in its write, where your original implementation was O9n) in the read. Again if you are read-mostly rather than write-mostly this could be a big win.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.