I'm new to Java8 and working on a problem where multiple threads (~10) are writing values to a Concurrent Hash Map. I have another dedicated thread which reads all the values present in Concurrent Hash Map and returns them (every 30 seconds). Is iterating over result of values() method the recommended way of fetching results without getting Concurrent Modification Exception?
Note: I am perfectly fine with getting stale data
I went over the official docs which says:
Retrieval operations generally do not block, so may overlap with update operations . Retrievals reflect the results of the most recently completed update operations holding upon their onset. For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators, Spliterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException.
However doc of values() method says:
Returns a Collection view of the values contained in this map
Is the below code thread safe?
for (String name: myMap.values()) {
System.out.println("name": + name);
}
Is iterating over result of values() method the recommended way of fetching results without getting a ConcurrentModificationException?
Yes. It is the recommended way, and you won't get a ConcurrentModificationException.
As the package level javadoc states:
Most concurrent Collection implementations (including most Queues) also differ from the usual java.util conventions in that their Iterators and Spliterators provide weakly consistent rather than fast-fail traversal:
they may proceed concurrently with other operations
they will never throw ConcurrentModificationException
they are guaranteed to traverse elements as they existed upon construction exactly once, and may (but are not guaranteed to) reflect any modifications subsequent to construction.
Is the below code thread safe?
for (String name: myMap.values()) {
System.out.println("name": + name);
}
Yes ... with some qualifications.
Thread safety really means that the code works according to its specified behavior in a multi-threaded application. The problem is that you haven't said clearly what you expect the code to actually do.
What we can say is the following:
The iteration will see values as per the previously stated guarantees.
The memory model guarantees mean that there shouldn't be any nasty behavior with stale values ... unless you mutate value objects after putting them into the map. (If you do that, then the object's methods need to be implemented to cope with that; e.g. they may need to be synchronized. This is moot for String values, since they are immutable.)
Related
I am using a Collection (a HashMap used indirectly by the JPA, it so happens), but apparently randomly the code throws a ConcurrentModificationException. What is causing it and how do I fix this problem? By using some synchronization, perhaps?
Here is the full stack-trace:
Exception in thread "pool-1-thread-1" java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(Unknown Source)
at java.util.HashMap$ValueIterator.next(Unknown Source)
at org.hibernate.collection.AbstractPersistentCollection$IteratorProxy.next(AbstractPersistentCollection.java:555)
at org.hibernate.engine.Cascade.cascadeCollectionElements(Cascade.java:296)
at org.hibernate.engine.Cascade.cascadeCollection(Cascade.java:242)
at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:219)
at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169)
at org.hibernate.engine.Cascade.cascade(Cascade.java:130)
This is not a synchronization problem. This will occur if the underlying collection that is being iterated over is modified by anything other than the Iterator itself.
Iterator it = map.entrySet().iterator();
while (it.hasNext()) {
Entry item = it.next();
map.remove(item.getKey());
}
This will throw a ConcurrentModificationException when the it.hasNext() is called the second time.
The correct approach would be
Iterator it = map.entrySet().iterator();
while (it.hasNext()) {
Entry item = it.next();
it.remove();
}
Assuming this iterator supports the remove() operation.
Try using a ConcurrentHashMap instead of a plain HashMap
Modification of a Collection while iterating through that Collection using an Iterator is not permitted by most of the Collection classes. The Java library calls an attempt to modify a Collection while iterating through it a "concurrent modification". That unfortunately suggests the only possible cause is simultaneous modification by multiple threads, but that is not so. Using only one thread it is possible to create an iterator for the Collection (using Collection.iterator(), or an enhanced for loop), start iterating (using Iterator.next(), or equivalently entering the body of the enhanced for loop), modify the Collection, then continue iterating.
To help programmers, some implementations of those Collection classes attempt to detect erroneous concurrent modification, and throw a ConcurrentModificationException if they detect it. However, it is in general not possible and practical to guarantee detection of all concurrent modifications. So erroneous use of the Collection does not always result in a thrown ConcurrentModificationException.
The documentation of ConcurrentModificationException says:
This exception may be thrown by methods that have detected concurrent modification of an object when such modification is not permissible...
Note that this exception does not always indicate that an object has been concurrently modified by a different thread. If a single thread issues a sequence of method invocations that violates the contract of an object, the object may throw this exception...
Note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast operations throw ConcurrentModificationException on a best-effort basis.
Note that
the exception may be throw, not must be thrown
different threads are not required
throwing the exception cannot be guaranteed
throwing the exception is on a best-effort basis
throwing the exception happens when the concurrent modification is detected, not when it is caused
The documentation of the HashSet, HashMap, TreeSet and ArrayList classes says this:
The iterators returned [directly or indirectly from this class] are fail-fast: if the [collection] is modified at any time after the iterator is created, in any way except through the iterator's own remove method, the Iterator throws a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
Note again that the behaviour "cannot be guaranteed" and is only "on a best-effort basis".
The documentation of several methods of the Map interface say this:
Non-concurrent implementations should override this method and, on a best-effort basis, throw a ConcurrentModificationException if it is detected that the mapping function modifies this map during computation. Concurrent implementations should override this method and, on a best-effort basis, throw an IllegalStateException if it is detected that the mapping function modifies this map during computation and as a result computation would never complete.
Note again that only a "best-effort basis" is required for detection, and a ConcurrentModificationException is explicitly suggested only for the non concurrent (non thread-safe) classes.
Debugging ConcurrentModificationException
So, when you see a stack-trace due to a ConcurrentModificationException, you can not immediately assume that the cause is unsafe multi-threaded access to a Collection. You must examine the stack-trace to determine which class of Collection threw the exception (a method of the class will have directly or indirectly thrown it), and for which Collection object. Then you must examine from where that object can be modified.
The most common cause is modification of the Collection within an enhanced for loop over the Collection. Just because you do not see an Iterator object in your source code does not mean there is no Iterator there! Fortunately, one of the statements of the faulty for loop will usually be in the stack-trace, so tracking down the error is usually easy.
A trickier case is when your code passes around references to the Collection object. Note that unmodifiable views of collections (such as produced by Collections.unmodifiableList()) retain a reference to the modifiable collection, so iteration over an "unmodifiable" collection can throw the exception (the modification has been done elsewhere). Other views of your Collection, such as sub lists, Map entry sets and Map key sets also retain references to the original (modifiable) Collection. This can be a problem even for a thread-safe Collection, such as CopyOnWriteList; do not assume that thread-safe (concurrent) collections can never throw the exception.
Which operations can modify a Collection can be unexpected in some cases. For example, LinkedHashMap.get() modifies its collection.
The hardest cases are when the exception is due to concurrent modification by multiple threads.
Programming to prevent concurrent modification errors
When possible, confine all references to a Collection object, so its is easier to prevent concurrent modifications. Make the Collection a private object or a local variable, and do not return references to the Collection or its iterators from methods. It is then much easier to examine all the places where the Collection can be modified. If the Collection is to be used by multiple threads, it is then practical to ensure that the threads access the Collection only with appropriate synchonization and locking.
In Java 8, you can use lambda expression:
map.keySet().removeIf(key -> key condition);
removeIf is a convenient default method in Collection which uses Iterator internally to iterate over the elements of the calling collection.
The extraction of the removal condition is expressed by allowing the caller to provide a Predicate<? super E>.
"I'll perform the iteration for you and test your Predicate on each one of the elements in the collection. If an element causes the test method of the Predicate to return true, I'll remove it."
It sounds less like a Java synchronization issue and more like a database locking problem.
I don't know if adding a version to all your persistent classes will sort it out, but that's one way that Hibernate can provide exclusive access to rows in a table.
Could be that isolation level needs to be higher. If you allow "dirty reads", maybe you need to bump up to serializable.
Note that the selected answer cannot be applied to your context directly before some modification, if you are trying to remove some entries from the map while iterating the map just like me.
I just give my working example here for newbies to save their time:
HashMap<Character,Integer> map=new HashMap();
//adding some entries to the map
...
int threshold;
//initialize the threshold
...
Iterator it=map.entrySet().iterator();
while(it.hasNext()){
Map.Entry<Character,Integer> item=(Map.Entry<Character,Integer>)it.next();
//it.remove() will delete the item from the map
if((Integer)item.getValue()<threshold){
it.remove();
}
Try either CopyOnWriteArrayList or CopyOnWriteArraySet depending on what you are trying to do.
I ran into this exception when try to remove x last items from list.
myList.subList(lastIndex, myList.size()).clear(); was the only solution that worked for me.
HashTable is a thread-safe collection but does initializing it with an ArrayList (which is not thread-safe) as value endanger the whole thread-safety aspect?
Hashtable <Employee, ArrayList<Car>> carDealership = new Hashtable<>();
Further on, I am planning to wrap every action of ArrayLists in a synchronized block to prevent any race-conditions when operating with any methods.
Yet I haven't declared the ArrayLists in the HashTable as synchronized lists, this being achieved with the following code
Collections.synchronizedList(new ArrayList<>())
This will happen when I will be adding ArrayLists to the HashTable obviously.
How can I be sure that the ArrayLists in the HashTable are thread-safe?
Is it enough to pass a thread-safe ArrayList to the put() method of the hashTable and I'm good to go? (and not even worry about the constructor of the HashTable?) Therefore the put() method of the HashTable doesn't even recognize if I am passing a thread-safe/unsafe parameter?
Note: Thread-safety is a requirement. Otherwise I wouldn't have opted for this implementation.
The only way to ensure that the values in the Hashtable or ConcurrentHashMap are thread-safe is to wrap it in a way that prevents anyone from adding something that you don't control. Never expose the Map itself or any of the Lists contained in it to other parts of your code. Provide methods to get snapshot-copies if you need them, provide methods to add values to the lists, but make sure the class wrapping the map is the one that will create all lists that can ever get added to it. Iteration over the "live" lists in you map will require external synchronisation (as metioned in the JavaDocs of synchronizedList).
Both Hashtable and ConcurrentHashMap are thread-safe in that concurrent operations will not leave them in an invalid state. This means e.g. that if you invoke put from two threads with the same key, one of them will return the value the other inserted as the "old" value. But of course you can't tell which will be the first and which will be second in advance without some external synchronization.
The implementation is quite different, though: Hashtable and the synchronized Map returned by Collections.synchronizedMap(new HashMap()); are similar in that they basically add synchronized modifiers to most methods. This can be inefficient if you have lots of threads (i.e. high contention for the locks) that mostly read, but only occasionally modify the map. ConcurrentHashMap provides more fine grained locking:
Retrieval operations (including get) generally do not block
which can yield significantly better performance, depending on your use case. I also provides a richer API with powerful search- and bulk-modification-operations.
Yes, using ArrayList in this case is not thread safe. You can always get the object from the table and operate on it.
CopyOnWriteArrayList is a good substitue for it.
But you still have the case, when one thread takes (saves in a variable) the collection, and the other thread replaces with another one.
If you are not going to replace the lists inside the table, then this is not a problem.
I have a ConcurrentHashMap<String, Object> concurrentMap;
I need to return String[] with keys of the map.
Is the following code:
public String[] listKeys() {
return (String[]) concurrentMap.keySet().toArray();
}
thread safe?
While the ConcurrentHashMap is a thread-safe class, the Iterator that is used on the keys is NOT CERTAIN to be in sync with any subsequent HashMap changes, once created...
From the spec:
public Set<K> keySet()
Returns a Set view of the keys contained in this map......
...........................
The view's iterator is a "weakly consistent" iterator that will
never throw ConcurrentModificationException, and guarantees to
traverse elements as they existed upon construction of the iterator,
and may (but is not guaranteed to) reflect any modifications
subsequent to construction.
Yes and No. Threas-safe is only fuzzily defined as soon as you extend to scope.
Generally, concurrent collections implement all their methods in ways that allow concurrent access by multiple threads, or if they can't, provide mechanisms to serialize such accesses (e.g. synchronization) transparently. Thus, they are safe in the sense they ensure they preserve a valid internal structure and method calls give valid results.
The fuzziness starts if you look at the details, e.g. toArray() will return you some kind of snapshot of the collections contents. There is no guarantee that by the time the method returns the contents will not have already been changed. So while the call is thread safe, the result will not fulfill the usual invariants (e.g. the array contents may not be the same as the collections).
If you need consistency over the scope of mupltiple calls to a concurrent collection, you need to provide mechanisms within the code calling the methods to ensure the required consistency.
I have a query regarding ConcurrentHashMap.
ConcurrentHashMap is a map for concurrent access. ConcurrentHashMap implements ConcurrentMap which extends Map.
a) ConcurrentHashMap implements the methods defined in ConcurrentMap (like putifAbsent etc) which are atomic.
b) But, how about the methods in the Map interface which ConcurrentMap extends?
How are they now atomic? Have they been reimplemented by ConcurrentHashMap
If I have a reference of type ConcurrentHashMap and call a method from
the Map Interface(e.g put) or any other method, is that method an atomic method?
ConcurrentHashMap does not extend HashMap. They are both implementations of a hash table, but ConcurrentHashMap has very different internals to HashMap in order to provide concurrency.
If you provide a ConcurrentHashMap to a method that accepts Map, then that will work, and you will get the concurrent behaviour you expect. Map is simply an interface that describes a set of methods, ConcurrentHashMap implements that interface with concurrent behaviour.
There is a difference between 'concurrent' and 'atomic'. Concurrent means that multiple operations can happen at the same time and the Map (or whatever data structure we are talking about) will ALWAYS BE IN A VALID STATE. This means that you can have multiple threads calling put(), get(), remove(), etc on this map and there will never be any errors (if you try this with a regular HashMap you WILL get errors as it isn't designed to handle concurrency).
Atomic means that an action that takes multiple steps appears to take a single step to other threads - as fair as they are aware it has completely finished or hasn't even started yet. For ConcurrentHashMap, putIfAbsent() is one such example. From the javadoc,
If the specified key is not already associated with a value, associate it with the given value. This is equivalent to:
if (!map.containsKey(key)) {
return map.put(key, value);
else
return map.get(key);
except that the action is performed atomically.
If you tried the above code with a ConcurrentHashMap, you wouldn't get any errors (since it is Concurrent), but there is a good change that other threads would interleave with the main thread and your entry would get overwritten or removed. ConcurrentMap specifies the atomic putIfAbsent() method to ensure that implementations can perform those steps atomically, without interference from other threads.
Map is just an interface. Therefore the ConcurrentHashMap has an atomic implementation of those methods.
The putIfAbsent method is a convenient way in a concurrent environment to execute an atomic if not contains then put that you cannot do from the Map interface even though the Map is actually of type ConcurrentHashMap.
The implementation of these methods like put() and remove() are getting the lock on the final Sync object, concurrent retrieval will always give the most recent data on the map.
In case of putAll() or clear(), which operates on whole Map, concurrent read may reflect insertion and removal of only some entries.
Below Two links will help you to understand:
http://javarevisited.blogspot.in/2013/02/concurrenthashmap-in-java-example-tutorial-working.html
http://www.javamex.com/tutorials/synchronization_concurrency_8_hashmap2.shtml
I'm looking for a structure which hashes keys without requiring a value. When queried, it should return true if the key is found and false otherwise. I'm looking for something similar to
Hashtable<MyClass, Boolean>
except insertion requires only a key and queries only ever return true or false, never null.
You need Java's HashSet (Java 8).
The description from the official documentation is:
This class implements the Set interface, backed by a hash table
(actually a HashMap instance). It makes no guarantees as to the
iteration order of the set; in particular, it does not guarantee that
the order will remain constant over time. This class permits the null
element.
This class offers constant time performance for the basic operations
(add, remove, contains and size), assuming the hash function disperses
the elements properly among the buckets. Iterating over this set
requires time proportional to the sum of the HashSet instance's size
(the number of elements) plus the "capacity" of the backing HashMap
instance (the number of buckets). Thus, it's very important not to set
the initial capacity too high (or the load factor too low) if
iteration performance is important.
Note that this implementation is not synchronized. If multiple threads
access a hash set concurrently, and at least one of the threads
modifies the set, it must be synchronized externally. This is
typically accomplished by synchronizing on some object that naturally
encapsulates the set. If no such object exists, the set should be
"wrapped" using the Collections.synchronizedSet method. This is best
done at creation time, to prevent accidental unsynchronized access to
the set:
Set s = Collections.synchronizedSet(new HashSet(...));
The iterators returned by this class's iterator method are fail-fast:
if the set is modified at any time after the iterator is created, in
any way except through the iterator's own remove method, the Iterator
throws a ConcurrentModificationException. Thus, in the face of
concurrent modification, the iterator fails quickly and cleanly,
rather than risking arbitrary, non-deterministic behavior at an
undetermined time in the future.
Note that the fail-fast behavior of an iterator cannot be guaranteed
as it is, generally speaking, impossible to make any hard guarantees
in the presence of unsynchronized concurrent modification. Fail-fast
iterators throw ConcurrentModificationException on a best-effort
basis. Therefore, it would be wrong to write a program that depended
on this exception for its correctness: the fail-fast behavior of
iterators should be used only to detect bugs.
This class is a member of the Java Collections Framework.
java.util.HashSet? Using contains() for your lookup.
See also the static methods Collections#newSetFromMap that creates a set based on the given map implementation. This is eg handy to create a weak hash set.
Java Set is designed to remove duplicates and hopefully the HashMap must be using Java Set internally for managing key as keys can never have duplicates, So you should be considering set for your requirement.