This question already has answers here:
What is difference between Collection.stream().forEach() and Collection.forEach()?
(5 answers)
Closed 8 years ago.
It looks like I can call list.forEach(a -> a.stuff()) directly on my collection, instead of list.stream().forEach(a -> a.stuff()). When would I use one over the other (parallelStream() aside..)?
There are a few differences:
Iterable.forEach guarantees processing in iteration order, if it's defined for the Iterable. (Iteration order is generally well-defined for Lists.) Stream.forEach does not; one must use Stream.forEachOrdered instead.
Iterable.forEach may permit side effects on the underlying data structure. Although many collections' iterators will throw ConcurrentModificationException if the collection is modified during iteration, some collections' iterators explicitly permit it. See CopyOnWriteArrayList, for example. By contrast, stream operations in general must not interfere with the stream source.
If the Iterable is a synchronized wrapper collection, for example, from Collections.synchronizedList(), a call to forEach on it will hold its lock during the entire iteration. This will prevent other threads from modifying the collection during the iteration, ensuring that the iteration sees a consistent view of the collection, and preventing ConcurrentModificationException. (This will also prevent other threads from reading the collection during the iteration.) This is not the case for streams. There is nothing to prevent the collection from being modified during the stream operation, and if modification does occur, the result is undefined.
Related
I am using a Collection (a HashMap used indirectly by the JPA, it so happens), but apparently randomly the code throws a ConcurrentModificationException. What is causing it and how do I fix this problem? By using some synchronization, perhaps?
Here is the full stack-trace:
Exception in thread "pool-1-thread-1" java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(Unknown Source)
at java.util.HashMap$ValueIterator.next(Unknown Source)
at org.hibernate.collection.AbstractPersistentCollection$IteratorProxy.next(AbstractPersistentCollection.java:555)
at org.hibernate.engine.Cascade.cascadeCollectionElements(Cascade.java:296)
at org.hibernate.engine.Cascade.cascadeCollection(Cascade.java:242)
at org.hibernate.engine.Cascade.cascadeAssociation(Cascade.java:219)
at org.hibernate.engine.Cascade.cascadeProperty(Cascade.java:169)
at org.hibernate.engine.Cascade.cascade(Cascade.java:130)
This is not a synchronization problem. This will occur if the underlying collection that is being iterated over is modified by anything other than the Iterator itself.
Iterator it = map.entrySet().iterator();
while (it.hasNext()) {
Entry item = it.next();
map.remove(item.getKey());
}
This will throw a ConcurrentModificationException when the it.hasNext() is called the second time.
The correct approach would be
Iterator it = map.entrySet().iterator();
while (it.hasNext()) {
Entry item = it.next();
it.remove();
}
Assuming this iterator supports the remove() operation.
Try using a ConcurrentHashMap instead of a plain HashMap
Modification of a Collection while iterating through that Collection using an Iterator is not permitted by most of the Collection classes. The Java library calls an attempt to modify a Collection while iterating through it a "concurrent modification". That unfortunately suggests the only possible cause is simultaneous modification by multiple threads, but that is not so. Using only one thread it is possible to create an iterator for the Collection (using Collection.iterator(), or an enhanced for loop), start iterating (using Iterator.next(), or equivalently entering the body of the enhanced for loop), modify the Collection, then continue iterating.
To help programmers, some implementations of those Collection classes attempt to detect erroneous concurrent modification, and throw a ConcurrentModificationException if they detect it. However, it is in general not possible and practical to guarantee detection of all concurrent modifications. So erroneous use of the Collection does not always result in a thrown ConcurrentModificationException.
The documentation of ConcurrentModificationException says:
This exception may be thrown by methods that have detected concurrent modification of an object when such modification is not permissible...
Note that this exception does not always indicate that an object has been concurrently modified by a different thread. If a single thread issues a sequence of method invocations that violates the contract of an object, the object may throw this exception...
Note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast operations throw ConcurrentModificationException on a best-effort basis.
Note that
the exception may be throw, not must be thrown
different threads are not required
throwing the exception cannot be guaranteed
throwing the exception is on a best-effort basis
throwing the exception happens when the concurrent modification is detected, not when it is caused
The documentation of the HashSet, HashMap, TreeSet and ArrayList classes says this:
The iterators returned [directly or indirectly from this class] are fail-fast: if the [collection] is modified at any time after the iterator is created, in any way except through the iterator's own remove method, the Iterator throws a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
Note again that the behaviour "cannot be guaranteed" and is only "on a best-effort basis".
The documentation of several methods of the Map interface say this:
Non-concurrent implementations should override this method and, on a best-effort basis, throw a ConcurrentModificationException if it is detected that the mapping function modifies this map during computation. Concurrent implementations should override this method and, on a best-effort basis, throw an IllegalStateException if it is detected that the mapping function modifies this map during computation and as a result computation would never complete.
Note again that only a "best-effort basis" is required for detection, and a ConcurrentModificationException is explicitly suggested only for the non concurrent (non thread-safe) classes.
Debugging ConcurrentModificationException
So, when you see a stack-trace due to a ConcurrentModificationException, you can not immediately assume that the cause is unsafe multi-threaded access to a Collection. You must examine the stack-trace to determine which class of Collection threw the exception (a method of the class will have directly or indirectly thrown it), and for which Collection object. Then you must examine from where that object can be modified.
The most common cause is modification of the Collection within an enhanced for loop over the Collection. Just because you do not see an Iterator object in your source code does not mean there is no Iterator there! Fortunately, one of the statements of the faulty for loop will usually be in the stack-trace, so tracking down the error is usually easy.
A trickier case is when your code passes around references to the Collection object. Note that unmodifiable views of collections (such as produced by Collections.unmodifiableList()) retain a reference to the modifiable collection, so iteration over an "unmodifiable" collection can throw the exception (the modification has been done elsewhere). Other views of your Collection, such as sub lists, Map entry sets and Map key sets also retain references to the original (modifiable) Collection. This can be a problem even for a thread-safe Collection, such as CopyOnWriteList; do not assume that thread-safe (concurrent) collections can never throw the exception.
Which operations can modify a Collection can be unexpected in some cases. For example, LinkedHashMap.get() modifies its collection.
The hardest cases are when the exception is due to concurrent modification by multiple threads.
Programming to prevent concurrent modification errors
When possible, confine all references to a Collection object, so its is easier to prevent concurrent modifications. Make the Collection a private object or a local variable, and do not return references to the Collection or its iterators from methods. It is then much easier to examine all the places where the Collection can be modified. If the Collection is to be used by multiple threads, it is then practical to ensure that the threads access the Collection only with appropriate synchonization and locking.
In Java 8, you can use lambda expression:
map.keySet().removeIf(key -> key condition);
removeIf is a convenient default method in Collection which uses Iterator internally to iterate over the elements of the calling collection.
The extraction of the removal condition is expressed by allowing the caller to provide a Predicate<? super E>.
"I'll perform the iteration for you and test your Predicate on each one of the elements in the collection. If an element causes the test method of the Predicate to return true, I'll remove it."
It sounds less like a Java synchronization issue and more like a database locking problem.
I don't know if adding a version to all your persistent classes will sort it out, but that's one way that Hibernate can provide exclusive access to rows in a table.
Could be that isolation level needs to be higher. If you allow "dirty reads", maybe you need to bump up to serializable.
Note that the selected answer cannot be applied to your context directly before some modification, if you are trying to remove some entries from the map while iterating the map just like me.
I just give my working example here for newbies to save their time:
HashMap<Character,Integer> map=new HashMap();
//adding some entries to the map
...
int threshold;
//initialize the threshold
...
Iterator it=map.entrySet().iterator();
while(it.hasNext()){
Map.Entry<Character,Integer> item=(Map.Entry<Character,Integer>)it.next();
//it.remove() will delete the item from the map
if((Integer)item.getValue()<threshold){
it.remove();
}
Try either CopyOnWriteArrayList or CopyOnWriteArraySet depending on what you are trying to do.
I ran into this exception when try to remove x last items from list.
myList.subList(lastIndex, myList.size()).clear(); was the only solution that worked for me.
I'm new to Java8 and working on a problem where multiple threads (~10) are writing values to a Concurrent Hash Map. I have another dedicated thread which reads all the values present in Concurrent Hash Map and returns them (every 30 seconds). Is iterating over result of values() method the recommended way of fetching results without getting Concurrent Modification Exception?
Note: I am perfectly fine with getting stale data
I went over the official docs which says:
Retrieval operations generally do not block, so may overlap with update operations . Retrievals reflect the results of the most recently completed update operations holding upon their onset. For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators, Spliterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException.
However doc of values() method says:
Returns a Collection view of the values contained in this map
Is the below code thread safe?
for (String name: myMap.values()) {
System.out.println("name": + name);
}
Is iterating over result of values() method the recommended way of fetching results without getting a ConcurrentModificationException?
Yes. It is the recommended way, and you won't get a ConcurrentModificationException.
As the package level javadoc states:
Most concurrent Collection implementations (including most Queues) also differ from the usual java.util conventions in that their Iterators and Spliterators provide weakly consistent rather than fast-fail traversal:
they may proceed concurrently with other operations
they will never throw ConcurrentModificationException
they are guaranteed to traverse elements as they existed upon construction exactly once, and may (but are not guaranteed to) reflect any modifications subsequent to construction.
Is the below code thread safe?
for (String name: myMap.values()) {
System.out.println("name": + name);
}
Yes ... with some qualifications.
Thread safety really means that the code works according to its specified behavior in a multi-threaded application. The problem is that you haven't said clearly what you expect the code to actually do.
What we can say is the following:
The iteration will see values as per the previously stated guarantees.
The memory model guarantees mean that there shouldn't be any nasty behavior with stale values ... unless you mutate value objects after putting them into the map. (If you do that, then the object's methods need to be implemented to cope with that; e.g. they may need to be synchronized. This is moot for String values, since they are immutable.)
I'm reading the java official doc regarding wrappers implementation, which are static methods in Collections used to get synchronized collection, for example : List<Type> list = Collections.synchronizedList(new ArrayList<Type>());
...
the thing that I did not understand is the following (I quote from the java doc ) :
A collection created in this fashion is every bit as thread-safe as a normally synchronized collection, such as a Vector.
In the face of concurrent access, it is imperative that the user manually synchronize on the returned collection when iterating over it. The reason is that iteration is accomplished via multiple calls into the collection, which must be composed into a single atomic operation...
how it could be every bit as thread-safe an need to manually synchronize when iterating ??
It is thread safe in the sense that each of it's individual methods are thread safe, but if you perform compound actions on the collection, then your code is at risk of concurrency issues.
ex:
List<String> synchronizedList = Collections.synchronizedList(someList);
synchronizedList.add(whatever); // this is thread safe
the individual method add() is thread safe but if i perform the following:
List<String> synchronizedList = Collections.synchronizedList(someList);
if(!synchronizedList.contains(whatever))
synchronizedList.add(whatever); // this is not thread safe
the if-then-add operation is not thread safe because some other thread might have added whatever to the list after contains() check.
There is no contradiction here: collections returned from synchronizedXyz suffer the same shortcoming as synchronized collections available to you directly, namely the need to manually synchronize on iterating the collection.
The problem of external iteration cannot be solved by a better class design, because iterating a collection inherently requires making multiple calls to its methods (see this Q&A for detailed explanation).
Note that starting with Java 1.8 you can iterate without additional synchronization using forEach method of your synchronized collection*. This is thread-safe, and comes with additional benefits; see this Q&A for details.
The reason this is different from iterating externally is that forEach implementation inside the collection takes care of synchronizing the iteration for you.
This question already has answers here:
What is difference between Collection.stream().forEach() and Collection.forEach()?
(5 answers)
Closed 7 years ago.
This is a example:
code A:
files.forEach(f -> {
//TODO
});
and another code B may use on this way:
files.stream().forEach(f -> { });
What is the difference between both, with stream() and no stream()?
Practically speaking, they are mostly the same, but there is a small semantic difference.
Code A is defined by Iterable.forEach, whereas code B is defined by Stream.forEach. The definition of Stream.forEach allows for the elements to be processed in any order -- even for sequential streams. (For parallel streams, Stream.forEach will very likely process elements out-of-order.)
Iterable.forEach gets an Iterator from the source and calls forEachRemaining() on it. As far as I can see, all current (JDK 8) implementations of Stream.forEach on the collections classes will create a Spliterator built from one of the source's Iterators, and will then call forEachRemaining on that Iterator -- just like Iterable.forEach does. So they do the same thing, though the streams version has some extra setup overhead.
However, in the future, it's possible that the streams implementation could change so that this is no longer the case.
(If you want to guarantee ordering of processing streams elements, use forEachOrdered() instead.)
There is no difference in terms of semantics, though the direct implementation without stream is probably slightly more efficient.
A stream is an sequence of elements (i.e a data structure) for using up an operation or iteration. Any Collection can be exposed as a stream. The operations you perform on a stream can either be
Intermediate operations (map, skip, concat, substream, distinct, filter, sorted, limit, peek..) producing another java.util.stream.Stream but the intermediate operations are lazy operations, which will be executed only after a terminal operation was executed.
And the Terminal operations (forEach, max, count, matchAny, findFirst, reduce, collect, sum, findAny ) producing an object that is not a stream.
Basically it is similar to pipeline as in Unix.
Both approaches uses the terminal operation Iterable.forEach, but the version with .stream() also unnecessarily creates a Stream object representing the List. While there is no difference, it is suboptimal.
This question already has answers here:
Iterating through a Collection, avoiding ConcurrentModificationException when removing objects in a loop
(31 answers)
Closed 6 years ago.
The following mockup code ends up in ConcurrentModificationException, that happens (as i understand it), due to the fact that i am iterating over a set, which i am modifying.
Set<String> data = new HashSet<String>();
data.add("a=1");
data.add("b=2");
data.add("c=3");
data.add("d=4");
for (String s : data) {
data.remove(s);
}
But why is it exactly? Please help clarify
You're violating the iterator's contract. From the ConcurrentModificationException javadoc,
If a single thread issues a sequence of method invocations that
violates the contract of an object, the object may throw this
exception. For example, if a thread modifies a collection directly
while it is iterating over the collection with a fail-fast iterator,
the iterator will throw this exception.
The exception is being thrown simply because you are modifying the collection (by calling data.remove(s)) while iterating over it. Java Collections generally have the requirement that they cannot be modified while iterating over their values.
From the official documentation:
it is not generally permissible for one thread to modify a Collection while another thread is iterating over it. In general, the results of the iteration are undefined under these circumstances. Some Iterator implementations (including those of all the general purpose collection implementations provided by the JRE) may choose to throw this exception if this behavior is detected. Iterators that do this are known as fail-fast iterators, as they fail quickly and cleanly, rather that risking arbitrary, non-deterministic behavior at an undetermined time in the future.
You have to use Iterator to remove element from a Set
It's because the compiler actually inserts an Iterator and then uses the traditional for loop to iterate over the elements. If you modify the Collection on which the iterator was founded this would lead to undetermined behavior. To prevent this, ConcurrentModificationException is thrown.
See also here:
Item 7. Do not modify the list during iteration. While for-each syntax does not provide direct access to the iterator used by the equivalent basic for loop, the list can be modified by directly calling other methods on the list. Doing so can lead to indeterminate program behavior. In particular, if the compiler-inserted call to iterator() returns a fail-fast iterator, a java.util.ConcurrentModificationException runtime exception may be thrown. But this is only done a best-effort basis, and cannot be relied upon except as a means of detecting a bug when the exception does get thrown.
or the section on the for each loop in the Language Specification.