What is the difference between .foreach and .stream().foreach? [duplicate] - java

This question already has answers here:
What is difference between Collection.stream().forEach() and Collection.forEach()?
(5 answers)
Closed 7 years ago.
This is a example:
code A:
files.forEach(f -> {
//TODO
});
and another code B may use on this way:
files.stream().forEach(f -> { });
What is the difference between both, with stream() and no stream()?

Practically speaking, they are mostly the same, but there is a small semantic difference.
Code A is defined by Iterable.forEach, whereas code B is defined by Stream.forEach. The definition of Stream.forEach allows for the elements to be processed in any order -- even for sequential streams. (For parallel streams, Stream.forEach will very likely process elements out-of-order.)
Iterable.forEach gets an Iterator from the source and calls forEachRemaining() on it. As far as I can see, all current (JDK 8) implementations of Stream.forEach on the collections classes will create a Spliterator built from one of the source's Iterators, and will then call forEachRemaining on that Iterator -- just like Iterable.forEach does. So they do the same thing, though the streams version has some extra setup overhead.
However, in the future, it's possible that the streams implementation could change so that this is no longer the case.
(If you want to guarantee ordering of processing streams elements, use forEachOrdered() instead.)

There is no difference in terms of semantics, though the direct implementation without stream is probably slightly more efficient.

A stream is an sequence of elements (i.e a data structure) for using up an operation or iteration. Any Collection can be exposed as a stream. The operations you perform on a stream can either be
Intermediate operations (map, skip, concat, substream, distinct, filter, sorted, limit, peek..) producing another java.util.stream.Stream but the intermediate operations are lazy operations, which will be executed only after a terminal operation was executed.
And the Terminal operations (forEach, max, count, matchAny, findFirst, reduce, collect, sum, findAny ) producing an object that is not a stream.
Basically it is similar to pipeline as in Unix.

Both approaches uses the terminal operation Iterable.forEach, but the version with .stream() also unnecessarily creates a Stream object representing the List. While there is no difference, it is suboptimal.

Related

Terminal operations on streams cannot be chained?

I have this concern when it is said there can be one terminal operation and terminal operations cannot be chained. we can write something like this right?
Stream1.map().collect().forEach()
Isn’t this chaining collect and forEach which are both terminal operations. I don’t get that part
The above works fine
Because
Assuming you meant collect(Collectors.toList()), forEach is a List operation, not a Stream operation. Perhaps the source of your confusion: there is a forEach method on Stream as well, but that's not what you're using.
Even if it weren't, nothing stops you from creating another stream from something that can be streamed, you just can't use the same stream you created in the first place.
Stream has forEach, and List has forEach (by extending Iterable). Different methods, but with the same name and purpose. Only the former is a terminal operation.
One practical difference is that the Stream version can be called on a parallel stream, and in that case, the order is not guaranteed. It might appear "random". The version from Iterable always happens on the same, calling thread. The order is guaranteed to match that of an Iterator.
Your example is terminally collecting the stream's items into a List, then calling forEach on that List.
That example is bad style because the intermediate List is useless. It's creating a List for something you could have done directly on the Stream.

Java Unordered() function

In java 8, when I do,
Type 1
list.stream().parallel().map(/**/).unordered().filter(/**/).collect(/**/);
Type 2
list.stream().parallel().unordered().map(/**/).filter(/**/).collect(/**/);
As both the streams are parallel, I can understand all the objects for each of the operations like filter, map etc.. will be executed parallely but the operations itself will be executed sequentially in the order defined.
Questions
1.In Type1, I do say unordered() after map() operation. So, does the map() operation try to handle 'ordering', because it is before unOrdered()?
2.In Type2, Ordering is not maintained across map, filter ops right? Is my understanding correct?
There are 3 Stream state-modifying methods:
sequential()
Returns an equivalent stream that is sequential. May return itself, either because the stream was already sequential, or because the underlying stream state was modified to be sequential.
parallel()
Returns an equivalent stream that is parallel. May return itself, either because the stream was already parallel, or because the underlying stream state was modified to be parallel.
unordered()
Returns an equivalent stream that is unordered. May return itself, either because the stream was already unordered, or because the underlying stream state was modified to be unordered.
As you can see, all three may modify the underlying stream state, which means that the position in the stream chain of methods don't matter.
Your two examples are the same. So would these be:
list.stream().parallel().map(/**/).filter(/**/).unordered().collect(/**/);
list.stream().map(/**/).filter(/**/).unordered().parallel().collect(/**/);
list.stream().unordered().map(/**/).parallel().filter(/**/).collect(/**/);
list.stream().unordered().parallel().map(/**/).filter(/**/).collect(/**/);
You should click on the unordered link and read the javadoc to learn more about ordering of streams.
The effect of .unordered() is to only remove constraints on the stream that it must remain ordered. So any intermediate operations in the pipeline are unaffected by an ordering constraint. In the example provided, assuming the internal operations of each are not stateful operations, .unordered() has no effect.
Here are some helpful quotes from the docs:
Stream Operations:
Traversal of the pipeline source does not begin until the terminal
operation of the pipeline is executed.
So all of the effects of intermediate operations are consolidated and operate on an optimized representation of the input data. These means that regardless of the order of the intermediate operation, they effect the entirety of the operation of the pipeline the same way. This is true for parallel or sequential streams.
Ordering:
However, if the source has no defined encounter order, then any
permutation of the values [2, 4, 6] would be a valid result.
This is related to your question about T1 (maintaining ordering). In the pipelines you have, this quote means there is nothing that will maintain order.

Java 8 forEach use cases

Let's say you have a collection with some strings and you want to return the first two characters of each string (or some other manipulation...).
In Java 8 for this case you can use either the map or the forEach methods on the stream() which you get from the collection (maybe something else but that is not important right now).
Personally I would use the map primarily because I associate forEach with mutating the collection and I want to avoid this. I also created a really small test regarding the performance but could not see any improvements when using forEach (I perfectly understand that small tests cannot give reliable results but still).
So what are the use-cases where one should choose forEach?
map is the better choice for this, because you're not trying to do anything with the strings yet, just map them to different strings.
forEach is designed to be the "final operation." As such, it doesn't return anything, and is all about mutating some state -- though not necessarily that of the original collection. For instance, you might use it to write elements to a file, having used other constructs (including map) to get those elements.
forEach terminates the stream and is exectued because of the side effect of the called Cosumer. It does not necessarily mutate the stream members.
map maps each stream element to a different value/object using a provided Function. A Stream <R> is returned on which more steps can act.
The forEach terminal operation might be useful in several cases: when you want to collect into some older class for which you don't have a proper collector or when you don't want to collect at all, but send you data somewhere outside (write into the database, print into OutputStream, etc.). There are many cases when the best way is to use both map (as intermediate operation) and forEach (as terminal operation).

Does a sequential stream in Java 8 use the combiner parameter on calling collect?

If I call collect on a sequential stream (eg. from calling Collection.stream()) then will it use the combiner parameter I pass to collect? I presume not but I see nothing in the documentation. If I'm correct, then it seems unfortunate to have to supply something that I know will not be used (if I know it is a sequential stream).
Keep in mind to develop against interface specifications -- not against the implementation. The implementation might change with the next Java version, whereas the specification should remain stable.
The specification does not differentiate between sequential and parallel streams. For that reason, you should assume, that the combiner might be used. Actually, there are good examples showing that combiners for sequential streams can improve the performance. For example, the following reduce operation concatenates a list of strings. Executing the code without combiner has quadratic complexity. A smart execution with combiner can reduce the runtime by magnitudes.
List<String> tokens = ...;
String result = tokens.stream().reduce("", String::concat, String::concat);

Difference between iterable.forEach() and iterable.stream().forEach() [duplicate]

This question already has answers here:
What is difference between Collection.stream().forEach() and Collection.forEach()?
(5 answers)
Closed 8 years ago.
It looks like I can call list.forEach(a -> a.stuff()) directly on my collection, instead of list.stream().forEach(a -> a.stuff()). When would I use one over the other (parallelStream() aside..)?
There are a few differences:
Iterable.forEach guarantees processing in iteration order, if it's defined for the Iterable. (Iteration order is generally well-defined for Lists.) Stream.forEach does not; one must use Stream.forEachOrdered instead.
Iterable.forEach may permit side effects on the underlying data structure. Although many collections' iterators will throw ConcurrentModificationException if the collection is modified during iteration, some collections' iterators explicitly permit it. See CopyOnWriteArrayList, for example. By contrast, stream operations in general must not interfere with the stream source.
If the Iterable is a synchronized wrapper collection, for example, from Collections.synchronizedList(), a call to forEach on it will hold its lock during the entire iteration. This will prevent other threads from modifying the collection during the iteration, ensuring that the iteration sees a consistent view of the collection, and preventing ConcurrentModificationException. (This will also prevent other threads from reading the collection during the iteration.) This is not the case for streams. There is nothing to prevent the collection from being modified during the stream operation, and if modification does occur, the result is undefined.

Categories

Resources