Side effects of lambda expression in java 8 - java

Please consider the below code snippet.
List<String> list = new ArrayList<String>();
list.add("A");
list.add("B");
list.add("C");
List<String> copyList = new ArrayList<String>();
Consumer<String> consumer = s->copyList.add(s);
list.stream().forEach(consumer);
Since we are using lambda expression, as per functional programming (pure functions) it should only compute the input & provide corresponding output.
But here in the example it is trying to add elements to the list which is neither input nor declared inside the lambda scope.
Is this a good practice, I mean, leading to any side effects?

forEach would be useless if it didn't produce side-effects, since it has no return value. Hence, whenever you use forEach you should be expecting side-effects to take place. Therefore there's nothing wrong with your example.
A Consumer<String> can print the String, or insert it into some database, or write it into some output file, or store it in some Collection (as in your example), etc...
From the Stream Javadoc:
A stream pipeline consists of a source (which might be an array, a collection, a generator function, an I/O channel, etc), zero or more intermediate operations (which transform a stream into another stream, such as Stream.filter(Predicate)), and a terminal operation (which produces a result or side-effect, such as Stream.count() or Stream.forEach(Consumer)).
Besides, if you look at the Javadoc of Consumer, you'll see that it's expected to have side-effects:
java.util.function.Consumer
Represents an operation that accepts a single input argument and returns no result. Unlike most other functional interfaces, Consumer is expected to operate via side-effects.
I guess this means Java Streams and functional interfaces were not designed to be used only for "purely" functional programming.

For a forEach or even a stream().forEach it is pretty straightforward. Your example works fine.
Be aware though that if you would do this with other streaming methods, then you could get some surprises: e.g. The following code prints absolutely nothing.
List<String> lst = Arrays.asList("a", "b", "c");
lst.stream().map(
s -> {
System.out.println(s);
return "-";
});
In this case, the stream acts more like a builder which prepares a process but does not execute it yet. It's only when a collect, count or find... method is called that the lambda is executed.
An easy way to spot this, is by looking at the return type of the map method, which in turn is again a Stream.
Having said that, I think for your specific example, there are easier alternatives.
List<String> list = Arrays.asList("A", "B", "C");
// this is the base pattern for transforming one list to another.
// the map applies a transformation.
List<String> copyList1 = list.stream().map(e -> e).collect(Collectors.toList());
// if it's a 1-to-1 mapping, then you don't really need the map.
List<String> copyList2 = list.stream().collect(Collectors.toList());
// in essence, you could of course just clone the list without streaming.
List<String> copyList3 = new ArrayList<>(list);

Related

Java stream, filter, and then do something with resulting collection

Goal
final List<T> listOfThings = ...;
listOfThings.stream()
.filter(...) // returns a Stream<T>
.then(filteredListOfThings -> {
// How do I get here so I can work on the newly filtered collection
// in a fluent way w/out collecting the result to a variable?
// For example, if I need to process the elements but don't
// care about them in their current form outside this chain.
});
Problem
In English, given a list of something, I'd like to stream the list, filter it, and then operate on the entire filtered result. I can accomplish this with optional but it's not clean IMO:
final List<T> listOfThings = ...;
Optional
.of(listOfThings.stream()
.filter(...) // returns a Stream<T>
.collect(Collectors.toList()))
.map(filteredListOfThings -> {
// I'm here, now, but would like to not have to wrap it in an Optional<T>
});
It'd be cool if there was a then or similar method on a Stream<T> which returns Stream<T> to allow for further chaining, which allows me to work with the entire set of results within the lambda without declaring an outside variable.
Don't make it more complicated than it needs to be.
Assign the result of the collect to a variable, then operate on that variable:
List<T> filteredListOfThings = ... .collect(toList());
// Now use filteredListOfThings.
filteredListOfThings will always have a value, even if it's the empty list, so there's no point in using Optional.
And there's not much syntactic difference between filteredListOfThings being a lambda parameter and it being an explicit variable; but you have more flexibility in what you can do whilst processing it (returning from the methods, throwing checked exceptions etc).
I'd like to stream the list, filter it, and then operate on the entire filtered result.
Note that the stream can be infinite as well ;)
So getting the infinite list of results is not a good idea.
Basically streams are lazy and applying an intermediate operations to stream without having a terminal operation does nothing:
For example the following code prints nothing:
Stream<String> stream = Stream.of("hello","how","are", "you").filter(this::startsWithH)
private boolean startsWithH(String elem) {
System.out.println("Filtering element " + elem);
return elem.startsWith("h");
}
Now, when you do apply a terminal operation, it will still work element-by-element usually:
Example of execution:
Stream<String> stream = Stream.of("hello","how","are", "you")
.filter(this::startsWithH)
.map(String::toUpperCase)
stream.collect(toList());
This example yields the following execution chain:
filter("hello")
map("hello")
filter("how")
map("how")
filter("are") <--- filtered out, no call to map will be done
filter("you") <--- filtered out, no call to map will be done
But if so, you can't really operate on the "whole" stream in the example provided in this question (ok, there are stateful operations that must work on the whole stream, like sort, but its an entirely different story).
In other words if you want to get the data as a collection you should, well, collect the data. It won't be a stream anymore.
For this, you should use .collect(). And if you do have an infinite stream, don't forget to call limit beforehands ;)

Terminal operation to evaluate intermediate operation

Let says i have a list of strings and i want to use those strings as input to a fluent builder.
List<String> scripts;
//initialize list
ScriptRunnerBuilder scriptRunnerBuilder = new ScriptRunnerBuilder();
BiFunction<String,ScriptRunnerBuilder,ScriptRunnerBuilder> addScript =
(script,builder) -> builer.addScript(script);
scriptRunnerBuilder = scripts.stream.map(script ->
addScript.apply(script,scriptRunnerBuilder)).......
scriptRunnerBuilder.build();
which terminal operation can i use so that the addScript function gets called for all elements in the list?
The issue is that the ScriptRunnerBuilder is immutable whereby ScriptRunnerBuilder.addScript() returns a new ScriptRunnerBuilder object rather than modifying existing – so i can't just us a foreach.
My intentions are to carry the result of the addScript() call and use that as input for the next element in the stream
In simplest way this should:
// create your builder
ScriptRunnerBuilder builder = new ScriptRunnerBuilder();
// add all scripts
scripts.forEach(script-> builder.addScript(script))
build results
scriptRunnerBuilder.build();
Because builder aggregates all data, and you have created it outside forEach lambda, you can access it directly. This will lead to less code and same result.
Or as #Holger suggested:
scripts.forEach(builder::addScript);
Use forEach instead of map and don't assign the result of the stream anymore
scripts.forEach(script -> addScript.apply(script,scriptRunnerBuilder));
i could use reduce operation but that is unnecessary as we are not combining results
Combining is exactly what you are doing.
You combine all scripts from List<String> to ScriptRunnerBuilder aren't you?
I agree that the #Beri's solution without stream probably is the simplest. But also there is a way with reduce(identity, accumulator, combiner) method where you don't need to create ScriptRunnerBuilder before:
ScriptRunnerBuilder builder = scripts.stream()
.reduce(new ScriptRunnerBuilder(), ScriptRunnerBuilder::addScript, (b1, b2) -> b1);
See more: Why is a combiner needed for reduce method that converts type in java 8
Update To not to rely on the fact that combiner not being invoked for sequential stream and to make it works with parallel one you have to implement the real combiner.
If you could add an overrided method addScript(ScriptRunnerBuilder otherBuilder) then the reduce will look like:
.reduce(new ScriptRunnerBuilder(), ScriptRunnerBuilder::addScript,
ScriptRunnerBuilder::addScript)

Considering list elements that are added after filtered stream creation

Given the following code:
List<String> strList = new ArrayList<>(Arrays.asList("Java","Python","Php"));
Stream<String> jFilter = strList.stream().filter(str -> str.startsWith("J"));
strList.add("JavaScript"); // element added after filter creation
strList.add("JQuery"); // element added after filter creation
System.out.println(Arrays.toString(jFilter.toArray()));
which outputs:
[Java, JavaScript, JQuery]
Why do JavaScript and JQuery appear in the filtered result even though they were added after creating the filtered stream?
Short Answer
You're assuming after this point:
Stream<String> jFilter = strStream.filter(str -> str.startsWith("J"));
That a new stream of the elements starting with "J" are returned i.e. only Java. However this is not the case;
streams are lazy i.e. they don't perform any logic unless told otherwise by a terminal operation.
The actual execution of the stream pipeline starts on the toArray() call and since the list was modified before the terminal toArray() operation commenced the result will be [Java, JavaScript, JQuery].
Longer Answer
here's part of the documentation which mentions this:
For well-behaved stream sources, the source can be modified before
the terminal operation commences and those modifications will be
reflected in the covered elements. For example, consider the following
code:
List<String> l = new ArrayList(Arrays.asList("one", "two"));
Stream<String> sl = l.stream();
l.add("three");
String s = sl.collect(joining(" "));
First a list is created consisting of two strings: "one"; and "two". Then a stream is created
from that list. Next the list is modified by adding a third string:
"three". Finally the elements of the stream are collected and joined
together. Since the list was modified before the terminal collect
operation commenced the result will be a string of "one two three".
All the streams returned from JDK collections, and most other JDK
classes, are well-behaved in this manner;
Until the statement
System.out.println(Arrays.toString(jFilter.toArray()));
runs, the stream doesn't do anything. A terminal operation (toArray in the example) is required for the stream to be traversed and your intermediate operations (filter in this case) to be executed.
In this case, what you can do is, for example, capture the size of the list before adding other elements:
int maxSize = strList.size();
Stream<String> jFilter = strStream.limit(maxSize)
.filter(str -> str.startsWith("J"));
where limit(maxSize) will not allow more than the initial elements to go through the pipeline.
Its because the stream never got evaluated. you never called a "Terminal operation" on that stream for it to get executed as they're lazy.
Look at a modification of your code and the output. The filtering actually takes place when you call the Terminal Operator.
public static void main(String []args){
List<String> strList = new ArrayList<>();
strList.add("Java");
strList.add("Python");
strList.add("Php");
Stream<String> strStream = strList.stream();
Stream<String> jFilter = strStream.filter(str -> {
System.out.println("Filtering" + str);
return str.startsWith("J");
});
System.out.println("After Stream creation");
strList.add("JavaScript"); // element added after filter creation
strList.add("JQuery"); // element added after filter creation
System.out.println(Arrays.toString(jFilter.toArray()));
}
Output:
After Stream creation
FilteringJava
FilteringPython
FilteringPhp
FilteringJavaScript
FilteringJQuery
[Java, JavaScript, JQuery]
As explained in the official documentation at ,https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html, streams have no storage, and so are more like iterators than collections, and are evaluated lazily.
So, nothing really happens with respect to the stream until you invoke the terminal operation toArray()
#Hadi J's comment but it should be answer according to the rules.
Because streams are lazy and when you call terminal operation it executed.
The toArray method is the terminal operation and it works on that full content of your list. To get predictable result do not save the stream to a temporary variable as it will lead to misleading results. A better code is:
String[] arr = strList.stream().filter(str -> str.startsWith("J")).toArray();

Should I use shared mutable variable update in Java 8 Streams

Just iterating below list & adding into another shared mutable list via java 8 streams.
List<String> list1 = Arrays.asList("A1","A2","A3","A4","A5","A6","A7","A8","B1","B2","B3");
List<String> list2 = new ArrayList<>();
Consumer<String> c = t -> list2.add(t.startsWith("A") ? t : "EMPTY");
list1.stream().forEach(c);
list1.parallelStream().forEach(c);
list1.forEach(c);
What is the difference between above three iteration & which one we need to use. Are there any considerations?
Regardless of whether you use parallel or sequential Stream, you shouldn't use forEach when your goal is to generate a List. Use map with collect:
List<String> list2 =
list2.stream()
.map(item -> item.startsWith("A") ? item : "EMPTY")
.collect(Collectors.toList());
Functionally speaking,for the simple cases they are almost the same, but generally speaking, there are some hidden differences:
Lets start by quoting from Javadoc of forEach for iterable use-cases stating that:
performs the given action for each element of the Iterable until all
elements have been processed or the action throws an exception.
and also we can iterate over a collection and perform a given action on each element – by just passing a class that implements the Consumer interface
void forEach(Consumer<? super T> action)
https://docs.oracle.com/javase/8/docs/api/java/lang/Iterable.html#forEach-java.util.function.Consumer-
The order of Stream.forEach is random while Iterable.forEach is always executed in the iteration order of the Iterable.
If Iterable.forEach is iterating over a synchronized collection, Iterable.forEach takes the collection's lock once and holds it across all the calls to the action method. The Stream.forEach call uses the collection's spliterator, which does not lock
The action specified in Stream.forEach is required to be non-interfering while Iterable.forEach is allowed to set values in the underlying ArrayList without problems.
In Java, Iterators returned by Collection classes, e.g. ArrayList, HashSet, Vector, etc., are fail fast. This means that if you try to add() or remove() from the underlying data structure while iterating it, you get a ConcurrentModificationException.
https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html#fail-fast
More Info:
What is the difference between .foreach and .stream().foreach?
What is difference between Collection.stream().forEach() and Collection.forEach()?
When working with streams, you should write your code in a way that if you switch to parallel streams, it does not produce the wrong results.
Imagine if in your code you were doing reading and writing on the same shared memory (list2) and you distribute your process into several threads (using parallel streams). Then you are DOOMED. Therefore you have several options.
make your shared memory (list2) thread-safe. for example by using AtomicReferences
List<String> list2 = new ArrayList<>();
AtomicReference<List<String>> listSafe = new AtomicReference<>();
listSafe.getAndUpdate(strings -> {strings.add("newvalue"); return strings;});
or you can go with the purely functional approach (code with no side effects)
like the #Eran solution.

Collection to stream to a new collection

I'm looking for the most pain free way to filter a collection. I'm thinking something like
Collection<?> foo = existingCollection.stream().filter( ... ). ...
But I'm not sure how is best to go from the filter, to returning or populating another collection. Most examples seem to be like "and here you can print". Possible there's a constructor, or output method that I'm missing.
There’s a reason why most examples avoid storing the result into a Collection. It’s not the recommended way of programming. You already have a Collection, the one providing the source data and collections are of no use on its own. You want to perform certain operations on it so the ideal case is to perform the operation using the stream and skip storing the data in an intermediate Collection. This is what most examples try to suggest.
Of course, there are a lot of existing APIs working with Collections and there always will be. So the Stream API offers different ways to handle the demand for a Collection.
Get an unmodifiable List implementation containing all elements (JDK 16):
List<T> results = l.stream().filter(…).toList();
Get an arbitrary List implementation holding the result:
List<T> results = l.stream().filter(…).collect(Collectors.toList());
Get an unmodifiable List forbidding null like List.of(…) (JDK 10):
List<T> results = l.stream().filter(…).collect(Collectors.toUnmodifiableList());
Get an arbitrary Set implementation holding the result:
Set<T> results = l.stream().filter(…).collect(Collectors.toSet());
Get a specific Collection:
ArrayList<T> results =
l.stream().filter(…).collect(Collectors.toCollection(ArrayList::new));
Add to an existing Collection:
l.stream().filter(…).forEach(existing::add);
Create an array:
String[] array=l.stream().filter(…).toArray(String[]::new);
Use the array to create a list with a specific specific behavior (mutable, fixed size):
List<String> al=Arrays.asList(l.stream().filter(…).toArray(String[]::new));
Allow a parallel capable stream to add to temporary local lists and join them afterward:
List<T> results
= l.stream().filter(…).collect(ArrayList::new, List::add, List::addAll);
(Note: this is closely related to how Collectors.toList() is currently implemented, but that’s an implementation detail, i.e. there is no guarantee that future implementations of the toList() collectors will still return an ArrayList)
An example from java.util.stream's documentation:
List<String>results =
stream.filter(s -> pattern.matcher(s).matches())
.collect(Collectors.toList());
Collectors has a toCollection() method, I'd suggest looking this way.
As an example that is more in line with Java 8 style of functional programming:
Collection<String> a = Collections.emptyList();
List<String> result = a.stream().
filter(s -> s.length() > 0).
collect(Collectors.toList());
You would possibly want to use toList or toSet or toMap methods from Collectors class.
However to get more control the toCollection method can be used. Here is a simple example:
Collection<String> c1 = new ArrayList<>();
c1.add("aa");
c1.add("ab");
c1.add("ca");
Collection<String> c2 = c1.stream().filter(s -> s.startsWith("a")).collect(Collectors.toCollection(ArrayList::new));
Collection<String> c3 = c1.stream().filter(s -> s.startsWith("a")).collect(Collectors.toList());
c2.forEach(System.out::println); // prints-> aa ab
c3.forEach(System.out::println); // prints-> aa ab

Categories

Resources