Passing a collection using a reduce (3 parameters) function - streams java 8 - java

I am trying to calculate the multiplication of a value using the previous two values using java 8's stream. I want to call a function that will return an array/list/collection. I am creating a List and adding 1,2 to it.
Let's say the list name is result.
public static void main (String[] args) {
List<Integer> result = new ArrayList<Integer>();
result.add(1);
result.add(2);
int n = 5; //n can be anything, choosing 5 for this example
res(n, result);
//print result which should be [1, 2, 2, 4, 8]
}
public static List<Integer> res(int n, List<Integer> result ) {
result.stream()
.limit(n)
.reduce(identity, (base,index) -> base);
//return result;
}
Now the issue is trying to try to pass result into the stream to keep updating the list with the new values using the stream. According to the java tutorials, it is possible, albeit inefficient.
"If your reduce operation involves adding elements to a collection, then every time your accumulator function processes an element, it creates a new collection that includes the element, which is inefficient."
Do I need to use the optional third parameter, BinaryOperator combiner, to combine the list + result??
<U> U reduce(U identity,
BiFunction<U,? super T,U> accumulator,
BinaryOperator<U> combiner)
In short; I want to pass a list with two values and have the function find the multiplication of the first two values (1,2), add it to the list, and find the multiplication of the last two values (2,2), and add it to the list, and until the stream hits the limit.

It looks like you're trying to implement a recurrence relation. The reduce method applies some function to a bunch of pre-existing values in the stream. You can't use reduce and take an intermediate result from the reducer function and "feed it back" into the stream, which is what you need to do in order to implement a recurrence relation.
The way to implement a recurrence relation using streams is to use one of the streams factory methods Stream.generate or Stream.iterate. The iterate factory seems to suggest the most obvious approach. The state that needs to be kept for each application of the recurrence function requires two ints in your example, so unfortunately we have to create an object to hold these for us:
static class IntPair {
final int a, b;
IntPair(int a_, int b_) {
a = a_; b = b_;
}
}
Using this state object you can create a stream that implements the recurrence that you want:
Stream.iterate(new IntPair(1, 2), p -> new IntPair(p.b, p.a * p.b))
Once you have such a stream, it's a simple matter to collect the values into a list:
List<Integer> output =
Stream.iterate(new IntPair(1, 2), p -> new IntPair(p.b, p.a * p.b))
.limit(5)
.map(pair -> pair.a)
.collect(Collectors.toList());
System.out.println(output);
[1, 2, 2, 4, 8]
As an aside, you can use the same technique to generate the Fibonacci sequence. All you do is provide a different starting value and iteration function:
Stream.iterate(new IntPair(0, 1), p -> new IntPair(p.b, p.a + p.b))
You could also implement a similar recurrence relation using Stream.generate. This will also require a helper class. The helper class implements Supplier of the result value but it also needs to maintain state. It thus needs to be mutable, which is kind of gross in my book. The iteration function also needs to be baked into the generator object. This makes it less flexible than the IntPair object, which can be used for creating arbitrary recurrences.

Just for completeness, here is a solution which does not need an additional class.
List<Integer> output = Stream.iterate(
(ToIntFunction<IntBinaryOperator>)f -> f.applyAsInt(1, 2),
prev -> f -> prev.applyAsInt((a, b) -> f.applyAsInt(b, a*b) )
)
.limit(9).map(pair -> pair.applyAsInt((a, b)->a))
.collect(Collectors.toList());
This is a functional approach which doesn’t need an intermediate value storage. However, since Java is not a functional programming language and doesn’t have optimizations for such a recursive function definition, this is not recommended for larger streams.
Since for this example a larger stream would overflow numerically anyway and the calculation is cheap, this approach works. But for other use cases you will surely prefer a storage object when solving such a problem with plain Java (as in Stuart Marks’ answer)

Related

Java stream min/max of separate variables

How can I determine both the min and max of different attributes of objects in a stream?
I've seen answers on how get min and max of the same variable. I've also seen answers on how to get min or max using a particular object attribute (e.g. maxByAttribute()). But how do I get both the min of all the "x" attributes and the max of all the "y" attributes of objects in a stream?
Let's say I have a Java Stream<Span> with each object having a Span.getStart() and Span.getEnd() returning type long. (The units are irrelevant; it could be time or planks on a floor.) I want to get the minimum start and the maximum end, e.g. to represent the minimum span covering all the spans. Of course, I could create a loop and manually update mins and maxes, but is there a concise and efficient functional approach using Java streams?
Note that I don't want to create intermediate spans! If you want to create some intermediate Pair<Long> instance that would work, but for my purposes the Span type is special and I can't create more of them. I just want to find the minimum start and maximum end.
Bonus for also showing whether this is possible using the new Java 12 teeing(), but for my purposes the solution must work in Java 8+.
Assuming that all data is valid (end > start) you can create LongSummaryStatistics object containing such information as min/max values, average, etc., by using summaryStatistics() as a terminal operation.
List<Span> spans = // initiliazing the source
LongSummaryStatistics stat = spans.stream()
.flatMapToLong(span -> LongStream.of(span.getStart(), span.getEnd()))
.summaryStatistics();
long minStart = stat.getMin();
long maxEnd = stat.getMax();
Note that if the stream source would be empty (you can check it by invoking stat.getCount(), which will give the number of consumed elements), min and max attributes of the LongSummaryStatistics object would have their default values, which are maximum and minimum long values respectively.
That is how it could be done using collect() and picking max and min values manually:
long[] minMax = spans.stream()
.collect(() -> new long[2],
(long[] arr, Span span) -> { // consuming the next value
arr[0] = Math.min(arr[0], span.getStart());
arr[1] = Math.max(arr[1], span.getEnd());
},
(long[] left, long[] right) -> { // merging partial results produced in different threads
left[0] = Math.min(left[0], right[0]);
left[1] = Math.max(left[1], right[1]);
});
In order to utilize Collectors.teeing() you need to define two collectors and a function. Every element from the stream will be consumed by both collectors at the same time and when they are done, merger function will grab their intermediate results and will produce the final result.
In the example below, the result is Optional of map entry. In case there would be no elements in the stream, the resulting optional object would be empty as well.
List<Span> spans = List.of(new Span(1, 3), new Span(3, 6), new Span(7, 9));
Optional<Map.Entry<Long, Long>> minMaxSpan = spans.stream()
.collect(Collectors.teeing(
Collectors.minBy(Comparator.comparingLong(Span::getStart)),
Collectors.maxBy(Comparator.comparingLong(Span::getStart)),
(Optional<Span> min, Optional<Span> max) ->
min.isPresent() ? Optional.of(Map.entry(min.get().getStart(), max.get().getEnd())) : Optional.empty()));
minMaxSpan.ifPresent(System.out::println);
Output
1=9
As an alternative data-carrier, you can use a Java 16 record:
public record MinMax(long start, long end) {}
Getters in the form start() and end() will be generated by the compiler.
I am afraid for pre Java 12 you need to operate on the given Stream twice.
Given a class Span
#Getter
#AllArgsConstructor
#ToString
static class Span {
int start;
int end;
}
and a list of spans
List<Span> spanList = List.of(new Span(1,2),new Span(3,4),new Span(5,1));
you could do something like below for java 8:
Optional<Integer> minimumStart = spanList.stream().map(Span::getStart).min(Integer::compareTo);
Optional<Integer> maximumEnd = spanList.stream().map(Span::getEnd).max(Integer::compareTo);
For Java 12+ as you already noticed you can use the built-in teeing collector like:
HashMap<String, Integer> result = spanList.stream().collect(
Collectors.teeing(
Collectors.minBy(Comparator.comparing(Span::getStart)),
Collectors.maxBy(Comparator.comparing(Span::getEnd)),
(min, max) -> {
HashMap<String, Integer> map = new HashMap();
map.put("minimum start", min.get().getStart());
map.put("maximum end", max.get().getEnd());
return map;
}
));
System.out.println(result);
Here is a Collectors.teeing solution using a record as the Span class.
record Span(long getStart, long getEnd) {
}
List<Span> spans = List.of(new Span(10,20), new Span(30,40));
the Collectors in teeing are built upon each other. In this case
mapping - to get the longs out of the Span class
maxBy, minBy - takes a comparator to get the max or min value as appropriate
Both of these return optionals so get must be used.
merge operation - to merge the results of the teed collectors.
final results are placed in a long array
long[] result =
spans.stream()
.collect(Collectors.teeing(
Collectors.mapping(Span::getStart,
Collectors.minBy(
Long::compareTo)),
Collectors.mapping(Span::getEnd,
Collectors.maxBy(
Long::compareTo)),
(a, b) -> new long[] { a.get(),
b.get() }));
System.out.println(Arrays.toString(result));
prints
[10, 40]
You can also use collectingAndThen to put them in an array after get the values from Summary statistics.
long[] results = spans.stream().flatMap(
span -> Stream.of(span.getStart(), span.getEnd()))
.collect(Collectors.collectingAndThen(
Collectors.summarizingLong(Long::longValue),
stats -> new long[] {stats.getMin(),
stats.getMax()}));
System.out.println(Arrays.toString(results));
prints
[10, 40]

given an infinite sequence break it into intervals, and return a new infinite sequence with the average of each interval

i have to calculate the average of a Infinite Sequence using Stream API
Input:
Stream<Double> s = a,b,c,d ...
int interval = 3
Expected Result:
Stream<Double> result = avg(a,b,c), avg(d,e,f), ....
the result can be also an Iterator, or any other type
as long as it mantains the structure of an infinite list
of course what i written is pseudo code and doesnt run
There is a #Beta API termed mapWithIndex within Guava that could help here with certain assumption:
static Stream<Double> stepAverage(Stream<Double> stream, int step) {
return Streams.mapWithIndex(stream, (from, index) -> Map.entry(index, from))
.collect(Collectors.groupingBy(e -> (e.getKey() / step), TreeMap::new,
Collectors.averagingDouble(Map.Entry::getValue)))
.values().stream();
}
The assumption that it brings in is detailed in the documentation clearly(emphasized by me):
The resulting stream is efficiently splittable if and only if stream
was efficiently splittable and its underlying spliterator reported
Spliterator.SUBSIZED. This is generally the case if the underlying
stream comes from a data structure supporting efficient indexed random
access, typically an array or list.
This should work fine using vanilla Java
I'm using Stream#mapMulti and a Set external to the Stream to aggregate the doubles
As you see, I also used DoubleSummaryStatistics to count the average.
I could have use the traditional looping and summing then dividing but I found this way more explicit
Update:
I changed the Collection used from Set to List as a Set could cause unexpected behaviour
int step = 3;
List<Double> list = new ArrayList<>();
Stream<Double> averagesStream =
infiniteStream.mapMulti(((Double aDouble, Consumer<Double> doubleConsumer) -> {
list.add(aDouble);
if (list.size() == step) {
DoubleSummaryStatistics doubleSummaryStatistics = new DoubleSummaryStatistics();
list.forEach(doubleSummaryStatistics::accept);
list.clear();
doubleConsumer.accept(doubleSummaryStatistics.getAverage());
}
}));

what is the difference between a stateful and a stateless lambda expression?

According to the OCP book one must avoid stateful operations otherwise known as stateful lambda expression. The definition provided in the book is 'a stateful lambda expression is one whose result depends on any state that might change during the execution of a pipeline.'
They provide an example where a parallel stream is used to add a fixed collection of numbers to a synchronized ArrayList using the .map() function.
The order in the arraylist is completely random and this should make one see that a stateful lambda expression produces unpredictable results in runtime. That's why its strongly recommended to avoid stateful operations when using parallel streams so as to remove any potential data side effects.
They don't show a stateless lambda expression that provides a solution to the same problem (adding numbers to a synchronized arraylist) and I still don't get what the problem is with using a map function to populate an empty synchronized arraylist with data... What is exactly the state that might change during the execution of a pipeline? Are they referring to the Arraylist itself? Like when another thread decides to add other data to the ArrayList when the parallelstream is still in the process adding the numbers and thus altering the eventual result?
Maybe someone can provide me with a better example that shows what a stateful lambda expression is and why it should be avoided. That would be very much appreciated.
Thank you
The first problem is this:
List<Integer> list = new ArrayList<>();
List<Integer> result = Stream.of(1, 2, 3, 4, 5, 6)
.parallel()
.map(x -> {
list.add(x);
return x;
})
.collect(Collectors.toList());
System.out.println(list);
You have no idea what the result will be here, since you are adding elements to a non-thread-safe collection ArrayList.
But even if you do:
List<Integer> list = Collections.synchronizedList(new ArrayList<>());
And perform the same operation the list has no predictable order. Multiple Threads add to this synchronized collection. By adding the synchronized collection you guarantee that all elements are added (as opposed to the plain ArrayList), but in which order they will be present in unknown.
Notice that list has no order guarantees what-so-ever, this is called processing order. While result is guaranteed to be: [1, 2, 3, 4, 5, 6] for this particular example.
Depending on the problem, you usually can get rid of the stateful operations; for your example returning the synchronized List would be:
Stream.of(1, 2, 3, 4, 5, 6)
.filter(x -> x > 2) // for example a filter is present
.collect(Collectors.collectingAndThen(Collectors.toList(),
Collections::synchronizedList));
To try to give an example, let's consider the following Consumer (note : the usefulness of such a function is not of the matter here) :
public static class StatefulConsumer implements IntConsumer {
private static final Integer ARBITRARY_THRESHOLD = 10;
private boolean flag = false;
private final List<Integer> list = new ArrayList<>();
#Override
public void accept(int value) {
if(flag){ // exit condition
return;
}
if(value >= ARBITRARY_THRESHOLD){
flag = true;
}
list.add(value);
}
}
It's a consumer that will add items to a List (let's not consider how to get back the list nor the thread safety) and has a flag (to represent the statefulness).
The logic behind this would be that once the threshold has been reached, the consumer should stop adding items.
What your book was trying to say was that because there is no guaranteed order in which the function will have to consume the elements of the Stream, the output is non-deterministic.
Thus, they advise you to only use stateless functions, meaning they will always produce the same result with the same input.
Here is an example where a stateful operation returns a different result each time:
public static void main(String[] args) {
Set<Integer> seen = new HashSet<>();
IntStream stream = IntStream.of(1, 2, 3, 1, 2, 3);
// Stateful lambda expression
IntUnaryOperator mapUniqueLambda = (int i) -> {
if (!seen.contains(i)) {
seen.add(i);
return i;
}
else {
return 0;
}
};
int sum = stream.parallel().map(mapUniqueLambda).peek(i -> System.out.println("Stream member: " + i)).sum();
System.out.println("Sum: " + sum);
}
In my case when I ran the code I got the following output:
Stream member: 1
Stream member: 0
Stream member: 2
Stream member: 3
Stream member: 1
Stream member: 2
Sum: 9
Why did I get 9 as the sum if I'm inserting into a hashset?
The answer: Different threads took different parts of the IntStream
For example values 1 & 2 managed to end up on different threads.
A stateful lambda expression is one whose result depends on any state that might change during the execution of a pipeline. On the
other hand, a stateless lambda expression is one whose result does
not depend on any state that might change during the execution of a
pipeline.
Source: OCP: Oracle Certified Professional Java SE 8 Programmer II Study Guide: Exam 1Z0-809by Jeanne Boyarsky,‎ Scott Selikoff
List < Integer > data = Collections.synchronizedList(new ArrayList < > ());
Arrays.asList(1, 2, 3, 4, 5, 6, 7).parallelStream()
.map(i -> {
data.add(i);
return i;
}) // AVOID STATEFUL LAMBDA EXPRESSIONS!
.forEachOrdered(i -> System.out.print(i+" "));
System.out.println();
for (int e: data) {
System.out.print(e + " ");
Possible Output:
1 2 3 4 5 6 7
1 7 5 2 3 4 6
It strongly recommended that you avoid stateful operations when using
parallel streams, so as to remove any potential data side effects. In
fact, they should generally be avoided in serial streams wherever
possible, since they prevent your streams from taking advantage of
parallelization.
A stateful lambda expression is one whose result depends on any state that might change during the execution of a stream pipeline.
Let's understand this with an example here:
List<Integer> list = Arrays.asList(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15);
List<Integer> result = new ArrayList<Integer>();
list.parallelStream().map(s -> {
synchronized (result) {
if (result.size() < 10) {
result.add(s);
}
}
return s;
}).forEach( e -> {});
System.out.println(result);
When you run this code 5 times, the output would/could be different all the time. Reason behind is here processing of Lambda expression inside map updates result array. Since here the result array depend on the size of that array for a particular sub stream, which would change every time this parallel stream would be called.
For better understanding of parallel stream:
Parallel computing involves dividing a problem into subproblems, solving those problems simultaneously (in parallel, with each subproblem running in a separate thread), and then combining the results of the solutions to the subproblems. When a stream executes in parallel, the Java runtime partitions the streams into multiple substreams. Aggregate operations iterate over and process these substreams in parallel and then combine the results.
Hope this helps!!!

Java8 Associate random point from stream with player object from other stream

So this is one that's really left me puzzled. Lets say I have a Player object, with Point p containing an x and y value:
class Player {
void movePlayer(Point p) {
...
}
}
If I have a bunch of static points (certainly more than players) that I need to randomly, yet uniquely, map to each player's movePlayer function, how would I do so? This process does not need to be done quickly, but often and randomly each time. To add a layer of complication, my points are generated by both varying x and y values. As of now I am doing the following (which crashed my JVM):
public List<Stream<Point>> generatePointStream() {
Random random = new Random();
List<Stream<Point>> points = new ArrayList<Stream<Point>>();
points.add(random.ints(2384, 2413).distinct().mapToObj(x -> new Point(x, 3072)));
points.add(random.ints(3072, 3084).distinct().mapToObj(y -> new Point(2413, y)));
....
points.add(random.ints(2386, 2415).distinct().mapToObj(x -> new Point(x, 3135)));
Collections.shuffle(points);
return points;
}
Note that before I used only one stream with the Stream.concat method, but that threw errors and looked pretty ugly, leading me to my current predicament. And to assign them to all Player objects in the List<Player> players:
players.stream().forEach(p->p.movePlayer(generatePointStream().stream().flatMap(t->t).
findAny().orElse(new Point(2376, 9487))));
Now this almost worked when I used some ridiculous abstraction Stream<Stream<Point>> , except it only used points from the first Stream<Point>.
Am I completely missing the point of streams here? I just liked the idea of not creating explicit Point objects I wouldn't use anyways.
Well, you can define a method returning a Stream of Points like
public Stream<Point> allValues() {
return Stream.of(
IntStream.range(2384, 2413).mapToObj(x -> new Point(x, 3072)),
IntStream.range(3072, 3084).mapToObj(y -> new Point(2413, y)),
//...
IntStream.range(2386, 2415).mapToObj(x -> new Point(x, 3135))
).flatMap(Function.identity());
}
which contains all valid points, though not materialized, due to the lazy nature of the Stream. Then, create a method to pick random elements like:
public List<Point> getRandomPoints(int num) {
long count=allValues().count();
assert count > num;
return new Random().longs(0, count)
.distinct()
.limit(num)
.mapToObj(i -> allValues().skip(i).findFirst().get())
.collect(Collectors.toList());
}
In a perfect world, this would already have all the laziness you wish, including creating only the desired number of Point instances.
However, there are several implementation details which might make this even worse than just collecting into a list.
One is special to the flatMap operation, see “Why filter() after flatMap() is “not completely” lazy in Java streams?”. Not only are substreams processed eagerly, also Stream properties that could allow internal optimizations are not evaluated. In this regard, a concat based Stream is more efficient.
public Stream<Point> allValues() {
return Stream.concat(
Stream.concat(
IntStream.range(2384, 2413).mapToObj(x -> new Point(x, 3072)),
IntStream.range(3072, 3084).mapToObj(y -> new Point(2413, y))
),
//...
IntStream.range(2386, 2415).mapToObj(x -> new Point(x, 3135))
);
}
There is a warning regarding creating too deep concatenated streams, but if you are in control of the creation like here, you can care to create a balanced tree, like
Stream.concat(
Stream.concat(
Stream.concat(a, b),
Stream.concat(c, d)
),
Stream.concat(
Stream.concat(a, b),
Stream.concat(c, d)
)
)
However, even though such a Stream allows to calculate the size without processing elements, this won’t happen before Java 9. In Java 8, count() will always iterate over all elements, which implies having already instantiated as much Point instances as when collecting all elements into a List after the count() operation.
Even worse, skip is not propagated to the Stream’s source, so when saying stream.map(…).skip(n).findFirst(), the mapping function is evaluated up to n+1 times instead of only once. Of course, this renders the entire idea of the getRandomPoints method using this as lazy construct useless. Due to the encapsulation and the nested streams we have here, we can’t even move the skip operation before the map.
Note that temporary instances still might be handled more efficient than collecting into a list, where all instance of the exist at the same time, but it’s hard to predict due to the much larger number we have here. So if the instance creation really is a concern, we can solve this specific case due to the fact that the two int values making up a point can be encapsulated in a primitive long value:
public LongStream allValuesAsLong() {
return LongStream.concat(LongStream.concat(
LongStream.range(2384, 2413).map(x -> x <<32 | 3072),
LongStream.range(3072, 3084).map(y -> 2413L <<32 | y)
),
//...
LongStream.range(2386, 2415).map(x -> x <<32 | 3135)
);
}
public List<Point> getRandomPoints(int num) {
long count=allValuesAsLong().count();
assert count > num;
return new Random().longs(0, count)
.distinct()
.limit(num)
.mapToObj(i -> allValuesAsLong().skip(i)
.mapToObj(l -> new Point((int)(l>>>32), (int)(l&(1L<<32)-1)))
.findFirst().get())
.collect(Collectors.toList());
}
This will indeed only create num instances of Point.
You should do something like:
final int PLAYERS_COUNT = 6;
List<Point> points = generatePointStream()
.stream()
.limit(PLAYERS_COUNT)
.map(s -> s.findAny().get())
.collect(Collectors.toList());
This outputs
2403, 3135
2413, 3076
2393, 3072
2431, 3118
2386, 3134
2368, 3113

Can you split a stream into two streams?

I have a data set represented by a Java 8 stream:
Stream<T> stream = ...;
I can see how to filter it to get a random subset - for example
Random r = new Random();
PrimitiveIterator.OfInt coin = r.ints(0, 2).iterator();
Stream<T> heads = stream.filter((x) -> (coin.nextInt() == 0));
I can also see how I could reduce this stream to get, for example, two lists representing two random halves of the data set, and then turn those back into streams.
But, is there a direct way to generate two streams from the initial one? Something like
(heads, tails) = stream.[some kind of split based on filter]
Thanks for any insight.
A collector can be used for this.
For two categories, use Collectors.partitioningBy() factory.
This will create a Map<Boolean, List>, and put items in one or the other list based on a Predicate.
Note: Since the stream needs to be consumed whole, this can't work on infinite streams. And because the stream is consumed anyway, this method simply puts them in Lists instead of making a new stream-with-memory. You can always stream those lists if you require streams as output.
Also, no need for the iterator, not even in the heads-only example you provided.
Binary splitting looks like this:
Random r = new Random();
Map<Boolean, List<String>> groups = stream
.collect(Collectors.partitioningBy(x -> r.nextBoolean()));
System.out.println(groups.get(false).size());
System.out.println(groups.get(true).size());
For more categories, use a Collectors.groupingBy() factory.
Map<Object, List<String>> groups = stream
.collect(Collectors.groupingBy(x -> r.nextInt(3)));
System.out.println(groups.get(0).size());
System.out.println(groups.get(1).size());
System.out.println(groups.get(2).size());
In case the streams are not Stream, but one of the primitive streams like IntStream, then this .collect(Collectors) method is not available. You'll have to do it the manual way without a collector factory. It's implementation looks like this:
[Example 2.0 since 2020-04-16]
IntStream intStream = IntStream.iterate(0, i -> i + 1).limit(100000).parallel();
IntPredicate predicate = ignored -> r.nextBoolean();
Map<Boolean, List<Integer>> groups = intStream.collect(
() -> Map.of(false, new ArrayList<>(100000),
true , new ArrayList<>(100000)),
(map, value) -> map.get(predicate.test(value)).add(value),
(map1, map2) -> {
map1.get(false).addAll(map2.get(false));
map1.get(true ).addAll(map2.get(true ));
});
In this example I initialize the ArrayLists with the full size of the initial collection (if this is known at all). This prevents resize events even in the worst-case scenario, but can potentially gobble up 2NT space (N = initial number of elements, T = number of threads). To trade-off space for speed, you can leave it out or use your best educated guess, like the expected highest number of elements in one partition (typically just over N/2 for a balanced split).
I hope I don't offend anyone by using a Java 9 method. For the Java 8 version, look at the edit history.
I stumbled across this question to my self and I feel that a forked stream has some use cases that could prove valid. I wrote the code below as a consumer so that it does not do anything but you could apply it to functions and anything else you might come across.
class PredicateSplitterConsumer<T> implements Consumer<T>
{
private Predicate<T> predicate;
private Consumer<T> positiveConsumer;
private Consumer<T> negativeConsumer;
public PredicateSplitterConsumer(Predicate<T> predicate, Consumer<T> positive, Consumer<T> negative)
{
this.predicate = predicate;
this.positiveConsumer = positive;
this.negativeConsumer = negative;
}
#Override
public void accept(T t)
{
if (predicate.test(t))
{
positiveConsumer.accept(t);
}
else
{
negativeConsumer.accept(t);
}
}
}
Now your code implementation could be something like this:
personsArray.forEach(
new PredicateSplitterConsumer<>(
person -> person.getDateOfBirth().isPresent(),
person -> System.out.println(person.getName()),
person -> System.out.println(person.getName() + " does not have Date of birth")));
Unfortunately, what you ask for is directly frowned upon in the JavaDoc of Stream:
A stream should be operated on (invoking an intermediate or terminal
stream operation) only once. This rules out, for example, "forked"
streams, where the same source feeds two or more pipelines, or
multiple traversals of the same stream.
You can work around this using peek or other methods should you truly desire that type of behaviour. In this case, what you should do is instead of trying to back two streams from the same original Stream source with a forking filter, you would duplicate your stream and filter each of the duplicates appropriately.
However, you may wish to reconsider if a Stream is the appropriate structure for your use case.
You can get two Streams out of one
since Java 12 with teeing
counting heads and tails in 100 coin flips
Random r = new Random();
PrimitiveIterator.OfInt coin = r.ints(0, 2).iterator();
List<Long> list = Stream.iterate(0, i -> coin.nextInt())
.limit(100).collect(teeing(
filtering(i -> i == 1, counting()),
filtering(i -> i == 0, counting()),
(heads, tails) -> {
return(List.of(heads, tails));
}));
System.err.println("heads:" + list.get(0) + " tails:" + list.get(1));
gets eg.: heads:51 tails:49
Not exactly. You can't get two Streams out of one; this doesn't make sense -- how would you iterate over one without needing to generate the other at the same time? A stream can only be operated over once.
However, if you want to dump them into a list or something, you could do
stream.forEach((x) -> ((x == 0) ? heads : tails).add(x));
This is against the general mechanism of Stream. Say you can split Stream S0 to Sa and Sb like you wanted. Performing any terminal operation, say count(), on Sa will necessarily "consume" all elements in S0. Therefore Sb lost its data source.
Previously, Stream had a tee() method, I think, which duplicate a stream to two. It's removed now.
Stream has a peek() method though, you might be able to use it to achieve your requirements.
not exactly, but you may be able to accomplish what you need by invoking Collectors.groupingBy(). you create a new Collection, and can then instantiate streams on that new collection.
This was the least bad answer I could come up with.
import org.apache.commons.lang3.tuple.ImmutablePair;
import org.apache.commons.lang3.tuple.Pair;
public class Test {
public static <T, L, R> Pair<L, R> splitStream(Stream<T> inputStream, Predicate<T> predicate,
Function<Stream<T>, L> trueStreamProcessor, Function<Stream<T>, R> falseStreamProcessor) {
Map<Boolean, List<T>> partitioned = inputStream.collect(Collectors.partitioningBy(predicate));
L trueResult = trueStreamProcessor.apply(partitioned.get(Boolean.TRUE).stream());
R falseResult = falseStreamProcessor.apply(partitioned.get(Boolean.FALSE).stream());
return new ImmutablePair<L, R>(trueResult, falseResult);
}
public static void main(String[] args) {
Stream<Integer> stream = Stream.iterate(0, n -> n + 1).limit(10);
Pair<List<Integer>, String> results = splitStream(stream,
n -> n > 5,
s -> s.filter(n -> n % 2 == 0).collect(Collectors.toList()),
s -> s.map(n -> n.toString()).collect(Collectors.joining("|")));
System.out.println(results);
}
}
This takes a stream of integers and splits them at 5. For those greater than 5 it filters only even numbers and puts them in a list. For the rest it joins them with |.
outputs:
([6, 8],0|1|2|3|4|5)
Its not ideal as it collects everything into intermediary collections breaking the stream (and has too many arguments!)
I stumbled across this question while looking for a way to filter certain elements out of a stream and log them as errors. So I did not really need to split the stream so much as attach a premature terminating action to a predicate with unobtrusive syntax. This is what I came up with:
public class MyProcess {
/* Return a Predicate that performs a bail-out action on non-matching items. */
private static <T> Predicate<T> withAltAction(Predicate<T> pred, Consumer<T> altAction) {
return x -> {
if (pred.test(x)) {
return true;
}
altAction.accept(x);
return false;
};
/* Example usage in non-trivial pipeline */
public void processItems(Stream<Item> stream) {
stream.filter(Objects::nonNull)
.peek(this::logItem)
.map(Item::getSubItems)
.filter(withAltAction(SubItem::isValid,
i -> logError(i, "Invalid")))
.peek(this::logSubItem)
.filter(withAltAction(i -> i.size() > 10,
i -> logError(i, "Too large")))
.map(SubItem::toDisplayItem)
.forEach(this::display);
}
}
Shorter version that uses Lombok
import java.util.function.Consumer;
import java.util.function.Predicate;
import lombok.RequiredArgsConstructor;
/**
* Forks a Stream using a Predicate into postive and negative outcomes.
*/
#RequiredArgsConstructor
#FieldDefaults(makeFinal = true, level = AccessLevel.PROTECTED)
public class StreamForkerUtil<T> implements Consumer<T> {
Predicate<T> predicate;
Consumer<T> positiveConsumer;
Consumer<T> negativeConsumer;
#Override
public void accept(T t) {
(predicate.test(t) ? positiveConsumer : negativeConsumer).accept(t);
}
}
How about:
Supplier<Stream<Integer>> randomIntsStreamSupplier =
() -> (new Random()).ints(0, 2).boxed();
Stream<Integer> tails =
randomIntsStreamSupplier.get().filter(x->x.equals(0));
Stream<Integer> heads =
randomIntsStreamSupplier.get().filter(x->x.equals(1));

Categories

Resources