How can I check if a Stream is empty and throw an exception if it's not, as a non-terminal operation?
Basically, I'm looking for something equivalent to the code below, but without materializing the stream in-between. In particular, the check should not occur before the stream is actually consumed by a terminal operation.
public Stream<Thing> getFilteredThings() {
Stream<Thing> stream = getThings().stream()
.filter(Thing::isFoo)
.filter(Thing::isBar);
return nonEmptyStream(stream, () -> {
throw new RuntimeException("No foo bar things available")
});
}
private static <T> Stream<T> nonEmptyStream(Stream<T> stream, Supplier<T> defaultValue) {
List<T> list = stream.collect(Collectors.toList());
if (list.isEmpty()) list.add(defaultValue.get());
return list.stream();
}
This may be sufficient in many cases
stream.findAny().isPresent()
The other answers and comments are correct in that to examine the contents of a stream, one must add a terminal operation, thereby "consuming" the stream. However, one can do this and turn the result back into a stream, without buffering up the entire contents of the stream. Here are a couple examples:
static <T> Stream<T> throwIfEmpty(Stream<T> stream) {
Iterator<T> iterator = stream.iterator();
if (iterator.hasNext()) {
return StreamSupport.stream(Spliterators.spliteratorUnknownSize(iterator, 0), false);
} else {
throw new NoSuchElementException("empty stream");
}
}
static <T> Stream<T> defaultIfEmpty(Stream<T> stream, Supplier<T> supplier) {
Iterator<T> iterator = stream.iterator();
if (iterator.hasNext()) {
return StreamSupport.stream(Spliterators.spliteratorUnknownSize(iterator, 0), false);
} else {
return Stream.of(supplier.get());
}
}
Basically turn the stream into an Iterator in order to call hasNext() on it, and if true, turn the Iterator back into a Stream. This is inefficient in that all subsequent operations on the stream will go through the Iterator's hasNext() and next() methods, which also implies that the stream is effectively processed sequentially (even if it's later turned parallel). However, this does allow you to test the stream without buffering up all of its elements.
There is probably a way to do this using a Spliterator instead of an Iterator. This potentially allows the returned stream to have the same characteristics as the input stream, including running in parallel.
If you can live with limited parallel capablilities, the following solution will work:
private static <T> Stream<T> nonEmptyStream(
Stream<T> stream, Supplier<RuntimeException> e) {
Spliterator<T> it=stream.spliterator();
return StreamSupport.stream(new Spliterator<T>() {
boolean seen;
public boolean tryAdvance(Consumer<? super T> action) {
boolean r=it.tryAdvance(action);
if(!seen && !r) throw e.get();
seen=true;
return r;
}
public Spliterator<T> trySplit() { return null; }
public long estimateSize() { return it.estimateSize(); }
public int characteristics() { return it.characteristics(); }
}, false);
}
Here is some example code using it:
List<String> l=Arrays.asList("hello", "world");
nonEmptyStream(l.stream(), ()->new RuntimeException("No strings available"))
.forEach(System.out::println);
nonEmptyStream(l.stream().filter(s->s.startsWith("x")),
()->new RuntimeException("No strings available"))
.forEach(System.out::println);
The problem with (efficient) parallel execution is that supporting splitting of the Spliterator requires a thread-safe way to notice whether either of the fragments has seen any value in a thread-safe manner. Then the last of the fragments executing tryAdvance has to realize that it is the last one (and it also couldn’t advance) to throw the appropriate exception. So I didn’t add support for splitting here.
You must perform a terminal operation on the Stream in order for any of the filters to be applied. Therefore you can't know if it will be empty until you consume it.
Best you can do is terminate the Stream with a findAny() terminal operation, which will stop when it finds any element, but if there are none, it will have to iterate over all the input list to find that out.
This would only help you if the input list has many elements, and one of the first few passes the filters, since only a small subset of the list would have to be consumed before you know the Stream is not empty.
Of course you'll still have to create a new Stream in order to produce the output list.
I think should be enough to map a boolean
In code this is:
boolean isEmpty = anyCollection.stream()
.filter(p -> someFilter(p)) // Add my filter
.map(p -> Boolean.TRUE) // For each element after filter, map to a TRUE
.findAny() // Get any TRUE
.orElse(Boolean.FALSE); // If there is no match return false
Following Stuart's idea, this could be done with a Spliterator like this:
static <T> Stream<T> defaultIfEmpty(Stream<T> stream, Stream<T> defaultStream) {
final Spliterator<T> spliterator = stream.spliterator();
final AtomicReference<T> reference = new AtomicReference<>();
if (spliterator.tryAdvance(reference::set)) {
return Stream.concat(Stream.of(reference.get()), StreamSupport.stream(spliterator, stream.isParallel()));
} else {
return defaultStream;
}
}
I think this works with parallel Streams as the stream.spliterator() operation will terminate the stream, and then rebuild it as required
In my use-case I needed a default Stream rather than a default value. that's quite easy to change if this is not what you need
I would simply use:
stream.count()>0
The best simple solution I could find that does not consume the stream or convert to iterators is:
public Stream<Thing> getFilteredThings() {
AtomicBoolean found = new AtomicBoolean(false);
Stream<Thing> stream = getThings().stream()
.filter(Thing::isFoo)
.filter(Thing::isBar)
.forEach(x -> {
found.set(true);
// do useful things
})
;
if (!found.get()) {
throw new RuntimeException("No foo bar things available");
}
}
Feel free to suggest improvements..
Related
I have a class where optionally a Comparator can be specified.
Since the Comparator is optional, I have to evaluate its presence and execute the same stream code, either with sorted() or without:
if(comparator != null) {
[...].stream().map()[...].sorted(comparator)[...];
} else {
[...].stream().map()[...];
}
Question:
Is there a more elegant way to do this without the code duplication?
Note:
A default Comparator is not an option, I just want to keep the original order of the values I am streaming.
Besides, the elements are already mapped at the point of sorting, so I can not somehow reference the root list of the stream, as I do not have the original elements anymore.
You can do something like this:
Stream<Something> stream = [...].stream().map()[...]; // preliminary processing
if(comparator != null) {
stream = stream.sorted(comparator); // optional sorting
}
stream... // resumed processing, which ends in some terminal operation (such as collect)
Another way would be to use Optional:
Stream<Whatever> stream = [...].stream().map()[...];
List<WhateverElse> result = Optional.ofNullable(comparator)
.map(stream::sorted)
.orElse(stream)
.[...] // <-- go on with the stream pipeline
.collect(Collectors.toList());
You could define a comparator of your type (I used E as a placeholder here) that will not change the order:
Comparator<E> NO_SORTING = (one, other) -> 0;
If the comparator field is an Optional of Comparator, you can then use
.sorted(comparator.orElse(NO_SORTING))
If you don't mind use third party library StreamEx
StreamEx(source).[...].chain(s -> comparator == null ? s : s.sorted(comparator)).[...];
You can accomplish this using an auxiliary function.
static <T, R> R applyFunction(T obj, Function<T, R> f) {
return f.apply(obj);
}
and
applyFunction([...].stream().map()[...],
stream -> comparator == null ? stream : stream.sorted(comparator))
[...];
You don't need to know intermediate stream type.
Suppose I have a Queue<String> and I want to empty the current contents of the queue and do something with each element. Using a loop I could do something like:
while (true) {
String element = queue.poll();
if (element == null) {
return;
}
System.out.println(element);
}
This feels a bit ugly. Could I do this better with streams?
Note that there may be other threads accessing the queue at the same time, so relying on the size of the queue to know how many items to poll would be error prone.
Since you asked about “without blocking”, it seems you are referring to a BlockingQueue. In that case, it’s recommended to avoid repeatedly calling poll().
Instead, transfer all pending elements to a local collection in one go, then process them:
List<String> tmp = new ArrayList<>();
queue.drainTo(tmp);
tmp.forEach(System.out::println);
You may also avoid synchronizing on System.out (implicitly) multiple times:
List<String> tmp=new ArrayList<>();
queue.drainTo(tmp);
System.out.println(tmp.stream().collect(Collectors.joining(System.lineSeparator())));
or
List<String> tmp=new ArrayList<>();
queue.drainTo(tmp);
System.out.println(String.join(System.lineSeparator(), tmp));
(though that doesn’t bear a stream operation)…
You don't need to use streams to make the code less ugly (a stream solution would probably be more ugly).
String s = null;
while((s = queue.poll()) != null)
System.out.println(s);
The best answer I can come up with at present is the following:
StreamEx.generate(() -> queue.poll())
.takeWhile(Objects::nonNull)
.forEach(System.out::println);
but that uses a library (StreamEx). Can we do this with vanilla Java?
As per my understanding, if you want to use stream then you can't modify the same collection on which you are doing operations.
Below code might help you:
queue.stream().forEach(e -> {
System.out.println(e);
});
queue.clear();
I don’t think you want the complication, but if I am wrong, you may do:
Spliterator<String> spliterator = new Spliterators.AbstractSpliterator<String>(queue.size(), 0) {
#Override
public boolean tryAdvance(Consumer<? super String> action) {
String element = queue.poll();
if (element == null) {
return false;
} else {
action.accept(element);
return true;
}
}
};
StreamSupport.stream(spliterator, false).forEach(System.out::println);
You may look into the docs of Spliterators and StreamSupport for numerous possible refinements.
How can I convert multiple Streams into one Stream? For example, I have 3 IntStreams and I want to combine them into one Stream of int arrays.
In the Javadoc, most Stream operations take one stream as input, and the concat doesn't answer my use case.
Here's what I had in mind
Stream 1: 1, 2, 3
Stream 2: 4, 5, 6
Combined Stream ex1: [1,4],[2,5],[3,6]
Combined Stream ex2: 1+4,2+5,3+6
Combined Stream ex3: new MyObject(1,4), new MyObject(2,5), new MyObject(3,6)
In functional terms, the problem comes down to zipping a list of streams, and applying a custom zipper for each elements.
There is no facility to do that directly with the Stream API. We can use 3rd party libraries, like the protonpack library, that provides a zip method to do that. Considering the data:
List<Stream<Integer>> streams = Arrays.asList(Stream.of(1,2,3), Stream.of(4,5,6));
you can have
Stream<Integer> stream = StreamUtils.zip(streams, l -> l.stream().mapToInt(i -> i).sum());
// the Stream is now "1+4,2+5,3+6"
or
Stream<Integer[]> stream = StreamUtils.zip(streams, l -> l.toArray(new Integer[l.size()]));
// the Stream is now "[1,4][2,5][3,6]"
The mapper takes the list of elements to zip and returns the zipped value. In the first example, it sums the value together, while it returns an array in the second.
Sadly, there is nothing native to the Stream that does this for you. An unfortunate shortcoming to the API.
That said, you could do this by taking out an Iterator on each of the streams, similar to:
public static <T,U,R> Stream<R> zipStreams (Stream<T> a, Stream<U> b, BiFunction<T,U,R> zipFunc) {
Iterator<T> itA = a.iterator();
Iterator<U> itB = b.iterator();
Iterator<R> itRet = new Iterator<R>() {
#Override
public boolean hasNext() {
return itA.hasNext() && itB.hasNext();
}
#Override
public R next() {
return zipFunc.apply(itA.next(), itB.next());
}
};
Iterable<R> ret = () -> itRet;
return StreamSupport.stream(ret.spliterator(), a.isParallel() || b.isParallel());
}
I need to convert Stream<Optional<Integer>> to Optional<Stream<Integer>>.
The output Optional<Stream<Integer>> should be an empty value when at least one value ofStream<Optional<Integer>> is empty.
Do you know any functional way to solve the problem? I tried to use collect method, but without success.
Well, the tricky thing here is that if you're just given a Stream, you can only use it once.
To be stateless and avoid redundant copying, one way is to just catch NoSuchElementException:
static <T> Optional<Stream<T>> invert(Stream<Optional<T>> stream) {
try {
return Optional.of(
stream.map(Optional::get)
.collect(Collectors.toList())
.stream());
} catch (NoSuchElementException e) {
return Optional.empty();
}
}
A simple inversion would be:
static <T> Optional<Stream<T>> invert(Stream<Optional<T>> stream) {
return Optional.of(stream.map(Optional::get));
}
But to find out if it contains an empty element, you need to actually traverse it which also consumes it.
If you're given the source of the stream, you can traverse it without collecting it:
static <T> Optional<Stream<T>> invert(
Supplier<Stream<Optional<T>>> supplier) {
// taking advantage of short-circuiting here
// instead of allMatch(Optional::isPresent)
return supplier.get().anyMatch(o -> !o.isPresent()) ?
Optional.empty() : Optional.of(supplier.get().map(Optional::get));
}
List<Optional<Integer>> myInts =
Arrays.asList(Optional.of(1), Optional.of(2), Optional.of(3));
Optional<Stream<Integer>> inverted = invert(myInts::stream);
That's probably a more interesting approach. (But it's prone to a race condition because the stream() is taken twice. If some other thread adds an empty element in between and gets away with it, we have a problem.)
Though this has already been answered yet to add to the list, with Java-9 introducing Optional.stream, this should be achievable as:
// initialized stream of optional
Stream<Optional<Integer>> so = Stream.empty();
// mapped stream of T
Stream<Integer> s = so.flatMap(Optional::stream);
// constructing optional from the stream
Optional<Stream<Integer>> os = Optional.of(s);
Similar to Radiodef's answer, though this one avoids the exception handling and the intermediate list.
private static <T> Optional<Stream<T>> invertOptional(Stream<Optional<T>> input) {
return input.map(integer -> integer.map(Stream::of))
.collect(Collectors.reducing((l, r) -> l.flatMap(lv -> r.map(rv -> Stream.concat(lv, rv)))))
.orElse(Optional.empty());
}
The way this works is it maps to a Stream of Optional Streams of T. The Optional.map is used in this case, so each one of the Optional<Stream<T>> items in the resultant stream is a either a Stream of 1, or an empty Optional.
Then it collects these streams by reducing them together. the l.flatMap will return an empty Optional if l is empty or the r.map returns an empty. if r.map isn't empty, it calls the Stream.concat, which combines the left and right stream values.
The whole collect reduction produces an Optional<Optional<Stream<T>>>, so we narrow that down with the .orElse(Optional.empty)
Note: Code is tested, and appears to work. The unspecified "edge case" of an empty input Stream is treated as an an empty Optional, but can be easily changed.
final Stream<Optional<Integer>> streamOfInts = Stream.of(Optional.of(1), Optional.of(2), Optional.of(3), Optional.of(4), Optional.of(5));
// false - list of Optional.empty(); true -> list of Optional.of(Integer)
final Map<Boolean, List<Optional<Integer>>> collect = streamOfInts.collect(Collectors.partitioningBy(Optional::isPresent));
final Function<List<Optional<Integer>>, Stream<Integer>> mapToStream = List->List.stream().filter(o->o.isPresent()).map(o->o.get());
Optional<Stream<Integer>> result = Optional
.of(Optional.of(collect.get(false)).filter(list->list.size()>0).orElse(collect.get(true)))
.filter(list->list.size()>0)
.filter(list->list.get(0).isPresent())
.map(mapToStream)
.map(Optional::of)
.orElse(Optional.empty());
I have a data set represented by a Java 8 stream:
Stream<T> stream = ...;
I can see how to filter it to get a random subset - for example
Random r = new Random();
PrimitiveIterator.OfInt coin = r.ints(0, 2).iterator();
Stream<T> heads = stream.filter((x) -> (coin.nextInt() == 0));
I can also see how I could reduce this stream to get, for example, two lists representing two random halves of the data set, and then turn those back into streams.
But, is there a direct way to generate two streams from the initial one? Something like
(heads, tails) = stream.[some kind of split based on filter]
Thanks for any insight.
A collector can be used for this.
For two categories, use Collectors.partitioningBy() factory.
This will create a Map<Boolean, List>, and put items in one or the other list based on a Predicate.
Note: Since the stream needs to be consumed whole, this can't work on infinite streams. And because the stream is consumed anyway, this method simply puts them in Lists instead of making a new stream-with-memory. You can always stream those lists if you require streams as output.
Also, no need for the iterator, not even in the heads-only example you provided.
Binary splitting looks like this:
Random r = new Random();
Map<Boolean, List<String>> groups = stream
.collect(Collectors.partitioningBy(x -> r.nextBoolean()));
System.out.println(groups.get(false).size());
System.out.println(groups.get(true).size());
For more categories, use a Collectors.groupingBy() factory.
Map<Object, List<String>> groups = stream
.collect(Collectors.groupingBy(x -> r.nextInt(3)));
System.out.println(groups.get(0).size());
System.out.println(groups.get(1).size());
System.out.println(groups.get(2).size());
In case the streams are not Stream, but one of the primitive streams like IntStream, then this .collect(Collectors) method is not available. You'll have to do it the manual way without a collector factory. It's implementation looks like this:
[Example 2.0 since 2020-04-16]
IntStream intStream = IntStream.iterate(0, i -> i + 1).limit(100000).parallel();
IntPredicate predicate = ignored -> r.nextBoolean();
Map<Boolean, List<Integer>> groups = intStream.collect(
() -> Map.of(false, new ArrayList<>(100000),
true , new ArrayList<>(100000)),
(map, value) -> map.get(predicate.test(value)).add(value),
(map1, map2) -> {
map1.get(false).addAll(map2.get(false));
map1.get(true ).addAll(map2.get(true ));
});
In this example I initialize the ArrayLists with the full size of the initial collection (if this is known at all). This prevents resize events even in the worst-case scenario, but can potentially gobble up 2NT space (N = initial number of elements, T = number of threads). To trade-off space for speed, you can leave it out or use your best educated guess, like the expected highest number of elements in one partition (typically just over N/2 for a balanced split).
I hope I don't offend anyone by using a Java 9 method. For the Java 8 version, look at the edit history.
I stumbled across this question to my self and I feel that a forked stream has some use cases that could prove valid. I wrote the code below as a consumer so that it does not do anything but you could apply it to functions and anything else you might come across.
class PredicateSplitterConsumer<T> implements Consumer<T>
{
private Predicate<T> predicate;
private Consumer<T> positiveConsumer;
private Consumer<T> negativeConsumer;
public PredicateSplitterConsumer(Predicate<T> predicate, Consumer<T> positive, Consumer<T> negative)
{
this.predicate = predicate;
this.positiveConsumer = positive;
this.negativeConsumer = negative;
}
#Override
public void accept(T t)
{
if (predicate.test(t))
{
positiveConsumer.accept(t);
}
else
{
negativeConsumer.accept(t);
}
}
}
Now your code implementation could be something like this:
personsArray.forEach(
new PredicateSplitterConsumer<>(
person -> person.getDateOfBirth().isPresent(),
person -> System.out.println(person.getName()),
person -> System.out.println(person.getName() + " does not have Date of birth")));
Unfortunately, what you ask for is directly frowned upon in the JavaDoc of Stream:
A stream should be operated on (invoking an intermediate or terminal
stream operation) only once. This rules out, for example, "forked"
streams, where the same source feeds two or more pipelines, or
multiple traversals of the same stream.
You can work around this using peek or other methods should you truly desire that type of behaviour. In this case, what you should do is instead of trying to back two streams from the same original Stream source with a forking filter, you would duplicate your stream and filter each of the duplicates appropriately.
However, you may wish to reconsider if a Stream is the appropriate structure for your use case.
You can get two Streams out of one
since Java 12 with teeing
counting heads and tails in 100 coin flips
Random r = new Random();
PrimitiveIterator.OfInt coin = r.ints(0, 2).iterator();
List<Long> list = Stream.iterate(0, i -> coin.nextInt())
.limit(100).collect(teeing(
filtering(i -> i == 1, counting()),
filtering(i -> i == 0, counting()),
(heads, tails) -> {
return(List.of(heads, tails));
}));
System.err.println("heads:" + list.get(0) + " tails:" + list.get(1));
gets eg.: heads:51 tails:49
Not exactly. You can't get two Streams out of one; this doesn't make sense -- how would you iterate over one without needing to generate the other at the same time? A stream can only be operated over once.
However, if you want to dump them into a list or something, you could do
stream.forEach((x) -> ((x == 0) ? heads : tails).add(x));
This is against the general mechanism of Stream. Say you can split Stream S0 to Sa and Sb like you wanted. Performing any terminal operation, say count(), on Sa will necessarily "consume" all elements in S0. Therefore Sb lost its data source.
Previously, Stream had a tee() method, I think, which duplicate a stream to two. It's removed now.
Stream has a peek() method though, you might be able to use it to achieve your requirements.
not exactly, but you may be able to accomplish what you need by invoking Collectors.groupingBy(). you create a new Collection, and can then instantiate streams on that new collection.
This was the least bad answer I could come up with.
import org.apache.commons.lang3.tuple.ImmutablePair;
import org.apache.commons.lang3.tuple.Pair;
public class Test {
public static <T, L, R> Pair<L, R> splitStream(Stream<T> inputStream, Predicate<T> predicate,
Function<Stream<T>, L> trueStreamProcessor, Function<Stream<T>, R> falseStreamProcessor) {
Map<Boolean, List<T>> partitioned = inputStream.collect(Collectors.partitioningBy(predicate));
L trueResult = trueStreamProcessor.apply(partitioned.get(Boolean.TRUE).stream());
R falseResult = falseStreamProcessor.apply(partitioned.get(Boolean.FALSE).stream());
return new ImmutablePair<L, R>(trueResult, falseResult);
}
public static void main(String[] args) {
Stream<Integer> stream = Stream.iterate(0, n -> n + 1).limit(10);
Pair<List<Integer>, String> results = splitStream(stream,
n -> n > 5,
s -> s.filter(n -> n % 2 == 0).collect(Collectors.toList()),
s -> s.map(n -> n.toString()).collect(Collectors.joining("|")));
System.out.println(results);
}
}
This takes a stream of integers and splits them at 5. For those greater than 5 it filters only even numbers and puts them in a list. For the rest it joins them with |.
outputs:
([6, 8],0|1|2|3|4|5)
Its not ideal as it collects everything into intermediary collections breaking the stream (and has too many arguments!)
I stumbled across this question while looking for a way to filter certain elements out of a stream and log them as errors. So I did not really need to split the stream so much as attach a premature terminating action to a predicate with unobtrusive syntax. This is what I came up with:
public class MyProcess {
/* Return a Predicate that performs a bail-out action on non-matching items. */
private static <T> Predicate<T> withAltAction(Predicate<T> pred, Consumer<T> altAction) {
return x -> {
if (pred.test(x)) {
return true;
}
altAction.accept(x);
return false;
};
/* Example usage in non-trivial pipeline */
public void processItems(Stream<Item> stream) {
stream.filter(Objects::nonNull)
.peek(this::logItem)
.map(Item::getSubItems)
.filter(withAltAction(SubItem::isValid,
i -> logError(i, "Invalid")))
.peek(this::logSubItem)
.filter(withAltAction(i -> i.size() > 10,
i -> logError(i, "Too large")))
.map(SubItem::toDisplayItem)
.forEach(this::display);
}
}
Shorter version that uses Lombok
import java.util.function.Consumer;
import java.util.function.Predicate;
import lombok.RequiredArgsConstructor;
/**
* Forks a Stream using a Predicate into postive and negative outcomes.
*/
#RequiredArgsConstructor
#FieldDefaults(makeFinal = true, level = AccessLevel.PROTECTED)
public class StreamForkerUtil<T> implements Consumer<T> {
Predicate<T> predicate;
Consumer<T> positiveConsumer;
Consumer<T> negativeConsumer;
#Override
public void accept(T t) {
(predicate.test(t) ? positiveConsumer : negativeConsumer).accept(t);
}
}
How about:
Supplier<Stream<Integer>> randomIntsStreamSupplier =
() -> (new Random()).ints(0, 2).boxed();
Stream<Integer> tails =
randomIntsStreamSupplier.get().filter(x->x.equals(0));
Stream<Integer> heads =
randomIntsStreamSupplier.get().filter(x->x.equals(1));