Stream mysteriously consumed twice - java

The following code ends up with a java.lang.IllegalStateException: stream has already been operated upon or closed.
public static void main(String[] args) {
Stream.concat(Stream.of("FOOBAR"),
reverse(StreamSupport.stream(new File("FOO/BAR").toPath().spliterator(), true)
.map(Path::toString)));
}
static <T> Stream<T> reverse(Stream<T> stream) {
return stream.reduce(Stream.empty(),
(Stream<T> a, T b) -> Stream.concat(Stream.of(b), a),
(a, b) -> Stream.concat(b, a));
}
The obvious solution is to generate a non parallel stream with StreamSupport.stream(…, false), but I can’t see why can’t run in parallel.

Stream.empty() is not a constant. This method returns a new stream instance on each invocation that will get consumed like any other stream, e.g. when you pass it into Stream.concat.
Therefore, Stream.empty() is not suitable as identity value for reduce, as the identity value may get passed as input to the reduction function an arbitrary, intentionally unspecified number of times. It’s an implementation detail that is happens to be used only a single time for sequential reduction and potentially multiple times for parallel reduction.
You can use
static <T> Stream<T> reverse(Stream<T> stream) {
return stream.map(Stream::of)
.reduce((a, b) -> Stream.concat(b, a))
.orElseGet(Stream::empty);
}
instead.
However, I only provide the solution as an academic exercise. As soon as the stream gets large, it leads to an excessive amount of concat calls and the note of the documentation applies:
Use caution when constructing streams from repeated concatenation. Accessing an element of a deeply concatenated stream can result in deep call chains, or even StackOverflowError.
Generally, the resulting underlying data structure will be far more expensive than a flat list, when using the Stream API this way.
You can use something like
Stream<String> s = Stream.concat(Stream.of("FOOBAR"),
reverse(new File("FOO/BAR").toPath()).map(Path::toString));
static Stream<Path> reverse(Path p) {
ArrayDeque<Path> d = new ArrayDeque<>();
p.forEach(d::addFirst);
return d.stream();
}
or
static Stream<Path> reverse(Path p) {
Stream.Builder b = Stream.builder();
for(; p != null; p = p.getParent()) b.add(p.getFileName());
return b.build();
}
With Java 9+ you can use a stream that truly has no additional storage (which does not necessarily imply that it will be more efficient):
static Stream<Path> reverse(Path p) {
return Stream.iterate(p, Objects::nonNull, Path::getParent).map(Path::getFileName);
}

Related

How to set a value to variable based on multiple conditions using Java Streams API?

I couldn't wrap my head around writing the below condition using Java Streams. Let's assume that I have a list of elements from the periodic table. I've to write a method that returns a String by checking whether the list has Silicon or Radium or Both. If it has only Silicon, method has to return Silicon. If it has only Radium, method has to return Radium. If it has both, method has to return Both. If none of them are available, method returns "" (default value).
Currently, the code that I've written is below.
String resolve(List<Element> elements) {
AtomicReference<String> value = new AtomicReference<>("");
elements.stream()
.map(Element::getName)
.forEach(name -> {
if (name.equalsIgnoreCase("RADIUM")) {
if (value.get().equals("")) {
value.set("RADIUM");
} else {
value.set("BOTH");
}
} else if (name.equalsIgnoreCase("SILICON")) {
if (value.get().equals("")) {
value.set("SILICON");
} else {
value.set("BOTH");
}
}
});
return value.get();
}
I understand the code looks messier and looks more imperative than functional. But I don't know how to write it in a better manner using streams. I've also considered the possibility of going through the list couple of times to filter elements Silicon and Radium and finalizing based on that. But it doesn't seem efficient going through a list twice.
NOTE : I also understand that this could be written in an imperative manner rather than complicating with streams and atomic variables. I just want to know how to write the same logic using streams.
Please share your suggestions on better ways to achieve the same goal using Java Streams.
It could be done with Stream IPA in a single statement and without multiline lambdas, nested conditions and impure function that changes the state outside the lambda.
My approach is to introduce an enum which elements correspond to all possible outcomes with its constants EMPTY, SILICON, RADIUM, BOTH.
All the return values apart from empty string can be obtained by invoking the method name() derived from the java.lang.Enum. And only to caver the case with empty string, I've added getName() method.
Note that since Java 16 enums can be declared locally inside a method.
The logic of the stream pipeline is the following:
stream elements turns into a stream of string;
gets filtered and transformed into a stream of enum constants;
reduction is done on the enum members;
optional of enum turs into an optional of string.
Implementation can look like this:
public static String resolve(List<Element> elements) {
return elements.stream()
.map(Element::getName)
.map(String::toUpperCase)
.filter(str -> str.equals("SILICON") || str.equals("RADIUM"))
.map(Elements::valueOf)
.reduce((result, next) -> result == Elements.BOTH || result != next ? Elements.BOTH : next)
.map(Elements::getName)
.orElse("");
}
enum
enum Elements {EMPTY, SILICON, RADIUM, BOTH;
String getName() {
return this == EMPTY ? "" : name(); // note name() declared in the java.lang.Enum as final and can't be overridden
}
}
main
public static void main(String[] args) {
System.out.println(resolve(List.of(new Element("Silicon"), new Element("Lithium"))));
System.out.println(resolve(List.of(new Element("Silicon"), new Element("Radium"))));
System.out.println(resolve(List.of(new Element("Ferrum"), new Element("Oxygen"), new Element("Aurum")))
.isEmpty() + " - no target elements"); // output is an empty string
}
output
SILICON
BOTH
true - no target elements
Note:
Although with streams you can produce the result in O(n) time iterative approach might be better for this task. Think about it this way: if you have a list of 10.000 elements in the list and it starts with "SILICON" and "RADIUM". You could easily break the loop and return "BOTH".
Stateful operations in the streams has to be avoided according to the documentation, also to understand why javadoc warns against stateful streams you might take a look at this question. If you want to play around with AtomicReference it's totally fine, just keep in mind that this approach is not considered to be good practice.
I guess if I had implemented such a method with streams, the overall logic would be the same as above, but without utilizing an enum. Since only a single object is needed it's a reduction, so I'll apply reduce() on a stream of strings, extract the reduction logic with all the conditions to a separate method. Normally, lambdas have to be well-readable one-liners.
Collect the strings to a unique set. Then check containment in constant time.
Set<String> names = elements.stream().map(Element::getName).map(String::toLowerCase).collect(toSet());
boolean hasSilicon = names.contains("silicon");
boolean hasRadium = names.contains("radium");
String result = "";
if (hasSilicon && hasRadium) {
result = "BOTH";
} else if (hasSilicon) {
result = "SILICON";
} else if (hasRadium) {
result = "RADIUM";
}
return result;
i have used predicate in filter to for radium and silicon and using the resulted set i am printing the result.
import java.util.ArrayList;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
public class Test {
public static void main(String[] args) {
List<Element> elementss = new ArrayList<>();
Set<String> stringSet = elementss.stream().map(e -> e.getName())
.filter(string -> (string.equals("Radium") || string.equals("Silicon")))
.collect(Collectors.toSet());
if(stringSet.size()==2){
System.out.println("both");
}else if(stringSet.size()==1){
System.out.println(stringSet);
}else{
System.out.println(" ");
}
}
}
You could save a few lines if you use regex, but I doubt if it is better than the other answers:
String resolve(List<Element> elements) {
String result = elements.stream()
.map(Element::getName)
.map(String::toUpperCase)
.filter(str -> str.matches("RADIUM|SILICON"))
.sorted()
.collect(Collectors.joining());
return result.matches("RADIUMSILICON") ? "BOTH" : result;
}

Java Stream `generate()` how to "include" the first "excluded" element

Assume this usage scenario for a Java stream, where data is added from a data source. Data source can be a list of values, like in the example below, or a paginated REST api. It doesn't matter, at the moment.
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Stream;
public class Main {
public static void main(String[] args) {
final List<Boolean> dataSource = List.of(true, true, true, false, false, false, false);
final AtomicInteger index = new AtomicInteger();
Stream
.generate(() -> {
boolean value = dataSource.get(index.getAndIncrement());
System.out.format("--> Executed expensive operation to retrieve data: %b\n", value);
return value;
})
.takeWhile(value -> value == true)
.forEach(data -> System.out.printf("--> Using: %b\n", data));
}
}
If you run this code your output will be
--> Executed expensive operation to retrieve data: true
--> Using: true
--> Executed expensive operation to retrieve data: true
--> Using: true
--> Executed expensive operation to retrieve data: true
--> Using: true
--> Executed expensive operation to retrieve data: false
As you can see the last element, the one that evaluated to false, did not get added to the stream, as expected.
Now assume that the generate() method loads pages of data from a REST api. In that case the value true/false is a value on page N indicating if page N + 1 exists, something like a has_more field. Now, I want the last page returned by the API to be added to the stream, but I do not want to perform another expensive operation to read an empty page, because I already know that there are no more pages.
What is the most idiomatic way to do this using the Java Stream API? Every workaround I can think of requires a call to the API to be executed.
UPDATE
In addition to the approaches listed in Inclusive takeWhile() for Streams there is another ugly way to achieve this.
public static void main(String[] args) {
final List<Boolean> dataSource = List.of(true, true, true, false, false, false, false);
final AtomicInteger index = new AtomicInteger();
final AtomicBoolean hasMore = new AtomicBoolean(true);
Stream
.generate(() -> {
if (!hasMore.get()) {
return null;
}
boolean value = dataSource.get(index.getAndIncrement());
hasMore.set(value);
System.out.format("--> Executed expensive operation to retrieve data: %b\n", value);
return value;
})
.takeWhile(Objects::nonNull)
.forEach(data -> System.out.printf("--> Using: %b\n", data));
}
You are using the wrong tool for your job. As already noticable in your code example, the Supplier passed to Stream.generate has to go great lengths to maintain the index it needs for fetching pages.
What makes matters worse, is that Stream.generate creates an unordered Stream:
Returns an infinite sequential unordered stream where each element is generated by the provided Supplier.
This is suitable for generating constant streams, streams of random elements, etc.
You’re not returning constant or random values nor anything else that would be independent of the order.
This has a significant impact on the semantics of takeWhile:
Otherwise returns, if this stream is unordered, a stream consisting of a subset of elements taken from this stream that match the given predicate.
This makes sense if you think about it. If there is at least one element rejected by the predicate, it could be encountered at an arbitrary position for an unordered stream, so an arbitrary subset of elements encountered before it, including the empty set, would be a valid prefix.
But since there is no “before” or “after” for an unordered stream, even elements produced by the generator after the rejected one could be included by the result.
In practice, you are unlikely to encounter such effects for a sequential stream, but it doesn’t change the fact that Stream.generate(…) .takeWhile(…) is semantically wrong for your task.
From your example code, I conclude that pages do not contain their own number nor a "getNext" method, so we have to maintain the number and the "hasNext" state for creating a stream.
Assuming an example setup like
class Page {
private String data;
private boolean hasNext;
public Page(String data, boolean hasNext) {
this.data = data;
this.hasNext = hasNext;
}
public String getData() {
return data;
}
public boolean hasNext() {
return hasNext;
}
}
private static String[] SAMPLE_PAGES = { "foo", "bar", "baz" };
public static Page getPage(int index) {
Objects.checkIndex(index, SAMPLE_PAGES.length);
return new Page(SAMPLE_PAGES[index], index + 1 < SAMPLE_PAGES.length);
}
You can implement a correct stream like
Stream.iterate(Map.entry(0, getPage(0)), Objects::nonNull,
e -> e.getValue().hasNext()? Map.entry(e.getKey()+1, getPage(e.getKey()+1)): null)
.map(Map.Entry::getValue)
.forEach(page -> System.out.println(page.getData()));
Note that Stream.iterate creates an ordered stream:
Returns a sequential ordered Stream produced by iterative application of the given next function to an initial element,
conditioned on satisfying the given hasNext predicate.
Of course, things would be much easier if the page knew its own number, e.g.
Stream.iterate(getPage(0), Objects::nonNull,
p -> p.hasNext()? getPage(p.getPageNumber()+1): null)
.forEach(page -> System.out.println(page.getData()));
or if there was a method to get from an existing Page to the next Page, e.g.
Stream.iterate(getPage(0), Objects::nonNull, p -> p.hasNext()? p.getNextPage(): null)
.forEach(page -> System.out.println(page.getData()));

takeWhile() working differently with flatmap

I am creating snippets with takeWhile to explore its possibilities. When used in conjunction with flatMap, the behaviour is not in line with the expectation. Please find the code snippet below.
String[][] strArray = {{"Sample1", "Sample2"}, {"Sample3", "Sample4", "Sample5"}};
Arrays.stream(strArray)
.flatMap(indStream -> Arrays.stream(indStream))
.takeWhile(ele -> !ele.equalsIgnoreCase("Sample4"))
.forEach(ele -> System.out.println(ele));
Actual Output:
Sample1
Sample2
Sample3
Sample5
ExpectedOutput:
Sample1
Sample2
Sample3
Reason for the expectation is that takeWhile should be executing till the time the condition inside turns true. I have also added printout statements inside flatmap for debugging. The streams are returned just twice which is inline with the expectation.
However, this works just fine without flatmap in the chain.
String[] strArraySingle = {"Sample3", "Sample4", "Sample5"};
Arrays.stream(strArraySingle)
.takeWhile(ele -> !ele.equalsIgnoreCase("Sample4"))
.forEach(ele -> System.out.println(ele));
Actual Output:
Sample3
Here the actual output matches with the expected output.
Disclaimer: These snippets are just for code practise and does not serve any valid usecases.
Update:
Bug JDK-8193856: fix will be available as part of JDK 10.
The change will be to correct whileOps
Sink::accept
#Override
public void accept(T t) {
if (take = predicate.test(t)) {
downstream.accept(t);
}
}
Changed Implementation:
#Override
public void accept(T t) {
if (take && (take = predicate.test(t))) {
downstream.accept(t);
}
}
This is a bug in JDK 9 - from issue #8193856:
takeWhile is incorrectly assuming that an upstream operation supports and honors cancellation, which unfortunately is not the case for flatMap.
Explanation
If the stream is ordered, takeWhile should show the expected behavior. This is not entirely the case in your code because you use forEach, which waives order. If you care about it, which you do in this example, you should use forEachOrdered instead. Funny thing: That doesn't change anything. 🤔
So maybe the stream isn't ordered in the first place? (In that case the behavior is ok.) If you create a temporary variable for the stream created from strArray and check whether it is ordered by executing the expression ((StatefulOp) stream).isOrdered(); at the breakpoint, you will find that it is indeed ordered:
String[][] strArray = {{"Sample1", "Sample2"}, {"Sample3", "Sample4", "Sample5"}};
Stream<String> stream = Arrays.stream(strArray)
.flatMap(indStream -> Arrays.stream(indStream))
.takeWhile(ele -> !ele.equalsIgnoreCase("Sample4"));
// breakpoint here
System.out.println(stream);
That means that this is very likely an implementation error.
Into The Code
As others have suspected, I now also think that this might be connected to flatMap being eager. More precisely, both problems might have the same root cause.
Looking into the source of WhileOps, we can see these methods:
#Override
public void accept(T t) {
if (take = predicate.test(t)) {
downstream.accept(t);
}
}
#Override
public boolean cancellationRequested() {
return !take || downstream.cancellationRequested();
}
This code is used by takeWhile to check for a given stream element t whether the predicate is fulfilled:
If so, it passes the element on to the downstream operation, in this case System.out::println.
If not, it sets take to false, so when it is asked next time whether the pipeline should be canceled (i.e. it is done), it returns true.
This covers the takeWhile operation. The other thing you need to know is that forEachOrdered leads to the terminal operation executing the method ReferencePipeline::forEachWithCancel:
#Override
final boolean forEachWithCancel(Spliterator<P_OUT> spliterator, Sink<P_OUT> sink) {
boolean cancelled;
do { } while (
!(cancelled = sink.cancellationRequested())
&& spliterator.tryAdvance(sink));
return cancelled;
}
All this does is:
check whether pipeline was canceled
if not, advance the sink by one element
stop if this was the last element
Looks promising, right?
Without flatMap
In the "good case" (without flatMap; your second example) forEachWithCancel directly operates on the WhileOp as sink and you can see how this plays out:
ReferencePipeline::forEachWithCancel does its loop:
WhileOps::accept is given each stream element
WhileOps::cancellationRequested is queried after each element
at some point "Sample4" fails the predicate and the stream is canceled
Yay!
With flatMap
In the "bad case" (with flatMap; your first example), forEachWithCancel operates on the flatMap operation, though, , which simply calls forEachRemaining on the ArraySpliterator for {"Sample3", "Sample4", "Sample5"}, which does this:
if ((a = array).length >= (hi = fence) &&
(i = index) >= 0 && i < (index = hi)) {
do { action.accept((T)a[i]); } while (++i < hi);
}
Ignoring all that hi and fence stuff, which is only used if the array processing is split for a parallel stream, this is a simple for loop, which passes each element to the takeWhile operation, but never checks whether it is cancelled. It will hence eagerly ply through all elements in that "substream" before stopping, likely even through the rest of the stream.
This is a bug no matter how I look at it - and thank you Holger for your comments. I did not want to put this answer in here (seriously!), but none of the answer clearly states that this is a bug.
People are saying that this has to with ordered/un-ordered, and this is not true as this will report true 3 times:
Stream<String[]> s1 = Arrays.stream(strArray);
System.out.println(s1.spliterator().hasCharacteristics(Spliterator.ORDERED));
Stream<String> s2 = Arrays.stream(strArray)
.flatMap(indStream -> Arrays.stream(indStream));
System.out.println(s2.spliterator().hasCharacteristics(Spliterator.ORDERED));
Stream<String> s3 = Arrays.stream(strArray)
.flatMap(indStream -> Arrays.stream(indStream))
.takeWhile(ele -> !ele.equalsIgnoreCase("Sample4"));
System.out.println(s3.spliterator().hasCharacteristics(Spliterator.ORDERED));
It's very interesting also that if you change it to:
String[][] strArray = {
{ "Sample1", "Sample2" },
{ "Sample3", "Sample5", "Sample4" }, // Sample4 is the last one here
{ "Sample7", "Sample8" }
};
then Sample7 and Sample8 will not be part of the output, otherwise they will. It seems that flatmap ignores a cancel flag that would be introduced by dropWhile.
If you look at the documentation for takeWhile:
if this stream is ordered, [returns] a stream consisting of the
longest prefix of elements taken from this stream that match the given
predicate.
if this stream is unordered, [returns] a stream consisting of a subset
of elements taken from this stream that match the given predicate.
Your stream is coincidentally ordered, but takeWhile doesn't know that it is. As such, it is returning 2nd condition - the subset. Your takeWhile is just acting like a filter.
If you add a call to sorted before takeWhile, you'll see the result you expect:
Arrays.stream(strArray)
.flatMap(indStream -> Arrays.stream(indStream))
.sorted()
.takeWhile(ele -> !ele.equalsIgnoreCase("Sample4"))
.forEach(ele -> System.out.println(ele));
The reason for that is the flatMap operation also being an intermediate operations with which (one of) the stateful short-circuiting intermediate operation takeWhile is used.
The behavior of flatMap as pointed by Holger in this answer is certainly a reference one shouldn't miss out to understand the unexpected output for such short-circuiting operations.
Your expected result can be achieved by splitting these two intermediate operations by introducing a terminal operation to deterministically use an ordered stream further and performing them for a sample as :
List<String> sampleList = Arrays.stream(strArray).flatMap(Arrays::stream).collect(Collectors.toList());
sampleList.stream().takeWhile(ele -> !ele.equalsIgnoreCase("Sample4"))
.forEach(System.out::println);
Also, there seems to be a related Bug#JDK-8075939 to trace this behavior already registered.
Edit: This can be tracked further at JDK-8193856 accepted as a bug.

Java 8 streams - stackoverflow exception

Running the following code sample ends with:
"Exception in thread "main" java.lang.StackOverflowError"
import java.util.stream.IntStream;
import java.util.stream.Stream;
public class TestStream {
public static void main(String[] args) {
Stream<String> reducedStream = IntStream.range(0, 15000)
.mapToObj(Abc::new)
.reduce(
Stream.of("Test")
, (str , abc) -> abc.process(str)
, (a , b) -> {throw new IllegalStateException();}
);
System.out.println(reducedStream.findFirst().get());
}
private static class Abc {
public Abc(int id) {
}
public Stream<String> process(Stream<String> batch) {
return batch.map(this::doNothing);
}
private String doNothing(String test) {
return test;
}
}
}
What exactly is causing that issue? Which part of this code is recursive and why?
Your code isn't recursively looping. You can test with smaller numbers for the IntStream range (i.e. 1 or 100). In your case it's the actual stack size limit that causes the problem. As pointed out in some of the comments, its the way the streams are processes.
Each invocation on the stream creates a new wrapper stream around the original one. The 'findFirst()' method asks the previous stream for elements, which in turn asks the previous stream for elements. As the streams are no real containers but only pointers on the elements of the result.
The wrapper explosion happens in the reduce methods' accumulator '(str , abc) -> abc.process(str)'. The implementation of the method creates a new stream wrapper on the result (str) of the previous operation, feeding into the next iteration, creating a new wrapper on the result(result(str))). So the accumulation mechanism is one of a wrapper (recursion) and not of an appender (iteration). So creating a new stream of the actual (flattened) result and not on reference to the potential result would stop the explosion, i.e.
public Stream<String> process(Stream<String> batch) {
return Stream.of(batch.map(this::doNothing).collect(Collectors.joining()));
}
This method is just an example, as your original example doesn't make any sense because it does nothing, and neither does this example. Its just an illustration. It basically flattens the elements of the stream returned by the map method into a single string and creates a new stream on this concrete string and not on a stream itself, thats the difference to your original code.
You could tune the stacksize using the '-Xss' parameter which defines the size of the stack per thread. The default value depends on the platform, see also this question 'What is the maximum depth of the java call stack?' But take care when increasing, this setting applies to all threads.

How to process chuncks of a file with java.util.stream

To get familliar with the stream api, I tried to code a quite simple pattern.
Problem: Having a text file containing not nested blocks of text. All blocks are identified by start/endpatterns (e.g. <start> and <stop>. The content of a block isn't syntactically distinguishable from the noise between the blocks. Therefore it is impossible, to work with simple (stateless) lambdas.
I was just able to implement something ugly like:
Files.lines(path).collect(new MySequentialParseAndProsessEachLineCollector<>());
To be honest, this is not what I want.
Im looking for a mapper something like:
Files.lines(path).map(MyMapAllLinesOfBlockToBuckets()).parallelStream().collect(new MyProcessOneBucketCollector<>());
is there a good way to extract chunks of data from a java 8 stream seems to contain a skeleton of a solution. Unfortunatly, I'm to stubid to translate that to my problem. ;-)
Any hints?
Here is a solution which can be used for converting a Stream<String>, each element representing a line, to a Stream<List<String>>, each element representing a chunk found using a specified delimiter:
public class ChunkSpliterator implements Spliterator<List<String>> {
private final Spliterator<String> source;
private final Predicate<String> start, end;
private final Consumer<String> getChunk;
private List<String> current;
ChunkSpliterator(Spliterator<String> lineSpliterator,
Predicate<String> chunkStart, Predicate<String> chunkEnd) {
source=lineSpliterator;
start=chunkStart;
end=chunkEnd;
getChunk=s -> {
if(current!=null) current.add(s);
else if(start.test(s)) current=new ArrayList<>();
};
}
public boolean tryAdvance(Consumer<? super List<String>> action) {
while(current==null || current.isEmpty()
|| !end.test(current.get(current.size()-1)))
if(!source.tryAdvance(getChunk)) return false;
current.remove(current.size()-1);
action.accept(current);
current=null;
return true;
}
public Spliterator<List<String>> trySplit() {
return null;
}
public long estimateSize() {
return Long.MAX_VALUE;
}
public int characteristics() {
return ORDERED|NONNULL;
}
public static Stream<List<String>> toChunks(Stream<String> lines,
Predicate<String> chunkStart, Predicate<String> chunkEnd,
boolean parallel) {
return StreamSupport.stream(
new ChunkSpliterator(lines.spliterator(), chunkStart, chunkEnd),
parallel);
}
}
The lines matching the predicates are not included in the chunk; it would be easy to change this behavior, if desired.
It can be used like this:
ChunkSpliterator.toChunks( Files.lines(Paths.get(myFile)),
Pattern.compile("^<start>$").asPredicate(),
Pattern.compile("^<stop>$").asPredicate(),
true )
.collect(new MyProcessOneBucketCollector<>())
The patterns are specifying as ^word$ to require the entire line to consist of the word only; without these anchors, lines containing the pattern can start and end a chunk. The nature of the source stream does not allow parallelism when creating the chunks, so when chaining with an immediate collection operation the parallelism for the entire operation is rather limited. It depends on the MyProcessOneBucketCollector if there can be any parallelism at all.
If your final result does not depend on the order of occurrences of the buckets in the source file, it is strongly recommended that either your collector reports itself to be UNORDERED or you insert an unordered() in the stream’s method chains before the collect.

Categories

Resources