Java 8 streams - stackoverflow exception - java

Running the following code sample ends with:
"Exception in thread "main" java.lang.StackOverflowError"
import java.util.stream.IntStream;
import java.util.stream.Stream;
public class TestStream {
public static void main(String[] args) {
Stream<String> reducedStream = IntStream.range(0, 15000)
.mapToObj(Abc::new)
.reduce(
Stream.of("Test")
, (str , abc) -> abc.process(str)
, (a , b) -> {throw new IllegalStateException();}
);
System.out.println(reducedStream.findFirst().get());
}
private static class Abc {
public Abc(int id) {
}
public Stream<String> process(Stream<String> batch) {
return batch.map(this::doNothing);
}
private String doNothing(String test) {
return test;
}
}
}
What exactly is causing that issue? Which part of this code is recursive and why?

Your code isn't recursively looping. You can test with smaller numbers for the IntStream range (i.e. 1 or 100). In your case it's the actual stack size limit that causes the problem. As pointed out in some of the comments, its the way the streams are processes.
Each invocation on the stream creates a new wrapper stream around the original one. The 'findFirst()' method asks the previous stream for elements, which in turn asks the previous stream for elements. As the streams are no real containers but only pointers on the elements of the result.
The wrapper explosion happens in the reduce methods' accumulator '(str , abc) -> abc.process(str)'. The implementation of the method creates a new stream wrapper on the result (str) of the previous operation, feeding into the next iteration, creating a new wrapper on the result(result(str))). So the accumulation mechanism is one of a wrapper (recursion) and not of an appender (iteration). So creating a new stream of the actual (flattened) result and not on reference to the potential result would stop the explosion, i.e.
public Stream<String> process(Stream<String> batch) {
return Stream.of(batch.map(this::doNothing).collect(Collectors.joining()));
}
This method is just an example, as your original example doesn't make any sense because it does nothing, and neither does this example. Its just an illustration. It basically flattens the elements of the stream returned by the map method into a single string and creates a new stream on this concrete string and not on a stream itself, thats the difference to your original code.
You could tune the stacksize using the '-Xss' parameter which defines the size of the stack per thread. The default value depends on the platform, see also this question 'What is the maximum depth of the java call stack?' But take care when increasing, this setting applies to all threads.

Related

How to set a value to variable based on multiple conditions using Java Streams API?

I couldn't wrap my head around writing the below condition using Java Streams. Let's assume that I have a list of elements from the periodic table. I've to write a method that returns a String by checking whether the list has Silicon or Radium or Both. If it has only Silicon, method has to return Silicon. If it has only Radium, method has to return Radium. If it has both, method has to return Both. If none of them are available, method returns "" (default value).
Currently, the code that I've written is below.
String resolve(List<Element> elements) {
AtomicReference<String> value = new AtomicReference<>("");
elements.stream()
.map(Element::getName)
.forEach(name -> {
if (name.equalsIgnoreCase("RADIUM")) {
if (value.get().equals("")) {
value.set("RADIUM");
} else {
value.set("BOTH");
}
} else if (name.equalsIgnoreCase("SILICON")) {
if (value.get().equals("")) {
value.set("SILICON");
} else {
value.set("BOTH");
}
}
});
return value.get();
}
I understand the code looks messier and looks more imperative than functional. But I don't know how to write it in a better manner using streams. I've also considered the possibility of going through the list couple of times to filter elements Silicon and Radium and finalizing based on that. But it doesn't seem efficient going through a list twice.
NOTE : I also understand that this could be written in an imperative manner rather than complicating with streams and atomic variables. I just want to know how to write the same logic using streams.
Please share your suggestions on better ways to achieve the same goal using Java Streams.
It could be done with Stream IPA in a single statement and without multiline lambdas, nested conditions and impure function that changes the state outside the lambda.
My approach is to introduce an enum which elements correspond to all possible outcomes with its constants EMPTY, SILICON, RADIUM, BOTH.
All the return values apart from empty string can be obtained by invoking the method name() derived from the java.lang.Enum. And only to caver the case with empty string, I've added getName() method.
Note that since Java 16 enums can be declared locally inside a method.
The logic of the stream pipeline is the following:
stream elements turns into a stream of string;
gets filtered and transformed into a stream of enum constants;
reduction is done on the enum members;
optional of enum turs into an optional of string.
Implementation can look like this:
public static String resolve(List<Element> elements) {
return elements.stream()
.map(Element::getName)
.map(String::toUpperCase)
.filter(str -> str.equals("SILICON") || str.equals("RADIUM"))
.map(Elements::valueOf)
.reduce((result, next) -> result == Elements.BOTH || result != next ? Elements.BOTH : next)
.map(Elements::getName)
.orElse("");
}
enum
enum Elements {EMPTY, SILICON, RADIUM, BOTH;
String getName() {
return this == EMPTY ? "" : name(); // note name() declared in the java.lang.Enum as final and can't be overridden
}
}
main
public static void main(String[] args) {
System.out.println(resolve(List.of(new Element("Silicon"), new Element("Lithium"))));
System.out.println(resolve(List.of(new Element("Silicon"), new Element("Radium"))));
System.out.println(resolve(List.of(new Element("Ferrum"), new Element("Oxygen"), new Element("Aurum")))
.isEmpty() + " - no target elements"); // output is an empty string
}
output
SILICON
BOTH
true - no target elements
Note:
Although with streams you can produce the result in O(n) time iterative approach might be better for this task. Think about it this way: if you have a list of 10.000 elements in the list and it starts with "SILICON" and "RADIUM". You could easily break the loop and return "BOTH".
Stateful operations in the streams has to be avoided according to the documentation, also to understand why javadoc warns against stateful streams you might take a look at this question. If you want to play around with AtomicReference it's totally fine, just keep in mind that this approach is not considered to be good practice.
I guess if I had implemented such a method with streams, the overall logic would be the same as above, but without utilizing an enum. Since only a single object is needed it's a reduction, so I'll apply reduce() on a stream of strings, extract the reduction logic with all the conditions to a separate method. Normally, lambdas have to be well-readable one-liners.
Collect the strings to a unique set. Then check containment in constant time.
Set<String> names = elements.stream().map(Element::getName).map(String::toLowerCase).collect(toSet());
boolean hasSilicon = names.contains("silicon");
boolean hasRadium = names.contains("radium");
String result = "";
if (hasSilicon && hasRadium) {
result = "BOTH";
} else if (hasSilicon) {
result = "SILICON";
} else if (hasRadium) {
result = "RADIUM";
}
return result;
i have used predicate in filter to for radium and silicon and using the resulted set i am printing the result.
import java.util.ArrayList;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
public class Test {
public static void main(String[] args) {
List<Element> elementss = new ArrayList<>();
Set<String> stringSet = elementss.stream().map(e -> e.getName())
.filter(string -> (string.equals("Radium") || string.equals("Silicon")))
.collect(Collectors.toSet());
if(stringSet.size()==2){
System.out.println("both");
}else if(stringSet.size()==1){
System.out.println(stringSet);
}else{
System.out.println(" ");
}
}
}
You could save a few lines if you use regex, but I doubt if it is better than the other answers:
String resolve(List<Element> elements) {
String result = elements.stream()
.map(Element::getName)
.map(String::toUpperCase)
.filter(str -> str.matches("RADIUM|SILICON"))
.sorted()
.collect(Collectors.joining());
return result.matches("RADIUMSILICON") ? "BOTH" : result;
}

Stream mysteriously consumed twice

The following code ends up with a java.lang.IllegalStateException: stream has already been operated upon or closed.
public static void main(String[] args) {
Stream.concat(Stream.of("FOOBAR"),
reverse(StreamSupport.stream(new File("FOO/BAR").toPath().spliterator(), true)
.map(Path::toString)));
}
static <T> Stream<T> reverse(Stream<T> stream) {
return stream.reduce(Stream.empty(),
(Stream<T> a, T b) -> Stream.concat(Stream.of(b), a),
(a, b) -> Stream.concat(b, a));
}
The obvious solution is to generate a non parallel stream with StreamSupport.stream(…, false), but I can’t see why can’t run in parallel.
Stream.empty() is not a constant. This method returns a new stream instance on each invocation that will get consumed like any other stream, e.g. when you pass it into Stream.concat.
Therefore, Stream.empty() is not suitable as identity value for reduce, as the identity value may get passed as input to the reduction function an arbitrary, intentionally unspecified number of times. It’s an implementation detail that is happens to be used only a single time for sequential reduction and potentially multiple times for parallel reduction.
You can use
static <T> Stream<T> reverse(Stream<T> stream) {
return stream.map(Stream::of)
.reduce((a, b) -> Stream.concat(b, a))
.orElseGet(Stream::empty);
}
instead.
However, I only provide the solution as an academic exercise. As soon as the stream gets large, it leads to an excessive amount of concat calls and the note of the documentation applies:
Use caution when constructing streams from repeated concatenation. Accessing an element of a deeply concatenated stream can result in deep call chains, or even StackOverflowError.
Generally, the resulting underlying data structure will be far more expensive than a flat list, when using the Stream API this way.
You can use something like
Stream<String> s = Stream.concat(Stream.of("FOOBAR"),
reverse(new File("FOO/BAR").toPath()).map(Path::toString));
static Stream<Path> reverse(Path p) {
ArrayDeque<Path> d = new ArrayDeque<>();
p.forEach(d::addFirst);
return d.stream();
}
or
static Stream<Path> reverse(Path p) {
Stream.Builder b = Stream.builder();
for(; p != null; p = p.getParent()) b.add(p.getFileName());
return b.build();
}
With Java 9+ you can use a stream that truly has no additional storage (which does not necessarily imply that it will be more efficient):
static Stream<Path> reverse(Path p) {
return Stream.iterate(p, Objects::nonNull, Path::getParent).map(Path::getFileName);
}

Java Stream `generate()` how to "include" the first "excluded" element

Assume this usage scenario for a Java stream, where data is added from a data source. Data source can be a list of values, like in the example below, or a paginated REST api. It doesn't matter, at the moment.
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Stream;
public class Main {
public static void main(String[] args) {
final List<Boolean> dataSource = List.of(true, true, true, false, false, false, false);
final AtomicInteger index = new AtomicInteger();
Stream
.generate(() -> {
boolean value = dataSource.get(index.getAndIncrement());
System.out.format("--> Executed expensive operation to retrieve data: %b\n", value);
return value;
})
.takeWhile(value -> value == true)
.forEach(data -> System.out.printf("--> Using: %b\n", data));
}
}
If you run this code your output will be
--> Executed expensive operation to retrieve data: true
--> Using: true
--> Executed expensive operation to retrieve data: true
--> Using: true
--> Executed expensive operation to retrieve data: true
--> Using: true
--> Executed expensive operation to retrieve data: false
As you can see the last element, the one that evaluated to false, did not get added to the stream, as expected.
Now assume that the generate() method loads pages of data from a REST api. In that case the value true/false is a value on page N indicating if page N + 1 exists, something like a has_more field. Now, I want the last page returned by the API to be added to the stream, but I do not want to perform another expensive operation to read an empty page, because I already know that there are no more pages.
What is the most idiomatic way to do this using the Java Stream API? Every workaround I can think of requires a call to the API to be executed.
UPDATE
In addition to the approaches listed in Inclusive takeWhile() for Streams there is another ugly way to achieve this.
public static void main(String[] args) {
final List<Boolean> dataSource = List.of(true, true, true, false, false, false, false);
final AtomicInteger index = new AtomicInteger();
final AtomicBoolean hasMore = new AtomicBoolean(true);
Stream
.generate(() -> {
if (!hasMore.get()) {
return null;
}
boolean value = dataSource.get(index.getAndIncrement());
hasMore.set(value);
System.out.format("--> Executed expensive operation to retrieve data: %b\n", value);
return value;
})
.takeWhile(Objects::nonNull)
.forEach(data -> System.out.printf("--> Using: %b\n", data));
}
You are using the wrong tool for your job. As already noticable in your code example, the Supplier passed to Stream.generate has to go great lengths to maintain the index it needs for fetching pages.
What makes matters worse, is that Stream.generate creates an unordered Stream:
Returns an infinite sequential unordered stream where each element is generated by the provided Supplier.
This is suitable for generating constant streams, streams of random elements, etc.
You’re not returning constant or random values nor anything else that would be independent of the order.
This has a significant impact on the semantics of takeWhile:
Otherwise returns, if this stream is unordered, a stream consisting of a subset of elements taken from this stream that match the given predicate.
This makes sense if you think about it. If there is at least one element rejected by the predicate, it could be encountered at an arbitrary position for an unordered stream, so an arbitrary subset of elements encountered before it, including the empty set, would be a valid prefix.
But since there is no “before” or “after” for an unordered stream, even elements produced by the generator after the rejected one could be included by the result.
In practice, you are unlikely to encounter such effects for a sequential stream, but it doesn’t change the fact that Stream.generate(…) .takeWhile(…) is semantically wrong for your task.
From your example code, I conclude that pages do not contain their own number nor a "getNext" method, so we have to maintain the number and the "hasNext" state for creating a stream.
Assuming an example setup like
class Page {
private String data;
private boolean hasNext;
public Page(String data, boolean hasNext) {
this.data = data;
this.hasNext = hasNext;
}
public String getData() {
return data;
}
public boolean hasNext() {
return hasNext;
}
}
private static String[] SAMPLE_PAGES = { "foo", "bar", "baz" };
public static Page getPage(int index) {
Objects.checkIndex(index, SAMPLE_PAGES.length);
return new Page(SAMPLE_PAGES[index], index + 1 < SAMPLE_PAGES.length);
}
You can implement a correct stream like
Stream.iterate(Map.entry(0, getPage(0)), Objects::nonNull,
e -> e.getValue().hasNext()? Map.entry(e.getKey()+1, getPage(e.getKey()+1)): null)
.map(Map.Entry::getValue)
.forEach(page -> System.out.println(page.getData()));
Note that Stream.iterate creates an ordered stream:
Returns a sequential ordered Stream produced by iterative application of the given next function to an initial element,
conditioned on satisfying the given hasNext predicate.
Of course, things would be much easier if the page knew its own number, e.g.
Stream.iterate(getPage(0), Objects::nonNull,
p -> p.hasNext()? getPage(p.getPageNumber()+1): null)
.forEach(page -> System.out.println(page.getData()));
or if there was a method to get from an existing Page to the next Page, e.g.
Stream.iterate(getPage(0), Objects::nonNull, p -> p.hasNext()? p.getNextPage(): null)
.forEach(page -> System.out.println(page.getData()));

Why functions considered as first-class citizens are so important? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Java 8 provides a bunch of functional interfaces that we can implement using lambda expressions, which allows functions to be treated as
first-class citizen (passed as arguments, returned from a method, etc...).
Example:
Stream.of("Hello", "World").forEach(str->System.out.println(str));
Why functions considered as first-class citizens are so important? Any example to demonstrate this power?
The idea is to be able to pass behavior as a parameter. This is useful, for example, in implementing the Strategy pattern.
Streams API is a perfect example of how passing behavior as a parameter is useful:
people.stream()
.map(person::name)
.map(name->new GraveStone(name, Rock.GRANITE)
.collect(Collectors.toSet())
Also it allows programmers to think in terms of functional programming instead of object-oriented programming, which is convenient for a lot of tasks, but is quite a broad thing to cover in an answer.
I think the second part of the question has been addressed well. But I want to try to answer the first question.
By definition there is more that a first-class citizen function can do. A first-class citizen function can:
be named by variables
be passed as arguments
be returned as the result of another function
participate as a member data type in a data structure (e.g., an array or list)
These are the privileges of being "first-class."
It's a matter of expressiveness. You don't have to, but in many practical cases it will make your code more readable and concise. For instance, take your code:
public class Foo {
public static void main(String[] args) {
Stream.of("Hello", "World").forEach(str->System.out.println(str));
}
}
And compare it to the most concise Java 7 implementation I could come out with:
interface Procedure<T> {
void call(T arg);
}
class Util {
static <T> void forEach(Procedure<T> proc, T... elements) {
for (T el: elements) {
proc.call(el);
}
}
}
public class Foo {
static public void main(String[] args) {
Util.forEach(
new Procedure<String>() {
public void call(String str) { System.out.println(str); }
},
"Hello", "World"
);
}
}
The result is the same, the number of lines a bit less :) Also note that for supporting Procedure instances with different number of arguments, you would have needed an interface each or (more practical) passing all the arguments as a single Parameters object. A closures would have been made in a similar way, by adding some fields to the Procedure implementation. That's a lot of boilerplate.
In fact, things like first-class "functors" and (non-mutable) closures have been around for a long time using anonymous classes, but they required a significant implementation effort. Lambdas just make things easier to read and write (at least, in most cases).
Here's a short program the shows (arguably) the primary differentiating factor.
public static void main(String[] args) {
List<Integer> input = Arrays.asList(10, 12, 13, 15, 17, 19);
List<Integer> list = pickEvensViaLists(input);
for (int i = 0; i < 2; ++i)
System.out.println(list.get(i));
System.out.println("--------------------------------------------");
pickEvensViaStreams(input).limit(2).forEach((x) -> System.out.println(x));
}
private static List<Integer> pickEvensViaLists(List<Integer> input) {
List<Integer> list = new ArrayList<Integer>(input);
for (Iterator<Integer> iter = list.iterator(); iter.hasNext(); ) {
int curr = iter.next();
System.out.println("processing list element " + curr);
if (curr % 2 != 0)
iter.remove();
}
return list;
}
private static Stream<Integer> pickEvensViaStreams(List<Integer> input) {
Stream<Integer> inputStream = input.stream();
Stream<Integer> filtered = inputStream.filter((curr) -> {
System.out.println("processing stream element " + curr);
return curr % 2 == 0;
});
return filtered;
}
This program takes an input list and prints the first two even numbers from it. It does so twice: the first time using lists with hand-written loops, the second time using streams with lambda expressions.
There are some differences in terms of the amount of code one has to write in either approach but this is not (in my mind) the main point. The difference is in how things are evaluated:
In the list-based approach the code of pickEvensViaLists() iterates over the entire list. it will remove all odd values from the list and only then will return back to main(). The list that it returned to main() will therefore contain four values: 10, 12, 20, 30 and main() will print just the first two.
In the stream-based approach the code of pickEvensViaStreams() does not actually iterate over anything. It returns a stream who else can be computed off of the input stream but it did not yet compute any one of them. Only when main() starts iterating (via forEach()) will the elements of the returned stream be computed, one by one. As main() only cares about the first two elements only two elements of the returned stream are actually computed. In other words: with stream you get lazy evaluation: streams are iterated only much as needed.
To see that let's examine the output of this program:
--------------------------------------------
list-based filtering:
processing list element 10
processing list element 12
processing list element 13
processing list element 15
processing list element 17
processing list element 19
processing list element 20
processing list element 30
10
12
--------------------------------------------
stream-based filtering:
processing stream element 10
10
processing stream element 12
12
with lists the entire input was iterated over (hence the eight "processing list element" messages). With stream only two elements were actually extracted from the input resulting in only two "processing stream element" messages.

Terminate Iterable.forEach early [duplicate]

This question already has answers here:
Limit a stream by a predicate
(19 answers)
Closed 8 years ago.
I have a set and a method:
private static Set<String> set = ...;
public static String method(){
final String returnVal[] = new String[1];
set.forEach((String str) -> {
returnVal[0] += str;
//if something: goto mark
});
//mark
return returnVal[0];
}
Can I terminate the forEach inside the lambda (with or without using exceptions)?
Should I use an anonymous class?
I could do this:
set.forEach((String str) -> {
if(someConditions()){
returnVal[0] += str;
}
});
but it wastes time.
implementation using stream.reduce
return set.parallelStream().reduce((output, next) -> {
return someConditions() ? next : output;
}).get(); //should avoid empty set before
I am looking for the fastest solution so exception and a 'real' for each loop are acceptable if they are fast enough.
I'm reluctant to answer this even though I'm not entirely sure what you're attempting to accomplish, but the simple answer is no, you can't terminate a forEach when it's halfway through processing elements.
The official Javadoc states that it is a terminal operation that applies against all elements in the stream.
Performs an action for each element of this stream.
This is a terminal operation.
If you want to gather the results into a single result, you want to use reduction instead.
Be sure to consider what it is a stream is doing. It is acting on all elements contained in it - and if it's filtered along the way, each step in the chain can be said to act on all elements in its stream, even if it's a subset of the original.
In case you were curious as to why simply putting a return wouldn't have any effect, here's the implementation of forEach.
default void forEach(Consumer<? super T> action) {
Objects.requireNonNull(action);
for (T t : this) {
action.accept(t);
}
}
The consumer is explicitly passed in, ad this is done independently of the actual iteration going on. I imagine you could throw an exception, but that would be tacky when more elegant solutions likely exist.

Categories

Resources