Storing a reference to a Stream

Storing a reference to a Stream - java

I have a class which manages a Stream:
class MyStreamManager {
private Stream<Object> currentStream = null;
boolean hasMoreData() {
//code here to assert currentStream is null
final Optional<Stream<Object>> maybeAStream = somethingWhichMightProvideAStream.getNextStream();
currentStream = maybeAStream.orElse(null);
return currentStream != null;
}
#MustBeClosed
Stream<Object> getCurrentStream() { return currentStream; }
void finish() {
currentStream.close();
currentStream = null;
}
}
Which is used in the following style:
while (myStreamManager.hasMoreData()) {
try {
myStreamManager.getCurrentStream().map(...).filter(...); //etc
} finally {
myStreamManager.finish();
}
}
Is storing a reference to a Stream like this bad practice? While this works, it definitely doesn't feel right, and ErrorProne is flagging it (hence the #MustBeClosed annotation).
MyStreamManager is a Spring #Bean but is only used by one thread (this is running in a batch).
I can think of two different approaches which are probably better:
instantiate MyStreamManager and wrap it in a try-with-resources, delegating the close() call to the Stream
use the Spliterators class to create a Spliterator that delegates to many Streams?

I don't think that it's as much the fact you're storing a Stream per se that makes this feel awkward, but rather that you've got sequential coupling.
You have to call hasMoreData; then getCurrentStream(); then finish(). If you're only using the class in a limited number of places, you will probably be able to get it right in all of those; but every place you use it is a new opportunity to use it incorrectly.
I would say that your manager class is actually just making things harder for yourself.
for (Optional<Stream<Object>> opt = somethingWhichMightProvideAStream.getNextStream();
opt.isPresent();
opt = somethingWhichMightProvideAStream.getNextStream()) {
try (Stream<Object> stream = opt.get()) { // try-with-resources auto-closes the stream
stream.map(...).filter(...); //etc
}
}
or:
Optional<Stream<Object>> opt;
while ((opt = somethingWhichMightProvideAStream.getNextStream()).isPresent()) {
try (Stream<Object> stream = opt.get()) {
stream.map(...).filter(...); //etc
}
}
The loop declarations in either case are not especially pretty; but this is way shorter (roughly as long as the while/try/finally loop you already have), and harder to use wrong, I think.
(Admittedly, you've still got sequential coupling here: you have to remember to close the stream returned in the optional. Sigh.)

Mixing imperative (while loop, try-finally) and declarative (streams) code together doesn't seem right.
If all of these opeartions are synchronous I guess it could be done in one pipeline (without MyStreamManager at all).
I think that you could think of focusing on moving some logic to object containing method somethingWhichMightProvideAStream because mixing imperative iterator pattern with stream API doesn't look like idiomatic. For example it can return List (or even better a Stream!) of Streams instead of Optional
Think twice if you really need to close this stream. From documentation:
Streams have a BaseStream.close() method and implement AutoCloseable, but nearly all stream instances do not actually need to be closed after use. Generally, only streams whose source is an IO channel (such as those returned by Files.lines(Path, Charset)) will require closing.

Related

What could be a better way to achieve Checked Exception handling in mapping functions passed to Java 8 Streams

I have a simple scenario which I am trying to code without being clumsy and without writing unreadable multiline lambdas.
public class StreamTest {
public static void main(String[] args) {
List<String> list = Arrays.asList("hellow", "world");
Stream<String> stream = list.stream().map(StreamTest::exceptionThrowingMappingFunction);
}
public static String exceptionThrowingMappingFunction(String s) throws Exception {
if (s.equals("world")) {
throw new Exception("world is doomed");
}
return s + " exists";
}
}
What I would like to have are the following options:
Fail the whole stream if the exception is thrown
Skip the value and continue with the rest of the stream if exception occurs
I know about popular ways of dealing with this, like throwing a RuntimeException in a custom FunctionalInterface or just handling the exception inline.
But is there some way, where I can extend Streams and just write a stream like StreamWithExceptionHandling extends Stream. Which also accepts an ExceptionHandler and just implements the above behaviour?
Thanks for taking your time to read this one.

Try writing a sample solution and posting it to Code Review. Your problem might be a good fit.
Lambdas are useful for one liners. For the rest: don't feel bad about just defining a class or a method.
For option 2, map the value into a result object that contains operation status and return value and then filter by status. You'll avoid introducing non-standard behaviour to the streams API.

You can use CompletionStages to help out in this scenario. They have a good interface for handling exceptional flows.
So, convert your streamed value to an already completed CompletableFuture, as a map step in the stream, then map it again to CompletionStage.thenApply, which returns a new CompletionStage that holds any exceptions for you. You can then filter the unwanted exceptional completion stages out of the stream, or include other other processing steps if you want (like logging the exception, for example).
And of course you can map the value back out of a CompletionStage into the actual completed value easily enough.
It’s one way to do it at least, without trying to write your own streams interface.

What is the use case for null(Input/Output)Stream API in Java?

With Java 11, I could initialize an InputStream as:
InputStream inputStream = InputStream.nullInputStream();
But I am unable to understand a potential use case of InputStream.nullInputStream or a similar API for OutputStream
i.e. OutputStream.nullOutputStream.
From the API Javadocs, I could figure out that it
Returns a new InputStream that reads no bytes. The returned stream is
initially open. The stream is closed by calling the close() method.
Subsequent calls to close() have no effect. While the stream is open,
the available(), read(), read(byte[]), ...
skip(long), and transferTo() methods all behave as if end of stream
has been reached.
I went through the detailed release notes further which states:
There are various times where I would like to use methods that require
as a parameter a target OutputStream/Writer for sending output, but
would like to execute those methods silently for their other effects.
This corresponds to the ability in Unix to redirect command output to
/dev/null, or in DOS to append command output to NUL.
Yet I fail to understand what are those methods in the statement as stated as .... execute those methods silently for their other effects. (blame my lack of hands-on with the APIs)
Can someone help me understand what is the usefulness of having such an input or output stream with a help of an example if possible?
Edit: One of a similar implementation I could find on browsing further is apache-commons' NullInputStream, which does justify the testing use case much better.

Sometimes you want to have a parameter of InputStream type, but also to be able to choose not to feed your code with any data. In tests it's probably easier to mock it but in production you may choose to bind null input instead of scattering your code with ifs and flags.
compare:
class ComposableReprinter {
void reprint(InputStream is) throws IOException {
System.out.println(is.read());
}
void bla() {
reprint(InputStream.nullInputStream());
}
}
with this:
class ControllableReprinter {
void reprint(InputStream is, boolean for_real) throws IOException {
if (for_real) {
System.out.println(is.read());
}
}
void bla() {
reprint(new BufferedInputStream(), false);
}
}
or this:
class NullableReprinter {
void reprint(InputStream is) throws IOException {
if (is != null) {
System.out.println(is.read());
}
}
void bla() {
reprint(null);
}
}
It makes more sense with output IMHO. Input is probably more for consistency.
This approach is called Null Object: https://en.wikipedia.org/wiki/Null_object_pattern

I see it as a safer (1) and more expressive (2) alternative to initialising a stream variable with null.
No worries about NPEs.
[Output|Input]Stream is an abstraction. In order to return a null/empty/mock stream, you had to deviate from the core concept down to a specific implementation.

I think nullOutputStream is very easy and clear: just to discard output (similar to > /dev/null) and/or for testing (no need to invent an OutputStream).
An (obviously basic) example:
OutputStream out = ... // an easy way to either print it to System.out or just discard all prints, setting it basically to the nullOutputStream
out.println("yeah... or not");
exporter.exportTo(out); // discard or real export?
Regarding nullInputStream it's probably more for testing (I don't like mocks) and APIs requiring an input stream or (this now being more probable) delivering an input stream which does not contain any data, or you can't deliver and where null is not a viable option:
importer.importDocument("name", /* input stream... */);
InputStream inputStream = content.getInputStream(); // better having no data to read, then getting a null
When you test that importer, you can just use a nullInputStream there, again instead of inventing your own InputStream or instead of using a mock. Other use cases here rather look like a workaround or misuse of the API ;-)
Regarding the return of an InputStream: that rather makes sense. If you haven't any data you may want to return that nullInputStream instead of null so that callers do not have to deal with null and can just read as they would if there was data.
Finally, these are just convenience methods to make our lifes easier without adding another dependency ;-) and as others already stated (comments/answers), it's basically an implementation of the null object pattern.
Using the null*Stream might also have the benefit that tests are executed faster... if you stream real data (of course... depending on size, etc.) you may just slow down your tests unnecessarily and we all want tests to complete fast, right? (some will put in mocks here... well...)

Java 8 streams and varargs

According to Effective Java 2nd Ed, when you want to write a method signature that allows for varargs but still enforces that you have one element minimum at compile-time you should write the method signature this way:
public void something(String required, String ... additional) {
//... do what you want to do
}
If I want to stream all these elements, I've been doing something like this:
public void something(String required, String ... additional) {
Stream<String> allParams =
Stream.concat(Stream.of(required), Stream.of(additional));
//... do what you want to do
}
This feels really inelegant and wasteful, especially because I'm creating a stream of 1 and concatenating it with another. Is there a cleaner way to do this?

Here is a way for doing it without creating two Streams, although you might not like it.
Stream.Builder<String> builder = Stream.<String>builder().add(required);
for (String s : additional) {
builder.add(s);
}
Stream<String> allParams = builder.build();

There is nothing wrong with the composed streams. These objects are lightweight as they only refer to the source data but don’t copy data like array contents. The cost of such lightweight object might only be relevant if the actual payload is very small as well. Such scenarios can be handled with specialized, semantically equivalent overloads:
public void something(String required, String ... additional) {
somethingImpl(Stream.concat(Stream.of(required), Stream.of(additional)));
}
public void something(String required) {
somethingImpl(Stream.of(required));
}
public void something(String required, String second) {
somethingImpl(Stream.of(required, second));
}
private void somethingImpl(Stream<String> allParams) {
//... do what you want to do
}
so in the case of only one argument you’re not only saving Stream instances but also the varargs array (similar to Stream.of’s overload). This is a common pattern, see for example the EnumSet.of overloads.
However, in a lot of cases even these simple overloads are not necessary and might be considered premature optimization (libraries like the JRE offer them as it’s otherwise impossible for an application developer to add them if ever needed). If something is part of an application rather than a library you shouldn’t add them unless a profiler tells you that there’s a bottleneck caused by that parameter processing.

If you're willing to use Guava, you may Lists.asList(required, additional).stream(). The method was created to ease that varargs with minimum requirement idiom.
A side note, I consider the library really useful, but of course it's not a good idea to add it just because of that. Check the docs and see if it could be of more use to you.

Unfortunately, Java can be quite verbose. But another option to alleviate that is to simply use static imports. In my opinion, it does not make your code less clear since every method is stream-related.
Stream<String> allParams =
concat(of(required), of(additional));

Third-party extensions to Stream API like my StreamEx or jOOλ provide methods like append or prepend which allow you to do this in more clean way:
// Using StreamEx
Stream<String> allParams = StreamEx.of(required).append(additional);
// Using jOOL
Stream<String> allParams = Seq.of(required).append(additional);

Why is Files.lines (and similar Streams) not automatically closed?

The javadoc for Stream states:
Streams have a BaseStream.close() method and implement AutoCloseable, but nearly all stream instances do not actually need to be closed after use. Generally, only streams whose source is an IO channel (such as those returned by Files.lines(Path, Charset)) will require closing. Most streams are backed by collections, arrays, or generating functions, which require no special resource management. (If a stream does require closing, it can be declared as a resource in a try-with-resources statement.)
Therefore, the vast majority of the time one can use Streams in a one-liner, like collection.stream().forEach(System.out::println); but for Files.lines and other resource-backed streams, one must use a try-with-resources statement or else leak resources.
This strikes me as error-prone and unnecessary. As Streams can only be iterated once, it seems to me that there is no a situation where the output of Files.lines should not be closed as soon as it has been iterated, and therefore the implementation should simply call close implicitly at the end of any terminal operation. Am I mistaken?

Yes, this was a deliberate decision. We considered both alternatives.
The operating design principle here is "whoever acquires the resource should release the resource". Files don't auto-close when you read to EOF; we expect files to be closed explicitly by whoever opened them. Streams that are backed by IO resources are the same.
Fortunately, the language provides a mechanism for automating this for you: try-with-resources. Because Stream implements AutoCloseable, you can do:
try (Stream<String> s = Files.lines(...)) {
s.forEach(...);
}
The argument that "it would be really convenient to auto-close so I could write it as a one-liner" is nice, but would mostly be the tail wagging the dog. If you opened a file or other resource, you should also be prepared to close it. Effective and consistent resource management trumps "I want to write this in one line", and we chose not to distort the design just to preserve the one-line-ness.

I have more specific example in addition to #BrianGoetz answer. Don't forget that the Stream has escape-hatch methods like iterator(). Suppose you are doing this:
Iterator<String> iterator = Files.lines(path).iterator();
After that you may call hasNext() and next() several times, then just abandon this iterator: Iterator interface perfectly supports such use. There's no way to explicitly close the Iterator, the only object you can close here is the Stream. So this way it would work perfectly fine:
try(Stream<String> stream = Files.lines(path)) {
Iterator<String> iterator = stream.iterator();
// use iterator in any way you want and abandon it at any moment
} // file is correctly closed here.

In addition if you want "one line write". You can just do this:
Files.readAllLines(source).stream().forEach(...);
You can use it if you are sure that you need entire file and the file is small. Because it isn't a lazy read.

If you're lazy like me and don't mind the "if an exception is raised, it will leave the file handle open" you could wrap the stream in an autoclosing stream, something like this (there may be other ways):
static Stream<String> allLinesCloseAtEnd(String filename) throws IOException {
Stream<String> lines = Files.lines(Paths.get(filename));
Iterator<String> linesIter = lines.iterator();
Iterator it = new Iterator() {
#Override
public boolean hasNext() {
if (!linesIter.hasNext()) {
lines.close(); // auto-close when reach end
return false;
}
return true;
}
#Override
public Object next() {
return linesIter.next();
}
};
return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, Spliterator.DISTINCT), false);
}

Java 8 Streams and try with resources

I thought that the stream API was here to make the code easier to read.
I found something quite annoying. The Stream interface extends the java.lang.AutoCloseable interface.
So if you want to correctly close your streams, you have to use try with resources.
Listing 1. Not very nice, streams are not closed.
public void noTryWithResource() {
Set<Integer> photos = new HashSet<Integer>(Arrays.asList(1, 2, 3));
#SuppressWarnings("resource") List<ImageView> collect = photos.stream()
.map(photo -> new ImageView(new Image(String.valueOf(photo))))
.collect(Collectors.<ImageView>toList());
}
Listing 2. With 2 nested try
public void tryWithResource() {
Set<Integer> photos = new HashSet<Integer>(Arrays.asList(1, 2, 3));
try (Stream<Integer> stream = photos.stream()) {
try (Stream<ImageView> map = stream
.map(photo -> new ImageView(new Image(String.valueOf(photo)))))
{
List<ImageView> collect = map.collect(Collectors.<ImageView>toList());
}
}
}
Listing 3. As map returns a stream, both the stream() and the map() functions have to be closed.
public void tryWithResource2() {
Set<Integer> photos = new HashSet<Integer>(Arrays.asList(1, 2, 3));
try (Stream<Integer> stream = photos.stream(); Stream<ImageView> map = stream.map(photo -> new ImageView(new Image(String.valueOf(photo)))))
{
List<ImageView> collect = map.collect(Collectors.<ImageView>toList());
}
}
The example I give does not make any sense. I replaced Path to jpg images with Integer, for the sake of the example. But don't let you distract by these details.
What is the best way to go around with those auto closable streams.
I have to say I'm not satisfied with any of the 3 options I showed.
What do you think? Are there yet other more elegant solutions?

You're using #SuppressWarnings("resource") which presumably suppresses a warning about an unclosed resource. This isn't one of the warnings emitted by javac. Web searches seem to indicate that Eclipse issues warnings if an AutoCloseable is left unclosed.
This is a reasonable warning according to the Java 7 specification that introduced AutoCloseable:
A resource that must be closed when it is no longer needed.
However, the Java 8 specification for AutoCloseable was relaxed to remove the "must be closed" clause. It now says, in part,
An object that may hold resources ... until it is closed.
It is possible, and in fact common, for a base class to implement AutoCloseable even though not all of its subclasses or instances will hold releasable resources. For code that must operate in complete generality, or when it is known that the AutoCloseable instance requires resource release, it is recommended to use try-with-resources constructions. However, when using facilities such as Stream that support both I/O-based and non-I/O-based forms, try-with-resources blocks are in general unnecessary when using non-I/O-based forms.
This issue was discussed extensively within the Lambda expert group; this message summarizes the decision. Among other things it mentions changes to the AutoCloseable specification (cited above) and the BaseStream specification (cited by other answers). It also mentions the possible need to adjust the Eclipse code inspector for the changed semantics, presumably not to emit warnings unconditionally for AutoCloseable objects. Apparently this message didn't get to the Eclipse folks or they haven't changed it yet.
In summary, if Eclipse warnings are leading you into thinking that you need to close all AutoCloseable objects, that's incorrect. Only certain specific AutoCloseable objects need to be closed. Eclipse needs to be fixed (if it hasn't already) not to emit warnings for all AutoCloseable objects.

You only need to close Streams if the stream needs to do any cleanup of itself, usually I/O. Your example uses an HashSet so it doesn't need to be closed.
from the Stream javadoc:
Generally, only streams whose source is an IO channel (such as those returned by Files.lines(Path, Charset)) will require closing. Most streams are backed by collections, arrays, or generating functions, which require no special resource management.
So in your example this should work without issue
List<ImageView> collect = photos.stream()
.map(photo -> ...)
.collect(toList());
EDIT
Even if you need to clean up resources, you should be able to use just one try-with-resource. Let's pretend you are reading a file where each line in the file is a path to an image:
try(Stream<String> lines = Files.lines(file)){
List<ImageView> collect = lines
.map(line -> new ImageView( ImageIO.read(new File(line)))
.collect(toList());
}

“Closeable” means “can be closed”, not “must be closed”.
That was true in the past, e.g. see ByteArrayOutputStream:
Closing a ByteArrayOutputStream has no effect.
And that is true now for Streams where the documentation makes clear:
Streams have a BaseStream.close() method and implement AutoCloseable, but nearly all stream instances do not actually need to be closed after use. Generally, only streams whose source is an IO channel (such as those returned by Files.lines(Path, Charset)) will require closing.
So if an audit tool generates false warnings, it’s a problem of the audit tool, not of the API.
Note that even if you want to add resource management, there is no need to nest try statements. While the following is sufficient:
final Path p = Paths.get(System.getProperty("java.home"), "COPYRIGHT");
try(Stream<String> stream=Files.lines(p, StandardCharsets.ISO_8859_1)) {
System.out.println(stream.filter(s->s.contains("Oracle")).count());
}
you may also add the secondary Stream to the resource management without an additional try:
final Path p = Paths.get(System.getProperty("java.home"), "COPYRIGHT");
try(Stream<String> stream=Files.lines(p, StandardCharsets.ISO_8859_1);
Stream<String> filtered=stream.filter(s->s.contains("Oracle"))) {
System.out.println(filtered.count());
}

It is possible to create a utility method that reliably closes streams with a try-with-resource-statement.
It is a bit like a try-finally that is an expression (something that is the case in e.g. Scala).
/**
* Applies a function to a resource and closes it afterwards.
* #param sup Supplier of the resource that should be closed
* #param op operation that should be performed on the resource before it is closed
* #return The result of calling op.apply on the resource
*/
private static <A extends AutoCloseable, B> B applyAndClose(Callable<A> sup, Function<A, B> op) {
try (A res = sup.call()) {
return op.apply(res);
} catch (RuntimeException exc) {
throw exc;
} catch (Exception exc) {
throw new RuntimeException("Wrapped in applyAndClose", exc);
}
}
(Since resources that need to be closed often also throw exceptions when they are allocated non-runtime exceptions are wrapped in runtime exceptions, avoiding the need for a separate method that does that.)
With this method the example from the question looks like this:
Set<Integer> photos = new HashSet<Integer>(Arrays.asList(1, 2, 3));
List<ImageView> collect = applyAndClose(photos::stream, s -> s
.map(photo -> new ImageView(new Image(String.valueOf(photo))))
.collect(Collectors.toList()));
This is useful in situations when closing the stream is required, such as when using Files.lines. It also helps when you have to do a "double close", as in your example in Listing 3.
This answer is an adaptation of an old answer to a similar question.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.