How to make a Stream from a DirectoryStream

How to make a Stream from a DirectoryStream - java

When reading the API for DirectoryStream I miss a lot of functions. First of all it suggests using a for loop to go from stream to List. And I miss the fact that it a DirectoryStream is not a Stream.
How can I make a Stream<Path> from a DirectoryStream in Java 8?

While it is possible to convert a DirectoryStream into a Stream using its spliterator method, there is no reason to do so. Just create a Stream<Path> in the first place.
E.g., instead of calling Files.newDirectoryStream(Path) just call Files.list(Path).
The overload of newDirectoryStream which accepts an additional Filter may be replaced by Files.list(Path).filter(Predicate) and there are additional operations like Files.find and Files.walk returning a Stream<Path>, however, I did not find a replacement for the case you want to use the “glob pattern”. That seems to be the only case where translating a DirectoryStream into a Stream might be useful (I prefer using regular expressions anyway)…

DirectoryStream is not a Stream (it's been there since Java 7, before the streams api was introduced in Java 8) but it implements the Iterable<Path> interface so you could write:
try (DirectoryStream<Path> ds = ...) {
Stream<Path> s = StreamSupport.stream(ds.spliterator(), false);
}

DirectoryStream has a method that returns a spliterator. So just do:
Stream<Path> stream = StreamSupport.stream(myDirectoryStream.spliterator(), false);
You might want to see this question, which is basically what your problem reduces to: How to create a Stream from an Iterable.

Related

How to fix "Intermediate Stream methods should not be left unused" on Sonarqube

I found this bug on Sonarqube:
private String getMacAdressByPorts(Set<Integer> ports) {
ports.stream().sorted(); // sonar list show "Refactor the code so this stream pipeline is used"
return ports.toString();
} //getMacAdressByPorts
I have been searching for a long time on the Internet, but it was no use. Please help or try to give some ideas how to achieve this.

The sorted() method has no effect on the Set you pass in; actually, it's a non-terminal operation, so it isn't even executed. If you want to sort your ports, you need something like
return ports.stream().sorted().collect(Collectors.joining(","));
EDIT:
as #Slaw correctly points out, to get the same format you had before (ie [item1, item2, item3], you also need to add the square brackets to the joining collector, ie Collectors.joining(", ", "[", "]"). I left those out for simplicity.

From the Sonar Source documentation about this warning (emphasis of mine):
Stream operations are divided into intermediate and terminal operations, and are combined to form stream pipelines. After the terminal operation is performed, the stream pipeline is considered consumed, and cannot be used again. Such a reuse will yield unexpected results.
Official JavaDoc for stream() gives more details on sorted() (emphasis of mine):
Returns a stream consisting of the elements of this stream, sorted according to natural order. If the elements of this stream are not Comparable, a java.lang.ClassCastException may be thrown when the terminal operation is executed. [...] This is a stateful intermediate operation.
This implies that only using sorted() will yield no result. From the Oracle Stream package documentation (still emphasis of mine):
Stream operations are divided into intermediate and terminal operations, and are combined to form stream pipelines. A stream pipeline consists of a source (such as a Collection, an array, a generator function, or an I/O channel); followed by zero or more intermediate operations such as Stream.filter or Stream.map; and a terminal operation such as Stream.forEach or Stream.reduce.
Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.
sorted() returns another stream(), and not a sorted list. To solve your Sonar issue (and maybe your code issue, in that manner), you have to call a terminal operation in order to run all the intermediate operations. You can find a list (non-exhausive, I think) of terminal operations on CodeJava for instance.
In your case, the solution might look like:
private String getMacAdressByPorts(Set<Integer> ports) {
/* Ports > Stream > Sort (intermediate operation) >
/ Collector (final operation) > List > String
/ Note that you need to import the static method toList()
/ from the Collector API, otherwise it won't compile
*/
return ports.stream().sorted().collect(toList()).toString();
}

I finally solved the problem used the code below.
private String getMacAdressByPorts(Set<Integer> ports) {
return ports.stream().sorted().collect(Collectors.toList()).toString();

How to use Streams api peek() function and make it work?

According to this question, peek() is lazy it means we should somehow activate it. In fact, to activate it to print something out to the console I tried this :
Stream<String> ss = Stream.of("Hi","Hello","Halo","Hacker News");
ss.parallel().peek(System.out::println);
System.out.println("lol"); // I wrote this line to print sth out to terminal to wake peek method up
But that doesn't work and the output is :
lol
Thus, how can I make the peek function actually work?
If there is no way to that so whats the point of using peek?

You have to use terminal operation on a stream for it to execute (peek is not terminal, it is an intermediate operation, that returns a new Stream), e.g. count():
Stream<String> ss = Stream.of("Hi","Hello","Halo","Hacker News");
ss.parallel().peek(System.out::println).count();
Or replace peek with forEach (which is terminal):
ss.parallel().forEach(System.out::println);

peek() method uses Consumer as parameter which means that potentially you can mutate the state of the incoming element. At the same time Java documentation says that peek should be mostly used for debugging purposes. It is an intermediate operator and requires a terminal operator like forEach.
stream().peek(Consumer).forEach(Consumer);

Why is Files.lines (and similar Streams) not automatically closed?

The javadoc for Stream states:
Streams have a BaseStream.close() method and implement AutoCloseable, but nearly all stream instances do not actually need to be closed after use. Generally, only streams whose source is an IO channel (such as those returned by Files.lines(Path, Charset)) will require closing. Most streams are backed by collections, arrays, or generating functions, which require no special resource management. (If a stream does require closing, it can be declared as a resource in a try-with-resources statement.)
Therefore, the vast majority of the time one can use Streams in a one-liner, like collection.stream().forEach(System.out::println); but for Files.lines and other resource-backed streams, one must use a try-with-resources statement or else leak resources.
This strikes me as error-prone and unnecessary. As Streams can only be iterated once, it seems to me that there is no a situation where the output of Files.lines should not be closed as soon as it has been iterated, and therefore the implementation should simply call close implicitly at the end of any terminal operation. Am I mistaken?

Yes, this was a deliberate decision. We considered both alternatives.
The operating design principle here is "whoever acquires the resource should release the resource". Files don't auto-close when you read to EOF; we expect files to be closed explicitly by whoever opened them. Streams that are backed by IO resources are the same.
Fortunately, the language provides a mechanism for automating this for you: try-with-resources. Because Stream implements AutoCloseable, you can do:
try (Stream<String> s = Files.lines(...)) {
s.forEach(...);
}
The argument that "it would be really convenient to auto-close so I could write it as a one-liner" is nice, but would mostly be the tail wagging the dog. If you opened a file or other resource, you should also be prepared to close it. Effective and consistent resource management trumps "I want to write this in one line", and we chose not to distort the design just to preserve the one-line-ness.

I have more specific example in addition to #BrianGoetz answer. Don't forget that the Stream has escape-hatch methods like iterator(). Suppose you are doing this:
Iterator<String> iterator = Files.lines(path).iterator();
After that you may call hasNext() and next() several times, then just abandon this iterator: Iterator interface perfectly supports such use. There's no way to explicitly close the Iterator, the only object you can close here is the Stream. So this way it would work perfectly fine:
try(Stream<String> stream = Files.lines(path)) {
Iterator<String> iterator = stream.iterator();
// use iterator in any way you want and abandon it at any moment
} // file is correctly closed here.

In addition if you want "one line write". You can just do this:
Files.readAllLines(source).stream().forEach(...);
You can use it if you are sure that you need entire file and the file is small. Because it isn't a lazy read.

If you're lazy like me and don't mind the "if an exception is raised, it will leave the file handle open" you could wrap the stream in an autoclosing stream, something like this (there may be other ways):
static Stream<String> allLinesCloseAtEnd(String filename) throws IOException {
Stream<String> lines = Files.lines(Paths.get(filename));
Iterator<String> linesIter = lines.iterator();
Iterator it = new Iterator() {
#Override
public boolean hasNext() {
if (!linesIter.hasNext()) {
lines.close(); // auto-close when reach end
return false;
}
return true;
}
#Override
public Object next() {
return linesIter.next();
}
};
return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, Spliterator.DISTINCT), false);
}

In Java 8 Stream API, what's the difference between a DirectoryStream<Path> and Stream<Path>?

I want to return a stream of paths (these are files located in a certain directory). My initial approach was this:
DirectoryStream getFiles(Path dir) throws IOException {
Files.newDirectoryStream(dir);
}
... but, I would like to know the difference between the above snippet and this second one:
Stream<Path> getFiles(Path dir) throws IOException {
Spliterator<Path> spl = Files.newDirectoryStream(dir).spliterator();
return StreamSupport.stream(spl, false);
}
Both DirectoryStream and Stream are sub-interfaces of AutoCloseable, but beyond that, they seem to be designed for different purposes.
To be more precise, my question is this:
What are the conceptual and functionality-based differences between DirectoryStream and Stream interfaces in Java-8?

What are the conceptual and functionality-based differences between
DirectoryStream and Stream interfaces in Java-8?
Java Stream API is general purpose API designed and implemented provide immutable, lazy, functional/declarative style of coding with any stream of objects. This is not specific for one scope and has mechanisms to filter, transform, aggregate data coming from the stream.
Where as DirectoryStream is specifically designed to cater loading, filtering and iterating through file system directories in a easy to use API.
Java Stream API has clear cut common usage functions and corresponding SAM (Single Abstract Method) interfaces to ease coding for almost any usecase.
Where as DirectoryStream has convenient functions and interfaces to carry out loading, filtering, iterating over directories easy.

Java 8 Streams and try with resources

I thought that the stream API was here to make the code easier to read.
I found something quite annoying. The Stream interface extends the java.lang.AutoCloseable interface.
So if you want to correctly close your streams, you have to use try with resources.
Listing 1. Not very nice, streams are not closed.
public void noTryWithResource() {
Set<Integer> photos = new HashSet<Integer>(Arrays.asList(1, 2, 3));
#SuppressWarnings("resource") List<ImageView> collect = photos.stream()
.map(photo -> new ImageView(new Image(String.valueOf(photo))))
.collect(Collectors.<ImageView>toList());
}
Listing 2. With 2 nested try
public void tryWithResource() {
Set<Integer> photos = new HashSet<Integer>(Arrays.asList(1, 2, 3));
try (Stream<Integer> stream = photos.stream()) {
try (Stream<ImageView> map = stream
.map(photo -> new ImageView(new Image(String.valueOf(photo)))))
{
List<ImageView> collect = map.collect(Collectors.<ImageView>toList());
}
}
}
Listing 3. As map returns a stream, both the stream() and the map() functions have to be closed.
public void tryWithResource2() {
Set<Integer> photos = new HashSet<Integer>(Arrays.asList(1, 2, 3));
try (Stream<Integer> stream = photos.stream(); Stream<ImageView> map = stream.map(photo -> new ImageView(new Image(String.valueOf(photo)))))
{
List<ImageView> collect = map.collect(Collectors.<ImageView>toList());
}
}
The example I give does not make any sense. I replaced Path to jpg images with Integer, for the sake of the example. But don't let you distract by these details.
What is the best way to go around with those auto closable streams.
I have to say I'm not satisfied with any of the 3 options I showed.
What do you think? Are there yet other more elegant solutions?

You're using #SuppressWarnings("resource") which presumably suppresses a warning about an unclosed resource. This isn't one of the warnings emitted by javac. Web searches seem to indicate that Eclipse issues warnings if an AutoCloseable is left unclosed.
This is a reasonable warning according to the Java 7 specification that introduced AutoCloseable:
A resource that must be closed when it is no longer needed.
However, the Java 8 specification for AutoCloseable was relaxed to remove the "must be closed" clause. It now says, in part,
An object that may hold resources ... until it is closed.
It is possible, and in fact common, for a base class to implement AutoCloseable even though not all of its subclasses or instances will hold releasable resources. For code that must operate in complete generality, or when it is known that the AutoCloseable instance requires resource release, it is recommended to use try-with-resources constructions. However, when using facilities such as Stream that support both I/O-based and non-I/O-based forms, try-with-resources blocks are in general unnecessary when using non-I/O-based forms.
This issue was discussed extensively within the Lambda expert group; this message summarizes the decision. Among other things it mentions changes to the AutoCloseable specification (cited above) and the BaseStream specification (cited by other answers). It also mentions the possible need to adjust the Eclipse code inspector for the changed semantics, presumably not to emit warnings unconditionally for AutoCloseable objects. Apparently this message didn't get to the Eclipse folks or they haven't changed it yet.
In summary, if Eclipse warnings are leading you into thinking that you need to close all AutoCloseable objects, that's incorrect. Only certain specific AutoCloseable objects need to be closed. Eclipse needs to be fixed (if it hasn't already) not to emit warnings for all AutoCloseable objects.

You only need to close Streams if the stream needs to do any cleanup of itself, usually I/O. Your example uses an HashSet so it doesn't need to be closed.
from the Stream javadoc:
Generally, only streams whose source is an IO channel (such as those returned by Files.lines(Path, Charset)) will require closing. Most streams are backed by collections, arrays, or generating functions, which require no special resource management.
So in your example this should work without issue
List<ImageView> collect = photos.stream()
.map(photo -> ...)
.collect(toList());
EDIT
Even if you need to clean up resources, you should be able to use just one try-with-resource. Let's pretend you are reading a file where each line in the file is a path to an image:
try(Stream<String> lines = Files.lines(file)){
List<ImageView> collect = lines
.map(line -> new ImageView( ImageIO.read(new File(line)))
.collect(toList());
}

“Closeable” means “can be closed”, not “must be closed”.
That was true in the past, e.g. see ByteArrayOutputStream:
Closing a ByteArrayOutputStream has no effect.
And that is true now for Streams where the documentation makes clear:
Streams have a BaseStream.close() method and implement AutoCloseable, but nearly all stream instances do not actually need to be closed after use. Generally, only streams whose source is an IO channel (such as those returned by Files.lines(Path, Charset)) will require closing.
So if an audit tool generates false warnings, it’s a problem of the audit tool, not of the API.
Note that even if you want to add resource management, there is no need to nest try statements. While the following is sufficient:
final Path p = Paths.get(System.getProperty("java.home"), "COPYRIGHT");
try(Stream<String> stream=Files.lines(p, StandardCharsets.ISO_8859_1)) {
System.out.println(stream.filter(s->s.contains("Oracle")).count());
}
you may also add the secondary Stream to the resource management without an additional try:
final Path p = Paths.get(System.getProperty("java.home"), "COPYRIGHT");
try(Stream<String> stream=Files.lines(p, StandardCharsets.ISO_8859_1);
Stream<String> filtered=stream.filter(s->s.contains("Oracle"))) {
System.out.println(filtered.count());
}

It is possible to create a utility method that reliably closes streams with a try-with-resource-statement.
It is a bit like a try-finally that is an expression (something that is the case in e.g. Scala).
/**
* Applies a function to a resource and closes it afterwards.
* #param sup Supplier of the resource that should be closed
* #param op operation that should be performed on the resource before it is closed
* #return The result of calling op.apply on the resource
*/
private static <A extends AutoCloseable, B> B applyAndClose(Callable<A> sup, Function<A, B> op) {
try (A res = sup.call()) {
return op.apply(res);
} catch (RuntimeException exc) {
throw exc;
} catch (Exception exc) {
throw new RuntimeException("Wrapped in applyAndClose", exc);
}
}
(Since resources that need to be closed often also throw exceptions when they are allocated non-runtime exceptions are wrapped in runtime exceptions, avoiding the need for a separate method that does that.)
With this method the example from the question looks like this:
Set<Integer> photos = new HashSet<Integer>(Arrays.asList(1, 2, 3));
List<ImageView> collect = applyAndClose(photos::stream, s -> s
.map(photo -> new ImageView(new Image(String.valueOf(photo))))
.collect(Collectors.toList()));
This is useful in situations when closing the stream is required, such as when using Files.lines. It also helps when you have to do a "double close", as in your example in Listing 3.
This answer is an adaptation of an old answer to a similar question.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.