When are readers/writers/streams identified as being open? - java

I am creating an abstract binding class for a Reader and Writer where the user doesn't have to reference each one individually.
Example: I have a FileStream which inside of it houses both a FileReader and FileWriter.
The question I have refers to optimizing the class. I know I can't have two streams opened simultaneously due to concurrency, however I need to initialize them somewhere without having data leaks all over the place.
Are streams/readers/writers classified as being open, as soon as you initialize them, or are the 'pipes' only opened once the first read/write begins? I'm looking at the JavaDoc and don't see anything here about when the streams actually open up...
For those who do not understand what I am asking (ignoring try-catch blocks):
// does my reader become OPEN here?
BufferedReader br = new BufferedReader(new FileReader("foobar.txt"));
// or here, now that I have performed the first operation.
br.readLine();

They are open as soon as you construct them. There is no 'open' operation, so they are already open.
Discussion:
new FileInputStream(...) and new FileOutputStream(...) open the file, as they throw IOExceptions about it. Practically every other input or output stream extends FilterInput/OutputStream, with a FileInput/OutputStream as its delegate (including socket input/output streams as a matter of fact). The FileInput/OutputStream is created first in any such stack, ergo it is already open before the decorator streams, ergo they are already open too.
ByteArrayInput/OutputStreams and StringReader/Writer don't need opening at all.

Alternative solution: forget about re-inventing the wheel.
Java has a class that is specifically designed to allow for reading and writing to the same file: java.io.RandomAcessFile
So, if you have to wrap around... Use that class, instead of combining two other things that were never intended to be combined!

Related

Resource leak: 'sc' is never closed while taking user input with for loop [duplicate]

This question already has answers here:
Resource leak: 'in' is never closed
(14 answers)
Closed 2 years ago.
I'm watching some java tutorials and wondering about the resource leak warning when using Scanner.
I know I can close it, but the person in the video is not getting a warning even though he is using the exact same code, why is that?
Scanner input= new Scanner(System.in);
System.out.print("Enter a line of text: ");
String line = input.nextLine();
System.out.println("You entered: " + line);
//input.close();
The actual problem
It's.. complicated. The general rule is that whomever makes the resource must also safely close it, and because your code makes a Scanner, the IDE is telling you: Hey, you should close that.
The problem is, closing that scanner is wrong here: The scanner wraps around System.in, which is NOT a resource you made, but scanner.close() would close the underlying stream (System.in itself), and you don't want that: It's not your responsibility and in fact actively harms things; now you can never read from sysin again.
The problem is, the IDE can't really know this. The underlying problem is that System.in is extremely badly designed API in many many ways, but [A] it's 30 years old, back then it was a lot harder to know that; we know it now because of hindsight, and [B] oracle hasn't gotten around to making a second version of the sysin/out/err API yet, and it's not high on the agenda.
This leaves IDEs in trouble: It's relatively easy to set up some patterns and rules for using resources such that you never have any problems with this 'filters you created that wrap around resources you did not create', but you can't use them with sysin/err/out without writing a little framework, and that's a bit much to ask for newbies. It's also not a good idea presumably to tell those taking their first steps in java coding to first go download some third party library that cleans up sysin/out/err interaction a bit.
Thus, we're in limbo. IDEs should NOT warn about this, but it's hard for them to detect that this is an exotic sccenario where you have a resource you made that you nevertheless don't need to close, and in fact, should not close.
You can turn off the setting for 'unclosed resources', which no doubt the video tutorial did, but it is a useful warning. Just.. hampered by this silly old API that makes no sense anymore.
Some in-depth explanation
There are resources. These are things that implement AutoClosable, and there are very many. Let's focus on those that represent I/O things: The top level types in the hierarchy are Writer, Reader, InputStream, and OutputStream (let's call them all WRIOs). They're all AutoCloseable, and most IDEs (incorrectly?) all complain about unclosed resources for these. However, that's oversimplifying things.
You can split the world of all WRIOs into:
Actual resources (they directly represent an underlying OS-based concept that results to starvation of some resource, a.k.a. 'a leak', if you do not close them). new FileInputStream, socket.getInputStream - etc, these represent actual resources
Dummy resources - they act like a resource but don't actually represent a resource that you can starve out that isn't already fixed by the garbage collector. new ByteArrayInputStream, turning StringBuilders into Readers, etc.
filters - these wrap around a resource and modify it 'in transit'. Such filters do not themselves capture any starvable resource. If you close() them, they also invoke close on the thing they wrap. Scanner is a filter.
The rules on closing them boil down to:
Actual resources - must be closed safely by whomever made them. IDE warnings are warranted if you fail to do this. Note that you did not make System.in, so doesn't apply there.
Dummy resources - you can close them, but you don't have to. If the IDE warns on them, toss a try-with-resources around it, annoying but not too hard to work around.
Filters - tricky.
The problem with filters
If it's a filter provided to you with the intent that you close it:
BufferedReader br = Files.newBufferedReader(somePath);
then failure to close br is a resource leak; IDE warnings are warranted.
If it's a filter you made, wrapping around a WRIO you also made:
InputStream raw = socket.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(raw, StandardCharsets.UTF_8));
(This is 1 real resource, wrapped by a filter WRIO (InputStreamReader), and then that filter wrapped by another filter WRIO): Then the resource leak is all about raw, and if you fail to safely close br, that's no resource leak. It might be a bug (if you close raw without closing/flushing br first, a bunch of bytes in the buffer won't have been written out), but not a resource leak. An IDE warning about failure to close br is wrong, but not too harmful, as you can just toss try-with-resources around it, and this in passing also guarantees that 'bug due to failure to flush out the buffering filter WRIO' cannot happen anymore.
Then, there is the problem case:
Making a filter WRIO that wraps around a resource you did not make and do not have the responsibility to close: You should actively NOT be closing these filter WRIOs, as that will end up closing the underlying resource and you did not want that.
Here an IDE warning is actively bad and annoying, but it is very hard for an IDE to realize this.
The design solution
Normally, you fix this by never getting in that scenario. For example, System.in should have better API; this API would look like:
try (Scanner s = System.newStandardIn()) {
// use scanner here
}
and have the property that closing s does not close System.in itself (it would do mostly nothing; set a boolean flag to throw exceptions if any further read calls are done, or possibly even do literally nothing). Now the IDE warning is at best overzealous, but heeding its advice and safely closing your scanner is now no longer actively introducing bugs in your code.
Unfortunately, that nice API doesn't exist (yet?). Thus we're stuck with this annoying scenario where a useful IDE warning system actively misleads you because of bad API design. If you really want to, you could write it:
public static Scanner newStandardIn() {
Scanner s = new Scanner(System.in) {
#Override public void close() {}
};
// hey, while we're here, lets fix
// another annoying wart!
s.useDelimiter("\r?\n");
return s;
}
Now you can heed those warnings by following its advice:
public static void main(String[] args) {
String name;
int age;
try (Scanner s = newStandardIn()) {
System.out.print("What is your name: ");
// use next() to read entire lines -
// that useDelimiter fix made this possible
name = s.next();
System.out.print("What is your age: ");
age = s.nextInt();
}
// use name and age here
}
no IDE warning, and no bugs.

InputStream and OutputStream - How to differentiate ambiguity

It seems to me that InputStream and OutputStream are ambiguous names for I/O.
InputStream can be thought of as "to input into a stream", and OutputStream can be thought of as "get output of a stream".
After all, we read from an "input" stream, but shouldn't you be reading from an "output"?
What was the rationale behind choosing these two names and what is a good way to remember Input/Output stream without confusing one for the other?
The streams are named not for how you use them inside your code but for what they accomplish. An InputStream accomplishes reading input from somewhere outside your program (the console, a file, etc.), whereas an OutputStream accomplishes writing an output to somewhere else (again, console, file, etc.). Your Java code is only the intermediary in this scenario: In order to make use of the input, you have to read it from the stream, and in order to produce an output, you first have to write something to the stream.
The problem with the naming is only that streams by design always have something that goes in and something that comes out - you can always read and write on/with any stream. All you have to remember is that they are named for the more important task they do: interacting with something outside your code.
Think of your program/code as the Actor.
When the Actor wants to read something in, it seeks an handle to
InputStream cause its this stream that will provide the Input. And hence when you Read from it.
When the Actor wants to write something out, it seeks an handle
to OutputStream and then start writing to the handle which will do
the rest. Likewise you Write to it.
I hope this answers. I just visualize my code as the classic Stick Diagram Actor and InputStream and OutputStream as the entities with which you interact.

Why doesn't `loadFont` close input stream? Should I close it?

Looking at the documentation of Font#loadFont I came across this remark:
This method does not close the input stream.
Unfortunately, this is not explained or expanded upon. So my question is:
What are possible reasons the API won't close the input stream? Is it likely you would like to re-use the stream?
I mostly use this method like this:
Font.loadFont(getClass().getResourceAsStream("path/to/font"), 13.0);
to make sure the font is available for my application, so I never re-use the input stream, and I can't really think of a reason I'd want to.
Should I close the input stream myself? Should I expect any problems if I'm not closing the input stream?
In the past I've had problems with a font loaded this way, where some labels configured with this font started showing squares, while others (on the same scene!) kept working fine. Could this be related to not closing the input stream?
The documentation for every API involving scarce or external resources (such as file descriptors or streams) will make it clear whose responsibility it is to clean up (in this case, close the stream). This is sometimes referred to as "ownership".
In this case the documentation states that the loadFont method does not take ownership of the stream. Therefore it still belongs to you: It is your responsibility to close the stream.
The try-with-resources statement is the best way to do this.

Read from a BufferedReader more than once in Java

I have the following piece of code in Java:
HttpURLConnection con = (HttpURLConnection)new URL(url).openConnection();
con.connect();
InputStream stream = con.getInputStream();
BufferedReader file = new BufferedReader(new InputStreamReader(stream));
At this point, I read the file from start to end while searching for something:
while (true)
{
String line = file.readLine();
if (line == null)
break;
// Search for something...
}
Now I want to search for something else within the file, without opening another URL connection.
For reasons unrelated to this question, I wish to avoid searching for both things "in a single file-sweep".
Questions:
Can I rewind the file with reset?
If yes, should I apply it on the InputStream object, on the BufferedReader object or on both?
If no, then should I simply close the file and reopen it?
If yes, should I apply it on the InputStream object, on the BufferedReader object or on both?
If no, how else can I sweep the file again, without reading through the URL connection again?
You can rewind the file with reset(), provided that you have mark()'ed the position you want to rewind to. These methods should be invoked on the decorator, i.e. BufferedReader.
However, you may probably want to reconsider your design as you can easily read the whole file into some data structure (even a list of strings, or some stream backed by a string) and use the data multiple times.
Use the following methods:
mark
skip
reset
You can do it only if markSupported() returns true. Please note that actually reader typically does not add this functionality but delegates it to wrapped intput stream, so always call markSupported() and keep in mind that it can return false for streams that do not support this feature.
For example it really can happen for URL based streams: think, how can you reset stream that is originated from remote server. This may require client side to cache all content that you have already downloaded.
I usually end up using something like InputStreamSource to make re-reading convenient. When I'm dealing with connections, I find it useful to use an in-memory or on-disk spooling strategy for re-reading. Use a threshold for choosing storage location, "tee" into the spool on first read, and re-read from the spool on subsequent reads.
Edit: Also found guavas ByteSource and CharSource which have the same purpose.

Can I close/reopen InputStream to mimic mark/reset for input streams that do not support mark?

I'm trying to read java.io.InputStream multiple times starting from the top of the stream.
Obviously for streams that return true to markSupported() I can try and use mark(availableBytes) and then reset() to read stream again from the top.
Most of the streams do not support mark and those that do (e.g. java.io.BufferedInputStream) copy data into temporary byte array which is not nice in term of memory consumption, etc.
If my method receives java.io.InputStream as a parameter can I close it and then somehow reopen it to reset same original stream to the top so I can read it again?
I couldn't find any way to do this trick apart from writing original InputStream into memory (yak!) or temporary file and than opening new InputStream to those temporary locations if I need to read stream from top again.
You can close it, but the only way to reopen the same stream to the same data without creating an explicit copy of the data somewhere is to determine what concrete type of InputStream you are dealing with (easy), what that stream was initialized to point at (may be easy, hard, or impossible depending upon the stream type and its interface), and then adding code to instantiate a new instance of the concrete stream type with the original source input (not difficult, but also not very maintainable and easily breakable if someone creates a custom InputStream implementation that you don't know how to handle).

Categories

Resources