When reading stream, why not stop iterating when 0 bytes read - java

The standard idiom when reading from a stream is to check for EOF (-1):
while((bytesRead = inputStream.read(buffer)) != -1)
This seems pretty standard - I checked source for popular libraries like Apache Commons and it seems to be the defacto standard.
Why do we not stop at 0 as well? Wouldn't > -1 be better? Why do whatever work is in the loop when we didn't read anything?

Basically because it would be pointless. Look at the documentation:
If the length of b is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at the end of the file, the value -1 is returned; otherwise, at least one byte is read and stored into b.
So unless you're passing in an empty buffer (which is basically a bug in pretty much all cases; I personally wish the method would throw an exception in that case) the return value will never be 0. It will block for at least one byte to be read (in which case the return value will be 1 or more), or for the end of the stream to be reached (in which case the return value will be -1).

Related

Reading n bytes atomically without blocking

I just asked a question about why my thread shut down wasn't working. It ended up being due to readLine() blocking my thread before the shutdown flag could be recognised. This was easy to fix by checking ready() before calling readLine().
However, I'm now using a DataInputStream to do the following in series:
int x = reader.readInt();
int y = reader.readInt();
byte[] z = new byte[y]
reader.readFully(z);
I know I could implement my own buffering which would check the running file flag while loading up the buffer. But I know this would be tedious. Instead, I could let the data be buffered within the InputStream class, and wait until I have my n bytes read, before executing a non-blocking read - as I know how much I need to read.
4 bytes for the first integer
4 bytes for the second integer y
and y bytes for the z byte array.
Instead of using ready() to check if there is a line in the buffer, is there some equivalent ready(int bytesNeeded)?
The available() method returns the amount of bytes in the InputStreams internal buffer.
So, one can do something like:
while (reader.available() < 4) checkIfShutdown();
reader.readInt();
You can use InputStream.available() to get an estimate of the amount of bytes that can be read. Quoting the Javadoc:
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking, which may be 0, or 0 when end of stream is detected. The read might be on the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
In other words, if available() returns n, you know you can safely call read(n) without blocking. Note that, as the Javadoc states, the value returned is an estimate. For example, InflaterInputStream.available() will always return 1 if EOF isn't reached. Check the documentation of the InputStream subclass you will be using to ensure it meets your needs.
You are going to need to implement your own equivalent of BufferedInputStream. Either as a sole owner of an InputStream and a thread (possibly borrowed from a pool) to block in. Alternatively, implement with NIO.

Channel.read returns -2

I'm using Java NIO to do socket operations. In the past when working with streams, the read call (where you are reading into a byte array, or in this case a ByteBuffer) returns the number of bytes read from the stream, or -1 if the stream was closed. So you basically can do
while(channel.read(buffer) != -1){
//do stuff
}
However, I noticed that I was killing my servers. When I added some logging statements, I noticed that the read() call was returning -2 at the end of the stream. According to the documentation:
Returns: The number of bytes read, possibly zero, or -1 if the channel
has reached end-of-stream
Has anybody experienced this before? I changed my code to loop on a value >0, but I wanted to make sure I understand what's going on.

java inputstream

What is an InputStream's available() method is supposed to return when the end of the stream is reached?
The documentation doesn't specify the behavior.
..end of the stream is reached
Don't use available() for detecting end of stream! Instead look to the int returned by InputStream.read(), which:
If no byte is available because the end of the stream has been reached, the value -1 is returned.
The JavaDoc does tell you in the Returns section -
an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking or 0 when it reaches the end of the input stream.
(from InputStream JavaDoc)
Theoretically if end of stream is reached there are not bytes to read and available returns 0. But be careful with it. Not all streams provide real implementation of this method. InputStream itself always returns 0.
If you need non-blocking functionality, i.e. reading from stream without being blocked on read use NIO instead.
From the Java 7 documentation:
"an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking or 0 when it reaches the end of the input stream."
So, I would say it should return 0 in this case. That also seems the most intuitive behaviour to me.
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
The available method for class InputStream always returns 0.
http://download.oracle.com/javase/6/docs/api/java/io/InputStream.html#available%28%29

Can calling available() for a BufferedInputStream lead me astray in this case?

I am reading in arbitrary size file in blocks of 1021 bytes, with a block size of <= 1021 bytes for the final block of the file. At the moment, I am doing this using a BufferedInputStream which is wrapped around a FileInputStream and code that looks (roughly) like the following (where reader is the BufferedInputStream and this is operating in a loop):
int availableData = reader.available();
int datalen = (availableData >= 1021)
? 1021
: availableData;
reader.read(bufferArray, 0, datalen);
However, from reading the API docs, I note that available() only gives an "estimate" of the available size, before the call would 'block'. Printing out the value of availableData each iteration seems to give the expected values - starting with the file size and slowly getting less until it is <= 1021. Given that this is a local file, am I wrong to expect this to be a correct value - is there a situation where available() would give an incorrect answer?
EDIT: Sorry, additional information. The BufferedInputStream is wrapped around a FileInputStream. From the source code for a FIS, I think I'm safe to rely on available() as a measure of how much data is left in the case of a local file. Am I right?
The question is pointless. Those four lines of code are entirely equivalent to this:
reader.read(buffer, 0, 1021);
without the timing-window problem you have introduced between the available() call and the read. Note that this code is still incorrect as you are ignoring the return value, which can be -1 at EOS, or else anything between 1 and 1021 inclusive.
It doesn't give the estimated size, it gives the remaining bytes that can be read. It's not an estimate with BufferedInputStream.
Returns the number of bytes that can
be read from this input stream without
blocking.
You should pass available() directly into the read() call if you want to avoid blocking, but remember to return if the return value is 0 or -1. available() might throw an exception on buffer types that don't support the operation.

What 0 returned by InputStream.read(byte[]) means? How to handle this?

What 0 (number of bytes read) returned by InputStream.read(byte[]) and InputStream.read(byte[], int, int) means? How to handle this situation?
To be clear, I mean read(byte[] b) or read(byte[] b, int off, int len) methods which return number of bytes read.
The only situation in which a InputStream may return 0 from a call to read(byte[]) is when the byte[] passed in has a length of 0:
byte[] buf = new byte[0];
int read = in.read(buf); // read will contain 0
As specified by this part of the JavaDoc:
If the length of b is zero, then no bytes are read and 0 is returned
My guess: you used available() to see how big the buffer should be and it returned 0. Note that this is a misuse of available(). The JavaDoc explicitly states that:
It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.
Take a look at the implementation of javax.sound.AudioInputStream#read(byte[] b, int off, int len) ... yuck. They completely violated the standard java.io.InputStream semantics and return a read size of 0 if you request fewer than a whole frame of data.
So unfortunately; the common advice (and api spec) should preclude having to deal with return of zero when len > 0 but even for JDK provided classes you can't universally rely on this to be true for InputStreams of arbitrary types.
Again, yuck.
According to Java API Doc:
http://java.sun.com/j2se/1.4.2/docs/api/java/io/InputStream.html#read(byte[])
It only can happen if the byte[] you passed has zero items (new byte[0]).
In other situations it must return at least one byte. Or -1 if EOF reached. Or an exception.
Of course: it depends of the actual implementation of the InputStream you are using!!! (it could be a wrong one)
I observed the same behavior (reading 0 bytes) when I build a swing console output window and made a reader-thread for stdout and stderr via the following code:
this.pi = new PipedInputStream();
po = new PipedOutputStream((PipedInputStream)pi);
System.setOut(new PrintStream(po, true));
When the 'main' swing application exits, and my console window is still open I read 0 from this.pi.read().
The read data was put on the console window resulting in a race condition some how, just ignoring the result and not updating the console window solved the issue.

Categories

Resources