Channel.read returns -2 - java

I'm using Java NIO to do socket operations. In the past when working with streams, the read call (where you are reading into a byte array, or in this case a ByteBuffer) returns the number of bytes read from the stream, or -1 if the stream was closed. So you basically can do
while(channel.read(buffer) != -1){
//do stuff
}
However, I noticed that I was killing my servers. When I added some logging statements, I noticed that the read() call was returning -2 at the end of the stream. According to the documentation:
Returns: The number of bytes read, possibly zero, or -1 if the channel
has reached end-of-stream
Has anybody experienced this before? I changed my code to loop on a value >0, but I wanted to make sure I understand what's going on.

Related

Reading n bytes atomically without blocking

I just asked a question about why my thread shut down wasn't working. It ended up being due to readLine() blocking my thread before the shutdown flag could be recognised. This was easy to fix by checking ready() before calling readLine().
However, I'm now using a DataInputStream to do the following in series:
int x = reader.readInt();
int y = reader.readInt();
byte[] z = new byte[y]
reader.readFully(z);
I know I could implement my own buffering which would check the running file flag while loading up the buffer. But I know this would be tedious. Instead, I could let the data be buffered within the InputStream class, and wait until I have my n bytes read, before executing a non-blocking read - as I know how much I need to read.
4 bytes for the first integer
4 bytes for the second integer y
and y bytes for the z byte array.
Instead of using ready() to check if there is a line in the buffer, is there some equivalent ready(int bytesNeeded)?
The available() method returns the amount of bytes in the InputStreams internal buffer.
So, one can do something like:
while (reader.available() < 4) checkIfShutdown();
reader.readInt();
You can use InputStream.available() to get an estimate of the amount of bytes that can be read. Quoting the Javadoc:
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking, which may be 0, or 0 when end of stream is detected. The read might be on the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
In other words, if available() returns n, you know you can safely call read(n) without blocking. Note that, as the Javadoc states, the value returned is an estimate. For example, InflaterInputStream.available() will always return 1 if EOF isn't reached. Check the documentation of the InputStream subclass you will be using to ensure it meets your needs.
You are going to need to implement your own equivalent of BufferedInputStream. Either as a sole owner of an InputStream and a thread (possibly borrowed from a pool) to block in. Alternatively, implement with NIO.

Exactly what read/block guarantees does DataInputStream provide following available()

I've read the java docs and a number of related questions but am unsure if the following is guaranteed to work:
I have a DataInputStream on a dedicated thread that continually reads small amounts of data, of known byte-size, from a very active connection. I'd like to alert the user when the stream becomes inactive (i.e. network goes down) so I've implemented the following:
...
streamState = waitOnStreamForState(stream, 4);
int i = stream.readInt();
...
private static int
waitOnStreamForState(DataInputStream stream, int nBytes) throws IOException {
return waitOnStream(stream, nBytes, STREAM_ACTIVITY_THRESHOLD, STREAM_POLL_INTERVAL)
? STREAM_STATE_ACTIVE
: STREAM_STATE_INACTIVE;
private static boolean
waitOnStream(DataInputStream stream, int nBytes, long timeout, long pollInterval) throws IOException {
int timeWaitingForAvailable = 0;
while( stream.available() < nBytes ){
if( timeWaitingForAvailable >= timeout && timeout > 0 ){
return false;
}
try{
Thread.sleep(pollInterval);
}catch( InterruptedException e ){
Thread.currentThread().interrupt();
return (stream.available() >= nBytes);
}
timeWaitingForAvailable += pollInterval;
}
return true;
}
The docs for available() explain:
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next caller of a method for this input stream. The next caller might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
Does this mean it's possible the next read (inside readInt()) might only, for instance, read 2 bytes, and the subsequent read to finish retrieving the Integer could block? I realize readInt() is a method of the stream 'called next' but I presume it has to loop on a read call until it gets 4 bytes and the docs don't mention subsequent calls. In the above example is it possible that the readInt() call could still block even if waitOnStreamForState(stream, 4) returns STREAM_STATE_ACTIVE?
(and yes, I realize my timeout mechanism is not exact)
Does this mean it's possible the next read (inside readInt()) might only, for instance, read 2 bytes, and the subsequent read to finish retrieving the Integer could block?
That's what it says. However at least the next read() won't block.
I realize readInt() is a method of the stream 'called next' but I presume it has to loop on a read call until it gets 4 bytes and the docs don't mention subsequent calls. In the above example is it possible that the readInt() call could still block even if waitOnStreamForState(stream, 4) returns STREAM_STATE_ACTIVE?
That's what it says.
For example, consider SSL. You can tell that there is data available, but you can't tell how much without actually decrpyting it, so a JSSE implementation is free to:
always return 0 from available() (this is what it used to do)
always return 1 if the underlying socket's input stream has available() > 0, otherwise zero
return the underlying socket input stream's available() value and rely on this wording to get it out of trouble if the actual plaintext data is less. (However the correct value might still be zero, if the cipher data consisted entirely of handshake messages or alerts.)
However you don't need any of this. All you need is a read timeout, set via Socket.setSoTimeout(), and a catch for SocketTimeoutException. There are few if any correct uses of available(): fewer and fewer as time goes on, it seems to me. You should certainly not waste time calling sleep().

When reading stream, why not stop iterating when 0 bytes read

The standard idiom when reading from a stream is to check for EOF (-1):
while((bytesRead = inputStream.read(buffer)) != -1)
This seems pretty standard - I checked source for popular libraries like Apache Commons and it seems to be the defacto standard.
Why do we not stop at 0 as well? Wouldn't > -1 be better? Why do whatever work is in the loop when we didn't read anything?
Basically because it would be pointless. Look at the documentation:
If the length of b is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at the end of the file, the value -1 is returned; otherwise, at least one byte is read and stored into b.
So unless you're passing in an empty buffer (which is basically a bug in pretty much all cases; I personally wish the method would throw an exception in that case) the return value will never be 0. It will block for at least one byte to be read (in which case the return value will be 1 or more), or for the end of the stream to be reached (in which case the return value will be -1).

java inputstream

What is an InputStream's available() method is supposed to return when the end of the stream is reached?
The documentation doesn't specify the behavior.
..end of the stream is reached
Don't use available() for detecting end of stream! Instead look to the int returned by InputStream.read(), which:
If no byte is available because the end of the stream has been reached, the value -1 is returned.
The JavaDoc does tell you in the Returns section -
an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking or 0 when it reaches the end of the input stream.
(from InputStream JavaDoc)
Theoretically if end of stream is reached there are not bytes to read and available returns 0. But be careful with it. Not all streams provide real implementation of this method. InputStream itself always returns 0.
If you need non-blocking functionality, i.e. reading from stream without being blocked on read use NIO instead.
From the Java 7 documentation:
"an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking or 0 when it reaches the end of the input stream."
So, I would say it should return 0 in this case. That also seems the most intuitive behaviour to me.
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
The available method for class InputStream always returns 0.
http://download.oracle.com/javase/6/docs/api/java/io/InputStream.html#available%28%29

BufferedReader ready method in a while loop to determine EOF?

I have a large file (English Wikipedia articles only database as XML files). I am reading one character at a time using BufferedReader. The pseudocode is:
file = BufferedReader...
while (file.ready())
character = file.read()
Is this actually valid? Or will ready just return false when it is waiting for the HDD to return data and not actually when the EOF has been reached? I tried to use if (file.read() == -1) but seemed to run into an infinite loop that I literally could not find.
I am just wondering if it is reading the whole file as my statistics say 444,380 Wikipedia pages have been read but I thought there were many more articles.
The Reader.ready() method is not intended to be used to test for end of file. Rather, it is a way to test whether calling read() will block.
The correct way to detect that you have reached EOF is to examine the result of a read call.
For example, if you are reading a character at a time, the read() method returns an int which will either be a valid character or -1 if you've reached the end-of-file. Thus, your code should look like this:
int character;
while ((character = file.read()) != -1) {
...
}
This is not guaranteed to read the whole input. ready() just tells you if the underlying stream has some content ready. If it is abstracting over a network socket or file, for example, it could mean that there isn't any buffered data available yet.

Categories

Resources