InputStream via ReadableByteChannel does not read to end

InputStream via ReadableByteChannel does not read to end - java

I have an existing problem where I am using InputStreams and I want to increase the performance of reading from this channel. Therefore i read with a ReadableByteChannel.
As a result the reading is much faster with this code:
public static String readAll(InputStream is, String charset, int size) throws IOException{
try(ByteArrayOutputStream bos = new ByteArrayOutputStream()){
java.nio.ByteBuffer buffer = java.nio.ByteBuffer.allocate(size);
try(ReadableByteChannel channel = Channels.newChannel(is)){
int bytesRead = 0;
do{
bytesRead = channel.read(buffer);
bos.write(buffer.array(), 0, bytesRead);
buffer.clear();
}
while(bytesRead >= size);
}
catch(Exception ex){
ex.printStackTrace();
}
String ans = bos.toString(charset);
return ans;
}
}
The Problem is: It does not read to the end every time! If I try to read a File it works pretty good. If I read from a network Socket (to request a WebPage manually for example) it sometimes stops somewhere in between.
What can I do to read to the end?
I don't want to use something like this:
StringBuilder result = new StringBuilder();
while(true){
int ans = is.read();
if(ans == -1) break;
result.append((char)ans);
}
return result.toString();
because this implementation is slow.
I hope you can help me with my problem. maybe i have some mistake in my code.

This causes problem:
... } while (bytesRead >= size);
Reading from socket may return when at least one byte was read (or even if no bytes in case of non-blocking). So if there are not enough bytes in OS socket buffer, the condition will break the loop although obviously not full content was read. If the size identifies expected length to be received, implement total += bytesRead and break the loop when total reaches size. Or if you reach end of file of course...

Your copy loop is completely wrong. There's no reason why bytesRead should ever be >= size, and it misbehaves at end of stream. It should be something like this:
while ((bytesRead = channel.read(buffer)) > 0)
{
bos.write(buffer.array(), 0, bytesRead);
buffer.clear();
}
with suitable adjustments for limiting the transfer to size bytes, which are non-trivial.
But layering all this over an existing InputStream cannot possibly be 'much faster' tha using the InputStream directly, unless because of the premature termination. Unless your idea of use an InputStream is what you posted, which is horrifically slow. Try that with a 'BufferedInputStream.

Related

difference between input.read and input.read(array, offset, length)

I'm trying to understand how inputstreams work. The following block of code is one of the many ways to read data from a text file:-
File file = new File("./src/test.txt");
InputStream input = new BufferedInputStream (new FileInputStream(file));
int data = 0;
while (data != -1) (-1 means we reached the end of the file)
{
data = input.read(); //if a character was read, it'll be turned to a bite and we get the integer representation of it so a is 97 b is 98
System.out.println(data + (char)data); //this will print the numbers followed by space then the character
}
input.close();
Now to use input.read(byte, offset, length) i have this code. I got it from here
File file = new File("./src/test.txt");
InputStream input = new BufferedInputStream (new FileInputStream(file));
int totalBytesRead = 0, bytesRemaining, bytesRead;
byte[] result = new byte[ ( int ) file.length()];
while ( totalBytesRead < result.length )
{
bytesRemaining = result.length - totalBytesRead;
bytesRead = input.read ( result, totalBytesRead, bytesRemaining );
if ( bytesRead > 0 )
totalBytesRead = totalBytesRead + bytesRead;
//printing integer version of bytes read
for (int i = 0; i < bytesRead; i++)
System.out.print(result[i] + " ");
System.out.println();
//printing character version of bytes read
for (int i = 0; i < bytesRead; i++)
System.out.print((char)result[i]);
}
input.close();
I'm assuming that based on the name BYTESREAD, this read method is returning the number of bytes read. In the documentation, it says that the function will try to read as many as possible. So there might be a reason why it wouldn't.
My first question is: What are these reasons?
I could replace that entire while loop with one line of code: input.read(result, 0, result.length)
I'm sure the creator of the article thought about this. It's not about the output because I get the same output in both cases. So there has to be a reason. At least one. What is it?

The documentation of read(byte[],int,int says that it:
Reads up to len bytes of data.
An attempt is made to read as many as len bytes
A smaller number may be read.
Since we are working with files that are right there in our hard disk, it seems reasonable to expect that the attempt will read the whole file, but input.read(result, 0, result.length) is not guaranteed to read the whole file (it's not said anywhere in the documentation). Relying in undocumented behaviors is a source for bugs when the undocumented behavior change.
For instance, the file stream may be implemented differently in other JVMs, some OS may impose a limit on the number of bytes that you may read at once, the file may be located in the network, or you may later use that piece of code with another implementation of stream, which doesn't behave in that way.
Alternatively, if you are reading the whole file in an array, perhaps you could use DataInputStream.readFully
About the loop with read(), it reads a single byte each time. That reduces performance if you are reading a big chunk of data, since each call to read() will perform several tests (has the stream ended? etc) and may ask the OS for one byte. Since you already know that you want file.length() bytes, there is no reason for not using the other more efficient forms.

Imagine you are reading from a network socket, not from a file. In this case you don't have any information about the total amount of bytes in the stream. You would allocate a buffer of fixed size and read from the stream in a loop. During one iteration of the loop you can't expect there are BUFFERSIZE bytes available in the stream. So you would fill the buffer as much as possible and iterate again, until the buffer is full. This can be useful, if you have data blocks of fixed size, for example serialized object.
ArrayList<MyObject> list = new ArrayList<MyObject>();
try {
InputStream input = socket.getInputStream();
byte[] buffer = new byte[1024];
int bytesRead;
int off = 0;
int len = 1024;
while(true) {
bytesRead = input.read(buffer, off, len);
if(bytesRead == len) {
list.add(createMyObject(buffer));
// reset variables
off = 0;
len = 1024;
continue;
}
if(bytesRead == -1) break;
// buffer is not full, adjust size
off += bytesRead;
len -= bytesRead;
}
} catch(IOException io) {
// stream was closed
}
ps. Code is not tested and should only point out, how this function can be useful.

You specify the amount of bytes to read because you might not want to read the entire file at once or maybe you couldn't or might not want to create a buffer as large as the file.

Why does setting SO_TIMEOUT cause final read of SocketInputStream to return immediately?

I'm working on a test harness that writes bytes over a socket to a server and reads the response. I had a problem where the last read of the Socket's InputStream would pause for 20 seconds. I fixed that, but don't understand why it worked.
The following method is given a java.net.SocketInputStream. The call to read(byte[], int, int) was pausing for 20 seconds on the final read, the one that returns -1, indicating end-of-stream.
private String getResponse(InputStream in) throws IOException {
StringBuffer buffer = new StringBuffer();
ByteArrayOutputStream bout = new ByteArrayOutputStream();
byte[] data = new byte[1024];
int bytesRead = 0;
while (bytesRead >= 0) {
bytesRead = in.read(data, 0, 1024); // PAUSED HERE ON LAST READ
if (bytesRead > 0) {
bout.write(data, 0, bytesRead);
}
buffer.append(new String(data));
}
return buffer.toString();
}
I was able to make the pause go away by setting SO_TIMEOUT on the socket. It doesn't seem to matter what I set it to. Even with socket.setSoTimeout(60000), the problem read in the method above returns immediately at end-of-stream.
What's happening here? Why does setting SO_TIMEOUT, even to a high value, cause the final read on the SocketInputStream to return immediately?

This sounds implausible. Setting a socket timeout shouldn't have that effect.
I think the most likely explanation is that you changed something else, and that is what has fixed the pauses. (If I was to guess, it would be that the server is now closing the socket where it wasn't doing that before.)
If this doesn't help, you will need to provide an SSCCE that other people can run to observe the effect. And tell us what platform you are using.

Does Java´s BufferedReader leaves bytes in its internal buffer after a readline() call?

I´m having a problem, in my server, after I send a file with X bytes, I send a string saying this file is over and another file is coming, like
FILE: a SIZE: Y\r\n
send Y bytes
FILE a FINISHED\r\n
FILE b SIZE: Z\r\n
send Z byes
FILE b FINISHED\r\n
FILES FINISHED\r\n
In my client it does not recive properly.
I use readline() to get the command lines after reading Y or Z bytes from the socket.
With one file it works fine, with multiple files it rarely works (yeah, I dont know how it worked once or twice)
Here are some code I use to transfer binary
public static void readInputStreamToFile(InputStream is, FileOutputStream fout,
long size, int bufferSize) throws Exception
{
byte[] buffer = new byte[bufferSize];
long curRead = 0;
long totalRead = 0;
long sizeToRead = size;
while(totalRead < sizeToRead)
{
if(totalRead + buffer.length <= sizeToRead)
{
curRead = is.read(buffer);
}
else
{
curRead = is.read(buffer, 0, (int)(sizeToRead - totalRead));
}
totalRead = totalRead + curRead;
fout.write(buffer, 0, (int) curRead);
}
}
public static void writeFileInputStreamToOutputStream(FileInputStream in, OutputStream out, int bufferSize) throws Exception
{
byte[] buffer = new byte[bufferSize];
int count = 0;
while((count = in.read(buffer)) != -1)
{
out.write(buffer, 0, count);
}
}
just for note I could solve replacing readline to this code:
ByteArrayOutputStream ba = new ByteArrayOutputStream();
int ch;
while(true)
{
ch = is.read();
if(ch == -1)
throw new IOException("Conecção finalizada");
if(ch == 13)
{
ch = is.read();
if(ch == 10)
return new String(ba.toByteArray(), "ISO-8859-1");
else
ba.write(13);
}
ba.write(ch);
}
PS: "is" is my input stream from socket: socket.getInputStream();
still I dont know if its the best implementation to do, im tryinf to figure out

There's no readLine() calls in the code here, but to answer your question; Yes, calling BufferedReader.readLine() might very well leave stuff around in its internal buffer. It's buffering the input.
If you wrap one of your InputStream in a BufferedReader, you can't really get much sane behavior if you read from the BufferedReader and then later on read from the InputStream.
You could read bytes from your InputStream and parse out a text line from that by looking for a pair of \r\n bytes. When you got a line saying "FILE: a SIZE: Y\r\n" , you go on as usual, except the buffer you used to parse lines might contain the first few bytes of your file, so write those bytes out first.
Or you use the idea of FTP and use one TCP stream for commands and one TCP stream for the actual transfer, reading from the command stream with a BufferedReader.readLine(), and reading the data as you already do with an InputStream.

Yes, the main point of a BufferedReader is to buffer the data. It is reading input from its underlying Reader in bigger chunks to avoid having multiple small reads.
That it has a readLine() method is just a nice bonus which is made easily possible by the buffering.
You may want to use a DataInputStream (on top of a BufferedInputStream) and it's readLine() method, if you really have to mix text and binary data over the same connection - read the data from the same DataInputStream. (But take care about the encoding here.)

Call flush() on the OutputStream after you've written data that you want to be certain has been sent. So essentially at the end of each file call flush().

I guess you must flush your output stream in order to make sure any buffered bytes are properly sent down the stream. Closing the stream will equally have this process run.
The Javadocs for flush say:
Flushes this output stream and forces
any buffered output bytes to be
written out. The general contract of
flush is that calling it is an
indication that, if any bytes
previously written have been buffered
by the implementation of the output
stream, such bytes should immediately
be written to their intended
destination.

Java -- How to read an unknown number of bytes from an inputStream (socket/socketServer)?

Looking to read in some bytes over a socket using an inputStream. The bytes sent by the server may be of variable quantity, and the client doesn't know in advance the length of the byte array. How may this be accomplished?
byte b[];
sock.getInputStream().read(b);
This causes a 'might not be initialized error' from the Net BzEAnSZ. Help.

You need to expand the buffer as needed, by reading in chunks of bytes, 1024 at a time as in this example code I wrote some time ago
byte[] resultBuff = new byte[0];
byte[] buff = new byte[1024];
int k = -1;
while((k = sock.getInputStream().read(buff, 0, buff.length)) > -1) {
byte[] tbuff = new byte[resultBuff.length + k]; // temp buffer size = bytes already read + bytes last read
System.arraycopy(resultBuff, 0, tbuff, 0, resultBuff.length); // copy previous bytes
System.arraycopy(buff, 0, tbuff, resultBuff.length, k); // copy current lot
resultBuff = tbuff; // call the temp buffer as your result buff
}
System.out.println(resultBuff.length + " bytes read.");
return resultBuff;

Assuming the sender closes the stream at the end of the data:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[4096];
while(true) {
int n = is.read(buf);
if( n < 0 ) break;
baos.write(buf,0,n);
}
byte data[] = baos.toByteArray();

Read an int, which is the size of the next segment of data being received. Create a buffer with that size, or use a roomy pre-existing buffer. Read into the buffer, making sure it is limited to the aforeread size. Rinse and repeat :)
If you really don't know the size in advance as you said, read into an expanding ByteArrayOutputStream as the other answers have mentioned. However, the size method really is the most reliable.

Without re-inventing the wheel, using Apache Commons:
IOUtils.toByteArray(inputStream);
For example, complete code with error handling:
public static byte[] readInputStreamToByteArray(InputStream inputStream) {
if (inputStream == null) {
// normally, the caller should check for null after getting the InputStream object from a resource
throw new FileProcessingException("Cannot read from InputStream that is NULL. The resource requested by the caller may not exist or was not looked up correctly.");
}
try {
return IOUtils.toByteArray(inputStream);
} catch (IOException e) {
throw new FileProcessingException("Error reading input stream.", e);
} finally {
closeStream(inputStream);
}
}
private static void closeStream(Closeable closeable) {
try {
if (closeable != null) {
closeable.close();
}
} catch (Exception e) {
throw new FileProcessingException("IO Error closing a stream.", e);
}
}
Where FileProcessingException is your app-specific meaningful RT exception that will travel uninterrupted to your proper handler w/o polluting the code in between.

The simple answer is:
byte b[] = new byte[BIG_ENOUGH];
int nosRead = sock.getInputStream().read(b);
where BIG_ENOUGH is big enough.
But in general there is a big problem with this. A single read call is not guaranteed to return all that the other end has written.
If the nosRead value is BIG_ENOUGH, your application has no way of knowing for sure if there are more bytes to come; the other end may have sent exactly BIG_ENOUGH bytes ... or more than BIG_ENOUGH bytes. In the former case, you application will block (for ever) if you try to read. In the latter case, your application has to do (at least) another read to get the rest of the data.
If the nosRead value is less than BIG_ENOUGH, your application still doesn't know. It might have received everything there is, part of the data may have been delayed (due to network packet fragmentation, network packet loss, network partition, etc), or the other end might have blocked or crashed part way through sending the data.
The best answer is that EITHER your application needs to know beforehand how many bytes to expect, OR the application protocol needs to somehow tell the application how many bytes to expect or when all bytes have been sent.
Possible approaches are:
the application protocol uses fixed message sizes (not applicable to your example)
the application protocol message sizes are specified in message headers
the application protocol uses end-of-message markers
the application protocol is not message based, and the other end closes the connection to say that is the end.
Without one of these strategies, your application is left to guess, and is liable to get it wrong occasionally.
Then you use multiple read calls and (maybe) multiple buffers.

Stream all Input data into Output stream. Here is working example:
InputStream inputStream = null;
byte[] tempStorage = new byte[1024];//try to read 1Kb at time
int bLength;
try{
ByteArrayOutputStream outputByteArrayStream = new ByteArrayOutputStream();
if (fileName.startsWith("http"))
inputStream = new URL(fileName).openStream();
else
inputStream = new FileInputStream(fileName);
while ((bLength = inputStream.read(tempStorage)) != -1) {
outputByteArrayStream.write(tempStorage, 0, bLength);
}
outputByteArrayStream.flush();
//Here is the byte array at the end
byte[] finalByteArray = outputByteArrayStream.toByteArray();
outputByteArrayStream.close();
inputStream.close();
}catch(Exception e){
e.printStackTrace();
if (inputStream != null) inputStream.close();
}

Either:
Have the sender close the socket after transferring the bytes. Then at the receiver just keep reading until EOS.
Have the sender prefix a length word as per Chris's suggestion, then read that many bytes.
Use a self-describing protocol such as XML, Serialization, ...

Use BufferedInputStream, and use the available() method which returns the size of bytes available for reading, and then construct a byte[] with that size. Problem solved. :)
BufferedInputStream buf = new BufferedInputStream(is);
int size = buf.available();

Here is a simpler example using ByteArrayOutputStream...
socketInputStream = socket.getInputStream();
int expectedDataLength = 128; //todo - set accordingly/experiment. Does not have to be precise value.
ByteArrayOutputStream baos = new ByteArrayOutputStream(expectedDataLength);
byte[] chunk = new byte[expectedDataLength];
int numBytesJustRead;
while((numBytesJustRead = socketInputStream.read(chunk)) != -1) {
baos.write(chunk, 0, numBytesJustRead);
}
return baos.toString("UTF-8");
However, if the server does not return a -1, you will need to detect the end of the data some other way - e.g., maybe the returned content always ends with a certain marker (e.g., ""), or you could possibly solve using socket.setSoTimeout(). (Mentioning this as it is seems to be a common problem.)

This is both a late answer and self-advertising, but anyone checking out this question may want to take a look here:
https://github.com/GregoryConrad/SmartSocket

This question is 7 years old, but i had a similiar problem, while making a NIO and OIO compatible system (Client and Server might be whatever they want, OIO or NIO).
This was quit the challenge, because of the blocking InputStreams.
I found a way, which makes it possible and i want to post it, to help people with similiar problems.
Reading a byte array of dynamic sice is done here with the DataInputStream, which kann be simply wrapped around the socketInputStream. Also, i do not want to introduce a specific communication protocoll (like first sending the size of bytes, that will be send), because i want to make this as vanilla as possible. First of, i have a simple utility Buffer class, which looks like this:
import java.util.ArrayList;
import java.util.List;
public class Buffer {
private byte[] core;
private int capacity;
public Buffer(int size){
this.capacity = size;
clear();
}
public List<Byte> list() {
final List<Byte> result = new ArrayList<>();
for(byte b : core) {
result.add(b);
}
return result;
}
public void reallocate(int capacity) {
this.capacity = capacity;
}
public void teardown() {
this.core = null;
}
public void clear() {
core = new byte[capacity];
}
public byte[] array() {
return core;
}
}
This class only exists, because of the dumb way, byte <=> Byte autoboxing in Java works with this List. This is not realy needed at all in this example, but i did not want to leave something out of this explanation.
Next up, the 2 simple, core methods. In those, a StringBuilder is used as a "callback". It will be filled with the result which has been read and the amount of bytes read will be returned. This might be done different of course.
private int readNext(StringBuilder stringBuilder, Buffer buffer) throws IOException {
// Attempt to read up to the buffers size
int read = in.read(buffer.array());
// If EOF is reached (-1 read)
// we disconnect, because the
// other end disconnected.
if(read == -1) {
disconnect();
return -1;
}
// Add the read byte[] as
// a String to the stringBuilder.
stringBuilder.append(new String(buffer.array()).trim());
buffer.clear();
return read;
}
private Optional<String> readBlocking() throws IOException {
final Buffer buffer = new Buffer(256);
final StringBuilder stringBuilder = new StringBuilder();
// This call blocks. Therefor
// if we continue past this point
// we WILL have some sort of
// result. This might be -1, which
// means, EOF (disconnect.)
if(readNext(stringBuilder, buffer) == -1) {
return Optional.empty();
}
while(in.available() > 0) {
buffer.reallocate(in.available());
if(readNext(stringBuilder, buffer) == -1) {
return Optional.empty();
}
}
buffer.teardown();
return Optional.of(stringBuilder.toString());
}
The first method readNext will fill the buffer, with byte[] from the DataInputStream and return the amount bytes read this way.
In the secon method, readBlocking, i utilized the blocking nature, not to worry about consumer-producer-problems. Simply readBlocking will block, untill a new byte-array is received. Before we call this blocking method, we allocate a Buffer-size. Note, i called reallocate after the first read (inside the while loop). This is not needed. You can safely delete this line and the code will still work. I did it, because of the uniqueness of my problem.
The 2 things, i did not explain in more detail are:
1. in (the DataInputStream and the only short varaible here, sorry for that)
2. disconnect (your disconnect routine)
All in all, you can now use it, this way:
// The in has to be an attribute, or an parameter to the readBlocking method
DataInputStream in = new DataInputStream(socket.getInputStream());
final Optional<String> rawDataOptional = readBlocking();
rawDataOptional.ifPresent(string -> threadPool.execute(() -> handle(string)));
This will provide you with a way of reading byte arrays of any shape or form over a socket (or any InputStream realy). Hope this helps!

java servlet serving a file over HTTP connection

I have the following code(Server is Tomcat/Linux).
// Send the local file over the current HTTP connection
FileInputStream fin = new FileInputStream(sendFile);
int readBlockSize;
int totalBytes=0;
while ((readBlockSize=fin.available())>0) {
byte[] buffer = new byte[readBlockSize];
fin.read(buffer, 0, readBlockSize);
outStream.write(buffer, 0, readBlockSize);
totalBytes+=readBlockSize;
}
With some files of type 3gp
When i attach the debugger, in line:
outStream.write(buffer, 0, readBlockSize);
it breaks out the while with the following error;
ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) line:299
And the file is not served.
Any clues?
Thanks
A.K.

You can't guarantee that InputStream.read(byte[], int, int) will actually read the desired number of bytes: it may read less. Even your call to available() will not provide that guarantee. You should use the return value from fin.read to find out how many bytes were actually read and only write that many to the output.
I would guess that the problem you see could be related to this. If the block read is less than the available size then your buffer will be partially filled and that will cause problems when you write too many bytes to the output.
Also, don't allocate a new array every time through the loop! That will result in a huge number of needless memory allocations that will slow your code down, and will potentially cause an OutOfMemoryError if available() returns a large number.
Try this:
int size;
int totalBytes = 0;
byte[] buffer = new byte[BUFFER_SIZE];
while ((size = fin.read(buffer, 0, BUFFER_SIZE)) != -1) {
outStream.write(buffer, 0, size);
totalBytes += size;
}

Avoiding these types of problems is why I start with Commons IO. If that's an option, your code would be as follows.
FileInputStream fin = new FileInputStream(sendFile);
int totalBytes = IOUtils.copy(fin, outStream);
No need reinventing the wheel.

It is possible that the .read() call returns less bytes than you requested. This means you need to use te returnvalue of .read() as argument to the .write() call:
int bytesRead = fin.read(buffer, 0, readBlockSize);
outStream.write(buffer, 0, bytesRead);
apart from this, it is better to pre-allocate a buffer and use it (your could could try to use a 2Gb buffer if your file is large :-))
byte[] buffer = new byte[4096]; // define a constant for this max length
while ((readBlockSize=fin.available())>0) {
if (4096 < readBlockSize) {
readBlockSise = 4096;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

InputStream via ReadableByteChannel does not read to end - java

Related

difference between input.read and input.read(array, offset, length)

Why does setting SO_TIMEOUT cause final read of SocketInputStream to return immediately?

Does Java´s BufferedReader leaves bytes in its internal buffer after a readline() call?

Java -- How to read an unknown number of bytes from an inputStream (socket/socketServer)?

java servlet serving a file over HTTP connection

Categories

Resources