Reading stream over TCP on a SocketChannel with undefined number of Bytes

Reading stream over TCP on a SocketChannel with undefined number of Bytes - java

I am trying to read a stream on a SocketChannel without defining the number of bytes.
The alternate solution i thought about is storing different ByteBuffers of a pre-defined size into a list which will allow me afterwards to allocate a new ByteBuffer of the received size and put the result inside.
The problem is that i am on blocking-mode and cannot find a valid condition to leave the loop i made on the read method check the code:
public static final Charset charsetUTF8 = Charset.forName("UTF-8");
public static final int BUFFER_SIZE = 1024;
public static String getUnbounded(String st, SocketAddress address) throws IOException {
SocketChannel sc = SocketChannel.open(address);
sc.write(charsetUTF8.encode(st));
List<ByteBuffer> listBuffers = new ArrayList<>();
ByteBuffer buff = ByteBuffer.allocate(BUFFER_SIZE);
while( sc.read(buff) > -1){
if(buff.remaining() == 0){
listBuffers.add(buff);
buff.clear();
}
}
listBuffers.add(buff);
ByteBuffer finalBuffer = ByteBuffer.allocate(BUFFER_SIZE * listBuffers.size());
for(ByteBuffer tempBuff: listBuffers){
finalBuffer.put(tempBuff);
tempBuff.clear();
}
finalBuffer.flip();
return charsetUTF8.decode(finalBuffer).toString();
}
Any idea on how to solve this?

You can't just clear() the byte buffer. You need to allocate a new one; otherwise the same buffer is being added to listBuffers repeatedly.
ByteBuffer buff = ByteBuffer.allocate(BUFFER_SIZE);
while( sc.read(buff) > -1){
if(buff.remaining() == 0){
listBuffers.add(buff);
buff = ByteBuffer.allocate(BUFFER_SIZE);
}
}
if (buff.position() > 0) {
listBuffers.add(buff);
}
Since the last buffer might not (probably will not) be full, you should calculate the finalBuffer size taking this into account.

The number of bytes in an HTTP response stream is not 'undefined'. See the RFC. It is defined by either:
EOS in the case of a connection which is closed (HTTP 1.0 or Connection: close),
The Content-Length header, or
The result of decoding the chunked-encoding format.
It is essential that it be defined in one of these ways, and maybe there are others, so that HTTP persistent connections can work, where there may be another response following this one.
I would like to know why you are implementing this at all, when the HttpURLConnection class already exists, along with various third-party HTTP clients, which already implement all this correctly, and many other things besides.

The solution is that to get out of the loop i had to call:
sc.shutdownOutput();
Which closes the writing stream without closing the reading stream and set the sc.read(buff) to -1

Related

DataInputStream read not blocking

First off forgive me if I am mistaken for how blocking works, to my understanding blocking will pause the thread until it is ready, for exsample when reading user input the program will wait until the user hits return.
My problem is that instead of waiting for data to become available it reads bytes with the value 0. Is there a way to block until data becoms available?
The method readBytes is called in a loop.
public byte[] readBytes(){
try{
//read the head int that will be 4 bytes telling the number of bytes that follow containing data
byte[] rawLen = new byte[4];
socketReader.read(rawLen);
ByteBuffer bb = ByteBuffer.wrap(rawLen);
int len = bb.getInt();
byte[] data = new byte[len];
if (len > 0) {
socketReader.readFully(data);
}
return data;
} catch (Exception e){
e.printStackTrace();
logError("Failed to read data: " + socket.toString());
return null;
}
}

If read() returned -1, the peer has disconnected. You aren't handling that case. If you detect end of stream you must close the connection and stop reading. At present you have no way of doing so. You need to reconsider your method signature.
You should use readInt() instead of those four lines of code that read the length. At present you are assuming have read four bytes without actually checking. readInt() will check for you.
This way also you will never get out of sync with the sender, which at present is a serious risk.

How can I make sure I received whole file through socket stream?

Ok, So I'm making a Java program that has a server and client and I'm sending a Zip file from server to client. I have sending the file down, almost. But recieving I've found some inconsistency. My code isn't always getting the full archive. I'm guessing it's terminating before the BufferedReader has the full thing. Here's the code for the client:
public void run(String[] args) {
try {
clientSocket = new Socket("jacob-custom-pc", 4444);
out = new PrintWriter(clientSocket.getOutputStream(), true);
in = new BufferedInputStream(clientSocket.getInputStream());
BufferedReader inRead = new BufferedReader(new InputStreamReader(in));
int size = 0;
while(true) {
if(in.available() > 0) {
byte[] array = new byte[in.available()];
in.read(array);
System.out.println(array.length);
System.out.println("recieved file!");
FileOutputStream fileOut = new FileOutputStream("out.zip");
fileOut.write(array);
fileOut.close();
break;
}
}
}
} catch(IOException e) {
e.printStackTrace();
System.exit(-1);
}
}
So how can I be sure the full archive is there before it writes the file?

On the sending side write the file size before you start writing the file. On the reading side Read the file size so you know how many bytes to expect. Then call read until you have gotten everything you expect. With network sockets it may take more than one call to read to get everything that was sent. This is especially true as your data gets larger.

HTTP sends a content-length: x+\n in bytes. This is elegant, it might throw a TimeoutException if the conn is broken.

You are using a TCP socket. The ZIP file is probably larger than the network MTU, so it will be split up into multiple packets and reassembled at the other side. Still, something like this might happen:
client connects
server starts sending. The ZIP file is bigger than the MTU and therefore split up into multiple packets.
client busy-waits in the while (true) until it gets the first packets.
client notices that data has arrived (in.available() > 0)
client reads all available data, writes it to the file and exits
the last packets arrive
So as you can see: Unless the client machine is crazily slow and the network is crazily fast and has a huge MTU, your code simply won't receive the entire file by design. That's how you built it.
A different approach: Prefix the data with the length.
Socket clientSocket = new Socket("jacob-custom-pc", 4444);
DataInputStream dataReader = new DataInputStream(clientSocket.getInputStream());
FileOutputStream out = new FileOutputStream("out.zip");
long size = dataReader.readLong();
long chunks = size / 1024;
int lastChunk = (int)(size - (chunks * 1024));
byte[] buf = new byte[1024];
for (long i = 0; i < chunks; i++) {
dataReader.read(buf);
out.write(buf);
}
dataReader.read(buf, 0, lastChunk);
out.write(buf, 0, lastChunk);
And the server uses DataOutputStream to send the size of the file before the actual file. I didn't test this, but it should work.

How can I make sure I received whole file through socket stream?
By fixing your code. You are using InputStream.available() as a test for end of stream. That's not what it's for. Change your copy loop to this, which is also a whole lot simpler:
while ((count = in.read(buffer)) > 0)
{
out.write(buffer, 0, count);
}
Use with any buffer size greater than zero, typically 8192.

In.available() just tells you that there is no data to be consumed by in.read() without blocking (waiting) at the moment but it does not mean the end of stream. But, they may arrive into your PC at any time, with TCP/IP packet. Normally, you never use in.available(). In.read() suffices everything for the reading the stream entirely. The pattern for reading the input streams is
byte[] buf;
int size;
while ((size = in.read(buf)) != -1)
process(buf, size);
// end of stream has reached
This way you will read the stream entirely, until its end.
update If you want to read multiple files, then chunk you stream into "packets" and prefix every one with an integer size. You then read until size bytes is received instead of in.read = -1.
update2 Anyway, never use in.available for demarking between the chunks of data. If you do that, you imply that there is a time delay between incoming data pieces. You can do this only in the real-time systems. But Windows, Java and TCP/IP are all these layers incompatible with real-time.

Is Socket.getInputStream().read(byte[]) guaranteed to not block after at least some data is read?

The JavaDoc for the class InputStream says the following:
Reads up to len bytes of data from the input stream into an array of
bytes. An attempt is made to read as many as len bytes, but a smaller
number may be read. The number of bytes actually read is returned as
an integer. This method blocks until input data is available, end of
file is detected, or an exception is thrown.
This corresponds to my experience as well. See for instance the example code below:
Client:
Socket socket = new Socket("localhost", PORT);
OutputStream out = socket.getOutputStream();
byte[] b = { 0, 0 };
Thread.sleep(5000);
out.write(b);
Thread.sleep(5000);
out.write(b);
Server:
ServerSocket server = new ServerSocket(PORT);
Socket socket = server.accept();
InputStream in = socket.getInputStream();
byte[] buffer = new byte[4];
System.out.println(in.read(buffer));
System.out.println(in.read(buffer));
Output:
2 // Two bytes read five seconds after Client is started.
2 // Two bytes read ten seconds after Client is started.
The first call to read(buffer) blocks until input data is available. However the method returns after two bytes are read, even though there is still room in the byte buffer, which corresponds with the JavaDoc stating that 'An attempt is made to read as many as len bytes, but a smaller number may be read'. However, is it guaranteed that the method will not block once at least one byte of data is read when the input stream comes from a socket?
The reason I ask is that I saw the following code in the small Java web server NanoHTTPD, and I wondered if a HTTP Request smaller than 8k bytes (which most requests are) potientially could make the thread block indefinately unless there is a guarantee that it won't block once some data is read.
InputStream is = mySocket.getInputStream();
// Read the first 8192 bytes. The full header should fit in here.
byte[] buf = new byte[8192];
int rlen = is.read(buf, 0, bufsize);
Edit:
Let me try to illustrate once more with a relatively similar code example. EJP says that the method blocks until either EOS is signalled or at least one byte of data has arrived, in which case it reads however many bytes of data have arrived, without blocking again, and returns that number, which corresponds to the JavaDoc for method read(byte[], int, int) in the class InputStream. However, if one actually looks at the source code it is clear that the method indeed blocks until the buffer is full. I've tested it by using the same Client as above and copying the InputStream-code to a static method in my server example.
public static void main(String[] args) throws Exception {
ServerSocket server = new ServerSocket(PORT);
Socket socket = server.accept();
InputStream in = socket.getInputStream();
byte[] buffer = new byte[4];
System.out.println(read(in, buffer, 0, buffer.length));
}
public static int read(InputStream in, byte b[], int off, int len) throws IOException {
if (b == null) {
throw new NullPointerException();
}
else if (off < 0 || len < 0 || len > b.length - off) {
throw new IndexOutOfBoundsException();
}
else if (len == 0) {
return 0;
}
int c = in.read();
if (c == -1) {
return -1;
}
b[off] = (byte)c;
int i = 1;
try {
for (; i < len; i++) {
c = in.read();
if (c == -1) {
break;
}
b[off + i] = (byte)c;
}
}
catch (IOException ee) {
}
return i;
}
This code will have as its output:
4 // Four bytes read ten seconds after Client is started.
Now clearly there is data available after 5 seconds, however the method still blocks trying to fill the entire buffer. This doesn't seem to be the case with the input stream that Socket.getInputStream() returns, but is it guaranteed that it will never block once data is available, like the JavaDoc says but not like the source code shows?

However, is it guaranteed that the method will not block once at least one byte of data is read when the input stream comes from a socket?
I don't think this question means anything. The method blocks until either EOS is signalled or at least one byte of data has arrived, in which case it reads however many bytes of data have arrived, without blocking again, and returns that number.
I saw the following code in the small Java web server NanoHTTPD
The code is wrong. It makes the invalid assumption that the entire header will be delivered in the first read. I would expect to see a loop here, that loops until a blank line is detected.
I wondered if a HTTP Request smaller than 8k bytes (which most requests are) potientially could make the thread block indefinitely unless there is a guarantee that it won't block once some data is read.
Again I don't think this means anything. The method will block until at least one byte has arrived, or EOS. Period.

Unknown buffer size to be read from a DataInputStream in java

I have the following statement:
DataInputStream is = new DataInputStream(process.getInputStream());
I would like to print the contents of this input stream but I dont know the size of this stream. How should I read this stream and print it?

It is common to all Streams, that the length is not known in advance. Using a standard InputStream the usual solution is to simply call read until -1 is returned.
But I assume, that you have wrapped a standard InputStream with a DataInputStream for a good reason: To parse binary data. (Note: Scanner is for textual data only.)
The JavaDoc for DataInputStream shows you, that this class has two different ways to indicate EOF - each method either returns -1 or throws an EOFException. A rule of thumb is:
Every method which is inherited from InputStream uses the "return -1" convention,
Every method NOT inherited from InputStream throws the EOFException.
If you use readShort for example, read until an exception is thrown, if you use "read()", do so until -1 is returned.
Tip: Be very careful in the beginning and lookup each method you use from DataInputStream - a rule of thumb can break.

Call is.read(byte[]) repeadely, passing a pre-allocated buffer (you can keep reusing the same buffer). The function will return the number of bytes actually read, or -1 at the end of the stream (in which case, stop):
byte[] buf = new byte[8192];
int nread;
while ((nread = is.read(buf)) >= 0) {
// process the first `nread` bytes of `buf`
}

byte[] buffer = new byte[100];
int numberRead = 0;
do{
numberRead = is.read(buffer);
if (numberRead != -1){
// do work here
}
}while (numberRead == buffer.length);
Keep reading a set buffer size in a loop. If the return value is ever less than the size of the buffer you know you have reached the end of the stream. If the return value is -1, there is no data in the buffer.
DataInputStream.read

DataInputStream is something obsolete. I recommend you to use Scanner instead.
Scanner sc = new Scanner (process.getInputStream());
while (sc.hasNextXxx()) {
System.out.println(sc.nextXxx());
}

Java -- How to read an unknown number of bytes from an inputStream (socket/socketServer)?

Looking to read in some bytes over a socket using an inputStream. The bytes sent by the server may be of variable quantity, and the client doesn't know in advance the length of the byte array. How may this be accomplished?
byte b[];
sock.getInputStream().read(b);
This causes a 'might not be initialized error' from the Net BzEAnSZ. Help.

You need to expand the buffer as needed, by reading in chunks of bytes, 1024 at a time as in this example code I wrote some time ago
byte[] resultBuff = new byte[0];
byte[] buff = new byte[1024];
int k = -1;
while((k = sock.getInputStream().read(buff, 0, buff.length)) > -1) {
byte[] tbuff = new byte[resultBuff.length + k]; // temp buffer size = bytes already read + bytes last read
System.arraycopy(resultBuff, 0, tbuff, 0, resultBuff.length); // copy previous bytes
System.arraycopy(buff, 0, tbuff, resultBuff.length, k); // copy current lot
resultBuff = tbuff; // call the temp buffer as your result buff
}
System.out.println(resultBuff.length + " bytes read.");
return resultBuff;

Assuming the sender closes the stream at the end of the data:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[4096];
while(true) {
int n = is.read(buf);
if( n < 0 ) break;
baos.write(buf,0,n);
}
byte data[] = baos.toByteArray();

Read an int, which is the size of the next segment of data being received. Create a buffer with that size, or use a roomy pre-existing buffer. Read into the buffer, making sure it is limited to the aforeread size. Rinse and repeat :)
If you really don't know the size in advance as you said, read into an expanding ByteArrayOutputStream as the other answers have mentioned. However, the size method really is the most reliable.

Without re-inventing the wheel, using Apache Commons:
IOUtils.toByteArray(inputStream);
For example, complete code with error handling:
public static byte[] readInputStreamToByteArray(InputStream inputStream) {
if (inputStream == null) {
// normally, the caller should check for null after getting the InputStream object from a resource
throw new FileProcessingException("Cannot read from InputStream that is NULL. The resource requested by the caller may not exist or was not looked up correctly.");
}
try {
return IOUtils.toByteArray(inputStream);
} catch (IOException e) {
throw new FileProcessingException("Error reading input stream.", e);
} finally {
closeStream(inputStream);
}
}
private static void closeStream(Closeable closeable) {
try {
if (closeable != null) {
closeable.close();
}
} catch (Exception e) {
throw new FileProcessingException("IO Error closing a stream.", e);
}
}
Where FileProcessingException is your app-specific meaningful RT exception that will travel uninterrupted to your proper handler w/o polluting the code in between.

The simple answer is:
byte b[] = new byte[BIG_ENOUGH];
int nosRead = sock.getInputStream().read(b);
where BIG_ENOUGH is big enough.
But in general there is a big problem with this. A single read call is not guaranteed to return all that the other end has written.
If the nosRead value is BIG_ENOUGH, your application has no way of knowing for sure if there are more bytes to come; the other end may have sent exactly BIG_ENOUGH bytes ... or more than BIG_ENOUGH bytes. In the former case, you application will block (for ever) if you try to read. In the latter case, your application has to do (at least) another read to get the rest of the data.
If the nosRead value is less than BIG_ENOUGH, your application still doesn't know. It might have received everything there is, part of the data may have been delayed (due to network packet fragmentation, network packet loss, network partition, etc), or the other end might have blocked or crashed part way through sending the data.
The best answer is that EITHER your application needs to know beforehand how many bytes to expect, OR the application protocol needs to somehow tell the application how many bytes to expect or when all bytes have been sent.
Possible approaches are:
the application protocol uses fixed message sizes (not applicable to your example)
the application protocol message sizes are specified in message headers
the application protocol uses end-of-message markers
the application protocol is not message based, and the other end closes the connection to say that is the end.
Without one of these strategies, your application is left to guess, and is liable to get it wrong occasionally.
Then you use multiple read calls and (maybe) multiple buffers.

Stream all Input data into Output stream. Here is working example:
InputStream inputStream = null;
byte[] tempStorage = new byte[1024];//try to read 1Kb at time
int bLength;
try{
ByteArrayOutputStream outputByteArrayStream = new ByteArrayOutputStream();
if (fileName.startsWith("http"))
inputStream = new URL(fileName).openStream();
else
inputStream = new FileInputStream(fileName);
while ((bLength = inputStream.read(tempStorage)) != -1) {
outputByteArrayStream.write(tempStorage, 0, bLength);
}
outputByteArrayStream.flush();
//Here is the byte array at the end
byte[] finalByteArray = outputByteArrayStream.toByteArray();
outputByteArrayStream.close();
inputStream.close();
}catch(Exception e){
e.printStackTrace();
if (inputStream != null) inputStream.close();
}

Either:
Have the sender close the socket after transferring the bytes. Then at the receiver just keep reading until EOS.
Have the sender prefix a length word as per Chris's suggestion, then read that many bytes.
Use a self-describing protocol such as XML, Serialization, ...

Use BufferedInputStream, and use the available() method which returns the size of bytes available for reading, and then construct a byte[] with that size. Problem solved. :)
BufferedInputStream buf = new BufferedInputStream(is);
int size = buf.available();

Here is a simpler example using ByteArrayOutputStream...
socketInputStream = socket.getInputStream();
int expectedDataLength = 128; //todo - set accordingly/experiment. Does not have to be precise value.
ByteArrayOutputStream baos = new ByteArrayOutputStream(expectedDataLength);
byte[] chunk = new byte[expectedDataLength];
int numBytesJustRead;
while((numBytesJustRead = socketInputStream.read(chunk)) != -1) {
baos.write(chunk, 0, numBytesJustRead);
}
return baos.toString("UTF-8");
However, if the server does not return a -1, you will need to detect the end of the data some other way - e.g., maybe the returned content always ends with a certain marker (e.g., ""), or you could possibly solve using socket.setSoTimeout(). (Mentioning this as it is seems to be a common problem.)

This is both a late answer and self-advertising, but anyone checking out this question may want to take a look here:
https://github.com/GregoryConrad/SmartSocket

This question is 7 years old, but i had a similiar problem, while making a NIO and OIO compatible system (Client and Server might be whatever they want, OIO or NIO).
This was quit the challenge, because of the blocking InputStreams.
I found a way, which makes it possible and i want to post it, to help people with similiar problems.
Reading a byte array of dynamic sice is done here with the DataInputStream, which kann be simply wrapped around the socketInputStream. Also, i do not want to introduce a specific communication protocoll (like first sending the size of bytes, that will be send), because i want to make this as vanilla as possible. First of, i have a simple utility Buffer class, which looks like this:
import java.util.ArrayList;
import java.util.List;
public class Buffer {
private byte[] core;
private int capacity;
public Buffer(int size){
this.capacity = size;
clear();
}
public List<Byte> list() {
final List<Byte> result = new ArrayList<>();
for(byte b : core) {
result.add(b);
}
return result;
}
public void reallocate(int capacity) {
this.capacity = capacity;
}
public void teardown() {
this.core = null;
}
public void clear() {
core = new byte[capacity];
}
public byte[] array() {
return core;
}
}
This class only exists, because of the dumb way, byte <=> Byte autoboxing in Java works with this List. This is not realy needed at all in this example, but i did not want to leave something out of this explanation.
Next up, the 2 simple, core methods. In those, a StringBuilder is used as a "callback". It will be filled with the result which has been read and the amount of bytes read will be returned. This might be done different of course.
private int readNext(StringBuilder stringBuilder, Buffer buffer) throws IOException {
// Attempt to read up to the buffers size
int read = in.read(buffer.array());
// If EOF is reached (-1 read)
// we disconnect, because the
// other end disconnected.
if(read == -1) {
disconnect();
return -1;
}
// Add the read byte[] as
// a String to the stringBuilder.
stringBuilder.append(new String(buffer.array()).trim());
buffer.clear();
return read;
}
private Optional<String> readBlocking() throws IOException {
final Buffer buffer = new Buffer(256);
final StringBuilder stringBuilder = new StringBuilder();
// This call blocks. Therefor
// if we continue past this point
// we WILL have some sort of
// result. This might be -1, which
// means, EOF (disconnect.)
if(readNext(stringBuilder, buffer) == -1) {
return Optional.empty();
}
while(in.available() > 0) {
buffer.reallocate(in.available());
if(readNext(stringBuilder, buffer) == -1) {
return Optional.empty();
}
}
buffer.teardown();
return Optional.of(stringBuilder.toString());
}
The first method readNext will fill the buffer, with byte[] from the DataInputStream and return the amount bytes read this way.
In the secon method, readBlocking, i utilized the blocking nature, not to worry about consumer-producer-problems. Simply readBlocking will block, untill a new byte-array is received. Before we call this blocking method, we allocate a Buffer-size. Note, i called reallocate after the first read (inside the while loop). This is not needed. You can safely delete this line and the code will still work. I did it, because of the uniqueness of my problem.
The 2 things, i did not explain in more detail are:
1. in (the DataInputStream and the only short varaible here, sorry for that)
2. disconnect (your disconnect routine)
All in all, you can now use it, this way:
// The in has to be an attribute, or an parameter to the readBlocking method
DataInputStream in = new DataInputStream(socket.getInputStream());
final Optional<String> rawDataOptional = readBlocking();
rawDataOptional.ifPresent(string -> threadPool.execute(() -> handle(string)));
This will provide you with a way of reading byte arrays of any shape or form over a socket (or any InputStream realy). Hope this helps!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reading stream over TCP on a SocketChannel with undefined number of Bytes - java

The solution is that to get out of the loop i had to call: sc.shutdownOutput(); Which closes the writing stream without closing the reading stream and set the sc.read(buff) to -1

Related

DataInputStream read not blocking

How can I make sure I received whole file through socket stream?

Is Socket.getInputStream().read(byte[]) guaranteed to not block after at least some data is read?

Unknown buffer size to be read from a DataInputStream in java

Java -- How to read an unknown number of bytes from an inputStream (socket/socketServer)?

Categories

Resources