Sending file as chunks from a java client to server - java

I wrote a java client which takes in the buffer_size for the byte array as a command line argument and declares a byte array in which as file will be read and sent to the server in chunks. Client will send the buffer_size to the java server before starting to read the file so that the server can also define a byte array to receive the file chunks. So this mechanism looks something like....
Client Side:
while ((count = fileReader.read(bytes)) > 0) {
toServer.write(bytes, 0, count);
}
Server Side:
while ((count = fromClient.read(bytes)) > 0) {
//process the received file content
}
This works right for me. But the behavior of the server reading the chunks changes in a random manner. i.e., If the file that is to be read by the client is of 3000 bytes and the buffer_size if 8192 bytes (the server will also have a buffer_size of 8192), the server reads the whole chunk(from the client) with a single read() operation and sometimes the chunk is divided into 2 parts and read (as two 1500 bytes for example, taking up 2 read() operations). I don't understand what exactly happens here. Can we implement this in such a that the server doesn't divide the chunk which is being sent by the client?!
When it is tested by running both client and server in the local machine, server reads the whole content sent by the client using one write() operation with just one read() operation. The behavior changes happen only when the client and server is on different machines.

1500 is probably the MTU of the link. Data is split into packets of such size as it is transfered. When implementing a server, you must simply be prepared for the data to be diced into arbitrarily-sized packets.
The abstraction implemented by TCP is a stream of bytes, not a series of write() calls. To force a specific packet size, you would have to use UDP (and deal with packet loss, as UDP does not guarantee delivery).
To fix this, simply read in a loop until the whole buffer has been read, or the length of the file has been reached. The client can simply send the length of the file at the beginning of the protocol. There is no need for the client and server to use the same buffer size.

Related

Data packet, when sent over my SSL Socket becomes extremely large

I have recently tried my hands over Socket Programming by creating an SSL Socket used to stream data live from a server with no success of course. When I analyze the data packets through Wireshark, I realize the size of request data has been magnified n number of times in the packet and hence the request reaches the server in fragments where as the actual JSON request is a handful of bytes and should reach the server in a single shot.
Any help would be appreciated.
Wrap a BufferedOutputStream around the SSLSocket's output stream, and don't flush it until you really gotta, which is usually not until you're about to read the reply. Otherwise you can be sending one byte at a time to the SSLSocket, which becomes one SSL message per byte, which can expand the data by more than 40x.
However:
the request reaches the server in fragments
That can happen any time. The server has to be able to cope with receiving data as badly fragmented as one byte at a time.
where as the actual JSON request is a handful of bytes and should reach the server in a single shot.
There is no such guarantee in TCP.

Java understanding I/O streams

I/O streams in Java are the most misunderstood concept for me in programming.
Suppose, we get input stream from a socket connection:
DataInputStream in = new DataInputStream(clientSocket.getInputStream());
When I get data from a remote server, which of these describes things correctly?
Data stored in the in variable. When extra data comes from server, it appends to in, increasing its size. And then we can read data from in variable that way:
byte[] messageByte = new byte[1000];
boolean end = false;
String dataString = "";
while(!end)
{
bytesRead = in.read(messageByte);
messageString += new String(messageByte, 0, bytesRead);
if (messageString.length == 100)
{
end = true;
}
}
in is only a link to the source of data and doesn't contain data itself. When we call in.read(messageByte) 1000 bytes copy from the socket to bytesRead?
Alternatively, instead of a socket let's say we have stream connected to file on HDD. When we call in.read(messageByte) we read 1000 bytes from HDD, yes?
Which approach is right? I tend to think it's #2, but if so where is data stored in the socket case? Is the remote server waiting when we read 1000 bytes, and then sends extra data again? Or is data from the server stored in some buffer in the operating system?
Data stored in in variable.
No.
When extra data comes from server, it appends to in, increase it size. And then we can read data from in variable that way:
byte[] messageByte = new byte[1000];
boolean end = false;
String dataString = "";
while(!end)
{
bytesRead = in.read(messageByte);
messageString += new String(messageByte, 0, bytesRead);
if (messageString.length == 100)
{
end = true;
}
}
No. See below.
in is only link to source of data, and don't contains data themselves.
Correct.
And when we call in.read(messageByte); 1000 bytes copy from socket to bytesRead?
No. It blocks until:
at least one byte has been transferred, or
end of stream has occurred, or
an exception has been thrown,
whichever occurs first. See the Javadoc.
(Instead socket, we can have stream connected to file on HDD, and when we call in.read(messageByte) we read 1000 bytes from HDD. Yes?)
No. Same as above.
What approach right?
Neither of them. The correct way to read from an input stream is to loop until you have all the data you're expecting, or EOS or an exception occurs. You can't rely on read() filling the buffer. If you need that, use DataInputStream.readFully().
I tend to 2
That doesn't make sense. You don't have the choice. (1) and (2) aren't programming paradigms, they are questions about how the stream actually works. The question of how to write the code is distinct from that.
where data stored in socket?
Some of it is in the socket receive buffer in the kernel. Most of it hasn't arrived yet. None of it is 'in the socket'.
Or remote server waiting when we read 1000 bytes, and then send extra data again?
No. The server sends through its socket send buffer into your socket receive buffer. Your reads and the server's writes are very decoupled from each other.
Or data from server stored in any buffer in operating system?
Yes, the socket receive buffer.
It depends on the type of stream. Where the data is stored varies from stream to stream. Some have internal storage, some read from other sources, and some do both.
A FileInputStream reads from the file on disk when you request it to. The data is on disk, it's not in the stream.
A socket's InputStream reads from the operating systems buffers. When packets arrive the operating system automatically reads them and buffers up a small amount of data (say, 64KB). Reading from the stream drains that OS buffer. If the buffer is empty because no packets have arrived, your read call blocks. If you don't drain the buffer fast enough and it gets full then the OS will drop network packets until you free up some space.
A ByteArrayOutputStream has an internal byte[] array. When you write to the stream it stores your writes in that array. In this case the stream does have internal storage.
A BufferedInputStream is tied to another input stream. When you read from a BufferedInputStream it will typically request a bug chunk of data from the underlying stream and store it in a buffer. Subsequent read requests you issue are then satisfied with data from the buffer rather than performing additional I/O on the underlying stream. The goal is to minimize the number of individual read requests the underlying stream receives by issuing a smaller number of bulk reads. In this case the stream has a mixed strategy of some internal storage and some external reads.

Reliable write to Java SocketChannel

I'd have a question regarding java SocketChannel.
Say I have a socket channel opened in blocking mode; after calling the the write(ByteBuffer) method, I get an integer describing how many bytes were written. The javadoc says:
"Returns: The number of bytes written, possibly zero"
But what exactly does this mean? does this mean that the number of bytes really has been delivered to the client (so that sender received tcp ack making evident how many bytes have been received by server), or does this mean that the number of bytes has been written to the tcp stack? (so that some bytes still might be waiting e.g. in the network card buffer).
does this mean that the number of bytes really has been delivered to the client
No. It simply means the number of bytes delivered to the local network stack.
The only way to be sure that data has been delivered to the remote application is if you receive an application level acknowledgment for the data.
The paragraph that confuses you is for non-blocking I/O.
For non-blocking operation, your initial call may indeed not write anything at the time of the call.
Unless otherwise specified, a write
operation will return only after
writing all of the r requested bytes.
Some types of channels, depending upon
their state, may write only some of
the bytes or possibly none at all. A
socket channel in non-blocking mode,
for example, cannot write any more
bytes than are free in the socket's
output buffer.
As you can see, for blocking I/O all bytes would be sent ( or exception thrown in the middle of the send )
Note, that there is no guarantee on when bytes will appear on the receiving side, this is totally up to the low level socket protocol.

Validating data received in a non blocking server

I am building a non blocking server using javas NIO package, and have a couple questions about validating received data.
I have noticed that when i call read on a socket channel, it will attempt to fill the byte buffer that it is reading to (as per the documentation). When i send 10 bytes to the server from the client, the server will read the those ten bytes into the byte buffer, the rest of the bytes in the byte buffer will stay at zero, and the number returned from the read operation will be the size of my byte buffer even though the client only wrote 10 bytes.
What i am trying to figure out is if there is a way to get just the number of bytes the client sent to the server when the server reads from a socket channel (in the above case, 10 instead of 1024).
If that doesn't work i know i can get separate all the actual received data from the client from this 'excess' data stored in the byte buffer by using delimiters in conjunction with my 'instruction set headers' and what not, but it seems like this should exists so i have to wonder if i am just missing something obvious, or if there is some low level reason why this can't be done.
Thanks :)
you probably forgot to call the notorious flip() on your buffer.
buffer.clear();
channel.read(buffer);
buffer.flip();
// now you can read from the buffer
buffer.get...
I need to change my signature to nio.sucks

read and write method for large data in socket communication does not work reliably

I have created a socket programming for server client communication.
I am reading data using read(byte[]) of DataInputStream, also writing data using write(byte[]) of DataOutputStream.
Whenver I am sending small amount of data my program works fine.
But if I send a data of 20000 characters and send it 10 times then I am able to receive the data 8 times perfectly but not the 2 times.
So can I reliably send and receive data using read and write in socket programming?
My guess is that you're issuing a single call to read() and assuming it will return all the data you asked for. Streams don't generally work that way. It will block until some data is available, but it won't wait until it's got enough data to fill the array.
Generally this means looping round. For instance:
byte[] data = new byte[expectedSize];
int totalRead = 0;
while (totalRead < expectedSize)
{
int read = stream.read(data, totalRead, expectedSize-totalRead);
if (read == -1)
{
throw new IOException("Not enough data in stream");
}
totalRead += read;
}
If you don't know how many bytes you're expecting in the first place, you may well want to still loop round, but this time until read() returns -1. Use a buffer (e.g. 8K) to read into, and write into a ByteArrayOutputStream. When you've finished reading, you can then get the data out of the ByteArrayOutputStream as a byte array.
Absolutly -- TCP Sockets is a reliable network protocol provided the API is used properly.
You really need to check the number of bytes you receive on each read() call.
Sockets will arbiterily decide you have enough data and pass it back on hte read call -- the amount can dependon many factors (buffer size, memory availibility, network respose time etc.) most of which are unpredicatable. For smaller buffers you normally get as many bytes as you asked for, but, for larger buffer sizes read() will often return less data than you asked for -- you need to check the number of bytes read and repeat the read call for the remaining bytes.
It is also possible that something in your network infrastructure (router, firewall etc.) is misconfigred and trucating large packets.
Your problem is that in the server thread, you must call outputstream.flush(), to specify that the buffered data should be send to the other end of the communication

Categories

Resources