Java TCP-Sockets transmit files larger than 4gb [duplicate] - java

This question already has answers here:
Java multiple file transfer over socket
(3 answers)
Closed 5 years ago.
I am trying to transfer a file that is greater than 4gb using the Java SocketsAPI. I am already reading it via InputStreams and writing it via OutputStreams. However, analyzing the transmitted packets in Wireshark, I realise that the Sequence number of the TCP-packets is incremented by the byte-length of the packet, which seems to be 1440byte.
This leads to the behavior that when I try to send a file greater than 4gb, the total size of the Sequence-Number field of TCP is exceeded, leading to lots of error packages, but no error in Java.
My code for transmission currently looks like this:
DataOutputStream fileTransmissionStream = new DataOutputStream(transmissionSocket.getOutputStream());
FileInputStream fis = new FileInputStream(toBeSent);
int totalFileSize = fis.available();
fileTransmissionStream.writeInt(totalFileSize);
while (totalFileSize >0){
if(totalFileSize >= FileTransmissionManagementService.splittedTransmissionSize){
sendBytes = new byte[FileTransmissionManagementService.splittedTransmissionSize];
fis.read(sendBytes);
totalFileSize -= FileTransmissionManagementService.splittedTransmissionSize;
} else {
sendBytes = new byte[totalFileSize];
fis.read(sendBytes);
totalFileSize = 0;
}
byte[] encryptedBytes = DataEncryptor.encrypt(sendBytes);
/*byte[] bytesx = ByteBuffer.allocate(4).putInt(encryptedBytes.length).array();
fileTransmissionStream.write(bytesx,0,4);*/
fileTransmissionStream.writeInt(encryptedBytes.length);
fileTransmissionStream.write(encryptedBytes, 0, encryptedBytes.length);
What exactly have I done wrong in this situation, or is it not possible to transmit files greater than 4gb via one Socket?

TCP can handle infinitely long data streams. There is no problem with the sequence number wrapping around. As it is initially random, that can happen almost immediately, regardless of the length of the stream. The problems are in your code:
DataOutputStream fileTransmissionStream = new DataOutputStream(transmissionSocket.getOutputStream());
FileInputStream fis = new FileInputStream(toBeSent);
int totalFileSize = fis.available();
Classic misuse of available(). Have a look at the Javadoc and see what it's really for. This is also where your basic problem lies, as values > 2G don't fit into an int, so there is a truncation. You should be using File.length(), and storing it into a long.
fileTransmissionStream.writeInt(totalFileSize);
while (totalFileSize >0){
if(totalFileSize >= FileTransmissionManagementService.splittedTransmissionSize){
sendBytes = new byte[FileTransmissionManagementService.splittedTransmissionSize];
fis.read(sendBytes);
Here you are ignoring the result of read() here. It isn't guaranteed to fill the buffer: that's why it returns a value. See, again, the Javadoc.
totalFileSize -= FileTransmissionManagementService.splittedTransmissionSize;
} else {
sendBytes = new byte[totalFileSize];
Here you are assuming the file size fits into an int, and assuming the bytes fit into memory.
fis.read(sendBytes);
See above re read().
totalFileSize = 0;
}
byte[] encryptedBytes = DataEncryptor.encrypt(sendBytes);
/*byte[] bytesx = ByteBuffer.allocate(4).putInt(encryptedBytes.length).array();
fileTransmissionStream.write(bytesx,0,4);*/
We're not interested in your commented-out code.
fileTransmissionStream.writeInt(encryptedBytes.length);
fileTransmissionStream.write(encryptedBytes, 0, encryptedBytes.length);
You don't need all this crud. Use a CipherOutputStream to take care of the encryption, or better still SSL, and use the following copy loop:
byte[] buffer = new byte[8192]; // or much more if you like, but there are diminishing returns
int count;
while ((count = in.read(buffer)) > 0)
{
out.write(buffer, 0, count);
}

It seems that your protocol for the transmission is:
Send total file length in an int.
For each bunch of bytes read,
Send number of encrypted bytes ahead in an int,
Send the entrypted bytes themselves.
The basic problem, beyond the misinterpretations of the documentation that were pointed out in #EJP's answer, is with this very protocol.
You assume that the file length can be sent oven in an int. This means the length it sends cannot be more than Integer.MAX_VALUE. Of course, this limits you to files of 2G length (remember Java integers are signed).
If you take a look at the Files.size() method, which is a method for getting the actual file size in bytes, you'll see that it returns long. A long will accommodate files larger than 2GB, and larger than 4GB. So in fact, your protocol should at the very least be defined to start with a long rather than an int field.
The size problem really has nothing at all to do with the TCP packets.

Related

Java - download a file through network with a buffer

i want to read from a network stream and write the bytes to a file, directly.
But every time i run the program very few bytes are written to the file actually.
Java:
InputStream in = uc.getInputStream();
int clength=uc.getContentLength();
byte[] barr = new byte[clength];
int offset=0;
int totalwritten=0;
int i;
int wrote=0;
OutputStream out = new FileOutputStream("file.xlsx");
while(in.available()!=0) {
wrote=in.read(barr, offset, clength-offset);
out.write(barr, offset, wrote);
offset+=wrote;
totalwritten+=wrote;
}
System.out.println("Written: "+totalwritten+" of "+clength);
out.flush();
That's because available() doesn't do what you think it does. Read its API documentation. You should simply read until the number of bytes read, returned by read(), is -1. Or even simpler, use Files.copy():
Files.copy(in, new File("file.xlsx").toPath());
Using a buffer that has the size of the input stream also pretty much defeats the purpose of using a buffer, which is to only have a few bytes in memory.
If you want to reimplement copy(), the general pattern is the following:
byte[] buffer = new byte[4096]; // number of bytes in memory
int numberOfBytesRead;
while ((numberOfBytesRead = in.read(buffer)) >= 0) {
out.write(buffer, 0, numberOfBytesRead);
}
You're using .available() wrong. From Java documentation:
available() returns an estimate of the number of bytes that can be read
(or skipped over) from this input stream without blocking by the next
invocation of a method for this input stream
That means that the first time your stream is slower than your file writing speed (very soon in all probability) the while ends.
You should either prepare a thread that waits for the input until it has read all the expected content length (with a sizable timeout, of course) or just block your program in the wait, if user interaction is not a big deal.

Why doesn't InputStream fill the array fully?

Dude, I'm using following code to read up a large file(2MB or more) and do some business with data.
I have to read 128Byte for each data read call.
At the first I used this code(no problem,works good).
InputStream is;//= something...
int read=-1;
byte[] buff=new byte[128];
while(true){
for(int idx=0;idx<128;idx++){
read=is.read(); if(read==-1){return;}//end of stream
buff[idx]=(byte)read;
}
process_data(buff);
}
Then I tried this code which the problems got appeared(Error! weird responses sometimes)
InputStream is;//= something...
int read=-1;
byte[] buff=new byte[128];
while(true){
//ERROR! java doesn't read 128 bytes while it's available
if((read=is.read(buff,0,128))==128){process_data(buff);}else{return;}
}
The above code doesn't work all the time, I'm sure that number of data is available, but reads(read) 127 or 125, or 123, sometimes. what is the problem?
I also found a code for this to use DataInputStream#readFully(buff:byte[]):void which works too, but I'm just wondered why the seconds solution doesn't fill the array data while the data is available.
Thanks buddy.
Consulting the javadoc for FileInputStream (I'm assuming since you're reading from file):
Reads up to len bytes of data from this input stream into an array of bytes. If len is not zero, the method blocks until some input is available; otherwise, no bytes are read and 0 is returned.
The key here is that the method only blocks until some data is available. The returned value gives you how many bytes was actually read. The reason you may be reading less than 128 bytes could be due to a slow drive/implementation-defined behavior.
For a proper read sequence, you should check that read() does not equal -1 (End of stream) and write to a buffer until the correct amount of data has been read.
Example of a proper implementation of your code:
InputStream is; // = something...
int read;
int read_total;
byte[] buf = new byte[128];
// Infinite loop
while(true){
read_total = 0;
// Repeatedly perform reads until break or end of stream, offsetting at last read position in array
while((read = is.read(buf, read_total, buf.length - offset)) != -1){
// Gets the amount read and adds it to a read_total variable.
read_total = read_total + read;
// Break if it read_total is buffer length (128)
if(read_total == buf.length){
break;
}
}
if(read_total != buf.length){
// Incomplete read before 128 bytes
}else{
process_data(buf);
}
}
Edit:
Don't try to use available() as an indicator of data availability (sounds weird I know), again the javadoc:
Returns an estimate of the number of remaining bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. Returns 0 when the file position is beyond EOF. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
In some cases, a non-blocking read (or skip) may appear to be blocked when it is merely slow, for example when reading large files over slow networks.
The key there is estimate, don't work with estimates.
Since the accepted answer was provided a new option has become available. Starting with Java 9, the InputStream class has two methods named readNBytes that eliminate the need for the programmer to write a read loop, for example your method could look like
public static void some_method( ) throws IOException {
InputStream is = new FileInputStream(args[1]);
byte[] buff = new byte[128];
while (true) {
int numRead = is.readNBytes(buff, 0, buff.length);
if (numRead == 0) {
break;
}
// The last read before end-of-stream may read fewer than 128 bytes.
process_data(buff, numRead);
}
}
or the slightly simpler
public static void some_method( ) throws IOException {
InputStream is = new FileInputStream(args[1]);
while (true) {
byte[] buff = is.readNBytes(128);
if (buff.length == 0) {
break;
}
// The last read before end-of-stream may read fewer than 128 bytes.
process_data(buff);
}
}

Limit the content available from a Java NIO Channel (File or Socket)

I'm pretty new to NIO and wanted to implement some feature with it, instead of typical Streams (which can do all sort of things).
What I'm not sure I can get is reading from a file into a buffer and limiting the content that I will transfer. Let's say from position 100 to 200 (even if file length is 1000). It also would be nice to do on network sockets.
I know that NIO keeps things basic to leverage OS capabilities that's why I'm not sure it can be done.
I was thinking that a tricky way to do it would be a 'LimitedReadChannel' that when it's should return less than the available buffer size it uses another byte-buffer and then transfer to the original one (1). But seems more tricky than necessary. I also don't want to use anything related to streams because it would defeat the purpose of using NIO.
(1) So far....
LimitedChannel.read(buffer) {
if (buffer.available?? > contentLeft) {
delegateChannel.read(smallerBuffer);
// transfer from smallerBuffer to buffer
} else {
delegateChannel.read(buffer);
}
}
I've found that Buffers admit to ask for the current limit or set a new one. So that wrapper channel (the one that limits the effective number of bytes read) could modify the buffer limit to avoid reading more...
Something like:
// LimitedChannel.java
// private int bytesLeft; // remaining amount of bytes to read
public int read(ByteBuffer buffer) {
if (bytesLeft <= 0) {
return -1;
}
int oldLimit = buffer.limit();
if (bytesLeft < buffer.remaining()) {
// ensure I'm not reading more than allowed
buffer.limit(buffer.position() + bytesLeft);
}
int bytesRead = delegateChannel.read(buffer);
bytesLeft -= bytesRead;
buffer.limit(oldLimit);
return bytesRead;
}
Anyway not sure if this already exists somewhere. It's difficult to find documentation about this use case...

java nio socketChannel read always return same data

In client side, read code:
byte[] bytes = new byte[50]; //TODO should reuse buffer, for test only
ByteBuffer dst = ByteBuffer.wrap(bytes);
int ret = 0;
int readBytes = 0;
boolean fail = false;
try {
while ((ret = socketChannel.read(dst)) > 0) {
readBytes += ret;
System.out.println("read " + ret + " bytes from socket " + dst);
if (!dst.hasRemaining()) {
break;
}
}
int pos = dst.position();
byte[] data = new byte[pos];
dst.flip();
dst.get(data);
System.out.println("read data: " + StringUtil.toHexString(data));
} catch (Exception e) {
fail = true;
handler.onException(e);
}
The problem is socketChannel.read() always return positive, I checked the return buffer, the data is duplicate N times, it likes the low level socket buffer's position is not move forward. Any idea?
If the server only returned 48 bytes, your code must have blocked in the read() method trying to get the 49th and 50th bytes. So either your '50' is wrong or you will have to restructure your code to read and process whatever you get as you get it rather than trying to fill buffers first. And this can't possibly be the code where you think you always got the same data. The explanation for that would be failure to compact the buffer after the get, if you reuse the same buffer for the next read, which you should do, but your posted code doesn't do.
1 : This might not be a bug !
[assuming that there is readable data in the buffer]...
You would expect a -1 at the end of the stream... See http://docs.oracle.com/javase/1.4.2/docs/api/java/nio/channels/SocketChannel.html#read%28java.nio.ByteBuffer%29
If you are continually recieving a positive value from the read() call, then you will need to determine why data is being read continually.
Of course, the mystery herein ultimately lies in the source data (i.e. the SocketChannel which you are read data from).
2: Explanation of your possible problems
If your socket channel is coming from a REAL file, which is finite then your file is really big, and eventually, the read() operation will return 0... eventually...
If, on the other hand, your socket channel is listening to a source of data which you EXPECT to be finite (i.e. a serialized object stream, for example), I would double check the source --- maybe your finite stream is simply producing more and more data... and you are correctly consuming it.
3: Finally some advice
A trick for debugging this type of error is playing with the ByteBuffer input to your read method : the nice thing about java.nio's ByteBuffers is that, since they are more object oriented then the older byte[] writers, you can get very fine-grained debugging of their operations.

Reading bytes from a java socket: getting ArrayIndexOutOfBounds

am having this very strange problem: i have a small program that reads bytes off a socket;
whenever i am debugging, the program runs fine; but every time i run it (like straight up run it), i get the ArrayIndexOutOfBounds exception. what gives? am i reading it too fast for the socket? am i missing something?
here is the main():
public static void main(String[] args){
TParser p = new TParser();
p.init();
p.readPacket();
p.sendResponse();
p.readPacket();
p.sendResponse();
p.shutdown();
}
The method init is where i create the Sockets for reading and writing;
The next method (readPacket) is where problems start to arise; i read the entire buffer to a private byte array so i can manipulate the data freely; for instance, depending on some bytes on the data i set some properties:
public void readPacket(){
System.out.println("readPacket");
readInternalPacket();
setPacketInfo();
}
private void readInternalPacket(){
System.out.println("readInternalPacket");
try {
int available=dataIN.available();
packet= new byte[available];
dataIN.read(packet,0,available);
dataPacketSize=available;
}
catch (Exception e) {
e.printStackTrace();
}
}
private void setPacketInfo() {
System.out.println("setPacketInfo");
System.out.println("packetLen: " +dataPacketSize);
byte[] pkt= new byte[2];
pkt[0]= packet[0];
pkt[1]= packet[1];
String type= toHex(pkt);
System.out.println("packet type: "+type);
if(type.equalsIgnoreCase("000F")){
recordCount=0;
packetIterator=0;
packetType=Constants.PacketType.ACKPacket;
readIMEI();
validateDevice();
}
}
The line where it breaks is the line
pkt[1]= packet[1]; (setPacketInfo)
meaning it only has 1 byte at that time... but how can that be, if whe i debug it it runs perfectly? is there some sanity check i must do on the socket? (dataIN is of type DataInputStream)
should i put methods on separate threads? ive gone over this over and over, even replaced my memory modules (when i started having weird ideas on this)
...please help me.
I dont know the surrounding code, especially the class of dataIN but I think your code does this:
int available=dataIN.available(); does not wait for data at all, just returns that there are 0 bytes available
so your array is of size 0 and you then do:
pkt[0]= packet[0]; pkt[1]= packet[1]; which is out of bounds.
I would recommend that you at least loop until the available() returns the 2 you expect, but i cannot be sure that that is the correct (* ) or right (** ) way to do it because i dont know dataIN's class-implementation.
Notes: (* ) it is not correct if it is possible for available() to e.g. return the 2 bytes separately. (** ) it is not the right way to do it if dataIN itself provides methods that wait.
Can it be that reading the data from the socket is an asynchronous process and the setPacketInfo() is called before your packet[] is completely filled? If this is the case, it's possible it runs great when debugging, but terrible when it really uses sockets on different machines.
You can add some code to the setPacketInfo() method to check the length of the packet[] variable.
byte[] pkt= new byte[packet.length];
for(int x = 0; x < packet.length; x++)
{
pkt[x]= packet[x];
}
not really sure though why you even copy the packet[] variable into pkt[]?
You are using a packet oriented protocol on a stream oriented layer without transmitting the real packet length. because of fragmentation the size of received data can be smaller than the packet you sent.
Therefore I strongly recommend to send the data packet size before sending the actual packet. On the receiver side you could use a DataInputStream and use blocking read for detecting an incoming packet:
private void readInternalPacket() {
System.out.println("readInternalPacket");
try {
int packetSize = dataIN.readInt();
packet = new byte[packetSize];
dataIN.read(packet, 0, packetSize);
dataPacketSize = packetSize;
} catch (Exception e) {
e.printStackTrace();
}
}
Of course you have to modify the sender side as well, sending the packet size before the packet.
To add to the answer from #eznme. You need to read from your underlying stream until there is no more pending data. This may required one or more reads, but an end of stream will be indicated when the available method returns 0. I would recommend using Apache IOUtils to 'copy' the input stream to a ByteArrayOutputStream, then getting the byte[] array from that.
In your setPacketInfo method you should do a check on your data buffer length before getting your protocol header bytes:
byte[] pkt= new byte[2];
if((packet != null) && (packet.length >= 2)) {
pkt[0]= packet[0];
pkt[1]= packet[1];
// ...
}
That will get rid of the out of bound exceptions you are getting when you read zero-length data buffers from your protocol.
You should never rely on dataIN.available(), and dataIN.read(packet,0,available); returns an integer that says how many bytes you received. That's not always the same value as what available says, and it can also be less than the size of the buffer.
This is how you should read:
byte[] packet = new byte[1024]; //
dataPacketSize = dataIN.read(packet,0,packet.length);
You should also wrap your DataInputStream in a BufferedInputStream, and take care of the case where you get less than 2 bytes, so that you don't try to process bytes that you haven't received.

Categories

Resources