Handling input-stream overflow (zero-window) in java

Handling input-stream overflow (zero-window) in java - java

We have a system where there are 2 applications. One of these is a legacy application, for which we can't do any code changes. This application is sending messages to second application which is written in java. In our java code, we have set input stream buffer size equal to 1 MB as follows:
Socket eventSocket = new Socket();
eventSocket.setSendBufferSize(1024 * 1024);
Now the legacy application is sending messages of variable size. Most of the messages are smaller than 1 MB. But sometimes it is sending messages as large as 8 MB. Many times these messages are read successfully by the java application. But for some cases, following read operation is returning -1 value:
read = stream.read(b, off, len - off); ( here stream is an InputStream object)
As per Java API definition, InputStream read method returns -1 if there is no more data because the end of the stream has been reached.
But this is an erroneous behavior. We have done snoop test using
wireshark to verify the exact messages that are exchanged between these two applications and found that java application has sent zero
window message few seconds before the time when input stream read
method has returned -1 value. At the time when this java api method
has returned -1, java application was sending ZeroWindowProbeAck
message to the legacy application.
How should we handle this issue?
As per https://wiki.wireshark.org/TCP%20ZeroWindow, zero window has following definition:
What does TCP Zero Window mean?
Zero Window is something to investigate.
TCP Zero Window is when the Window size in a machine remains at zero for a specified amount of time.
This means that a client is not able to receive further information at the moment, and the TCP transmission is halted until it can process the information in its receive buffer.
TCP Window size is the amount of information that a machine can receive during a TCP session and still be able to process the data. Think of it like a TCP receive buffer. When a machine initiates a TCP connection to a server, it will let the server know how much data it can receive by the Window Size.
In many Windows machines, this value is around 64512 bytes. As the TCP session is initiated and the server begins sending data, the client will decrement it's Window Size as this buffer fills. At the same time, the client is processing the data in the buffer, and is emptying it, making room for more data. Through TCP ACK frames, the client informs the server of how much room is in this buffer. If the TCP Window Size goes down to 0, the client will not be able to receive any more data until it processes and opens the buffer up again. In this case, Protocol Expert will alert a "Zero Window" in Expert View.
Troubleshooting a Zero Window
For one reason or another, the machine alerting the Zero Window will not receive any more data from the host. It could be that the machine is running too many processes at that moment, and its processor is maxed. Or it could be that there is an error in the TCP receiver, like a Windows registry misconfiguration. Try to determine what the client was doing when the TCP Zero Window happened.
Source: flukenetworks.com

Handling input-stream overflow (zero window) in Java
There is no such thing as 'input-stream overflow' in Java, and you can't handle zero window in Java either, except by reading from the network more quickly. Your title already doesn't make sense.
We have done snoop test using wireshark to verify the exact messages that are exchanged between these two applications and found that java application has sent zero window message few seconds before the time when input stream read method has returned -1 value.
Neither Java nor the application send those messages. The operating system does.
The input stream of a socket returns -1 if and only if a FIN has been received from the peer, and that may in turn occur if and and only if the peer has closed the connection or exited (Unix). It doesn't have anything to do wth TCP windowing.
At the time when this java api method has returned -1, java application was sending ZeroWindowProbeAck message to the legacy application.
No it wasn't. The operating system was, and it wasn't 'at the time', it was 'a few seconds before', accordingly to your own words. At the time when this Java method returned -1, it had just received a FIN from the peer. Have a look at your sniff log. There is no problem here to explain.
As per [whatever], zero window has the following definition
Wireshark does not get to define TCP. TCP is defined in IETF RFCs. You can't cite non-normative sources as definitions.
TCP Zero Window is when the Window size in a machine remains at zero for a specified amount of time.
For any amount of time.
This means that a client is not able to receive further information at the moment, and the TCP transmission is halted until it can process the information in its receive buffer.
It means that the peer is not able to receive. It has nothing to do with the client or the server specifically.
TCP Window size is the amount of information that a machine can receive during a TCP session
No it isn't. It is the amount of data the receiver is currently able to receive. It is therefore also the amount of data the sender is present allowed to send. It has nothing to do with the session whatsoever.
and still be able to process the data.
Irrelevant.
Think of it like a TCP receive buffer.
It is a TCP receive buffer.
When a machine initiates a TCP connection to a server, it will let the server know how much data it can receive by the Window Size.
Correct. And vice versa. Continuously, not just at the start of the session.
In many Windows machines, this value is around 64512 bytes. As the TCP session is initiated and the server begins sending data, the client will decrement it's Window Size as this buffer fills.
It has nothing to do with clients and servers. It operates in both directions.
At the same time, the client is processing the data in the buffer, and is emptying it, making room for more data. Through TCP ACK frames,
Segments
the client informs the server of how much room is in this buffer.
The receiver informs the sender.
If the TCP Window Size goes down to 0, the client
The peer
will not be able to receive any more data until it processes and opens the buffer up again. In this case, Protocol Expert will alert a "Zero Window" in Expert View.
For one reason or another, the machine alerting the Zero Window will not receive any more data from the host.
For one reason only. Its socket receive buffer is full. Period.
It could be that the machine is running too many processes at that moment
Rubbish.
Or it could be that there is an error in the TCP receiver, like a Windows registry misconfiguration.
Rubbish. The receiver is reading more slowly than the sender is sending. Period. It is a normal condition that arises frequently during any TCP session.
Try to determine what the client was doing when the TCP Zero Window happened.
That's easy. Not reading from the network.
Your source is drivel, and your problem is imaginary.

We have created a solution, where we are waiting for input stream to get cleared by waiting for some time after this overflow problem occurs. We have done code changes as follows:
int execRetries = 0;
while (true)
{
read = stream.read(b, off, len - off);
if (read == -1)
{
if(execRetries++ < MAX_EXEC_RETRIES_AFTER_IS_OVERFLOW){
try {
Log.error("Inputstream buffer overflow occured. Retry no: " + execRetries);
Thread.sleep(WAIT_TIME_AFTER_IS_OVERFLOW);
} catch (InterruptedException e) {
Log.error(e.getMessage(), e);
}
}
else{
throw new Exception("End of file on input stream");
}
}
else if(execRetries!=0){
Log.info("Inputstream buffer overflow problem resolved after retry no: " + execRetries);
execRetries = 0;
}
.....
}
Solution is sent to test server. We are waiting to verify the end result whether this solution is working.

Related

detecting TCP/IP packet loss

I have tcp communication via socket code like :
public void openConnection() throws Exception
{
socket = new Socket();
InetAddress iNet = InetAddress.getByName("server");
InetSocketAddress sock = new InetSocketAddress(iNet, Integer.parseInt(port));
socket.connect(sock, 0);
out = new PrintWriter(socket.getOutputStream(), true);
in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
}
and send method as :
synchronized void send(String message)
{
try
{
out.println(message);
}
catch (Exception e)
{
throw new RuntimeException(this.getClass() + ": Error Sending Message: "
+ message, e);
}
}
And I read from webpages that, TCP/IP does not guarantee delivery of packet, it retries but if network is too busy, packet may be dropped([link]).1
Packets can be dropped when transferring data between systems for two key reasons:
Heavy network utilization and resulting congestion
Faulty network hardware or connectors
TCP is designed to be able to react when packets are dropped on a network. When a packet is successfully delivered to its destination, the destination system sends an acknowledgement message back to the source system. If this acknowledgement is not received within a certain interval, that may be either because the destination system never received the packet or because the packet containing the acknowledgement was itself lost. In either case, if the acknowledgement is not received by the source system in the given time, the source system assumes that the destination system never received the message and retransmits it. It is easy to see that if the performance of the network is poor, packets are lost in the first place, and the increased load from these retransmit messages is only increasing the load on the network further, meaning that more packets will be lost. This behaviour can result in very quickly creating a critical situation on the network.
Is there any way, that I can detect if packet was received successfully by destination or not, I am not sure that out.println(message); will throw any exception, as this is non blocking call. It will put message in buffer and return to let TCP/IP do its work.
Any help?

TCP is designed to be able to react when packets are dropped on a network.
As your quote says, TCP is design to react automatically to the events you mention in this text. As such, you don't have anything to do at this level, since this will be handled by the TCP implementation you're using (e.g. in the OS).
TCP has some features that will do some of the work for you, but you are right to wonder about their limitations (many people think of TCP as a guaranteed delivery protocol, without context).
There is an interesting discussion on the Linux Kernel Mailing List ("Re: Client receives TCP packets but does not ACK") about this.
In your use case, practically, this means that you should treat your TCP connection as a stream of data, in each direction (the classic mistake is to assume that if you send n bytes from on end, you'll read n bytes in a single buffer read on the other end), and handle possible exceptions.
Handling java.io.IOExceptions properly (in particular subclasses in java.net) will cover error cases at the level you describe: if you get one, have a retry strategy (depending on what the application and its user is meant to do). Rely on timeouts too (don't set a socket as blocking forever).
Application protocols may also be designed to send their own acknowledgement when receiving commands or requests.
This is a matter of assigning responsibilities to different layers. The TCP stack implementation will handle the packet loss problems you mention, and throw an error/exception if it can't fix it by itself. Its responsibility is the communication with the remote TCP stack. Since in most cases you want your application to talk to a remote application, there needs to be an extra acknowledgement on top of that. In general, the application protocol needs to be designed to handle these cases. (You can go a number of layers up in some cases, depending on which entity is meant to take responsibility to handle the requests/commands.)

TCP/IP does not drop the packet. The congestion control algorithms inside the TCP implementation take care of retransmission. Assuming that there is a steady stream of data being sent, the receiver will acknowledge which sequence numbers it received back to the sender. The sender can use the acknowledgements to infer which packets need to be resent. The sender holds packets until they have been acknowledged.
As an application, unless the TCP implementation provides mechanisms to receive notification of congestion, the best it can do is establish a timeout for when the transaction can complete. If the timeout occurs before the transaction completes, the application can declare the network to be too congested for the application to succeed.

Code what you need. If you need acknowledgements, implement them. If you want the sender to know the recipient got the information, then have the recipient send some kind of acknowledgement.
From an application standpoint, TCP provides a bi-directional byte stream. You can communicate whatever information you want over that by simply specifying streams of bytes that convey the information you need to communicate.
Don't try to make TCP do anything else. Don't try to "teach TCP" your protocol.

Why java.nio.SocketChannel not send data (Jdiameter)?

I create simple diameter client and server (Link to sources). Client must send 10000 ccr messages, but in wireshark i see only ~300 ccr messages will be sended. Other messages raised timeouts on client. I run server and client on different computers with windows 7. I found in JDiameter sources line where jdiameter sended ccr (line 280) and i think in case when sending buffer of socket is full ccr not sended. I add before line 280 this code
while(bytes.hasRemaining())
Client send ~9900 ccr, but very slow.I tested client on other diameter server wroted on c++, client(on jdiameter without my changes) send ~7000 ccr, but this server hosted on debian.
I don't know ways to solve this problem, thanks for any help.

If the sender's send returns zero, it means the sender's socket send buffer is full, which in turn means the receiver's socket receive buffer is full, which in turn means that the receiver is reading slower than the sender is sending.
So speed up the receiver.
NB In non-blocking mode, merely looping around the write() call while it returns zero is not adequate. If write() returns zero you must:
Deregister the channel for OP_READ and register it for OP_WRITE
Return to the select loop.
When OP_WRITE fires, do the write again. This time, if it doesn't return zero, deregister OP_WRITE, and (probably, according to your requirements) register OP_READ.
Note that keeping the channel registered for OP_WRITE all the time isn't correct either. A socket channel is almost always writable, meaning there is almost always space in the socket send buffer. What you're interested in is the transistion between not-writable and writable.

UDP packets waiting and then arriving together

I have a simple Java program which acts as a server, listening for UDP packets. I then have a client which sends UDP packets over 3g.
Something I've noticed is occasionally the following appears to occur: I send one packet and seconds later it is still not received. I then send another packet and suddenly they both arrive.
I was wondering if it was possible that some sort of system is in place to wait for a certain amount of data instead of sending an undersized packet. In my application, I only send around 2-3 bytes of data per packet - although the UDP header and what not will bulk the message up a bit.
The aim of my application is to get these few bytes of data from A to B as fast as possible. Huge emphasis on speed. Is it all just coincidence? I suppose I could increase the packet size, but it just seems like the transfer time will increase, and 3g isn't exactly perfect.

Since the comments are getting rather lengthy, it might be better to turn them into an answer altogether.
If your app is not receiving data until a certain quantity is retrieved, then chances are, there is some sort of buffering going on behind the scenes. A good example (not saying this applies to you directly) is that if you or the underlying libraries are using InputStream.readLine() or InputStream.read(bytes), then it will block until it receives a newline or bytes number of bytes before returning. Judging by the fact that your program seems to retrieve all of the data when a certain threshold is reached, it sounds like this is the case.
A good way to debug this is, use Wireshark. Wireshark doesn't care about your program--its analyzing the raw packets that are sent to and from your computer, and can tell you whether or not the issue is on the sender or the receiver.
If you use Wireshark and see that the data from the first send is arriving on your physical machine well before the second, then the issue lies with your receiving end. If you're seeing that the first packet arrives at the same time as the second packet, then the issue lies with the sender. Without seeing the code, its hard to say what you're doing and what, specifically, is causing the data to only show up after receiving more than 2-3 bytes--but until then, this behavior describes exactly what you're seeing.

There are several probable causes of this:
Cellular data networks are not "always-on". Depending on the underlying technology, there can be a substantial delay between when a first packet is sent and when IP connectivity is actually established. This will be most noticeable after IP networking has been idle for some time.
Your receiver may not be correctly checking the socket for readability. Regardless of what high-level APIs you may be using, underneath there needs to be a call to select() to check whether the socket is readable. When a datagram arrives, select() should unblock and signal that the socket descriptor is readable. Alternatively, but less efficiently, you could set the socket to non-blocking and poll it with a read. Polling wastes CPU time when there is no data and delays detection of arrival for up to the polling interval, but can be useful if for some reason you can't spare a thread to wait on select().
I said above that select() should signal readability on a watched socket when data arrives, but this behavior can be modified by the socket's "Receive low-water mark". The default value is usually 1, meaning any data will signal readability. But if SO_RCVLOWAT is set higher (via setsockopt() or a higher-level equivalent), then readability will be not be signaled until more than the specified amount of data has arrived. You can check the value with getsockopt() or whatever API is equivalent in your environment.
Item 1 would cause the first datagram to actually be delayed, but only when the IP network has been idle for a while and not once it comes up active. Items 2 and 3 would only make it appear to your program that the first datagram was delayed: a packet sniffer at the receiver would show the first datagram arriving on time.

Socket output stream with DataOuputStream - what are the guarantees?

Looking at the code:
private static void send(final Socket socket, final String data) throws IOException {
final OutputStream os = socket.getOutputStream();
final DataOutputStream dos = new DataOutputStream(os);
dos.writeUTF(data);
dos.flush();
}
can I be sure that calling this method either throws IOException (and that means that I'd better close the socket), or, if no exceptions are thrown, the data I send is guaranteed to be fully send? Are there any cases when I read the data on the other endpoint, the string I get is incomplete and there are no exception?

There is a big difference between sent and received. You can send data from the application successfully, however it then passes to
the OS on your machine
the network adapter
the switch(s) on the network
the network adapter on the remote machine
the OS on the remote machine
the application buffer on the remote machine
whatever the application does with it.
Any of these stages can fail and your sender will be none the wiser.
If you want to know the application has received and processed the data successfully, it must send you back a message saying this has happened. When you receive this, then you know it was received.

Yes, several things may happen. First of all, keep in mind write returns really quickly, so don't think much error checking (has all my data been ACKed ?) is performed.
Door number 1
You write and flush your data. TCP tries as hard as it can to deliver it. Which means it might perform retransmits and such. Of course, your send doesn't get stuck for such a long period (in some cases TCP tries for 5-10 minutes before it nukes the connections). Thus, you will never know if the other side actually got your message. You will get an error message on the next operation on the socket.
Door number 2
You write and flush your data. Because of MTU nastiness and because the string is long, it is sent in multiple packets. So your peer reads some of it and presents it to the user before getting it all.
So imagine you send: "Hello darkness my old friend, I've come to talk with you again". The other side might get "Hello darkness m". However, if it performs subsequent reads, it will get the whole data. So the far side TCP has actually received everything, it has ACKed everything but the user application has failed to read the data in order to take it out of TCPs hands.

Faster detection of a broken socket in Java/Android

Background
My application gathers data from the phone and sends the to a remote server.
The data is first stored in memory (or on file when it's big enough) and every X seconds or so the application flushes that data and sends it to the server.
It's mission critical that every single piece of data is sent successfully, I'd rather send the data twice than not at all.
Problem
As a test I set up the app to send data with a timestamp every 5 seconds, this means that every 5 seconds a new line appear on the server.
If I kill the server I expect the lines to stop, they should now be written to memory instead.
When I enable the server again I should be able to confirm that no events are missing.
The problem however is that when I kill the server it takes about 20 seconds for IO operations to start failing meaning that during those 20 seconds the app happily sends the events and removes them from memory but they never reach the server and are lost forever.
I need a way to make certain that the data actually reaches the server.
This is possibly one of the more basic TCP questions but non the less, I haven't found any solution to it.
Stuff I've tried
Setting Socket.setTcpNoDelay(true)
Removing all buffered writers and just using OutputStream directly
Flushing the stream after every send
Additional info
I cannot change how the server responds meaning I can't tell the server to acknowledge the data (more than mechanics of TCP that is), the server will just silently accept the data without sending anything back.
Snippet of code
Initialization of the class:
socket = new Socket(host, port);
socket.setTcpNoDelay(true);
Where data is sent:
while(!dataList.isEmpty()) {
String data = dataList.removeFirst();
inMemoryCount -= data.length();
try {
OutputStream os = socket.getOutputStream();
os.write(data.getBytes());
os.flush();
}
catch(IOException e) {
inMemoryCount += data.length();
dataList.addFirst(data);
socket = null;
return false;
}
}
return true;
Update 1
I'll say this again, I cannot change the way the server behaves.
It receive data over TCP and UPD and does not send any data back to confirm the receive. This is a fact and sure in a perfect world the server would acknowledge the data but that will simply not happen.
Update 2
The solution posted by Fraggle works perfect (closing the socket and waiting for the input stream to be closed).
This however comes with a new set of problems.
Since I'm on a phone I have to assume that the user cannot send an infinite amount of bytes and I would like to keep all data traffic to a minimum if possible.
I'm not worried by the overhead of opening a new socket, those few bytes will not make a difference. What I am worried about however is that every time I connect to the server I have to send a short string identifying who I am.
The string itself is not that long (around 30 characters) but that adds up if I close and open the socket too often.
One solution is only to "flush" the data every X bytes, the problem is I have to choose X wisely; if too big there will be too much duplicate data sent if the socket goes down and if it's too small the overhead is too big.
Final update
My final solution is to "flush" the socket by closing it every X bytes and if all didn't got well those X bytes will be sent again.
This will possibly create some duplicate events on the server but that can be filtered there.

Dan's solution is the one I'd suggest right after reading your question, he's got my up-vote.
Now can I suggest working around the problem? I don't know if this is possible with your setup, but one way of dealing with badly designed software (this is your server, sorry) is to wrap it, or in fancy-design-pattern-talk provide a facade, or in plain-talk put a proxy in front of your pain-in-the-behind server. Design meaningful ack-based protocol, have the proxy keep enough data samples in memory to be able to detect and tolerate broken connections, etc. etc. In short, have the phone app connect to a proxy residing somewhere on a "server-grade" machine using "good" protocol, then have the proxy connect to the server process using the "bad" protocol. The client is responsible for generating data. The proxy is responsible for dealing with the server.
Just another idea.
Edit 0:
You might find this one entertaining: The ultimate SO_LINGER page, or: why is my tcp not reliable.

The bad news: You can't detect a failed connection except by trying to send or receive data on that connection.
The good news: As you say, it's OK if you send duplicate data. So your solution is not to worry about detecting failure in less than the 20 seconds it now takes. Instead, simply keep a circular buffer containing the last 30 or 60 seconds' worth of data. Each time you detect a failure and then reconnect, you can start the session by resending that saved data.
(This could get to be problematic if the server repeatedly cycles up and down in less than a minute; but if it's doing that, you have other problems to deal with.)

See the accepted answer here: Java Sockets and Dropped Connections
socket.shutdownOutput();
wait for inputStream.read() to return -1, indicating the peer has also shutdown its socket

Won't work: server cannot be modified
Can't your server acknowledge every message it receives with another packet? The client won't remove the messages that the server did not acknowledge yet.
This will have performance implications. To avoid slowing down you can keep on sending messages before an acknowledgement is received, and acknowledge several messages in one return message.
If you send a message every 5 seconds, and disconnection is not detected by the network stack for 30 seconds, you'll have to store just 6 messages. If 6 sent messages are not acknowledged, you can consider the connection to be down. (I suppose that logic of reconnection and backlog sending is already implemented in your app.)

What about sending UDP datagrams on a separate UDP socket while making the remote host respond to each, and then when the remote host doesn't respond, you kill the TCP connection? It detects a link breakage quickly enough :)

Use http POST instead of socket connection, then you can send a response to each post. On the client side you only remove the data from memory if the response indicates success.
Sure its more overhead, but gives you what you want 100% of the time.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.