ObjectInputStream.readObject() hangs forever during the process of socket communication

ObjectInputStream.readObject() hangs forever during the process of socket communication - java

I have encountered a problem of socket communication on linux system, the communication process is like below: client send a message to ask the server to do a compute task, and wait for the result message from server after the task completes.
But the client would hangs up to wait for the result message if the task costs a long time such as about 40 minutes even though from the server side, the result message has been written to the socket to respond to the client, but it could normally receive the result message if the task costs little time, such as one minute. Additionally, this problem only happens on customer environment, the communication process behaves normally in our testing environment.
I have suspected the cause to this problem is the default timeout value of socket is different between customer environment and testing environment, but the follow values are identical on these two environment, and both Client and server.
getSoTimeout:0
getReceiveBufferSize:43690
getSendBufferSize:8192
getSoLinger:-1
getTrafficClass:0
getKeepAlive:false
getTcpNoDelay:false
the codes on CLient are like:
Message msg = null;
ObjectInputStream in = client.getClient().getInputStream();
//if no message readObject() will hang here
while ( true ) {
try {
Object recObject = in.readObject();
System.out.println("Client received msg.");
msg = (Message)recObject;
return msg;
}catch (Exception e) {
e.printStackTrace();
return null;
}
}
the codes on server are like,
ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
socketOutStream.writeObject(msgJobComplete);
}catch(Exception e) {
e.printStackTrace();
}
in order to solve this problem, i have added the flush and reset method, but the problem still exists:
ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
socketOutStream.flush();
logger.debug("AbstractJob#reply to the socket");
socketOutStream.writeObject(msgJobComplete);
socketOutStream.reset();
socketOutStream.flush();
logger.debug("AbstractJob#after Flush Reply");
}catch(Exception e) {
e.printStackTrace();
logger.error("Exception when sending MessageJobComplete."+e.getMessage());
}
so do anyone knows what the next steps i should do to solve this problem.
I guess the cause is the environment setting, but I do not know what the environment factors would affect the socket communication?
And the socket using the Tcp/Ip protocal to communicate, the problem is related with the long time task, so what values about tcp would affect the timeout of socket communication?
After my analysis about the logs, i found after the message are written to the socket, there were no exceptions are thrown/caught. But always after 15 minutes, there are exceptions in the objectInputStream.readObject() codes snippet of Server Side which is used to accept the request from client. However, socket.getSoTimeout value is 0, so it is very strange that the a Timed out Exception was thrown.
{2012-01-09 17:44:13,908} ERROR java.net.SocketException: Connection timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:312)
at sun.security.ssl.InputRecord.read(InputRecord.java:350)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:809)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:766)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:94)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:69)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
so why the Connection Timed out exceptions are thrown?

This problem is solved. using the tcpdump to capture the messages flows. I have found that while in the application level, ObjectOutputStream.writeObject() method was invoked, in the tcp level, many times [TCP ReTransmission] were found.
So, I concluded that the connection is possibly be dead, although using the netstat -an command the tcp connection state still was ESTABLISHED.
So I wrote a testing application to periodically sent Testing messages as the heart-beating messages from the Server. Then this problem disappeared.

The read() methods of java.io.InputStream are blocking calls., which means they wait "forever" if they are called when there is no data in the stream to read.
This is completely expected behaviour and as per the published contract in javadoc if the server does not respond.
If you want a non-blocking read, use the java.nio.* classes.

Related

How do I handle ServerSocketChannel.accept() IOException: too many open files in NIO?

I'm having a problem with one of my servers, on Friday morning I got the following IOException:
11/Sep/2015 01:51:39,524 [ERROR] [Thread-1] - ServerRunnable: IOException:
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[?:1.7.0_75]
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) ~[?:1.7.0_75]
at com.watersprint.deviceapi.server.ServerRunnable.acceptConnection(ServerRunnable.java:162) [rsrc:./:?]
at com.watersprint.deviceapi.server.ServerRunnable.run(ServerRunnable.java:121) [rsrc:./:?]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]
Row 162 of the ServerRunnable class is in the method below, it's the ssc.accept() call.
private void acceptConnection(Selector selector, SelectionKey key) {
try {
ServerSocketChannel ssc = (ServerSocketChannel) key.channel();
SocketChannel sc = ssc.accept();
socketConnectionCount++;
/*
* Test to force device error, for debugging purposes
*/
if (brokenSocket
&& (socketConnectionCount % brokenSocketNumber == 0)) {
sc.close();
} else {
sc.configureBlocking(false);
log.debug("*************************************************");
log.debug("Selector Thread: Client accepted from "
+ sc.getRemoteAddress());
SelectionKey newKey = sc.register(selector,
SelectionKey.OP_READ);
ClientStateMachine clientState = new ClientStateMachine();
clientState.setIpAddress(sc.getRemoteAddress().toString());
clientState.attachSelector(selector);
clientState.attachSocketChannel(sc);
newKey.attach(clientState);
}
} catch (ClosedChannelException e) {
log.error("ClosedChannelException: ", e);
ClientStateMachine clientState = (ClientStateMachine)key.attachment();
database.insertFailedCommunication(clientState.getDeviceId(),
clientState.getIpAddress(),
clientState.getReceivedString(), e.toString());
key.cancel();
} catch (IOException e) {
log.error("IOException: ", e);
}
}
How should I handle this? reading up on the error it appears to be a setting in the Linux OS that limits the number of open files a process can have.
Judging from that, and this question here, it appears that I am not closing sockets correctly (The server is currently serving around 50 clients). Is this a situation where I need a timer to monitor open sockets and time them out after an extended period?
I have some cases where a client can connect and then not send any data once the connection is established. I thought I had handled those cases properly.
It's my understanding that a non-blocking NIO server has very long timeouts, is it possible that if I've missed cases like this they might accumulate and result in this error?
This server has been running for three months without any issues.
After I go through my code and check for badly handled / missing cases, what's the best way to handle this particular error? Are there other things I should consider that might contribute to this?
Also, (Maybe this should be another question) I have log4j2 configured to send emails for log levels of error and higher, yet I didn't get an email for this error. Are there any reasons why that might be? It usually works, the error was logged to the log file as expected, but I never got an email about it. I should have gotten plenty as the error occurred every time a connection was established.

You fix your socket leaks. When you get EOS, or any IOException other than SocketTimeoutException, on a socket you must close it. In the case of SocketChannels, that means closing the channel. Merely cancelling the key, or ignoring the issue and hoping it will go away, isn't sufficient. The connection has already gone away.
The fact that you find it necessary to count broken socket connections, and catch ClosedChannelException, already indicates major logic problems in your application. You shouldn't need this. And cancelling the key of a closed channel doesn't provide any kind of a solution.
It's my understanding that a non-blocking NIO server has very long timeouts
The only timeout a non-blocking NIO server has is the timeout you specify to select(). All the timeouts built-in to the TCP stack are unaffected by whether you are using NIO or non-blocking mode.

OutputStream does not throw IOException although LAN-cable is not plugged

I established a TCP Connection with Java. The Server sends an "ALIVE"-Message to the Client every second to detect a broken connection.
but when I plug of the LAN-cable the IOException is thrown exactly after 23seconds, and not immediately after the send attempt.
My code to send looks like that
// out is an OutputStream
try {
out.write(String.format("%s\r\n", encryptedCommand).getBytes(Charset.forName("UTF-8")));
} catch (IOException e) {
fireConnectionClosed();
}
I'm pretty sure that this was working already, but now it doesn't any more

When a TCP packet fails to get through, the TCP stack resends the packet a few times. This allows the connection to stay up even if an occasional packet is lost. That prevents the problem of connections dropping whenever there's the least bit of congestion on the network. You can test this by unplugging the network for 5 or 10 seconds, then plugging it back in - the connection should be fine.
However, this feature also means that the TCP connection won't be recognized as having been lost until after a few retries.

ObjectInputStream's readObject() freezes after Client Socket connection is killed

I have following Socket server's code that reads stream from connected Socket.
try
{
ObjectInputStream in = new ObjectInputStream(client.getInputStream());
int count = 10;
while(count>0)
{
String msg = in.readObject().toString(); //Stucks here if this client is lost.
System.out.println("Client Says : "+msg);
count--;
}
in.close();
client.close();
}
catch(Exception ex)
{
ex.printStackTrace();
}
And I have a Client program, that connects with this server, sends some string every second for 10 times, and server reads from the socket for 10 times and prints the message, but if in between I kill the Client program, the Server freezes in between instead of throwing any exception or anything.
How can I detect this freeze condition? and make this loop iterate infinitely and print whatever client sends until connection is active and stable?

The problem is that the server side of the socket has no way of knowing that the client connection closed because the client code terminates without calling .close() on the client side of the socket, and therefore never sends the TCP FIN signal.
One possible way of fixing this would be to create a new Watcher thread that just periodically inspects the socket to see if it is still active. The problem with that approach is that the isConnected() on the Socket will not work for the same reason stated above so the only real way to inspect the connection is to attempt to write to it. However, this may cause random garbage to be sent to a potentially listening client.
Other options would be to implement some type of keep-alive protocol that the client should agree to (i.e., send keep-alive bits every so often so the Watcher has something to look for). You could also just move to the java.nio approach, which I believe does a better job at dealing with these conditions.
This thread is old, but provides more detail: http://www.velocityreviews.com/forums/t541628-sockets-checking-for-dropped-connections-and-close.html.

Reliable TCP/IP in Java

Is there a way to have reliable communications (the sender get informed that the message it sent is already received by the receiver) using Java TCP/IP library in java.net.*? I understand that one of the advantages of TCP over UDP is its reliability. Yet, I couldn't get that assurance in the experiment below:
I created two classes:
1) echo server => always sending back the data it received.
2) client => periodically send "Hello world" message to the echo server.
They were run on different computers (and worked perfectly). During the middle of the execution, I disconnected the network (unplugged the LAN cable). After disconnected, the server still keep waiting for a data until a few seconds passed (it eventually raised an exception). Similarly, the client also keep sending a data until a few seconds passed (an exception is raised).
The problem is, objectOutputStream.writeObject(message) doesn't guarantee the delivery status of the message (I expect it to block the thread, keep resending the data until delivered). Or at least I get informed, which messages are missing.
Server Code:
import java.net.*;
import java.io.*;
import java.io.Serializable;
public class SimpleServer {
public static void main(String args[]) {
try {
ServerSocket serverSocket = new ServerSocket(2002);
Socket socket = new Socket();
socket = serverSocket.accept();
InputStream inputStream = socket.getInputStream();
ObjectInputStream objectInputStream = new ObjectInputStream(
inputStream);
while (true) {
try {
String message = (String) objectInputStream.readObject();
System.out.println(message);
Thread.sleep(1000);
} catch (Exception ex) {
ex.printStackTrace();
}
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
Client code:
import java.net.*;
import java.io.*;
public class SimpleClient {
public static void main(String args[]) {
try {
String serverIpAddress = "localhost"; //change this
Socket socket = new Socket(serverIpAddress, 2002);
OutputStream outputStream = socket.getOutputStream();
ObjectOutputStream objectOutputStream = new ObjectOutputStream(
outputStream);
while (true) {
String message = "Hello world!";
objectOutputStream.writeObject(message);
System.out.println(message);
Thread.sleep(1000);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

If you need to know which messages have arrived in the peer application, the peer application has to send acknowledgements.

If you want this level of guarantees it sounds like you really want JMS. This can ensure not only that messages have been delivered but also have been processed correctly. i.e. there is no point having very reliable delivery if it can be discarded due to a bug.
You can monitor which messages are waiting and which consumers are falling behind. Watch a producer to see what messages it is sending, and have messages saved when it is down and are available when it restarts. i.e. reliable delivery even if the consumer is restarted.

TCP is always reliable. You don't need confirmations. However, to check that a client is up, you might also want to use a UDP stream with confirmations. Like a PING? PONG! system. Might also be TCP settings you can adjust.

Your base assumption (and understanding of TCP) here is wrong. If you unplug and then re-plug, the message most likely will not be lost.
It boils down on how long to you want the sender to wait. One hour, one day? If you'd make the timeout one day, you would unplug for two days and still say "does not work".
So the guaranteed delivery is that "either data is delivered - or you get informed". In the second case you need to solve it on application level.

You could consider using the SO_KEEPALIVE socket option which will cause the connection to be closed if no data is transmitted over the socket for 2 hours. However, obviously in many cases this doesn't offer the level of control typically needed by applications.
A second problem is that some TCP/IP stack implementations are poor and can leave your server with dangling open connections in the event of a network outage.
Therefore, I'd advise adding application level heartbeating between your client and server to ensure that both parties are still alive. This also offers the advantage of severing the connection if, for example a 3rd party client remains alive but becomes unresponsive and hence stops sending heartbeats.

Java Sockets and Dropped Connections

What's the most appropriate way to detect if a socket has been dropped or not? Or whether a packet did actually get sent?
I have a library for sending Apple Push Notifications to iPhones through the Apple gatways (available on GitHub). Clients need to open a socket and send a binary representation of each message; but unfortunately Apple doesn't return any acknowledgement whatsoever. The connection can be reused to send multiple messages as well. I'm using the simple Java Socket connections. The relevant code is:
Socket socket = socket(); // returns an reused open socket, or a new one
socket.getOutputStream().write(m.marshall());
socket.getOutputStream().flush();
logger.debug("Message \"{}\" sent", m);
In some cases, if a connection is dropped while a message is sent or right before; Socket.getOutputStream().write() finishes successfully though. I expect it's due to the TCP window isn't exhausted yet.
Is there a way that I can tell for sure whether a packet actually got in the network or not? I experimented with the following two solutions:
Insert an additional socket.getInputStream().read() operation with a 250ms timeout. This forces a read operation that fails when the connection was dropped, but hangs otherwise for 250ms.
set the TCP sending buffer size (e.g. Socket.setSendBufferSize()) to the message binary size.
Both of the methods work, but they significantly degrade the quality of the service; throughput goes from a 100 messages/second to about 10 messages/second at most.
Any suggestions?
UPDATE:
Challenged by multiple answers questioning the possibility of the described. I constructed "unit" tests of the behavior I'm describing. Check out the unit cases at Gist 273786.
Both unit tests have two threads, a server and a client. The server closes while the client is sending data without an IOException thrown anyway. Here is the main method:
public static void main(String[] args) throws Throwable {
final int PORT = 8005;
final int FIRST_BUF_SIZE = 5;
final Throwable[] errors = new Throwable[1];
final Semaphore serverClosing = new Semaphore(0);
final Semaphore messageFlushed = new Semaphore(0);
class ServerThread extends Thread {
public void run() {
try {
ServerSocket ssocket = new ServerSocket(PORT);
Socket socket = ssocket.accept();
InputStream s = socket.getInputStream();
s.read(new byte[FIRST_BUF_SIZE]);
messageFlushed.acquire();
socket.close();
ssocket.close();
System.out.println("Closed socket");
serverClosing.release();
} catch (Throwable e) {
errors[0] = e;
}
}
}
class ClientThread extends Thread {
public void run() {
try {
Socket socket = new Socket("localhost", PORT);
OutputStream st = socket.getOutputStream();
st.write(new byte[FIRST_BUF_SIZE]);
st.flush();
messageFlushed.release();
serverClosing.acquire(1);
System.out.println("writing new packets");
// sending more packets while server already
// closed connection
st.write(32);
st.flush();
st.close();
System.out.println("Sent");
} catch (Throwable e) {
errors[0] = e;
}
}
}
Thread thread1 = new ServerThread();
Thread thread2 = new ClientThread();
thread1.start();
thread2.start();
thread1.join();
thread2.join();
if (errors[0] != null)
throw errors[0];
System.out.println("Run without any errors");
}
[Incidentally, I also have a concurrency testing library, that makes the setup a bit better and clearer. Checkout the sample at gist as well].
When run I get the following output:
Closed socket
writing new packets
Finished writing
Run without any errors

This not be of much help to you, but technically both of your proposed solutions are incorrect. OutputStream.flush() and whatever else API calls you can think of are not going to do what you need.
The only portable and reliable way to determine if a packet has been received by the peer is to wait for a confirmation from the peer. This confirmation can either be an actual response, or a graceful socket shutdown. End of story - there really is no other way, and this not Java specific - it is fundamental network programming.
If this is not a persistent connection - that is, if you just send something and then close the connection - the way you do it is you catch all IOExceptions (any of them indicate an error) and you perform a graceful socket shutdown:
1. socket.shutdownOutput();
2. wait for inputStream.read() to return -1, indicating the peer has also shutdown its socket

After much trouble with dropped connections, I moved my code to use the enhanced format, which pretty much means you change your package to look like this:
This way Apple will not drop a connection if an error happens, but will write a feedback code to the socket.

If you're sending information using the TCP/IP protocol to apple you have to be receiving acknowledgements. However you stated:
Apple doesn't return any
acknowledgement whatsoever
What do you mean by this? TCP/IP guarantees delivery therefore receiver MUST acknowledge receipt. It does not guarantee when the delivery will take place, however.
If you send notification to Apple and you break your connection before receiving the ACK there is no way to tell whether you were successful or not so you simply must send it again. If pushing the same information twice is a problem or not handled properly by the device then there is a problem. The solution is to fix the device handling of the duplicate push notification: there's nothing you can do on the pushing side.
#Comment Clarification/Question
Ok. The first part of what you understand is your answer to the second part. Only the packets that have received ACKS have been sent and received properly. I'm sure we could think of some very complicated scheme of keeping track of each individual packet ourselves, but TCP is suppose to abstract this layer away and handle it for you. On your end you simply have to deal with the multitude of failures that could occur (in Java if any of these occur an exception is raised). If there is no exception the data you just tried to send is sent guaranteed by the TCP/IP protocol.
Is there a situation where data is seemingly "sent" but not guaranteed to be received where no exception is raised? The answer should be no.
#Examples
Nice examples, this clarifies things quite a bit. I would have thought an error would be thrown. In the example posted an error is thrown on the second write, but not the first. This is interesting behavior... and I wasn't able to find much information explaining why it behaves like this. It does however explain why we must develop our own application level protocols to verify delivery.
Looks like you are correct that without a protocol for confirmation their is no guarantee the Apple device will receive the notification. Apple also only queue's the last message. Looking a little bit at the service I was able to determine this service is more for convenience for the customer, but cannot be used to guarantee service and must be combined with other methods. I read this from the following source.
http://blog.boxedice.com/2009/07/10/how-to-build-an-apple-push-notification-provider-server-tutorial/
Seems like the answer is no on whether or not you can tell for sure. You may be able to use a packet sniffer like Wireshark to tell if it was sent, but this still won't guarantee it was received and sent to the device due to the nature of the service.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.