Java app throws ClosedByInterruptException immediately when opening a socket, cause? - java

I have a java app that holds open many connections to an address, probably in the ballpark of 2,000 at once, with hardly any activity, mostly open for monitoring purposes passing a few bytes every now and then. When new connections need to be opened up, it automatically opens them and adds them to its pool. Sometimes though, for an unknown reason, the application receives a ClosedByInterruptException immediately during/after creating the socket to the remote address. To the best of my knowledge, this only occurs on the client side as a result of an interrupt signal to the thread. I have checked and rechecked the source code surrounding the problem area and it seems ok. I was hoping I could get someone's expertise as to if there could be an alternate cause, besides source code, for instance, is there a system reason that causes this? Is there a hardware cause? Server level/router level? My network knowledge I would consider amateur, but is 2K connections too many for a router, or no?
INFO [08 Sep 2011 23:11:45,982]: Reconnecting id 20831
ERROR [08 Sep 2011 23:11:45,990]: IOException while creating plain socket channel
java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:518)
at com.*.createSocketChannelPlain(MyTask.java:441)
at com.*._executeTask(MyTask.java:176)
at com.*.executeTask(MyTask.java:90)
at com.*.ThreadPool$WorkerThread.run(ThreadPool.java:55)
ERROR [08 Sep 2011 23:11:45,990]: Could not open socket
WARN [08 Sep 2011 23:11:45,990]: WorkerThread_24 received interrupted exception in ThreadPool
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at com.*.TaskQueue.getTask(TaskQueue.java:39)
at com.*.ThreadPool$WorkerThread.run(ThreadPool.java:48)
Update: I would like to try and offer all I can to help others contribute to a diagnosis. So here is the actual function where the exception occurs, only difference being the line marking I added to line 441.
private SocketChannel createSocketChannelPlain() throws TaskFailedException {
SocketChannel socketChannel = null;
try {
// Create a non-blocking socket channel to use to communicate for imap connection
socketChannel = SocketChannel.open();
socketChannel.configureBlocking(false);
try {socketChannel.socket().setSoLinger(true, 0);} catch (Exception e) {}
try {socketChannel.socket().setKeepAlive(true);} catch (Exception e) {}
/*Line 441*/ socketChannel.connect(new InetSocketAddress(_HOSTNAME, _PORT));
//System.out.println("Started connection");
// Complete connection
while (!socketChannel.finishConnect()) {
// do something until connect completed
try {
//do what you want to do before sleeping
Thread.sleep(500);//sleep for 500 ms
//do what you want to do after sleeping
} catch(InterruptedException ie){
//If this thread was interrupted by another thread
try { socketChannel.close(); } catch (Exception e) {}
finally { socketChannel = null; }
break;
}
}
//System.out.println("Finished connecting");
return socketChannel;
} catch (IOException e) {
logger.error("IOException while creating plain socket channel to gmail", e);
try { socketChannel.close(); } catch (Exception e1) {}
finally { socketChannel = null; }
//throw new TaskFailedException("IOException occurred in createSocketChannel");
}
return socketChannel;
}

What OS are you running this on? I don't know about Windows, but on Linux (and presumably other Unix-like OSes), you can run out of file handles by having large numbers of sockets. You can work around this by doing ulimit -n 8192 or similar before running the Java app. Alternatively, edit /etc/security/limits.conf and set nofile. All of that said, ClosedByInterruptedException would be an odd way to notice this.
If the above isn't the issue, the next thing I'd try would be to run tcpdump (if we're talking about a GUI-less machine) or Wireshark (if we aren't) and capture the traffic your program's generating, looking for weird things happening at the time that connection starts.

Related

Is there a proper way to detect and handle sudden disconnected server

I'm using this code to detect if a server isn't connect
private boolean isServerListening() {
try {
s = new Socket("localhost", PORT);
return true;
} catch (IOException e) {
System.out.println(e);
return false;
}
}
and Thread to handle suddenly disconnected server
Thread checkServer = new Thread(() -> {
while (true) {
if (isServerListening()==false) {
JOptionPane.showMessageDialog(null, "Server is disconnected!");
System.exit(0);
}
}
});
The problem is:
I think the method took too much time (about 4 seconds) to execute and return.So is there a proper way?
No matter the server is connected or not, this Thread still show the JOptionPane and terminate my program.Am I wrong at some point?
There is no general solution that fits all. Bsically there are different types of "lost connection":
Your computer disconnects, so it knows immediatley that the connection is closed.
The other side disconnects, it might happen that this signal does not reach your computer, so it will still think that you are connected.
The physical connection breaks, both sides cannot inform the other side.
The Socket has the methods isConnected() and isClosed() which you should use.
The only way to check a connection surely is by sending a message and receiving an answer. Then it might take up to 60 seconds (by default) until your computer notices the lost connection.

How do I handle ServerSocketChannel.accept() IOException: too many open files in NIO?

I'm having a problem with one of my servers, on Friday morning I got the following IOException:
11/Sep/2015 01:51:39,524 [ERROR] [Thread-1] - ServerRunnable: IOException:
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[?:1.7.0_75]
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) ~[?:1.7.0_75]
at com.watersprint.deviceapi.server.ServerRunnable.acceptConnection(ServerRunnable.java:162) [rsrc:./:?]
at com.watersprint.deviceapi.server.ServerRunnable.run(ServerRunnable.java:121) [rsrc:./:?]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]
Row 162 of the ServerRunnable class is in the method below, it's the ssc.accept() call.
private void acceptConnection(Selector selector, SelectionKey key) {
try {
ServerSocketChannel ssc = (ServerSocketChannel) key.channel();
SocketChannel sc = ssc.accept();
socketConnectionCount++;
/*
* Test to force device error, for debugging purposes
*/
if (brokenSocket
&& (socketConnectionCount % brokenSocketNumber == 0)) {
sc.close();
} else {
sc.configureBlocking(false);
log.debug("*************************************************");
log.debug("Selector Thread: Client accepted from "
+ sc.getRemoteAddress());
SelectionKey newKey = sc.register(selector,
SelectionKey.OP_READ);
ClientStateMachine clientState = new ClientStateMachine();
clientState.setIpAddress(sc.getRemoteAddress().toString());
clientState.attachSelector(selector);
clientState.attachSocketChannel(sc);
newKey.attach(clientState);
}
} catch (ClosedChannelException e) {
log.error("ClosedChannelException: ", e);
ClientStateMachine clientState = (ClientStateMachine)key.attachment();
database.insertFailedCommunication(clientState.getDeviceId(),
clientState.getIpAddress(),
clientState.getReceivedString(), e.toString());
key.cancel();
} catch (IOException e) {
log.error("IOException: ", e);
}
}
How should I handle this? reading up on the error it appears to be a setting in the Linux OS that limits the number of open files a process can have.
Judging from that, and this question here, it appears that I am not closing sockets correctly (The server is currently serving around 50 clients). Is this a situation where I need a timer to monitor open sockets and time them out after an extended period?
I have some cases where a client can connect and then not send any data once the connection is established. I thought I had handled those cases properly.
It's my understanding that a non-blocking NIO server has very long timeouts, is it possible that if I've missed cases like this they might accumulate and result in this error?
This server has been running for three months without any issues.
After I go through my code and check for badly handled / missing cases, what's the best way to handle this particular error? Are there other things I should consider that might contribute to this?
Also, (Maybe this should be another question) I have log4j2 configured to send emails for log levels of error and higher, yet I didn't get an email for this error. Are there any reasons why that might be? It usually works, the error was logged to the log file as expected, but I never got an email about it. I should have gotten plenty as the error occurred every time a connection was established.
You fix your socket leaks. When you get EOS, or any IOException other than SocketTimeoutException, on a socket you must close it. In the case of SocketChannels, that means closing the channel. Merely cancelling the key, or ignoring the issue and hoping it will go away, isn't sufficient. The connection has already gone away.
The fact that you find it necessary to count broken socket connections, and catch ClosedChannelException, already indicates major logic problems in your application. You shouldn't need this. And cancelling the key of a closed channel doesn't provide any kind of a solution.
It's my understanding that a non-blocking NIO server has very long timeouts
The only timeout a non-blocking NIO server has is the timeout you specify to select(). All the timeouts built-in to the TCP stack are unaffected by whether you are using NIO or non-blocking mode.

'ServerSocket.accept()' Loop Without SocketTimeoutException (Java) (Alternative Solution)

Explanation
I'm revisiting the project I used to teach myself Java.
In this project I want to be able to stop the server from accepting new clients and then perform a few 'cleanup' operations before exiting the JVM.
In that project I used the following style for a client accept/handle loop:
//Exit loop by changing running to false and waiting up to 2 seconds
ServerSocket serverSocket = new ServerSocket(123);
serverSocket.setSoTimeout(2000);
Socket client;
while (running){ // 'running' is a private static boolean
try{
client = serverSocket.accept();
createComms(client); //Handles Connection in New Thread
} catch (IOException ex){
//Do Nothing
}
}
In this approach a SocketTimeoutException will be thrown every 2 seconds, if there are no clients connecting, and I don't like relying on exceptions for normal operation unless it's necessary.
I've been experimenting with the following style to try and minimise relying on Exceptions for normal operation:
//Exit loop by calling serverSocket.close()
ServerSocket serverSocket = new ServerSocket(123);
Socket client;
try{
while ((client = serverSocket.accept()) != null){
createComms(client); //Handles Connection in New Thread
}
} catch (IOException ex){
//Do Nothing
}
In this case my intention is that an Exception will only be thrown when I call serverSocket.close() or if something goes wrong.
Question
Is there any significant difference in the two approaches, or are they both viable solutions?
I'm totally self-taught so I have no idea if I've re-invented the wheel for no reason or if I've come up something good.
I've been lurking on SO for a while, this is the first time I've not been able to find what I need already.
Please feel free to suggest completely different approaches =3
The problem with second approach is that the server will die if an exception occurs in the while loop.
The first approach is better, though you might want to add logging exceptions using Log4j.
while (running){
try{
client = serverSocket.accept();
createComms(client);
} catch (IOException ex){
// Log errors
LOG.warn(ex,ex);
}
}
Non-blocking IO is what you're looking for. Instead of blocking until a SocketChannel (non-blocking alternative to Socket) is returned, it'll return null if there is currently no connection to accept.
This will allow you to remove the timeout, since nothing will be blocking.
You could also register a Selector, which informs you when there is a connection to accept or when there is data to read. I have a small example of that here, as well as a non-blocking ServerSocket that doesnt use a selector
EDIT: In case something goes wrong with my link, here is the example of non-blocking IO, without a selector, accepting a connection:
class Server {
public static void main(String[] args) throws Exception {
ServerSocketChannel ssc = ServerSocketChannel.open();
ssc.configureBlocking(false);
while(true) {
SocketChannel sc = ssc.accept();
if(sc != null) {
//handle channel
}
}
}
}
The second approach is better (for the reasons you mentioned: relying on exceptions in normal program flow is not a good practise) allthough your code suggests that serverSocket.accept() can return null, which it can not. The method can throw all kinds of exceptions though (see the api-docs). You might want to catch those exceptions: a server should not go down without a very good reason.
I have been using the second approach with good success, but added some more code to make it more stable/reliable: see my take on it here (unit tests here). One of the 'cleanup' tasks to consider is to give some time to the threads that are handling the client communications so that these threads can finish or properly inform the client the connection will be closed. This prevents situations where the client is not sure if the server completed an important task before the connection was suddenly lost/closed.

Start and Stop Tomcat from java code

Based in a code I saw in Stackoverflow and other pages on Internet, I've created a method to stop and start tomcat at the moment I'll run a process in my system because I need to clean memory in my OS, I use System.gc() but still not enough to free memory, this is the code:
Global declaration:
private String server = "localhost";
Method to stop-start tomcat:
public void tomcat(){
try{
Socket s = new Socket(server,8005);
if(s.isConnected()){
PrintWriter print = new PrintWriter(s.getOutputStream(),true);
print.println("SHUTDOWN"); /*Command to stop tomcat according to the line "<Server port="8005" shutdown="SHUTDOWN">" in catalina_home/conf/server.xml*/
print.close();
s.close();
}
Runtime.getRuntime().exec(System.getProperty("catalina.home")+"\\bin\\startup.bat"); /*Instruction to run tomcat after it gets stopped*/
}catch (Exception ex){
ex.printStackTrace();
}
}
The code line to start tomcat works perfectly, but no the instructions to stop it because, when I instance the socket, gives me the following message: Connection refused: connect.
How can I solve this? or, is there another way to stop tomcat?
Thanks in advance.
public static void shut_server(String lien)
{
try {
Process child = Runtime.getRuntime().exec(lien+"/shutdown.sh");
System.out.println("Serveur est atteins");
} catch (IOException ex) {
Logger.getLogger(Installation.class.getName()).log(Level.SEVERE, null, ex);
System.out.println("erreur de demarrage");
}
}
lien = path to your tomcat bin file
for example - /home/zakaria/Téléchargements/apache-tomcat-8.0.21/bin
I had similar issue. I was getting the "Connection refused: connect" error message on creating the socket.
However, my use case is different from the one posted by Vlad. When the Tomcat server is starting up, my app is checking availability of some resources and if they are not, it needs to shutdown the server.
I added a 30 seconds sleep just before the line creating socket:
try {
Thread.sleep(30000);
}
catch (Exception excp) {}
Socket socket = new Socket("localhost", port);
and it started working.
I think when Tomcat is starting up it needs some time to make the shutdown port ready to work.
The selection of 30 seconds is arbitrary, it could be probably shorter.
FYI, my Tomcat is running as Windows service.

ObjectInputStream.readObject() hangs forever during the process of socket communication

I have encountered a problem of socket communication on linux system, the communication process is like below: client send a message to ask the server to do a compute task, and wait for the result message from server after the task completes.
But the client would hangs up to wait for the result message if the task costs a long time such as about 40 minutes even though from the server side, the result message has been written to the socket to respond to the client, but it could normally receive the result message if the task costs little time, such as one minute. Additionally, this problem only happens on customer environment, the communication process behaves normally in our testing environment.
I have suspected the cause to this problem is the default timeout value of socket is different between customer environment and testing environment, but the follow values are identical on these two environment, and both Client and server.
getSoTimeout:0
getReceiveBufferSize:43690
getSendBufferSize:8192
getSoLinger:-1
getTrafficClass:0
getKeepAlive:false
getTcpNoDelay:false
the codes on CLient are like:
Message msg = null;
ObjectInputStream in = client.getClient().getInputStream();
//if no message readObject() will hang here
while ( true ) {
try {
Object recObject = in.readObject();
System.out.println("Client received msg.");
msg = (Message)recObject;
return msg;
}catch (Exception e) {
e.printStackTrace();
return null;
}
}
the codes on server are like,
ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
socketOutStream.writeObject(msgJobComplete);
}catch(Exception e) {
e.printStackTrace();
}
in order to solve this problem, i have added the flush and reset method, but the problem still exists:
ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
socketOutStream.flush();
logger.debug("AbstractJob#reply to the socket");
socketOutStream.writeObject(msgJobComplete);
socketOutStream.reset();
socketOutStream.flush();
logger.debug("AbstractJob#after Flush Reply");
}catch(Exception e) {
e.printStackTrace();
logger.error("Exception when sending MessageJobComplete."+e.getMessage());
}
so do anyone knows what the next steps i should do to solve this problem.
I guess the cause is the environment setting, but I do not know what the environment factors would affect the socket communication?
And the socket using the Tcp/Ip protocal to communicate, the problem is related with the long time task, so what values about tcp would affect the timeout of socket communication?
After my analysis about the logs, i found after the message are written to the socket, there were no exceptions are thrown/caught. But always after 15 minutes, there are exceptions in the objectInputStream.readObject() codes snippet of Server Side which is used to accept the request from client. However, socket.getSoTimeout value is 0, so it is very strange that the a Timed out Exception was thrown.
{2012-01-09 17:44:13,908} ERROR java.net.SocketException: Connection timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:312)
at sun.security.ssl.InputRecord.read(InputRecord.java:350)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:809)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:766)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:94)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:69)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
so why the Connection Timed out exceptions are thrown?
This problem is solved. using the tcpdump to capture the messages flows. I have found that while in the application level, ObjectOutputStream.writeObject() method was invoked, in the tcp level, many times [TCP ReTransmission] were found.
So, I concluded that the connection is possibly be dead, although using the netstat -an command the tcp connection state still was ESTABLISHED.
So I wrote a testing application to periodically sent Testing messages as the heart-beating messages from the Server. Then this problem disappeared.
The read() methods of java.io.InputStream are blocking calls., which means they wait "forever" if they are called when there is no data in the stream to read.
This is completely expected behaviour and as per the published contract in javadoc if the server does not respond.
If you want a non-blocking read, use the java.nio.* classes.

Categories

Resources