In my software i need to send messages between client and server through an ObjectOutputStream.
The core of the sender method is the following:
....
try {
objWriter.writeUnshared(bean);
objWriter.flush();
} catch (Exception e) {
....
}
...
Running my application on windows XP when the network cable is removed the writeUnsahred throw me an exception.
Now i'm trying to run my application into ubuntu 12.10 and the method don't throw anything if i remove the cable!
Any hint??
Whether and when you get the exception depends on:
how large the socket send buffer is at your end
how large the socket receive buffer is at the peer
how much unacknowledged data you have already written
how long it is since you wrote that, and
the internal timers of your TCP stack.
The only part of that you can control from Java is your own socket send buffer. It is therefore entirely unpredictable when and if the exception will be delivered. You therefore must not write your application to depend on a specific behaviour.
Yes but following the methods called by writeUnshared and flush i see that write(OutputStream.class:106) is called and this method must generate an exception if the stream is closed... So i use that information to check if my channel is active.. the problem is that in ubuntu the channel seems to be open even if i remove the cable..
Related
I am using java.net.DatagramSocket to send UDP packets to a statsd server from a Google App Engine servlet. This generally works; however, we periodically see the following exception:
IOException - Socket is closed: Unknown socket_descriptor..
When these IOExceptions occur, calling DatagramSocket.isClosed() returns false.
This issue happens frequently enough that it is concerning, and although I've put in place some workarounds (allocate a new socket and use a DeferredTask queue to retry), it would be good to understand the underlaying reason for these errors.
The Google docs mention, "Sockets may be reclaimed after 2 minutes of inactivity; any socket operation keeps the socket alive for a further 2 minutes." It is unclear to me how this would play into UDP datagrams; however, one suspicion I have is that this is related to GAE instance lifecycle in some way.
My code (sanitized and extracted) looks like:
DatagramSocket _socket;
void init() {
_socket = new DatagramSocket();
}
void send() {
DatagramPacket packet = new DatagramPacket(<BYTES>, <LENGTH>, <HOST>, <PORT>);
_socket.send(packet);
}
Appreciate any feedback on this!
The approach taken to workaround this issue was simply to manage a single static DatagramSocket instance with a couple of helper methods, getSocket() and releaseSocket() to release sockets throwing IOExceptions through the release method, and then allocate upon next access through the get method. Not shown in this code is retry logic to retry the failed socket.send(). Under load testing, this seems to work reliably.
try {
DatagramPacket packet = new DatagramPacket(<BYTES>, <LENGTH>, <HOST>, <PORT>);
getSocket().send(packet);
} catch (IOException ioe) {
releaseSocket();
}
I have following Socket server's code that reads stream from connected Socket.
try
{
ObjectInputStream in = new ObjectInputStream(client.getInputStream());
int count = 10;
while(count>0)
{
String msg = in.readObject().toString(); //Stucks here if this client is lost.
System.out.println("Client Says : "+msg);
count--;
}
in.close();
client.close();
}
catch(Exception ex)
{
ex.printStackTrace();
}
And I have a Client program, that connects with this server, sends some string every second for 10 times, and server reads from the socket for 10 times and prints the message, but if in between I kill the Client program, the Server freezes in between instead of throwing any exception or anything.
How can I detect this freeze condition? and make this loop iterate infinitely and print whatever client sends until connection is active and stable?
The problem is that the server side of the socket has no way of knowing that the client connection closed because the client code terminates without calling .close() on the client side of the socket, and therefore never sends the TCP FIN signal.
One possible way of fixing this would be to create a new Watcher thread that just periodically inspects the socket to see if it is still active. The problem with that approach is that the isConnected() on the Socket will not work for the same reason stated above so the only real way to inspect the connection is to attempt to write to it. However, this may cause random garbage to be sent to a potentially listening client.
Other options would be to implement some type of keep-alive protocol that the client should agree to (i.e., send keep-alive bits every so often so the Watcher has something to look for). You could also just move to the java.nio approach, which I believe does a better job at dealing with these conditions.
This thread is old, but provides more detail: http://www.velocityreviews.com/forums/t541628-sockets-checking-for-dropped-connections-and-close.html.
I have encountered a problem of socket communication on linux system, the communication process is like below: client send a message to ask the server to do a compute task, and wait for the result message from server after the task completes.
But the client would hangs up to wait for the result message if the task costs a long time such as about 40 minutes even though from the server side, the result message has been written to the socket to respond to the client, but it could normally receive the result message if the task costs little time, such as one minute. Additionally, this problem only happens on customer environment, the communication process behaves normally in our testing environment.
I have suspected the cause to this problem is the default timeout value of socket is different between customer environment and testing environment, but the follow values are identical on these two environment, and both Client and server.
getSoTimeout:0
getReceiveBufferSize:43690
getSendBufferSize:8192
getSoLinger:-1
getTrafficClass:0
getKeepAlive:false
getTcpNoDelay:false
the codes on CLient are like:
Message msg = null;
ObjectInputStream in = client.getClient().getInputStream();
//if no message readObject() will hang here
while ( true ) {
try {
Object recObject = in.readObject();
System.out.println("Client received msg.");
msg = (Message)recObject;
return msg;
}catch (Exception e) {
e.printStackTrace();
return null;
}
}
the codes on server are like,
ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
socketOutStream.writeObject(msgJobComplete);
}catch(Exception e) {
e.printStackTrace();
}
in order to solve this problem, i have added the flush and reset method, but the problem still exists:
ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
socketOutStream.flush();
logger.debug("AbstractJob#reply to the socket");
socketOutStream.writeObject(msgJobComplete);
socketOutStream.reset();
socketOutStream.flush();
logger.debug("AbstractJob#after Flush Reply");
}catch(Exception e) {
e.printStackTrace();
logger.error("Exception when sending MessageJobComplete."+e.getMessage());
}
so do anyone knows what the next steps i should do to solve this problem.
I guess the cause is the environment setting, but I do not know what the environment factors would affect the socket communication?
And the socket using the Tcp/Ip protocal to communicate, the problem is related with the long time task, so what values about tcp would affect the timeout of socket communication?
After my analysis about the logs, i found after the message are written to the socket, there were no exceptions are thrown/caught. But always after 15 minutes, there are exceptions in the objectInputStream.readObject() codes snippet of Server Side which is used to accept the request from client. However, socket.getSoTimeout value is 0, so it is very strange that the a Timed out Exception was thrown.
{2012-01-09 17:44:13,908} ERROR java.net.SocketException: Connection timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:312)
at sun.security.ssl.InputRecord.read(InputRecord.java:350)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:809)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:766)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:94)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:69)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
so why the Connection Timed out exceptions are thrown?
This problem is solved. using the tcpdump to capture the messages flows. I have found that while in the application level, ObjectOutputStream.writeObject() method was invoked, in the tcp level, many times [TCP ReTransmission] were found.
So, I concluded that the connection is possibly be dead, although using the netstat -an command the tcp connection state still was ESTABLISHED.
So I wrote a testing application to periodically sent Testing messages as the heart-beating messages from the Server. Then this problem disappeared.
The read() methods of java.io.InputStream are blocking calls., which means they wait "forever" if they are called when there is no data in the stream to read.
This is completely expected behaviour and as per the published contract in javadoc if the server does not respond.
If you want a non-blocking read, use the java.nio.* classes.
As the topic suggests I have a server and some clients.
The server accepts I/O connections concurrently (no queueing in socket connections) but I have this troubling issue and I do not know how to bypass it!
If I force a client to throw an I/O Exception the server detects it and terminates the client thread correctly (verified from Task Manager (Windows) and System Monitor (Ubuntu) ). But If I emulate an I/O that is "hanging" like i.e. Thread.sleep(60*1000);or
private static Object lock = new Object();
synchronized(lock) {
while (true) {
try {
lock.wait();
} catch (InterruptedException e) {
/* Foo */
}
}
}
then all subsequent I/O operations (connection & data transfer) seem to block or wait until the "hanging" client is terminated. The applications makes use of the ExecutorService so if the "hanging" client does not complete the operations in the suggested time limit then the task will time out and the client is forced to exit. The subsequent "blocked" I/Os will resume but I wonder why the server doesn't accept any I/O connections or performs any I/O operations when a client "hangs"?
NOTE:The client threading takes place in the server main like this:
while (true) {
accept client connection;
submit client task;
||
\ /
\/
// ExecutorService here in the form
// spService.submit(new Callable<Tuple<String[], BigDecimal[]>>() {
// ... code ... }}).get(taskTimeout, taskTimeUnit);
check task result & perform cleanup if result is null;
otherwise continue;
}
The Problem :
This may very well indicate that your server ACCEPTS client connections concurrently, however, it only handles these connections synchronously. That means that even if a million clients connect, successfully, at any given time, if anyone of them takes a long time (or hangs), it will hold up the others.
The TEST:
To verify this : I would toggle the amount of time a client takes to connect by adding Thread.sleep statments(1000) in your clients.
Expected result :
I believe you will see that even adding a single Thread.sleep(1000) statement in your client delays all other connecting clients by 1000.
I think I have found the source of my problems!
I do use one thread-per-client model but I run my tests locally i.e. in the same machine which means all of them have the same IP! So each client is assigned the same IP with the server! I guess that this leaves server and clients to differ only in port number but since each client is mapped to a different localport for each server connection then the server shouldn't block. I have confirmed that each client and server use different I/Os (compared references) and I wrap their sockets using <Input/Output>Streams to BufferedReaders & PrintWriters but still when a client hangs all other clients hang too (so maybe the I/O channels are indeed the same???)!I will test this on another machine and check the results back with you! :)
EDIT: Confirmed the erratic behaviour. It seems that even with remote clients if one hangs the other clients seem to hang too! :/
Don't know but I am determined to fix this. It's just that it's pretty weird since I am pretty sure I use one thread-per-client (I/Os differ, client sockets differ, IPs seem to be not a problem, I even map each client in the server to a localport of my choice ...)
May be I'll switch to NIO if I don't find a solution soon enough.
SOLUTION: Solved the problem! It seemed that the ExecutorService had to be run in a seperate thread otherwise if an I/O in a client blocked, all I/Os would block! That's strange given the fact that I've tried both an Executors.newFixedThreadPool(<nThreads>); and Executors.newCachedThreadPool(); and the client actions (aka I/Os) should take place in a new Thread for each client.
In any case, I used a method and wrapped the calls so each client instace would use a final ExecutorService baseWorker = Executors.newSingleThreadExecutor(); and created a new Thread explicitly each time using <Thread instance>.start(); so each thread would run in the background :)
What's the most appropriate way to detect if a socket has been dropped or not? Or whether a packet did actually get sent?
I have a library for sending Apple Push Notifications to iPhones through the Apple gatways (available on GitHub). Clients need to open a socket and send a binary representation of each message; but unfortunately Apple doesn't return any acknowledgement whatsoever. The connection can be reused to send multiple messages as well. I'm using the simple Java Socket connections. The relevant code is:
Socket socket = socket(); // returns an reused open socket, or a new one
socket.getOutputStream().write(m.marshall());
socket.getOutputStream().flush();
logger.debug("Message \"{}\" sent", m);
In some cases, if a connection is dropped while a message is sent or right before; Socket.getOutputStream().write() finishes successfully though. I expect it's due to the TCP window isn't exhausted yet.
Is there a way that I can tell for sure whether a packet actually got in the network or not? I experimented with the following two solutions:
Insert an additional socket.getInputStream().read() operation with a 250ms timeout. This forces a read operation that fails when the connection was dropped, but hangs otherwise for 250ms.
set the TCP sending buffer size (e.g. Socket.setSendBufferSize()) to the message binary size.
Both of the methods work, but they significantly degrade the quality of the service; throughput goes from a 100 messages/second to about 10 messages/second at most.
Any suggestions?
UPDATE:
Challenged by multiple answers questioning the possibility of the described. I constructed "unit" tests of the behavior I'm describing. Check out the unit cases at Gist 273786.
Both unit tests have two threads, a server and a client. The server closes while the client is sending data without an IOException thrown anyway. Here is the main method:
public static void main(String[] args) throws Throwable {
final int PORT = 8005;
final int FIRST_BUF_SIZE = 5;
final Throwable[] errors = new Throwable[1];
final Semaphore serverClosing = new Semaphore(0);
final Semaphore messageFlushed = new Semaphore(0);
class ServerThread extends Thread {
public void run() {
try {
ServerSocket ssocket = new ServerSocket(PORT);
Socket socket = ssocket.accept();
InputStream s = socket.getInputStream();
s.read(new byte[FIRST_BUF_SIZE]);
messageFlushed.acquire();
socket.close();
ssocket.close();
System.out.println("Closed socket");
serverClosing.release();
} catch (Throwable e) {
errors[0] = e;
}
}
}
class ClientThread extends Thread {
public void run() {
try {
Socket socket = new Socket("localhost", PORT);
OutputStream st = socket.getOutputStream();
st.write(new byte[FIRST_BUF_SIZE]);
st.flush();
messageFlushed.release();
serverClosing.acquire(1);
System.out.println("writing new packets");
// sending more packets while server already
// closed connection
st.write(32);
st.flush();
st.close();
System.out.println("Sent");
} catch (Throwable e) {
errors[0] = e;
}
}
}
Thread thread1 = new ServerThread();
Thread thread2 = new ClientThread();
thread1.start();
thread2.start();
thread1.join();
thread2.join();
if (errors[0] != null)
throw errors[0];
System.out.println("Run without any errors");
}
[Incidentally, I also have a concurrency testing library, that makes the setup a bit better and clearer. Checkout the sample at gist as well].
When run I get the following output:
Closed socket
writing new packets
Finished writing
Run without any errors
This not be of much help to you, but technically both of your proposed solutions are incorrect. OutputStream.flush() and whatever else API calls you can think of are not going to do what you need.
The only portable and reliable way to determine if a packet has been received by the peer is to wait for a confirmation from the peer. This confirmation can either be an actual response, or a graceful socket shutdown. End of story - there really is no other way, and this not Java specific - it is fundamental network programming.
If this is not a persistent connection - that is, if you just send something and then close the connection - the way you do it is you catch all IOExceptions (any of them indicate an error) and you perform a graceful socket shutdown:
1. socket.shutdownOutput();
2. wait for inputStream.read() to return -1, indicating the peer has also shutdown its socket
After much trouble with dropped connections, I moved my code to use the enhanced format, which pretty much means you change your package to look like this:
This way Apple will not drop a connection if an error happens, but will write a feedback code to the socket.
If you're sending information using the TCP/IP protocol to apple you have to be receiving acknowledgements. However you stated:
Apple doesn't return any
acknowledgement whatsoever
What do you mean by this? TCP/IP guarantees delivery therefore receiver MUST acknowledge receipt. It does not guarantee when the delivery will take place, however.
If you send notification to Apple and you break your connection before receiving the ACK there is no way to tell whether you were successful or not so you simply must send it again. If pushing the same information twice is a problem or not handled properly by the device then there is a problem. The solution is to fix the device handling of the duplicate push notification: there's nothing you can do on the pushing side.
#Comment Clarification/Question
Ok. The first part of what you understand is your answer to the second part. Only the packets that have received ACKS have been sent and received properly. I'm sure we could think of some very complicated scheme of keeping track of each individual packet ourselves, but TCP is suppose to abstract this layer away and handle it for you. On your end you simply have to deal with the multitude of failures that could occur (in Java if any of these occur an exception is raised). If there is no exception the data you just tried to send is sent guaranteed by the TCP/IP protocol.
Is there a situation where data is seemingly "sent" but not guaranteed to be received where no exception is raised? The answer should be no.
#Examples
Nice examples, this clarifies things quite a bit. I would have thought an error would be thrown. In the example posted an error is thrown on the second write, but not the first. This is interesting behavior... and I wasn't able to find much information explaining why it behaves like this. It does however explain why we must develop our own application level protocols to verify delivery.
Looks like you are correct that without a protocol for confirmation their is no guarantee the Apple device will receive the notification. Apple also only queue's the last message. Looking a little bit at the service I was able to determine this service is more for convenience for the customer, but cannot be used to guarantee service and must be combined with other methods. I read this from the following source.
http://blog.boxedice.com/2009/07/10/how-to-build-an-apple-push-notification-provider-server-tutorial/
Seems like the answer is no on whether or not you can tell for sure. You may be able to use a packet sniffer like Wireshark to tell if it was sent, but this still won't guarantee it was received and sent to the device due to the nature of the service.