I've got a client-server tiered architecture with the client making RPC-like requests to the server. I'm using Tomcat to host the servlets, and the Apache HttpClient to make requests to it.
My code goes something like this:
private static final HttpConnectionManager CONN_MGR = new MultiThreadedHttpConnectionManager();
final GetMethod get = new GetMethod();
final HttpClient httpClient = new HttpClient(CONN_MGR);
get.getParams().setCookiePolicy(CookiePolicy.IGNORE_COOKIES);
get.getParams().setParameter(HttpMethodParams.USER_AGENT, USER_AGENT);
get.setQueryString(encodedParams);
int responseCode;
try {
responseCode = httpClient.executeMethod(get);
} catch (final IOException e) {
...
}
if (responseCode != 200)
throw new Exception(...);
String responseHTML;
try {
responseHTML = get.getResponseBodyAsString(100*1024*1024);
} catch (final IOException e) {
...
}
return responseHTML;
It works great in a lightly-loaded environment, but when I'm making hundreds of requests per second I start to see this -
Caused by: java.net.BindException: Address already in use
at java.net.PlainSocketImpl.socketBind(Native Method)
at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:336)
at java.net.Socket.bind(Socket.java:588)
at java.net.Socket.<init>(Socket.java:387)
at java.net.Socket.<init>(Socket.java:263)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
Any thoughts on how to fix this? I'm guessing it's something to do with the client trying to reuse the ephemeral client ports, but why is this happening / how can I fix it?
Thanks!
A very good discussion of the problem you are running into can be found here. On the Tomcat side, by default it will use the SO_REUSEADDR option, which will allow the server to reuse sockets which are in TIME_WAIT. Additionally, the Apache http client will by default use keep-alives, and attempt to reuse connections.
Your problems seems to be caused by not calling releaseConnection on the HttpClient. This is required in order for the connection to be reused. Otherwise, the connection will remain open until garbage collector comes and closes it, or the server disconnects the keep-alive. In both cases, it won't be returned to the pool.
With hundreds of connections a second, and without knowing how long your connections keep to open, do their thing, close, and get recycled, I suspect that this is just a problem you're going to have. One thing you can do is catch the BindException in your try block, use that to do anything you need to do in the bind-unsuccessful case, and wrap the whole call in a while loop that depends on a flag indicating whether the bind succeeded. Off the top of my head:
boolean hasBound = false;
while (!hasBound) {
try {
hasBound = true;
responseCode = httpClient.executeMethod(get);
} catch (BindException e) {
// do anything you want in the bound-unsuccessful case
} catch (final IOException e) {
...
}
}
Update with question: One curious question: what are the maximum total and per-host number of connections allowed by your MultiThreadedHttpConnectionManager? In your code, that'd be:
CONN_MGR.getParams().getDefaultMaxConnectionsPerHost();
CONN_MGR.getParams().getMaxTotalConnections();
Thus, you've fired more requests than TCP/IP ports are allowed to be opened. I don't do HttpClient, so I can't go in detail about this, but in theory there are three solutions for this particular problem:
Hardware based: add another NIC (network interface card).
Software based: close connections directly after use and/or increase the connection timeout.
Platform based: increase the amount of TCP/IP ports which are allowed to be opened. May be OS-specific and/or NIC driver-specific. The absolute maximum is 65535, of which several may already be reserved/in use (e.g. port 80).
So it turns out the problem was that one of the other HttpClient instances accidentally wasn't using the MultiThreadedHttpConnectionManager I instantiated, so I effectively had no rate limiting at all. Fixing this problem fixed the exception being thrown.
Thanks for all the suggestions, though!
Even though we invoke HttpClientUtils.closeQuietly(client); but in your code in case trying to read the content from HttpResponse entity like InputStream contentStream = HttpResponse.getEntity().getContent(), then you should close the inputstream also then only HttpClient connection get closed properly.
Related
I'm having a problem with one of my servers, on Friday morning I got the following IOException:
11/Sep/2015 01:51:39,524 [ERROR] [Thread-1] - ServerRunnable: IOException:
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[?:1.7.0_75]
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) ~[?:1.7.0_75]
at com.watersprint.deviceapi.server.ServerRunnable.acceptConnection(ServerRunnable.java:162) [rsrc:./:?]
at com.watersprint.deviceapi.server.ServerRunnable.run(ServerRunnable.java:121) [rsrc:./:?]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]
Row 162 of the ServerRunnable class is in the method below, it's the ssc.accept() call.
private void acceptConnection(Selector selector, SelectionKey key) {
try {
ServerSocketChannel ssc = (ServerSocketChannel) key.channel();
SocketChannel sc = ssc.accept();
socketConnectionCount++;
/*
* Test to force device error, for debugging purposes
*/
if (brokenSocket
&& (socketConnectionCount % brokenSocketNumber == 0)) {
sc.close();
} else {
sc.configureBlocking(false);
log.debug("*************************************************");
log.debug("Selector Thread: Client accepted from "
+ sc.getRemoteAddress());
SelectionKey newKey = sc.register(selector,
SelectionKey.OP_READ);
ClientStateMachine clientState = new ClientStateMachine();
clientState.setIpAddress(sc.getRemoteAddress().toString());
clientState.attachSelector(selector);
clientState.attachSocketChannel(sc);
newKey.attach(clientState);
}
} catch (ClosedChannelException e) {
log.error("ClosedChannelException: ", e);
ClientStateMachine clientState = (ClientStateMachine)key.attachment();
database.insertFailedCommunication(clientState.getDeviceId(),
clientState.getIpAddress(),
clientState.getReceivedString(), e.toString());
key.cancel();
} catch (IOException e) {
log.error("IOException: ", e);
}
}
How should I handle this? reading up on the error it appears to be a setting in the Linux OS that limits the number of open files a process can have.
Judging from that, and this question here, it appears that I am not closing sockets correctly (The server is currently serving around 50 clients). Is this a situation where I need a timer to monitor open sockets and time them out after an extended period?
I have some cases where a client can connect and then not send any data once the connection is established. I thought I had handled those cases properly.
It's my understanding that a non-blocking NIO server has very long timeouts, is it possible that if I've missed cases like this they might accumulate and result in this error?
This server has been running for three months without any issues.
After I go through my code and check for badly handled / missing cases, what's the best way to handle this particular error? Are there other things I should consider that might contribute to this?
Also, (Maybe this should be another question) I have log4j2 configured to send emails for log levels of error and higher, yet I didn't get an email for this error. Are there any reasons why that might be? It usually works, the error was logged to the log file as expected, but I never got an email about it. I should have gotten plenty as the error occurred every time a connection was established.
You fix your socket leaks. When you get EOS, or any IOException other than SocketTimeoutException, on a socket you must close it. In the case of SocketChannels, that means closing the channel. Merely cancelling the key, or ignoring the issue and hoping it will go away, isn't sufficient. The connection has already gone away.
The fact that you find it necessary to count broken socket connections, and catch ClosedChannelException, already indicates major logic problems in your application. You shouldn't need this. And cancelling the key of a closed channel doesn't provide any kind of a solution.
It's my understanding that a non-blocking NIO server has very long timeouts
The only timeout a non-blocking NIO server has is the timeout you specify to select(). All the timeouts built-in to the TCP stack are unaffected by whether you are using NIO or non-blocking mode.
I am using java.net.DatagramSocket to send UDP packets to a statsd server from a Google App Engine servlet. This generally works; however, we periodically see the following exception:
IOException - Socket is closed: Unknown socket_descriptor..
When these IOExceptions occur, calling DatagramSocket.isClosed() returns false.
This issue happens frequently enough that it is concerning, and although I've put in place some workarounds (allocate a new socket and use a DeferredTask queue to retry), it would be good to understand the underlaying reason for these errors.
The Google docs mention, "Sockets may be reclaimed after 2 minutes of inactivity; any socket operation keeps the socket alive for a further 2 minutes." It is unclear to me how this would play into UDP datagrams; however, one suspicion I have is that this is related to GAE instance lifecycle in some way.
My code (sanitized and extracted) looks like:
DatagramSocket _socket;
void init() {
_socket = new DatagramSocket();
}
void send() {
DatagramPacket packet = new DatagramPacket(<BYTES>, <LENGTH>, <HOST>, <PORT>);
_socket.send(packet);
}
Appreciate any feedback on this!
The approach taken to workaround this issue was simply to manage a single static DatagramSocket instance with a couple of helper methods, getSocket() and releaseSocket() to release sockets throwing IOExceptions through the release method, and then allocate upon next access through the get method. Not shown in this code is retry logic to retry the failed socket.send(). Under load testing, this seems to work reliably.
try {
DatagramPacket packet = new DatagramPacket(<BYTES>, <LENGTH>, <HOST>, <PORT>);
getSocket().send(packet);
} catch (IOException ioe) {
releaseSocket();
}
What's the connection timeout of a socket created with a connecting constructor?
In Java SE 6, the following constructors for Socket will connect the socket right away instead of you having to call connect on it after construction:
Socket(InetAddress address, int port)
Socket(InetAddress host, int port, boolean stream)
Socket(InetAddress address, int port, InetAddress localAddr, int localPort)
Socket(String host, int port)
Socket(String host, int port, boolean stream)
Socket(String host, int port, InetAddress localAddr, int localPort)
While it's nice and convenient and all that the Java SE folks created 500 ways of constructing a socket so you can just browse the list of 500 to find the one that sort of does what you want (instead of calling new Socket() followed by Socket#connect()), none of the docs of these constructors says what the connection timeout is or whether/how they call connect(SocketAddress endpoint, int timeout).
Perhaps the stuff in the constructor docs talking about createSocketImpl imply something about the timeout, or some docs somewhere else say it?
Anyone know what the actual connection timeout for any of these constructors is?
Background: Okay, assuming the spec is really ambiguous (I thought Java is portable?), I'm trying to figure out why a customer's code freezes at seemingly random times. I have some code that calls some open source library which calls one of these constructors. I want to know whether calling one of these constructors would have made the timeout infinite or very long. I don't know what version of the JDK the customer is using so it would be nice if the spec says the timeout somewhere. I guess I can probably get the JDK version from my customer, but it will probably be the closed source JDK. In that case, could I reverse-engineer the code in their version of the SE library to find out? Is it hard? Will I go to jail?
Java spec is bogus. It doesn't say what timeout is on any of those constructor so implementation could set timeout to 0.000000000001 nanoseconds and still be correct. Furthurmore: non finite timeout not even respected by vm implementations (as seen here) so look like spec doesnt even matter cause noone followed it.
Conclusion: You have to read closed source binary of customer JVM (probably illegal but you have to do what you have to do), also OS socket doc.
Even though Java docs says the timeout is infinite, it actually means that JVM will not impose any timeout on the connect operation, however OS is free to impose timeout settings on any socket operations.
Thus the actual timeout will be dependent on your OS's TCP/IP layer settings.
A good programming practice is to set timeouts for all socket operations, preferably configurable via configuration file. The advantage of having it configurable is that depending on the network load of the deployment environment, the timeout can be tweaked without re-building/re-testing/re-releasing the whole software.
Looking at the code of Socket in OpenJDK 6-b14, you can see that these constructors call connect(socketAddress, 0), which means an infinite timeout value.
According to the sources (I'm looking at 1.5_13 here, but there shouldn't be a difference), the different Socket constructors all call Socket(SocketAddress, SocketAddress, boolean) which is defined as:
private Socket(SocketAddress address, SocketAddress localAddr,
boolean stream) throws IOException {
setImpl();
// backward compatibility
if (address == null)
throw new NullPointerException();
try {
createImpl(stream);
if (localAddr == null)
localAddr = new InetSocketAddress(0);
bind(localAddr);
if (address != null)
connect(address);
} catch (IOException e) {
close();
throw e;
}
}
connect(SocketAddress) is defined as
public void connect(SocketAddress endpoint) throws IOException {
connect(endpoint, 0);
}
Hence, infinite timeout (as #Keppil already stated).
The Socket class exists since Java 1.0, but at that time, it was only possible to create sockets, which were immediately connected and it was not possible to specify a connect timeout. Since Java 1.4, it has been possible to create unconnected sockets and then specify a timeout using the connect method. I assume that someone simply forgot to clarify the documentation of the "old" constructors, specifying that these still operate without an explicit timeout.
The documentation of the connect methods with timeout parameter reads that "a timeout of zero is interpreted as an infinite timeout". This is actually incorrect as well, since it only means that no timeout is implied by the Java VM. Even with a timeout of 0, the connect operation may still timeout in the operating system's TCP/IP stack.
It's platform-dependent but it's around a minute. The Javadoc for connect() is incorrect in stating that it is infinite. Note also that the connect() timeout parameter can only be used to decrease the default, not increase it.
I have the following 3 lines of the code:
ServerSocket listeningSocket = new ServerSocket(earPort);
Socket serverSideSocket = listeningSocket.accept();
BufferedReader in = new BufferedReader(new InputStreamReader(serverSideSocket.getInputStream()));
The compiler complains about all of these 3 lines and its complain is the same for all 3 lines: unreported exception java.io.IOException; In more details, these exception are thrown by new ServerSocket, accept() and getInputStream().
I know I need to use try ... catch .... But for that I need to know what this exceptions mean in every particular case (how should I interpret them). When they happen? I mean, not in general, but in these 3 particular cases.
You dont know IN PARTICULAR because IO Exception is also a "generic" exception that can have many causes technically. It means an unexpected issue around input / output happened, but obviously it has different causes on local hard disc than on the internet.
In general, all three items resolve around sockets. So causes are related to network issues. Possible are:
No network at all, not even localhost (would be a serious technical issue).
Port already in use, when a port number is given (new Server Socket(earPort))
Network issues - for example somseone stumbled over the cable during some stuff. Can also be a cause of bad quality, a DDOS attack etc.
Port exhaustion - no client side port available for a new connection.
Basically around this line.
The same will happen or be able to happen whenever you actually do something with the streams.
In thi scase you ahve two possible main causes:
First line: the socket is already in use (program started 2 times, same port as other program). This obviously is non-fixable normally unless the user does something.
Generic later runtime error. These can happen during normal operations.
The simplest way is to declare your calling method to throw IOException, but you need to cleanup allocated resources in finally clauses before you leave your method:
public void doSession ( ) throws IOException
{
final ServerSocket listeningSocket = new ServerSocket(earPort);
try
{
final Socket serverSideSocket = listeningSocket.accept();
try
{
final BufferedReader in =
new BufferedReader(
new InputStreamReader(
serverSideSocket.getInputStream()
)
);
}
finally
{
serverSideSocket.close( )
}
}
finally
{
listeningSocket.close( )
}
}
In general it doesn't matter exactly what caused the initial IOException because there's little your app can do to correct the situation.
However, as a general answer to your question of "what to do" You have a few options.
Try Again - May work if the problem was intermittent. Remember to supply a break condition in case it doesn't.
Try Something Else - Load the resource from a different location or via a different method.
Give Up - Throw/rethrow the exception and/or abort the action or perhaps the entire program. You may want to provide a user friendly message at this point... ;-) If your program requires the input to function then not having the input leaves you little choice but not to function.
What's the most appropriate way to detect if a socket has been dropped or not? Or whether a packet did actually get sent?
I have a library for sending Apple Push Notifications to iPhones through the Apple gatways (available on GitHub). Clients need to open a socket and send a binary representation of each message; but unfortunately Apple doesn't return any acknowledgement whatsoever. The connection can be reused to send multiple messages as well. I'm using the simple Java Socket connections. The relevant code is:
Socket socket = socket(); // returns an reused open socket, or a new one
socket.getOutputStream().write(m.marshall());
socket.getOutputStream().flush();
logger.debug("Message \"{}\" sent", m);
In some cases, if a connection is dropped while a message is sent or right before; Socket.getOutputStream().write() finishes successfully though. I expect it's due to the TCP window isn't exhausted yet.
Is there a way that I can tell for sure whether a packet actually got in the network or not? I experimented with the following two solutions:
Insert an additional socket.getInputStream().read() operation with a 250ms timeout. This forces a read operation that fails when the connection was dropped, but hangs otherwise for 250ms.
set the TCP sending buffer size (e.g. Socket.setSendBufferSize()) to the message binary size.
Both of the methods work, but they significantly degrade the quality of the service; throughput goes from a 100 messages/second to about 10 messages/second at most.
Any suggestions?
UPDATE:
Challenged by multiple answers questioning the possibility of the described. I constructed "unit" tests of the behavior I'm describing. Check out the unit cases at Gist 273786.
Both unit tests have two threads, a server and a client. The server closes while the client is sending data without an IOException thrown anyway. Here is the main method:
public static void main(String[] args) throws Throwable {
final int PORT = 8005;
final int FIRST_BUF_SIZE = 5;
final Throwable[] errors = new Throwable[1];
final Semaphore serverClosing = new Semaphore(0);
final Semaphore messageFlushed = new Semaphore(0);
class ServerThread extends Thread {
public void run() {
try {
ServerSocket ssocket = new ServerSocket(PORT);
Socket socket = ssocket.accept();
InputStream s = socket.getInputStream();
s.read(new byte[FIRST_BUF_SIZE]);
messageFlushed.acquire();
socket.close();
ssocket.close();
System.out.println("Closed socket");
serverClosing.release();
} catch (Throwable e) {
errors[0] = e;
}
}
}
class ClientThread extends Thread {
public void run() {
try {
Socket socket = new Socket("localhost", PORT);
OutputStream st = socket.getOutputStream();
st.write(new byte[FIRST_BUF_SIZE]);
st.flush();
messageFlushed.release();
serverClosing.acquire(1);
System.out.println("writing new packets");
// sending more packets while server already
// closed connection
st.write(32);
st.flush();
st.close();
System.out.println("Sent");
} catch (Throwable e) {
errors[0] = e;
}
}
}
Thread thread1 = new ServerThread();
Thread thread2 = new ClientThread();
thread1.start();
thread2.start();
thread1.join();
thread2.join();
if (errors[0] != null)
throw errors[0];
System.out.println("Run without any errors");
}
[Incidentally, I also have a concurrency testing library, that makes the setup a bit better and clearer. Checkout the sample at gist as well].
When run I get the following output:
Closed socket
writing new packets
Finished writing
Run without any errors
This not be of much help to you, but technically both of your proposed solutions are incorrect. OutputStream.flush() and whatever else API calls you can think of are not going to do what you need.
The only portable and reliable way to determine if a packet has been received by the peer is to wait for a confirmation from the peer. This confirmation can either be an actual response, or a graceful socket shutdown. End of story - there really is no other way, and this not Java specific - it is fundamental network programming.
If this is not a persistent connection - that is, if you just send something and then close the connection - the way you do it is you catch all IOExceptions (any of them indicate an error) and you perform a graceful socket shutdown:
1. socket.shutdownOutput();
2. wait for inputStream.read() to return -1, indicating the peer has also shutdown its socket
After much trouble with dropped connections, I moved my code to use the enhanced format, which pretty much means you change your package to look like this:
This way Apple will not drop a connection if an error happens, but will write a feedback code to the socket.
If you're sending information using the TCP/IP protocol to apple you have to be receiving acknowledgements. However you stated:
Apple doesn't return any
acknowledgement whatsoever
What do you mean by this? TCP/IP guarantees delivery therefore receiver MUST acknowledge receipt. It does not guarantee when the delivery will take place, however.
If you send notification to Apple and you break your connection before receiving the ACK there is no way to tell whether you were successful or not so you simply must send it again. If pushing the same information twice is a problem or not handled properly by the device then there is a problem. The solution is to fix the device handling of the duplicate push notification: there's nothing you can do on the pushing side.
#Comment Clarification/Question
Ok. The first part of what you understand is your answer to the second part. Only the packets that have received ACKS have been sent and received properly. I'm sure we could think of some very complicated scheme of keeping track of each individual packet ourselves, but TCP is suppose to abstract this layer away and handle it for you. On your end you simply have to deal with the multitude of failures that could occur (in Java if any of these occur an exception is raised). If there is no exception the data you just tried to send is sent guaranteed by the TCP/IP protocol.
Is there a situation where data is seemingly "sent" but not guaranteed to be received where no exception is raised? The answer should be no.
#Examples
Nice examples, this clarifies things quite a bit. I would have thought an error would be thrown. In the example posted an error is thrown on the second write, but not the first. This is interesting behavior... and I wasn't able to find much information explaining why it behaves like this. It does however explain why we must develop our own application level protocols to verify delivery.
Looks like you are correct that without a protocol for confirmation their is no guarantee the Apple device will receive the notification. Apple also only queue's the last message. Looking a little bit at the service I was able to determine this service is more for convenience for the customer, but cannot be used to guarantee service and must be combined with other methods. I read this from the following source.
http://blog.boxedice.com/2009/07/10/how-to-build-an-apple-push-notification-provider-server-tutorial/
Seems like the answer is no on whether or not you can tell for sure. You may be able to use a packet sniffer like Wireshark to tell if it was sent, but this still won't guarantee it was received and sent to the device due to the nature of the service.