Apache HttpComponents BasicHttpClientConnectionManager - java

I recently switched from java.net to org.apache.http.client, I have setup a ClosableHttpClient with the HttpClientBuilder. As connection manager I am using the BasicHttpClientConnectionManager.
Now I have the problem that very often when I create some HTTP request I get a timeout exception. It seems that the connection manager is keeping connections open to reuse them but if the system is idle for a few minutes then this connection will timeout and when I make the next request the first thing I get is a timeout. Repeating the same request one more time then usually works without any problem.
Is there a way to configure the BasicHttpClientConnectionManager in order to not reuse its connections and create a new connection each time?

There several ways of dealing with the problem
Evict idle connections once no longer needed. The code below effectively disables connection persistence by closing out persistent connections after each HTTP exchange.
BasicHttpClientConnectionManager cm = new BasicHttpClientConnectionManager();
CloseableHttpClient httpclient = HttpClients.custom().setConnectionManager(cm).build();
...
try (CloseableHttpResponse response = httpclient.execute(new HttpGet("/"))) {
System.out.println(response.getStatusLine());
EntityUtils.consume(response.getEntity());
}
cm.closeIdleConnections(0, TimeUnit.MILLISECONDS);
Limit connection keep-alive time to something relatively small-ish
BasicHttpClientConnectionManager cm = new BasicHttpClientConnectionManager();
CloseableHttpClient httpclient = HttpClients.custom()
.setConnectionManager(cm)
.setKeepAliveStrategy((response, context) -> 1000)
.build();
try (CloseableHttpResponse response = httpclient.execute(new HttpGet("/"))) {
System.out.println(response.getStatusLine());
EntityUtils.consume(response.getEntity());
}
(Recommended) Use pooling connection manager and set connection total time to live to a finite value. There are no benefits to using the basic connection manager compared to the pooling one unless your code is expected to run in an EJB container.
CloseableHttpClient httpclient = HttpClients.custom()
.setConnectionTimeToLive(5, TimeUnit.SECONDS)
.build();
try (CloseableHttpResponse response = httpclient.execute(new HttpGet("/"))) {
System.out.println(response.getStatusLine());
EntityUtils.consume(response.getEntity());
}

Related

Apache Async HTTP send requests with new Proxy on each Request

I have to check thousands of proxy servers continuously.
To speed it up, I am thinking to create a batch of size N(say 50) and send requests to them concurrently. Each proxy server has a unique IP/Port and username/password authentication.
Since I am checking proxies, I will configure the request to use the given Proxy and send a request to the target site and measure the response.
Here is an example to use proxy with auth from the Apache client docs:
public static void main(String[] args)throws Exception {
CredentialsProvider credsProvider = new BasicCredentialsProvider();
credsProvider.setCredentials(
new AuthScope("localhost", 8889),
new UsernamePasswordCredentials("squid", "nopassword"));
CloseableHttpAsyncClient httpclient = HttpAsyncClients.custom()
.setDefaultCredentialsProvider(credsProvider)
.build();
try {
httpclient.start();
HttpHost proxy = new HttpHost("localhost", 8889);
RequestConfig config = RequestConfig.custom()
.setProxy(proxy)
.build();
HttpGet httpget = new HttpGet("https://httpbin.org/");
httpget.setConfig(config);
Future<HttpResponse> future = httpclient.execute(httpget, null);
HttpResponse response = future.get();
System.out.println("Response: " + response.getStatusLine());
System.out.println("Shutting down");
} finally {
httpclient.close();
}
}
As you can see, if you are using an authenticated proxy, you need to provide the credentials in the Client itself.
This means that if I am checking 50 proxy servers concurrently then I have to create a new client for each of them. Which means that the requests will not be concurrent and better if I just use a multi-threaded solution.
The issue is that if I use multithreading then I will put excessive loads on the server as most of the threads will block on I/O. A concurrent non-blocking I/O is much better for this type of challenge.
How can I check multiple authenticated proxy servers concurrently if I have to create a client for each of them?

How to get persistent HttpConnection with Apache HttpClient?

In my test application I execute consecutive HttpGet requests to the same host with Apache HttpClient but upon each next request it turns out that the previous HttpConnection is closed and the new HttpConnection is created.
I use the same instance of HttpClient and don't close responses. From each entity I get InputStream, read from it with Scanner and then close the Scanner. I have tested KeepAliveStrategy, it returns true. The time between requests doesn't exceed keepAlive or connectionTimeToLive durations.
Can anyone tell me what could be the reason for such behavior?
Updated
I have found the solution. In order to keep the HttpConnecton alive it is necessary to set HttpClientConnectionManager when building HttpClient. I have used BasicHttpClientConnectionManager.
ConnectionKeepAliveStrategy keepAliveStrat = new DefaultConnectionKeepAliveStrategy() {
#Override
public long getKeepAliveDuration(HttpResponse response, HttpContext context)
{
long keepAlive = super.getKeepAliveDuration(response, context);
if (keepAlive == -1)
keepAlive = 120000;
return keepAlive;
}
};
HttpClientConnectionManager connectionManager = new BasicHttpClientConnectionManager();
try (CloseableHttpClient httpClient = HttpClients.custom()
.setConnectionManager(connectionManager) // without this setting connection is not kept alive
.setDefaultCookieStore(store)
.setKeepAliveStrategy(keepAliveStrat)
.setConnectionTimeToLive(120, TimeUnit.SECONDS)
.setUserAgent(USER_AGENT)
.build())
{
HttpClientContext context = new HttpClientContext();
RequestConfig config = RequestConfig.custom()
.setCookieSpec(CookieSpecs.DEFAULT)
.setSocketTimeout(10000)
.setConnectTimeout(10000)
.build();
context.setRequestConfig(config);
HttpGet httpGet = new HttpGet(uri);
CloseableHttpResponse response = httpClient.execute(httpGet, context);
HttpConnection conn = context.getConnection();
HttpEntity entity = response.getEntity();
try (Scanner in = new Scanner(entity.getContent(), ENC))
{
// do something
}
System.out.println("open=" + conn.isOpen()); // now open=true
HttpGet httpGet2 = new HttpGet(uri2); // on the same host with other path
// and so on
}
Updated 2
In general checking connections with conn.isOpen() is not proper way to check the connections state because: "Internally HTTP connection managers work with instances of ManagedHttpClientConnection acting as a proxy for a real connection that manages connection state and controls execution of I/O operations. If a managed connection is released or get explicitly closed by its consumer the underlying connection gets detached from its proxy and is returned back to the manager. Even though the service consumer still holds a reference to the proxy instance, it is no longer able to execute any I/O operations or change the state of the real connection either intentionally or unintentionally." (HttpClent Tutorial)
As have pointed #oleg the proper way to trace connections is using the logger.
First of all you need to make sure remote server you're working with does support keep-alive connections. Just simply check whether remote server does return header Connection: Keep-Alive or Connection: Closed in each and every response. For Close case there is nothing you can do with that. You can use this online tool to perform such check.
Next, you need to implement the ConnectionKeepAliveStrategy as defined in paragraph #2.6 of this manual. Note that you can use existent DefaultConnectionKeepAliveStrategy since HttpClient version 4.0, so that your HttpClient will be constructed as following:
HttpClient client = HttpClients.custom()
.setKeepAliveStrategy(DefaultConnectionKeepAliveStrategy.INSTANCE)
.build();
That will ensure you HttpClient instance will reuse the same connection via keep-alive mechanism if it is being supported by server.
Your application must be closing response objects in order to ensure proper resource de-allocation of the underlying connections. Upon response closure HttpClient keeps valid connections alive and returns them back to the connection manager (connection pool).
I suspect your code simply leaks connections and every request ens up with a newly created connection while all previous connections keep on piling up in memory.
From the example at HttpClient website:
// In order to ensure correct deallocation of system resources
// the user MUST call CloseableHttpResponse#close() from a finally clause.
// Please note that if response content is not fully consumed the underlying
// connection cannot be safely re-used and will be shut down and discarded
// by the connection manager.
So as #oleg said you need to close the HttpResponse before checking the connection status.

Proper usage of Apache HttpClient and when to close it.

I am using HttpClient within a servlet to make calls to a resource which I return as the servlets response after some manipulation.
My HttpClient uses PoolingHttpClientConnectionManager.
I create the client like so:
private CloseableHttpClient getConfiguredHttpClient(){
return HttpClientBuilder
.create()
.setDefaultRequestConfig(config)
.setConnectionReuseStrategy(NoConnectionReuseStrategy.INSTANCE)
.setConnectionManagerShared(true)
.setConnectionManager(connManager)
.build();
}
I use this client within a Try With Resource within the servlets service method, so it is auto closed. To stop the the connection manager from being closed, I set setConnectionManagerShared to true.
I have seen other code samples that do not close the HttpClient. Should I not be closing this resource?
Thanks
For httpcomponents version 4.5.x:
I found that you really need to close the resource as shown in the documentation: https://hc.apache.org/httpcomponents-client-4.5.x/quickstart.html
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("http://targethost/homepage");
CloseableHttpResponse response1 = httpclient.execute(httpGet);
try {
System.out.println(response1.getStatusLine());
HttpEntity entity1 = response1.getEntity();
EntityUtils.consume(entity1);
} finally {
response1.close();
}
For other versions of httpcomponents, see other answers.
For older versions of httpcomponents (http://hc.apache.org/httpcomponents-client-4.2.x/quickstart.html):
You do not need to explicitly close the HttpClient, however, (you may be doing this already but worth noting) you should ensure that connections are released after method execution.
Edit: The ClientConnectionManager within the HttpClient is going to be responsible for maintaining the state of connections.
GetMethod httpget = new GetMethod("http://www.url.com/");
try {
httpclient.executeMethod(httpget);
Reader reader = new InputStreamReader(httpget.getResponseBodyAsStream(), httpget.getResponseCharSet());
// consume the response entity and do something awesome
} finally {
httpget.releaseConnection();
}

HttpClientError: The target server failed to respond

I am trying to hit a server using HTTP client using PoolingClientConnectionManager setting max connections for individual hosts
//Code that initializes my connection manager and HTTP client
HttpParams httpParam = httpclient.getParams();
HttpConnectionParams.setSoTimeout(httpParam, SOCKET_TIMEOUT);
HttpConnectionParams.setConnectionTimeout(httpParam, CONN_TIMEOUT);
httpclient.setParams(httpParam);
//Run a thread which closes Expired connections
new ConnectionManager(connManager).start();
//Code that executes my request
HttpPost httpPost = new HttpPost(url);
HttpEntity httpEntity = new StringEntity(request, "UTF-8");
httpPost.setEntity(httpEntity);
Header acceptEncoding = new BasicHeader("Accept-Encoding", "gzip,deflate");
httpPost.setHeader(acceptEncoding);
if(contenttype != null && !contenttype.equals("")){
Header contentType = new BasicHeader("Content-Type", contenttype);
httpPost.setHeader(contentType);
}
InputStream inputStream = null;
LOG.info(dataSource + URL + url + REQUEST + request);
HttpResponse response = httpclient.execute(httpPost);
That is we are using Connection pooling for http persistence .
We are getting this error sporadically :
The target server failed to respond
org.apache.http.NoHttpResponseException: The target server failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:517)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
Does any one know how to resolve this?
We are shutting down idle connections as well.
Can some Please help.
http://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html
Probably, it is a bug in the HttpClient.
If you are using the HttpClient 4.4, please try to upgrade to 4.4.1.
If you want for more information, please look at this link.
If you can't upgrade, the following links might be helpful.
http://www.nuxeo.com/blog/using-httpclient-properly-avoid-closewait-tcp-connections/
Good luck!
Recently faced similar while using HttpClient 5.
On enabling the HttpClient logs and found that the issue was due to stale connections.
Adding the below helped to solve the issue, it detects and validates the connections that have become stale while kept inactive in the pool before reuse.
PoolingHttpClientConnectionManager connectionManager = new PoolingHttpClientConnectionManager();
connectionManager.setValidateAfterInactivity(timeinmilliseconds);

Disable or delay timeout in Apache Httpclient request

I have a REST webservice with some methods.
I'm sending requests to the rest with Apache HttpClient 4.
When I make a connection to this rest, in a method that is bigger and slower, it throws a NoHttpResponseException.
After googling, I discovered that the server is cutting down the connection with my client app.
So, I tried to disable the timeout this way :
DefaultHttpClient httpclient = null;
HttpParams params = new BasicHttpParams();
HttpConnectionParams.setConnectionTimeout(params, 0);
HttpConnectionParams.setSoTimeout(params, 0);
HttpConnectionParams.setStaleCheckingEnabled(params, true);
httpclient = new DefaultHttpClient(params);
httpclient.execute(httpRequest, httpContext);
But it failed. The request dies in 15 seconds (possible default timeout?)
Does anyone know the best way to do this?
I would suggest that you return data to the client before the timeout can occur. This may just be some bytes that says "working" to the client. By trickling the data out, you should be able to keep the client alive.

Categories

Resources