How to handle Cookies with Apache HttpClient 4.3 - java

I need to implement a series of HTTP requests in Java and decided to use Apaches HttpClient in version 4.3 (the most current one).
The problem is all these requests use a cookie for session management and I seem to be unable to find a way of accessing that cookie and passing it from request to request. My commands in using curl look something like:
# Login
curl -c cookies -d "UserName=username&Password=password" "https://example.com/Login"
# Upload a file
curl -b cookies -F fileUpload=#IMG_0013.JPG "https://example.com/File"
# Get results of server processing file
curl -b cookies "https://example.com/File/1234/Content"
They work perfectly. However with HttpClient it seems not to work. What I tried was:
URI serverAddress = new URI("https://example.com/");
URI loginUri = UriBuilder.fromUri(serverAddress).segment("Login").queryParam("UserName", "username")
.queryParam("Password", "password").build();
RequestConfig globalConfig = RequestConfig.custom().setCookieSpec(CookieSpecs.BEST_MATCH).build();
CookieStore cookieStore = new BasicCookieStore();
HttpClientContext context = HttpClientContext.create();
context.setCookieStore(cookieStore);
CloseableHttpClient httpClient = HttpClients.custom().setDefaultRequestConfig(globalConfig).setDefaultCookieStore(cookieStore).build();
HttpGet httpGet = new HttpGet(loginUri);
CloseableHttpResponse loginResponse = httpClient.execute(httpGet,context);
System.out.println(context.getCookieStore().getCookies());
The output of the last line is always an empty list. I think it should contain my Cookie, am I right?
Can someone give me a small example on how to handle the cookie using Apache HttpClient 4.3?
Thanks

Your code looks OK to me (other than not releasing resources, but I presume exception handling was omitted for brevity). The reason for cookie store being empty may be violation of the actual cookie policy (which is BEST_MATCH in your case) by the target server. So, cookies sent by the server get rejected as invalid. You can find out if that is the case (and other useful contextual details) by turning on context / wire logging as described here

Related

HTTP method PATCH - use with ip and not hostname

I had a problem of HttpURLConnection Invalid HTTP method: PATCH and got a suggestion here in which the X-HTTP-Method-Override work around did not work out for me. So I tried
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpPatch httpPatch = new HttpPatch(new URI("http://example.com"));
CloseableHttpResponse response = httpClient.execute(httpPatch);
where I am facing a challenge. My request is an HTTPS request and I have the url as https://192.168.1.1/foo/bar. I neither know the hostname of the ip 192.168.1.1 which is validated by CloseableHttpClient with the hostname on the certificate, nor want to perform a DNS look up to happen(not even in the known hosts).
Are there any feasibility to perform a PATCH request in my case?

apache commons httpclient 4.23 form login problems different session cookies used in different requests

I have a protected resource which requires me to login. Im using the commons client with the following code block.
HttpClient httpClient = new HttpClient();
httpClient.getParams().setParameter("http.protocol.cookie-policy", CookiePolicy.BROWSER_COMPATIBILITY);
httpClient.getParams().setParameter("http.protocol.single-cookie-header", Boolean.TRUE);
PostMethod postMethod = new PostMethod("/admin/adminlogon.do");
postMethod.setRequestEntity(new StringRequestEntity("action=logon&adminUser=admin&adminPassword=password",
"application/x-www-form-urlencoded",
"UTF-8"));
postMethod.addParameter("action","logon");
postMethod.addParameter("adminUser","admin");
postMethod.addParameter("adminPassword","password");
httpClient.executeMethod(postMethod);
String response2 = postMethod.getResponseBodyAsString();
Above is where I basically login. This works fine im getting a nice little JSESSIONID cookie back.
GetMethod get = new GetMethod("/admin/api.do?action=getSomeJson");
httpClient.executeMethod(get);
When I check the logic on the sever the for the 2nd request I notice that we are using a different JSESSIONID. Therefore the get seems to fail to log in. I was under the impression the httpClient managed the cookies and sent the same cookie back. When I log into my app normally through the UI I see the same cookie in each request just not in the this test code.
String s = get.getResponseBodyAsString();
get.releaseConnection();
Do I need to do something with the httpClient to ensure it uses the same cookies from the first post request when it does its get request??
Thanks in advance.
Your assumption regarding HTTP client cookie behavior is correct.
In your case your not use the same httpClient instance. To fix it you need to allocate the httpClient only once (in PostConstructor):
httpClient = new DefaultHttpClient(); // or new HttpClient();
Then, you perform your calls using the same instance of the client. The client will take a cookie from a response, will store it in the cookieStore and will send it with the next request.
[Added after the comment]
The following code works for me:
httpClient = new DefaultHttpClient();
// Create a local instance of cookie store
cookieStore = new BasicCookieStore();
// Set the store
httpClient.setCookieStore(cookieStore);

Can't authenticate with DefaultHttpClient

I want to add authentication header to my request. I'm using DefaultHttpClient from Apache httpclient 4.0.
I found that's done this way:
URI uri = new URI("http://www.bla.bla/folder/");
String host = uri.getHost();
int port = uri.getPort();
httpClient.getCredentialsProvider().setCredentials(
new AuthScope(host, port, AuthScope.ANY_SCHEME),
new UsernamePasswordCredentials("myuser", "mypassword")
);
This is executed and even with the debugger I see some credentials variable of the httpClient are set at the moment of doing the request. But I inspect web traffic with Charles and there's no authentication header.
Content of vars:
host: www.bla.bla
port: -1
Btw. I enabled Charles as a proxy to see the headers of the request, with:
HttpHost proxy = new HttpHost("127.0.0.1", 8888, "http");
httpParameters.setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);
I think that should not be altering my headers, would make no sense for webproxy... anyways if I disable the proxy stuff it also doesn't work (although I can't see the content of the header but I suppose it's the same reason).
Also tried using a request interceptor like described in Softhinker.com's post here: How can I send HTTP Basic Authentication headers in Android?
And I get exactly the same request, without authentification header.
What am I doing wrong?
Thanks in advance.
I got it working setting the header "manually" in the request.
request.setHeader(new BasicHeader("Authorization", authstring));

Java HttpClient seems to be caching content

I'm building a simple web-scraper and i need to fetch the same page a few hundred times, and there's an attribute in the page that is dynamic and should change at each request. I've built a multithreaded HttpClient based class to process the requests and i'm using an ExecutorService to make a thread pool and run the threads. The problem is that dynamic attribute sometimes doesn't change on each request and i end up getting the same value on like 3 or 4 subsequent threads. I've read alot about HttpClient and i really can't find where this problem comes from. Could it be something about caching, or something like it!?
Update: here is the code executed in each thread:
HttpContext localContext = new BasicHttpContext();
HttpParams params = new BasicHttpParams();
HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1);
HttpProtocolParams.setContentCharset(params,
HTTP.DEFAULT_CONTENT_CHARSET);
HttpProtocolParams.setUseExpectContinue(params, true);
ClientConnectionManager connman = new ThreadSafeClientConnManager();
DefaultHttpClient httpclient = new DefaultHttpClient(connman, params);
HttpHost proxy = new HttpHost(inc_proxy, Integer.valueOf(inc_port));
httpclient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY,
proxy);
HttpGet httpGet = new HttpGet(url);
httpGet.setHeader("User-Agent",
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)");
String iden = null;
int timeoutConnection = 10000;
HttpConnectionParams.setConnectionTimeout(httpGet.getParams(),
timeoutConnection);
try {
HttpResponse response = httpclient.execute(httpGet, localContext);
HttpEntity entity = response.getEntity();
if (entity != null) {
InputStream instream = entity.getContent();
String result = convertStreamToString(instream);
// System.out.printf("Resultado\n %s",result +"\n");
instream.close();
iden = StringUtils
.substringBetween(result,
"<input name=\"iden\" value=\"",
"\" type=\"hidden\"/>");
System.out.printf("IDEN:%s\n", iden);
EntityUtils.consume(entity);
}
}
catch (ClientProtocolException e) {
// TODO Auto-generated catch block
System.out.println("Excepção CP");
} catch (IOException e) {
// TODO Auto-generated catch block
System.out.println("Excepção IO");
}
HTTPClient does not use cache by default (when you use DefaultHttpClient class only). It does so, if you use CachingHttpClient which is HttpClient interface decorator enabling caching:
HttpClient client = new CachingHttpClient(new DefaultHttpClient(), cacheConfiguration);
Then, it analyzes If-Modified-Since and If-None-Match headers in order to decide if request to the remote server is performed, or if its result is returned from cache.
I suspect, that your issue is caused by proxy server standing between your application and remote server.
You can test it easily with curl application; execute some number of requests omitting proxy:
#!/bin/bash
for i in {1..50}
do
echo "*** Performing request number $i"
curl -D - http://yourserveraddress.com -o $i -s
done
And then, execute diff between all downloaded files. All of them should have differences you mentioned. Then, add -x/--proxy <host[:port]> option to curl, execute this script and compare files again. If some responses are the same as others, then you can be sure that this is proxy server issue.
Generally speaking, in order to test whether or not HTTP requests are being made over the wire, you can use a "sniffing" tool that analyzes network traffic, for example:
Fiddler ( http://fiddler2.com/fiddler2/ ) - I would start with this
Wireshark ( http://www.wireshark.org/ ) - more low level
I highly doubt HttpClient is performing caching of any sort (this would imply it needs to store pages in memory or on disk - not one of its capabilities).
While this is not an answer, its a point to ponder: Is it possible that the server (or some proxy in between) is returning you cached content? If you are performing many requests (simultaneously or near simultaneously) for the same content, the server may be returning you cached content because it has decided that the information has not "expired" yet. In fact the HTTP protocol provides caching directives for such functionality. Here is a site that provides a high level overview of the different HTTP caching mechanisms:
http://betterexplained.com/articles/how-to-optimize-your-site-with-http-caching/
I hope this gives you a starting point. If you have already considered these avenues then that's great.
You could try appending some unique dummy parameter to the URL on every request to try to defeat any URL-based caching (in the server, or somewhere along the way). It won't work if caching isn't the problem, or if the server is smart enough to reject requests with unknown parameters, or if the server is caching but only based on parameters it cares about, or if your chosen parameter name collides with a parameter the site actually uses.
If this is the URL you're using
http://www.example.org/index.html
try using
http://www.example.org/index.html?dummy=1
Set dummy to a different value for each request.

Does Apache HttpClient add the Cookies set by the java.net.CookieHandler to Request?

My simple Apache HttpClient (4.0.1) client application makes an HttpGet request to a server URL in the main() method and prints the response. On startup, the application registers an implementation of java.net.CookieHandler in a static block.
On checking the cookies received on the server side, I found that the cookies are not being received by the Server when the HttpClient makes the GET request.
On the other hand, when I replaced the Apache HttpClient with a plain java.net.URL(HTTP_URL).openStream(), the cookies were set by the CookieHandler on the Request and were received by the Server.
Is it that CookieHandler does not work with Apache HttpClient?
Code:
Client.java
static {
CookieHandler.setDefault(new CookieHandler() {
public Map get(URI u, List r) {
return Collections.singletonMap("Cookie",
Collections.singletonList(COOKIE_STRING));
}
});
}
Using HttpClient (does not put cookies on request)
HttpClient client = new DefaultHttpClient();
HttpGet get = new HttpGet(HTTP_URL);
client.execute(get);
Using java.net.URL (sets the cookies on request)
URL url = new URL(HTTP_URL);
InputStream is = url.openStream();
Is it that CookieHandler does not work with Apache HttpClient?
That is correct.
The Apache HttpClient codebase uses its own cookie and cookie store representations / mechanisms. Here is a link to the relevant section of the HttpClient tutorial. (It is pretty sketchy, but if you look at the javadocs for the relevant classes, you should be able to figure out how to use it.)
(If you are using an older version of Apache HttpClient, beware that the APIs have changed significantly.)

Categories

Resources