Java HTTP GET timeouts but works fine in browser - java

NOTE: This is not a duplicate question. I'm aware of concrete problems that looked identical to mine and were solved by adding some data to the header requests. This is not the case, I've tried all the solutions and none works. Tried: this question and this one, nothing seems to work.
I'm trying to read contents of a website using Java. The url is URL to fetch. There's no authentication involved, and no forms are filled before. I can open that url in a cookie-free session and it still works with browser. I can even fetch it with Selenium, but HttpClient keeps refusing to load the URL.
The problem has nothing to do with certificates, right now I'm using a working "allow-all" certificate manager, so that's not the issue.
I've inspected my browser sent headers, nothing special:
Host: www.hipercor.es
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:76.0) Gecko/20100101 Firefox/76.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
DNT: 1
Connection: keep-alive
Upgrade-Insecure-Requests: 1
As I said, I've already tried configuring the user agent to "fake" being Firefox.
Just to give some background, I'm building a enhanced version of crawler4j, my idea is to build a web scraper, and I found this issue testing random shops I knew are currently being crawled in bussiness environment by other scrapers.
Note that HeadRequest also fail.
The errors are either
java.net.SocketException: Connection reset
or
java.net.SocketTimeoutException: Read timed out
Please, note that browser loads it perfectly, as well as using Selenium Drivers to load the page (although it is slow as hell in that case)

Related

Android to PHP session without cookies

so far, I have been able to use the HttpURLConnection class in java to make an app that can GET the form of my php website, put in the proper login details (username, password) and POST them back. I have double checked this with the response codes and am getting 200 for both GET and POST.
I'm having an issue now accessing the page that a successful login should redirect to. It is to my understanding that after a POST or GET, the connection is terminated once the response code is requested. My attempts to get the response cookies while logging in produce a "null" cookie.
The PHP site I am accessing does not seem to have any response cookies after a login when using "inspect element" in Chrome. Regardless of this, I have tried accessing the cookies all sorts of ways with no such luck. The request cookie header is there when I go the the website.
Am I missing something and the cookies are actually there? Or is it possible that the site does not use cookies to maintain a session? If that's the case, how would I access the page I want after logging in on my Android app?
Response Headers
Cache-Control:no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Connection:Keep-Alive
Content-Encoding:gzip
Content-Length:23030
Content-Type:text/html; charset=utf-8
Date:Mon, 10 Aug 2015 23:03:26 GMT
Expires:Thu, 19 Nov 1981 08:52:00 GMT
Keep-Alive:timeout=15, max=100
Pragma:no-cache
Server:Apache/2.2.22 (Debian)
Vary:Accept-Encoding
X-Powered-By:PHP/5.4.4-14+deb7u11
Request Header
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:52
Content-Type:application/x-www-form-urlencoded
Cookie:__utma=83554121.1278939357.1435860313.1435944069.1438202297.3; __utmc=83554121; __utmz=83554121.1438202297.3.3.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); _ga=GA1.2.1278939357.1435860313; PHPSESSID=4q03j4ihb7trnm1pvvofc9f3f5
Host:WEBSITE
Origin:WEBSITE
Referer:WEBSITE
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.130 Safari/537.36
Just an update on what I have figured out so far.
I logged in to my website on my browser on my PC. I took this PHPSESSID and used it in my app and I can access the page that the login page redirects to just fine.
My new question is: Why can't I assign myself a random PHPSESSID and go back to it? In other words, if I give myself PHPSESSID=123456 for example and use that when posting my login details, why does using that specific PHPSESSID still bring me back to the login page instead of the redirected one?
I'm currently trying to read the request header "set-cookie" but am having trouble doing so. In addition, even putting in the wrong user and/or password gives a response code of 200. Suggestions on how to check if logging in was successful?

Basic http authentication from browser to server via socket

I am trying to write a simple java server that receives HTTP GET requests from web browser and sends back some data. All the communication is done via sockets.
I am able to process the requests and now I am trying to implement a simple authentication with BASIC AUTH so some requests will be handled only if correct credentials are provided in the request header. For sake of simplicity, I am using only http protocol (not https). I am not sure how to access the header and read the credentials on my server, though:
The server runs on localhost, port 9000 and this is the sample URL that I am trying to process:
http://user:password#localhost:9000/files/text?tid=file3
I open the socket and read everything:
InputStream input = clientSocket.getInputStream();
// Reading line by line with a BufferedReader
java.io.BufferedReader in = new java.io.BufferedReader(new String line;
while (!(line = in.readLine()).equals("")) {
System.out.println(line);
}
This is what I get:
GET /files/text?tid=file3 HTTP/1.1
Host: localhost:9000
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/41.0.2272.76 Chrome/41.0.2272.76 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
There is no trace of the auth credentials I put in the URL so I am not sure how to parse the request. Could you tell me what am I missing here?
I know that this example is very simple and there are more clever ways to implement this but I am curious how to access these credentials in this simple model case.
Use a client like curl:
$ curl -v -u user:password "http://localhost:9000/files/text?tid=file3"
Since HTTP is stateless, sending the Authentication header is enough. That's what curl does. It is not necessary to wait for the server to return a 401.

How to check http methods supported by the tomcat

In below code sample am getting the http method get is used as reQuest now want to know what are all the http methods my tomcat server support.Please help to find this.
GET / HTTP/1.0
Connection: Keep-Alive
User-Agent: Mozilla/4.6 (X11; I; Linux 2.2.6-15apmac ppc)
Host: zink.demon.co.uk:1126
Accept: image/gif, */*
There is no way to know this unless you ask the server which methods it supports.
Usually this is done with the OPTIONS HTTP method, but not all web servers support it. Also, it applies to a specific URL, not the whole server.

Webscarab refusing connections with all browsers

Hi I am using webscarab as proxy to see the conversations between my browser and server. I havent used webscarab in a while. It was ok when I used it before. But now all the connections are refused , with all browsers or ports. It is showing an exception like below.
WebScarab encountered an error trying to retrieve
GET http://www.gooogle.com:80/ HTTP/1.1
Host: www.gooogle.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Proxy-Connection: keep-alive
Cache-Control: max-age=0
The error was :
Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
at java.net.Socket.connect(Socket.java:546)
at org.owasp.webscarab.httpclient.URLFetcher.connect(URLFetcher.java:368)
at org.owasp.webscarab.httpclient.URLFetcher.fetchResponse(URLFetcher.java:229)
at org.owasp.webscarab.plugin.proxy.CookieTracker$Plugin.fetchResponse(CookieTracker.java:130)
at org.owasp.webscarab.plugin.proxy.BrowserCache$Plugin.fetchResponse(BrowserCache.java:101)
at org.owasp.webscarab.plugin.proxy.RevealHidden$Plugin.fetchResponse(RevealHidden.java:100)
at org.owasp.webscarab.plugin.proxy.BeanShell$Plugin.fetchResponse(BeanShell.java:229)
at org.owasp.webscarab.plugin.proxy.ManualEdit$Plugin.fetchResponse(ManualEdit.java:243)
at org.owasp.webscarab.plugin.proxy.ConnectionHandler.run(ConnectionHandler.java:233)
at java.lang.Thread.run(Thread.java:636)
This is the error I get from firefox now. I tried changing the listener ports, and browsers. But no use. Still getting same exception. Can someone please help.
The most likely problem is that there is a proxy configured in WebScarab itself, that WebScarab cannot reach. To check this, go to Tools -> Proxies, and make sure that there is no proxy configured (unless you need an upstream proxy to reach the sites normally, in which case make sure that that is properly configured.)
Connection between javafx2.2 webengine and webscarab fails
it worked for me.

Session Cookies and IE 8

I recently built a simple web-app deployed over Tomcat. The app uses pretty standard session based security where a user who has logged in is given a session.
Sessions work fine in Firefox and Chrome, but require the use of jsessionid in the URL for IE (tested 7 & 8), set to medium privacy. In IE 8, I tried to override cookie handling, setting "Allow all 3rd party cookies" and "Allow all session cookies"- no dice. However, when I run Tomcat on my local machine, IE accepts the cookie, and sessions work just fine.
And now, for the HTTP headers.
From Chrome, a logged in user gets a session
GET http://devl:8080/testing/ HTTP/1.1
Host: devl:8080
Connection: keep-alive
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.1.249.1036 Safari/532.5
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
P3P: CP="NON CURa ADMa DEVa TAIa OUR BUS IND UNI COM NAV INT STA"
Set-Cookie: JSESSIONID=9280023BCE2046F32B13C89130CBC397; Path=/testing
Content-Type: text/html;charset=UTF-8
Content-Language: en-US
Content-Length: 2450
Date: Fri, 26 Mar 2010 14:14:40 GMT
GET http://devl:8080/testing/logout HTTP/1.1
Host: devl:8080
Connection: keep-alive
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.1.249.1036 Safari/532.5
Referer: http://devl:8080/testing/
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: JSESSIONID=9280023BCE2046F32B13C89130CBC397
...
From IE 8, with standard medium level security and privacy-
GET http://devl:8080/testing/ HTTP/1.1
Accept: application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, */*
Accept-Language: en-US
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Win64; x64; Trident/4.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; MDDC; Tablet PC 2.0)
UA-CPU: AMD64
Accept-Encoding: gzip, deflate
Host: devl:8080
Connection: Keep-Alive
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
P3P: CP="NON CURa ADMa DEVa TAIa OUR BUS IND UNI COM NAV INT STA"
Set-Cookie: JSESSIONID=192999F922D6E9C868314452726764BA; Path=/testing
Content-Type: text/html;charset=UTF-8
Content-Language: en-US
Content-Length: 2450
Date: Fri, 26 Mar 2010 14:32:34 GMT
GET http://devl:8080/testing/logout HTTP/1.1
Accept: application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, */*
Referer: http://devl:8080/testing/;jsessionid=6371A83EFE39A46997544F9146AA5CEA
Accept-Language: en-US
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Win64; x64; Trident/4.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; MDDC; Tablet PC 2.0)
UA-CPU: AMD64
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
Host: devl:8080
...
I thought it might be P3P, but on adding a compact policy, nothing changes. This is the standard Tomcat session, so I'm really surprised I haven't been able to find other people with the same problem so far. Anyone have any ideas?
EDIT 4/3/2010 -
Sorry if I didn't make this clear- I've tried from multiple other instances of IE - co-workers down the hall, etc.
EDIT 4/3/2010 -
I've also tried turning on prompting for all cookies, but I don't get a prompt. Setting the domain in the "Set-Cookie" header using Fiddler didn't make a difference, either.
I ran into this exact problem, dug around for a while, and found this:
http://forums.iis.net/p/1147938/1879164.aspx
which says that domain names that have underscores in them cause problems with Windows Server, tomcat and IE
not sure if this fixes your problem (and at this point, you probably don't care) but maybe the next person who comes along can gain some value from it.
Problem: IE8 refused to accept cookies on a site I had built, but Firefox and IE7 worked just fine and had done so for ages - this was stable code.
Solution (for me): My server is in a different time zone to the client machine. The STUPID, IDIOTIC IE8 tries to be clever and refuses to accept cookies (stored in the local client machine) with a 20 minute life. My PHP code was straight from the text book, thus:
setcookie($name,$value,time()+1200);
But it works fine if I change it to, for example -
setcookie($name,$value,time()+120000);
This still leaves me with the problem of making the cookie die after 20 minutes, but at least my users can now use my website with IE8. I pass on this information in case it may help someone else.
Have you checked that the server time is correct?
I have had similar problems recently with IE not accepting cookies properly. After a lot of head scratching it turned out to be because the time difference between the server and client machines was so big that IE refused to accept the cookie. This was in Apache however.
Try using the standard HTTP port (80). I've read about issues with port numbers in URLs regarding privacy/security in IE more than once but can't seem to find relevant links at this time.
I agree with Lexicore - the cookie protocol from the web server looks right, so there's something with IE. It would be easier to figure out how to address the issue if we understood better why IE is rejecting the cookie. Alternatively, ask a friend to hit the site for you in IE to help confirm its a server issue not a browser instance issue.
Here is some things to check to help debug with IE and cookies - unfortunately, there's a mess of options to check. Sorry if some of these items seem basic - I just don't wnat to make any assumptions. I'm following along in IE 8.0 for this.
First, browse to the target site (http://devl:8080/testing/) in IE. Then:
Confirm what zone IE classifies 'http://devl:8080/testing/'. (This could explain why its works with Tomcat on your local machine.) The zone is displayed in the bottom bar of the browser and it most likely says "Internet". If it instead says "Local intranet", "Trusted Site", or "Restricted Site", this may be part of the problem and you should update your question or figure out why it isn't classified as Internet.
Double-click on the zone indicator in the bottom bar (presumably "Internet") to open the Security dialog. Is the Security Level for Internet set to Medium-high? If it isn't, this could be part of the problem and you should probably reset it back to match your users.
Select the "Internet" zone and then click the "Custom level ..." button to open the Security Settings dialog. Confirm the "Userdata persistence" option is set to "Enable". The "Userdata persistence" option is in the bottom 1/4 of the list of options in the "Miscenllaneous" section (near the bottom of the section just above the next section "Scripting").
Click OK on each dialog to close both of them.
On the menubar (enable it if it is not enabled), click "Tools" > "Internet Options". Select the "Privacy" tab. I know you mentioned you tried some things here, but those changes may not affect your site if your site is not in the Internet zone or if your site in the "Per Site Privacy Actions" exception list, so its best to just confirm.
Is the privacy setting in the Privacy tab set to Medium? If not, you may want to reset to default.
Click the "Sites" button to open the Per Site Privacy Actions dialog. Is your dev1 site listed? If so, remove it. Click OK to dismiss the dialog. Alternatively, you could force your dev1 site to always Allow cookies.
Click the "Advanced" button. Is "Override automatic cookie handling checked? If so, you might want to uncheck it to match your users. Alternatively, try checking it and checking "Always allow session cookies."
Click OK on each dialog to close both of them.
Confirm the browser is still at target site ('http://devl:8080/testing/'). Click "View" > "Webpage Privacy Policy..." to view the Privacy Report dialog. Does the list include "http://dev1:8080/testing/"? Does the Cookie column indicate "Accepted" for "http://dev1:8080/testing/"?
Select "http://dev1:8080/testing/" from the list. Click Summary to see the Privacy Policy. If set one for the your site, you should see it here. Otherwise, you should get a message that a privacy policy was not found. Look at the bottom of the dialog to see how the site is set to use cookies (compare, always allow, or never allow).
Hope this helps or gives you some ideas to pursue.
Ref:
http://blogs.msdn.com/ieinternals/archive/2009/08/20/WinINET-IE-Cookie-Internals-FAQ.aspx
http://www.practicalmachinist.com/vb/general/how-manage-cookies-internet-explorer-181641/
http://support.microsoft.com/kb/283185
This forum concerning P3P seems relevant.
Also have you considered setting your domain and expiration date for the session cookie?
This has clearly nothing to do with Tomcat, since the cookie is being set - just not accepted by the IE. This must be security issue in IE then.
Maybe this MS article would help to tune it.
What security zone is the dev1 site part of? IE handles cookies and lots of other security differently depending on which zone (and how the zone is configured).
Try setting the dev1 site to explicitly be part of the Trusted Sites for example and see what happens.
Zones:
Internet
Local Intranet
Trusted Sites
Restricted Sites
Also, does the cookie have to be restricted to the /testing path? Try setting it for / and see if that makes a difference.
I would try using the fully qualified hostname of the server. MSIE treats hostname without domains as being in the "Local intranet" and handles security differently.
Specifically, instead of:
http://devl:8080/testing/
Try using something like:
http://devl.mydomain.com:8080/testing/
It seems from what you're saying that you've only seen this issue in IE and only using computers in your office. Is there any sort of "security suite" installed by IT on all office computers, and if so, can you temporarily disable it? Oftentimes, these types of applications hook into IE and muck with its HTTP stack. If you do have software like that installed, do you have a "clean" installation or non-company computer you can test with?
The time on our servers were off by 14 minutes (and in the correct time zone EST).
Once we set the time on the server to the correct time cookies starting working again.
Ed

Categories

Resources