Please consider the following code snippet:
URL url = new URL("https://wfs.geodienste.ch/av/deu?&LAYERS=LCSFC,SOLI,SOSFC,LOCPOS,HADR,LNNA,OSNR,RESF,OSBP,MBSF&FORMAT=image%2Fjpeg&DPI=100&SERVICE=WMS&VERSION=1.1.1&REQUEST=GetMap&STYLES=&SRS=EPSG%3A2056&BBOX=2637158.3069220297,1236087.7425729688,2639935,1237632&WIDTH=2463&HEIGHT=1369");
HttpURLConnection uc = (HttpURLConnection) url.openConnection();
InputStream is = uc.getInputStream();
For the given URL I get a 401 Exception:
java.io.IOException: Server returned HTTP response code: 401 for URL: https://wfs.geodienste.ch/av/deu?&LAYERS=LCSFC,SOLI,SOSFC,LOCPOS,HADR,LNNA,OSNR,RESF,OSBP,MBSF&FORMAT=image%2Fjpeg&DPI=100&SERVICE=WMS&VERSION=1.1.1&REQUEST=GetMap&STYLES=&SRS=EPSG%3A2056&BBOX=2637158.3069220297,1236087.7425729688,2639935,1237632&WIDTH=2463&HEIGHT=1369
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1839)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1440)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)
at HTTPTestMain.doit(HTTPTestMain.java:43)
at HTTPTestMain.main(HTTPTestMain.java:28)
Now I'd expect the URL to ask for some credentials when using it with a browser. Surprisingly there are no credentials needed and the response in e.g. Firefox is 200.
Just for curiousity I've added the following line of code:
uc.setRequestProperty("Authorization", "Basic ");
Still the same 401 response.
Can you tell me what's needed to get the Response right with Java?
Kind regards
Klib
401 means "Unauthorized", so there must be something with your credentials.You could use an Authenticator
enter code here
Authenticator.setDefault(new Authenticator(){
#Override
protected PasswordAuthentication getPasswordAuthentication(){
return new PasswordAuthentication(login, password.toCharArray());
}
});
I was also able to access that page from a browser without any password.
It is possible that they are treating requests from a web browser differently from programmatic requests; i.e. from an automated scraper. For example, they may be looking at the "User-Agent" header in your requests.
But if you try to reverse engineer this to evade possible restrictions, they may decide to block you using other mechanisms.
I think you need to contact the site's support: https://geodienste.ch/support. Ask them how to deal with this. Find out if you need an account, and how to get one.
Related
I'm doing a script to update several queries that we use in our project everytime we deploy a sprint.
I'm trying to replicate the same request that I'm testing on Fiddler, that it is working, in the following way:
System.setProperty("sun.net.http.allowRestrictedHeaders", "true");
String host = 'redmine.our-domain.com';
String url = 'http://redmine.our-domain.com/queries/4088';
String REDMINE_SESSION_COOKIE = "_redmine_session=BAh7DkkiDHVzZXJfaWQGOgZFRmkvSSIKY3RpbWUGOwBGbCsHmouFWkkiCmF0aW1lBjsARmwrByk211tJIg9zZXNzaW9uX2lkBjsARkkiJTMzZWJkNmI1MzA4MzZkNmMxNGYwNjY1OWQxMDZjZmU3BjsAVEkiEF9jc3JmX3Rva2VuBjsARkkiMVB3bDlCb0F5NFFCbTd3dmdGWGx0VjdEL05WYjhVRGExdFluQmNMbnFZTHM9BjsARkkiCnF1ZXJ5BjsARnsHOgdpZGkC%2BA86D3Byb2plY3RfaWRpAssBSSIWaXNzdWVzX2luZGV4X3NvcnQGOwBGSSIMaWQ6ZGVzYwY7AEZJIg1wZXJfcGFnZQY7AEZpaUkiFWZqbGVzX2luWGV4X3NvcnQGOwBGSSINZm2sZW5hbWUGOwBG--5c961485290b3c98f38de934b939d25cc01e092f"
String data = "_method=put&authenticity_token=Pwl9BoAy4QBm7wvgFXlsV7D%2FNVb8UDa2tYnBcLnqYLs%3D&query%5Bname%5D=Current+sprint+1.75-test+API+0+0+1&query%5Bvisibility%5D=2query%5Bgroup_by%5D=category&f%5B%5D=status_id&op%5Bstatus_id%5D=o&f%5B%5D6=fixed_version_id&v%5Bfixed_version_id%5D%5B%5D=6030&c%5B%5D=tracker&c%5B%5D=status&c%5B%5D=priority&c%5B%5D=subject&c%5B%5D=assigned_to&c%5B%5D=fixed_version&c%5B%5D=start_date&c%5B%5D=due_date&c%5B%5D=estimated_hours&c%5B%5D=done_ratio&c%5B%5D=parent";
byte[] body = data.getBytes("UTF-8");
HttpURLConnection http = (HttpURLConnection) new URL(url).openConnection();
http.setRequestMethod('POST');
http.setRequestProperty('Cookie', REDMINE_SESSION_COOKIE);
http.setRequestProperty('Content-Type', 'application/x-www-form-urlencoded');
http.setRequestProperty('Host', host);
http.setRequestProperty('Content-Length', "${body.length}");
http.setDoOutput(true);
http.getOutputStream().write(body);
Both, data's authenticity_token and session cookie are fakes, but I'm copy-pasting the Fiddler one.
I'm adding the Host and Content-Length because Fiddler always add them.
Fiddler returns a 302 status that it is right, because Redmine redirects the page.
With the code above I receive a 422 status (Unprocessable Entity) with this message in the body:
Invalid form authenticity token
I've spent 3 days trying to figure out what I'm doing wrong to clone the request. Any clue?
You should rather try to use Redmine's API to acheive your goal, instead of trying to send html form data to controller.
Redmine login form creates also invisible form data fields, which you can see while inspecting with your browser (F12 usually).
One such, hidden field is authenticity token, and it's generated new, every time form is rendered.
Fiddler probably works, because it's performing basic authentication, as described here:
http://www.redmine.org/projects/redmine/wiki/Rest_api#Authentication
So in your code, you must remove part of code trying to mimic form data, and use basic authentication instead, like this:
System.setProperty("sun.net.http.allowRestrictedHeaders", "true");
String host = 'redmine.our-domain.com';
String url = 'http://redmine.our-domain.com/queries/4088';
String auth = Base64.getEncoder().encodeToString((username+":"+password).getBytes(StandardCharsets.UTF_8)); //Java 8 - not sure for 7
HttpURLConnection http = (HttpURLConnection) new URL(url).openConnection();
http.setRequestProperty("Authorization", "Basic "+auth);
http.setRequestMethod('POST');
http.setRequestProperty('Cookie', REDMINE_SESSION_COOKIE);
http.setRequestProperty('Content-Type', 'application/x-www-form-urlencoded');
http.setRequestProperty('Host', host);
http.setRequestProperty('Content-Length', "${body.length}");
http.setDoOutput(true);
http.getOutputStream().write(body);
I am facing this HTTP 403 Forbidden response from a https REST service when I am trying to call it from my java code. Can you kindly let me know if I am missing something here?
Please note that the server returns the expected data when I trigger the request from any browser / SOAPUI/ Chrome Postman clients.
2 peer certificates are used - as shown in the ssl info from soapui after the request is sent.
The code snippet is attached. [The headers I set in the code are taken from the request header I found from the successful requests]
HttpsURLConnection connection = (HttpsURLConnection)new URL("https://server address").openConnection();
connection.setRequestProperty("Authorization", "Basic " + authStringEnc);
connection.setRequestProperty("Content-Type", "application/json");
connection.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
connection.setRequestProperty("Accept-Encoding","gzip, deflate, br");
connection.setRequestProperty("User-Agent", "Apache-HttpClient/4.1.1");
connection.addRequestProperty("Connection", "Keep-Alive");
connection.addRequestProperty("Cache-Control","no-cache");
connection.setRequestMethod("GET");
connection.connect();
System.out.println("Response Code : " + connection.getResponseCode()+" "+connection.getResponseMessage());
Response Code : 403 Forbidden
Can you please check the Server URL if it is in the Java Acceptable format?
Sometimes java need escape characters to recognize strings correctly.
This question: What are all the escape characters? , can help you to check if you are using any of those characters. Also check if the conversion in the function is done properly.
Also, if you have more complex URL, consider to use java.net.URL .
Finally, check the user agent parameter Setting user agent of a java URLConnection .
Thanks for your response. The issue is with session cookie to be used for the connection. We are able to connect and get the response back with response code HTTP 200 once the cookie with JSESSIONID is passed as a header. Thanks again.
I am tasked with checking whether some URLs are working correctly, I'm using Java to make HTTP get request to get the response code.
So what I did was this.
URL u = new URL("some URL");
HttpURLConnection huc = (HttpURLConnection) u.openConnection();
huc.setRequestMethod("GET");
huc.connect();
int code = huc.getResponseCode();
System.out.println(code + " " + huc.getURL());
The Problem: Some sites require you to login to access the page, but the page doesn't return a 401 code, but 200. Note that the web page doesn't show up until a username and password are provided. It asks for authentication in a pop up window.
So how do I catch these kind of links?
Also, how can I identify if a webpage shows a login page like http://www.example.com/login/? Is it sufficient to just check the URL for the word “login”?
There's no universal way to deal with this. You have to know how the site you're using does authentication - 401? separate login page? multi-factor auth (ie: using RSA token)? Checking for the substring "login" in the URL is a possible way of handling some, but not enough for a general way.
For example, a 401 will only happen when using basic authentication (or when trying to access protected resources directly). There's a lot of other ways to do auth
John sums up the issue quite well in his comment:
If you have to deal with pages that roll their own custom authentication, then it follows that you probably have to write your own custom code to accommodate them. Depending on how the relevant sites work, you might be able to bypass authentication by sending an appropriate cookie in your request, as if you had already authenticated, or by some similar means
I am trying to retrieve some html texts from a list of google returned pages. most of them work fine, but for urls such as https://www.google.com/patents/US6034687 always gives 401 error see below
Server returned HTTP response code: 401 for URL: https://www.google.com/patents/US6034687
I am using java and I did look up on this error code, it seems authentication related, but this kind of URL can be accessed from any browsers without asking for login. So I am confused, how come only this kind of URL does not work for me.
here is my code for retrieving html
URL u=new URL(url);
StringBuilder html =new StringBuilder();
HttpURLConnection conn = (HttpURLConnection) u.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Accept", "text/html");
BufferedReader br;
try {
br = new BufferedReader(new InputStreamReader((conn.getInputStream())));
String out="";
while ((out= br.readLine()) != null) {
// System.out.println(out);
html.append(out+"\n");
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Any idea?
thanks
Try sending a User-Agent header in the request. That 401 status is misleading. Some servers do not allow requests from non-browser clients.
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 5.2; rv:21.0) Gecko/20100101 Firefox/21.0");
BTW, when you do openConnection() for an https scheme, the return value is HttpsURLConnection, which extends HttpURLConnection.
The request requires user authentication. The response MUST include a WWW-Authenticate header field containing a challenge applicable to the requested resource. The client MAY repeat the request with a suitable Authorization header field. If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials. If the 401 response contains the same challenge as the prior response, and the user agent has already attempted authentication at least once, then the user SHOULD be presented the entity that was given in the response, since that entity might include relevant diagnostic information. HTTP access authentication is explained in "HTTP Authentication: Basic and Digest Access Authentication
This question already has answers here:
403 Forbidden with Java but not web browser?
(4 answers)
Closed 4 years ago.
My code goes like this:
URL url;
URLConnection uc;
StringBuilder parsedContentFromUrl = new StringBuilder();
String urlString="http://www.example.com/content/w2e4dhy3kxya1v0d/";
System.out.println("Getting content for URl : " + urlString);
url = new URL(urlString);
uc = url.openConnection();
uc.connect();
uc.getInputStream();
BufferedInputStream in = new BufferedInputStream(uc.getInputStream());
int ch;
while ((ch = in.read()) != -1) {
parsedContentFromUrl.append((char) ch);
}
System.out.println(parsedContentFromUrl);
However when I am trying to access the URL through browser there is no problem , but when I try to access it through a java program, it throws expection:
java.io.IOException: Server returned HTTP response code: 403 for URL
What is the solution?
Add the code below in between uc.connect(); and uc.getInputStream();:
uc = url.openConnection();
uc.addRequestProperty("User-Agent",
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");
However, it a nice idea to just allow certain types of user agents. This will keep your website safe and bandwidth usage low.
Some possible bad 'User Agents' you might want to block from your server depending if you don't want people leeching your content and bandwidth. But, user agent can be spoofed as you can see in my example above.
403 means forbidden. From here:-
10.4.4 403 Forbidden
The server understood the request, but
is refusing to fulfill it.
Authorization will not help and the
request SHOULD NOT be repeated. If the
request method was not HEAD and the
server wishes to make public why the
request has not been fulfilled, it
SHOULD describe the reason for the
refusal in the entity. If the server
does not wish to make this information
available to the client, the status
code 404 (Not Found) can be used
instead.
You need to contact the owner of the site to make sure the permissions are set properly.
EDIT I see your problem. I ran the URL through Fiddler. I noticed that I am getting a 407 which means below. This should help you go in the right direction.
10.4.8 407 Proxy Authentication Required
This code is similar to 401
(Unauthorized), but indicates that the
client must first authenticate itself
with the proxy. The proxy MUST return
a Proxy-Authenticate header field
(section 14.33) containing a challenge
applicable to the proxy for the
requested resource. The client MAY
repeat the request with a suitable
Proxy-Authorization header field
(section 14.34). HTTP access
authentication is explained in "HTTP
Authentication: Basic and Digest
Access Authentication"
Also see this relevant question.
java.io.IOException: Server returned HTTP response code: 403 for URL
IF the browser can access the page, and your code cannot, then there's something different between the browser request and your request. You can look at the browser request, using, say, Firebug, to see what the differences are. Some things I can think of are:
The site sets a
cookie (maybe during login). You may be able to handle
this in code, you will have to
explicitly add support for passing
the cookie. This is most likely.
The site filters based on user agents. You can set the user agent. This is not as likely.