I know that there are a lot of questions regarding this subject, but I still don't get it.
I want to get the current URL from my browser in my program. What do I need to succeed?
The connection has to be made with HTTP Connection? A proxy server would help me more? because i have to filter those URLs. Help me please, I am so confused.
request.getRequestURL();
will get you the URL from HttpServletRequest. More documentation can be found in https://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServletRequest.html
Related
I write a java program like I saw here
How to read the https page content using java?
but for some sites the code does not work.
I got Error Server returned HTTP response code: 403 for URL: https://research.investors.com/stock-quotes/nyse-sailpoint-tech-holdings-sail.htm
It works for
url = "https://maven.apache.org/guides/mini/guide-repository-ssl.html";
Can someone help me ?
403 HTTP status stands for "Forbidden", most likely investors.com can check your request headers and deny the resource.
Try modifying the request headers using an User-Agent that site might accept.
403 Forbidden
The request contained valid data and was understood by the server, but the server is refusing action. This may be due to the user not having the necessary permissions for a resource or needing an account of some sort, or attempting a prohibited action (e.g. creating a duplicate record where only one is allowed). This code is also typically used if the request provided authentication by answering the WWW-Authenticate header field challenge, but the server did not accept that authentication. The request should not be repeated.
So probably website, which you want to scrape, just restricted requests like yours (i mean requests, that was made not from browser).
But you can try Selenium.
OK , I solved.
I use con.setRequestProperty and set "User-Agent", "Accept", "Content-Type", "Accept-Language".
Thank you.
java.io.IOException: Server returned HTTP response code: 500 for URL: https://***/fiwebservice/services/FIUsbWebService
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1459)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:234)
at com.abcde.testClient.TestClientTry.main(TestClientTry.java:109)
I have replaced the url as *** for security purpose, as it is confidential..
Why is there an error when I call a soap webservice in eclipse?
Please help me regarding this.
It seems that there is an error in com.abcde.testClient.TestClientTry. Could you provide the logs and the the source of the File?
Http 500 can mean many things. In Spring security I think that can mean that you didnt have the appropriate authentication to reach the resource. Without knowing much about your server side its hard to say what the problem is or how to solve it.
What kind of technology did you have at the server?
HTTP status code 500 usually means that the web server code crashes.
If HttpURLConnection#getResponseCode() and error and HttpURLConnectionof#getErrorStream() instead of (to the status code to determine in advance), it is necessary to read. It can in other words, information about the problem.
Host if blocked you, you have got the code 4NN State, as the more 401 and 403rd
I'm using this. I changed my base URL and database name, but when I try to sign up, I get the following error.
Any ideas? And if you can help and guide me towards what the base URL for couchDB would be great.
My current URL public static final String BASE_URL ="http://10.0.2.2:5984/_utils/database.html?colourity";
That exception basically means that you're trying to speak a protocol to the server that it doesn't handle. For example, if you're trying to connect to a SOCKS4 proxy but the server is a HTML server, it will return you that response.
Basically, I'd try to debug a bit further. See via Log.d() what are you sending to the server, what is it issuing, what it might be answering, and see why the information you're sending to them is not correct.
with the HttpServletRequest object, we can have the getRequestURL, which shows the ressource requested, but in my case I would like to know from where the request comes:
I tried also getRemoteAddr() and getLocalAddr() that prints my local IP, (as I am running glassfish and small webserver that talks to glassfish locally.
but the IP doesn't show the full referer, that should be in my case
http://my.domain.com/wiki/aPage
from my IP I can resolve to http://my.domain.com, yet not the full url
Does this mean I need to send also "wiki/aPage" in the request, or I hope there is a better possibility?
thanks
You can read the Referer header of the request and get the value by using request.getHeader("Referer");
I was expecting this code to return a 404, however it produces the output :
"Response code is 200"
Would it be possible to learn how to differentiate between existent and non-existent web pages . . . thanks so much,
try
{
// create the HttpURLConnection
URL url = new URL("http://www.thisurldoesnotexist");
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
System.out.println("Response code is " + connection.getResponseCode());
}
EDIT: I see you've call openConnection() but not connect() - could that be the problem? I would expect getResponseCode() to actually make the request if it hasn't already, but it's worth just trying that...
That suggests you've possible got some DNS resolver which redirects to a "helper" (spam) page, or something like that.
The easiest way to see exactly what's going on here is to use Wireshark - have that up and capturing traffic (HTTP-only, to make life easier) and then run your code. You should be able to see what's going on that way.
Note that I wouldn't have expected a 404 - because that would involve being able to find a web server to talk to to start with. If you're trying to go to a host which doesn't involve, there shouldn't be an HTTP response at all. I'd expect connect() to throw an exception.
try adding a "connection.connect();" or look at the contents returned...
it could be a dns issue, ie: your dns is being sent to a parking place... for example: freedns does this.
You could:
Resolve the IP from the host of the page
Try to connect to port 80 on the resolved IP using plain sockets
This is a bit low level however and will add complexity since you will need to make a simple GET request through the socket. Then validate the response so you're sure that its actually a HTTP server running on port 80.
NMap might be able to help you here.
Ideally you should be getting this error:
java.net.UnknownHostException: www.thisurldoesnotexist
But it looks like your URL is resolved by you DNS provider.
For instance on my company's network running your code with URI "http://profile/" displays
the employee profile.
Please also check etc.home file if you are on windows to check if any settings have been changed.
Like #spgennard - I think this is most likely a DNS issue.
The URL you have chosen is owned by a DNS speculator.
The URL you have chosen is "parked" by a DNS provider.
Your ISP is messing with your DNS results to send your browser to some search page.
It is also possible that you are accessing the web via a proxy, and the proxy is doing something strange.
The way to diagnose this is to look at the other information in the HTTP responses you are getting, particularly the response body.