Check if url exists JAVA - java

I'm trying to check if the url entered by the user actually exists.
Below is what I have tried.
public static Boolean checkURLExists(String urlName)
{
Boolean urlCheck=false;
try{
URL url = new URL(urlName);
HttpURLConnection.setFollowRedirects(false);
HttpURLConnection huc = (HttpURLConnection) url.openConnection();
huc.setRequestMethod("GET");
int responseCode = huc.getResponseCode();
String responseMessage = huc.getResponseMessage();
char a=String.valueOf(Math.abs((long)huc.getResponseCode())).charAt(0);
if ((a == '2' || a == '3')&& (responseMessage.equalsIgnoreCase("ok")||responseMessage.equalsIgnoreCase("found")||responseMessage.equalsIgnoreCase("redirect"))) {
System.out.println("GOOD "+responseCode+" - "+a);
urlCheck=true;
} else {
System.out.println("BAD "+responseCode+" - "+a);
}
}catch(Exception e){
e.printStackTrace();
}
return urlCheck;
}
The issue with the above code is that it returns http://www.gmail.com or http://www.yahoo.co.in etc. as invalid URLs with response code 301 & response message "Moved permanently" but they actually redirects to other url, Is there any way to detect that the url when entered in browser will open a page?
Thank you.

Well the normal behavior of a web browser when it sees a 301 response is to follow the redirect. But you seem to have told your test code NOT to do that. If you want your code to behave (more) like a browser would, change this
HttpURLConnection.setFollowRedirects(false);
to this
HttpURLConnection.setFollowRedirects(true);

Related

Java Facebook server to server get code URL parameter

I've been working on this for a couple of weeks now...
Basically what I am trying to do is login into Facebook (authenticate, accept the permissions, etc.), parse returned the "code" URL query param, and use that "code" param to get the FB user access token...
This is the FB_OAuthURL (for this example that is the name of the variable):
https://www.facebook.com/dialog/oauth?client_id=<APP_ID>&redirect_uri=http%3A%2F%2Flocalhost%2Fconnect%2Flogin_success.html&scope=public_profile%2Cpublish_actions%2Cuser_about_me%2Cuser_actions.books%2Cuser_actions.fitness%2Cuser_actions.music%2Cuser_actions.news%2Cuser_actions.video%2Cuser_birthday%2Cuser_education_history%2Cuser_events%2Cuser_games_activity%2Cuser_hometown%2Cuser_religion_politics%2Cuser_status%2Cuser_tagged_places%2Cuser_work_history%2Crsvp_event%2Cuser_relationships%2Cuser_relationship_details%2Cuser_location%2Cuser_likes%2Cuser_posts&state=<RANDOM_NUMBER>
The following is the method that I am using
public static String getFinalRedirectedUrl(String url) {
HttpURLConnection connection;
String finalUrl = url;
try {
do {
connection = (HttpURLConnection) new URL(finalUrl).openConnection();
connection.setInstanceFollowRedirects(false);
connection.setUseCaches(false);
connection.setRequestMethod("GET");
connection.connect();
if (connection.getResponseCode() >= 300 && responseCode < 400) {
String redirectedUrl = connection.getHeaderField("Location");
if (null == redirectedUrl)
break;
finalUrl = redirectedUrl;
System.out.println("redirected url: " + finalUrl);
} else
break;
} while (connection.getResponseCode() != HttpURLConnection.HTTP_OK);
connection.disconnect();
} catch (Exception e) {
e.printStackTrace();
}
return finalUrl;
}
However the results (let's call this FB_REDIRCTURL) of this is the following:
https://www.facebook.com/login.php?skip_api_login=1&api_key=<APP_ID>&signed_next=1&next=https%3A%2F%2Fwww.facebook.com%2Fv2.5%2Fdialog%2Foauth%3Fredirect_uri%3Dhttp%253A%252F%252Flocalhost%252Fconnect%252Flogin_success.html%26state%3D-<RANDOM_NUMBER>%26scope%3Dpublic_profile%252Cpublish_actions%252Cuser_about_me%252Cuser_actions.books%252Cuser_actions.fitness%252Cuser_actions.music%252Cuser_actions.news%252Cuser_actions.video%252Cuser_birthday%252Cuser_education_history%252Cuser_events%252Cuser_games_activity%252Cuser_hometown%252Cuser_religion_politics%252Cuser_status%252Cuser_tagged_places%252Cuser_work_history%252Crsvp_event%252Cuser_relationships%252Cuser_relationship_details%252Cuser_location%252Cuser_likes%252Cuser_posts%26client_id%3D<APP_ID>%26ret%3Dlogin&cancel_url=http%3A%2F%2Flocalhost%2Fconnect%2Flogin_success.html%3Ferror%3Daccess_denied%26error_code%3D200%26error_description%3DPermissions%2Berror%26error_reason%3Duser_denied%26state%3D-4486902649550591089%23_%3D_&display=page
My two questions are
if I copy/paste this URL - the browser redirects me and I get the "code" param - again that is with me manually copying & pasting -- how do I get the method to move forward and eventually retrieve the http response that I am looking for
The FB_REDIRCTURL says that there was an error within the URL params, however, as I stated it still works when I copy & paste the url into a browser...any ideas why that is?
Thanks everyone -- I really appreciate the help

How do I authenticate with facebook serverside without using the JS library

At my site a user is allowed to sign in with facebook. When doing that I ask for permission
to post to the users feed. This works like a charm.
When signed in, a user is allowed to write a review and when saving the review the user is asked if the user wants to post the review to the users feed on facebook. Since the post to facebook should be done after the review is saved in my local db, I understand that I need to perform an authentication serverside and then when I have a token I'm able to POST to eg.
http://graph.facebook.com/10XXXX40308/feed
with
message : "This works"
I have been trying to implement the facebook web login as described here:
The steps are:
Perform a request against
https://graph.facebook.com/oauth/authorize?client_id=MY_API_KEY&
redirect_uri=http://www.facebook.com/connect/login_success.html&
scope=publish_stream
Facebook will redirect you to
http://www.facebook.com/connect/login_success.html?
code=MY_VERIFICATION_CODE
Request
https://graph.facebook.com/oauth/access_token?client_id=MY_API_KEY&
redirect_uri=http://www.facebook.com/connect/login_success.html&
client_secret=MY_APP_SECRET&code=MY_VERIFICATION_CODE Facebook will
respond with access_token=MY_ACCESS_TOKEN
When doing 1. in a browser the application behaves accordingly. I get a redirect back from facebook with the MY_VERIFICATION_CODE:
So I try to do it in code like this:
String url = "https://graph.facebook.com/oauth/authorize?client_id="+clientId+"&scope=publish_stream&redirect_uri=http://www.facebook.com/connect/login_success.html";
URL obj = new URL(url);
conn = (HttpURLConnection) obj.openConnection();
conn.setReadTimeout(5000);
conn.setRequestMethod("GET");
System.out.println("Request URL ... " + url);
boolean redirect = false;
// normally, 3xx is redirect
int status = conn.getResponseCode();
if (status != HttpURLConnection.HTTP_OK) {
if (status == HttpURLConnection.HTTP_MOVED_TEMP
|| status == HttpURLConnection.HTTP_MOVED_PERM
|| status == HttpURLConnection.HTTP_SEE_OTHER)
redirect = true;
}
System.out.println("Response Code ... " + status);
if (redirect) {
// get redirect url from "location" header field
String newUrl = conn.getHeaderField("Location");
// get the cookie if need, for login
String cookies = conn.getHeaderField("Set-Cookie");
// open the new connnection again
conn = (HttpURLConnection) new URL(newUrl).openConnection();
conn.setRequestProperty("Cookie", cookies);
System.out.println("Redirect to URL : " + newUrl);
}
BufferedReader in = new BufferedReader(
new InputStreamReader(conn.getInputStream()));
String inputLine;
StringBuffer html = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
html.append(inputLine);
}
in.close();
System.out.println("URL Content... \n" + html.toString());
System.out.println("Done");
But what happens is that instead of getting the 302 back I get a 200 back and the login
page in code:
It seems that I have missed a step or do not understand the flow.
What I'm trying to accomplish is to implement a similar call like to janrain's:
https://rpxnow.com/api/v2/facebook/stream.publish
where you are allowed to do this.
Thank you!
I guess rtfm is in place here. The user need to authenticate, so what I'm really trying to do here is to bypass the authentication process. This is of course not allowed. So
how do you solve this?
When the user authenticates I need to save the access token and the expire time so that
I can add that to the request later on.
I think that is the only way...Correct me if i'm wrong.
So in the authentication process i create a regexp:
Pattern pattern = Pattern.compile("access_token=([a-zA-Z0-9]+)&expires=([0-9]+)");
Then in the callback from facebook I extract the token with the regexp:
String accessToken = "";
String expires = "";
Matcher matcher = pattern.matcher(token.trim());
if(matcher.matches()) {
accessToken = matcher.group(1);
expires = matcher.group(2);
} else {
return new JSONObject()
.put("error", "OathBean: accessToken is null");
}
I then call facebook to get the values for the user and return all the values so that I can work with them:
return new JSONObject()
.put("facebookId", facebookId)
.put("firstName", firstName)
.put("lastName", lastName)
.put("email", email)
.put("photo", photo)
.put("accessToken", accessToken)
.put("expires", expires);
Later on when the user wants post a review to facebook. I populate the request and post the review.
Map<String, String> requestData = new HashMap<String, String>();
requestData.put("link",url(newReview));
requestData.put("description", "reviewText");
requestData.put("access_token", credential.getPassword());
String query = createQuery(requestData);
JSONObject result = null;
try {
URL url = new URL("https://graph.facebook.com/"+identifier+"/feed");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
conn.setRequestMethod("POST");
conn.setDoOutput(true);
conn.connect();
OutputStreamWriter osw = new OutputStreamWriter(conn.getOutputStream(), "UTF-8");
osw.write(query);
osw.close();
result = new JSONObject(IOUtils.toString(conn.getInputStream()));
}
catch (Exception e) {
log.error("Could not call graph feed to publish for id: "+identifier, e);
}
if(result != null) {
boolean success = StringUtils.isNotBlank(result.getString("id"));
entityManager.persist(new FBPublishEvent(currentUser, newReview, success, result.toString()));
}
If you have a better solution, please share =)

Sending JSON POST to Django App

I have a bit of Java code that attempts to send POST data over to t Django application. However, the view is simply never called. If I paste the same URL the java code hits into my browser, the Django view is called. I have no idea what I am missing, but something must be wrong with the Java write.
This is the Java function doing the write:
public void executeWrite(String requestUrl, JsonObject jsonObject)
{
DataInputStream input = null;
try
{
URL url;
HttpURLConnection urlConn;
DataOutputStream printout;
System.out.println(requestUrl);
// URL of CGI-Bin script.
url = new URL (requestUrl);
// URL connection channel.
urlConn = (HttpURLConnection)url.openConnection();
// Let the run-time system (RTS) know that we want input.
urlConn.setDoInput (true);
// Let the RTS know that we want to do output.
urlConn.setDoOutput (true);
// No caching, we want the real thing.
urlConn.setUseCaches (false);
// Specify the content type.
urlConn.setRequestMethod("POST");
urlConn.setRequestProperty("content-type","application/json; charset=utf-8");
OutputStreamWriter wr = new OutputStreamWriter(urlConn.getOutputStream());
wr.write(jsonObject.toString());
wr.flush();
wr.close();
}
catch(Exception ex)
{
ex.printStackTrace();
}
}
Now the requestURL passed into the function directly corresponds to the one for the Django view. The requestURL is:
http://127.0.0.1:8000/events/rest/33456/create
This is the Django Urlconfig:
(r'^events/rest/(?P<key>\d+)/create', 'events.views.restCreateEvent'),
Finally this is the view that never gets called by the Java code
#csrf_exempt
def restCreateEvent(request, key):
#doesn't really matter what is in here it never runs
So, what am I doing wrong that the POST request is never received by the Django sever? I've spent about 2 hours trying to figure it out and I can't find any issues with the Java code. Clearly something is wrong though.
Make sure your view is csrf exempt since you are not sending the appropriate CSRF token from the Java request.
I think the crsf thing was actually the issue. Once I added that I changed the Java code slightly and it worked. I am still not sure what the subtle Java error was, here is the working Java code.
public void executeWrite(String requestUrl, JsonObject jsonObject)
{
InputStreamReader input = null;
try
{
URL url;
HttpURLConnection urlConn;
DataOutputStream printout;
System.out.println(requestUrl);
// URL of CGI-Bin script.
url = new URL (requestUrl);
// URL connection channel.
urlConn = (HttpURLConnection)url.openConnection();
// Let the run-time system (RTS) know that we want input.
urlConn.setDoInput (true);
// Let the RTS know that we want to do output.
urlConn.setDoOutput (true);
// No caching, we want the real thing.
urlConn.setUseCaches (false);
// Specify the content type.
urlConn.setRequestMethod("POST");
urlConn.setRequestProperty("content-type","application/json; charset=utf-8");
OutputStreamWriter wr = new OutputStreamWriter(urlConn.getOutputStream());
wr.write(jsonObject.toString());
wr.flush();
wr.close();
input = new InputStreamReader (urlConn.getInputStream ());
String response = UserInterface.read(new BufferedReader(input));
if(response.length() > 0)
{
System.out.println("Response:" + response);
}
input.close();
}
catch(IOException ex)
{
ex.printStackTrace();
}
}
I remember the URL need change to "http://127.0.0.1:8000/events/rest/33456/create/" when use "POST" type.

Java function to detect valid webpage

I am trying to write a Java program that will load pages pointed to by valid links and report other links as broken. My problem is that the Java URL will download the appropriate page if the url is valid, and the search-engine results for the url if the url is invalid.
Is there a Java function that detects if the url resolves to a legitimate page . . . thanks very much,
Joel
HttpURLConnection#getResponseCode will give you an HTTP status code
You can get the HTTP response code for a URL like so:
public static int getResponseCode(URL url) throws IOException {
URLConnection conn = url.openConnection();
if (!(conn instanceof HttpURLConnection)) {
throw new IllegalArgumentException("not an HTTP url: " + url);
}
HttpURLConnection httpConn = (HttpURLConnection) conn;
return httpConn.getResponseCode();
}
Now the question is, what do you consider a "valid" webpage? For me, if a URL parses correctly and it's protocol is "http" (or https) and it's response code is in the 200 block or 302 (Found/Redirect) or 304 (Not modified), then it's valid:
public boolean isValidHttpResponseCode(int code) {
return ((code / 100) == 2) || (code == 302) || (code == 304);
}

Quickest way to get content type

I need to chech for the content type (if it's image, audio or video) of an url which has been inserted by the user. I have a code like this:
URL url = new URL(urlname);
URLConnection connection = url.openConnection();
connection.connect();
String contentType = connection.getContentType();
I'm getting the content type, but the problem is that it seems that it is necessary to download the whole file to check it's content type. So it last too much time when the file is quite big. I need to use it in a Google App Engine aplication so the requests are limited to 30 seconds.
Is there any other way to get the content type of a url without downloading the file (so it could be done quicker)?
Thanks to DaveHowes answer and googling around about how to get HEAD I got it in this way:
URL url = new URL(urlname);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("HEAD");
connection.connect();
String contentType = connection.getContentType();
If the "other" end supports it, could you use the HEAD HTTP method?
Be aware of redirects, I faced same problem with my remote content check.
Here is my fix:
/**
* Http HEAD Method to get URL content type
*
* #param urlString
* #return content type
* #throws IOException
*/
public static String getContentType(String urlString) throws IOException{
URL url = new URL(urlString);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("HEAD");
if (isRedirect(connection.getResponseCode())) {
String newUrl = connection.getHeaderField("Location"); // get redirect url from "location" header field
logger.warn("Original request URL: '{}' redirected to: '{}'", urlString, newUrl);
return getContentType(newUrl);
}
String contentType = connection.getContentType();
return contentType;
}
/**
* Check status code for redirects
*
* #param statusCode
* #return true if matched redirect group
*/
protected static boolean isRedirect(int statusCode) {
if (statusCode != HttpURLConnection.HTTP_OK) {
if (statusCode == HttpURLConnection.HTTP_MOVED_TEMP
|| statusCode == HttpURLConnection.HTTP_MOVED_PERM
|| statusCode == HttpURLConnection.HTTP_SEE_OTHER) {
return true;
}
}
return false;
}
You could also put some counter for maxRedirectCount to avoid infinite redirects loop - but this is not covered here. This is just a inspiration.
I faced a similar task where I needed to check the content type of the url, and the way how I managed it is with retrofit. First you have to define an endpoint to call it with the url you want to check:
#GET
suspend fun getContentType(#Url url: String): Response<Unit>
Then you call it like this to get the content type header:
api.getContentType(url).headers()["content-type"]

Categories

Resources