I wanted to know if there is any standard APIs in Java to validate a given URL?
I want to check both if the URL string is right i.e. the given protocol is valid and then to check if a connection can be established.
I tried using HttpURLConnection, providing the URL and connecting to it. The first part of my requirement seems to be fulfilled but when I try to perform HttpURLConnection.connect(), 'java.net.ConnectException: Connection refused' exception is thrown.
Can this be because of proxy settings? I tried setting the System properties for proxy but no success.
Let me know what I am doing wrong.
For the benefit of the community, since this thread is top on Google when searching for
"url validator java"
Catching exceptions is expensive, and should be avoided when possible. If you just want to verify your String is a valid URL, you can use the UrlValidator class from the Apache Commons Validator project.
For example:
String[] schemes = {"http","https"}; // DEFAULT schemes = "http", "https", "ftp"
UrlValidator urlValidator = new UrlValidator(schemes);
if (urlValidator.isValid("ftp://foo.bar.com/")) {
System.out.println("URL is valid");
} else {
System.out.println("URL is invalid");
}
The java.net.URL class is in fact not at all a good way of validating URLs. MalformedURLException is not thrown on all malformed URLs during construction. Catching IOException on java.net.URL#openConnection().connect() does not validate URL either, only tell wether or not the connection can be established.
Consider this piece of code:
try {
new URL("http://.com");
new URL("http://com.");
new URL("http:// ");
new URL("ftp://::::#example.com");
} catch (MalformedURLException malformedURLException) {
malformedURLException.printStackTrace();
}
..which does not throw any exceptions.
I recommend using some validation API implemented using a context free grammar, or in very simplified validation just use regular expressions. However I need someone to suggest a superior or standard API for this, I only recently started searching for it myself.
Note
It has been suggested that URL#toURI() in combination with handling of the exception java.net. URISyntaxException can facilitate validation of URLs. However, this method only catches one of the very simple cases above.
The conclusion is that there is no standard java URL parser to validate URLs.
You need to create both a URL object and a URLConnection object. The following code will test both the format of the URL and whether a connection can be established:
try {
URL url = new URL("http://www.yoursite.com/");
URLConnection conn = url.openConnection();
conn.connect();
} catch (MalformedURLException e) {
// the URL is not in a valid form
} catch (IOException e) {
// the connection couldn't be established
}
Using only standard API, pass the string to a URL object then convert it to a URI object. This will accurately determine the validity of the URL according to the RFC2396 standard.
Example:
public boolean isValidURL(String url) {
try {
new URL(url).toURI();
} catch (MalformedURLException | URISyntaxException e) {
return false;
}
return true;
}
Use the android.webkit.URLUtil on android:
URLUtil.isValidUrl(URL_STRING);
Note: It is just checking the initial scheme of URL, not that the entire URL is valid.
There is a way to perform URL validation in strict accordance to standards in Java without resorting to third-party libraries:
boolean isValidURL(String url) {
try {
new URI(url).parseServerAuthority();
return true;
} catch (URISyntaxException e) {
return false;
}
}
The constructor of URI checks that url is a valid URI, and the call to parseServerAuthority ensures that it is a URL (absolute or relative) and not a URN.
Just important to point that the URL object handle both validation and connection. Then, only protocols for which a handler has been provided in sun.net.www.protocol are authorized (file,
ftp, gopher, http, https, jar, mailto, netdoc) are valid ones. For instance, try to make a new URL with the ldap protocol:
new URL("ldap://myhost:389")
You will get a java.net.MalformedURLException: unknown protocol: ldap.
You need to implement your own handler and register it through URL.setURLStreamHandlerFactory(). Quite overkill if you just want to validate the URL syntax, a regexp seems to be a simpler solution.
Are you sure you're using the correct proxy as system properties?
Also if you are using 1.5 or 1.6 you could pass a java.net.Proxy instance to the openConnection() method. This is more elegant imo:
//Proxy instance, proxy ip = 10.0.0.1 with port 8080
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("10.0.0.1", 8080));
conn = new URL(urlString).openConnection(proxy);
I think the best response is from the user #b1nary.atr0phy. Somehow, I recommend combine the method from the b1nay.atr0phy response with a regex to cover all the possible cases.
public static final URL validateURL(String url, Logger logger) {
URL u = null;
try {
Pattern regex = Pattern.compile("(?i)^(?:(?:https?|ftp)://)(?:\\S+(?::\\S*)?#)?(?:(?!(?:10|127)(?:\\.\\d{1,3}){3})(?!(?:169\\.254|192\\.168)(?:\\.\\d{1,3}){2})(?!172\\.(?:1[6-9]|2\\d|3[0-1])(?:\\.\\d{1,3}){2})(?:[1-9]\\d?|1\\d\\d|2[01]\\d|22[0-3])(?:\\.(?:1?\\d{1,2}|2[0-4]\\d|25[0-5])){2}(?:\\.(?:[1-9]\\d?|1\\d\\d|2[0-4]\\d|25[0-4]))|(?:(?:[a-z\\u00a1-\\uffff0-9]-*)*[a-z\\u00a1-\\uffff0-9]+)(?:\\.(?:[a-z\\u00a1-\\uffff0-9]-*)*[a-z\\u00a1-\\uffff0-9]+)*(?:\\.(?:[a-z\\u00a1-\\uffff]{2,}))\\.?)(?::\\d{2,5})?(?:[/?#]\\S*)?$");
Matcher matcher = regex.matcher(url);
if(!matcher.find()) {
throw new URISyntaxException(url, "La url no está formada correctamente.");
}
u = new URL(url);
u.toURI();
} catch (MalformedURLException e) {
logger.error("La url no está formada correctamente.");
} catch (URISyntaxException e) {
logger.error("La url no está formada correctamente.");
}
return u;
}
This is what I use to validate CDN urls (must start with https, but that's easy to customise). This will also not allow using IP addresses.
public static final boolean validateURL(String url) {
var regex = Pattern.compile("^[https:\\/\\/(www\\.)?a-zA-Z0-9#:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9#:%_\\+.~#?&//=]*)");
var matcher = regex.matcher(url);
return matcher.find();
}
Thanks. Opening the URL connection by passing the Proxy as suggested by NickDK works fine.
//Proxy instance, proxy ip = 10.0.0.1 with port 8080
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("10.0.0.1", 8080));
conn = new URL(urlString).openConnection(proxy);
System properties however doesn't work as I had mentioned earlier.
Thanks again.
Regards,
Keya
Related
I need to get host from this url
android-app://com.google.android.googlequicksearchbox?Pub_id={siteID}
java.net.URL and java.net.URI can't handle it.
The problem is in { and } characters which are not valid for URI. Looks like a placeholder that wasn't resolved correctly when creating a URI.
You can use String.replaceAll() to get rid of these two characters:
String value = "android-app://com.google.android.googlequicksearchbox?Pub_id={siteID}";
URI uri = URI.create(value.replaceAll("[{}]", ""));
System.out.println(uri.getHost()); // com.google.android.googlequicksearchbox
You see, eventually I need path, scheme and query.
I've just found super fast library for parsing such URLs. https://github.com/anthonynsimon/jurl
It's also very flexible.
You can try the following code
String url = "android-app://com.google.android.googlequicksearchbox?Pub_id={siteID}";
url = url.replace("{", "").replace("}","");
URI u;
try {
u = new URI(url);
System.out.println(u.getHost());
} catch (URISyntaxException e) {
e.printStackTrace();
}
Here I have to take input URI from customer.
And base URL is - https://mydomain.com/router/
So, final URL will be - https://mydomain.com/router/<URI-from-customer>
Here I have to validate the final URL using -
java.net.URL url = new java.net.URL(finalURL);
But I was not able to test failure case. I tried with all the special characters but am not able to get malformed URL exception.
Following is the URI I tried with -
维也纳恩斯特哈佩尔球场 &&& '";'><()!~`5%=-_}{}[]\||| ? ?&& / ?
Still there is no error.
So I want to know invalid characters to fail URL construction.
One way to trigger that Exception is with something like
String finalURL = "Like this";
try {
java.net.URL url = new java.net.URL(finalURL);
} catch (MalformedURLException e) {
e.printStackTrace();
}
Output is (as requested)
java.net.MalformedURLException: no protocol: Like this
at java.net.URL.<init>(URL.java:586)
at java.net.URL.<init>(URL.java:483)
at java.net.URL.<init>(URL.java:432)
at com.stackoverflow.Main.main(Main.java:15)
I have a text field to acquire location information (String type) from User. It could be file directory based (e.g. C:\directory) or Web url (e.g. http://localhost:8008/resouces). The system will read some predetermined metadata file from the location.
Given the input string, how can I detect the nature of the path location whether it is a file based or Web URL effectively.
So far I have tried.
URL url = new URL(location); // will get MalformedURLException if it is a file based.
url.getProtocol().equalsIgnoreCase("http");
File file = new File(location); // will not hit exception if it is a url.
file.exist(); // return false if it is a url.
I am still struggling to find a best way to tackle both scenarios. :-(
Basically I would not prefer to explicitly check the path using the prefix such as http:// or https://
Is there an elegant and proper way of doing this?
You can check if the location starts with http:// or https://:
String s = location.trim().toLowerCase();
boolean isWeb = s.startsWith("http://") || s.startsWith("https://");
Or you can use the URI class instead of URL, URI does not throw MalformedURLException like the URL class:
URI u = new URI(location);
boolean isWeb = "http".equalsIgnoreCase(u.getScheme())
|| "https".equalsIgnoreCase(u.getScheme())
Although new URI() may also throw URISyntaxException if you use backslash in location for example. Best way would be to either use prefix check (my first suggestion) or create a URL and catch MalformedURLException which if thrown you'll know it cannot be a valid web url.
If you're open to the use of a try/catch scenario being "elegant", here is a way that is more specific:
try {
processURL(new URL(location));
}
catch (MalformedURLException ex){
File file = new File(location);
if (file.exists()) {
processFile(file);
}
else {
throw new PersonalException("Can't find the file");
}
}
This way, you're getting the automatic URL syntax checking and, that failing, the check for file existence.
you can try:
static public boolean isValidURL(String urlStr) {
try {
URI uri = new URI(urlStr);
return uri.getScheme().equals("http") || uri.getScheme().equals("https");
}
catch (Exception e) {
return false;
}
}
note that this will return false for any other reason that invalidates the url, ofor a non http/https url: a malformed url is not necessarily an actual file name, and a good file name can be referring to a non exisiting one, so use it in conjunction with you file existence check.
public boolean urlIsFile(String input) {
if (input.startsWith("file:")) return true;
try { return new File(input).exists(); } catch (Exception e) {return false;}
}
This is the best method because it is hassle free, and will always return true if you have a file reference. For instance, other solutions don't and cannot cover the plethora of protocol schemes available such as ftp, sftp, scp, or any future protocol implementations. So this one is the one for all uses and purposes; with the caveat of the file must exist, if it doesn't begin with the file protocol.
if you look at the logic of the function by it's name, you should understand that, returning false for a non existent direct path lookup is not a bug, that is the fact.
how to check protocol is present in URL , if not present need to append it.
is there any class to achieve this in java?
eg: String URL = www.google.com
need to get http://www.google.com
Just use String.startsWith("http://") to check this.
public String ensure_has_protocol(final String a_url)
{
if (!a_url.startsWith("http://"))
{
return "http://" + a_url;
}
return a_url;
}
EDIT:
An alternative would use a java.net.URL instance, whose constructor would throw an java.net.MalformedURLException if the URL did not contain a (legal) protocol (or was invalid for any other reason):
public URL make_url(final String a_url) throws MalformedURLException
{
try
{
return new URL(a_url);
}
catch (final MalformedURLException e)
{
}
return new URL("http://" + a_url);
}
You can use URL.toString() to obtain string representation of the URL. This is an improvement on the startsWith() approach as it guarantees that return URL is valid.
Let's say you have String url = www.google.com. String class methods would be enough for the goal of checking protocol identifiers. For example, url.startsWith("https://") would check whether a specific string is starting with the given protocol name.
However, are these controls enough for validation?
I think they aren't enough. First of all, you should define a list of valid protocol identifiers, e.g. a String array like {"http", "ftp", "https", ...}. Then you can parse your input String with regex ("://") and test your URL header whether it belongs to the list of valid protocol identifiers. And domain name validation methods are beyond this question, you can/should handle it with different techniques as well.
Just for completeness, I would do something like the following:
import com.google.common.base.Strings;
private static boolean isUrlHttps(String url){
if(Strings.isNullOrEmpty(url))
return false;
return url.toLowerCase().startsWith("https://");
}
I have got a cluster of two server nodes IBM Websphere Application Server on two different physical machine.Can anybody help me with java code to check whether my server instance is running or when one of the server is not up and running?
To do it quickly and portably, you can check for a page if it's served by the server.
For example you can:
boolean isAlive = true;
try {
URL hp = new URL("http://yourserver/TestPage.html");
URLConnection hpCon = hp.openConnection();
// add more checks...
} catch (Exception e) {
isAlive = false;
}
This not much sophisticated method will work with every http server.
Hope below is what you want...
URL url = new URL( "http://google.com/" );
HttpURLConnection httpConn = (HttpURLConnection)url.openConnection();
httpConn.setInstanceFollowRedirects( false );
httpConn.setRequestMethod( "HEAD" );
httpConn.connect();
System.out.println( "google.com : " + httpConn.getResponseCode());
or for failure:
URL url = new URL( "http://google.com:666/" );
HttpURLConnection httpConn = (HttpURLConnection)url.openConnection();
httpConn.setInstanceFollowRedirects( false );
httpConn.setRequestMethod( "HEAD" );
try{
httpConn.connect();
System.out.println( "google.com : " + httpConn.getResponseCode());
}catch(java.net.ConnectException e){
System.out.println( "google.com:666 is down ");
}
Good Luck!!!
I think what you may be looking for is the use of the WebSphere Thin Administrative Client, which exposes Java APIs and provides access to the WAS MBeans that allow you to query the status of your servers/applications (along with many other management and monitoring tasks).
First, you'll want to obtain a connection to WAS (the AdminClient) as follows:
Properties clientProps = new Properties();
clientProps.setProperty(AdminClient.CONNECTOR_TYPE, AdminClient.CONNECTOR_TYPE_SOAP);
clientProps.setProperty(AdminClient.CONNECTOR_HOST, dmgrHostname);
clientProps.setProperty(AdminClient.CONNECTOR_PORT, dmgrSoapConnectorPort);
if (dmgrIsSecure) {
clientProps.setProperty(AdminClient.CONNECTOR_SECURITY_ENABLED, "true");
clientProps.setProperty(AdminClient.USERNAME, wasUsername);
clientProps.setProperty(AdminClient.PASSWORD, wasUserPassword);
}
AdminClient adminClient = AdminClientFactory.createAdminClient(clientProps);
Next, you'll want to query for the relevant MBeans, and then perform the relevant operations. In your case, you may be interested in the ClusterMgr and/or J2EEApplication MBeans. Here is an example that queries for the Cluster's state:
AdminClient adminClient = getAdminClient(target);
ObjectName clusterMgr =
(ObjectName)adminClient.queryNames(
ObjectName.getInstance("WebSphere:*,type=ClusterMgr"), null).iterator().next();
String state = adminClient.invoke(clusterMgr, "getClusterState",
new Object[] {clusterName}, new String[] {String.class.getName()});
You can invoke further operations as desired, such as querying the individual cluster member's status.
Also, in addition to querying, you can also register notifications so that your program can be notified when certain events happen, such as a change in state of clusters, servers, or applications.
We get the specialized connection for the URL we want using openConnection(). It will return a subclass of the abstract class URLConnection, depending on the URL's public protocol, for example a HttpURLConnection. Then with the method connect() opens the communication link
private String server = "http://testserver:9086";
try {
URLConnection hpCon = new URL(SERVER).openConnection();
hpCon.connect();
} catch (Exception e) {
// Anything you want to do when there is a failure
}