Extract file name from Java URL (file: and http/https protocol)? - java

I have various urls like this:
String a = "file:./bla/file.txt"; // Valid, see See [RFC 3986][1], path - rootless definition
String b = "file:.file.txt"; // Valid, see See [RFC 3986][1], path - rootless definition
String c = "file:./file.txt"; // Valid, see See [RFC 3986][1], path - rootless definition
String d = "file:///file.txt";
String e = "file:///folder/file.txt";
String f = "http://example.com/file.txt";
String g = "https://example.com/file.txt";
These are all valid URLS, and I can convert them to a URL in java without errors:
URL url = new URL(...);
I want to extract the filename from each of the examples above, so I'm left with just:
file.txt
I have tried the following, but this doesn't work for example b above (which is a valid URL):
b.substring(path.lastIndexOf('/') + 1); // Returns file:.file.txt
I can prob write some custom code to check for slashes, just wondering if there a better more robust way to do it?

The URI class properly parses the parts of a URI. For most URLs, you want the path of the URI. In the case of a URI with no slashes, there won’t be any parsing of the parts, so you’ll have to rely on the entire scheme-specific part:
URI uri = new URI(b);
String path = uri.getPath();
if (path == null) {
path = uri.getSchemeSpecificPart();
}
String filename = path.substring(path.lastIndexOf('/') + 1);
The above should work for all of your URLs.

Related

Remove Everything Befor Third Forward Slash

I have the following strings:
http://somedomain.com/dir/sub/folder/file.txt
OR
https://10.0.0.1/dir/sub/folder/another_folder/file.txt
I want to remove everything before the third forward slash (remove the domain) and still keep the third forward slash.
Expected results:
/dir/sub/folder/file.txt
OR
/dir/sub/folder/another_folder/file.txt
Uri uri = Uri.parse("https://graph.facebook.com/me/home?limit=25&since=1374196005");
String protocol = uri.getScheme();
String server = uri.getAuthority();
String path = uri.getPath();
Set<String> args = uri.getQueryParameterNames();
String limit = uri.getQueryParameter("limit");
I think you need a path value
You can use the URL class that is what you are looking for.
The URL class provides several methods that let you query URL objects. You can get the protocol, authority, host name, port number, path, query, filename, and reference from a URL using these accessor methods
Use this :
URL aURL = new URL("https://10.0.0.1/dir/sub/folder/another_folder/file.txt");
aUrl.getPath();
Output result
path = /dir/sub/folder/another_folder/file.txt

java.net.MalformedURLException: no protocol on URL based on a string modified with URLEncoder

So I was attempting to use this String in a URL :-
http://site-test.com/Meetings/IC/DownloadDocument?meetingId=c21c905c-8359-4bd6-b864-844709e05754&itemId=a4b724d1-282e-4b36-9d16-d619a807ba67&file=\\s604132shvw140\Test-Documents\c21c905c-8359-4bd6-b864-844709e05754_attachments\7e89c3cb-ce53-4a04-a9ee-1a584e157987\myDoc.pdf
In this code: -
String fileToDownloadLocation = //The above string
URL fileToDownload = new URL(fileToDownloadLocation);
HttpGet httpget = new HttpGet(fileToDownload.toURI());
But at this point I get the error: -
java.net.URISyntaxException: Illegal character in query at index 169:Blahblahblah
I realised with a bit of googling this was due to the characters in the URL (guessing the &), so I then added in some code so it now looks like so: -
String fileToDownloadLocation = //The above string
fileToDownloadLocation = URLEncoder.encode(fileToDownloadLocation, "UTF-8");
URL fileToDownload = new URL(fileToDownloadLocation);
HttpGet httpget = new HttpGet(fileToDownload.toURI());
However, when I try and run this I get an error when I try and create the URL, the error then reads: -
java.net.MalformedURLException: no protocol: http%3A%2F%2Fsite-test.testsite.com%2FMeetings%2FIC%2FDownloadDocument%3FmeetingId%3Dc21c905c-8359-4bd6-b864-844709e05754%26itemId%3Da4b724d1-282e-4b36-9d16-d619a807ba67%26file%3D%5C%5Cs604132shvw140%5CTest-Documents%5Cc21c905c-8359-4bd6-b864-844709e05754_attachments%5C7e89c3cb-ce53-4a04-a9ee-1a584e157987%myDoc.pdf
It looks like I can't do the encoding until after I've created the URL else it replaces slashes and things which it shouldn't, but I can't see how I can create the URL with the string and then format it so its suitable for use. I'm not particularly familiar with all this and was hoping someone might be able to point out to me what I'm missing to get string A into a suitably formatted URL to then use with the correct characters replaced?
Any suggestions greatly appreciated!
You need to encode your parameter's values before concatenating them to URL.
Backslash \ is special character which have to be escaped as %5C
Escaping example:
String paramValue = "param\\with\\backslash";
String yourURLStr = "http://host.com?param=" + java.net.URLEncoder.encode(paramValue, "UTF-8");
java.net.URL url = new java.net.URL(yourURLStr);
The result is http://host.com?param=param%5Cwith%5Cbackslash which is properly formatted url string.
I have the same problem, i read the url with an properties file:
String configFile = System.getenv("system.Environment");
if (configFile == null || "".equalsIgnoreCase(configFile.trim())) {
configFile = "dev.properties";
}
// Load properties
Properties properties = new Properties();
properties.load(getClass().getResourceAsStream("/" + configFile));
//read url from file
apiUrl = properties.getProperty("url").trim();
URL url = new URL(apiUrl);
//throw exception here
URLConnection conn = url.openConnection();
dev.properties
url = "https://myDevServer.com/dev/api/gate"
it should be
dev.properties
url = https://myDevServer.com/dev/api/gate
without "" and my problem is solved.
According to oracle documentation
Thrown to indicate that a malformed URL has occurred. Either no legal protocol could be found in a specification string or the string
could not be parsed.
So it means it is not parsed inside the string.
You want to use URI templates. Look carefully at the README of this project: URLEncoder.encode() does NOT work for URIs.
Let us take your original URL:
http://site-test.test.com/Meetings/IC/DownloadDocument?meetingId=c21c905c-8359-4bd6-b864-844709e05754&itemId=a4b724d1-282e-4b36-9d16-d619a807ba67&file=\s604132shvw140\Test-Documents\c21c905c-8359-4bd6-b864-844709e05754_attachments\7e89c3cb-ce53-4a04-a9ee-1a584e157987\myDoc.pdf
and convert it to a URI template with two variables (on multiple lines for clarity):
http://site-test.test.com/Meetings/IC/DownloadDocument
?meetingId={meetingID}&itemId={itemID}&file={file}
Now let us build a variable map with these three variables using the library mentioned in the link:
final VariableMap = VariableMap.newBuilder()
.addScalarValue("meetingID", "c21c905c-8359-4bd6-b864-844709e05754")
.addScalarValue("itemID", "a4b724d1-282e-4b36-9d16-d619a807ba67e")
.addScalarValue("file", "\\\\s604132shvw140\\Test-Documents"
+ "\\c21c905c-8359-4bd6-b864-844709e05754_attachments"
+ "\\7e89c3cb-ce53-4a04-a9ee-1a584e157987\\myDoc.pdf")
.build();
final URITemplate template
= new URITemplate("http://site-test.test.com/Meetings/IC/DownloadDocument"
+ "meetingId={meetingID}&itemId={itemID}&file={file}");
// Generate URL as a String
final String theURL = template.expand(vars);
This is GUARANTEED to return a fully functional URL!
Thanks to Erhun's answer I finally realised that my JSON mapper was returning the quotation marks around my data too! I needed to use "asText()" instead of "toString()"
It's not an uncommon issue - one's brain doesn't see anything wrong with the correct data, surrounded by quotes!
discoveryJson.path("some_endpoint").toString();
"https://what.the.com/heck"
discoveryJson.path("some_endpoint").asText();
https://what.the.com/heck
This code worked for me
public static void main(String[] args) {
try {
java.net.URL url = new java.net.URL("http://path");
System.out.println("Instantiated new URL: " + url);
}
catch (MalformedURLException e) {
e.printStackTrace();
}
}
Instantiated new URL: http://path
Very simple fix
String encodedURL = UriUtils.encodePath(request.getUrl(), "UTF-8");
Works no extra functionality needed.

How to make non-relative JavaFx 'Media'

Whenever I try to convert a File to a JavaFx Media, it tries to make the path relative, which I do not want. I'm using a Mac.
This is my code:
static String AUDIO_URL_TO_TEST = "file://Users/Mike/Desktop/calb.mp3";
basicTime.getAudioOutput().setSource(new File(AUDIO_URL_TO_TEST));
I've tried almost everything for AUDIO_URL_TO_TEST, such as:
static String AUDIO_URL_TO_TEST = "file:///Users/Mike/Desktop/calb.mp3";
static String AUDIO_URL_TO_TEST = "file:/c:/Users/Mike/Desktop/calb.mp3";
static String AUDIO_URL_TO_TEST = "/Users/Mike/Desktop/calb.mp3";
static String AUDIO_URL_TO_TEST = "~/Users/Mike/Desktop/calb.mp3";
This is the code that setSource() calls:
Media m = new Media(source.getAbsoluteFile().toURI().toURL().toString());
player = new MediaPlayer(m);
Media ends up as something like this: /path/to/eclipse/directory/file://Users/Mike/Desktop/Calb.mp3, trying to make it relative.
I've tried things other than source.getAbsoluteFile().toURI().toURL().toString(), with just as little luck.
A side question: Why does the Media class only accept strings? That seems like a horrible design. Strings were meant to contain text, not reference files.
The API doc of Media says:
The Media class represents a media resource. It is instantiated from
the string form of a source URI. ...
So the constructor of it converts the String path to URI. But since none of the example paths in your question is a valid URI, Media treated them as relative paths. For more info please refer to File, URI and file protocol documentations. The valid URI can be:
File f = new File("C:/Users/Mike/Desktop/Calb.mp3");
Media m = new Media(f.toURI().toString());
Alternatively,
URI uri = new URI("file:///C:/Users/Mike/Desktop/Calb.mp3");
// or
URI uri = new URI("file:/C:/Users/Mike/Desktop/Calb.mp3");
// in short.
Media m = new Media(uri.toString());

Java: String manipulation. Fetch last subpath in a URL

Lets say I have a URL http://example.com/files/public_files/test.zip and I want to extract the last subpath so test.zip, How would I be able do this?
I am from Python so I am still new to Java and learning. In Python you could do something like this:
>>> x = "http://example.com/files/public_files/test.zip"
>>> x.split("/")[-1]
'test.zip'
There are many ways. I prefer:
String url = "http://example.com/files/public_files/test.zip";
String fileName = url.substring(url.lastIndexOf("/") + 1);
Using String class method is a way to go. But given that you are having a URL, you can use java.net.URL.getFile():
String url = "http://example.com/files/public_files/test.zip";
String filePart = new URL(url).getFile();
The above code will get you complete path. To get the file name, you can make use of Apache Commons - FilenameUtils.getName():
String url = "http://example.com/files/public_files/test.zip";
String fileName = FilenameUtils.getName(url);
Well, if you don't want to refer to 3rd party library for this task, String class is still an option to go for. I've just given another way.
you can use the following:
String url = "http://example.com/files/public_files/test.zip";
String arr[] = url.split("/");
String name = arr[arr.length - 1];
Most similar to the python syntax is :
String url = "http://example.com/files/public_files/test.zip";
String [] tokens = url.split("/");
String file = tokens[tokens.length-1];
Java lacks the convenient [-n] nth to last selector that Python has. If you wanted to do it all in one line, you'd have to do something gross like this:
String file = url.split("/")[url.split("/").length-1];
I don't recommend the latter

Java - Convert String to valid URI object

I am trying to get a java.net.URI object from a String. The string has some characters which will need to be replaced by their percentage escape sequences. But when I use URLEncoder to encode the String with UTF-8 encoding, even the / are replaced with their escape sequences.
How can I get a valid encoded URL from a String object?
http://www.google.com?q=a b gives http%3A%2F%2www.google.com... whereas I want the output to be http://www.google.com?q=a%20b
Can someone please tell me how to achieve this.
I am trying to do this in an Android app. So I have access to a limited number of libraries.
You might try: org.apache.commons.httpclient.util.URIUtil.encodeQuery in Apache commons-httpclient project
Like this (see URIUtil):
URIUtil.encodeQuery("http://www.google.com?q=a b")
will become:
http://www.google.com?q=a%20b
You can of course do it yourself, but URI parsing can get pretty messy...
Android has always had the Uri class as part of the SDK:
http://developer.android.com/reference/android/net/Uri.html
You can simply do something like:
String requestURL = String.format("http://www.example.com/?a=%s&b=%s", Uri.encode("foo bar"), Uri.encode("100% fubar'd"));
I'm going to add one suggestion here aimed at Android users. You can do this which avoids having to get any external libraries. Also, all the search/replace characters solutions suggested in some of the answers above are perilous and should be avoided.
Give this a try:
String urlStr = "http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4";
URL url = new URL(urlStr);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
url = uri.toURL();
You can see that in this particular URL, I need to have those spaces encoded so that I can use it for a request.
This takes advantage of a couple features available to you in Android classes. First, the URL class can break a url into its proper components so there is no need for you to do any string search/replace work. Secondly, this approach takes advantage of the URI class feature of properly escaping components when you construct a URI via components rather than from a single string.
The beauty of this approach is that you can take any valid url string and have it work without needing any special knowledge of it yourself.
Even if this is an old post with an already accepted answer, I post my alternative answer because it works well for the present issue and it seems nobody mentioned this method.
With the java.net.URI library:
URI uri = URI.create(URLString);
And if you want a URL-formatted string corresponding to it:
String validURLString = uri.toASCIIString();
Unlike many other methods (e.g. java.net.URLEncoder) this one replaces only unsafe ASCII characters (like ç, é...).
In the above example, if URLString is the following String:
"http://www.domain.com/façon+word"
the resulting validURLString will be:
"http://www.domain.com/fa%C3%A7on+word"
which is a well-formatted URL.
If you don't like libraries, how about this?
Note that you should not use this function on the whole URL, instead you should use this on the components...e.g. just the "a b" component, as you build up the URL - otherwise the computer won't know what characters are supposed to have a special meaning and which ones are supposed to have a literal meaning.
/** Converts a string into something you can safely insert into a URL. */
public static String encodeURIcomponent(String s)
{
StringBuilder o = new StringBuilder();
for (char ch : s.toCharArray()) {
if (isUnsafe(ch)) {
o.append('%');
o.append(toHex(ch / 16));
o.append(toHex(ch % 16));
}
else o.append(ch);
}
return o.toString();
}
private static char toHex(int ch)
{
return (char)(ch < 10 ? '0' + ch : 'A' + ch - 10);
}
private static boolean isUnsafe(char ch)
{
if (ch > 128 || ch < 0)
return true;
return " %$&+,/:;=?#<>#%".indexOf(ch) >= 0;
}
You can use the multi-argument constructors of the URI class. From the URI javadoc:
The multi-argument constructors quote illegal characters as required by the components in which they appear. The percent character ('%') is always quoted by these constructors. Any other characters are preserved.
So if you use
URI uri = new URI("http", "www.google.com?q=a b");
Then you get http:www.google.com?q=a%20b which isn't quite right, but it's a little closer.
If you know that your string will not have URL fragments (e.g. http://example.com/page#anchor), then you can use the following code to get what you want:
String s = "http://www.google.com?q=a b";
String[] parts = s.split(":",2);
URI uri = new URI(parts[0], parts[1], null);
To be safe, you should scan the string for # characters, but this should get you started.
I had similar problems for one of my projects to create a URI object from a string. I couldn't find any clean solution either. Here's what I came up with :
public static URI encodeURL(String url) throws MalformedURLException, URISyntaxException
{
URI uriFormatted = null;
URL urlLink = new URL(url);
uriFormatted = new URI("http", urlLink.getHost(), urlLink.getPath(), urlLink.getQuery(), urlLink.getRef());
return uriFormatted;
}
You can use the following URI constructor instead to specify a port if needed:
URI uri = new URI(scheme, userInfo, host, port, path, query, fragment);
Well I tried using
String converted = URLDecoder.decode("toconvert","UTF-8");
I hope this is what you were actually looking for?
The java.net blog had a class the other day that might have done what you want (but it is down right now so I cannot check).
This code here could probably be modified to do what you want:
http://svn.apache.org/repos/asf/incubator/shindig/trunk/java/common/src/main/java/org/apache/shindig/common/uri/UriBuilder.java
Here is the one I was thinking of from java.net: https://urlencodedquerystring.dev.java.net/
Or perhaps you could use this class:
http://developer.android.com/reference/java/net/URLEncoder.html
Which is present in Android since API level 1.
Annoyingly however, it treats spaces specially (replacing them with + instead of %20). To get round this we simply use this fragment:
URLEncoder.encode(value, "UTF-8").replace("+", "%20");
I ended up using the httpclient-4.3.6:
import org.apache.http.client.utils.URIBuilder;
public static void main (String [] args) {
URIBuilder uri = new URIBuilder();
uri.setScheme("http")
.setHost("www.example.com")
.setPath("/somepage.php")
.setParameter("username", "Hello Günter")
.setParameter("p1", "parameter 1");
System.out.println(uri.toString());
}
Output will be:
http://www.example.com/somepage.php?username=Hello+G%C3%BCnter&p1=paramter+1

Categories

Resources