I was playing around with non stringy types for an application loader i've been developing. As a typo, I forgot to include the protocol part of a specific URI. I expected the java test to fail due to an invalid URI... however this statement seems to work...
URI uri = URI.create("contacts.addresses.genericAddress")
To me, theres no standard for using a dot as a scheme part... and I thought the scheme part was always required?
Does anyone know why?
I'll add my comment as an answer because I think it's correct:
From the Java URI documentation: "specified by the grammar in RFC 2396, Appendix A" and appendix A allows a URI to be a relative path, with no host name or scheme. So "this.and.that" might just be a file name like "this.html" (dot's are valid as a file element name -- i.e., pchars in a path segment).
Related
What is a hierarchical URI strictly speaking?
Somewhere I see a definition that hierarchical URI must have scheme and path.
If hierarchical = not opaque, then hierarchical also shall have a scheme.
Can there be a hierarchichal URI without a scheme (for example a relative URI).
Yes, that is a hierarchical URI that is scheme and host relative (but path absolute). (On the assumption that the omitted scheme is a hierarchical one like "http://" or "file://")
See section 1.2.3 of RFC 3986 for more info https://www.ietf.org/rfc/rfc3986.txt , including the canonical definition of a hierarchical URI.
A relative reference (Section 4.2) refers to a resource by describing
the difference within a hierarchical name space between the reference
context and the target URI.
Why does URI allow missing protocol (while URL does not)?
In wikipedia Scheme (and even Path) seem to be obligatory components of an URI:
The URI generic syntax consists of a hierarchical sequence of five
components:[8]
URI = scheme:[//authority]path[?query][#fragment]
Or missing protocol defaults to something (like http)? I found nothing like this in the docs.
new URI("my.html"); // 1
new URI("xabc:my.html"); // 2
new URL("my.html"); // 3
new URL("xabc:my.html"); // 4
Concerning "obligatory" path - OK, there is oblique URI. But why missing protocol is allowed (it shall be present even for obligue URI which is required to be absolute)
I could understand that relative URL/URI don't require protocol (<img src="/images/pic.png">), but URL gives run-time java.net.MalformedURLException: no protocol in this case either (while URI don't).
Your relative path must be wrong,
Java's URI supports empty scheme for relative URI:
relative URI, that is, a URI that does not specify a scheme. Some examples of hierarchical URIs are:
docs/guide/collections/designfaq.html#28
Scheme is optional:
[scheme:]scheme-specific-part[#fragment]
Similar with URL, e.g.:
URL url = new URL("/guidelines.txt");
In our platform, we use a certain format from paths. In the Android App, it receives those paths to load some data or do something.
I want to do all the data handling using content provider, I want to give the path and get data. A simple transaction.
When I read into content providers, the documentation and all the tutorials out there always use "content://" at the beginning. However, I want to use our own start of the path which is usually "is-://". Can something like this work?
no, this is how the system categorize the uri as content provider.
its like relacing file:// with something else.
After referring to Developer.google site
A content URI is a URI that identifies data in a provider. Content URIs include the symbolic name of the entire provider (its authority) and a name that points to a table (a path). When you call a client method to access a table in a provider, the content URI for the table is one of the arguments.
From this I believe you can't set it on your own as it includes the symbol name.
Also why do you want to change it?
The construct new URL(new URL(new URL("http://localhost:4567"), "abc"), "def") produces (imho incorrectly) this url: http://localhost:4567/def
While the construct new URL(new URL(new URL("http://localhost:4567"), "abc/"), "def") produces the correct (wanted by me) url: http://localhost:4567/abc/def
The difference is a trailing slash in abc constructor argument.
Is this intended behavior or this is a bug that should be fixed in URL class?
After all the idea is not to worry about slashes when you use some helper class for URL construction.
Quoting javadoc of new URL(URL context, String spec):
Otherwise, the path is treated as a relative path and is appended to the context path, as described in RFC2396.
See section 5 "Relative URI References" of the RFC2396 spec, specifically section 5.2 "Resolving Relative References to Absolute Form", item 6a:
All but the last segment of the base URI's path component is copied to the buffer. In other words, any characters after the last (right-most) slash character, if any, are excluded.
Explanation
On a web page, the "Base URI" is the page address, e.g. http://example.com/path/to/page.html. A relative link, e.g. <a href="page2.html">, must be interpreted as a sibling to the base URI, so page.html is removed, and page2.html is added, resulting in http://example.com/path/to/page2.html, as intended.
The Java URL class implements this logic, and that is why you get what you see, and it is entirely the way it is supposed to work.
It is by design, i.e. not a bug.
Is there a clean and spec-conformant way to define a custom URL scheme that acts as an adapter on the resource returned by another URL?
I have already defined a custom URL protocol which returns a decrypted representation of a local file. So, for instance, in my code,
decrypted-file:///path/to/file
transparently decrypts the file you would get from file:///path/to/file. However, this only works for local files. No fun! I am hoping that the URL specification allows a clean way that I could generalize this by defining a new URL scheme as a kind of adapter on existing URLs.
For example, could I instead define a custom URL scheme decrypted: that could be used as an adapter that prefixes another absolute URL that retrieved a resource? Then I could just do
decrypted:file:///path/to/file
or decrypted:http://server/path/to/file or decrypted:ftp://server/path/to/file or whatever. This would make my decrypted: protocol composable with all existing URL schemes that do file retrieval.
Java does something similar with the jar: URL scheme but from my reading of RFC 3986 it seems like this Java technology violates the URL spec. The embedded URL is not properly byte-encoded, so any /, ?, or # delimiters in the embedded URL should officially be treated as segment delimiters in the embedding URL (even if that's not what JarURLConnection does). I want to stay within the specs.
Is there a nice and correct way to do this? Or is the only option to byte-encode the entire embedded URL (i.e., decrypted:file%3A%2F%2F%2Fpath%2Fto%2Ffile, which is not so nice)?
Is what I'm suggesting (URL adapters) done anywhere else? Or is there a deeper reason why this is misguided?
There's no built-in adaptor in Cocoa, but writing your own using NSURLProtocol is pretty straightforward for most uses. Given an arbitrary URL, encoding it like so seems simplest:
myscheme:<originalurl>
For example:
myscheme:http://example.com/path
At its simplest, NSURL only actually cares if the string you pass in is a valid URI, which the above is. Yes, there is then extra URL support layered on top, based around RFC 1808 etc. but that's not essential.
All that's required to be a valid URI is a colon to indicate the scheme, and no invalid characters (basically, ASCII without spaces).
You can then use the -resourceSpecifier method to retrieve the original URL and work with that.