Encoding query parameters in URL using Java with valid charset - java

I am trying to understand what is the difference and importance of different charsets available while encoding and decoding text.
I have a scenario, where I want to call a RestAPI. The RestAPI has a base URL, for ex: https://myrestapiurl.com. Now to perform a GET request, the URL is formed by appending the id of the entity that I want to fetch, like: https://myrestapiurl.com('id')
id : It has no limitations on valid characters!
I have encountered an id: باقی ریسورس , So before calling the RestAPI, I need to encode it. Using Java's URLEncoder, I tried the following:
String s ="باقی ریسورس";
String encodedID = URLEncoder.encode(s, StandardCharsets.UTF_8.name() )
Using the encodedID, I try to make a request using PostMan. The request fails with 404 or 400 when I use different charset. It only succeeds when I encode using ISO_8859_1 as follows:
String encodedID = URLEncoder.encode(s, StandardCharsets.ISO_8859_1.name());
String URL = "https://myrestapiurl.com('" + encodedID + "')";
This works fine, through code as well as PostMan. My question is:
How can I decide which charset to use before encoding? Or should I have fallbacks? That is if it fails with UTF_8 then try with UTF_16 etc etc...but this is very in-efficient. In case if the entity actually doesn't exist, then, these tries would be overhead
Also, when I visit https://www.w3schools.com/tags/ref_urlencode.ASP and enter the text to be encoded, it provides the valid encoded string with ISO_8859_1 , how does it manage to do so?
How can this be done in Java without using any other extra libraries like apache? We don't have choice to add extra dependencies!

Related

How to URL-encode the the whole xml value of a query param using Spring's rest template?

I am working on a Spring Boot application
I need to make a request to an external service, old and ill-conceived. The request take the form of a HTTP GET (or POST) call, but the payload, an xml content, need to be passed as a query parameter. For example,
GET http://ill-service.com/plain.cgi?XML_DATA=<request attribute="attributeValue"><content contentAttribute="plain"/></request>
Of course, the value of query param XML_DATA need to be URL encoded, and normally, the RestTemplate of Spring boot work good on that, following RFC 3986 (see http://www.ietf.org/rfc/rfc3986.txt).
Except that, as allowed by this RFC, '/' and '=' character are left in the param value, giving me the following query :
GET http://ill-service.com/plain.cgi?XML_DATA=%3Crequest%20attribute=%22attributeValue%22%3E%3Ccontent%20contentAttribute=%22plain%22/%3E%3C/request%3E
In a perfect wold, this would be good, but do you remember when I said that the service I am trying to call is ill-conceived ? In another world, it needs to have the full content of XML_DATA URL-encoded. In another words, it needs the following query:
GET http://ill-service.com/plain.cgi?XML_DATA=%3Crequest%20attribute%3D%22attributeValue%22%3E%3Ccontent%20contentAttribute%3D%22plain%22%2F%3E%3C%2Frequest%3E%0A
I am quite lost on how to instruct the rest template or the UriComponentBuilder I am using to do so. Any help would be greatly appreciated
Probably u can use spring's UriUtils class
Use java.net.URLEncoder to encode your XML payload first and then append the encoded payload.
Following the suggestion of Vasif, and some information about UriComponentBuilder I found the following solutions :
String xmlContent = "<request attribute="attributeValue"><content contentAttribute="plain"/></request>";
URI uri = UriComponentsBuilder.fromHttpUrl("http://ill-service.com/plain.cgi")
//This part set the query param as a full encoded value, not as query value encoded
.queryParam("XML_DATA", UriUtils.encode(xmlContent, "UTF-8"))
//The build(true) indicate to the builder that the Uri is already encoded
.build(true).toUri();
String responseStr = restTemplate.getForObject(uri ,String.class)

Documenting JSON in URL not possible

In my Rest API it should be possible to retrieve data which is inside a bounding box. Because the bounding box has four coordinates I want to design the GET requests in such way, that they accept the bounding box as JSON. Therefore I need to be able to send and document JSON strings as URL parameter.
The test itself works, but I can not document these requests with Spring RestDocs (1.0.0.RC1). I reproduced the problem with a simpler method. See below:
#Test public void ping_username() throws Exception
{
String query = "name={\"user\":\"Müller\"}";
String encodedQuery = URLEncoder.encode(query, "UTF-8");
mockMvc.perform(get(URI.create("/ping?" + encodedQuery)))
.andExpect(status().isOk())
.andDo(document("ping_username"));
}
When I remove .andDo(document("ping_username")) the test passes.
Stacktrace:
java.lang.IllegalArgumentException: Illegal character in query at index 32: http://localhost:8080/ping?name={"user":"Müller"}
at java.net.URI.create(URI.java:852)
at org.springframework.restdocs.mockmvc.MockMvcOperationRequestFactory.createOperationRequest(MockMvcOperationRequestFactory.java:79)
at org.springframework.restdocs.mockmvc.RestDocumentationResultHandler.handle(RestDocumentationResultHandler.java:93)
at org.springframework.test.web.servlet.MockMvc$1.andDo(MockMvc.java:158)
at application.rest.RestApiTest.ping_username(RestApiTest.java:65)
After I received the suggestion to encode the URL I tried it, but the problem remains.
The String which is used to create the URI in my test is now /ping?name%3D%7B%22user%22%3A%22M%C3%BCller%22%7D.
I checked the class MockMvcOperationRequestFactory which appears in the stacktrace, and in line 79 the following code is executed:
URI.create(getRequestUri(mockRequest)
+ (StringUtils.hasText(queryString) ? "?" + queryString : ""))
The problem here is that a not encoded String is used (in my case http://localhost:8080/ping?name={"user":"Müller"}) and the creation of the URI fails.
Remark:
Andy Wilkinson's answer is the solution for the problem. Although I think that David Sinfield is right and JSONs should be avoided in the URL to keep it simple. For my bounding box I will use a comma separated string, as it is used in WMS 1.1: BBOX=x1,y1,x2,y2
You haven't mentioned the version of Spring REST Docs that you're using, but I would guess that the problem is with URIUtil. I can't tell for certain as I can't see where URIUtil is from.
Anyway, using the JDK's URLEncoder works for me with Spring REST Docs 1.0.0.RC1:
String query = "name={\"user\":\"Müller\"}";
String encodedQuery = URLEncoder.encode(query, "UTF-8");
mockMvc.perform(get(URI.create("/baz?" + encodedQuery)))
.andExpect(status().isOk())
.andDo(document("ping_username"));
You can then use URLDecoder.decode on the server side to get the original JSON:
URLDecoder.decode(request.getQueryString(), "UTF-8")
The problem is that URIs have to be encoded as ACII. And ü is not a valid ASCII character, so it must be escaped in the url with % escaping.
If you are using Tomcat, you can use URIEncoding="UTF-8" in the Connector element of the server.xml, to configure UTF-8 escaping as default. If you do this, ü will be automatically converted to %C3%BC, which is the ASCII representation of the \uC3BC Unicode code-point (which represents ü).
Edit: It seems that I have missed the exact point of the error, but it is still the same error. Curly braces are invalid in a URI. Only the following characters are acceptable according to RFC 3986:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]#!$&'()*+,;=%
So these must be escaped too.

getParameter special characters

I'm trying to get an url parameter in jee.
So I have this kind of url :
http://MySite/MySite.jsp?page=recherche&msg=toto
First i tried with : request.getParameter("msg").toString();
it works well but if I try to search "c++" , the method "getParameter()" returns "c" and not "c++" and i understand.
So I tried another thing. I get the current URL and parse it to get the value of the message :
String msg[]= request.getQueryString().split("msg=");
message=msg[1].toString();
It works now for the research "c++" but now I can't search accent. What can I do ?
EDIT 1
I encode the message in the url
String urlString=Utils.encodeUrl(request.getParameter("msg"));
so for the URL : http://MySite/MySite.jsp?page=recherche&msg=c++
i have this encoded URL : http://MySite/MySite.jsp?page=recherche&msg=c%2B%2B
And when i need it, i decode the message of the URL
String decodedUrl = URLDecoder.decode(url, "ISO-8859-1");
Thanks everybody
Anything you send via "get" method goes as part of the url, which needs to be urlencoded to be valid in case it contains at least one of the reserved characters. So, any character will need to be encoded before sending.
In order to send c++, you would have to send c%2B%2B. That would be interpreted properly at the server side.
Here some reference you can check:
http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
Now the question is, how and where do you generate your URL? According to the language, you will need to use the proper method to encode your strings.
if I try to search "c++" , the method "getParameter()" returns "c" and not "c++"
Query parameters are treated as application/x-www-form-urlencoded, so a + character in the URL means a space character in the parameter value. If you want to send a + character then it needs to be encoded in the URL as %2B:
http://MySite/MySite.jsp?page=recherche&msg=c%2B%2B
The same applies to accented characters, they need to be escaped as the bytes of their UTF-8 representation, so été would need to be:
msg=%C3%A9t%C3%A9
(é being Unicode character U+00E9, which is C3 A9 in UTF-8).
In short, it's not the fault of this code, it's the fault of whatever component is responsible for constructing the URL on the client side.
Call your URL with
msg=c%2B%2B
+ in a URL mean 'space'. It needs to be escaped.
You need to escape special characters when passing them as URL parameters. Since + means space and & means and another parameter, these cannot be used as parameter values.
See this other S.O. question.
You may want to use the Apache HTTP client library to help you with the URL encoding/decoding. The URIUtil class has what you need.
Something like this should work:
String rawParam = request.getParameter("msg");
String msgParam = URIUtil.decode(rawParam);
Your example indicates that the data is not being properly encoded on the client side. See this JavaScript question.

Get parameter + replace with space when processing in action class

I encrypt a text "good-bye, friend" using BasicTextEncryptor. So the encrypt value looks like below,
3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=
Then I email a URL to the user where the above parameter as a token.
Then the user copies the below URL and presses enter,
http://localhost:8080/token=3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=
But when I access the parameter in Struts 2 application through the action method it gives me the encrypt parameter as below,
3qe80L1ap cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=
The + is replaced by " ". So when I decrypt it, it gives me EncryptionOperationNotPossibleException.
Does struts decode the + to " " assuming browser + is a encode character? In that case it ok before I proceed with decrypt, I replace the space with + ?
A better way would be to "URL encode" the string before appending it to actual URL.
URLEncoder.encode("3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=", "ISO-8859-1");
This would make sure the token is correctly decoded.
To, answer your question, struts does not have any role in decoding the URL parameter. Its the core functionality of the application server to decode the URL parameter. So every HTTP parameter is subjected to decoding before reaching the application code.
Whatever is decoded by the server is available by to the application (i.e. Struts in your case. )
Now to explain why the + is not reaching your struts.
java.net.URLDecoder.decode("3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI="));
it returns 3qe80L1ap cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=
which means that + is not getting URL Decoded.
So, reiterating, every HTTP parameter (querystring or form POST) is subjected to decoding before reaching the application code.
When you URL encode your string, + is encoded as %2B and your struts application will receive the correct decoded string.
You'll need to not put the base64 encoded string there, but encode it using the UrlEncoder, like the following:
URLEncoder.encode("3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=", "UTF-8")
That way you can put it in the link.
Consider using a so called URL safe variant of Base64. The most common variant, described in RFC 4648, uses - and _ instead of + and / respectively, and omits padding characters (=).
Most implementations of Base64 support this URL safe variant too, though if yours doesn't, it's easy enough to do manually.
URLs cannot contain spaces. URL encoding normally replaces a space with a + sign.
Thus the server decodes normally + sign to the space. See URLEncoder docs or read Java URL encoding of query string parameters.

java request.getQueryString() value different between chrome and ie browser

I have a request,In Browser address bar enter:
http://localhost:8888/cmens-tops-outwear/t-b-f-a-c-s-fLoose-p-g-e-i-o.htm?'"--></style></script><script>netsparker(0x0000E1)</script>=
Tomcat6.0.35 i have set URIEncoding="UTF-8"
Use request.getQueryString() in servlet:
if chrome,i get
'%22--%3E%3C/style%3E%3C/script%3E%3Cscript%3Enetsparker(0x0000E1)%3C/script%3E=
if ie,I get
'"--></style></script><script>netsparker(0x0000E1)</script>=
Why?
Additional
I want to get request.getQueryString() to create a uri
URI uri = URI.create(url)
if ie:
java.net.URISyntaxException: Illegal character in query at index 36: /cmens/t-b-f-a-c-s-f-p-g-e-i-o.htm?'"--></style></script><script>netsparker(0x0000E1)</script>
at java.net.URI$Parser.fail(URI.java:2809)
at java.net.URI$Parser.checkChars(URI.java:2982)
at java.net.URI$Parser.parseHierarchical(URI.java:3072)
at java.net.URI$Parser.parse(URI.java:3024)
at java.net.URI.<init>(URI.java:578)
at java.net.URI.create(URI.java:840)
How to determine the queryString whether has be encoded?
The HttpServletRequest#getQueryString() is per definition undecoded. See also the javadoc (emphasis mine):
Returns:
a String containing the query string or null if the URL contains no query string. The value is not decoded by the container.
Basically, you need to URL-decode it yourself if you'd like to parse it manually instead of using getParameterXxx() methods for some reason (which implicitly decodes the parameters!).
String decodedQueryString = URLDecoder.decode(request.getQueryString(), "UTF-8");
As to why Chrome sends it encoded while IE not, that's because Chrome is doing a better job of handling HTTP requests the safe/proper way. This is beyond your control. Just always URL-decode the query string yourself if you intend to parse it manually for some reason. The URIEncoding="UTF-8" configuration has only effect on getParameterXxx() methods during GET requests.
The Chrome version is URLEncoded while the IE string is decoded.
Use this tool to compare the URLEncoded and decoded versions: http://meyerweb.com/eric/tools/dencoder/
Chrome uses the URL encoding way, but IE is using strings.
For example: " is %22 in URL encoding.
< is %3E
and > is %3C
Chrome is doing it the "right way" but IE just can't do as all the others.
You can find complete list of URL characters here: http://www.w3schools.com/tags/ref_urlencode.asp
Chrome sends the url encoded. Try decoding the query string using
URLDecoder.decode(queryString, "UTF-8");
As stated by the javadoc, the query string is not decoded by the container:
returns a String containing the query string or null if the URL contains no query string. The value is not decoded by the container.
javadoc

Categories

Resources