Tomcat and URLS with percent signs in them - java

I am finding that if I pass access tomcat with a URL with a percent sign in it (e.g.
http://tester:8080/blah-1.6.0-SNAPSHOT/blah/getLoginURL/http%2F
Then tomcat seems to block the request and returns a blank response. If I remove the %gt above the request works as expected. Is there anyway to prevent this behaviour?
Edit: I thought I was using URL encoding - the above URL also causes the same failure

Correctly encode the URL:
http://tester:8080/blah-1.6.0-SNAPSHOT/blah/getLoginURL/veryr%25gt
Difference here ---------------------------------------------^^^
In a URI, the % character is special: It introduces an encoded entity. To actually put a % in a URI, you must use %25 (which is the encoded entity %). This is called URI-encoding, although it's frequently called "percent encoding".
(Complete speculation) If the %gt was meant to be a >, that would be %3E. URI encoding is a different thing from HTML character entities.

Related

getParameter special characters

I'm trying to get an url parameter in jee.
So I have this kind of url :
http://MySite/MySite.jsp?page=recherche&msg=toto
First i tried with : request.getParameter("msg").toString();
it works well but if I try to search "c++" , the method "getParameter()" returns "c" and not "c++" and i understand.
So I tried another thing. I get the current URL and parse it to get the value of the message :
String msg[]= request.getQueryString().split("msg=");
message=msg[1].toString();
It works now for the research "c++" but now I can't search accent. What can I do ?
EDIT 1
I encode the message in the url
String urlString=Utils.encodeUrl(request.getParameter("msg"));
so for the URL : http://MySite/MySite.jsp?page=recherche&msg=c++
i have this encoded URL : http://MySite/MySite.jsp?page=recherche&msg=c%2B%2B
And when i need it, i decode the message of the URL
String decodedUrl = URLDecoder.decode(url, "ISO-8859-1");
Thanks everybody
Anything you send via "get" method goes as part of the url, which needs to be urlencoded to be valid in case it contains at least one of the reserved characters. So, any character will need to be encoded before sending.
In order to send c++, you would have to send c%2B%2B. That would be interpreted properly at the server side.
Here some reference you can check:
http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
Now the question is, how and where do you generate your URL? According to the language, you will need to use the proper method to encode your strings.
if I try to search "c++" , the method "getParameter()" returns "c" and not "c++"
Query parameters are treated as application/x-www-form-urlencoded, so a + character in the URL means a space character in the parameter value. If you want to send a + character then it needs to be encoded in the URL as %2B:
http://MySite/MySite.jsp?page=recherche&msg=c%2B%2B
The same applies to accented characters, they need to be escaped as the bytes of their UTF-8 representation, so été would need to be:
msg=%C3%A9t%C3%A9
(é being Unicode character U+00E9, which is C3 A9 in UTF-8).
In short, it's not the fault of this code, it's the fault of whatever component is responsible for constructing the URL on the client side.
Call your URL with
msg=c%2B%2B
+ in a URL mean 'space'. It needs to be escaped.
You need to escape special characters when passing them as URL parameters. Since + means space and & means and another parameter, these cannot be used as parameter values.
See this other S.O. question.
You may want to use the Apache HTTP client library to help you with the URL encoding/decoding. The URIUtil class has what you need.
Something like this should work:
String rawParam = request.getParameter("msg");
String msgParam = URIUtil.decode(rawParam);
Your example indicates that the data is not being properly encoded on the client side. See this JavaScript question.

Get parameter + replace with space when processing in action class

I encrypt a text "good-bye, friend" using BasicTextEncryptor. So the encrypt value looks like below,
3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=
Then I email a URL to the user where the above parameter as a token.
Then the user copies the below URL and presses enter,
http://localhost:8080/token=3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=
But when I access the parameter in Struts 2 application through the action method it gives me the encrypt parameter as below,
3qe80L1ap cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=
The + is replaced by " ". So when I decrypt it, it gives me EncryptionOperationNotPossibleException.
Does struts decode the + to " " assuming browser + is a encode character? In that case it ok before I proceed with decrypt, I replace the space with + ?
A better way would be to "URL encode" the string before appending it to actual URL.
URLEncoder.encode("3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=", "ISO-8859-1");
This would make sure the token is correctly decoded.
To, answer your question, struts does not have any role in decoding the URL parameter. Its the core functionality of the application server to decode the URL parameter. So every HTTP parameter is subjected to decoding before reaching the application code.
Whatever is decoded by the server is available by to the application (i.e. Struts in your case. )
Now to explain why the + is not reaching your struts.
java.net.URLDecoder.decode("3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI="));
it returns 3qe80L1ap cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=
which means that + is not getting URL Decoded.
So, reiterating, every HTTP parameter (querystring or form POST) is subjected to decoding before reaching the application code.
When you URL encode your string, + is encoded as %2B and your struts application will receive the correct decoded string.
You'll need to not put the base64 encoded string there, but encode it using the UrlEncoder, like the following:
URLEncoder.encode("3qe80L1ap+cR2zRU9csFwOffw5NtWTueLRYgSXyjctI=", "UTF-8")
That way you can put it in the link.
Consider using a so called URL safe variant of Base64. The most common variant, described in RFC 4648, uses - and _ instead of + and / respectively, and omits padding characters (=).
Most implementations of Base64 support this URL safe variant too, though if yours doesn't, it's easy enough to do manually.
URLs cannot contain spaces. URL encoding normally replaces a space with a + sign.
Thus the server decodes normally + sign to the space. See URLEncoder docs or read Java URL encoding of query string parameters.

How to read the parameter from the url contains '#' instead of '?'

How to read access token from this url? When I try to read this by request.getParameter() method it returns null value
http://localhost:8080/FirstPick/views/common/home.faces#access_token=xxxxxxxxxx&expires_in=4015
Besides this,
Content after the hash (#) is only be used on the client side. If you require that information on the server, you can use a different separator with query-string using '?', or you can submit it via Ajax after the page has loaded by reading it on the client with JavaScript.
The part of the URI after the hash(#) is never sent to the server, reason is that the hash identifier was originally designed to point at references within the given web page and not to new resources on the server.
Thanks
You cant read any parameter from querystring like this..
It must contain '?'.
Only the string that appears after '?' is called 'QueryString'.
and from 'QueryString' you can get the value.
And you menstion the url here, is not contain '?' so it doesnt have 'QueryString'.
and You cant use request.getParameter() method.
On client side (i.e. from JavaScript) you can check window.location.hash to get hash. On server side, general answer is 'it is impossible' since hash is not sent in request to server.

UTF-8 encoding of GET parameters in JSF

I have a search form in JSF that is implemented using a RichFaces 4 autocomplete component and the following JSF 2 page and Java bean. I use Tomcat 6 & 7 to run the application.
...
<h:commandButton value="#{msg.search}" styleClass="search-btn" action="#{autoCompletBean.doSearch}" />
...
In the AutoCompleteBean
public String doSearch() {
//some logic here
return "/path/to/page/with/multiple_results?query=" + searchQuery + "&faces-redirect=true";
}
This works well as long as everything withing the "searchQuery" String is in Latin-1, it does not work if is outside of Latin-1.
For instance a search for "bodø" will be automatically encoded as "bod%F8". However a search for "Kra Ðong" will not work since it is unable to encode "Ð".
I have now tried several different approaches to solve this, but none of them works.
I have tried encoding the searchQuery my self using URLEncode, but this only leads to double encoding since % is encoded as %25.
I have tried using java.net.URI to get the encoding, but gives the same result as URLEncode.
I have tried turning on UTF-8 in Tomcat using URIEncoding="UTF-8" in the Connector but this only worsens that problem since then non-ascii characters does not work at all.
So to my questions:
Can I change the way JSF 2 encodes the GET parameters?
If I cannot change the way JSF 2 encodes the GET parameters, can I turn of the encoding and do it manually?
Am I doing something where strange here? This seems like something that should be supported out-of-the-box, but I cannot find any others with the same problem.
I think you've hit a corner case bug in JSF. The query string is URL-encoded by ExternalContext#encodeRedirectURL() which uses the response character encoding as obtained by ExternalContext#getResponseCharacterEncoding(). However, while JSF by default uses UTF-8 as response character encoding, this is only been set if the view is actually to be rendered, not when the response is to be redirected, so the response character encoding still returns the platform default of ISO-8859-1 which causes your characters to be URL-encoded using this wrong encoding.
I've reported this as issue 2440. In the meanwhile your best bet is to explicitly set the response character encoding yourself beforehand.
FacesContext.getCurrentInstance().getExternalContext().setResponseCharacterEncoding("UTF-8");
Note that this still requires that the container itself uses the same character encoding to decode the request URL, so you certainly need to set URIEncoding="UTF-8" in Tomcat's configuration. This won't mess up the characters anymore as they will be really UTF-8 now.
The only character encoding accepted for HTTP URLs and headers is US-ASCII, you need to URL encode these characters to send them back to the application. Simplest way to do this in java would be:
public String doSearch() {
//some logic here
String encodedSearchQuery = java.net.URLEncoder.encode( searchQuery, "UTF-8" );
return "/path/to/page/with/multiple_results?query=" + encodedSearchQuery + "&faces-redirect=true";
}
And then it should work for any character that you use.

encoding problem in servlet

I have a servlet which receive some parameter from the client ,then do some job.
And the parameter from the client is Chinese,so I often got some invalid characters in the servet.
For exmaple:
If I enter
http://localhost:8080/Servlet?q=中文&type=test
Then in the servlet,the parameter of 'type' is correct(test),however the parameter of 'q' is not correctly encoding,they become invalid characters that can not parsed.
However if I enter the adderss bar again,the url will changed to :
http://localhost:8080/Servlet?q=%D6%D0%CE%C4&type=test
Now my servlet will get the right parameter of 'q'.
What is the problem?
UPDATE
BTW,it words well when I send the form with post.
WHen I send them in the ajax,for example:
url="http://..q='中文',
xmlhttp.open("POST",url,true);
Then the server side also get the invalid characters.
It seems that just when the Chinese character are encoded like %xx,the server side can get the right result.
That's to say http://.../q=中文 does not work,
http://.../q=%D6%D0%CE%C4 work.
But why "http://www.google.com.hk/search?hl=zh-CN&newwindow=1&safe=strict&q=%E4%B8%AD%E6%96%87&btnG=Google+%E6%90%9C%E7%B4%A2&aq=f&aqi=&aql=&oq=&gs_rfai=" work?
Ensure that the encoding of the page with the form itself is also UTF-8 and ensure that the browser is instructed to read the page as UTF-8. Assuming that it's JSP, just put this in very top of the page to achieve that:
<%# page pageEncoding="UTF-8" %>
Then, to process GET query string as UTF-8, ensure that the servletcontainer in question is configured to do so. It's unclear which one you're using, so here's a Tomcat example: set the URIEncoding attribute of the <Connector> element in /conf/server.xml to UTF-8.
<Connector URIEncoding="UTF-8">
For the case that you'd like to use POST, then you need to ensure that the HttpServletRequest is instructed to parse the POST request body using UTF-8.
request.setCharacterEncoding("UTF-8");
Call this before you access the first parameter. A Filter is the best place for this.
See also:
Unicode - How to get the characters right?
Using non-ASCII characters as GET parameters (i.e. in URLs) is generally problematic. RFC 3986 recommends using UTF-8 and then percent encoding, but that's AFAIK not an official standard. And what you are using in the case where it works isn't UTF-8!
It would probably be safest to switch to POST requests.
I believe that the problem is on sending side. As I understood from your description if you are writing the URL in browser you get "correctly" encoded request. This job is done by browser: it knows to convert unicode characters to sequence of codes like %xx.
So, try to check how do you send the request. It should be encoded on sending.
Other possibility is to use POST method instead of GET.
Do read this article on URL encoding format "www.blooberry.com/indexdot/html/topics/urlencoding.htm".
If you want, you could convert characters to hex or Base64 and put them in the parameters of the URL.
I think it's better to put them in the body (Post) then the URL (Get).

Categories

Resources