Java -> Apache Commons StringEscapeUtils -> escapeJavaScript - java

For a very simple ajax name lookup, I'm sending an id from the client webpage to the server (Tomcat 5.5, Java 5), looking it up in a database and returning a string, which is assigned to a javascript variable back in the client (and then displayed).
The javascript code that receives the value is pretty standard:
//client code - javascript
xmlHttp.onreadystatechange=function() {
if (xmlHttp.readyState==4) {
var result = xmlHttp.responseText;
alert(result);
...
}
...
}
To return the string, I originally had this in the server:
//server code - java
myString = "...";
out.write(myString.getBytes("UTF-8"));
Which worked perfectly, if unsafe. Later, I replaced it with:
import org.apache.commons.lang.StringEscapeUtils;
...
myString = "...";
out.write(StringEscapeUtils.escapeJavaScript(myString).getBytes("UTF-8"));
But while safer, the resulting string can't be properly displayed if it contains special chars like "ñ".
For instance, using:
escapeJavaScript("años").getBytes("UTF-8");
sends:
an\u00F1os
to the client.
The question: is there a simple way to parse the resulting string in Javascript or is there an alternate escape function I can use in java that would prevent this issue?

The following works in every browser I've tried:
javascript:alert("a\u00F1os");
Perhaps your string is being escaped twice by mistake.

Actually, now that I read it over, I think I actually don't need to escape the string I'm sending back at all... That is, StringEscapeUtils.escapeJavaScript would be useful if the resulting value was printed in the page, like:
//javascript code with inline struts
var myJavasriptString = "<%=myJavaString%>";
Or am I missing something and there would still be a valid reason to do the escape in the original case? (when it is returned as a series of bytes back to an ajax onreadystatechange handler and assigned to a js variable)

Related

Request parameter is modified in my servlet

I sent one request as URL with data to servlet, But by default servlet is modifying the data and sending as request. Can you please suggest how to maintain the request URL with data which i passed to servlet should remain same ?
Example:- when i am passing the data to servlet
http://localhost/helloservlet/servlet/ppd.abcd.build.coupons.CouponValueFormatterServlet?dsn=frd_abc_abcde&lang=ENG&val=PRCTXT|12345 &ABCDEFG
when it using the above url in servelt as request , like string abc = request.getParameter("val"), the val attribute is trimmed automatically and assigned as " val=PRCTXT|12345" but it supposed to be like " val = PRCTXT|12345 &ABCDEFG ". Please help me on this.
The servlet interprets each & in the URL as the start of a new parameter. So when it sees &ABCDEFG, it thinks you are sending a new parameter called ABCDEFG with no value (though this is technically a "keyless value" according to the specifications).
Two things to fix this, first is when you want to actually send an &, use %26 instead. This will be skipped by the code that divides up the parameters, but converted to a real & in the parameter's value.
Second is to replace spaces with +. Spaces in URLs work sometimes but can be problematic.
So your actual request URL should look like this:
http://localhost/helloservlet/servlet/ppd.abcd.build.coupons.CouponValueFormatterServlet?dsn=frd_abc_abcde&lang=ENG&val=PRCTXT|12345+%26ABCDEFG
If you're building these parameters in javascript, you can use encodeURIComponent() to fix all problem characters for you. So you could do something like this:
var userInput = *get some input here*
var addr = 'http://www.example.com?param1=' + encodeURIComponent(userInput);

How to replace a query string in an Apache Velocity template?

In my web application I'm trying to prevent users from inserting JavaScript in the freeText parameter when they're running a search.
To do this, I've written code in the header Velocity file to check whether the query string contains a parameter called freeText, and if so, use the replace method to replace the characters within the parameter value. However, when you load the page, it still displays the original query string - I'm unsure on how to replace the original query string with my new one which has the replaced characters.
This is my code:
#set($freeTextParameter = "$request.getParameter('freeText')")
freeTextParameter: $freeTextParameter
#if($freeTextParameter)
##Do the replacement:
#set($replacedQueryString = "$freeTextParameter.replace('confirm','replaced')")
replacedQueryString after doing the replace: $replacedQueryString
The query string now: $request.getQueryString()
The freeText parameter now: $request.getParameter('freeText')
#end
In the code above, the replacedQueryString variable has changed as expected (ie the replacement has been carried out as expected), but the $request.getQueryString() and $request.getParameter('freeText') are still the same as before, as if the replacement had never happened.
Seeing as there is a request.getParameter method which works fine for getting the parameters, I assumed there would be a request.setParameter method to do the same thing in reverse, but there isn't.
The Java String is an immutable object, which means that the replace() method will return an altered string, without changing the original one.
Since the parameters map given by the HttpServletRequest object cannot be modified, this approach doesn't work well if your templates rely on $request.getParameter('freeText').
Instead, if you rely on VelocityTools, then you can rather rely on $params.freeText in your templates. Then, you can tune your WEB-INF/tools.xml file to make this parameters map alterable:
<?xml version="1.0">
<tools>
<toolbox scope="request">
<tool key="params" readOnly="false"/>
...
</toolbox>
...
</tools>
(Version 2.0+ of the tools is required).
Then, in your header, you can do:
#set($params.freeText = params.freeText.replace('confirm','replaced'))
I managed to fix the issue myself - it turned out that there was another file (which gets called on every page) in which the $!request.getParameter('freeText')" variable is used. I have updated that file so that it uses the new $!replacedQueryString variable (ie the one with the JavaScript stripped out) instead of the existing "$!request.getParameter('freeText')" variable. This now prevents the JavaScript from being executed on every page.
So, this is the final working code in the header Velocity file:
#set($freeTextParameter = "$!m.request.httpRequest.getParameter('freeText')")
#if($freeTextParameter)
#set($replacedQueryString = "$freeTextParameter.replace('confirm','').replace('<','').replace('>','').replace('(','').replace(')','').replace(';','').replace('/','').replace('\"','').replace('&','').replace('+','').replace('script','').replace('prompt','').replace('*','').replace('.','')")
#end

How to send special characters in request to servlets

I am using a jQuery ajax command, which has the following data:
$.ajax({
type:"POST",
...
data:"e=f_s&es="+JSON.stringify(email)+"&fr="+str
...
})
Where (email) can contain special character, for example it can be a string:
!#$%'&+-/=?^`*{|}~ch!#$%'/=?*^`{|}#mail.com
The reason why I allow such characters, is based on the following question.
The problem is, at some point on the server (Java EE application), it is messing up. The special characters are not showing the boundaries of different request parameters. For example it is considering : '/ as a parameter.
I think I need to escape characters? (if yes how?)
What should I do to be able to send such a string from javascript to java ?
Use encodeURIComponent:
encodeURIComponent("!#$%'&+-/=?^`*{|}~ch!#$%'/=?*^`{|}#mail.com")
returning:
"!%23%24%25'%26%2B-%2F%3D%3F%5E%60*%7B%7C%7D~ch!%23%24%25'%2F%3D%3F*%5E%60%7B%7C%7D%40mail.com"

401 response when do a POST using scribe oauth java [duplicate]

I am trying to use Twitter OAuth and my POST requests are failing with a 401 (Invalid OAuth Request) error.
For example, if I want to post a new status update, I am sending a HTTP POST request to https://twitter.com/statuses/update.json with the following parameters -
status=Testing&oauth_version=1.0&oauth_token=xxx&
oauth_nonce=xxx&oauth_timestamp=xxx&oauth_signature=xxx&
oauth_consumer_key=xxx&in_reply_to=xxx&oauth_signature_method=HMAC-SHA1`
My GET requests are all working fine. I can see on the mailing lists that a lot of people have had identical problems but I could not find a solution anywhere.
I am using the oauth.py Python library.
I just finished implementing twitter OAuth API from scratch using Java. Get and post requests work OK. You can use this page http://www.hueniverse.com/hueniverse/2008/10/beginners-gui-1.html to check signature and HTTP headers. Just enter your keys and tokens and check output. It seems twitter works exactly as described on this post. Be careful with spaces and UTF-8 symbols, for example Java encodes space as "+" but OAuth requires %20
Make sure your app access type is read & write.
On your app settings page (ex. http://twitter.com/apps/edit/12345) there's a radio button field like this:
Default Access type: Read & Write / Read-only
If you check 'Read-only' then status update API will return 401.
I second the answer by Jrgns. I has exactly the same issue. When reading the example Twitter provides, it's actually clear. However their pseudo code is misleading. In Python this worked for me :
def encodekeyval(key, val):
key = urllib.quote(key, '')
val = urllib.quote(val, '')
return urllib.quote(key + '=' + val, '')
def signature_base_string(urlstr, oauthdata):
sigstr = 'POST&' + urllib.quote(urlstr,'') + '&'
# retrieve "post" data as dictionary of name value pairs
pdata = oauthdata.getpdata()
# need to sort parameters
pstr = '%26'.join([encodekeyval(key, pdata[key]) for key in sorted(pdata.keys())])
return sigstr + pstr
I had the same issues, until I realised that the parameters need to be encoded twice for the base string. My GET requests all worked fine, but my POSTs, particularly status updates, failed. On a hunch I tried a POST without spaces in the status parameter, and it worked.
In PHP:
function encode($input) {
return str_replace('+', ' ', str_replace('%7E', '~', rawurlencode($input)));
}
$query = array();
foreach($parameters as $name => $value) {
$query[] = encode($name) . '=' .encode($value);
}
$base = encode(strtoupper($method)) . '&' .encode($norm_url) . '&' .
encode(implode('&', $query));
Notice the encode function around the names and values of the parameters, and then around the whole query string. A Space should end up as %2520, not just %20.
I found the solution and it works for me, You must add the following paramters in the request header and it should look like following (c# code), donot use & sign, instead separate parameters by comma(,) sign. and you must add the word "OAuth" in the beginging.
httpWebRequest.Headers[System.Net.HttpRequestHeader.Authorization] = "OAuth oauth_consumer_key=\"hAnZFaPKxXnJqdfLhDikdw\", oauth_nonce=\"4729687\", oauth_signature_method=\"HMAC-SHA1\", oauth_timestamp=\"1284821989\", oauth_token=\"17596307-KH9iUzqTxaoa5576VjILkERgUxcqExRyXkfb8AsXy\", oauth_version=\"1.0\", oauth_signature=\"p8f5WTObefG1N9%2b8AlBji1pg18A%3d\"";
and other parameters like 'status' should be written in the body of the request.
Most likely, the signature is invalid. You must follow the OAuth spec on how to generate the signature( normalized parameters, URLencoding, and cosumerSecret&oauthScret. More on this later ......

How do I correctly decode unicode parameters passed to a servlet

Suppose I have:
<a href="http://www.yahoo.com/" target="_yahoo"
title="Yahoo!™" onclick="return gateway(this);">Yahoo!</a>
<script type="text/javascript">
function gateway(lnk) {
window.open(SERVLET +
'?external_link=' + encodeURIComponent(lnk.href) +
'&external_target=' + encodeURIComponent(lnk.target) +
'&external_title=' + encodeURIComponent(lnk.title));
return false;
}
</script>
I have confirmed external_title gets encoded as Yahoo!%E2%84%A2 and passed to SERVLET. If in SERVLET I do:
Writer writer = response.getWriter();
writer.write(request.getParameter("external_title"));
I get Yahoo!â„¢ in the browser. If I manually switch the browser character encoding to UTF-8, it changes to Yahoo!TM (which is what I want).
So I figured the encoding I was sending to the browser was wrong (it was Content-type: text/html; charset=ISO-8859-1). I changed SERVLET to:
response.setContentType("text/html; charset=utf-8");
Writer writer = response.getWriter();
writer.write(request.getParameter("external_title"));
Now the browser character encoding is UTF-8, but it outputs Yahoo!⢠and I can't get the browser to render the correct character at all.
My question is: is there some combination of Content-type and/or new String(request.getParameter("external_title").getBytes(), "UTF-8"); and/or something else that will result in Yahoo!TM appearing in the SERVLET output?
You are nearly there. EncodeURIComponent correctly encodes to UTF-8, which is what you should always use in a URL today.
The problem is that the submitted query string is getting mutilated on the way into your server-side script, because getParameter() uses ISO-8559-1 instead of UTF-8. This stems from Ancient Times before the web settled on UTF-8 for URI/IRI, but it's rather pathetic that the Servlet spec hasn't been updated to match reality, or at least provide a reliable, supported option for it.
(There is request.setCharacterEncoding in Servlet 2.3, but it doesn't affect query string parsing, and if a single parameter has been read before, possibly by some other framework element, it won't work at all.)
So you need to futz around with container-specific methods to get proper UTF-8, often involving stuff in server.xml. This totally sucks for distributing web apps that should work anywhere. For Tomcat see https://cwiki.apache.org/confluence/display/TOMCAT/Character+Encoding and also What's the difference between "URIEncoding" of Tomcat, Encoding Filter and request.setCharacterEncoding.
I got the same problem and solved it by decoding Request.getQueryString() using URLDecoder(), and after extracting my parameters.
String[] Parameters = URLDecoder.decode(Request.getQueryString(), 'UTF-8')
.splitat('&');
There is way to do it in java (no fiddling with server.xml)
Do not work :
protected static final String CHARSET_FOR_URL_ENCODING = "UTF-8";
String uname = request.getParameter("name");
System.out.println(uname);
// ÏηγÏÏÏÏη
uname = request.getQueryString();
System.out.println(uname);
// name=%CF%84%CE%B7%CE%B3%CF%81%CF%84%CF%83%CF%82%CE%B7
uname = URLDecoder.decode(request.getParameter("name"),
CHARSET_FOR_URL_ENCODING);
System.out.println(uname);
// ÏηγÏÏÏÏη // !!!!!!!!!!!!!!!!!!!!!!!!!!!
uname = URLDecoder.decode(
"name=%CF%84%CE%B7%CE%B3%CF%81%CF%84%CF%83%CF%82%CE%B7",
CHARSET_FOR_URL_ENCODING);
System.out.println("query string decoded : " + uname);
// query string decoded : name=τηγρτσςη
uname = URLDecoder.decode(new String(request.getParameter("name")
.getBytes()), CHARSET_FOR_URL_ENCODING);
System.out.println(uname);
// ÏηγÏÏÏÏη // !!!!!!!!!!!!!!!!!!!!!!!!!!!
Works :
final String name = URLDecoder
.decode(new String(request.getParameter("name").getBytes(
"iso-8859-1")), CHARSET_FOR_URL_ENCODING);
System.out.println(name);
// τηγρτσςη
Worked but will break if default encoding != utf-8 - try this instead (omit the call to decode() it's not needed):
final String name = new String(request.getParameter("name").getBytes("iso-8859-1"),
CHARSET_FOR_URL_ENCODING);
As I said above if the server.xml is messed with as in :
<Connector connectionTimeout="20000" port="8080" protocol="HTTP/1.1"
redirectPort="8443" URIEncoding="UTF-8"/>
(notice the URIEncoding="UTF-8") the code above will break (cause the getBytes("iso-8859-1") should read getBytes("UTF-8")). So for a bullet proof solution you have to get the value of the URIEncoding attribute. This unfortunately seems to be container specific - even worse container version specific. For tomcat 7 you'd need something like :
import javax.management.AttributeNotFoundException;
import javax.management.InstanceNotFoundException;
import javax.management.MBeanException;
import javax.management.MBeanServer;
import javax.management.MBeanServerFactory;
import javax.management.MalformedObjectNameException;
import javax.management.ObjectName;
import javax.management.ReflectionException;
import org.apache.catalina.Server;
import org.apache.catalina.Service;
import org.apache.catalina.connector.Connector;
public class Controller extends HttpServlet {
// ...
static String CHARSET_FOR_URI_ENCODING; // the `URIEncoding` attribute
static {
MBeanServer mBeanServer = MBeanServerFactory.findMBeanServer(null).get(
0);
ObjectName name = null;
try {
name = new ObjectName("Catalina", "type", "Server");
} catch (MalformedObjectNameException e1) {
e1.printStackTrace();
}
Server server = null;
try {
server = (Server) mBeanServer.getAttribute(name, "managedResource");
} catch (AttributeNotFoundException | InstanceNotFoundException
| MBeanException | ReflectionException e) {
e.printStackTrace();
}
Service[] services = server.findServices();
for (Service service : services) {
for (Connector connector : service.findConnectors()) {
System.out.println(connector);
String uriEncoding = connector.getURIEncoding();
System.out.println("URIEncoding : " + uriEncoding);
boolean use = connector.getUseBodyEncodingForURI();
// TODO : if(use && connector.get uri enc...)
CHARSET_FOR_URI_ENCODING = uriEncoding;
// ProtocolHandler protocolHandler = connector
// .getProtocolHandler();
// if (protocolHandler instanceof Http11Protocol
// || protocolHandler instanceof Http11AprProtocol
// || protocolHandler instanceof Http11NioProtocol) {
// int serverPort = connector.getPort();
// System.out.println("HTTP Port: " + connector.getPort());
// }
}
}
}
}
And still you need to tweak this for multiple connectors (check the commented out parts). Then you would use something like :
new String(parameter.getBytes(CHARSET_FOR_URI_ENCODING), CHARSET_FOR_URL_ENCODING);
Still this may fail (IIUC) if parameter = request.getParameter("name"); decoded with CHARSET_FOR_URI_ENCODING was corrupted so the bytes I get with getBytes() were not the original ones (that's why "iso-8859-1" is used by default - it will preserve the bytes). You can get rid of it all by manually parsing the query string in the lines of:
URLDecoder.decode(request.getQueryString().split("=")[1],
CHARSET_FOR_URL_ENCODING);
I am still looking for the place in the docs where it is mentioned that request.getParameter("name") does call URLDecoder.decode() instead of returning the %CF%84%CE%B7%CE%B3%CF%81%CF%84%CF%83%CF%82%CE%B7 string ? A link in the source would be much appreciated.
Also how can I pass as the parameter's value the string, say, %CE ? => see comment : parameter=%25CE
I suspect that the data mutilation happens in the request, i.e. the declared encoding of the request does not match the one that is actually used for the data.
What does request.getCharacterEncoding() return?
I don't really know how JavaScript handles encodings or how to make it use a specific one.
You need to make sure that encodings are used correctly at all stages - do NOT try to "fix" the data by using new String() an getBytes() at a point where it has already been encoded incorrectly.
Edit: It may help to have the origin page (the one with the Javascript) also encoded in UTF-8 and declared as such in its Content-Type. Then I believe Javascript may default to using UTF-8 for its request - but this is not definite knowledge, just guesswork.
You could always use javascript to manipulate the text further.
<div id="test">a</div>
<script>
var a = document.getElementById('test');
alert(a.innerHTML);
a.innerHTML = decodeURI("Yahoo!%E2%84%A2");
alert(a.innerHTML);
</script>
I think I can get the following to work:
encodeURIComponent(escape(lnk.title))
That gives me %25u2122 (for &#8482) or %25AE (for &#174), which will decode to %u2122 and %AE respectively in the servlet.
I should then be able to turn %u2122 into '\u2122' and %AE into '\u00AE' relatively easily using (char) (base-10 integer value of %uXXXX or %XX) in a match and replace loop using regular expressions.
i.e. - match /%u([0-9a-f]{4})/i, extract the matching subexpression, convert it to base-10, turn it into a char and append it to the output, then do the same with /%([0-9a-f]{2})/i
There is a bug in certain versions of Jetty that makes it parse higher number UTF-8 characters incorrectly. If your server accepts arabic letters correctly but not emoji, that's a sign you have a version with this problem, since arabic is not in ISO-8859-1, but is in the lower range of UTF-8 characters ("lower" meaning java will represent it in a single char).
I updated from version 7.2.0.v20101020 to version 7.5.4.v20111024 and this fixed the problem; I can now use the getParameter(String) method instead of having to parse it myself.
If you're really curious, you can dig into your version of org.eclipse.jetty.util.Utf8StringBuilder.append(byte) and see whether it correctly adds multiple chars to the string when the utf-8 code is high enough or if, as in 7.2.0, it simply casts an int to a char and appends.
Thanks for all I get to know about encoding decoding of default character set that use in tomcat, jetty
I use this method to solve my problems using google guava
String str = URLDecoder.decode(request.getQueryString(), StandardCharsets.UTF_8.name());
final Map<String, String> map = Splitter.on('&').trimResults().withKeyValueSeparator("=").split(str);
System.out.println(map);
System.out.println(map.get("aung"));
System.out.println(map.get("aa"));

Categories

Resources