application url containing '#' throwing an exception

application url containing '#' throwing an exception - java

I have my application url which I am sending the the end user on their emails.
Now that url contains the 'username' field, which can contains '#' character.
For e.g. link which sent to the end user :
http://localhost:8080/my-app/someaction/activateuser/abc#def.com/somedata/
Now whenever user clicks on above link, its throwing following exception :
java.lang.IllegalArgumentException
Input string 'abc#def.com' is not valid; the character '#' at position 4 is not valid.
at org.apache.tapestry5.internal.services.URLEncoderImpl.decode(URLEncoderImpl.java:144)
at $URLEncoder_137022607d9.decode($URLEncoder_137022607d9.java)
at org.apache.tapestry5.internal.services.ContextPathEncoderImpl.decodePath(ContextPathEncoderImpl.java:92)
at $ContextPathEncoder_137022607cd.decodePath($ContextPathEncoder_137022607cd.java)
at org.apache.tapestry5.internal.services.ComponentEventLinkEncoderImpl.checkIfPage(ComponentEventLinkEncoderImpl.java:328)
at org.apache.tapestry5.internal.services.ComponentEventLinkEncoderImpl.decodePageRenderRequest(ComponentEventLinkEncoderImpl.java:307)
at org.apache.tapestry5.internal.services.linktransform.LinkTransformerInterceptor.decodePageRenderRequest(LinkTransformerInterceptor.java:68)
at $ComponentEventLinkEncoder_137022607c1.decodePageRenderRequest($ComponentEventLinkEncoder_137022607c1.java)
at org.apache.tapestry5.internal.services.PageRenderDispatcher.dispatch(PageRenderDispatcher.java:41)
at $Dispatcher_137022607c2.dispatch($Dispatcher_137022607c2.java)
at $Dispatcher_137022607bd.dispatch($Dispatcher_137022607bd.java)
at org.apache.tapestry5.services.TapestryModule$RequestHandlerTerminator.service(TapestryModule.java:321)
at org.apache.tapestry5.internal.services.RequestErrorFilter.service(RequestErrorFilter.java:26)
Is there any way to handle such scenario, like encoding/decoding the urls ?

You cannot have an # in the url, because it's a reserved character (the specific RFC is RFC 3986).
You can use the URLEncoder class to encode the url to an acceptable value

As MiniBill has already answered, that can't work, and as Howard has added, Tapestry has its own encoder for URLs. This means that the easiest way for you to get a URL in the format that Tapestry can read is to have Tapestry create it, and then pass it to the component that sends your emails:
#Inject
private LinkSource linkSource;
#OnEvent(...)
void sendActivationEmail() {
final Link activationLink = this.createUserActivationLink(email, otherStuff);
this.activationEmailSender.sendWithActivationLink(email, activationLink);
}
private Link createUserActivationLink(String email, String otherStuff) {
return linkSource.createPageRenderLink(
"someaction/activateuser", false, email, otherStuff);
}

I was able to solve the problem by encoding my string to Base64, and unpacking on Tapestry Java side. My strings were of UTF-8 encoded characters.
I modified the Base64 encoder from this answer: https://stackoverflow.com/a/40392850/5339857
function b64EncodeUnicode(str) {
return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
return String.fromCharCode('0x' + p1);
})).replace(/\=+$/, '');
}
(just added the .replace in the end, to remove padding =s that Tapestry doesn't like)
And in the Java side the decoding was a breeze: (this example is of an ajax click from javascript - where the Base64 encoding happens)
#OnEvent(value = "clickAjax")
Object clickAjax(String parameter) {
somePagePropetry = new String(java.util.Base64.getDecoder().decode(parameter));
return this;
}

Related

REST API JAVA Empty Path Parameters

I am new to REST API's.
I am developing a REST API.
In the following API the parameters I take is cloud-id.
This is the API Call:
#GET
#Path("{cloud-id}")
#Produces("application/json")
public Object Getall(#PathParam("cloud-id") String cloudID) {
if(cloudID!=null){
//return some details
}else{
//return something else
}
}
Happy Path:
http://example.com/sampleCloudID
This also works fine
http://example.com/(sampleCloudID)
It gives a 404 as expected
But when I give the URI as
http://example.com/{sampleCloudID}
ERROR:
You specified too few path parameters in the request.
In case the input I receive is {samplecloudID} I expect the service to return a 404, but I am unable to reach my resource if the path variable is in {}.
Why are curly braces giving me a error but normal parenthesis give 404 as expected ?

If you need to send special characters as part of the URL you need to encode them.
try using http://example.com/%7BsampleCloudID%7D
This should let your controller get the {}
This wikipedia article should give you details.

RFC 1738 states that certain characters are unsafe in URLs:
Unsafe:
Characters can be unsafe for a number of reasons. [...] Other
characters are unsafe because gateways and other transport agents are
known to sometimes modify such characters. These characters are "{",
"}", "|", "", "^", "~", "[", "]", and "`".
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in systems
that do not normally deal with fragment or anchor identifiers, so that
if the URL is copied into another system that does use them, it will
not be necessary to change the URL encoding.
source: https://meta.stackexchange.com/questions/79057/curly-brackets-in-urls?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa

You could write code this way in which you did't get any error on changing type of your path variable
#GetMapping(value = { "/your-path/{cloud-id}", produces = "application/json")
public Object Getall(#PathVariable(value = "cloud-id", required = false) String cloudID) {
//do your stuff
}

Jersey: Decoding String #PathParam which contains space in it

public Response getCustomerByName(
#PathParam("customerName") String customerName)
Problem :
I am passing customerName as : stack overflow (URL is encoded as : stack%20overflow). I want to receive as decoded string (stack overflow, without %20) in my java code.
What I tried :
This works perfectly fine, but I felt it is not more generic way of doing it.
URLDecoder.decode(customerName, "UTF-8");
Require more generic solution :
I want to do the similar changes in rest of the APIs as well, so using URLDecoder in each API is burden . Is there any common practice which I can follow to impose this decoding at application level? (#PathParam is already decoded when I receive the request)

It shall be auto "Decoded" and you don't need explicit decoding using URLDecoder.decode(customerName, "UTF-8");
As mentioned in javadoc of PathParam javadoc:
The value is URL decoded unless this is disabled using the Encoded annotation.
I just verified below and it works as per javadoc (in weblogic server)
#GET
#Produces(value = { "text/plain"})
#Path("{customerName}")
public Response getCustomerByName(#PathParam("customerName") String customerName) {
System.out.println(customerName);
return Response.ok().entity(customerName).type("text/plain").build();
}

How to encode decode between js and java?

I am working on a project in that java script as front-end and java as back-end are used, my problem is that I want to pass some string using restangular calls to my back-end resources. If passing parameter have space between the string then I got 500 (Server Error) before reaching to back-end resource side.
Lets take example :
At Java Script : RESTangular call
var myRestCall= Restangular.all('myRoot/myMethod/'+myLocalPath+'/'+folderName);
restPSTFolders.getList().then(function(listPSTFolders){
//my stuff
});
At Java Resource :
#ApiOperation(value = "My Method",
notes = "Returns My Method list",
responseContainer = "List",
response = List.class)
#Path("/myMethod/{myLocalPath}/{folderName}")
#GET
#Transactional
#Timed
public List myMethod(#PathParam("myLocalPath") String myLocalPath, #PathParam("folderName") String sFolderName) {
//my stuff
}
In my example myLocalPath parameter can have spaces and special characters in the string as it can be any :
C:\MY DRIVE\My Path One\My Path
D:\My favorite
To pass this to back-end class from RESTangular call, I need to replace all spaces with some character, it work for me, but I am not thinking its a good way to encode the special character and space with any character because the replacing character can also be a part of existing path then at back-end on again replacing the character, might change the path.
EDIT : If I passed the parameter as json object:
var parameterJsonPath = {};
parameterJsonPath={"myLocalPath": pathValue};
var myRestCall= Restangular.all('myRoot/myMethod/'+parameterJsonPath+'/'+folderName);
Does not make any sense as I got : ../myRoot/myMethod/%5Bobject%20Object%5D Failed to load resource: the server responded with a status of 500 (Server Error)
And making the parameterJsonPath to JSON.stringify(parameterJsonPath); will pass this as string that of no use for me.
Thus in all, is there any good way to encode the special character of string in js, so at back-end side I could decode that string using same key that were used while encoding?

Java decoding encoded LDAP filters before transmitting? Preventing LDAP injections

I am currently properly escaping my filters, either using Spring LDAP Filter clases, or by going through LdapEncoder.filterEncode().
At the same time, I am using WireShark to capture packets being exchanged between my local machine and the LDAP server.
And I seem to have a problem. Even if I properly escape values (which I have confirmed through debugging), they come out unescaped through the network. I have also confirmed (through debugging) that the value stays encoded all the way until it enters javax.naming.InitialContext.
Here is an example (note that I am using Spring LDAP 1.3.0, and that these happen on both Oracle JDK 6u45 and Oracle JDK 7u45).
In my own code, on the service layer, the call being made is:
String lMailAddress = (String) ldapTemplate.searchForObject("", new EqualsFilter(ldapUserSearchFilterAttribute, principal).encode(), new ContextMapper() {
#Override
public Object mapFromContext(Object ctx) {
DirContextAdapter lContext = (DirContextAdapter) ctx;
return lContext.getStringAttribute("mail");
}});
At this point, I can confirm that the String returned by the encode() method on the filter is "(sAMAccountName=boi\2a)"
The last point I can debug the code is the following one (starts at line 229 of org.springframework.ldap.core.LdapTemplate):
SearchExecutor se = new SearchExecutor() {
public NamingEnumeration executeSearch(DirContext ctx) throws javax.naming.NamingException {
return ctx.search(base, filter, controls);
}
};
When executeSearch() is later invoked, I can also verify that the filter String contains "(sAMAccountName=boi\2a)".
I cannot debug any further, since I do not have the source code to javax,naming.* or com.sun.jndi.ldap.* (since com.sun.jndi.ldap.LdapCtx is being invoked).
However, as soon as the call returns from executeSearch(), WireShark informs me that an LDAP packet containing a searchRequest with the filter "(sAMAccountName=boi*)" has been transmitted (the * is no longer escaped).
I have used similar encoding and used different methods of LdapTemplate that yielded the result I was expecting (I saw the encoded filter being transmitted in WireShark), but I cannot explain why, in the case I just exposed, the value gets decoded before being transmitted.
Please help me understanding the situation. Hpoefully, I am the one who does not properly understand the LDAP protocol here.
Thanks.
Disclaimer: I have posted the same question to Spring LDAP forums.
TL/DR: Why is com.sun.jndi.ldap.LdapCtx decoding LDAP encoded filters (like \2a to *) before transmitting them to the LDAP server?
Update: Tried and observed the same behavior with IBM's J9 JDK7.

Although I'm not familiar with Spring LDAP, it doesn't sound like there's necessarily a reason to be concerned. LDAP filters aren't transmitted as clear text, but rather in a binary encoding, and there is no need for escaping in this mechanism (nor would it be correct to do so).
Let's take "(sAMAccountName=boi*)" as an example. As written, this filter is a substring filter with a subInitial component of "boi". As you point out, if you want it to be an equality filter rather than a substring filter, then the string representation would have to be "(sAMAccountName=boi\2a)". However, the binary encodings for these filters don't use any escaping, but instead use an ASN.1 BER type to differentiate between substring and equality filters.
If you want "(sAMAccountName=boi*)" as a substring filter, then the encoded representation would be:
a417040e73414d4163636f756e744e616d6530058003626f69
On the other hand, if you want "(sAMAccountName=boi\2a)" as an equality filter, the encoding would be:
a316040e73414d4163636f756e744e616d650404626f692a
The full explanation of the encoding isn't something I want to get into, but the "a4" at the beginning of the first one indicates that it's a substring filter, whereas the "a3" at the beginning of the second indicates that it's an equality filter.
You should be able to verify the actual bytes sent in WireShark. It may well be that WireShark doesn't properly escape the filter when generating the string representation, but that would be an issue with WireShark itself. The directory server only gets the binary representation, and it's hard to believe that an LDAP server would misinterpret that.

OWASP suggest to encode strings for searches:
public static final String escapeLDAPSearchFilter(String filter) {
StringBuffer sb = new StringBuffer(); // If using JDK >= 1.5 consider using StringBuilder
for (int i = 0; i < filter.length(); i++) {
char curChar = filter.charAt(i);
switch (curChar) {
case '\\':
sb.append("\\5c");
break;
case '*':
sb.append("\\2a");
break;
case '(':
sb.append("\\28");
break;
case ')':
sb.append("\\29");
break;
case '\u0000':
sb.append("\\00");
break;
default:
sb.append(curChar);
}
}
return sb.toString();
}
DN strings are escaped different. See the link below.
https://www.owasp.org/index.php/Preventing_LDAP_Injection_in_Java

The best way is to use parameterized filter search method, thus the parameter will be properly encoded.
See https://docs.oracle.com/javase/jndi/tutorial/ldap/search/search.html
// Perform the search
NamingEnumeration answer = ctx.search("ou=NewHires",
"(&(mySpecialKey={0}) (cn=*{1}))", // Filter expression
new Object[]{key, name}, // Filter arguments
null); // Default search controls

How do I correctly decode unicode parameters passed to a servlet

Suppose I have:
<a href="http://www.yahoo.com/" target="_yahoo"
title="Yahoo!™" onclick="return gateway(this);">Yahoo!</a>
<script type="text/javascript">
function gateway(lnk) {
window.open(SERVLET +
'?external_link=' + encodeURIComponent(lnk.href) +
'&external_target=' + encodeURIComponent(lnk.target) +
'&external_title=' + encodeURIComponent(lnk.title));
return false;
}
</script>
I have confirmed external_title gets encoded as Yahoo!%E2%84%A2 and passed to SERVLET. If in SERVLET I do:
Writer writer = response.getWriter();
writer.write(request.getParameter("external_title"));
I get Yahoo!â„¢ in the browser. If I manually switch the browser character encoding to UTF-8, it changes to Yahoo!TM (which is what I want).
So I figured the encoding I was sending to the browser was wrong (it was Content-type: text/html; charset=ISO-8859-1). I changed SERVLET to:
response.setContentType("text/html; charset=utf-8");
Writer writer = response.getWriter();
writer.write(request.getParameter("external_title"));
Now the browser character encoding is UTF-8, but it outputs Yahoo!â¢ and I can't get the browser to render the correct character at all.
My question is: is there some combination of Content-type and/or new String(request.getParameter("external_title").getBytes(), "UTF-8"); and/or something else that will result in Yahoo!TM appearing in the SERVLET output?

You are nearly there. EncodeURIComponent correctly encodes to UTF-8, which is what you should always use in a URL today.
The problem is that the submitted query string is getting mutilated on the way into your server-side script, because getParameter() uses ISO-8559-1 instead of UTF-8. This stems from Ancient Times before the web settled on UTF-8 for URI/IRI, but it's rather pathetic that the Servlet spec hasn't been updated to match reality, or at least provide a reliable, supported option for it.
(There is request.setCharacterEncoding in Servlet 2.3, but it doesn't affect query string parsing, and if a single parameter has been read before, possibly by some other framework element, it won't work at all.)
So you need to futz around with container-specific methods to get proper UTF-8, often involving stuff in server.xml. This totally sucks for distributing web apps that should work anywhere. For Tomcat see https://cwiki.apache.org/confluence/display/TOMCAT/Character+Encoding and also What's the difference between "URIEncoding" of Tomcat, Encoding Filter and request.setCharacterEncoding.

I got the same problem and solved it by decoding Request.getQueryString() using URLDecoder(), and after extracting my parameters.
String[] Parameters = URLDecoder.decode(Request.getQueryString(), 'UTF-8')
.splitat('&');

There is way to do it in java (no fiddling with server.xml)
Do not work :
protected static final String CHARSET_FOR_URL_ENCODING = "UTF-8";
String uname = request.getParameter("name");
System.out.println(uname);
// ÏÎ·Î³ÏÏÏÏÎ·
uname = request.getQueryString();
System.out.println(uname);
// name=%CF%84%CE%B7%CE%B3%CF%81%CF%84%CF%83%CF%82%CE%B7
uname = URLDecoder.decode(request.getParameter("name"),
CHARSET_FOR_URL_ENCODING);
System.out.println(uname);
// ÏÎ·Î³ÏÏÏÏÎ· // !!!!!!!!!!!!!!!!!!!!!!!!!!!
uname = URLDecoder.decode(
"name=%CF%84%CE%B7%CE%B3%CF%81%CF%84%CF%83%CF%82%CE%B7",
CHARSET_FOR_URL_ENCODING);
System.out.println("query string decoded : " + uname);
// query string decoded : name=τηγρτσςη
uname = URLDecoder.decode(new String(request.getParameter("name")
.getBytes()), CHARSET_FOR_URL_ENCODING);
System.out.println(uname);
// ÏÎ·Î³ÏÏÏÏÎ· // !!!!!!!!!!!!!!!!!!!!!!!!!!!
Works :
final String name = URLDecoder
.decode(new String(request.getParameter("name").getBytes(
"iso-8859-1")), CHARSET_FOR_URL_ENCODING);
System.out.println(name);
// τηγρτσςη
Worked but will break if default encoding != utf-8 - try this instead (omit the call to decode() it's not needed):
final String name = new String(request.getParameter("name").getBytes("iso-8859-1"),
CHARSET_FOR_URL_ENCODING);
As I said above if the server.xml is messed with as in :
<Connector connectionTimeout="20000" port="8080" protocol="HTTP/1.1"
redirectPort="8443" URIEncoding="UTF-8"/>
(notice the URIEncoding="UTF-8") the code above will break (cause the getBytes("iso-8859-1") should read getBytes("UTF-8")). So for a bullet proof solution you have to get the value of the URIEncoding attribute. This unfortunately seems to be container specific - even worse container version specific. For tomcat 7 you'd need something like :
import javax.management.AttributeNotFoundException;
import javax.management.InstanceNotFoundException;
import javax.management.MBeanException;
import javax.management.MBeanServer;
import javax.management.MBeanServerFactory;
import javax.management.MalformedObjectNameException;
import javax.management.ObjectName;
import javax.management.ReflectionException;
import org.apache.catalina.Server;
import org.apache.catalina.Service;
import org.apache.catalina.connector.Connector;
public class Controller extends HttpServlet {
// ...
static String CHARSET_FOR_URI_ENCODING; // the `URIEncoding` attribute
static {
MBeanServer mBeanServer = MBeanServerFactory.findMBeanServer(null).get(
0);
ObjectName name = null;
try {
name = new ObjectName("Catalina", "type", "Server");
} catch (MalformedObjectNameException e1) {
e1.printStackTrace();
}
Server server = null;
try {
server = (Server) mBeanServer.getAttribute(name, "managedResource");
} catch (AttributeNotFoundException | InstanceNotFoundException
| MBeanException | ReflectionException e) {
e.printStackTrace();
}
Service[] services = server.findServices();
for (Service service : services) {
for (Connector connector : service.findConnectors()) {
System.out.println(connector);
String uriEncoding = connector.getURIEncoding();
System.out.println("URIEncoding : " + uriEncoding);
boolean use = connector.getUseBodyEncodingForURI();
// TODO : if(use && connector.get uri enc...)
CHARSET_FOR_URI_ENCODING = uriEncoding;
// ProtocolHandler protocolHandler = connector
// .getProtocolHandler();
// if (protocolHandler instanceof Http11Protocol
// || protocolHandler instanceof Http11AprProtocol
// || protocolHandler instanceof Http11NioProtocol) {
// int serverPort = connector.getPort();
// System.out.println("HTTP Port: " + connector.getPort());
// }
}
}
}
}
And still you need to tweak this for multiple connectors (check the commented out parts). Then you would use something like :
new String(parameter.getBytes(CHARSET_FOR_URI_ENCODING), CHARSET_FOR_URL_ENCODING);
Still this may fail (IIUC) if parameter = request.getParameter("name"); decoded with CHARSET_FOR_URI_ENCODING was corrupted so the bytes I get with getBytes() were not the original ones (that's why "iso-8859-1" is used by default - it will preserve the bytes). You can get rid of it all by manually parsing the query string in the lines of:
URLDecoder.decode(request.getQueryString().split("=")[1],
CHARSET_FOR_URL_ENCODING);
I am still looking for the place in the docs where it is mentioned that request.getParameter("name") does call URLDecoder.decode() instead of returning the %CF%84%CE%B7%CE%B3%CF%81%CF%84%CF%83%CF%82%CE%B7 string ? A link in the source would be much appreciated.
Also how can I pass as the parameter's value the string, say, %CE ? => see comment : parameter=%25CE

I suspect that the data mutilation happens in the request, i.e. the declared encoding of the request does not match the one that is actually used for the data.
What does request.getCharacterEncoding() return?
I don't really know how JavaScript handles encodings or how to make it use a specific one.
You need to make sure that encodings are used correctly at all stages - do NOT try to "fix" the data by using new String() an getBytes() at a point where it has already been encoded incorrectly.
Edit: It may help to have the origin page (the one with the Javascript) also encoded in UTF-8 and declared as such in its Content-Type. Then I believe Javascript may default to using UTF-8 for its request - but this is not definite knowledge, just guesswork.

You could always use javascript to manipulate the text further.
<div id="test">a</div>
<script>
var a = document.getElementById('test');
alert(a.innerHTML);
a.innerHTML = decodeURI("Yahoo!%E2%84%A2");
alert(a.innerHTML);
</script>

I think I can get the following to work:
encodeURIComponent(escape(lnk.title))
That gives me %25u2122 (for &#8482) or %25AE (for &#174), which will decode to %u2122 and %AE respectively in the servlet.
I should then be able to turn %u2122 into '\u2122' and %AE into '\u00AE' relatively easily using (char) (base-10 integer value of %uXXXX or %XX) in a match and replace loop using regular expressions.
i.e. - match /%u([0-9a-f]{4})/i, extract the matching subexpression, convert it to base-10, turn it into a char and append it to the output, then do the same with /%([0-9a-f]{2})/i

There is a bug in certain versions of Jetty that makes it parse higher number UTF-8 characters incorrectly. If your server accepts arabic letters correctly but not emoji, that's a sign you have a version with this problem, since arabic is not in ISO-8859-1, but is in the lower range of UTF-8 characters ("lower" meaning java will represent it in a single char).
I updated from version 7.2.0.v20101020 to version 7.5.4.v20111024 and this fixed the problem; I can now use the getParameter(String) method instead of having to parse it myself.
If you're really curious, you can dig into your version of org.eclipse.jetty.util.Utf8StringBuilder.append(byte) and see whether it correctly adds multiple chars to the string when the utf-8 code is high enough or if, as in 7.2.0, it simply casts an int to a char and appends.

Thanks for all I get to know about encoding decoding of default character set that use in tomcat, jetty
I use this method to solve my problems using google guava
String str = URLDecoder.decode(request.getQueryString(), StandardCharsets.UTF_8.name());
final Map<String, String> map = Splitter.on('&').trimResults().withKeyValueSeparator("=").split(str);
System.out.println(map);
System.out.println(map.get("aung"));
System.out.println(map.get("aa"));

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.