Getting unescaped servlet path from a HttpRequest object - java

I have a custom proxy servlet that has to deal with URL-s that contain special characters (e.g. ; , . / in their) in their path. This is because it is a RESTful application that has ugly path params by design. (Don't comment it as it is not mine.)
My client, (actually wget, because browsers tend to show unescaped the URL) send a request to this URL:
http://localhost:8080/MyApplication/proxy/foo/ugly%3Apart%2Fcomes%3Bhere/children
//note: %2F = '/', %3A = ':', %3B = ';'
In my servlet (mapped to /proxy/*) when I try to forward the GET request, I am unable to reconstruct it because HttpRequest.getPathInfo() returns me the URL unescaped:
http://localhost:8080/MyApplication/proxy/foo/ugly:part/comes;here/children
And therefore the information of which /s and ;s were originally escaped or unescaped is lost. And that makes a difference for me, for example ; makes my URL a so called matrix URL, see http://www.w3.org/DesignIssues/MatrixURIs.html, or all the REST path parameters get shifted by slashes.
Actually I found this issue on a Glassfish server, so I'm not sure if different application servers treat this differently or not. I found only this in the Servlet API:
getPathInfo() Returns any extra path information associated with the
URL the client sent when it made this request.
How could I get the original, unescaped request URL that was sent by the client?

Have a look at HttpServletRequest's getRequestURI() and getRequestURL() methods.
If you need to remove context and servlet mappings, look at getContextPath() and getServletPath().

Related

URL Path Parameter Encoding

I'm struggling with path param encoding with retrofit:
http://localhost:8080/nuxeo/api/v1 is my base url.
I have this Call #GET("path/{documentPath}")
Call<Document> fetchDocumentByPath(#Path("documentPath") String docPath);
As param, I'm setting the following: default-domain/blabla
I run the query against my tomcat app and I get this answer
Response{protocol=http/1.1, code=400, message=Bad Request, url=http://localhost:8080/nuxeo/api/v1/path/default-domain%2Fblabla}
Even if I put encode = true to say "don't encode my parameter, it's already encoded", it's still encoding it.
Moreover, in retrofit, this test retrofit2.RequestBuilderTest#getWithEncodedPathParam doesn't work if we put Request request = buildRequest(Example.class, "po/ng"); with the following assertion: assertThat(request.url().toString()).isEqualTo("http://example.com/foo/bar/po/ng/");
Tomcat has restricted his URL validation for security reason: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-0450.
So I'd like to send '/' directly in my path parameter without encoding it in %2F. How can I achieve it?
Thank you!
Since parent-2.0.0-beta4, the parameter of the annotation of #Path is now working properly.

Why is the forward slash mandatory when forwarding from servlet to JSP?

When using a request dispatcher in a servlet to forward to a JSP why must the JSP be denoted with a forward slash like the following-:
getServletContext().getRequestDispatcher("/foo.jsp").forward(request, response);
If I use it without the forward slash I get an exception in Tomcat.
But when I use request dispatcher for redirecting to servlets then I can omit the forward slash. The below code fragment works fine provided there is a servlet mapped to the url pattern-:
getServletContext().getRequestDispatcher("bar").forward(request, response);
I know that the / means the root of the web-app but why isn't it required for servlets but only for JSPs ? Servlets also belong to a particular web-app.
Your are mistaken to believe that the forward slash means root of the web app. It can sometimes mean that, but not always.
The forward slash '/' character is used in JSPs within links eg:
Stuff page
and in servlets by the RequestDispatcher.forward() and HTTPResponse.sendRedirect() methods for URL redirections.
It only has an effect when applied at the beginning of your redirection URL.
Here are the rules for how it is interpreted behind the scenes by the your application server:
First of all: Please be aware that the redirection address is always CASE SENSITIVE -even in the domain segment of your redirection-URL. See my comments in example code below to see an illustration of this through examples of what will work and what will fail.
If the redirection commences with 'http://', the ABSOLUTE path as specified, will be used in the redirection.
Otherwise your redirection URL will be applied as a relative URL.
If the redirection URL commences with a forward slash character '/', your application server is instructed to construct a URL RELATIVE to the web container!
For example: relative to localhost:8080
So the command...
response.sendRedirect("/foo/stuff.htm")
from inside a servlet, or
Stuff page
from inside a JSP, will take you to
localhost:8080/foo/stuff.htm.
The ABSENCE of a forward slash at the beginning of your redirection-url (together with the absence of a protocol signature) will instruct the app server to construct its url relative to the ORIGINAL REQUESTED URL! That is, the URL typed into the browser by a user at the client-side.
It is important to be aware that this constructed URL is neither
relative to the domainnor
relative to the web container!
Once again: the url constructed by the application server will be relative to the original url requested by the client!
So for example: if a client provides the URL
http://www.example.com/level1/level2/stuff.htm
then the command... response.sendRedirect("foo/stuff.htm") from within a servlet or,
Stuff page from within a JSP, will redirect you to http://www.example.com/level1/level2/foo/stuff.htm
// WILL NOT WORK! Reason: Case sensitivity.
response.sendRedirect("/teluskolearnings/login.jsp");
// WILL WORK! Reason: Case sensitivity.
response.sendRedirect("/TeluskoLearnings/login.jsp");
// Will redirect to localhost:8080/login.jsp as the forward slash tells app
// server to build the url RELATIVE TO THE APP SERVER.
// So you will be pointed to 'http://localhost:8080/login.jsp'.
// This is not what we want.
response.sendRedirect("/login.jsp");
// Will redirect to localhost:8080/TeluskoLearnings/login.jsp
// as the ABSENCE of forward slash tells app server to build the url
// RELATIVE TO THE URL!
// So you will be pointed to
// 'http://localhost:8080/TeluskoLearnings/login.jsp'.
// This IS what we want.
response.sendRedirect("login.jsp");
// Will redirect to localhost:8080/TeluskoLearnings/foo/login.jsp
// (you can see the redirection in the address bar, even if you get a
// 404 - page not found) as the ABSENCE of forward slash (at the start) tells
// app server to build the URL RELATIVE TO THE REQUESTED URL!
// This also means that if the user entered
// 'http://localhost:8080/TeluskoLearnings/level1/level2/stuff"'
// he will be pointed to
// 'http://localhost:8080/TeluskoLearnings/level1/level2/foo/login.jsp'
// (provided of course, that "/level1/level2/stuff" is captured inside the
// urlPatterns parameter of the #WebServlet() annotation).
response.sendRedirect("foo/login.jsp");
SEE: https://www.safaribooksonline.com/library/view/head-first-servlets/9780596516680/ch04s27.html
All the servlet objects are running at the same place in container but jsp goes to different place.
SERVLET LOCATION AFTER DEPLOYEMENT
You will find all the servlets here
apache-tomcat-7.0.55\webapps\Test\WEB-INF\classes\com\it\servlet
MainController.class
ABC.class
JSP LOCATION AFTER DEPLOYEMENT
apache-tomcat-7.0.55\work\Catalina\localhost\Test\org\apache\jsp
apiTester_jsp.class

How to correctly redirect a URL which does NOT start with HTTP or HTTPS protocols?

I am overwriting the HttpServletResponseWrapper.sendRedirect() method. Normally, a redirect url will start with http or https. But we do encounter some URLs like this:
//www.google.com.
This URL works when you assign this url to window.location in js. However, it would fail if we try to redirect this URL. Because it will always consider it as a relative path.
Do you know how to correctly redirect a URL like this?
You could rely on the following.
http://docs.oracle.com/javaee/1.2.1/api/javax/servlet/ServletRequest.html#getScheme()
This will tell you whether it's http/https
For handling relative URLs (i.e. ones which don't specify schema or host), you need to copy the missing parts from the ServletRequest that triggered the processing. The Java class URL has a helper method for that:
URL requestURL = new URL( request.getRequestURL() );
URL redirectURL = new URL( requestURL, "//www.google.com" );
Referring to this question, link without protocol would use the current protocol by default.
Thus you may simply use the protocol of current page.

What's the difference between getRequestURI and getPathInfo methods in HttpServletRequest?

I'm making a simple, very lightweight front-controller. I need to match request paths to different handlers (actions) in order to choose the correct one.
On my local machine HttpServletRequest.getPathInfo() and HttpServletRequest.getRequestURI() return the same results. But I'm not sure what will they return in the production environment.
So, what's the difference between these method and what should I choose?
I will put a small comparison table here (just to have it somewhere):
Servlet is mapped as /test%3F/* and the application is deployed under /app.
http://30thh.loc:8480/app/test%3F/a%3F+b;jsessionid=S%3F+ID?p+1=c+d&p+2=e+f#a
Method URL-Decoded Result
----------------------------------------------------
getContextPath() no /app
getLocalAddr() 127.0.0.1
getLocalName() 30thh.loc
getLocalPort() 8480
getMethod() GET
getPathInfo() yes /a?+b
getProtocol() HTTP/1.1
getQueryString() no p+1=c+d&p+2=e+f
getRequestedSessionId() no S%3F+ID
getRequestURI() no /app/test%3F/a%3F+b;jsessionid=S+ID
getRequestURL() no http://30thh.loc:8480/app/test%3F/a%3F+b;jsessionid=S+ID
getScheme() http
getServerName() 30thh.loc
getServerPort() 8480
getServletPath() yes /test?
getParameterNames() yes [p 2, p 1]
getParameter("p 1") yes c d
In the example above the server is running on the localhost:8480 and the name 30thh.loc was put into OS hosts file.
Comments
"+" is handled as space only in the query string
Anchor "#a" is not transferred to the server. Only the browser can work with it.
If the url-pattern in the servlet mapping does not end with * (for example /test or *.jsp), getPathInfo() returns null.
If Spring MVC is used
Method getPathInfo() returns null.
Method getServletPath() returns the part between the context path and the session ID. In the example above the value would be /test?/a?+b
Be careful with URL encoded parts of #RequestMapping and #RequestParam in Spring. It is buggy (current version 3.2.4) and is usually not working as expected.
getPathInfo() gives the extra path information after the URI, used to access your Servlet, where as getRequestURI() gives the complete URI.
I would have thought they would be different, given a Servlet must be configured with its own URI pattern in the first place; I don't think I've ever served a Servlet from root (/).
For example if Servlet 'Foo' is mapped to URI '/foo' then I would have thought the URI:
/foo/path/to/resource
Would result in:
RequestURI = /foo/path/to/resource
and
PathInfo = /path/to/resource
Let's break down the full URL that a client would type into their address bar to reach your servlet:
http://www.example.com:80/awesome-application/path/to/servlet/path/info?a=1&b=2#boo
The parts are:
scheme: http
hostname: www.example.com
port: 80
context path: awesome-application
servlet path: path/to/servlet
path info: path/info
query: a=1&b=2
fragment: boo
The request URI (returned by getRequestURI) corresponds to parts 4, 5 and 6.
(incidentally, even though you're not asking for this, the method getRequestURL would give you parts 1, 2, 3, 4, 5 and 6).
Now:
part 4 (the context path) is used to select your particular application out of many other applications that may be running in the server
part 5 (the servlet path) is used to select a particular servlet out of many other servlets that may be bundled in your application's WAR
part 6 (the path info) is interpreted by your servlet's logic (e.g. it may point to some resource controlled by your servlet).
part 7 (the query) is also made available to your servlet using getQueryString
part 8 (the fragment) is not even sent to the server and is relevant and known only to the client
The following always holds (except for URL encoding differences):
requestURI = contextPath + servletPath + pathInfo
The following example from the Servlet 3.0 specification is very helpful:
Note: image follows, I don't have the time to recreate in HTML:
Consider the following servlet conf:
<servlet>
<servlet-name>NewServlet</servlet-name>
<servlet-class>NewServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>NewServlet</servlet-name>
<url-pattern>/NewServlet/*</url-pattern>
</servlet-mapping>
Now, when I hit the URL http://localhost:8084/JSPTemp1/NewServlet/jhi, it will invoke NewServlet as it is mapped with the pattern described above.
Here:
getRequestURI() = /JSPTemp1/NewServlet/jhi
getPathInfo() = /jhi
We have those ones:
getPathInfo()
returns
a String, decoded by the web container, specifying extra path information that comes after the servlet path but before the query string in the request URL; or null if the URL does not have any extra path information
getRequestURI()
returns
a String containing the part of the URL from the protocol name up to the query string

Using hash symbol in forward URL with RequestDispatcher

I'm trying to forward a request to another URL which includes the hash symbol ('#'):
request.getRequestDispatcher("/some/path.jsp#somehash").forward(request, response);
Tomcat, however, tells me that "the requested resource is not available". If I remove the hash from the URL, everything works fine. Are hashes not allowed or am I not treating them right?
The # symbol is a browser thing, not a server thing. When you type a URL with a # into the browser, the browser doesn't send that part to the server. It sends the URL without it, then jumps to the named anchor when it gets the page back.
When you ask the container to get that URL for you, it doesn't treat the # any differently to any other URL - it has no special meaning for it, so it looks for a JSP page called /some/path.jsp#somehash, which of course doesn't exist.
You'll need to keep that jump-to-anchor logic on the client somehow. Perhaps you could put some javascript on the resulting page to scroll to that point in the document.
URL fragments are purely client side. The RequestDispatcher#forward() is entirely server-side. The in the forward given URL won't be sent to the client side. You can however redirect to the given URL using HttpServletResponse#sendRedirect(). The URL fragment will then be sent to the client side and reflected in browser address bar as well. Redirecting the request has however the disadvantage that the current request will be garbaged and a brand new one will be created. If that's not affordable, then you'll indeed have to look in the JavaScript corner for the solution.

Categories

Resources