redirecting between java servlets from url containing # - java

Hey,
Maybe the title is not the best choice, but I really don't know how to better describe the problem.
The thing is when you point your browser to url that contains #
http://anydomain.com/test/elsem/1234#dogeatdog
and for some reason (ie. there is a business logic) you want to redirect to other page
http://anydomain.com/test/els/1234
the #dogeatdog will be added to new url.
I found this behavior while developing wicket app, but just now I tested it with simple pure java servlet. Can someone explain it to me?
Here is the code just in case I'm doing something wrong:
private void process(HttpServletRequest req, HttpServletResponse res)
{
res.setContentType("text/plain");
try
{
HttpSession session = req.getSession();
Object as = session.getAttribute("as");
if (as == null)
{
log.info("redirecting");
session.setAttribute("as", 1);
res.sendRedirect("/test/");
}
else
{
log.info("writing");
PrintWriter out = res.getWriter();
out.write("after redirect "+as);
out.flush();
}
}
catch (IOException e)
{
e.printStackTrace();
}
}

Hash fragments (#a_hash_fragment) never leave the browser, they are not part of HTTP request.
What the web server gets in this case is GET /test/elsem/1234, and it responds with redirect 3xx code and the new url /test/els/1234, which your browser picks and appends #dogeatdog. Makes sense now?
UPDATE: Thanks to Zack, here's a W3C document that exactly explains how this (should) work:
http://www.w3.org/Protocols/HTTP/Fragment/draft-bos-http-redirect-00.txt

From the sendRedirect Javadoc:
Sends a temporary redirect response to the client using the specified
redirect location URL. This method can accept relative URLs; the
servlet container must convert the relative URL to an absolute URL
before sending the response to the client. If the location is relative
without a leading '/' the container interprets it as relative to the
current request URI. If the location is relative with a leading '/'
the container interprets it as relative to the servlet container root.
Because of repetitive use of "relative" in the Javadoc, I suspect the new URL is using what it can from the old URL and then building from there...
In the brief amount of what I've read, forwarding should be used if possible instead of redirect.
See this for a good explanation of forward verses redirect.
See this for straight-forward examples of forwarding requests to Servlets or JSPs.
Of course, with forwarding, the original URL will remain intact so that may not be what you're looking for...
EDIT
With information from milan, I found some more information regarding URL fragments (the stuff after "#" - I didn't know that was their official name until corresponding with milan).
There's another SOF post that has some good information concerning this and possibly the best answer: URL Fragment and 302 redirects
I have "+1'd" milan for giving good direction on this...

Related

How to change the URL that contains "#!" to "d/" in Java?

Ok, here is my problem. I am building the Ajax Web app & to make my webapp to be seen by Google spider, I need to use the url that contain hashbang "#!". For example, my url could be like this:
mydomain.com/#!getCustomer
mydomain.com/#!getOrder
....
These url look pretty ugly & beside Google adword does not allow # in the url so I can't advertise my url in Goolge.
Thus, I want that everytime user go to the above link, the system will change "#!" to "d/", so that users will see these:
mydomain.com/d/getCustomer
mydomain.com/d/getOrder
....
Note: even the url doesn't contain "#!", but the system still be able to let Google spider to index my website.
So, I use FilterServlet to do that:
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
HttpServletRequest httpRequest = (HttpServletRequest) request;
String fullURLQueryString = getFullURL(httpRequest);
System.out.println(fullURLQueryString); // test url
if ((fullURLQueryString != null) && (fullURLQueryString.contains("#!"))) {
fullURLQueryString=fullURLQueryString.replace("#!", "d/");
request.getRequestDispatcher(fullURLQueryString).forward(request, response);
}
}
However, the system does not recognize the part "#!" when capturing the fullURLQueryString
So the System.out.println(fullURLQueryString); only print out the mydomain.com & it ignores completely the part #!getCustomer or #!getOrder.
Did i do anything wrongly here?
Can you fix it?
You do not have to use #!, if your web application doesn't use client-side generated content. If your URLs do not currently contain #, this functionality is of no interest for your.
In typical scenario in which this is useful user goes to page: http://example.com/#page1.
The browser requests http://example.com/ (notice #page1 is not in the request). After the page is loaded, client side JavaScript examines the part after # and downloads additional content.
Google bots do not support JavaScript and cannot download any additional content. For them, every page http://example.com/#page1, http://example.com/#page2 ... looks the same.
To fix this, #! syntax was introduced. You can learn more about it here.
You cannot do this server-side because browsers never send the URL fragment (the # and everything after it) to the server. You can do this replacement only with client-side JavaScript:
if (location.href.match(/\/#!/)) {
location.replace(location.href.replace(/\/#!/, '/d/'));
}
You should be constructing your URLs with the URL class (not a String).
Here is the official Java 8 Documentation:
http://docs.oracle.com/javase/8/docs/api/java/net/URL.html
The problem is with your getFullURL() method. Instead you should use getRequestURL().

Redirection from Servlet/Filter does not work

I have a problem with redirection - it simply does not work for valid paths.
Right now I use page forwarding in the Servlet, but I need redirection in a filter.
All the pages reside in the 'pages' folder and have a .jspx extension
I've tried the following (this path works with forwarding):
httpResponse.sendRedirect("/pages/login.jspx");
browser url is http://[localhost]/pages/login.jspx, and it shows Tomcat's 404 page, the context path (in my case it's '/hotel') is missing from the url, so, if I add it:
httpResponse.sendRedirect("/hotel/pages/login.jspx");
redirect does not happen, browser url does not change, and I'm shown the browser's 404 page (This program cannot display the webpage).
What am I doing wrong?
The filter which is used to test this has the following mapping:
#WebFilter(filterName = "SecurityFilter", urlPatterns = "/*")
The redirected URL is indeed relative to the initially requested URL. To dynamically prepend the context path it's recommended to use HttpServletRequest#getContextPath() instead of hardcoding it, because the context path value can be changed externally by server-specific configuration.
As to your concrete problem, I'm not sure if I understand "browser's 404 page" properly, perhaps you mean the browser-default error page which can occur when the server is unreachable or when the request has been redirected in an infinite loop (that should however have been made clear in the actual message of the browser default error page, at least Chrome and Firefox do that).
Given that your filter is mapped on /*, it's actually redirecting in an infinite loop because the request URL of the login page in turn also matches the URL pattern of the filter.
You'd need either to put the filter on a more specific URL pattern which does not cover the login page, e.g. on /secured/* where all restricted pages are been moved in there (or map it on /pages/* and put the login page outside there), or to fix your filter as follows:
String loginURL = request.getContextPath() + "/pages/login.jspx";
if (needsToRedirect && !request.getRequestURI().equals(loginURL)) {
response.sendRedirect(loginURL);
}
else {
chain.doFilter(request, response);
}
1 - Have you got logging or some other observable event in your servlet code that confirms it's definitely running?
2 - Redirects can fail if you write any actual response content prior to the redirect - have you anything doing that?
3 - Another option, set up a page in the root directory, even a "hello.html" static page, and see if you can redirect to that using either of "/hello.html" and "hello.html".
Just some ideas I would use in my own debug approach, hope something helps!

In Java, how I download a page that was redirected?

I making a web crawler and there are some pages that redirect to other. How I get the page that the original page redirected?
In some sites like xtema.com.br, I can get the url of redirection using the HttpURLConnection class with the getHeaderField("Location") method, but in others like visa.com.br, the redirection is made using javascript or another way and this method returns null.
There is some way to always get the page and the url resulting of redirection? The original page without the redirection is not important.
Thanks, and sorry for bad english.
EDIT: Using httpConn.setInstanceFollowRedirects(true) to follow the redirections and returning the URL with httpConn.getURL worked, but I have two issues.
1: The httpConn.getURL only will return the actual url of the redirected page if I call httpConn.getDate before. If I dont this, it will return the original URL before the redirections.
2: Some sites like visa.com.br get the answer 200, but if I open then in the web browser, I see another page.
Eg.: my program - visa.com.br - answer 200 (no redirections)
web broser - visa.com.br/go/principal.aspx - html code different of the version that i get in my program
Use HttpURLConnection, it follows redirects by default.
In case you want to see the redirected URL, you'll have to do:
httpConn.setInstanceFollowRedirects( false );
httpConn.connect();
int responseCode = httpConn.getResponseCode();
while ((responseCode / 100) == 3) { /* codes 3XX are redirections */
String newLocationHeader = httpConn.getHeaderField( "Location" );
/* open a new connection and get the content for the URL newLocationHeader */
/* ... */
responseCode = httpConn.getResponseCode();
/* do it until you get some code that is not a redirection */
}
You can't easily get javascript redirection. And HTTP redirection is handled by default by the HttpURLConnection. What you can do is, search the page contents for several keywords:
the meta refresh tag
document.location=, window.location= and both with .href=
But this does not guarantee anything. People might be calling javascript functions from external js files and you will pretty much need to fetch resources and parse javascript, which you aren't willing to do, I guess.
I ended up using Apache's HTTP client. Just another option.

Redirect to servlet fails

I have a servlet named EditPhotos which, believe it or not, is used for editing the photos associated with a certain item on a web design I am developing. The URL path to edit a photo is [[SITEROOT]]/EditPhotos/[[ITEMNAME]].
When you go to this path (GET), the page loads fine. You can then click on a 'delete' link that POSTs to the same page, telling it to delete the photo. The servlet receives this delete command properly and successfully deletes the photo. It then sends a redirect back to the first page (GET).
For some reason, this redirect fails. I don't know how or why, but using the HTTPFox plugin for firefox, I see that the POST request receives 0 bytes in response and has the code NS_BINDING_ABORTED.
The code I am using to send the redirect, is the same code I have used throughout the website to send redirects:
response.sendRedirect(Constants.SITE_ROOT + "EditPhotos/" + itemURL);
I have checked the final URL that the redirect sends, and it is definitely correct, but the browser never receives the redirect. Why?
Read the server logs. Do you see IllegalStateException: response already committed with the sendRedirect() call in the trace?
If so, then that means that the redirect failed because the response headers are already been sent. Ensure that you aren't touching the HttpServletResponse at all before calling the sendRedirect(). A redirect namely exist of basically a Location response header with the new URL as value.
If not, then you're probably handling the request using JavaScript which in turn failed to handle the new location.
If neither is the case or you still cannot figure it, then we'd be interested in the smallest possible copy'n'pasteable code snippet which reproduces exactly this problem. Update then your question to include it.
Update as per the comments, the culprit is indeed in JavaScript. A redirect on a XMLHttpRequest POST isn't going to work. Are you using homegrown XMLHttpRequest functions or a library around it like as jQuery? If jQuery, please read this question carefully. It boils down to that you need to return a specific response and then let JS/jQuery do the new window.location itself.
Turns out that it was the JavaScript I was using to send the POST that was the problem.
I originally had this:
Delete
And everything got fixed when I changed it to this:
Delete
The deletePhoto function is:
function deletePhoto(photoID) {
doPost(document.URL, {'action':'delete', 'id':photoID});
}
function doPost(path, params) {
var form = document.createElement("form");
form.setAttribute("method", "POST");
form.setAttribute("action", path);
for(var key in params) {
var hiddenField = document.createElement("input");
hiddenField.setAttribute("type", "hidden");
hiddenField.setAttribute("name", key);
hiddenField.setAttribute("value", params[key]);
form.appendChild(hiddenField);
}
document.body.appendChild(form);
form.submit();
}

Using hash symbol in forward URL with RequestDispatcher

I'm trying to forward a request to another URL which includes the hash symbol ('#'):
request.getRequestDispatcher("/some/path.jsp#somehash").forward(request, response);
Tomcat, however, tells me that "the requested resource is not available". If I remove the hash from the URL, everything works fine. Are hashes not allowed or am I not treating them right?
The # symbol is a browser thing, not a server thing. When you type a URL with a # into the browser, the browser doesn't send that part to the server. It sends the URL without it, then jumps to the named anchor when it gets the page back.
When you ask the container to get that URL for you, it doesn't treat the # any differently to any other URL - it has no special meaning for it, so it looks for a JSP page called /some/path.jsp#somehash, which of course doesn't exist.
You'll need to keep that jump-to-anchor logic on the client somehow. Perhaps you could put some javascript on the resulting page to scroll to that point in the document.
URL fragments are purely client side. The RequestDispatcher#forward() is entirely server-side. The in the forward given URL won't be sent to the client side. You can however redirect to the given URL using HttpServletResponse#sendRedirect(). The URL fragment will then be sent to the client side and reflected in browser address bar as well. Redirecting the request has however the disadvantage that the current request will be garbaged and a brand new one will be created. If that's not affordable, then you'll indeed have to look in the JavaScript corner for the solution.

Categories

Resources