Mask urls in JSP - java

Related to this question:
URL characters replacement in JSP with UrlRewrite
I want to have masked URLs in this JSP Java EE web project.
For example if I had this:
http://mysite.com/products.jsp?id=42&name=Programming_Book
I would like to turn that URL into something more User/Google friendly like:
http://mysite.com/product-Programming-Book
I've been fighting with UrlRewrite, forwarding and RequestDispatcher to accomplish what I want, but I'm kind of lost. I should probably have a filter for all http requests, re format them, and forward the page.
Can anyone give some directions? Tips?
Thanks a lot.
UPDATE: Servlets did it. Thanks Yuval for your orientation.
I had been using UrlRewrite, as you can see at the first sentence of the question I also asked a question about that. But I couldn't manage to get UrlRewrite work the way I wanted. Servlets did the job.

You could use a URLRewrite filter. It's like how mod_rewrite is for Apache's HTTP web server.
http://tuckey.org/urlrewrite/
"Redirect one url
<rule>
<from>^/some/old/page\.html$</from>
<to type="redirect">/very/new/page.html</to>
</rule>
Tiny/Freindly url
<rule>
<from>^/zebra$</from>
<to type="redirect">/big/ugly/url/1,23,56,23132.html</to>
</rule>
"

It's been a while since I mucked about with JSPs, but if memory serves you can add URL patterns to your web.xml (or one of those XML config files) and have the servlet engine automatically route the request to a valid URL with your choice of paramters. I can look up the details if you like.
In your case, map http://mysite.com/product-Programming-Book to the URL
http://mysite.com/products.jsp?id=42&name=Programming_Book and the user no longer sees the real URL. Also, you can use this more user-friendly URL within your application, as a logical name for that page.
Yuval =8-)

Generally you're fronting your application with Apache. If so, look into using Apache's mod_rewrite. http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html

For one thing I'd recommend you deal with this within your application, and not rely on external rewrites, say via Apache mod_rewrite (unless you have determined this is the fastest way to do so.)
But a few things first:
I would not convert this:
http://mysite.com/products.jsp?id=42&name=Programming_Book
Into this:
http://mysite.com/product-Programming-Book
See, if I go only by your book example, I don't see what is wrong with the former URL. After, all it works for Amazon. And there is no such thing as google friendly URLs (only user friendly.) You have to consider why you want to do that type of rewriting, and how. For example, in your rewrite option, where is the id?
That is, you have to define a logical rule that define
the unique pages you want to show, and
the unique combination of parameters that can identify each page.
For example, using your book case. Let's say you can identify any book using the following rules:
by ISBN
by Author Name, Title and if
applicable version (if version is
missing, assume latest)
if ISBN is included with Author
Name, Title and/or edition, ignore
all except ISBN. That is, treat it
as the former (or more precisely,
ignore all other book identification
parameters when ISBN is present.)
With a ?parametrized url scheme, then you'd have the following possibilities:
http://yoursite/products?isbn=123465
http://yoursite/products?author=johndoe&title="the cookbook" << this assumes the latest edition, or 1 if first.
http://yoursite/products?author=johndoe&title="the cookbook"&edition=3
http://yoursite/products?title="the cookbook"&author=johndoe
http://yoursite/products?edition=3&title="the cookbook"&author=johndoe
....
and so on for all combinations. So before you look for a technical implementation, you have to think very carefully how you will do it. You'd have to create a syntax and a hierarchy of parameters (say, author will always come before title, and title will always come before edition).
So you'll end up with the following (using the same example as John Doe the author, with his book being in the 3rd edition):
http://yoursite/product/isbn/12345
http://yoursite/product/author/johndoe/the%20cookbook << see the %20 for encoding spaces (not a good idea, but something to take into account)
http://yoursite/product/author/johndoe/the%20cookbook/3
Any other combination should either generate an error or smartly figure out how to rewrite to the "cannon" versions and send a HTTP 3xx to the client with the appropriate URL target.
Once you have ironed those details out, you can ask yourself it the effort is worth it or necessary.
So if you find yourself that you need to, then easiest and cheapest DIY way is to write a filter that parses the url, breaks the parameters down, creates a ?parametrized url string for a JSP page, get its RequestDispatcher and forward to it.
You do not want to do URL rewrites because these incur in HTTP 303/307 back and forth between your server and your client. Or at least you want to keep that to a minimum.

Related

How to get the parameter from the URL after '/'

https://localhost:8080/invoice_creation/1/
Here /invoice_creation is my resource location and /1 is my invoice number which denotes my database. I want to parse the value 1 alone in my java servlet when I pass this URL in my postman. I've used request.getParamerter() method..but it doesn't help me.. please help me to parse the value 1 in my java servlet page
getParameter is for the x and y in for example: http://localhost:8080/invoice_creation/1?x=5&y=hello.
A webserver listens on port 8080, receives the 'location' that the visitor wishes to visit (such as /invoice_creation/1?x=6&y=hello), and needs to then route this by finding the 'handler' that is supposed to deal with this. It then calls that handler.
Java has a ton of web frameworks, and most cooked up their own way of routing.
The 'original' java web servers uses the servlets API. It's a crappy API that you shouldn't be using; it's a pain to use. But, given that you tagged this question with servlets, you are, evidently, using it. I suggest you look around; perhaps Jersey/JaxRS is nicer, especially if you're attempting to set up a REST API.
At any rate, if you insist on using the servlet API, you've somehow set up routing such that /invoice_creation/_anything goes here_ ends up at some servlet and now you wish the 'anything goes here' part. Exactly how to do that depends on a few factors, but usually its one of these:
req.getPathInfo(). Depending on your Web Server / servlet container you're using, this will either return /1 or returns /invoice_creation/1. You'll have to 'extract' the 1 from this, using either regular expressions (java.util.regexp.Pattern and friends), or basic string manipulation such as str.substring).
req.getTranslatedPath().
req.getRequestURI() - this definitely returns the whole thing (including /invoice_creation/).

How can I make custom urls like mywebsite.com/movie_title rather than mywebsite.com/movie?id=9282

I am building a website using java and google app engine. I need to create urls like www.mywebsite.com/the_dark_knight_rises
when a user goes to that url info is pulled from my database (mysql) about the movie The Dark Knight Rises and is displayed
The problem I have is
1) I don't know how to go on about making these dynamic urls, since my database contains 500k movies so i don't think i can make them manually
2) how do I pull info about that record. I got an idea to take the /the_dark_knight_rises part of the url replace the underscores with spaces and try to seach my database like select * from table where title like 'the dark knight rises' but i am not sure if this is the best solution or how to get that part of the url
Any help/directions are welcome
1) Use getPathInfo() or getRequestURI() to extract path from current url
2) Remove underscores from path and construct movie title
3) Make database call
4) Show movei if exists or show 404 if not.
If your intent is just to do this for SEO or readability purposes, you could always serve the urls as www.website.com/movie_name_in_text?id=abc123
This allows you to do a get by id on the database, but surfaces textual information for people and bots.
You should also probably look into using a Web MVC framework, such as Spring, thundr or similar. They can handle this type of data binding automatically, rather than requiring you to program to the low level servlet api.
Disclaimer, I'm a maintainer of thundr

Passing Variable from one website page to another website page without using QueryString

This question is related to a previous question
Passing Variable from page to page using ASP.NET (C#) without using QueryString
The difference in my case is that the request is coming from a different website (in java) to my website (in asp.net). I do not want the variable to appear in url.
Any suggestions !!
To explain my scenario, we are making a webpage(plugin), which can be called from any other website. To authenticate request, i am looking for a mechanism when other website will pass id & auth-key to my page. This i can use to authenticate the request. I do not want these variable to be visible.
A POST operation would work. The variable would still be part of the request, but it would not be readily visible to the user. I say "readily" visible because it won't be part of the requested URL, but it would be visible if they were to use a tool like Firebug. Short of sharing a database or some other form of "out-of-band" communication, I'm not sure it can be done any other way...
Well as chris mentioned doing a POST is the best way to achieve this. Else you can look at using javascript to achieve the same. Its pretty easy to use JS libraries to achieve the same.
Some of them that come to my mind are
a) Jquery
b) YUI
c) EXT (now Sencha i guess)
But I would definitely recommend jquery.
With jquery you have apis to do post operations. here is more on how to achieve the same.
http://api.jquery.com/jQuery.post/
Hope that helps.
I don't think it can be done without a query string. I know sessions won't work because sessions cannot be shared between Java, Asp, Asp.net, Php etc..., at least not nativly. If you have a database where you store the sessions, you can always use a session id in a query string and therefore simulate cross-language-sessions.

Java website protection solutions (especially XSS)

I'm developing a web application, and facing some security problems.
In my app users can send messages and see other's (a bulletin board like app). I'm validating all the form fields that users can send to my app.
There are some very easy fields, like "nick name", that can be 6-10 alpabetical characters, or message sending time, which is sended to the users as a string, and then (when users ask for messages, that are "younger" or "older" than a date) I parse this with SimpleDateFormat (I'm developing in java, but my question is not related to only java).
The big problem is the message field. I can't restrict it to only alphabetical characters (upper or lowercase), because I have to deal with some often use characters like ",',/,{,} etc... (users would not be satisfied if the system didn't allow them to use these stuff)
According to this http://ha.ckers.org/xss.html, there are a lot of ways people can "hack" my site. But I'm wondering, is there any way I can do to prevent that? Not all, because there is no 100% protection, but I'd like a solution that can protect my site.
I'm using servlets on the server side, and jQuery, on the client side. My app is "full" AJAX, so users open 1 JSP, then all the data is downloaded and rendered by jQuery using JSON. (yeah, I know it's not "users-without-javascript" friendly, but it's 2010, right? :-) )
I know front end validation is not enough. I'd like to use 3 layer validation:
- 1. front end, javascript validate the data, then send to the server
- 2. server side, the same validation, if there is anything, that shouldn't be there (because of client side javascript), I BAN the user
- 3. if there is anything that I wasn't able to catch earlier, the rendering process handle and render appropriately
Is there any "out of the box" solution, especially for java? Or other solution that I can use?
To minimize XSS attacks important thing is to encode any field data before putting it back on the page. Like change > to > and so on. This would never allow any malicious code to execute when being added to the page.
I think you are doing lot of right things by white listing the data you expect for different fields. Beyond that for fields which can allow other characters which can be problematic encoding would fix the issue for you.
Further since you are using Ajax it gives you some protection as people cannot override values in URL parameters etc.
Look at the AntiSamy library. It allows you to define rulesets for your application, then run your user input through AntiSamy to clean it per your rules.
The easiest way is to do a simple replacement for the following
< with <
> with >
' with \'
That will solve most database vulnerability.

expand Tiny url in java

I want to write a code in java that takes a url identify whether it is tiny url or not. if yes then it will identify the url is malicious or not. if not malicious print the url...
Please can any body help me....
You can use HttpClient to detect whether the URL is redirected to another location. After that it's a simple case of:
if (!isMalicious(redirectTargetURL))
{
System.out.println(redirectTargetURL);
}
The isMalicious(...) implementation is left as an excercise for the reader.
If you trust google to implement isMalicious(...) then they have done so with their Safe Browsing API.
So 2 main things you want:
Identify if it's a tinyurl
Identify if the URL is malicious
The answer to part 1 is easy. Just check if the URL belongs to the domain 'tinyurl.com'. Should be straightforward to either test raw URL string, or the host part returned by the getHost() method of a java.net.URL object.
Part 2 is more difficult to code up from scratch...
First you will need your code to figure out where the tinyurl redirects to.
The next bit really depends on how you want to define 'malicious'. Detecting deceptive URLs will require a bit of work (e.g. finding the difference between something like www.stackoverflow.com and www.stack0verf10w.com), or comparing the target URL with a malicous URL list (there's sites that publish them). There's also checking for multiple redirects, popups, and the list of criteria could go on and on.

Categories

Resources