how to check if a sessionId is valid in a servlet (java) - java

I am maintaining a map of sessionid(s) and HttpSession Objects in my web app. I use an HttpSessionListener to populate or remove a session from the map. When my web server crashes and goes down and comes back up, i need a way to check if the sessionid being submitted is valid or not. Obviously, when the app comes back online, the Map is empty, therefore all session id's are invalid, but I just want some way to check the incoming sessionid, if possible, at all..
thanks

You can keep those in DB in some table. This way it will not suffer by the crash. And by the way, servers serializes the session to the disk, if you want them too.

One suggestion is to serialize your Map to disk and reload when the Web server starts up. The strategy to backup your Map to disk depends on your requirement, you could write to disk each time you add/update the map or in regular intervals. It is fairly easy to write your session data to disk, this is something that is often done for achieving redundancy and load balancing scenarios.

Why on earth are you maintaining the sessions yourself? The servlet container is supposed to do that for you. I haven't seen your code, maybe you have a good reason for doing what you're doing, but I get the feeling that you just don't understand the servlet API, and are trying to re-implement it yourself.
Also, you might be interested in this: Tomcat has a feature where it persists the sessions, which means that the session state survives a server restart. (Turn this off when updating the application, since there might be mismatches between different versions of classes.)

Related

In GAE, is there a way to force an instance to serve only 1 session?

I am building a rich app on GAE using Canoo's RIA Suite. This package splits Java Swing components into server-side and client-side parts. On the server, it looks like a 'desktop' Java application. The client keeps its own map between these halves. When GAE starts a new instance, the client-side parts don't know about it -- if the next request they send is routed to the wrong instance bad things happen.
I figure I could get around this problem if I did one of two things:
Forced a GAE instance to serve exactly one HTTP session.
Directed each HTTP request to a specific GAE instance.
My question is, in the GAE environment, can either of these be done?
Neither of these two options will solve your problem, because an App Engine instance can die and be replaced at any moment.
If you can save a state of your server-side "half" in a datastore, you can load it when a request hits the "wrong" instance, but it's probably not a very efficient solution.
You may be better off using a Compute Engine instance.
I agree that neither of those two options will work for you. The implication of your current design is that you are storing state in memory on an instance, which will not work with GAE (or any autoscaling distributed system). You should put any state into some distributed data store, whether that is memcache (which is volatile), the datastore or cloudSQL
GAE/J has built in support for java sessions, the session state is persisted in the datastore across requests so that it is valid on any instance. For this to work, everything stored in your session will need to be serializable.
You can enable this by following these instructions.
Otherwise you can manage persisting server state yourself into the datastore accelerated by memcache, and linking it to a 'session' with a cookie. If you go down this road make sure you understand the implications of eventual consistency in the GAE datastore.

Java Session Implementation

I am developing a multiplayer online game. I have the following issue:
When the user breaks his/her connection with the server, he needs to reconnect. At the first connection, during registration, the registration module produces a special ResponseDispatcher which holds the reference to the connection Channel. But if the user logs out, this Channel becomes invalid. Even though I can detect the problem and clean up resources, I have to store the reference to the registration module and Connection module to the game module, in order to renew the Channel when the user authorises and reconnects again. This creates a lot of interdependencies among modules and it gets really hard to maintain.
What I need is something like an HttpSession in Servlet Container, so that I can get the references to my channel and session resources from all modules of my server.
How is HttpSession implemented in Servlet? Is it a global hashmap storing all JSESSIONID, from which the container determines which attribute map to return? If it is a global sysmbol table, will it hit the performance (even though the time is O(1) for hashMap, there might be session modifications so it probably has to be synchronized)?
PS. Maybe some recommendations of design patterns for this case would also do.
I would recommend trying Shiro
Shiro can handle Session Management outside of a servlet container.
You may want to back Shiro with EhCache to provide proper caching and, if required, session persistence (and load balancing, etc...)
Have a look at the Facade Pattern
I'm not exactly sure what the question is. Why do you "have to store the reference to the registration module and connection module"?
Anyway there are two immediately sensible solutions to your problem.
1) Make the Registration module and Connection module Singletons. Whether this is useful depends entirely on what functionality these modules provide.
2) Make the Registration module and Connection module Persistent Entities, save them to a DataStore, complete with necessary references, and retrieve and rebuild them on reconnect.
I'm not quite sure why rolling your own session implementation would be something you want to do, what happens when the session times out?
Your design seems somewhat flawed. The player should not "still be online" if his connection goes down (this is a contradiction in terms, you cannot by definition be online if you are not connected to a network), regardless of whether this was intentional or not (you cannot know whether it is intentional or not), you have no idea whether the player is able to reconnect within a timely fashion or not, so you should assume the worst. More importantly, from a purely design aspect, getting killed by the game, because your internet connection is rubbish, is probably not something you want to deal with. If persisting data is such a costly affair you should reexamine your datastore options. Also, what happens in the scenario where the server crashes while he's offline?

Combining java spring/thread and database access for time-critical web applications

I'm developing an MVC spring web app, and I would like to store the actions of my users (what they click on, etc.) in a database for offline analysis. Let's say an action is a tuple (long userId, long actionId, Date timestamp). I'm not specifically interested in the actions of my users, but I take this as an example.
I expect a lot of actions by a lot of (different) users par minutes (seconds). Hence the processing time is crucial.
In my current implementation, I've defined a datasource with a connection pool to store the actions in a database. I call a service from the request method of a controller, and this service calls a DAO which saves the action into the database.
This implementation is not efficient because it waits that the call from the controller and all the way down to the database is done to return the response to the user. Therefore I was thinking of wrapping this "action saving" into a thread, so that the response to the user is faster. The thread does not need to be finished to get the reponse.
I've no experience in these massive, concurrent and time-critical applications. So any feedback/comments would be very helpful.
Now my questions are:
How would you design such system?
would you implement a service and then wrap it into a thread called at every action?
What should I use?
I checked spring Batch, and this JobLauncher, but I'm not sure if it is the right thing for me.
What happen when there are concurrent accesses at the controller, the service, the DAO and the datasource level?
In more general terms, what are the best practices for designing such applications?
Thank you for your help!
Take a singleton object # apps level and update it with every user action.
This singleton object should have a Hashmap as generic, which should get refreshed periodically say after it reached a threshhold level of 10000 counts and save it to DB, as a spring batch.
Also, periodically, refresh it / clean it upto the last no.# of the records everytime it processed. We can also do a re-initialization of the singleton instance , weekly/ monthly. Remember, this might lead to an issue of updating the same in case, your apps is deployed into multiple JVM. So, you need to implement the clone not supported exception in singleton.
Here's what I did for that :
Used aspectJ to mark all the actions of the user I wanted to collect.
Then I sent this to log4j with an asynchronous dbAppender...
This lets you turn it on or off with log4j logging level.
works perfectly.
If you are interested in the actions your users take, you should be able to figure that out from the HTTP requests they send, so you might be better off logging the incoming requests in an Apache webserver that forwards to your application server. Putting a cluster of web servers in front of application servers is a typical practice (they're good for serving static content) and they are usually logging requests anyway. That way the logging will be fast, your application will not have to deal with it, and the biggest work will be writing a script to slurp the logs into a database where you can do analysis.
Typically it is considered bad form to spawn your own threads in a Java EE application.
A better approach would be to write to a local queue via JMS and then have a separate component, e.g., a message driven bean (pretty easy with EJB or Spring) which persists it to the database.
Another approach would be to just write to a log file and then have a process read the log file and write to the database once a day or whenever.
The things to consider are: -
How up-to-date do you need the information to be?
How critical is the information, can you lose some?
How reliable does the order need to be?
All of these will factor into how many threads you have processing your queue/log file, whether you need a persistent JMS queue and whether you should have the processing occur on a remote system to your main container.
Hope this answers your questions.

What are requirements for a web application to work in a cluster environment

I need to check if existing web application is ready the be deployed in a clustered environment.
Cluster:
Several Linux boxes. The flow is controlled by a load balancer that is using simple round robin algorithm with sticky session.
Application
Stateless (hopefully) java web application that retrieves content from back office and format it appropriately.
I have access to the source code. What should I check in the code to be sure that it will run in the cluster?
Check that something is not cached in a memory or file system that stores state of the application.
...Something else?
If you're using EJBs (which is recommended if you access a DB), then here is a list of restrictions:
http://java.sun.com/blueprints/qanda/ejb_tier/restrictions.html
I guess similar restrictions apply to the web application.
The easiest way to check the application is to start by having it running on 2 servers with the same data so at startup both are in the same state. Let's assume for a user to complete an operation, the browser will make 2 consecutive HTTP requests to your web app -- what you need to do is hit webserver 1 with first call and web server 2 with second call; then try the other way around, then with both requests going to the same webserver -- and if you get the same result each time you're very likely you have ready-to-cluster application. (It doesn't mean the app IS ready to cluster as there might be object states etc it stores in memory which are not easy to spot from the front-end, but it gives you a higher probability that IT MIGHT BE ok to run in a cluster.)
If its truly "stateless", there would be no problem, you could make any request of any server at any time and everything would just work. Most things aren't quite that easy so any sort of state would either have to be streamed to and from the page as it moves from client to server, or be stored on the back end, and have some sort of token passed back and forth in order to retrieve it from whatever shared data store you're using for that. If they are using the HttpSession, then anything that is retrieved from the session, if modified, needs to be set back into the session with session.setAttribute(key,value). This setting the attribute acts as a signal that whatever is being stored in the session needs to be replicated to the redundant servers. Make sure anything stored in the session implements, and actually is, Serializable. Some servers will allow you to store objects, (I'm looking at you weblogic), but will then throw an exception when it tries to replicate the object. I've had many a coworker complain that having to set stuff back to the session should be redundant, and perhaps it should, but this is just the way things work.
Having state is not a big problem if done properly. Anyway, all applications have state. Even if serving somewhat static file, the file content associated with an URL is indeed part of the state.
The problem is how this state is propagated and shared.
state inside user session is a no brainer. Use a session replication mechanism (slower but no session loss on node crash) or session sticky load balancer and your problem is solved.
All other shared state is indeed a problem. In particular even cache state must be shared and perfectly coherent otherwise a refresh on the same page could generate different result on random depending on witch web server, and thus the cache you hit.
You can still cache data using a shared cached (like ehcache), or failing back to session sticky.
I guess it is pretty difficult to be sure that the application will indeed work in a clusterised environement because a singleton in some obscure service, a static member somewhere, anything can potentially produce strange results. You can validate the general architecture for sure, but you'll need to do in reality and perform some validation test before going into production.

How to identify website visitor when cookies disabled and URL rewriting is not allowed?

In Java web application, Java servlet create unique jsessionid that passes as cookie to the client browser to keep track of the client's subsequent requests after the first one. But when cookies disabled and URL rewriting is not allowed due to security policy, my understanding is that Java servlet would create a new session object for every request from the same client. Is this correct? And does it mean a lot of wastage of server memory ( excessive memory allocation for each session object that is never going to be used again and excessive garbage collection)?
One solution is to use in such scenario is to use client's IP address and user agent string to uniquely identify the user and store in database. Is this correct solution?
Above scenario is fairly common in case of search engine bots which typically makes thousands of frequent requests when they visit the site.
Any other thoughts on crafting proper solution for this problem for a Java based web application?
Yes, in that situation sessions will be created every time. These do cost memory and will need to be GC'ed eventually.
If you don't need to track users you can always opt to disable the creation of sessions. In JSP this is a bit difficult, since a page normally always creates a session. There is a directive to turn this off though.
You can however write a filter and servlet request wrapper that prevents sessions from being created.
In JSF there is a very unfortunate bug in the much used Mojorra 2.04 implementation that makes it more or less impossible to do this, but luckily Mojarra 2.1.0 has fixed this.
In case that you really do need to track users, a form of fingerprinting could be used. This is always approximate though and I don't think you should ever use this for a login. IP + user agent is a form of fingerprinting, but because of proxies and large organizations installing the exact same browser for all their workstations this is quite unreliable. It's okay for usage statistics, but totally unsuited for logins.
Alternatives are using HTTPS/SSL, as this protocol has a build-in kind of "session ID", or using DOM or Flash storage, which not everyone who disables cookies also disables.

Categories

Resources