How to manage sessions in a distributed application - java

I have a Java web application which is deployed on two VMs. and NLB (Network Load Balancing) is set for these VMs. My Application uses sessions. I am confused that how the user session is managed in both VMs. i.e. For Example- If I make a request that goes to VM1 and create a user session. Now the second time I make request and it goes to VM2 and want to access the session data. How would it find the session which has been created in VM1.
Please Help me to clear this confusion.

There are several solutions:
configure the load balancer to be sticky: i.e. requests belonging to the same session would always go to the same VM. The advantage is that this solution is simple. The disadvantage is that if one VM fails, half of the users lose their session
configure the servers to use persistent sessions. If sessions are saved to a central database and loaded from this central database, then both VMs will see the same data in the session. You might still want to have sticky sessions to avoid concurrent accesses to the same session
configure the servers in a cluster, and to distribute/replicate the sessions on all the nodes of the cluster
avoid using sessions, and just use an signed cookie to identify the users (and possibly contain a few additional information). A JSON web token could be a good solution. Get everything else from the database when you need it. This ensures scalability and failover, and, IMO, often makes things simpler on the server instead of making it more complicated.
You'll have to look in the documentation of your server to see what is possible with that server, or use a third-party solution.

We can use distributed Redis to store the session and that could solve this problem.

Related

Liferay 6 persistent sessions with very large table

I've implemented a liferay 6.2 cluster (with tomcat 7.x) and configured persistent sessions within tomcat configuration.
Everything is working fine, but i've notice that the table containing the sessions is very large.
Almost 46gb of space for ~2000 persisted sessions.
Is there any way to reduce the space of the data saved into session?
I see there is a liferay property:
session.shared.attributes=COMPANY_,LIFERAY_SHARED_,org.apache.struts.action.LOCALE,PORTLET_RENDER_PARAMETERS_,PUBLIC_RENDER_PARAMETERS_POOL_,USER_
but i don't know if is relevant or not
As Liferay says, session replication is not recommended.
https://web.liferay.com/es/community/wiki/-/wiki/Main/Clustering
Install an http load balancer and make sure your load balancer is set to sticky session mode. It is not recommended to use session replication for clustering.
Why? because it is not a scalable system. In 99% cases, you can use a load balancer with session affinity. BUT this is not the best way to scale a cluster, but it is better.
The best way would be implement session managament with JWT (json web token) or a similar mechanism, without java session, since every node doesn't know anything about sessions and it is the only way to scale linearly.

Global Variables not updated due to load balancing

There are two servers running a web service and the servers are load balanced using HAProxy.
Web Service does a post and updates a global variable with a certain value and another application running on the same servers will read this global value and does some processing.
My issues is when I set the value to gloabl variable using the web service; due to the load balancer only one server's gets updated. Due to this, application some times read the global value and when the request goes to the second server it wont read it as the global variable is not updated in that server.
Can someone tell me how to handle a situation like this. Both the web service and the application is Java based.
First, relying on global data is probably a bad idea. A better solution would be to persist the data in some common store, such as a database or a (shared) Redis cache. Second, you could - if necessary - use sticky sessions on the load balancer so that subsequent requests always return to the same web server, as long as it is available. The latter qualification is one of the reasons why a shared cache or database solution should be preferred - the server may go down for maintenance or some other issue during the user's session.

How to load existing http sessions from an existing server to another server?

I would like to know if Servlet specifications provides a way to load http sessions into my web application.
The idea is simple : every time a new http client is connected, a new session is created... and I will send this session and its values into a database (for the time being this step is easy to do).
If this "master server" dies, another machine will take its IP address, so http clients will now send their requests to this new machine (lets call it "slave server").
Here I would like my slave server retrieve sessions from the old server... but I don't know which method from Servlet specifications can "add" session ! Is there a way to do it ?
PS: it's for an university project, so I cannot use already existing modules like Tomcat's mod_jk for this homemade load-balancer.
EDIT:
I think that a lot of people think I am crazy to not use already existing tools. It's an university project, and I have to make it with my bare hands in order to show to my professors the low level mecanisms that I have used. I already know it would be crazy to use what I am doing in production, when this project will be finished, it will be thrown in the trash.
For the moment, I didn't find a "standard way" to make it with the Servlet specifications, but I can maybe do it with Manager and Session from Tomcat native classes... How can I get the instances for those interfaces ?
This isn't exactly a new idea and is called session replication. There are a couple of ways to do this. The easiest ones imho are (in ascending order of preference):
Jetty's Session clustering with a database
Tomcat's Session clustering. I personally prefer the BackupManager, which makes sure that a session lives on 2 servers in a cluster at any given point in time and forwards clients accordingly. This reduces the network traffic for session replication to a bare minimum.
Session replication with a distributed cache like hazelnuts or ehcache. There are plugins for both jetty and Tomcat to do this. Since most often a cache is used anyway, this is to be the best solution for me. What I tend to do is to put 2 round robin balanced varnish servers in front of such a cluster, which serve the dual purpose role of load balancing the cluster and serving static content from in memory cache.
As for your university project, I'd turn in an embedded jetty with automatic session replication which connects to other servers via broadcast using hazelcast. Useful, not overcomplicated (iirc, you need to implement 2 relatively simple interfaces), yet powerful. Put a varnish in front of your test machines and you should be good to go.
This feature is supported by all major Java EE application server vendors out of the box, so you shouldn't implement anything by yourself. As Markus wrote it is referred as session replication or session persistence. You can take a look at WebSphere Liberty which is available for free for development. It supports it out of the box, without need to implement anything. You just need to:
install Liberty Download just the Liberty profile runtime
configure session replication Configuring session persistence for the Liberty profile
install and configure IBM Http Server for load balancing Configuring a web server plug-in for the Liberty profile

Minimizing Customer Interruption During Deployment

How do you deploy new code into production so that the customer experience is not interrupted?
Let's assume an e-commerce website that's in a load balanced environment, where no session state is shared. Tomcat is the application server.
The only ideas I can come up with are (1) use JavaRebel, but I have no experience with this and don't know what could go wrong (maybe session object mismatch, for example if you remove a member from a class) (2) have real-time monitoring of where your users are in the shopping experience and prevent new items from being added to the cart; wait until all existing shoppers have completed their order or have expired; then turn off the server and deploy new code.
Any ideas? Or is this a situation where it's critical to share session data among the web servers with something like terracotta? Then again, how do you deploy new code to the web servers where a session's object member has been removed or added?
Common approaches I've seen:
if there's not much state then store it client-side, in a cookie - this also has the advantage of not requiring IP affinity so requests can be distributed more evenly
shared storage (DB, Terracotta, etc.) - obviously this can become a bottleneck
deploy at a quiet time and hope nobody notices
remove server from pool for new sessions, then monitor the logs and wait for requests to die off
A common approach is to take one server at a time out of the load balancing pool, deploy the new code to it, and then put it back into the pool.

Scalable http session management (java, linux)

Is there a best-practice for scalable http session management?
Problem space:
Shopping cart kind of use case. User shops around the site, eventually checking out; session must be preserved.
Multiple data centers
Multiple web servers in each data center
Java, linux
I know there are tons of ways doing that, and I can always come up with my own specific solution, but I was wondering whether stackoverflow's wisdom of crowd can help me focus on best-practices
In general there seem to be a few approaches:
Don't keep sessions; Always run stateless, religiously [doesn't work for me...]
Use j2ee, ejb and the rest of that gang
use a database to store sessions. I suppose there are tools to make that easier so I don't have to craft all by myself
Use memcached for storing sessions (or other kind of intermediate, semi persistent storage)
Use key-value DB. "more persistent" than memcached
Use "client side sessions", meaning all session info lives in hidden form fields, and passed forward and backward from client to server. Nothing is stored on the server.
Any suggestions?
Thanks
I would go with some standard distributed cache solution.
Could be your application server provided, could be memcached, could be terracotta
Probably doesn't matter too much which one you choose, as long as you are using something sufficiently popular (so you know most of the bugs are already hunted down).
As for your other ideas:
Don't keep session - as you said not possible
Client Side Session - too unsecure - suppose someone hacks the cookie to put discount prices in the shopping cart
Use database - databases are usually the hardest bottleneck to solve, don't put any more there than you absolutely have to.
Those are my 2 cents :)
Regarding multiple data centers - you will want to have some affinity of the session to the data center it started on. I don't think there are any solutions for distributed cache that can work between different data centers.
You seem to have missed out vanilla replicated http sessions from your list. Any servlet container worth its salt supports replication of sessions across the cluster. As long as the items you put into the session aren't huge, and are serializable, then it's very easy to make it work.
http://tomcat.apache.org/tomcat-6.0-doc/cluster-howto.html
edit: It seems, however, that tomcat session replication doesn't scale well to large clusters. For that, I would suggest using JBoss+Tomcat, which gives the idea of "buddy replication":
http://www.jboss.org/community/wiki/BuddyReplicationandSessionData
I personally haven't managed such clusters, but when I took a J2EE course at the university the lecturer said to store sessions in a database and don't try to cache it. (You can't meaningfully cache dynamic pages anyway.) Http sessions are client-side by the definition, as the session-id is a cookie. If the client refuses to store cookies (e.g. he's paranoid about tracking), then he can't have a session.
You can get this id by calling HttpSession.getId().
Of course database is a bottleneck, so you'll end up with two clusters: an application server cluster and a database cluster.
As far as I know, both stateful message beans and regular servlet http sessions exist only in memory without load balancing built in.
Btw. I wouldn't store e-mail address or usernames in a hidden field, but maybe the content of the cart isn't that sensitive data.
I would rather move away from storing user application state in an HTTP session, but that would require a different way of thinking how the application works and use a RESTful stateless architecture. This normally involves dropping support for earlier versions of browsers that do not support MVWW architectures on the client side.
The shopping cart isn't a user application state it is an application state which means it would be stored on a database and managed as such. There can be an association table that would link the user to one or many shopping carts assuming the sharing of carts is possible.
Your biggest hurdle would likely be how to authenticate the user for every request if it is stateless. BASIC auth is the simplest approach that does not involve sessions, FORM-auth will require sessions regardless. A JASPIC implementation (like HTTP Headers or OAuth) will be able to mitigate your authentication concerns elsewhere, in which case a cookie can be used to manage your authentication token (like FORM-auth) or HTTP header like SiteMinder or Client Side Certificates with Apache.
The more expensive databases like DB2 have High Availability and Disaster Recovery features that work across multiple data centers. Note that it is not meant for load balancing the database, since there'd be a large impact due to network traffic.

Categories

Resources