Closing an HTTP Session for Writing in Java / Tomcat

Closing an HTTP Session for Writing in Java / Tomcat - java

When working on an ASP.NET application, I discovered that placing something in the session cache, or really, accessing variables in the session cache, caused my Ajax queries to stop being asynchronous. I learned that this was because the session basically blocks - if I fire two Ajax requests from my browser at the same time, and the first one takes a bit to return, the session is locked in the first request until that request is completed, at which point my second Ajax request starts working.
In PHP I gather that there is an option to close the session for writing (and / or open it in a read-only way) so that session variable access is non blocking and things stay asynchronous.
I'm building an application that will be Java, probably running on Tomcat (though I could change to some other container if I needed) and I am not able to find out whether Java has the same issue (session variable reads block) or has the same remedy (early close, read only mode). Has anyone encountered that issue before?

In Tomcat, HttpSession is implemented in org.apache.catalina.session.StandardSession (source here).
If you look at the source, you will see that calls to HttpSession.getAttribute(String) and HttpSession.setAttribute(String, Object) are pretty much channelled to a ConcurrentHashMap without any additional synchronization.
This means that these calls derive the contract of ConcurrentHashMap. Quoting its Javadoc:
retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access. <..> Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove)
The table is internally partitioned to try to permit the indicated number of concurrent updates without contention. Because placement in hash tables is essentially random, the actual concurrency will vary.

It looks like blocking takes place because of threads synchronization of access to HttpSession as described in this SO answer
So, it must be that 2nd request is blocked only while 1st one is working with HttpSession (or if you have some shared lock which is held for long time by 1st request, but this has nothing to do with Tomcat).
Since this synch is required by Servlets spec, you shouldn't try to violate it. Instead, make your app so it minimizes time it needs to read from or write to HttpSession.
Additionally, as I wrote above, blocking may occur if you have additional lock which makes several requests execute sequentially. Try to make several thread dumps of Tomcat when you have sent 2nd request to Tomcat and see if there's any such lock which is waited by 2nd requet for.

Related

HttpServletRequest.getSession(true) thread safe?

I see a lot of questions concerning whether the setAttribute and getAttribute methods on HttpSession are atomic. They are not. But, is the actual call of request.getSession(true) atomic per client?
For example if you had a servlet filter and a client issue two simultaneous calls which simultaneously reach a line request.getSession(true) would the same session object be returned? I assume such a thing would be container specific? Or are you gauranteed a syncronized getSession call per requesting client.

No, it is not.
Servlet Spec says...
2.3.3.4 Thread Safety
Other than the startAsync and complete methods, implementations of the request and response objects are not guaranteed to be thread safe. This means that they should either only be used within the scope of the request handling thread or the application must ensure that access to the request and response objects are thread safe.
If a thread created by the application uses the container-managed objects, such as the request or response object, those objects must be accessed only within the object’s life cycle as defined in sections 3.10 and 5.6. Be aware that other than the startAsync, and complete methods, the request and response objects are not thread safe.If those objects were accessed in the multiple threads, the access should be synchronized or be done through a wrapper to add the thread safety, for instance, synchronizing the call of the methods to access the request attribute, or using a local output stream for the response object within a thread.
And to your question?
Is it possible that two concurrent calls to getSession return a different HttpSession object even they come from the same client?
The answer is "Yes,
it will return two session objects,
two Set-Cookies Will be sent to the client
The latest Set-Cookie might override the first one

Not sure what you are really concerned here:
For example if you had a servlet filter, and a client issues two
simultaneous calls which simultaneously reach the line:
request.getSession(true)
would the same session object be returned?
It depends on what you mean by same session object, ie if you mean s1 == s2 or s1.equals(s2). I can't find anything stating that the object must be the same (==), but even if likely they are all different objects, they eventually can see the same logical session. Imagine these session objects as database clients: they are not the data, but they all view the same data, ie they read and write to a common place.
Now, to answer your question, we must decide if the client issued the second request before reading any other response from the same server: a session must be tracked with a piece of input (either in the URL or in the HTTP headers, in the form of a cookie), so we have the following scenarios:
Client makes request #1, gets a session, and sends the session ID back to the server in two simultaneous requests #2 and #3: they will share the session
Client makes request #1 and #2 almost at the same time, without any previous request to the same application. Since no input is provided to the server (no session ID) two new sessions are created, even if the clients don't hit the getSession() line at the same moment. Depending on the client application, this may be a bug or not.
So this is not a problem with threads at all. It just depends on the input supplied by the client. Same session ID, same session returned. Different (or no) session ID, different sessions.
Just for the sake of correctness, a logical client (a single program, like Firefox) can even make N requests in N separate threads on a N+ cores machine, but the network is usually shared. Assuming it has a multihomed machine, and each NIC is connected to a separate network, you'll need your servlet container to listen to multiple IP addresses and have N processors (or cores). This is just to say that there's no need to have two simultaneous calls, though it's perfectly possible that requests from the same client are processed in parallel and thus reach the same line at the same moment.

Nothing in the API suggests that the method is synchronized, though I don't know what is going on inside the method. Doing getSession(true) returns a new session if one doesn't exist. The check for an existing session and the creation of a new one is your critical section. If that is reached simultaneously by your threads, then two sessions will have been created and returned to your two different calling entities (filters will be applied before and, possibly, after your servlets so I don't see how you can do this, but for the question's sake let's assume it can happen). If a Session object already exists, then only that one will be returned.

Portlets, HttpSession and Thread-Safety

Our portlets keep state in the HttpSession, which is shared by all request processing threads for the same session.
The portlet spec (JSR-168) writes:
PLT.5.2.4.3 Multithreading Issues During Request Handling
The portlet container handles concurrent requests to the same portlet by concurrent
execution of the request handling methods on different threads. Portlet developers must
design their portlets to handle concurrent execution from multiple threads from within the
processAction and render methods at any particular time.
I wonder how I am supposed to achieve that? Sure, I can use synchronization to achieve mutual exclusion during both processAction and render, but I don't see how I can enforce atomicity of request processing as a whole. In particular, I worry about the following scenario:
Thread 1 executes processAction, loading data into the session for later rendering
Thread 2 executes processAction, discarding that data from the session
Thread 1 executes render, reading the data to render from the session, and throws a NullPointerException because the prepared data is no longer there ...
How is that scenario usually prevented? In particular, when using the JBoss portlet bridge to adapt JSF to a Portlet environment?

I'd say that if there are two portlets operating on the same data, especially one reading it while the other deletes it, there's most likely a serious flaw in the design.
You might then want to store the data per portlet/thread, i.e. if portlet1 reads some data you should write lock it until reading is finished and put it into the session using a unique key.
If it is legal to delete data that should be rendered, then you should account for that and check again during render.

Should volatile be used for attributes of domain model classes in Java web apps?

Here's my thinking:
Even though a HTTP request cycle is essentially handled by a 'single thread', each time a HTTP request is processed for that same session it is likely to be processed by a different thread from the thread pool.
Without the volatile keyword being used on a domain model object, whose lifecycle extends across multiple HTTP requests for the same session, then, according to my understanding, isn't it possible that the attribute could be thread local cached (an optimization by the compiler) in the thread that serviced the first HTTP request? If the second HTTP request is serviced by another thread then that second thread may not see the changes in that attribute that were made by the first thread.
Does this spell "Danger Will Robinson"? Or am I missing a vital plot point about the use (or not) of the volatile keyword?

I think you are forgetting that the threads handling the HTTP request first need to retrieve the instance of the domain model object from the HttpSession provided by your application server. The thread handling request 2 in the scenario you describe does not already have an instance of this domain model - it has to retrieve it from the session implementation at the start of handling each and every request.
I think it is completely reasonable to assume that the session-handling implementation in your application server is handling session data in such a way that memory model visibility issues are avoided. Apache Tomcat's default (non-clustered) HttpSession implementation, for example, stores the session attributes in a ConcurrentHashMap.
Adding volatile seems completely unnecessary to me. I have never seen this done for domain model objects handled by HTTP requests in a Servlet environment in any project I have worked in.
This would be a different story if thread-1 and thread-2 had references to the same object instance simulatenously while processing two different requests, and you were concerned about changes in one thread being visible to the other as each are processing the request, but this does not sound like what you are asking about.

Yes, if you are sharing an object between different threads, you may have race conditions. Without a happens before relationship, writes made by one thread may not be seen by a read in another thread.
Doing a volatile write in one thread and doing a volatile read of the same field in another thread establishes a happens before relationship between the two threads, and ensures visibility of the write.
This is a complicated problem, simply using a volatile keyword is probably not a good solution.

I think your understanding of it is correct. Given your description I would say it should be used. If its something more than a primitive type I would rather synchronize.
Good information on volatile:
http://www.javamex.com/tutorials/synchronization_volatile_when.shtml

If you have a mutable object in session, that is trouble. But usually the solution is not to guard individual fields; rather the entire object should be swapped.
Say you have the user object in the session. Most requests simply retrieve it, read it and display it.
There is a request that can modify user information. It would be a really bad idea to retrieve the user object, modify it. It's better to create complete new user object, and insert it into session.
In that case, fields in User don't need any protection; thread safety is guaranteed by session setAttribute() - getAttribute()

If you have concurrency issues, just adding 'volatile' probably won't help you.
As for keeping the object as an attribute of Session, I'd recommend you to keep just the object's ID, and use it to retrieve a 'live' instance when you need it (if you use Hibernate, successive retrieves will return the same object, so this shouldn't cause performance problems). Encapsulate all modification logic to this specific object into a single façade, and do the control concurrency there, using dababase locking.
Or, if you really, really, really want to use memory-based locking, and are really sure that you'll never have two instances of the application running in a cluster, make sure that your façade logic is synchronized at the right level. If your synchronization is too fine grained (low-level operations, such as volatile variables), it probably won't be enough to make your code thread-safe. For example, java.util.Hashtable is fully synchronized, but it doesn't mean anything if you have logic like this:
01 if (!hashtable.containsKey(key)) {
02 hashtable.put(key, calculate(key));
03 }
If two threads, say, t1 and t2, hit this block at the same time, t1 may execute line 01, then t2 may also execute 01, and then 02, and t1 then will execute 02, overwriting what t2 had done. The operations containsKey() and put() are atomic individually, but what should be atomic is the whole block.
Sometimes recalculating a value doesn't matter, but sometimes it does, and it will break.
When it comes to concurrency, there's no magic. I mean, seam some crappy frameworks try to sell you the idea that they solve this problem for you. They don't. Even if it works 99% of the time, it will break spectacularly when you go to production and start to get heavy traffic. Or (much, much) worse, it will silently generate wrong results.
Concurrency is one of the most complex problems in programming. And the only way to handle it is to avoid it. All this functional programming trend is not about dealing with concurrency, is about avoiding it altogether.

It turns out that volatile was not needed in the end. The problem that "appeared" to be fixed with volatile was actually a very subtle timing sensitive bug that was fixed in a much more elegant and proper way ;)
So sbrigdes was correct when he said "simply using a volatile keyword is probably not a good solution."

Using a JMS Session from different threads

From the javadoc for Session it states:
A Session object is a single-threaded context for producing and consuming messages.
So I understand that you shouldn't use a Session object from two different threads at the same time. What I'm unclear on is if you could use the Session object (or children such as a Queue) from a different thread than the one it created.
In the case I'm working on, I'm considering putting my Session objects into a pool of available sessions that any thread could borrow from, use, and return to the pool when it is finished with it.
Is this kosher?
(Using ActiveMQ BTW, if that impacts the answer at all.)

I think the footnote from section 4.4 in the JMS 1.1 spec sheds some light:
There are no restrictions on the number of threads that can use a Session object or those it creates. The restriction is that the resources of a Session should not be used concurrently by multiple threads. It is up to the user to insure that this concurrency restriction is met. The simplest way to do this is to use one thread. In the case of asynchronous delivery, use one thread for setup in stopped mode and then start asynchronous delivery. In more complex cases the user must provide explicit synchronization.
By my reading of the spec what you want to do is OK, provided you correctly manage concurrency.

Sadly the JMS docs are often not written as clearly or precisely as we might like :o(
But reading the spec I'm now pretty convinced you really shouldn't access the session from other threads, even if you guarantee there's no concurrent access. The bit of the javadoc that swung it for me was:
Once a connection has been started,
any session with a registered message
listener(s) is dedicated to the thread
of control that delivers messages to
it. It is erroneous for client code to
use this session or any of its
constituent objects from another
thread of control. The only exception
to this is the use of the session or
connection close method.
Note the clear use of 'thread of control' and the singling out of 'close()' as the only exception.
They seem to be saying that even if you're using asynchronous message consumption (i.e. setMessageListener) - which means you get called back on another thread created by JMS to receive messages - you're never allowed to touch the session or related objects again from any other thread, because the session is now 'dedicated' to the JMS delivery thread. For example, I assume this means you couldn't even call message.acknowledge() from another thread.
Having said that, I only just noticed that we haven't been obeying this constraint, and have yet to notice any ill effects (using SonicMQ). But of course if you don't obey the standard, all bets are off, so I guess we need to obey the 1-thread 1-session rule to stay safe.

Handling requests using threads

I am writing an application using JSP & Jdbc, Where i have a table name "COMMENT_DATA", In which user can post their comments on that. So now If more than one user is writing comments and posting it at the same time, I am going for threads. So I will be Synchronizing the method which inserts the data onto the Database. Then how to handle the other requests ie., how to queue out the other requests and how to take back and make them write into the Database

Exactly. Each HTTP request is already a thread at its own. Keep in mind that the web container will create only one servlet instance during application's lifetime and that the servlet code is been shared among all requests. This implies that any class-level variables or static variables are going to be shared among all requests. If you have such one variable, it is not threadsafe. You need to declare request-specific variables threadlocal at method-level.
As to JDBC: just write solid code and everything should go well. Using a connection pool is only useful to improve connecting performance (which is really worth the effort, believe me, connecting the DB is a fairly expensive task which may account up to at least 200ms or even more, while reusing a connection from the pool costs almost nothing). It only doesn't change anything to the threadsafety of the code you write, it's still in your control/hands. To get a clear picture of how to do the basic JDBC coding the right way, you may find this article useful.

As above the servlet container will handle the threading of the requests for you. I.e. for each different user than connects to the server a new thread will be created with out you knowing.
So all you have to do is ensure your jdbc code is thread safe and you should be fine. The database will do all of the necessary locking for you :-)
Karl

I'm not sure why you need to worry about this. The servlet container will handle the threading (say, via a threadpool). The database will handle multiple connections, so if you're not modifying shared state across different threads in the application, you shouldn't have to worry about this.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.