ThreadLocal usage for managing session data

ThreadLocal usage for managing session data - java

I got a doubt about threadlocal usage in managing sessions. It is..
In Thread Local, which ever thread that creates the thread local object has the access to that session object, and only that thread can modify the session object data. There might be many threads running to complete a single user request.
So what about all other threads that participate to complete one user request.
They wont get access to modify the session object, as whichever thread creates the Thread local object gets its own session object, so other threads that run to complete a single user request might not update their data to the session object to which they actually wanted to.
I mean if thread-1 and thread-2 participate in completing a user request, but thread-1 comes to create the threadlocal object, and thread-2 executes other codes as part of the request, but after thread-2 completion , it cant update the session data because, only tharead-1 has access to the session as it has created the threadlocal object.
So how are we resolving this issue.
Are we making sure that only one thread participate in completeing single users request ? or
How are we so sure that which ever thread that creates the threadlocal object is only coming to update the session data associated for that request ?

Well first, RTS (Real-time system) have to be distinguished from high performance systems. In RTS time has been split in frames and piece of software have one or more frame allocated to them. Which makes very predictable which task system is performing at a given time and also which other tasks are running in parallel. Thus design help in avoiding/limiting concurrency management.
Second, high performance systems (whatever RTS or not) don't necessarly rely on multithreading for the exact same reason as for RTS : concurrency management and blocking structures more precisely are bottlenecks. And multithreading is not the only solution for parallel processing. Forks, grid computing and forth work greats and offer more scalability, stability, robustness, etc.
So, single thread environments are generally based on an async design. You split your work (ie request processing) in small non-waiting/blocking structures. All waiting/blocking (or even agressive long computing) are sent "somewhere" and just notify request processor when complete. In this case one thread can treat many request during a single request processing (not sure to be clear at this point ...)
In such (and some other) cases, you can't use thread-local binding. However, you can use "local scope" to share a reference across your processing :
Map<String, String> context = new LinkedHashMap<>();
context.put("id", "42");
taskQueue.add(() -> context.put("state", "action1"));
taskQueue.add(() -> System.out.printf("%-10s > state=%s%n", context.get("id"), context.get("state")));
(assuming taskQueue is Queue of Runnable)
You can also save request context by:
generating a "Unique Request Identifier" (suitable class should be UUID)
using a request context storage (can be a simple ConcurrentMap) or more advanced storage such as Key-Value Cache, Key-Value Store or Document store.
attaching "Unique Request Identifier" to all request-bound actions

Related

Closing an HTTP Session for Writing in Java / Tomcat

When working on an ASP.NET application, I discovered that placing something in the session cache, or really, accessing variables in the session cache, caused my Ajax queries to stop being asynchronous. I learned that this was because the session basically blocks - if I fire two Ajax requests from my browser at the same time, and the first one takes a bit to return, the session is locked in the first request until that request is completed, at which point my second Ajax request starts working.
In PHP I gather that there is an option to close the session for writing (and / or open it in a read-only way) so that session variable access is non blocking and things stay asynchronous.
I'm building an application that will be Java, probably running on Tomcat (though I could change to some other container if I needed) and I am not able to find out whether Java has the same issue (session variable reads block) or has the same remedy (early close, read only mode). Has anyone encountered that issue before?

In Tomcat, HttpSession is implemented in org.apache.catalina.session.StandardSession (source here).
If you look at the source, you will see that calls to HttpSession.getAttribute(String) and HttpSession.setAttribute(String, Object) are pretty much channelled to a ConcurrentHashMap without any additional synchronization.
This means that these calls derive the contract of ConcurrentHashMap. Quoting its Javadoc:
retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access. <..> Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove)
The table is internally partitioned to try to permit the indicated number of concurrent updates without contention. Because placement in hash tables is essentially random, the actual concurrency will vary.

It looks like blocking takes place because of threads synchronization of access to HttpSession as described in this SO answer
So, it must be that 2nd request is blocked only while 1st one is working with HttpSession (or if you have some shared lock which is held for long time by 1st request, but this has nothing to do with Tomcat).
Since this synch is required by Servlets spec, you shouldn't try to violate it. Instead, make your app so it minimizes time it needs to read from or write to HttpSession.
Additionally, as I wrote above, blocking may occur if you have additional lock which makes several requests execute sequentially. Try to make several thread dumps of Tomcat when you have sent 2nd request to Tomcat and see if there's any such lock which is waited by 2nd requet for.

Does a session-scoped backing bean have to be implemented thread-safe?

Is it possible, that a session-scoped backing bean is accessed by multiple threads at the same time?
The servlet spec says, it is possible:
Multiple servlets executing request threads may have active access to the same
session object at the same time. The container must ensure that manipulation of
internal data structures representing the session attributes is performed in a thread
safe manner. The Developer has the responsibility for thread safe access to the
attribute objects themselves. This will protect the attribute collection inside the
HttpSession object from concurrent access, eliminating the opportunity for an
application to cause that collection to become corrupted.
However I could not make the server (JBoss) use different threads for the same session. When I opened multiple tabs and started a long running request in one tab, and then started a request in another tab, the second tab had to wait for a response until the action started in the first tab was completed.
I also verified this by blocking the thread with a breakpoint in the backing bean. It was not possible to do anything in other tabs of the same session until I resumed the thread.
Despite this we have some strange exceptions in the production log and so far the only possible explanation we have is, that multiple threads concurrently access the same session-scoped backing bean.

Yes, A Servlet session is thread safe. But, if you are putting mutable object in the session. The application should take care of the synchronization.
In your case, if your Bean is Mutable i.e, has state. Yes it has to be thread safe.
And about your test case, it depends on the browser you are using. Most browsers support upto 6 connections in parallel for every server. But, Not sure if they use parallel connections if there have cookies.

is reference counting good design

We are working on an application container that uses reference counting as the mechanism to keep track of the requests received and responses sent. The reference count is used in allowing the graceful shutdown of the container, i.e. if (refCount == 0) shutdown;
The reference count is incremented for every Request and also for a pending Response. The reference count is decremented only once an application has accepted a request and also only once the application has sent a valid response.
So here's my question, Is reference counting a good design decision in this scenario, say as compared to keeping a RequestContext which is only closed when the Application/Container has sent a response?
Since the software is implemented in Java I was looking at other options in Java and came across this article, http://weblogs.java.net/blog/2006/05/04/understanding-weak-references which made me think that trying to leverage the ReferenceQueue could be another approach for doing this.

It is actually a really neat way of doing it. You should also additionally use a ThreadLocal (If your request - response pipeline is going to be handled by a single thread)
Basically, when you receive the request. Set a ThreadLocal with a WeakRefernce to your request object (or some attribute of the request, such as user id etc.). Then you can get() the object anywhere within your processing pipeline.
If You are using a ThreadPool of workers to handle the Request, make sure you unset the Weak Reference object from your thread's ThreadLocal so that no more references of that object exist. If you are spawning a new thread for each request, you dont even need to do that. When the thread dies, the object will automatically be returned to referenceQueue (since no live reference points to that object)

Note that you will pay for that counter with performance hits. Effectively you are requiring a memory barrier for each and every request IFF you are using concurrent threads to service the requests. (A memory barrier instruction typically costs up to ~200 instructions.)
Per your question, it further seems you don't even want a counter but rather a binary flag that indicates if there are any active requests e.g. a requestsInProgress flag. The idea being that you 'gracefully shutdown' when the flag value is false.
If your container primarily exposes network endpoints e.g. REST/HTTP then I strongly suggest you consider NIO and employ a single threaded dispatch mechanism to linearize the req/rep at container periphery. (You can queue these and fan out to N processing threads using concurrent queues in java.util.concurrent.
[NIO subsystem] <-{poll}-[Selector(accept/read)/dispatch thread] => [Q:producer/consumer pattern 1:N]
[NIO subystem] <-{poll}-[Selector(write)/responder thread] <= [Q:producer/consumer N:1]
Benefit?
If you use the same thread for dispatch and responder then no memory barriers are involved -- the thread will be pinned to a core and your flag will be exclusive to its cache line:
e.g.
after dispatch queues request:
increment req_in_progress
after responder dequeues response:
decrement req_in_progress
There will see be a need for shared memory synchronization on shutdown, but that is far better than incurring that cost per each and every request, as you only pay for it when you actually need it.
If performance is not an issue at all, then why not just use an AtomicInteger for the counter and put it in the global context?

Portlets, HttpSession and Thread-Safety

Our portlets keep state in the HttpSession, which is shared by all request processing threads for the same session.
The portlet spec (JSR-168) writes:
PLT.5.2.4.3 Multithreading Issues During Request Handling
The portlet container handles concurrent requests to the same portlet by concurrent
execution of the request handling methods on different threads. Portlet developers must
design their portlets to handle concurrent execution from multiple threads from within the
processAction and render methods at any particular time.
I wonder how I am supposed to achieve that? Sure, I can use synchronization to achieve mutual exclusion during both processAction and render, but I don't see how I can enforce atomicity of request processing as a whole. In particular, I worry about the following scenario:
Thread 1 executes processAction, loading data into the session for later rendering
Thread 2 executes processAction, discarding that data from the session
Thread 1 executes render, reading the data to render from the session, and throws a NullPointerException because the prepared data is no longer there ...
How is that scenario usually prevented? In particular, when using the JBoss portlet bridge to adapt JSF to a Portlet environment?

I'd say that if there are two portlets operating on the same data, especially one reading it while the other deletes it, there's most likely a serious flaw in the design.
You might then want to store the data per portlet/thread, i.e. if portlet1 reads some data you should write lock it until reading is finished and put it into the session using a unique key.
If it is legal to delete data that should be rendered, then you should account for that and check again during render.

Java Multithreaded Caching with Single Updater Thread

I have a web service that has ~1k request threads running simultaneously on average. These threads access data from a cache (currently on ehcache.) When the entries in the cache expire, the thread that hits the expired entry tries getting the new value from the DB, while the other threads also trying to hit this entry block, i.e. I use the BlockingEhCache decorator. Instead of having the other threads waiting on the "fetching thread," I would like the other threads to use the "stale" value corresponding to the "missed" key. Is there any 3rd party developed ehcache decorators for this purpose? Do you know of any other caching solutions that have this behavior? Other suggestions?

I don't know EHCache good enough to give specific recommendations for it to solve your problem, so I'll outline what I would do, without EHCache.
Let's assume all the threads are accessing this cache using a Service interface, called FooService, and a service bean called SimpleFooService. The service will have the methods required to get the data needed (which is also cached). This way you're hiding the fact that it's cached from from the frontend (http requests objects).
Instead of simply storing the data to be cached in a property in the service, we'll make a special object for it. Let's call it FooCacheManager. It will store the cache in a property in FooCacheManger (Let's say its of type Map). It will have getters to get the cache. It will also have a special method called reload(), which will load the data from the DB (by calling a service methods to get the data, or through the DAO), and replace the content of the cache (saved in a property).
The trick here is as follows:
Declare the cache property in FooCacheManger as AtomicReference (new Object declared in Java 1.5). This guarantees thread safety when you read and also assign to it. Your read/write actions will never collide, or read half-written value to it.
The reload() will first load the data into a temporary map, and then when its finished it will assign the new map to the property saved in FooCacheManager. Since the property is AtomicReference, the assignment is atomic, thus it's basically swiping the map in an instant without any need for locking.
TTL implementation - Have FooCacheManager implement the QuartzJob interface, and making it effectively a quartz job. In the execute method of the job, have it run the reload(). In the Spring XML define this job to run every xx minutes (your TTL) which can also be defined in a property file if you use PropertyPlaceHolderConfigurer.
This method is effective since the reading threads:
Don't block for read
Don't called isExpired() on every read, which is 1k / second.
Also the writing thread doesn't block when writing the data.
If this wasn't clear, I can add example code.

Since ehcache removes stale data, a different approach can be to refresh data with a probability that increases as expiration time approaches, and is 0 if expiration time is "sufficiently" far.
So, if thread 1 needs some data element, it might refresh it, even though data is not old yet.
In the meantime, thread 2 needs same data, it might use the existing data (while refresh thread has not finished yet). It is possible thread 2 might try to do a refresh too.
If you are working with references (the updater thread loads the object and then simply changes the reference in the cache), then no separate synchronization is required for get and set operations on the cache.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.