is reference counting good design - java

We are working on an application container that uses reference counting as the mechanism to keep track of the requests received and responses sent. The reference count is used in allowing the graceful shutdown of the container, i.e. if (refCount == 0) shutdown;
The reference count is incremented for every Request and also for a pending Response. The reference count is decremented only once an application has accepted a request and also only once the application has sent a valid response.
So here's my question, Is reference counting a good design decision in this scenario, say as compared to keeping a RequestContext which is only closed when the Application/Container has sent a response?
Since the software is implemented in Java I was looking at other options in Java and came across this article, http://weblogs.java.net/blog/2006/05/04/understanding-weak-references which made me think that trying to leverage the ReferenceQueue could be another approach for doing this.

It is actually a really neat way of doing it. You should also additionally use a ThreadLocal (If your request - response pipeline is going to be handled by a single thread)
Basically, when you receive the request. Set a ThreadLocal with a WeakRefernce to your request object (or some attribute of the request, such as user id etc.). Then you can get() the object anywhere within your processing pipeline.
If You are using a ThreadPool of workers to handle the Request, make sure you unset the Weak Reference object from your thread's ThreadLocal so that no more references of that object exist. If you are spawning a new thread for each request, you dont even need to do that. When the thread dies, the object will automatically be returned to referenceQueue (since no live reference points to that object)

Note that you will pay for that counter with performance hits. Effectively you are requiring a memory barrier for each and every request IFF you are using concurrent threads to service the requests. (A memory barrier instruction typically costs up to ~200 instructions.)
Per your question, it further seems you don't even want a counter but rather a binary flag that indicates if there are any active requests e.g. a requestsInProgress flag. The idea being that you 'gracefully shutdown' when the flag value is false.
If your container primarily exposes network endpoints e.g. REST/HTTP then I strongly suggest you consider NIO and employ a single threaded dispatch mechanism to linearize the req/rep at container periphery. (You can queue these and fan out to N processing threads using concurrent queues in java.util.concurrent.
[NIO subsystem] <-{poll}-[Selector(accept/read)/dispatch thread] => [Q:producer/consumer pattern 1:N]
[NIO subystem] <-{poll}-[Selector(write)/responder thread] <= [Q:producer/consumer N:1]
Benefit?
If you use the same thread for dispatch and responder then no memory barriers are involved -- the thread will be pinned to a core and your flag will be exclusive to its cache line:
e.g.
after dispatch queues request:
increment req_in_progress
after responder dequeues response:
decrement req_in_progress
There will see be a need for shared memory synchronization on shutdown, but that is far better than incurring that cost per each and every request, as you only pay for it when you actually need it.
If performance is not an issue at all, then why not just use an AtomicInteger for the counter and put it in the global context?

Related

ThreadLocal usage for managing session data

I got a doubt about threadlocal usage in managing sessions. It is..
In Thread Local, which ever thread that creates the thread local object has the access to that session object, and only that thread can modify the session object data. There might be many threads running to complete a single user request.
So what about all other threads that participate to complete one user request.
They wont get access to modify the session object, as whichever thread creates the Thread local object gets its own session object, so other threads that run to complete a single user request might not update their data to the session object to which they actually wanted to.
I mean if thread-1 and thread-2 participate in completing a user request, but thread-1 comes to create the threadlocal object, and thread-2 executes other codes as part of the request, but after thread-2 completion , it cant update the session data because, only tharead-1 has access to the session as it has created the threadlocal object.
So how are we resolving this issue.
Are we making sure that only one thread participate in completeing single users request ? or
How are we so sure that which ever thread that creates the threadlocal object is only coming to update the session data associated for that request ?
Well first, RTS (Real-time system) have to be distinguished from high performance systems. In RTS time has been split in frames and piece of software have one or more frame allocated to them. Which makes very predictable which task system is performing at a given time and also which other tasks are running in parallel. Thus design help in avoiding/limiting concurrency management.
Second, high performance systems (whatever RTS or not) don't necessarly rely on multithreading for the exact same reason as for RTS : concurrency management and blocking structures more precisely are bottlenecks. And multithreading is not the only solution for parallel processing. Forks, grid computing and forth work greats and offer more scalability, stability, robustness, etc.
So, single thread environments are generally based on an async design. You split your work (ie request processing) in small non-waiting/blocking structures. All waiting/blocking (or even agressive long computing) are sent "somewhere" and just notify request processor when complete. In this case one thread can treat many request during a single request processing (not sure to be clear at this point ...)
In such (and some other) cases, you can't use thread-local binding. However, you can use "local scope" to share a reference across your processing :
Map<String, String> context = new LinkedHashMap<>();
context.put("id", "42");
taskQueue.add(() -> context.put("state", "action1"));
taskQueue.add(() -> System.out.printf("%-10s > state=%s%n", context.get("id"), context.get("state")));
(assuming taskQueue is Queue of Runnable)
You can also save request context by:
generating a "Unique Request Identifier" (suitable class should be UUID)
using a request context storage (can be a simple ConcurrentMap) or more advanced storage such as Key-Value Cache, Key-Value Store or Document store.
attaching "Unique Request Identifier" to all request-bound actions

Sending objects back and forth between threads in java?

I have multiple client handler threads, these threads need to pass received object to a server queue and the sever queue will pass another type of object back to the sending thread. The server queue is started and keeps running when the server starts.I am not sure which thread mechanism to use for the client handler threads notified an object is sent back. I don't intend to use socket or writing to a file.
If you wanted to do actual message passing take a look at SynchronusQueue. Each thread will have reference to the queue and would wait until one thread passed the reference through the queue.
This would be thread safe and address your requirements.
Though if you are simply looking to have threads read and write a shared variable you can use normalocity's suggestion though it's thread-safety depends on how you access it (via sychronized or volatile)
As far as making objects accessible in Java, there's no difference between multi-thread and single-thread. You just follow the scope rules (public, private, protected), and that's it. Multiple threads all run within the same process, so there isn't any special thread-only scope rules to know about.
For example, define a method where you pass the object in, and make that method accessible from the other thread. The object you want to pass around simply needs to be accessible from the other thread's scope.
As far as thread-safety, you can synchronize your writes, and for the most part, that will take care of things. Thread safety can get a bit hairy the more complicated your code, but I think this will get you started.
One method for processing objects, and producing result objects is to have a shared array or LinkedList that acts as a queue of objects, containing the objects to be processed, and the resulting objects from that processing. It's hard to go into much more detail than that without more specifics on what exactly you're trying to do, but most shared access to objects between threads comes down to either inter-thread method calls, or some shared collection/queue of objects.
Unless you are absolutely certain that it will always be only a single object at a time, use some sort of Queue.
If you are certain that it will always be only a single object at a time, use some sort of Queue anyway. :-)
Use a concurrent queue from the java.util.concurrent.*.
why? Almost guaranteed to provide better general performance than any thing hand rolled.
recommendation: use a bound queue and you will get back-pressure for free.
note: the depth of queue determines your general latency characteristics: shallower queues will have lower latencies at the cost of reduced bandwidth.
Use Future semantics
why? Futures provide a proven and standard means of getting asynchronous result.
recommendation: create a simple Request class and expose a method #getFutureResponse(). The implementation of this method can use a variety of signaling strategies, such as Lock, flag (using Atomic/CAS), etc.
note: use of timeout semantics in Future will allow you to link server behavior to your server SLA e.g. #getFutureResponse(sla_timeout_ms).
A book tip for if you want to dive a bit more into communication between threads (or processes, or systems): Pattern-Oriented Software Architecture Volume 2: Patterns for Concurrent and Networked Objects
Just use simple dependency injection.
MyFirstThread extends Thread{
public void setData(Object o){...}
}
MySecondThread extends Thread{
MyFirstThread callback;
MySecondThread(MyFirstThread callback){this.callback=callback)
}
MyFirstThread t1 = new MyFirstThread();
MySecondThread t2 = new MySecondThread(t1);
t1.start();
t2.start();
You can now do callback.setData(...) in your second thread.
I find this to be the safest way. Other solutions involve using volatile or some kind of shared object which I think is an overkill.
You may also want to use BlockingQueue and pass both of those to each thread. If you plan to have more than one thread then it is probably a better solution.

Should volatile be used for attributes of domain model classes in Java web apps?

Here's my thinking:
Even though a HTTP request cycle is essentially handled by a 'single thread', each time a HTTP request is processed for that same session it is likely to be processed by a different thread from the thread pool.
Without the volatile keyword being used on a domain model object, whose lifecycle extends across multiple HTTP requests for the same session, then, according to my understanding, isn't it possible that the attribute could be thread local cached (an optimization by the compiler) in the thread that serviced the first HTTP request? If the second HTTP request is serviced by another thread then that second thread may not see the changes in that attribute that were made by the first thread.
Does this spell "Danger Will Robinson"? Or am I missing a vital plot point about the use (or not) of the volatile keyword?
I think you are forgetting that the threads handling the HTTP request first need to retrieve the instance of the domain model object from the HttpSession provided by your application server. The thread handling request 2 in the scenario you describe does not already have an instance of this domain model - it has to retrieve it from the session implementation at the start of handling each and every request.
I think it is completely reasonable to assume that the session-handling implementation in your application server is handling session data in such a way that memory model visibility issues are avoided. Apache Tomcat's default (non-clustered) HttpSession implementation, for example, stores the session attributes in a ConcurrentHashMap.
Adding volatile seems completely unnecessary to me. I have never seen this done for domain model objects handled by HTTP requests in a Servlet environment in any project I have worked in.
This would be a different story if thread-1 and thread-2 had references to the same object instance simulatenously while processing two different requests, and you were concerned about changes in one thread being visible to the other as each are processing the request, but this does not sound like what you are asking about.
Yes, if you are sharing an object between different threads, you may have race conditions. Without a happens before relationship, writes made by one thread may not be seen by a read in another thread.
Doing a volatile write in one thread and doing a volatile read of the same field in another thread establishes a happens before relationship between the two threads, and ensures visibility of the write.
This is a complicated problem, simply using a volatile keyword is probably not a good solution.
I think your understanding of it is correct. Given your description I would say it should be used. If its something more than a primitive type I would rather synchronize.
Good information on volatile:
http://www.javamex.com/tutorials/synchronization_volatile_when.shtml
If you have a mutable object in session, that is trouble. But usually the solution is not to guard individual fields; rather the entire object should be swapped.
Say you have the user object in the session. Most requests simply retrieve it, read it and display it.
There is a request that can modify user information. It would be a really bad idea to retrieve the user object, modify it. It's better to create complete new user object, and insert it into session.
In that case, fields in User don't need any protection; thread safety is guaranteed by session setAttribute() - getAttribute()
If you have concurrency issues, just adding 'volatile' probably won't help you.
As for keeping the object as an attribute of Session, I'd recommend you to keep just the object's ID, and use it to retrieve a 'live' instance when you need it (if you use Hibernate, successive retrieves will return the same object, so this shouldn't cause performance problems). Encapsulate all modification logic to this specific object into a single façade, and do the control concurrency there, using dababase locking.
Or, if you really, really, really want to use memory-based locking, and are really sure that you'll never have two instances of the application running in a cluster, make sure that your façade logic is synchronized at the right level. If your synchronization is too fine grained (low-level operations, such as volatile variables), it probably won't be enough to make your code thread-safe. For example, java.util.Hashtable is fully synchronized, but it doesn't mean anything if you have logic like this:
01 if (!hashtable.containsKey(key)) {
02 hashtable.put(key, calculate(key));
03 }
If two threads, say, t1 and t2, hit this block at the same time, t1 may execute line 01, then t2 may also execute 01, and then 02, and t1 then will execute 02, overwriting what t2 had done. The operations containsKey() and put() are atomic individually, but what should be atomic is the whole block.
Sometimes recalculating a value doesn't matter, but sometimes it does, and it will break.
When it comes to concurrency, there's no magic. I mean, seam some crappy frameworks try to sell you the idea that they solve this problem for you. They don't. Even if it works 99% of the time, it will break spectacularly when you go to production and start to get heavy traffic. Or (much, much) worse, it will silently generate wrong results.
Concurrency is one of the most complex problems in programming. And the only way to handle it is to avoid it. All this functional programming trend is not about dealing with concurrency, is about avoiding it altogether.
It turns out that volatile was not needed in the end. The problem that "appeared" to be fixed with volatile was actually a very subtle timing sensitive bug that was fixed in a much more elegant and proper way ;)
So sbrigdes was correct when he said "simply using a volatile keyword is probably not a good solution."

Using a JMS Session from different threads

From the javadoc for Session it states:
A Session object is a single-threaded context for producing and consuming messages.
So I understand that you shouldn't use a Session object from two different threads at the same time. What I'm unclear on is if you could use the Session object (or children such as a Queue) from a different thread than the one it created.
In the case I'm working on, I'm considering putting my Session objects into a pool of available sessions that any thread could borrow from, use, and return to the pool when it is finished with it.
Is this kosher?
(Using ActiveMQ BTW, if that impacts the answer at all.)
I think the footnote from section 4.4 in the JMS 1.1 spec sheds some light:
There are no restrictions on the number of threads that can use a Session object or those it creates. The restriction is that the resources of a Session should not be used concurrently by multiple threads. It is up to the user to insure that this concurrency restriction is met. The simplest way to do this is to use one thread. In the case of asynchronous delivery, use one thread for setup in stopped mode and then start asynchronous delivery. In more complex cases the user must provide explicit synchronization.
By my reading of the spec what you want to do is OK, provided you correctly manage concurrency.
Sadly the JMS docs are often not written as clearly or precisely as we might like :o(
But reading the spec I'm now pretty convinced you really shouldn't access the session from other threads, even if you guarantee there's no concurrent access. The bit of the javadoc that swung it for me was:
Once a connection has been started,
any session with a registered message
listener(s) is dedicated to the thread
of control that delivers messages to
it. It is erroneous for client code to
use this session or any of its
constituent objects from another
thread of control. The only exception
to this is the use of the session or
connection close method.
Note the clear use of 'thread of control' and the singling out of 'close()' as the only exception.
They seem to be saying that even if you're using asynchronous message consumption (i.e. setMessageListener) - which means you get called back on another thread created by JMS to receive messages - you're never allowed to touch the session or related objects again from any other thread, because the session is now 'dedicated' to the JMS delivery thread. For example, I assume this means you couldn't even call message.acknowledge() from another thread.
Having said that, I only just noticed that we haven't been obeying this constraint, and have yet to notice any ill effects (using SonicMQ). But of course if you don't obey the standard, all bets are off, so I guess we need to obey the 1-thread 1-session rule to stay safe.

threadlocal variables in a servlet

Are the threadlocals variables global to all the requests made to the servlet that owns the variables?
I am using resin for the server.
Thanks for awnser.
I think I can make my self more clear.
The specific Case:
I want to:
initialize a static variable when the request starts the execution.
be able to query the value of the variable in the further executions of methods called from the servlet in a thread safety way until the request ends the execution
Short answer: Yes.
A bit longer one: This is how Spring does its magic. See RequestContextHolder (via DocJar).
Caution is needed though - you have to know when to invalidate the ThreadLocal, how to defer to other threads and how (not) to get tangled with a non-threadlocal context.
Or you could just use Spring...
I think they are global to all requests made with that specific thread only. Other threads get other copies of the thread-local data. This is the key point of thread-local storage:
http://en.wikipedia.org/wiki/Thread-local_storage#Java.
Unless you check the appropriate option in the servlets config, the servlet container will use your servlet with multiple threads to handle requests in parallel. So effectively you would have separate data for each thread that's up serving clients.
If your WebApplication isn't distributed (runs on multiple Java Virtual Machines), you can use the ServletContext object to store shared data across requests and threads (be sure to do proper locking then).
Like Adiel says, the proper way to do this is probably to use the request context (i.e. HttpServletRequest), not to create a ThreadLocal. While it's certainly possible to use a ThreadLocal here, you have to be careful to clean up your thread if you do that, since otherwise the next request that gets the thread will see the value associated with the previous request. (When the first request is done with the thread, the thread will go back into the pool and so the next request will see it.) No reason to have to manage that kind of thing when the request context exists for precisely this purpose.
Using ThreadLocal to store request scoped information has the potential to break if you use Servlet 3.0 Suspendable requests (or Jetty Continuations)
Using those API's multiple threads process a single request.
Threadlocal variables are always defined to be accessed globally, since the point is to transparently pass information around a system that can be accessed anywhere. The value of the variable is bound to the thread on which it is set, so even though the variable is global, it can have different values depending on the thread from which it is accessed.
A simple example would be to assign a user identity string to a thread in a thread local variable when the request is received in the servlet. Anywhere along the processing chain of that request (assuming it is on the same thread in the same VM), the identity can be retrieved by accessing this global variable. It would also be important to remove this value when the request is processed, since the thread will be put back in a thread pool.

Categories

Resources