Apologies for the long question..
I'm fairly new to Spring and don't understand the inner working fully yet.
So, my current java project has Spring 4.x code written way back in 2015 that uses ThreadLocal variable to store some user permission data.
The flow starts as a REST call in a REST controller which then calls the backend code and checks for user permissions from the DB.
There is a #Repository class that has a static instance of ThreadLocal where this user permission is stored. The ThreadLocal variable is updated by the calling thread.
So, if the thread finds data in the ThreadLocal instance already present for it, it just reads that data from the ThreadLocal variable and works away. If not, it goes to DB tables and fetches new permission data and also updates the ThreadLocal variable.
So my understanding is that ThreadLocal variable was used as these user permissions are needed multiple times within the same REST Call. So the idea was for a given REST request since the thread is the same, it needn't fetch user permissions from DB and instead can refer to its entry in the ThreadLocal variable within the same REST request.
Now, this seems to work fine in Spring 4.3.29.RELEASE as every REST call was being serviced by a different thread.(I printed Thread IDs to confirm.)
Spring 4.x ThreadStack up to Controller method call:
com.xxx.myRESTController.getDoc(MyRESTController.java),
org.springframework.web.context.request.async.WebAsyncManager$5.run(WebAsyncManager.java:332),
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511),
java.util.concurrent.FutureTask.run(FutureTask.java:266),
java.lang.Thread.run(Thread.java:748)]
However, when I upgraded to Spring 5.2.15.RELEASE this breaks when calling different REST endpoints that try to fetch user permissions from the backend.
On printing the Stacktrace in the backend, I see there is a ThreadPoolExecutor being used in Spring 5.x.
Spring 5.x ThreadStack:
com.xxx.myRESTController.getDoc(MyRESTController.java),
org.springframework.web.context.request.async.WebAsyncManager.lambda$startCallableProcessing$4(WebAsyncManager.java:337),
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511),
java.util.concurrent.FutureTask.run(FutureTask.java:266),
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149),
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624),
java.lang.Thread.run(Thread.java:748)]
So in Spring 5.x, it looks like the same thread is being put back in the ThreadPool and later gets called for multiple different REST calls.
When this thread looks up the ThreadLocal instance, it finds stale data stored by it for an earlier unrelated REST call. So quite a few of my test cases fail due to stale data permissions being read by it.
I read that calling ThreadLocal's remove() clears the calling thread's entry from the variable (which wasn't implemented at the time).
I wanted to do this in a generic way so that all REST calls call the remove() before the REST Response is sent back.
Now, in order to clear the ThreadLocal entry, I tried
writing an Interceptor by implementing HandlerInterceptor but this didn't work.
I also wrote another Interceptor extending HandlerInterceptorAdapter and calling ThreadLocal's remove() in its afterCompletion().
I then tried implementing ServletRequestListener and called the ThreadLocal's remove() from its requestDestroyed() method.
In addition, I implemented a Filter and called remove() in doFilter() method.
All these 4 implementations failed cos when I printed the Thread IDs in their methods they were the exact same as each other, but different to the Thread ID being printed in RestController method.
So, the Thread calling the REST endpoint is a different thread from those being called by the above 4 classes. So the remove() call in the above classes never clears anything from ThreadLocal variable.
Can someone please provide some pointers on how to clear the ThreadLocal entry for a given thread in a generic way in Spring?
As you noticed, both the HandlerInterceptor and the ServletRequestListener are executed in the original servlet container thread, where the request is received. Since you are doing asynchronous processing, you need a CallableProcessingInterceptor.
Its preProcess and postProcess methods are executed on the thread where asynchronous processing will take place.
Therefore you need something like this:
WebAsyncUtils.getAsyncManager(request)//
.registerCallableInterceptor("some_unique_key", new CallableProcessingInterceptor() {
#Override
public <T> void postProcess(NativeWebRequest request, Callable<T> task,
Object concurrentResult) throws Exception {
// remove the ThreadLocal
}
});
in a method that has access to the ServletRequest and executes in the original servlet container thread, e.g. in a HandlerInterceptor#preHandle method.
Remark: Instead of registering your own ThreadLocal, you can use Spring's RequestAttributes. Use the static method:
RequestContextHolder.currentRequestAttributes()
to retrieve the current instance. Under the hood a ThreadLocal is used, but Spring takes care of setting it and removing it on every thread where the processing of your request takes place (asynchronous processing included).
Related
We are using #Async for multithreading. Untill each multithreading method i can see values for RequestContextHolder.getRequestAttributes().
But when i debug inside the method i'm getting request attributes as NULL.
Any thoughts?
To get around this issue we created a ContextAwareRunnable Object that was pre-populated with the current requestHolder, securityContextHolder, etc, so that all spawned threads would be able to execute as if it were running in the main thread.
By default ThreadLocal variable is used as holder for request attributes. That means that only single thread which handles entire https request is able to access request attributes. In contrast #Async methods are processed by threads from a separate thread pool so they can't access the attributes.
However there is one more InheritableThreadLocal variable which could be used as request attributes holder instead for default one. You can enabled it by setting threadContextInheritable property to true in DispatcherServlet or RequestContextFilter.
Take a look at implementation of RequestContextHolder for more details.
EDITED SHORT VERSION OF THE POST:
Haven't had enough views, so I'm summarizing the question:
My architecture is completely stateless and async, the front-end makes a petition to a REST API and then long-polls for the response. This Rest API queues petitions into a messaging queue, and each petition is dequeued and processed by the Back-end.
I want this Back-end to follow the "traditional" Spring #Service interface and ServiceImpl approach, however is kind of hard because of how I'm doing it.
One Thread dequeues the petition (Producer), spawns a new Thread (Consumer), and then it processes all the petition within that thread, which later sends back to a "responses pool" where it gets polled. That petition might need to use several #Service's and merge the responses from each, maybe even the same #Service twice.
How would you do it? For more information, check the description below!
ORIGINAL LONG POST:
I have a large application with 3 layers like this:
Front-end (Spring-MVC): Views and Controllers, "Model" are async requests to REST API in Middleware to queue the petition first and then long-polling for an answer
Middleware (Spring-MVC): The rest API. Two main functions: receives a petition from front-end and queues it, receives an answer from Backend and stores it on responses cache until retrieved by front-end
Back-End (Spring Standalone App): Producer/Consumer pattern, ONE Producer dequeues petition and creates a Prototype Consumer for each petition. The consumer implements InitializingBean, so it goes something like this: It is initialized, many Autowired fields are initialized and then afterPropertiesSet is executed and many fields which depends on the petition are set.
I also have a Repository Layer of HibernateDaos, which does all the querying to the database.
I'm missing a properly built Service Layer and that's what this question is all about.
Let me put a little bit more of context. What I have right now is like one only HUGE service with 221 functions (The Consumer's file is very long), and one petition may need to invoke several of this functions, and the result of each is merged into a List of DTOs, which is later received by the front-end.
I want to split this one and only service into several, in a logical match to "it's" corresponding Repository, however I've faced the following problems:
Keep this in mind:
One petition has many Actions, one action is a call to a function of a Service.
Each Consumer is a single and unique Thread.
Every time a Consumer Thread starts, a transaction is started and right before returning it is commited, unless rollbacked.
I need all the services of that petition to be executed in the same thread and transaction.
When the consumer runs afterPropertiesSet, several fields specific to that request are initialized by some parameters which are always sent.
With a good Service Layer I want to acomplish:
I don't want to have to initialize all these parameters always for each service of the petition, I want them to be global to the petition/Thread, however, I don't want to have to pass then as parameters to all the 221 functions.
I want to lazily initialize the new services, only if needed, and when it is initialized, I want to set all the parameters I mentioned above. Each service needs to be a Prototype to the petition, however I feel like is dumb initializing it twice if needed within the same petition (2 actions for the same service on one petition), i.e. I want it to behave like a "Request" scope, however, it is not a request since it is not Web Based, it is a new Thread initialized by the Producer when de-queuing a petition.
I was thinking of having a prototype ServicesFactory per Consumer which is initialized with all the parameters afterPropetiesSet in the Consumer, inside this ServicesFactory all possible Services are declared as Class fields, and when a specific service is requested, if it's field is null it is initialized and all fields are set, if not null, the same instance is returned. The problem with this approach, I that I'm losing Dependency Injection on all the Services. I've been reading about ServiceFactoryBean thinking maybe this is the way to go, however I really can't get a hold to it. The fact that it needs all the parameters of the Consumer, and that it needs to be an unique ServiceFactoryBean per Consumer is really confusing.
Any thoughts?
Thanks
Based on the description I don't think this is a good case for using the protoype scope, in this case the ideal scope seems to be thread scope.
As a solution, the simplest would be to make all services singleton. Then the consumer reads the petition from the inbound queue and starts processing.
One of the services that is also singleton and gets injected in all services needed, let's call it PetitionScopedService.
This service internally uses a ThreadLocal, which is a thread scoped holder for a variable of type PetitionContext. PetitionContext on it's turn contain all information that is global to that petition.
All the consumer needs to do is to set the initial values of the petition context, and any caller of PetitionScopedService on the same thread will be able to read those values in a transparent way. Here is some sample code:
public class PetitionContext {
... just a POJO, getters and setters etc.
}
#Service
public class PetitionScopedService {
private ThreadLocal<PetitionContext> = new ThreadLocal<PetitionContext>();
public doSomethingPetitionSpecific() {
... uses the petition context ...
}
}
#Service
public class SomeOtherService {
#Autowired
private PetitionScopedService petitionService;
... use petition service that is a singleton with thread scoped internal state, effectivelly thread scoped ...
}
Points 2 and 3 need more reorganizing, prefer to check "Spring Integration" for both "Middleware" and "(Spring Standalone App): Producer/Consumer pattern" actually spring integration made to solve these 2 points, and using publish/subscribe if you are doing 2 or more actions at same time, the other point why you are using REST in "Middleware" are these "Middleware" services exposed by another app rather than your front end, in this case you can integrate this part in your Spring-MVC front end app using "content negotiation", otherwise if you are going to use "Spring Integration" you will find multiple ways for communication.
I have a class which acts as a simple crawler and I want to invoke this class within a servlet.
My idea is to get an url from user then url request will be passed to the servlet and servelt pass the url to the class and class will start the crawling. and I want my servlet to create only one instance of this class.the retrieved data from crawlwer will be added to the DB directly by the class.
I want to control the behavior of the class like running/halting/stopping from servlet
(for this matter I think I am able to create a simple xml file which will be shared between servlet and class and if servlet change the status code class should response to the status change)
But I have some doubts about how to control the behavior of the class such as command it to run/halt/stop and since my class is not multithreaded I don't have any idea what will happen to invoked class after calling it from servlet and since this class needs to read from network obviously I'll have some gap/freezing phase during running it.
How can I solve the problem of concurrency in this situation?or in other word will I have any concurrency issue or not?
regards.
It depends on the Servlet container you are using. Some containers spawn a new Thread per user request (almost always this is the desired behavior), so you should definitely design for concurrency.
You can make the Servlet class implement SingleThreadModel, then in the service method you can directly call the crawler class code, as only a thread will enter service at a time.
This implies only an URL can be processed at a given time, which is probably not what you want, so instead of that, don't implement SingleThreadModel and create a singleton executor service in the init method:
ExecutorService ex = Executors.newFixedThreadPool(20); //Only 20 tasks at a given time
Then, in the service method create a new CrawlingTask (Runnable) with the URL specified in the request, then submit the task to the executor.
That way you could also shutdown it:
ex.shutdown();
As ExecutorService is thread-safe, you don't have to worry about concurrency when enqueuing tasks.
First, understand the difference between a Class and a Thread. A class is just code, a thread is where the code is executed. You don't stop/halt a class, you stop or halt a thread that is executing code in a class.
I would suggest you start reading up on Java concurrency programming. since what you are describing is very much about multithreading and thread synchronization.
I have a web service that has ~1k request threads running simultaneously on average. These threads access data from a cache (currently on ehcache.) When the entries in the cache expire, the thread that hits the expired entry tries getting the new value from the DB, while the other threads also trying to hit this entry block, i.e. I use the BlockingEhCache decorator. Instead of having the other threads waiting on the "fetching thread," I would like the other threads to use the "stale" value corresponding to the "missed" key. Is there any 3rd party developed ehcache decorators for this purpose? Do you know of any other caching solutions that have this behavior? Other suggestions?
I don't know EHCache good enough to give specific recommendations for it to solve your problem, so I'll outline what I would do, without EHCache.
Let's assume all the threads are accessing this cache using a Service interface, called FooService, and a service bean called SimpleFooService. The service will have the methods required to get the data needed (which is also cached). This way you're hiding the fact that it's cached from from the frontend (http requests objects).
Instead of simply storing the data to be cached in a property in the service, we'll make a special object for it. Let's call it FooCacheManager. It will store the cache in a property in FooCacheManger (Let's say its of type Map). It will have getters to get the cache. It will also have a special method called reload(), which will load the data from the DB (by calling a service methods to get the data, or through the DAO), and replace the content of the cache (saved in a property).
The trick here is as follows:
Declare the cache property in FooCacheManger as AtomicReference (new Object declared in Java 1.5). This guarantees thread safety when you read and also assign to it. Your read/write actions will never collide, or read half-written value to it.
The reload() will first load the data into a temporary map, and then when its finished it will assign the new map to the property saved in FooCacheManager. Since the property is AtomicReference, the assignment is atomic, thus it's basically swiping the map in an instant without any need for locking.
TTL implementation - Have FooCacheManager implement the QuartzJob interface, and making it effectively a quartz job. In the execute method of the job, have it run the reload(). In the Spring XML define this job to run every xx minutes (your TTL) which can also be defined in a property file if you use PropertyPlaceHolderConfigurer.
This method is effective since the reading threads:
Don't block for read
Don't called isExpired() on every read, which is 1k / second.
Also the writing thread doesn't block when writing the data.
If this wasn't clear, I can add example code.
Since ehcache removes stale data, a different approach can be to refresh data with a probability that increases as expiration time approaches, and is 0 if expiration time is "sufficiently" far.
So, if thread 1 needs some data element, it might refresh it, even though data is not old yet.
In the meantime, thread 2 needs same data, it might use the existing data (while refresh thread has not finished yet). It is possible thread 2 might try to do a refresh too.
If you are working with references (the updater thread loads the object and then simply changes the reference in the cache), then no separate synchronization is required for get and set operations on the cache.
Are the threadlocals variables global to all the requests made to the servlet that owns the variables?
I am using resin for the server.
Thanks for awnser.
I think I can make my self more clear.
The specific Case:
I want to:
initialize a static variable when the request starts the execution.
be able to query the value of the variable in the further executions of methods called from the servlet in a thread safety way until the request ends the execution
Short answer: Yes.
A bit longer one: This is how Spring does its magic. See RequestContextHolder (via DocJar).
Caution is needed though - you have to know when to invalidate the ThreadLocal, how to defer to other threads and how (not) to get tangled with a non-threadlocal context.
Or you could just use Spring...
I think they are global to all requests made with that specific thread only. Other threads get other copies of the thread-local data. This is the key point of thread-local storage:
http://en.wikipedia.org/wiki/Thread-local_storage#Java.
Unless you check the appropriate option in the servlets config, the servlet container will use your servlet with multiple threads to handle requests in parallel. So effectively you would have separate data for each thread that's up serving clients.
If your WebApplication isn't distributed (runs on multiple Java Virtual Machines), you can use the ServletContext object to store shared data across requests and threads (be sure to do proper locking then).
Like Adiel says, the proper way to do this is probably to use the request context (i.e. HttpServletRequest), not to create a ThreadLocal. While it's certainly possible to use a ThreadLocal here, you have to be careful to clean up your thread if you do that, since otherwise the next request that gets the thread will see the value associated with the previous request. (When the first request is done with the thread, the thread will go back into the pool and so the next request will see it.) No reason to have to manage that kind of thing when the request context exists for precisely this purpose.
Using ThreadLocal to store request scoped information has the potential to break if you use Servlet 3.0 Suspendable requests (or Jetty Continuations)
Using those API's multiple threads process a single request.
Threadlocal variables are always defined to be accessed globally, since the point is to transparently pass information around a system that can be accessed anywhere. The value of the variable is bound to the thread on which it is set, so even though the variable is global, it can have different values depending on the thread from which it is accessed.
A simple example would be to assign a user identity string to a thread in a thread local variable when the request is received in the servlet. Anywhere along the processing chain of that request (assuming it is on the same thread in the same VM), the identity can be retrieved by accessing this global variable. It would also be important to remove this value when the request is processed, since the thread will be put back in a thread pool.