Spring batch - how to dynamically increase and decrease threads using ThreadPoolTaskExecutor - java

I'm fairly new to Spring-Batch so this may be a lack of understanding on my part. I'm wanting to understand how to dynamically increase and decrease threads using the ThreadPoolTaskExecutor in conjunction with the ThreadPoolExecutor while my job is running. I've tried to subclass both the ThreadPoolTaskExecutor and the ThreadPoolExecutor so I can gain access to the beforeExecute() and afterExecute() which would allow me to terminate threads if the corepoolsize was decreased using an approach that is listed on this site.
What I seem to be not understanding is that when I override the initializeExecutor() method which returns an ExecutorService, it apparently does not set the (private internal) threadPoolExecutor variable in the parent class (ThreadPoolTaskExecutor). It sets the private ExecutorService executor; (from the ExecutorConfigurationSupport class)
Since the threadPoolExecutor is not a protected member I cannot gain access to it. Without that being set, when I run I obviously end up getting a "ThreadPoolExecutor not initialized" error within the Spring Framework when I examine what's wrong under the covers.
public class MyThreadPoolTaskExecutor extends ThreadPoolTaskExecutor
{
#Override
protected ExecutorService initializeExecutor(ThreadFactory tf, RejectedExecutionHandler reh)
{
BlockingQueue <Runnable> queue = createQueue(Integer.MAX_VALUE);
MyThreadPoolExecutor tp_executor = new MyThreadPoolExecutor( this.getCorePoolSize(), this.getMaxPoolSize(), this.getKeepAliveSeconds(), TimeUnit.SECONDS, queue);
// if you look at the parent class(ThreadPoolTaskExecutor) it performs this call next.
// this.threadPoolExecutor = executor;
// that is a private member with no ability to set via any methods.
return tp_executor;
}
}
public class MyThreadPoolExecutor extends ThreadPoolExecutor
{
public MyThreadPoolExecutor(int corePoolSize, int maxPoolSize, long keepAliveTimeout, TimeUnit timeunit, BlockingQueue<Runnable> workQueue, ThreadFactory tf, RejectedExecutionHandler reh)
{
super(corePoolSize, maxPoolSize, keepAliveTimeout, timeunit, workQueue, tf, reh);
}
protected void beforeExecute (final Thread thread, final Runnable job)
{
...
}
}
Can someone explain what I am missing in my approach?

I assume you want to use one number of threads in one job step and another number of threads in another job step. Simple way to achieve that would be to declare two separate executors with necessary number of threads, zero corePoolSize (to not create threads when this is not necessary) and zero keepAliveSeconds (to not keep threads when this is no longer necessary). Then just inject first executor in one step and second executor in another step.
#Configuration
public class Conf {
#Bean
public TaskExecutor executorA(#Value("${first.number.of.threads}") int numberOfThreads) {
return executor(numberOfThreads);
}
#Bean
public TaskExecutor executorB(#Value("${second.number.of.threads}") int numberOfThreads) {
return executor(numberOfThreads);
}
private TaskExecutor executor(int numberOfThreads) {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(0);
executor.setMaxPoolSize(numberOfThreads);
executor.setAllowCoreThreadTimeOut(true);
executor.setKeepAliveSeconds(0);
return executor;
}
}

Related

How to get details of Spring Boot #Async method's rejected task?

In my application I'm using an #Async method which is calling a rest service and based on the rest service result I'm updating the MyJob status in DB.
#Async("thatOneTaskExecutor")
public void myAsyncTask(MyJob job) {
// get job details from the job and call rest service
// update the job with the result from rest service and save updated MyJob to DB
}
I'm using Spring's ThreadPoolTaskExucutor, Below is a snap from my AsyncConfiguration class where I declared this task executor.
private ThreadPoolTaskExecutor createExecutor(String name, int core, int max, int queue) {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(core);
executor.setMaxPoolSize(max);
executor.setQueueCapacity(queue);
executor.setThreadNamePrefix(name);
executor.setTaskDecorator(new MdcAwareTaskDecorator());
executor.initialize();
return executor;
}
#Bean(name = "thatOneTaskExecutor")
public Executor taskExecutor() {
String prefix = "thatOneTask-";
String corePoolSize = 12;
String maxPoolSize = 20;
String queueSize = 1000;
ThreadPoolTaskExecutor executor = createExecutor(prefix, corePoolSize, maxPoolSize, queueSize);
executor.setRejectedExecutionHandler(new RejectedExecutionHandlerImpl());
return executor;
}
As you can see I had configured a RejectedExecutionHandler for my Executor.
According to Spring documentation when queue is full this method will be called.
* Method that may be invoked by a {#link ThreadPoolExecutor} when
* {#link ThreadPoolExecutor#execute execute} cannot accept a
* task. This may occur when no more threads or queue slots are
* available because their bounds would be exceeded, or upon
* shutdown of the Executor.
public class RejectedExecutionHandlerImpl implements RejectedExecutionHandler {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
log.error("Task Rejected because of max queue size");
// How to get info about that particular job, for which Task executor rejected this task??
}
}
Rejected execution handler is working fine for me, now inside this rejectedExecutorion method, I want to access the MyJob(parameter of my async method) for which the async task is rejected. I want to update that particular rejected job with a status so that I can later run a corn and process those rejected jobs. Inside this rejectedExecution method I only have Runnable and ThreadPoolExucutor, how can I extract/get info about MyJob here?
My application's Spring boot version is 2.2.2.RELEASE
You could consider using the TaskExecutor directly instead of the #Async annotation by implementing the Runnable interface for MyJob Class and perform the required async operation inside the run() method.
The Runnable r could be cast back to MyJob Object in the rejectedExecution method of the handler and hence you could retrieve information of your job from there.
public class Myjob implements Runnable{
.......
#Override
public void run(){
//get job details from the job and call rest service
//update the job with the result from rest service and save updated MyJob to DB
}
}
#Autowired
TaskExecutor taskExecutor;
public void myAsyncTask(MyJob job) {
taskExecutor.execute(job)
}
public class RejectedExecutionHandlerImpl implements RejectedExecutionHandler {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
log.error("Task Rejected because of max queue size");
if(r.getClass()==MyJob.class)
{
MyJob failedJob=(MyJob)r; //Job info
}
}
}

Mixing Java `ExecutorService` with Spring `TaskExecutor`

I have a piece of Java code that I previously used without Spring that looks like this:
// `Callable` instead of `Runnable` because we need to throw exceptions
public MyTask extends Callable<Void> {
#Override
public Void call() throws Exception { ... }
}
public class MyTasksRunner {
private final ExecutorService executorService;
...
public void run() throws Exception {
List<MyTask> tasks = ...;
var futures = executorService.invokeAll(tasks);
for (var future : futures) {
// Rethrow any exceptions happened in the threads.
future.get();
}
}
}
Now I'm merging this code into a larger Spring Boot application that has async enabled. It configures a TaskExecutor, which doesn't have the same interface as ExecutorService. A TaskExecutor can only run Runnables, not Callables.
I can probably have a TaskExecutor bean for async Spring, and another ExecutorService bean for the MyTasksRunner code at the same time. But I wonder what options I have if I want to merge those:
Can I tell Spring to use an ExecutorService for its async stuff?
Can I convert my Callable code to use Runnables instead, while still being able to propagate exceptions from the tasks?
I also thought about just making MyTask a Spring component and annotating it with #Async, but I don't really like that it makes the MyTask* code tied to Spring.
Yes, you can convert your Callable task to Runnable as I see you don't expect any return value. But with one condition - you cant throw Checked Exception however you may continue throwing Runtime Exception.
Also, yes you can define Executor bean as below to inject ExecutorService
#Bean
public Executor taskExecutor() {
ExecutorService executor = Executors.newFixedThreadPool(2);
return executor;
}
If you dont define an Executor bean, Spring creates SimpleAsyncTaskExecutor and uses that.

How to get the queue size of the executor in real time

Supposed i have this application.java
#SpringBootApplication
public class Application {
public static void main(String[] args){
SpringApplication.run(Application.class, args);
}
#Bean
public Executor asyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(50);
executor.setMaxPoolSize(100);
executor.setQueueCapacity(5000);
executor.setThreadNamePrefix("sm-async-");
executor.setWaitForTasksToCompleteOnShutdown(true);
executor.initialize();
return executor;
}
}
My goal is to create an alert if the current real time queue size of the async executor is in 80% or nearly the limit. I think we can get the value from executor.getThreadPoolExecutor().getQueue().size();. Im currently stuck on how to achieve that
#Controller
public class QueueMonitorController {
#Autowired
private Executor executor;
#RequestMapping(value = "/queuesize", method = RequestMethod.GET)
public int queueSize() {
ThreadPoolExecutor tpe = (ThreadPoolExecutor)executor;
return tpe.getQueue().size();
}
}
If you can provide the bean as a ThreadPoolExecutor, then you don't even need the cast. The internal implementation of size() in LinkedBlockingQueue (which ThreadPoolExecutor uses) is AtomicInteger.get().
So there's no need to get creative and build your own mechanisms, it's all built-in. Based on Spring 4.2, but shouldn't depend on the version too much.
So the root goal is to monitor the queue, and send an alert when queue is 80% full. This should not go into your code which is responsible for making sure that your business logic works. You shouldn't make hacks there to account for lack of resources. If the idea is that you should throttle users when the queue is packed, there are far better ways to handle those.
Since the idea is to do "light monitoring", i.e. there's no attempt to handle a case when queue is 80% full, a polling solution would be lightweight enough. Considering that the executor can be easily injected to a separate Controller, it won't even mess up your "real" code.
As ThreadPoolTaskExecutor does not expose any API you can get the queue used by it. However, you are free to extend ThreadPoolTaskExecutor and create a CustomThreadPoolTaskExecutor override the createQueue.
public class CustomThreadPoolTaskExecutor extends ThreadPoolTaskExecutor{
private BlockingQueue<Runnable> queue;
#Override
protected BlockingQueue<Runnable> createQueue(int queueCapacity) {
queue = super.createQueue(queueCapacity);
return queue;
}
public BlockingQueue<Runnable> getQueue(){
return queue;
}
}
Now you can create asyncExecutor like below :
#Bean
public Executor asyncExecutor() {
ThreadPoolTaskExecutor executor = new CustomThreadPoolTaskExecutor();
//set other properties
executor.initialize();
return executor;
}
Your CustomThreadPoolTaskExecutor executor has public method getQueue and you can use that to get the queue.
I don't know from where you have got ThreadPoolTaskExecutor class type of executor. But in java you can typecast to ThreadPoolExecutor and get queue and it's size as below:
ThreadPoolExecutor executorService = (ThreadPoolExecutor )Executors.newCachedThreadPool();
executorService.getQueue().size()
To do this in real-time as you're asking for is not so easy. You'll need to decorate the methods of BlockingQueue so that you can add code to execute immediately when the content of the queue changes.
You can then provide your queue to Spring's ThreadPoolTaskExecutor like this:
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor() {
#Override
protected BlockingQueue<Runnable> createQueue(int queueCapacity) {
// create and return your instance of blocking queue here
}
};

Middle ground between ExecutorService.shutDown() and ExecutorService.shutDownNow()

I currently have an ExecutorService where I want the following:
No new tasks should be accepted.
Current tasks should still keep on executing.
All currently queued up tasks should be returned.
The issue I am facing is that the shutdown() will not return any non-executing submitted tasks (in fact it will wait until al tasks have been completed), while the shutdownNow() will abruptly try to kill all already existing tasks.
How should I work around this?
You should be able to accomplish this by creating your own class that extends ThreadPoolExecutor and providing your own shutdown method.
Below is an example for how this could be done, you might want to adjust it to your needs (providing more constructor methods for example)
public class MyExecutor extends ThreadPoolExecutor {
private final BlockingQueue<Runnable> queue;
public MyExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue) {
super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
this.queue = workQueue;
}
public List<Runnable> shutdownSpecial() {
ArrayList<Runnable> queued = new ArrayList<>();
queue.drainTo(queued);
this.shutdown();
return queued;
}
}
You can do something like follows:
As you know, ExecutorService ThreadPool implementations uses a BlockingQueue (bounded or unbounded queue) as a holder to keep your jobs (runnable).
So, considering your requirement you just have to do something like this:
yourWorkqueue.clear();
And after that, you can call
executor.shutdown();
AFAIK, the currently executing jobs would not be there in the Queue. So, the thread in the pool will keep on working on the jobs in hand while you will clear the Job Queue.

How to set priority to scheduled task?

I've got 3 tasks. 1st - adds new data. 2nd - backups. 3rd - deletes old data. They work every 10 minutes. How it should be:
1st task
2nd task
3rd task
What I've got:
2nd task
1st task
3rd task
How can I set priority to tasks?
If you use a java ThreadPoolExecutor you can provide your own task queue to be used internally by it, you should not interact with the queue directly.
You can use a PriorityBlockingQueue constructed with a custom Comparator that returns which task goes first.
You can combine #Scheduled annotation with a custom executor as explained here in the docs:
#Configuration
#EnableScheduling
public class AppConfig implements SchedulingConfigurer {
#Override
public void configureTasks(ScheduledTaskRegistrar taskRegistrar) {
taskRegistrar.setScheduler(taskExecutor());
}
#Bean(destroyMethod="shutdown")
public Executor taskExecutor() {
return new ThreadPoolExecutor(1, 2, 10, TimeUnit.SECONDS, new PriorityBlockingQueue<Runnable>(20, new Comparator<Runnable2>() {
#Override
public int compare(Runnable2 o1, Runnable2 o2) {
return o1.getPriority().compareTo(o2.getPriority());
}
}));
}
}
Having a Runnable2 class that implements Runnable and has an assigned priority for example.
You may want to look into converting these tasks to a Spring Batch job. This would provide more powerful features such as transactions and better error handling.

Categories

Resources