Problem statement:
I have 1,000 tasks and need to process them via ThreadPoolTaskExecutor. ThreadPoolTaskExecutor has corePoolSize = 5, maxPoolSize = 10 and queueCapacity = 1000.
Now from the main method, I am executing the following code
CountDownLatch latch = new CountDownLatch(5);
Collection<Future<?>> futures = new LinkedList<Future<?>>();
for (Map.Entry<String,Boolean> entry : map.entrySet()){
FutureTask task = new FutureTask(new CustomTask(entry));
executor.execute(task);
}
log.info("ACTIVE COUNT : "+executor.getActiveCount());
log.info("SIZE of the QUEUE : "+executor.getThreadPoolExecutor().getQueue().size());
log.info("LATCH WAIT : "+latch.getCount());
latch.wait();
.....
#Override
public Object call() throws Exception {
latch.countDown();
//some logic
return entry;
}
Now, the map has 1,000 entries in it and I want to process all tasks in queue(1,000) and then print these log lines. Whats happening here is, the corePoolSize(which is equal to the CountDounLatch count) create this number of thread and executes them 'Right-Away'. However, when this number is hit, it starts filling up the queue(which is totally fine and desired). However, this queue tasks are processed ONLY AFTER the main thread reaches the end, only then these tasks start executing. This is something that I don't want. I want the Executor to start picking up items from queue as soon as threads get free from processing batch-1.
But in my case, once the batch-1 is processed, the next task is picked only when the main threads ends(which I do not want).
Anyone with a solution on how can this be achieved? (The processing of queue as soon as the thread is available for processing)
P.S : I do understand that latch.await() waits for the threads to complete their execution, but I am looking for a behavior in which it should wait for all the threads to be finished(which is happening) and all the queue should be empty(my expectations).
Thank You
If you are going to do it this way, you need to initialize the latch with the number of tasks that you are going to submit; i.e. 1,000. Also you should decrement the latch at the end of each task, not at its start (as your code currently seems to be doing.)
But you don't need a latch or a counter or anything to implement this. Instead, if you are using a Java SE ExecutorService directly, just do this:
public static void main(String[] args) {
// Submit lots of tasks
executorService.shutdown();
try {
// Waits until all tasks in the queue have completed
executorService.awaitTermination(1_000_000, TimeUnit.SECONDS);
} catch (InterruptedException ex) {
// OK ... will end now
}
}
And if you are using the SpringFramework specific ThreadPoolTaskExecutor class:
public static void main(String[] args) {
// Submit lots of tasks
executor.setAwaitTerminationSeconds(1_000_000);
executor.setWaitForTasksToCompleteOnShutdown(true);
executor.shutdown();
}
Related
I am new to concurrency and I was trying to implement executor service concurrency for a do-while loop. But I always run into RejectedExecutionException
Here is my sample code:
do {
Future<Void> future = executor.submit(new Callable<Void>() {
#Override
public Void call() throws Exception {
// action
return null;
}
});
futures.add(future);
executor.shutdown();
for (Future<Void> future : futures) {
try {
future.get();
}
catch (InterruptedException e) {
throw new IOException(e)
}
}
}
while (true);
But this seems incorrect. I think I am calling the shutdown at the wrong place. Can anyone please help me implement Executor Service in a do-while loop correctly. Thanks.
ExecutorService.shutdown() stops the ExecutorService from accepting anymore jobs. It should be called when you're done submitting jobs.
Also Future.get() is a blocking method, which means it will block the execution of current thread and next iteration of loop will not continue unless this future (on which the get is called) returns. This will happen in every iteration, which makes the code non parallel.
You can use a CountDownLatch to wait for all the jobs to return.
Following is the correct code.
final List<Object> results = Collections.synchronizedList(new ArrayList<Object>());
final CountDownLatch latch = new CountDownLatch(10);//suppose you'll have 10 futures
do {
Future<Void> future = executor.submit(new Callable<Void>() {
#Override
public Void call() throws Exception {
// action
latch.countDown();//decrease the latch count
results.add(result); // some result
return null;
}
});
futures.add(future);
} while (true);
executor.shutdown();
latch.await(); //This will block till latch.countDown() has been called 10 times.
//Now results has all the outputs, do what you want with them.
Also if you're working with Java 8 then you can take a look at this answer https://stackoverflow.com/a/36261808/5343269
You're right, the shutdown method is not being called at the correct time. The ExecutorService will not accept tasks after shutdown is called (unless you implement your own version that does).
You should call shutdown after you've already submitted all tasks to the executor, so in this case, somewhere after the do-while loop.
From ThreadPoolExecutor documentation:
Rejected tasks
New tasks submitted in method execute(Runnable) will be rejected when the Executor has been shut down, and also when the Executor uses finite bounds for both maximum threads and work queue capacity, and is saturated.
In either case, the execute method invokes the RejectedExecutionHandler.rejectedExecution(Runnable, ThreadPoolExecutor) method of its RejectedExecutionHandler
From your code, it's clearly evident that you are calling shutdown() first and submitting the tasks later.
On a different note, refer to this related SE question for right way of shutting down ExecutorService:
ExecutorService's shutdown() doesn't wait until all threads will be finished
I found an unexpected deadlock while running tasks in a ThreadPoolExecutor.
The idea is a main task that launches a secondary task that changes a flag.
The main task halts until the secondary task updates the flag.
If corePoolSize >=2 the main task completes as expected.
If corePoolSize <2 it seems that the secondary task is enquenqued but never launched.
Using a SynchronousQueue instead, the main task completes even for corePoolSize=0.
I'd like to know:
what's the cause of the deadlock?. It seems no obvious from the
documentation.
why using a SynchronousQueue instead of a LinkedBlockingQueue prevents the deadlock?
Is corePoolSize =2 a safe value to prevent this kind of deadlocks?
import java.util.concurrent.*;
class ExecutorDeadlock {
/*------ FIELDS -------------*/
boolean halted = true;
ExecutorService executor;
Runnable secondaryTask = new Runnable() {
public void run() {
System.out.println("secondaryTask started");
halted = false;
System.out.println("secondaryTask completed");
}
};
Runnable primaryTask = new Runnable() {
public void run() {
System.out.println("primaryTask started");
executor.execute(secondaryTask);
while (halted) {
try {
Thread.sleep(500);
}
catch (Throwable e) {
e.printStackTrace();
}
}
System.out.println("primaryTask completed");
}
};
/*-------- EXECUTE -----------*/
void execute(){
executor.execute(primaryTask);
}
/*-------- CTOR -----------*/
ExecutorDeadlock(int corePoolSize,BlockingQueue<Runnable> workQueue) {
this.executor = new ThreadPoolExecutor(corePoolSize, 4,0L, TimeUnit.MILLISECONDS, workQueue);
}
/*-------- TEST -----------*/
public static void main(String[] args) {
new ExecutorDeadlock(2,new LinkedBlockingQueue<>()).execute();
//new ExecutorDeadlock(1,new LinkedBlockingQueue<>()).execute();
//new ExecutorDeadlock(0,new SynchronousQueue<>()).execute();
}
}
How do you expect for this to work on threads count <2 if
You have only 1 executor thread
first tast adding secondary task to the executor queue and WAITS for it to start
Tasks are fetched from the queue by executor service when there are free executors in pool. In you case (<2) executor thread is never released by first task.
There is no deadlock issue here.
EDIT:
Ok, I'v dug up some info and this is what I have found out. First of all some info from ThreadPoolExecutor
Any BlockingQueue may be used to transfer and hold submitted tasks.
The use of this queue interacts with pool sizing:
If fewer than corePoolSize threads are running, the Executor always prefers adding a new thread rather than queuing.
If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread.
If a request cannot be queued, a new thread is created unless this would exceed maximumPoolSize, in which case, the task will be rejected.
Ok and now as for queuess offer methods
SyncQueue:
Inserts the specified element into this queue, if another thread is
waiting to receive it.
LinkedBlockingQueue
Inserts the specified element into this queue, waiting if necessary for space to become available.
return value of offer method determines whetever new task will be queued or ran in new thread.
As LinkedBlockingQueue enqueues new task because it can as there is enought capacity, task is enqueued and no new threads are spawn. However SyncQueu will not enqueue another task, as there is no other threads that are waiting for something to be enqueued (offer returns false as task is not enqueued) and thats why new executor thread will be spawned.
If you read javadocs for ThreadPoolExecutor LinkedBlockingQueue and SynchronousQueue + check implementation of execute method, you will get to the same conclusion.
So you were wrong, there is explenation in documentation :)
Here is my task. I have a static queue of jobs in a class and a static method that adds jobs to the queue. Have n amount of threads that poll from a queue and perform the pulled job. I need to have the n threads poll simultaneously at an interval. AKA, all 3 should poll every 5 seconds and look for jobs.
I have this:
public class Handler {
private static final Queue<Job> queue = new LinkedList<>();
public static void initialize(int maxThreads) { // maxThreads == 3
ScheduledExecutorService executorService =
Executors.newScheduledThreadPool(maxThreads);
executorService.scheduleWithFixedDelay(new Runnable() {
#Override
public void run() {
Job job = null;
synchronized(queue) {
if(queue.size() > 0) {
job = queue.poll();
}
}
if(job != null) {
Log.log("start job");
doJob(job);
Log.log("end job");
}
}
}, 15, 5, TimeUnit.SECONDS);
}
}
I get this output when I add 4 tasks:
startjob
endjob
startjob
endjob
startjob
endjob
startjob
endjob
It is obvious that these threads perform that jobs serially, whereas I need them to be done 3 at a time. What am I doing wrong? Thanks!
From the documentation:
If any execution of this task takes longer than its period, then subsequent executions may start late, but will not concurrently execute.
So you must schedule three independent tasks to have them run concurrently. Also note that the scheduled executor service is a fixed thread pool, which is not flexible enough for many use cases. A good idiom is to use the scheduled service just to submit tasks to a regular executor service, which may be configured as a resizable thread pool.
You are running ScheduledExecutorService with fixed delay, what means, that your jobs will run one after one. Use fixed thread pool, and submit 3 threads at a time. Here is an explanation with examples
If you declare Job extends Runnable then your code simplifies dramatically:
First declare the Executor somewhere globally accessible:
public static final ExecutorService executor = Executors.newFixedThreadPool(MAX_THREADS);
Then add a job like this:
executor.submit(new Job());
You are done.
I am looking for a way to execute batches of tasks in java. The idea is to have an ExecutorService based on a thread pool that will allow me to spread a set of Callable among different threads from a main thread. This class should provide a waitForCompletion method that will put the main thread to sleep until all tasks are executed. Then the main thread should be awaken, and it will perform some operations and resubmit a set of tasks.
This process will be repeated numerous times, so I would like to use ExecutorService.shutdown as this would require to create multiple instances of ExecutorService.
Currently I have implemented it in the following way using a AtomicInteger, and a Lock/Condition:
public class BatchThreadPoolExecutor extends ThreadPoolExecutor {
private final AtomicInteger mActiveCount;
private final Lock mLock;
private final Condition mCondition;
public <C extends Callable<V>, V> Map<C, Future<V>> submitBatch(Collection<C> batch){
...
for(C task : batch){
submit(task);
mActiveCount.incrementAndGet();
}
}
#Override
protected void afterExecute(Runnable r, Throwable t) {
super.afterExecute(r, t);
mLock.lock();
if (mActiveCount.decrementAndGet() == 0) {
mCondition.signalAll();
}
mLock.unlock();
}
public void awaitBatchCompletion() throws InterruptedException {
...
// Lock and wait until there is no active task
mLock.lock();
while (mActiveCount.get() > 0) {
try {
mCondition.await();
} catch (InterruptedException e) {
mLock.unlock();
throw e;
}
}
mLock.unlock();
}
}
Please not that I will not necessarily submit all the tasks from the batch at once, therefore CountDownLatch does not seem to be an option.
Is this a valid way to do it? Is there a more efficient/elegant way to implement that?
Thanks
I think the ExecutorService itself will be able to perform your requirements.
Call invokeAll([...]) and iterate over all of your Tasks. All Tasks are finished, if you can iterate through all Futures.
As the other answers point out, there doesn't seem to be any part of your use case that requires a custom ExecutorService.
It seems to me that all you need to do is submit a batch, wait for them all to finish while ignoring interrupts on the main thread, then submit another batch perhaps based on the results of the first batch. I believe this is just a matter of:
ExecutorService service = ...;
Collection<Future> futures = new HashSet<Future>();
for (Callable callable : tasks) {
Future future = service.submit(callable);
futures.add(future);
}
for(Future future : futures) {
try {
future.get();
} catch (InterruptedException e) {
// Figure out if the interruption means we should stop.
}
}
// Use the results of futures to figure out a new batch of tasks.
// Repeat the process with the same ExecutorService.
I agree with #ckuetbach that the default Java Executors should provide you with all of the functionality you need to execute a "batch" of jobs.
If I were you I would just submit a bunch of jobs, wait for them to finish with the ExecutorService.awaitTermination() and then just start up a new ExecutorService. Doing this to save on "thread creations" is premature optimization unless you are doing this 100s of times a second or something.
If you really are stuck on using the same ExecutorService for each of the batches then you can allocate a ThreadPoolExecutor yourself, and be in a loop looking at ThreadPoolExecutor.getActiveCount(). Something like:
BlockingQueue jobQueue = new LinkedBlockingQueue<Runnable>();
ThreadPoolExecutor executor = new ThreadPoolExecutor(NUM_THREADS, NUM_THREADS,
0L, TimeUnit.MILLISECONDS, jobQueue);
// submit your batch of jobs ...
// need to wait a bit for the jobs to start
Thread.sleep(100);
while (executor.getActiveCount() > 0 && jobQueue.size() > 0) {
// to slow the spin
Thread.sleep(1000);
}
// continue on to submit the next batch
I have a queue of task in java. This queue is in a table in the DB.
I need to:
1 thread per task only
No more than N threads running at the same time. This is because the threads have DB interaction and I don't want have a bunch of DB connections opened.
I think I could do something like:
final Semaphore semaphore = new Semaphore(N);
while (isOnJob) {
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
final CountDownLatch cdl = new CountDownLatch(tasks.size());
for (final JobTask task : tasks) {
Thread tr = new Thread(new Runnable() {
#Override
public void run() {
semaphore.acquire();
task.doWork();
semaphore.release();
cdl.countDown();
}
});
}
cdl.await();
}
}
I know that an ExecutorService class exists, but I'm not sure if it I can use it for this.
So, do you think that this is the best way to do this? Or could you clarify me how the ExecutorService works in order to solve this?
final solution:
I think the best solution is something like:
while (isOnJob) {
ExecutorService executor = Executors.newFixedThreadPool(N);
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
for (final JobTask task : tasks) {
executor.submit(new Runnable() {
#Override
public void run() {
task.doWork();
}
});
}
}
executor.shutdown();
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.HOURS);
}
Thanks a lot for the awnsers. BTW I am using a connection pool, but the queries to the DB are very heavy and I don't want to have uncontrolled number of task at the same time.
You can indeed use an ExecutorService. For instance, create a new fixed thread pool using the newFixedThreadPool method. This way, besides caching threads, you also guarantee that no more than n threads are running at the same time.
Something along these lines:
private static final ExecutorService executor = Executors.newFixedThreadPool(N);
// ...
while (isOnJob) {
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
List<Future<?>> futures = new ArrayList<Future<?>>();
for (final JobTask task : tasks) {
Future<?> future = executor.submit(new Runnable() {
#Override
public void run() {
task.doWork();
}
});
futures.add(future);
}
// you no longer need to use await
for (Future<?> fut : futures) {
fut.get();
}
}
}
Note that you no longer need to use the latch, as get will wait for the computation to complete, if necessary.
I agree with JG that ExecutorService is the way to go... but I think you're both making it more complicated than it needs to be.
Rather than creating a large number of threads (1 per task) why not just create a fixed sized thread pool (with Executors.newFixedThreadPool(N)) and submit all the tasks to it? No need for a semaphore or anything like that - just submit the jobs to the thread pool as you get them, and the thread pool will handle them with up to N threads at a time.
If you aren't going to use more than N threads at a time, why would you want to create them?
Use a ThreadPoolExecutor instance with an unbound queue and fixed maximum size of Threads, e.g. Executors.newFixedThreadPool(N). This will accept a large number of tasks but will only execute N of them concurrently.
If you choose a bounded queue instead (with a capacity of N) the Executor will reject the execution of the task (how exactly depends on the Policy you can configure when working with ThreadPoolExecutor directly, instead of using the Executors factory - see RejectedExecutionHandler).
If you need "real" congestion control you should setup a bound BlockingQueue with a capacity of N. Fetch the tasks you want done from the database and put them into the queue - if it's full the calling thread will block. In another thread (perhaps also started using the Executor API) you take tasks from the BlockingQueue and submit them to the Executor. If the BlockingQueue is empty the calling thread will also block. To signal that you're done use a "special" object (e.g. a singleton which marks the last/final item in the queue).
Achieving good performance also depends on the kind of work that needs to be done in the threads. If your DB is the bottleneck in processing I would start paying attention to how your threads access the DB. Using a connection pool is probably in order. This might help you to achive more throughput, since worker threads can re-use DB connections from the pool.