I am looking for a way to execute batches of tasks in java. The idea is to have an ExecutorService based on a thread pool that will allow me to spread a set of Callable among different threads from a main thread. This class should provide a waitForCompletion method that will put the main thread to sleep until all tasks are executed. Then the main thread should be awaken, and it will perform some operations and resubmit a set of tasks.
This process will be repeated numerous times, so I would like to use ExecutorService.shutdown as this would require to create multiple instances of ExecutorService.
Currently I have implemented it in the following way using a AtomicInteger, and a Lock/Condition:
public class BatchThreadPoolExecutor extends ThreadPoolExecutor {
private final AtomicInteger mActiveCount;
private final Lock mLock;
private final Condition mCondition;
public <C extends Callable<V>, V> Map<C, Future<V>> submitBatch(Collection<C> batch){
...
for(C task : batch){
submit(task);
mActiveCount.incrementAndGet();
}
}
#Override
protected void afterExecute(Runnable r, Throwable t) {
super.afterExecute(r, t);
mLock.lock();
if (mActiveCount.decrementAndGet() == 0) {
mCondition.signalAll();
}
mLock.unlock();
}
public void awaitBatchCompletion() throws InterruptedException {
...
// Lock and wait until there is no active task
mLock.lock();
while (mActiveCount.get() > 0) {
try {
mCondition.await();
} catch (InterruptedException e) {
mLock.unlock();
throw e;
}
}
mLock.unlock();
}
}
Please not that I will not necessarily submit all the tasks from the batch at once, therefore CountDownLatch does not seem to be an option.
Is this a valid way to do it? Is there a more efficient/elegant way to implement that?
Thanks
I think the ExecutorService itself will be able to perform your requirements.
Call invokeAll([...]) and iterate over all of your Tasks. All Tasks are finished, if you can iterate through all Futures.
As the other answers point out, there doesn't seem to be any part of your use case that requires a custom ExecutorService.
It seems to me that all you need to do is submit a batch, wait for them all to finish while ignoring interrupts on the main thread, then submit another batch perhaps based on the results of the first batch. I believe this is just a matter of:
ExecutorService service = ...;
Collection<Future> futures = new HashSet<Future>();
for (Callable callable : tasks) {
Future future = service.submit(callable);
futures.add(future);
}
for(Future future : futures) {
try {
future.get();
} catch (InterruptedException e) {
// Figure out if the interruption means we should stop.
}
}
// Use the results of futures to figure out a new batch of tasks.
// Repeat the process with the same ExecutorService.
I agree with #ckuetbach that the default Java Executors should provide you with all of the functionality you need to execute a "batch" of jobs.
If I were you I would just submit a bunch of jobs, wait for them to finish with the ExecutorService.awaitTermination() and then just start up a new ExecutorService. Doing this to save on "thread creations" is premature optimization unless you are doing this 100s of times a second or something.
If you really are stuck on using the same ExecutorService for each of the batches then you can allocate a ThreadPoolExecutor yourself, and be in a loop looking at ThreadPoolExecutor.getActiveCount(). Something like:
BlockingQueue jobQueue = new LinkedBlockingQueue<Runnable>();
ThreadPoolExecutor executor = new ThreadPoolExecutor(NUM_THREADS, NUM_THREADS,
0L, TimeUnit.MILLISECONDS, jobQueue);
// submit your batch of jobs ...
// need to wait a bit for the jobs to start
Thread.sleep(100);
while (executor.getActiveCount() > 0 && jobQueue.size() > 0) {
// to slow the spin
Thread.sleep(1000);
}
// continue on to submit the next batch
Related
I am new to concurrency and I was trying to implement executor service concurrency for a do-while loop. But I always run into RejectedExecutionException
Here is my sample code:
do {
Future<Void> future = executor.submit(new Callable<Void>() {
#Override
public Void call() throws Exception {
// action
return null;
}
});
futures.add(future);
executor.shutdown();
for (Future<Void> future : futures) {
try {
future.get();
}
catch (InterruptedException e) {
throw new IOException(e)
}
}
}
while (true);
But this seems incorrect. I think I am calling the shutdown at the wrong place. Can anyone please help me implement Executor Service in a do-while loop correctly. Thanks.
ExecutorService.shutdown() stops the ExecutorService from accepting anymore jobs. It should be called when you're done submitting jobs.
Also Future.get() is a blocking method, which means it will block the execution of current thread and next iteration of loop will not continue unless this future (on which the get is called) returns. This will happen in every iteration, which makes the code non parallel.
You can use a CountDownLatch to wait for all the jobs to return.
Following is the correct code.
final List<Object> results = Collections.synchronizedList(new ArrayList<Object>());
final CountDownLatch latch = new CountDownLatch(10);//suppose you'll have 10 futures
do {
Future<Void> future = executor.submit(new Callable<Void>() {
#Override
public Void call() throws Exception {
// action
latch.countDown();//decrease the latch count
results.add(result); // some result
return null;
}
});
futures.add(future);
} while (true);
executor.shutdown();
latch.await(); //This will block till latch.countDown() has been called 10 times.
//Now results has all the outputs, do what you want with them.
Also if you're working with Java 8 then you can take a look at this answer https://stackoverflow.com/a/36261808/5343269
You're right, the shutdown method is not being called at the correct time. The ExecutorService will not accept tasks after shutdown is called (unless you implement your own version that does).
You should call shutdown after you've already submitted all tasks to the executor, so in this case, somewhere after the do-while loop.
From ThreadPoolExecutor documentation:
Rejected tasks
New tasks submitted in method execute(Runnable) will be rejected when the Executor has been shut down, and also when the Executor uses finite bounds for both maximum threads and work queue capacity, and is saturated.
In either case, the execute method invokes the RejectedExecutionHandler.rejectedExecution(Runnable, ThreadPoolExecutor) method of its RejectedExecutionHandler
From your code, it's clearly evident that you are calling shutdown() first and submitting the tasks later.
On a different note, refer to this related SE question for right way of shutting down ExecutorService:
ExecutorService's shutdown() doesn't wait until all threads will be finished
I am trying to figure out a way to handle exceptions in a multi-thread setting. I would like to execute certain tasks in parallel, each of which might throw an exception that I need to react to (basically, by putting the failed task back into an execution queue). However, it seems to only way to actually get the exception from the thread is to create a Future and call its get() method. However, this essentially turns the calls into synchronous calls.
Maybe some code will illustrate the point:
ExecutorService executor = Executors.newFixedThreadPool(nThreads);
Task task = taskQueue.poll(); // let's assume that task implements Runnable
try {
executor.execute(task);
}
catch(Exception ex) {
// record the failed task, so that it can be re-added to the queue
}
However, in this case all tasks are launched, but the exceptions don't seem to get caught in this catch block here.
An alternative would be to use a Future instead of a thread and retrieve its result:
try {
Future<?> future = executor.submit(task);
future.get();
}
...
In this case, the exceptions are caught alright in the catch block, but at the price of having to wait until this operation is finished. So, the tasks are executed sequentially and not in parallel, as desired.
What am I missing? How can catch each's tasks Exceptions and react to them?
you could trigger all your tasks within one loop and check/await/retry in another:
Map<Future<?>, Task> futures = new HashMap<Future<?>, Task>()
while(!taskQueue.isEmpty()){
Task task = taskQueue.poll();
Future<?> future = executor.submit(task);
futures.put(future, task);
}
for(Map.Entry<Future<?>, Task> entry : futures.entrySet()){
try {
entry.getKey().get();
}
catch(ExecutionException ex) {
// record the failed task, so that it can be re-added to the queue
// you should add a retry counter because you want to prevent endless loops
taskQueue.add(entry.getValue());
}
catch(InterrupredException ex){
// thread interrupted, exit
Thread.interrupt();
return;
}
}
HTH, Mark
I have one "Runnable" threads which is initiating few "Callable" threads and I want to display results when all above threads has finished their jobs.
What is the best way to do it?
My code is as follows
Connector.java (Starting Runnable Thread)
public class Connector {
private static void anyFileConnector() {
// Starting searching Thread
ExecutorService executor = Executors.newFixedThreadPool(100);
executor.submit(traverse, executor);
//HERE I WANT MY ALL SEARCH RESULTS/OUTPUT : CURRENTLY IT IS STARTING OTHER THREADS AND NOT SHOWING ME ANY RESULTS BECAUSE NONE OF THEM WAS FINISHED.(IN CONSOLE, I WAS ABLE TO SEE RESULTS FROM ALL THE THREADS
setSearchResult(traverse.getResult());
executor.shutdown();
}
}
Traverse.java (Runnable Thread)
I am using ExecutorCompletionService to handle it...but it didn't create any difference.
:(
public class Traverse implements Runnable {
public void run() {
ExecutorService executor = Executors.newFixedThreadPool(100);
ExecutorCompletionService<List<ResultBean>> taskCompletionService =
new ExecutorCompletionService<List<ResultBean>>(executor);
try (DirectoryStream<Path> stream = Files
.newDirectoryStream(dir)) {
Search newSearch = new Search();
taskCompletionService.submit(newSearch);
}
list.addAll(taskCompletionService.take().get());
}
}
Search.java (Callable Thread)
public class Search implements Callable<List<ResultBean>> {
public List<ResultBean> call() {
synchronized (Search.class) {
// It will return results
return this.search();
}
}
}
Go for CyclicBarrier and you will be able to achieve this.
A cyclic barrier will perform a task as soon as all the threads are done with their work, this is where you can print the en result.
Check this lik for working of CyclicBarrier : http://javarevisited.blogspot.com/2012/07/cyclicbarrier-example-java-5-concurrency-tutorial.html
Easy - all the Callables will return Future objects which you can used to wait and get the result by calling Future.get() in a blocking wait way. So your problem is just a for loop waiting for each future on the callables blockingly.
After that, just aggregate the results to return to client.
The submit method of executor service can return a list of Future objects. What you can do for your case is call isDone() method of these Future objects in a while loop.
Whenever, any future task gets completed this method will return true. You can now call get() method on this to get the value returned by this task. In this way you could get hold of all the future task values without having to wait for any particular task to get complete (since your first future task could have the longest completion time)
I'm trying to find a less clunky solution to a Java concurrency problem.
The gist of the problem is that I need a shutdown call to block while there are still worker threads active, but the crucial aspect is that the worker tasks are each spawned and completed asynchronously so the hold and release must be done by different threads. I need them to somehow send a signal to the shutdown thread once their work has completed. Just to make things more interesting, the worker threads cannot block each other so I'm unsure about the application of a Semaphore in this particular instance.
I have a solution which I think safely does the job, but my unfamiliarity with the Java concurrency utils leads me to think that there might be a much easier or more elegant pattern. Any help in this regard would be greatly appreciated.
Here's what I have so far, fairly sparse except for the comments:
final private ReentrantReadWriteLock shutdownLock = new ReentrantReadWriteLock();
volatile private int activeWorkerThreads;
private boolean isShutdown;
private void workerTask()
{
try
{
// Point A: Worker tasks mustn't block each other.
shutdownLock.readLock().lock();
// Point B: I only want worker tasks to continue if the shutdown signal
// hasn't already been received.
if (isShutdown)
return;
activeWorkerThreads ++;
// Point C: This async method call returns immediately, soon after which
// we release our lock. The shutdown thread may then acquire the write lock
// but we want it to continue blocking until all of the asynchronous tasks
// have completed.
executeAsynchronously(new Runnable()
{
#Override
final public void run()
{
try
{
// Do stuff.
}
finally
{
// Point D: Release of shutdown thread loop, if there are no other
// active worker tasks.
activeWorkerThreads --;
}
}
});
}
finally
{
shutdownLock.readLock().unlock();
}
}
final public void shutdown()
{
try
{
// Point E: Shutdown thread must block while any worker threads
// have breached Point A.
shutdownLock.writeLock().lock();
isShutdown = true;
// Point F: Is there a better way to wait for this signal?
while (activeWorkerThreads > 0)
;
// Do shutdown operation.
}
finally
{
shutdownLock.writeLock().unlock();
}
}
Thanks in advance for any help!
Russ
Declaring activeWorkerThreads as volatile doesn't allow you to do activeWorkerThreads++, as ++ is just shorthand for,
activeWorkerThreads = activeWorkerThreads + 1;
Which isn't atomic. Use AtomicInteger instead.
Does executeAsynchronously() send jobs to a ExecutorService? If so you can just use the awaitTermination method, so your shutdown hook will be,
executor.shutdown();
executor.awaitTermination(1, TimeUnit.Minutes);
You can use a semaphore in this scenario and not require a busy wait for the shutdown() call. The way to think of it is as a set of tickets that are handed out to workers to indicate that they are in-flight. If the shutdown() method can acquire all of the tickets then it knows that it has drained all workers and there is no activity. Because #acquire() is a blocking call the shutdown() won't spin. I've used this approach for a distributed master-worker library and its easy extend it to handle timeouts and retrials.
Executor executor = // ...
final int permits = // ...
final Semaphore semaphore = new Semaphore(permits);
void schedule(final Runnable task) {
semaphore.acquire();
try {
executor.execute(new Runnable() {
#Override public run() {
try {
task.run();
} finally {
semaphore.release();
}
}
});
} catch (RejectedExecutionException e) {
semaphore.release();
throw e;
}
}
void shutDown() {
semaphore.acquireUninterruptibly(permits);
// do stuff
}
ExecutorService should be a preferred solution as sbridges mentioned.
As an alternative, if the number of worker threads is fixed, then you can use CountDownLatch:
final CountDownLatch latch = new CountDownLatch(numberOfWorkers);
Pass the latch to every worker thread and call latch.countDown() when task is done.
Call latch.await() from the main thread to wait for all tasks to complete.
Whoa nelly. Never do this:
// Point F: Is there a better way to wait for this signal?
while (activeWorkerThreads > 0)
;
You're spinning and consuming CPU. Use a proper notification:
First: synchronize on an object, then check activeWorkerThreads, and wait() on the object if it's still > 0:
synchronized (mutexObject) {
while (activeWorkerThreads > 0) {
mutexObject.wait();
}
}
Second: Have the workers notify() the object after they decrement the activeWorkerThreads count. You must synchronize on the object before calling notify.
synchronized (mutexObject) {
activeWorkerThreads--;
mutexObject.notify();
}
Third: Seeing as you are (after implementing 1 & 2) synchronizing on an object whenever you touch activeWorkerThreads, use it as protection; there is no need for the variable to be volatile.
Then: the same object you use as a mutex for controlling access to activeWorkerThreads could also be used to control access to isShutdown. Example:
synchronized (mutexObject) {
if (isShutdown) {
return;
}
}
This won't cause workers to block each other except for immeasurably small amounts of time (which you likely do not avoid by using a read-write lock anyway).
This is more like a comment to sbridges answer, but it was a bit too long to submit as a comment.
Anyways, just 1 comment.
When you shutdown the executor, submitting new task to the executor will result in unchecked RejectedExecutionException if you use the default implementations (like Executors.newSingleThreadExecutor()). So in your case you probably want to use the following code.
code:
new ThreadPoolExecutor(1,
1,
1,
TimeUnit.HOURS,
new LinkedBlockingQueue<Runnable>(),
new ThreadPoolExecutor.DiscardPolicy());
This way, the tasks that were submitted to the executor after shutdown() was called, are simply ignored. The parameter above (1,1... etc) should produce an executor that basically is a single-thread executor, but doesn't throw the runtime exception.
I have a queue of task in java. This queue is in a table in the DB.
I need to:
1 thread per task only
No more than N threads running at the same time. This is because the threads have DB interaction and I don't want have a bunch of DB connections opened.
I think I could do something like:
final Semaphore semaphore = new Semaphore(N);
while (isOnJob) {
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
final CountDownLatch cdl = new CountDownLatch(tasks.size());
for (final JobTask task : tasks) {
Thread tr = new Thread(new Runnable() {
#Override
public void run() {
semaphore.acquire();
task.doWork();
semaphore.release();
cdl.countDown();
}
});
}
cdl.await();
}
}
I know that an ExecutorService class exists, but I'm not sure if it I can use it for this.
So, do you think that this is the best way to do this? Or could you clarify me how the ExecutorService works in order to solve this?
final solution:
I think the best solution is something like:
while (isOnJob) {
ExecutorService executor = Executors.newFixedThreadPool(N);
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
for (final JobTask task : tasks) {
executor.submit(new Runnable() {
#Override
public void run() {
task.doWork();
}
});
}
}
executor.shutdown();
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.HOURS);
}
Thanks a lot for the awnsers. BTW I am using a connection pool, but the queries to the DB are very heavy and I don't want to have uncontrolled number of task at the same time.
You can indeed use an ExecutorService. For instance, create a new fixed thread pool using the newFixedThreadPool method. This way, besides caching threads, you also guarantee that no more than n threads are running at the same time.
Something along these lines:
private static final ExecutorService executor = Executors.newFixedThreadPool(N);
// ...
while (isOnJob) {
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
List<Future<?>> futures = new ArrayList<Future<?>>();
for (final JobTask task : tasks) {
Future<?> future = executor.submit(new Runnable() {
#Override
public void run() {
task.doWork();
}
});
futures.add(future);
}
// you no longer need to use await
for (Future<?> fut : futures) {
fut.get();
}
}
}
Note that you no longer need to use the latch, as get will wait for the computation to complete, if necessary.
I agree with JG that ExecutorService is the way to go... but I think you're both making it more complicated than it needs to be.
Rather than creating a large number of threads (1 per task) why not just create a fixed sized thread pool (with Executors.newFixedThreadPool(N)) and submit all the tasks to it? No need for a semaphore or anything like that - just submit the jobs to the thread pool as you get them, and the thread pool will handle them with up to N threads at a time.
If you aren't going to use more than N threads at a time, why would you want to create them?
Use a ThreadPoolExecutor instance with an unbound queue and fixed maximum size of Threads, e.g. Executors.newFixedThreadPool(N). This will accept a large number of tasks but will only execute N of them concurrently.
If you choose a bounded queue instead (with a capacity of N) the Executor will reject the execution of the task (how exactly depends on the Policy you can configure when working with ThreadPoolExecutor directly, instead of using the Executors factory - see RejectedExecutionHandler).
If you need "real" congestion control you should setup a bound BlockingQueue with a capacity of N. Fetch the tasks you want done from the database and put them into the queue - if it's full the calling thread will block. In another thread (perhaps also started using the Executor API) you take tasks from the BlockingQueue and submit them to the Executor. If the BlockingQueue is empty the calling thread will also block. To signal that you're done use a "special" object (e.g. a singleton which marks the last/final item in the queue).
Achieving good performance also depends on the kind of work that needs to be done in the threads. If your DB is the bottleneck in processing I would start paying attention to how your threads access the DB. Using a connection pool is probably in order. This might help you to achive more throughput, since worker threads can re-use DB connections from the pool.