There is a generic queue of tasks where new tasks get added. I want to write code that will create more work in terms of tasks by adding them to the queue. The task that added the work to the queue will wait for all tasks to complete by polling the queue.
What would be the best way to implement it using Java. I was thinking of something on the lines of Simple threads by implementing a runnable interface and make it run in an infinite loop and sleep in between, wake up to see if there is any progress. If the progress is happening, keep on looping, if it has completed break out of the loop. Is there any other good and performance efficient way to implement this ?
How the tasks complete?
The tasks are submitted to a Queue. The Queue is polled by an executor and it runs the tasks.
What i want to do?
Poll that queue to see if the task has completed or is still executing.
What you're describing here, may be a rough sketch of a work queue. You could enqueue processes for asynchronous processing, wait for a notification of completion, and then terminate. This works, but there are new concurrency tools available. I recommend reading the Java Concurrency Lesson.
The new model for concurrency allows you to separate the concurrency concerns from the thread via tasks, Runnable and Callable and the ExecutorService. Rather than working directly with threads and building your own thread pool try to let the Executor do the heavy lifting for you.
...
ExecutorService ex = Executors.newSingleThreadExecutor();
....
You may hand tasks, in the form of Runnables and Callables, to the ExecutorService and receive in return Future objects which may be used to monitor the task's progress.
Future<String> f = executor.submit(new Foo());
....
class Foo implements Callable<String> {
#Override
public String call() throws Exception {
return "Bar";
}
}
You may use an ExecutorCompletionService to monitor the completion of tasks for you :
CompletionService<String> cs = new ExecutorCompletionService<String>(executor);
Future<String> f = cs.submit(new Foo());
... // Let's say you've added TASK_COUNT tasks
for (int i = 0; i < TASK_COUNT ; i++ ) {
try {
String str = cs.take().get();
if (str != null) {
System.out.println(str); //Handle the result of the Callable
continue;
}
} catch (ExecutionException ignore) {}
}
now you've received a result per callable, you can clean up your tasks using the Future f object you received earlier with cs.submit(new Foo()) , by invoking
f.cancel(true)
on each task. And finally, don't forget to clean up your executor with
executor.shutdown();
There is a lot more to concurrency than this, but I believe that the above illustrates a means to meet your needs. I'd recommend reading the JavaDoc as well.
Use java.util.concurrent.Future and a java.util.concurrent.CompletionService.
You can use Fork/Join framework from java 7
Related
I am trying to execute multiple threads in scala and for a simple test I run this code:
Executors.newFixedThreadPool(20).execute( new Runnable {
override def run(): Unit = {
println("Thread Started!")
}
})
As far as I could understand, it would create 20 threads and call the
print function, but this is not what's happening. It creates only one
thread, executes the print and hangs.
Can someone explain me this phenomena?
The reason it hangs is that you don't shut down the ExecutorService. In Java (sorry, not familiar with Scala):
ExecutorService executor = Executors.newFixedThreadPool(20); // or 1.
executor.execute(() -> System.out.println("..."));
executor.shutdown();
As to why you only see the message once: you create 20 threads, and give just one of them work. Threads won't do anything if you don't give them anything to do.
I think you assumed that this code would execute the runnable on each thread in the pool. That's simply not the case.
If you want to actually do this 20 times in different threads, you need to a) submit 20 runnables; b) synchronise the runnables in order that they actually need to run on separate threads:
CountdownLatch latch = new CountdownLatch(1);
ExecutorService executor = Executors.newFixedThreadPool(20);
for (int i = 0; i < 20; ++i) {
executor.execute(() -> {
latch.await(); // exception handling omitted for clarity.
System.out.println("...");
});
}
latch.countdown();
executor.shutdown();
The latch here ensures that the threads wait for each other before proceeding. Without it, the trivial work could easily be done on one thread before submitting another, so you wouldn't use all of the threads in the pool.
I have created ExecutorService like:
private static final java.util.concurrent.ExecutorService EXECUTOR_SERVICE = new java.util.concurrent.ThreadPoolExecutor(
10, // core thread pool size
5, // maximum thread pool size
1, // time to wait before resizing pool
java.util.concurrent.TimeUnit.MINUTES,
new java.util.concurrent.ArrayBlockingQueue<Runnable>(MAX_THREADS, true),
new java.util.concurrent.ThreadPoolExecutor.CallerRunsPolicy());
and added threads in to it with below code:
EXECUTOR_SERVICE.submit(thread);
Now I want know when all threads in EXECUTOR_SERVICE have finished their task so that I can do some dependent tasks.
Kindly suggest any way to achieve it.
You could use :
try {
executor.awaitTermination(1, TimeUnit.SECONDS);
} catch (InterruptedException e) {
// Report the interruptedException
}
Use CountDownLatch. I have used this before in the past with great success.
A synchronization aid that allows one or more threads to wait until a set of operations being performed in other threads completes.
The Javadoc link has a great example.
As per Java Doc Signature of submit Method is <T> Future<T> submit(Callable<T> task)
and Submits a value-returning task for execution and returns a Future representing the pending results of the task. The Future's get method will return the task's result upon successful completion.
If you would like to immediately block waiting for a task, you can use constructions of the form result = exec.submit(aCallable).get();
Note: The Executors class includes a set of methods that can convert some other common closure-like objects, for example, PrivilegedAction to Callable form so they can be submitted.
which return
Future representing pending completion of the task
Without modifying your submitted tasks, you are left to either query the internal state of ThreadPoolExecutor, subclass ThreadPoolExecutor to track task completion according to your requirements, or collect all of the returned Futures from task submission and wait on them each in turn until they are all done.
In no particular order:
Option 1: Query the state of ThreadPoolExecutor:
You can use ThreadPoolExecutor.getActiveCount() if you keep your reference typed to ThreadPoolExecutor instead of ExecutorService.
From ThreadPoolExecutor source:
/**
* Returns the approximate number of threads that are actively executing tasks.
* Returns:
* the number of threads
**/
public int getActiveCount() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
int n = 0;
for (Worker w : workers)
if (w.isLocked())
++n;
return n;
} finally {
mainLock.unlock();
}
}
The JavaDoc there that mentions "approximate" should concern you, however, since given concurrent execution it is not necessarily guaranteed to be accurate. Looking at the code though, it does lock and assuming it is not queried in another thread before all of your tasks have been added, it appears to be sufficient to test for task completeness.
A drawback here is that you are left to monitor the value continuously in a check / sleep loop.
Option 2: Subclass ThreadPoolExecutor:
Another solution (or perhaps a complementary solution) is to subclass ThreadPoolExecutor and override the afterExecute method in order to keep track of completed executions and take appropriate action. You could design your subclass such that it will call a callback once X tasks have been completed, or the number of remaining tasks drops to 0 (some concurrency concerns there since this could trigger before all tasks have been added) etc.
Option 3: Collect task Futures (probably the best option):
Each submission to the ExecutorService returns a Future which can be collected in a list. A loop could then run through and wait on each future in turn until all tasks are complete.
E.g.
List<Future> futures = new ArrayList<Future>();
futures.add(executorService.submit(myTask1));
futures.add(executorService.submit(myTask2));
for (Future future : futures) {
// TODO time limit, exception handling, etc etc.
future.get();
}
When using the ExecutorService returned by Executors.newSingleThreadExecutor(), how do I interrupt it?
In order to do this, you need to submit() a task to an ExecutorService, rather than calling execute(). When you do this, a Future is returned that can be used to manipulate the scheduled task. In particular, you can call cancel(true) on the associated Future to interrupt a task that is currently executing (or skip execution altogether if the task hasn't started running yet).
By the way, the object returned by Executors.newSingleThreadExecutor() is actually an ExecutorService.
Another way to interrupt the executor's internally managed thread(s) is to call the shutdownNow(..) method on your ExecutorService. Note, however, that as opposed to #erickson's solution, this will result in the whole ThreadPoolExecutor becoming unfit for further use.
I find this approach particularly useful in cases where the ExecutorService is no longer needed and keeping tabs on the Future instances is otherwise unnecessary (a prime example of this being the exit(..) method of your application).
Relevant information from the ExecutorService#shutdownNow(..) javadocs:
Attempts to stop all actively executing tasks, halts the processing of
waiting tasks, and returns a list of the tasks that were awaiting
execution.
There are no guarantees beyond best-effort attempts to stop processing
actively executing tasks. For example, typical implementations will
cancel via Thread.interrupt, so any task that fails to respond to
interrupts may never terminate.
One proper way could be customizing/injecting the ThreadFactory for the ExecutorService and from within the thread factory, you got the handle of the thread created, then you can schedule some task to interrupt the thread being interested.
Demo code part for the overwrited method newThread in ThreadFactory:
ThreadFactory customThreadfactory new ThreadFactory() {
public Thread newThread(Runnable runnable) {
final Thread thread = new Thread(runnable);
if (namePrefix != null) {
thread.setName(namePrefix + "-" + count.getAndIncrement());
}
if (daemon != null) {
thread.setDaemon(daemon);
}
if (priority != null) {
thread.setPriority(priority);
}
scheduledExecutorService.schedule(new Callable<String>() {
public String call() throws Exception {
System.out.println("Executed!");
thread.interrupt();
return "Called!";
}
}, 5, TimeUnit.SECONDS);
return thread;
}
}
Then you can use below code to construct your ExecutorService instance:
ExecutorService executorService = Executors.newFixedThreadPool(3,
customThreadfactory);
Then after 5 seconds, an interrupt signal will be sent to the threads in ExecutorService.
I'm preparing an application where a single producer generates several million tasks, which will then be processed by a configurable number of consumers. Communication from producer to consumer is (probably) going to be queue-based.
From the thread that runs the producer/generates the tasks, what method can I use to wait for completion of all tasks? I'd rather not resume to any periodic polling to see if my tasks queue is empty. In any case, the task queue being empty isn't actually a guarantee that the last tasks have completed. Those tasks can be relatively long-running, so it's quite possible that the queue is empty while the consumer threads are still happily processing.
Rgds, Maarten
You might want to have a look at the java.util.concurrent package.
ExecutorService
Executors
Future
The executor framework already provides means to execute tasks via threadpool. The Future abstraction allows to wait for the completition of tasks.
Putting both together allows you coordinate the executions easily, decoupling tasks, activities (threads) and results.
Example:
ExecutorService executorService = Executors.newFixedThreadPool(16);
List<Callable<Void>> tasks = null;
//TODO: fill tasks;
//dispatch
List<Future<Void>> results = executorService.invokeAll(tasks);
//Wait until all tasks have completed
for(Future<Void> result: results){
result.get();
}
Edit: Alternative Version using CountDownLatch
ExecutorService executorService = Executors.newFixedThreadPool(16);
final CountDownLatch latch;
List<Callable<Void>> tasks = null;
//TODO: fill tasks;
latch = new CountDownLatch(tasks.size());
//dispatch
executorService.invokeAll(tasks);
//Wait until all tasks have completed
latch.await();
And inside your tasks:
Callable<Void> task = new Callable<Void>()
{
#Override
public Void call() throws Exception
{
// TODO: do your stuff
latch.countDown(); //<---- important part
return null;
}
};
You want to know where every tasks completes. I would have another queue of completed task reports. (One object/message per task) When this count reaches the number of tasks you created, they have all completed. This task report can also have any errors and timing information for the task.
You could have each consumer check to see if the queue is empty when they dequeue, and, if it is, pulse a condvar (or a Monitor, since I believe that's what Java has) on which the main thread is waiting.
Having the threads check a global boolean variable (marked as volatile) is a way to let the threads know that they should stop.
You can use join() method for each thread ..so that till all the threads are done your main thread will not end! And by this way you can actually find out whether all the threads are done or not!
I can't use shutdown() and awaitTermination() because it is possible new tasks will be added to the ThreadPoolExecutor while it is waiting.
So I'm looking for a way to wait until the ThreadPoolExecutor has emptied it's queue and finished all of it's tasks without stopping new tasks from being added before that point.
If it makes any difference, this is for Android.
Thanks
Update: Many weeks later after revisiting this, I discovered that a modified CountDownLatch worked better for me in this case. I'll keep the answer marked because it applies more to what I asked.
If you are interested in knowing when a certain task completes, or a certain batch of tasks, you may use ExecutorService.submit(Runnable). Invoking this method returns a Future object which may be placed into a Collection which your main thread will then iterate over calling Future.get() for each one. This will cause your main thread to halt execution until the ExecutorService has processed all of the Runnable tasks.
Collection<Future<?>> futures = new LinkedList<Future<?>>();
futures.add(executorService.submit(myRunnable));
for (Future<?> future:futures) {
future.get();
}
My Scenario is a web crawler to fetch some information from a web site then processing them. A ThreadPoolExecutor is used to speed up the process because many pages can be loaded in the time. So new tasks will be created in the existing task because the crawler will follow hyperlinks in each page. The problem is the same: the main thread do not know when all the tasks are completed and it can start to process the result. I use a simple way to determine this. It is not very elegant but works in my case:
while (executor.getTaskCount()!=executor.getCompletedTaskCount()){
System.err.println("count="+executor.getTaskCount()+","+executor.getCompletedTaskCount());
Thread.sleep(5000);
}
executor.shutdown();
executor.awaitTermination(60, TimeUnit.SECONDS);
Maybe you are looking for a CompletionService to manage batches of task, see also this answer.
(This is an attempt to reproduce Thilo's earlier, deleted answer with my own adjustments.)
I think you may need to clarify your question since there is an implicit infinite condition... at some point you have to decide to shut down your executor, and at that point it won't accept any more tasks. Your question seems to imply that you want to wait until you know that no further tasks will be submitted, which you can only know in your own application code.
The following answer will allow you to smoothly transition to a new TPE (for whatever reason), completing all the currently-submitted tasks, and not rejecting new tasks to the new TPE. It might answer your question. #Thilo's might also.
Assuming you have defined somewhere a visible TPE in use as such:
AtomicReference<ThreadPoolExecutor> publiclyAvailableTPE = ...;
You can then write the TPE swap routine as such. It could also be written using a synchronized method, but I think this is simpler:
void rotateTPE()
{
ThreadPoolExecutor newTPE = createNewTPE();
// atomic swap with publicly-visible TPE
ThreadPoolExecutor oldTPE = publiclyAvailableTPE.getAndSet(newTPE);
oldTPE.shutdown();
// and if you want this method to block awaiting completion of old tasks in
// the previously visible TPE
oldTPE.awaitTermination();
}
Alternatively, if you really no kidding want to kill the thread pool, then your submitter side will need to cope with rejected tasks at some point, and you could use null for the new TPE:
void killTPE()
{
ThreadPoolExecutor oldTPE = publiclyAvailableTPE.getAndSet(null);
oldTPE.shutdown();
// and if you want this method to block awaiting completion of old tasks in
// the previously visible TPE
oldTPE.awaitTermination();
}
Which could cause upstream problems, the caller would need to know what to do with a null.
You could also swap out with a dummy TPE that simply rejected every new execution, but that's equivalent to what happens if you call shutdown() on the TPE.
If you don't want to use shutdown, follow below approaches:
Iterate through all Future tasks from submit on ExecutorService and check the status with blocking call get() on Future object as suggested by Tim Bender
Use one of
Using invokeAll on ExecutorService
Using CountDownLatch
Using ForkJoinPool or newWorkStealingPool of Executors(since java 8)
invokeAll() on executor service also achieves the same purpose of CountDownLatch
Related SE question:
How to wait for a number of threads to complete?
You could call the waitTillDone() on Runner class:
Runner runner = Runner.runner(10);
runner.runIn(2, SECONDS, runnable);
runner.run(runnable); // each of this runnables could submit more tasks
runner.waitTillDone(); // blocks until all tasks are finished (or failed)
// and now reuse it
runner.runIn(500, MILLISECONDS, callable);
runner.waitTillDone();
runner.shutdown();
To use it add this gradle/maven dependency to your project: 'com.github.matejtymes:javafixes:1.0'
For more details look here: https://github.com/MatejTymes/JavaFixes or here: http://matejtymes.blogspot.com/2016/04/executor-that-notifies-you-when-task.html
Try using queue size and active tasks count as shown below
while (executor.getThreadPoolExecutor().getActiveCount() != 0 || !executor.getThreadPoolExecutor().getQueue().isEmpty()){
try {
Thread.sleep(500);
} catch (InterruptedException e) {
}
}