Use case: tasks are generated in one thread, need to be distributed for computation to many threads and finally the generating task shall reap the results and mark the tasks as done.
I found the class ExecutorCompletionService which fits the use case nearly perfectly --- except that I see no good solution for non-idle waiting. Let me explain.
In principle my code would look like
while (true) {
MyTask t = generateNextTask();
if (t!=null) {
completionService.submit(t);
}
MyTask finished;
while (null!=(finished=compService.poll())) {
retireTaks(finished);
}
}
Both, generateNextTask() and completionService.poll() may return null if there are currently no new tasks available and if currently no task has returned from the CompletionService respectively.
In these cases, the loop degenerates into an ugly idle-wait. I could poll() with a timeout or add a Thread.sleep() for the double-null case, but I consider this a bad workaround, because it nevertheless wastes CPU and is not as responsive as possible, due to the wait.
Suppose I replace generateNextTask() by a poll() on a BlockingQueue, is there good way to poll the queue as well as the CompletionService in parallel to be woken up for work on whichever end something becomes available?
Actually this reminds me of Selector. Is something like it available for queues?
You should use CompletionService.take() to wait until the next task completes and retrieve its Future. poll() is the non-blocking version, returning null if no task is currently completed.
Also, your code seems to be inefficient, because you produce and consume tasks one at a time, instead of allowing multiple tasks to be processed in parallel. Consider having a different thread for task generation and for task results consumption.
-- Edit --
I think that given the constraints you mention in your comments, you can't achieve all your requirements.
Requiring the main thread to be producer and consumer, and disallowing any busy loop or timed loop, you can't avoid the scenario where a blocking wait for a task completion takes too long and no other task gets processed in the meanwhile.
Since you "can replace generateNextTask() by a poll() on a BlockingQueue", I assume incoming tasks can be put in a queue by some other thread, and the problem is, you cannot execute take() on 2 queues simultaneously. The solution is to simply put both incoming and finished tasks in the same queue. To differentiate, wrap them in objects of different types, and then check that type in the loop after take().
This solution works, but we can go further. You said you don't want to use 2 threads for handling tasks - then you can use zero threads. Let wrappers implement Runnable and, instead of checking of the type, you just call take().run(). This way your thread become a single-threaded Executor. But we already have an Executor (CompletionService), can we use it? The problem is, handling of incoming and finished tasks should be done serially, not in parallel. So we need SerialExecutor described in api/java/util/concurrent/Executor, which accepts Runnables and executes them serially, but on another executor. This way no thread is wasted.
And finally, you mentioned Selector as possible solution. I must say, it is an outdated approach. Learn dataflow and actor computing. Nice introduction is here. Look at Dataflow4java project of mine, it has MultiPortActorTest.java example, where class Accum does what you need, with all the boilerplate with wrapper Runnables and serial executors hidden in the supporting library.
What you need is a ListenableFuture from Guava. ListenableFutureExplained
Related
Could please somebody tell me a real life example where it's convenient to use this factory method rather than others?
newSingleThreadExecutor
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an
unbounded queue. (Note however that if this single thread terminates
due to a failure during execution prior to shutdown, a new one will
take its place if needed to execute subsequent tasks.) Tasks are
guaranteed to execute sequentially, and no more than one task will be
active at any given time. Unlike the otherwise equivalent
newFixedThreadPool(1) the returned executor is guaranteed not to be
reconfigurable to use additional threads.
Thanks in advance.
Could please somebody tell me a real life example where it's convenient to use [the newSingleThreadExecutor() factory method] rather than others?
I assume you are asking about when you use a single-threaded thread-pool as opposed to a fixed or cached thread pool.
I use a single threaded executor when I have many tasks to run but I only want one thread to do it. This is the same as using a fixed thread pool of 1 of course. Often this is because we don't need them to run in parallel, they are background tasks, and we don't want to take too many system resources (CPU, memory, IO). I want to deal with the various tasks as Callable or Runnable objects so an ExecutorService is optimal but all I need is a single thread to run them.
For example, I have a number of timer tasks that I spring inject. I have two kinds of tasks and my "short-run" tasks run in a single thread pool. There is only one thread that executes them all even though there are a couple of hundred in my system. They do routine tasks such as checking for disk space, cleaning up logs, dumping statistics, etc.. For the tasks that are time critical, I run in a cached thread pool.
Another example is that we have a series of partner integration tasks. They don't take very long and they run rather infrequently and we don't want them to compete with other system threads so they run in a single threaded executor.
A third example is that we have a finite state machine where each of the state mutators takes the job from one state to another and is registered as a Runnable in a single thread-pool. Even though we have hundreds of mutators, only one task is valid at any one point in time so it makes no sense to allocate more than one thread for the task.
Apart from the reasons already mentioned, you would want to use a single threaded executor when you want ordering guarantees, i.e you need to make sure that whatever tasks are being submitted will always happen in the order they were submitted.
The difference between Executors.newSingleThreadExecutor() and Executors.newFixedThreadPool(1) is small but can be helpful when designing a library API. If you expose the returned ExecutorService to users of your library and the library works correctly only when the executor uses a single thread (tasks are not thread safe), it is preferable to use Executors.newSingleThreadExecutor(). Otherwise the user of your library could break it by doing this:
ExecutorService e = myLibrary.getBackgroundTaskExecutor();
((ThreadPoolExecutor)e).setCorePoolSize(10);
, which is not possible for Executors.newSingleThreadExecutor().
It is helpful when you need a lightweight service which only makes it convenient to defer task execution, and you want to ensure only one thread is used for the job.
I wanna use the jersey-client for creating asynchronous rest-requests, the function delivers me Futures, so i can, in my understanding, invoke get, and if the request is finished it will return something.
So i am thinking, i could store the Futures in a map and look into them from time to time by one thread. Or maybe i should create a new thread everytime someone sending an asynchronous request. There is also a requirement that it shouldn't last forever (a timeout).
What do you think?
I often use a List<Future<Void>> to store the futures. As get() blocks, I just cycle through them rather than poll them.
There is also a requirement that it should last forever (a timeout).
I assume you mean its shouldn't last forever. This requires support in the library you are using to make the requests. If they can be interrupted you can cancel(true) the future either in your waiting thread or another ScheduledExecutorService. If they can't be interrupts you may have to stop() the thread but only as a last resort.
The javadoc says:
A Future represents the result of an asynchronous computation. Methods
are provided to check if the computation is complete, to wait for its
completion, and to retrieve the result of the computation. The result
can only be retrieved using method get when the computation has
completed, blocking if necessary until it is ready.
Therefore it is up to you to choose which strategy to adopt: it mostly depends on what you want to do with those requests.
You could place those Futures in any iterable structure before going through them. Block on each get may be a strategy if you can handle each result pretty fast and do need to check while waiting if other futures are already returned.
I am creating a http proxy server in java. I have a class named Handler which is responsible for processing the requests and responses coming and going from web browser and to web server respectively. I have also another class named Copy which copies the inputStream object to outputStream object . Both these classes implement Runnable interface. I would like to use the concept of Thread pooling in my design, however i don't know how to go about that! Any hint or idea would be highly appreciated.
I suggest you look at Executor and ExecutorService. They add a lot of good stuff to make it easier to use Thread pools.
...
#Azad provided some good information and links. You should also buy and read the book Java Concurrency in Practice. (often abbreviated as JCiP) Note to stackoverflow big-wigs - how about some revenue link to Amazon???
Below is my brief summary of how to use and take advantage of ExecutorService with thread pools. Let's say you want 8 threads in the pool.
You can create one using the full featured constructors of ThreadPoolExecutor, e.g.
ExecutorService service = new ThreadPoolExecutor(8,8, more args here...);
or you can use the simpler but less customizable Executors factories, e.g.
ExecutorService service = Executors.newFixedThreadPool(8);
One advantage you immediately get is the ability to shutdown() or shutdownNow() the thread pool, and to check this status via isShutdown() or isTerminated().
If you don't care much about the Runnable you wish to run, or they are very well written, self-contained, never fail or log any errors appropriately, etc... you can call
execute(Runnable r);
If you do care about either the result (say, it calculates pi or downloads an image from a webpage) and/or you care if there was an Exception, you should use one of the submit methods that returns a Future. That allows you, at some time in the future, check if the task isDone() and to retrieve the result via get(). If there was an Exception, get() will throw it (wrapped in an ExecutionException). Note - even of your Future doesn't "return" anything (it is of type Void) it may still be good practice to call get() (ignoring the void result) to test for an Exception.
However, this checking the Future is a bit of chicken and egg problem. The whole point of a thread pool is to submit tasks without blocking. But Future.get() blocks, and Future.isDone() begs the questions of which thread is calling it, and what it does if it isn't done - do you sleep() and block?
If you are submitting a known chunk of related of tasks simultaneously, e.g., you are performing some big mathematical calculation like a matrix multiply that can be done in parallel, and there is no particular advantage to obtaining partial results, you can call invokeAll(). The calling thread will then block until all the tasks are complete, when you can call Future.get() on all the Futures.
What if the tasks are more disjointed, or you really want to use the partial results? Use ExecutorCompletionService, which wraps an ExecutorService. As tasks get completed, they are added to a queue. This makes it easy for a single thread to poll and remove events from the queue. JCiP has a great example of an web page app that downloads all the images in parallel, and renders them as soon as they become available for responsiveness.
I hope below will help you:,
class Executor
An object that executes submitted Runnable tasks. This interface provides a way of decoupling task submission from the mechanics of how each task will be run, including details of thread use, scheduling, etc. An Executor is normally used instead of explicitly creating threads. For example, rather than invoking new Thread(new(RunnableTask())).start() for each of a set of tasks, you might use:
Executor executor = anExecutor;
executor.execute(new RunnableTask1());
executor.execute(new RunnableTask2());
...
class ScheduledThreadPoolExecutor
A ThreadPoolExecutor that can additionally schedule commands to run after a given delay, or to execute periodically. This class is preferable to Timer when multiple worker threads are needed, or when the additional flexibility or capabilities of ThreadPoolExecutor (which this class extends) are required.
Delayed tasks execute no sooner than they are enabled, but without any real-time guarantees about when, after they are enabled, they will commence. Tasks scheduled for exactly the same execution time are enabled in first-in-first-out (FIFO) order of submission.
and
Interface ExecutorService
An Executor that provides methods to manage termination and methods that can produce a Future for tracking progress of one or more asynchronous tasks.
An ExecutorService can be shut down, which will cause it to stop accepting new tasks. After being shut down, the executor will eventually terminate, at which point no tasks are actively executing, no tasks are awaiting execution, and no new tasks can be submitted.
Edited:
you can find example to use Executor and ExecutorService herehereand here Question will be useful for you.
I am trying to Tune a thread which does the following:
A thread pool with just 1 thread [CorePoolSize =0, maxPoolSize = 1]
The Queue used is a ArrayBlockingQueue
Quesize = 20
BackGround:
The thread tries to read a request and perform an operation on it.
HOWEVER, eventually the requests have increased so much that the thread is always busy and consume 1 CPU which makes it a resource hog.
What I want to do it , instead sample the requests at intervals and process them . Other requests can be safely ignored.
What I would have to do is put a sleep in "operation" function so that for each task the thread sleeps for sometime and releases the CPU.
Quesiton:
However , I was wondering if there is a way to use a queue which basically itself sleeps for sometime before it reads the next element. This would be ideal since sleeping a task in the middle of execution and keeping the execution incomplete just doesn't sound the best to me.
Please let me know if you have any other suggestions as well for the tasks
Thanks.
Edit:
I have added a follow-up question here
corrected the maxpool size to be 1 [written in a haste] .. thanks tim for pointing it out.
No, you can't make the thread sleep while it's in the pool. If there's a task in the queue, it will be executed.
Pausing within a queued task is the only way to force the thread to be idle in spite of queued tasks. Now, the "sleep" doesn't have to be in the same task as the "work"—you could queue a separate rest task after each real task, which might make for a cleaner implementation. More importantly, if the work is a Callable that returns a result, separating into two tasks will allow you to obtain the result as soon as possible.
As a refinement, rather than sleeping for a fixed interval between every task, you could "throttle" execution to a specified rate. This would allow you to avoid waiting unnecessarily between tasks, yet avoid executing too many tasks within a specified time interval. You can read another answer of mine for a simple way to implement this with a DelayQueue.
You could subclass ThreadPool and override beforeExecute to sleep for some time:
#Overrides
protected void beforeExecute(Thread t,
Runnable r){
try{
Thread.sleep( millis); // will sleep the correct thread, see JavaDoc
}
catch (InterruptedException e){}
}
But see AngerClown's comment about artificially slowing down the queue probably not being a good idea.
This might not work for you, but you could try setting the executor's thread priority to low.
Essentially, create the ThreadPoolExecutor with a custom ThreadFactory. Have the ThreadFactory.newThread() method return Threads with a priority of Thread.MIN_PRIORITY. This will cause the executor service you use to only be scheduled if there is an available core to run it.
The implication: On a system that strictly uses time slicing, you will only be given a time slice to execute if there is no other Thread in the entire program with a greater priority asking to be scheduled. Depending on how busy your application really is, you might get scheduled every once in awhile, or you might not be scheduled at all.
The reason the thread is consuming 100% CPU is because it is given more work than it can process. Adding a delay between tasks is not going to fix this problem. It is just make things worse.
Instead you should look at WHY your tasks are consuming so much CPU e.g. with a profiler and change them so that consume less CPU until you find that your thread can keep up and it no longer consumes 100% cpu.
I have a series of concurrent tasks to run. If any one of them fails, I want to interrupt them all and await termination. But assuming none of them fail, I want to wait for all of them to finish.
ExecutorCompletionService seems like almost what I want here, but there doesn't appear to be a way to tell if all of my tasks are done, except by keeping a separate count of the number of tasks. (Note that both of the examples of in the Javadoc for ExecutorCompletionService keep track of the count "n" of the tasks, and use that to determine if the service is finished.)
Am I overlooking something, or do I really have to write this code myself?
Yes, you do need to keep track if you're using an ExecutorCompletionService. Typically, you would call get() on the futures to see if an error occurred. Without iterating over the tasks, how else could you tell that one failed?
If your series of tasks is of a known size, then you should use the second example in the javadoc.
However, if you don't know the number of tasks which you will submit to the CompletionService, then you have a sort of Producer-Consumer problem. One thread is producing tasks and placing them in the ECS, another would be consuming the task futures via take(). A shared Semaphore could be used, allowing the Producer to call release() and the Consumer to call acquire(). Completion semantics would depend on your application, but a volatile or atomic boolean on the producer to indicate that it is done would suffice.
I suggest a Semaphore over wait/notify with poll() because there is a non-deterministic delay between the time a task is produced and the time that task's future is available for consumption. Therefore the consumer and producer needs to be just slightly smarter.