I'm writing an application in Java which uses ExecutorService for running multiple threads.
I wish to submit multiple tasks (thousands at a time) to the Executor as Callables and when done, retrieve their result. The way I'm approaching this is each time I call submit() function, I get a Future which I store in an ArrayList. Later I pass the List to a thread which keeps iterating over it, calling future.get() function with a timeout to see if the task completed. is this the right approach or is to too inefficient?
EDIT --- More info ---
Another problem is that each Callable takes different amount of processing time. So if I simply take the first element out of the List and call get() on it, it will block while results of others may become available and the program will not know. That is why I need to keep iterating with timeouts.
thanks in advance
is this the right approach or is to too inefficient?
This is not the correct approach per se. You are needlessly iterating over ArrayList checking for task completion.
This is very simple: Just use: CompletionService. You can easily wrap your existing executor into it. From JavaDocs:
Producers submit tasks for execution. Consumers take completed tasks
and process their results in the order they complete.
In essence, CompletionService provides a way to get the result back simply by calling take(). It is a blocking function and the caller will block until the results are available.
Note that the call to Future.get will block until the answer is available. So you don't need another thread to iterate over the array list.
See here https://blogs.oracle.com/CoreJavaTechTips/entry/get_netbeans_6
Related
Im looking for a way to synchronize multiple asynchronous operations. I'd like to use a BlockingQueue with a size equal to my operations but who can i wait till the Queue is full?
Im looking for something like a reversed Blocking Queue.
I need to gather the Results of each Thread at the End.
The AsyncHandler is fixed, its already a ThreadExecutor underlying, i cannot start new Threads.
//3 Times
makeAsync(new AsyncHandler() {
onSuccess() {
..
queue.put(result)
}
onFailure() {
..
}
});
//Blocking till the Queue is full
List<Results> = queue.takeAll()
Bonus Question: I need a way to end the wait when one of my Requests fails
I've never had need to do this sort of thing, but you might have some luck using a CountDownLatch or CyclicBarrier from your various threads.
What you describe with
//Blocking till the Queue is full
List<Results> results = queue.takeAll();
does not differ semantically from “take as much items as the queue’s capacity”. If you know the capacity you can achieve this by:
// preferably a constant which you also use to construct the bounded queue
int capacity;
…
List<Results> results = new ArrayList<>(capacity);
queue.drainTo(results, capacity);
while(result.size()<capacity)
queue.drainTo(results, capacity-result.size());
This will block until it has received as much items as the capacity which is, as said, the same as waiting for the queue to become full (has a size equal to its capacity) and than take all items. The only difference is that the event of the queue becoming full is not guaranteed to happen, e.g. if you intend your async operations to offer items until the queue is full, it does not work this way.
If you don’t know the capacity, you are out of luck. There is not even a guaranty that an arbitrary BlockingQueue is bounded, read, it might have an unlimited capacity.
On the other hand, if the asynchronous operations are able to detect when they have finished, they could simply collect the items in a list locally and put the entire list into a BlockingQueue<List<Results>> as a single item once they are done. Then your code waiting for it needs only a single take to get the entire list.
If you're using Java 8, do the following:
With each call to makeAsync, create a CompletableFuture<Result> instance and make it available to the AsyncHandler, and have the caller keep a reference too, say in a list.
When an async task completes normally, have it call complete(result) on its CompletableFuture instance.
When an async task completes with an error, have it call completeExceptionally(exception) on its CompletableFuture instance.
After initiating all the asynchronous tasks, have the caller call CompletableFuture.allOf(cfArray).join(). Unfortunately this takes an array, not a list, so you have to convert. The join() call will throw an exception if any one of the tasks completed with an error. Otherwise, you can collect the results from the individual CompletableFuture instances by calling their get() methods.
If you don't have Java 8, you'll have to sort of roll your own mechanism. Initialize a CountDownLatch to the number of async tasks you're going to fire off. Have each async task store its result (or an exception, or some other means of indicating failure) into a thread-safe data structure and then decrement ('countDown`) the latch. Have the caller wait for the latch to reach zero and then collect the results and errors. This isn't terribly difficult, but you have to determine a means for storing valid results as well as recording whether an error occurred, and also maintain a count manually.
If you can modify methodAsync(), then it's as simple as to use a CountDownLatch after each time you put some elements in the queue and have the main thread wait for such a CountDownLatch.
If unfortunately you cannot modify methodAsync(), then simply wrap the queue and give it a count down latch, and then override the add() method to count down this latch. The main method just wait it to be done.
Having said the above, your program structure smells not well organized.
Here's what I need:
A Master task will create a bunch of Worker tasks.
Once each worker finishes the job, it needs to report back to the master.
As soon as the master receives a predefined number of responses, it will save these results. This is needed because inserting the results one by one will take much more time than inserting a bunch of them at once and waiting for all the results might result in an OutOfMemoryException.
I've looked into each worker calling a method on the master and synchronizing this with wait() and notify() and also using ThreadPoolExecutor and the afterExecute(..) method for getting the result from the workers, but I'm still not sure what is the best way to achieve what I need.
Edit: I should also mention that this is a java app.
Use a BlockingQueue where the master waits (queue.take()) for a worker to place a result (queue.put()).
Use case: tasks are generated in one thread, need to be distributed for computation to many threads and finally the generating task shall reap the results and mark the tasks as done.
I found the class ExecutorCompletionService which fits the use case nearly perfectly --- except that I see no good solution for non-idle waiting. Let me explain.
In principle my code would look like
while (true) {
MyTask t = generateNextTask();
if (t!=null) {
completionService.submit(t);
}
MyTask finished;
while (null!=(finished=compService.poll())) {
retireTaks(finished);
}
}
Both, generateNextTask() and completionService.poll() may return null if there are currently no new tasks available and if currently no task has returned from the CompletionService respectively.
In these cases, the loop degenerates into an ugly idle-wait. I could poll() with a timeout or add a Thread.sleep() for the double-null case, but I consider this a bad workaround, because it nevertheless wastes CPU and is not as responsive as possible, due to the wait.
Suppose I replace generateNextTask() by a poll() on a BlockingQueue, is there good way to poll the queue as well as the CompletionService in parallel to be woken up for work on whichever end something becomes available?
Actually this reminds me of Selector. Is something like it available for queues?
You should use CompletionService.take() to wait until the next task completes and retrieve its Future. poll() is the non-blocking version, returning null if no task is currently completed.
Also, your code seems to be inefficient, because you produce and consume tasks one at a time, instead of allowing multiple tasks to be processed in parallel. Consider having a different thread for task generation and for task results consumption.
-- Edit --
I think that given the constraints you mention in your comments, you can't achieve all your requirements.
Requiring the main thread to be producer and consumer, and disallowing any busy loop or timed loop, you can't avoid the scenario where a blocking wait for a task completion takes too long and no other task gets processed in the meanwhile.
Since you "can replace generateNextTask() by a poll() on a BlockingQueue", I assume incoming tasks can be put in a queue by some other thread, and the problem is, you cannot execute take() on 2 queues simultaneously. The solution is to simply put both incoming and finished tasks in the same queue. To differentiate, wrap them in objects of different types, and then check that type in the loop after take().
This solution works, but we can go further. You said you don't want to use 2 threads for handling tasks - then you can use zero threads. Let wrappers implement Runnable and, instead of checking of the type, you just call take().run(). This way your thread become a single-threaded Executor. But we already have an Executor (CompletionService), can we use it? The problem is, handling of incoming and finished tasks should be done serially, not in parallel. So we need SerialExecutor described in api/java/util/concurrent/Executor, which accepts Runnables and executes them serially, but on another executor. This way no thread is wasted.
And finally, you mentioned Selector as possible solution. I must say, it is an outdated approach. Learn dataflow and actor computing. Nice introduction is here. Look at Dataflow4java project of mine, it has MultiPortActorTest.java example, where class Accum does what you need, with all the boilerplate with wrapper Runnables and serial executors hidden in the supporting library.
What you need is a ListenableFuture from Guava. ListenableFutureExplained
I wanna use the jersey-client for creating asynchronous rest-requests, the function delivers me Futures, so i can, in my understanding, invoke get, and if the request is finished it will return something.
So i am thinking, i could store the Futures in a map and look into them from time to time by one thread. Or maybe i should create a new thread everytime someone sending an asynchronous request. There is also a requirement that it shouldn't last forever (a timeout).
What do you think?
I often use a List<Future<Void>> to store the futures. As get() blocks, I just cycle through them rather than poll them.
There is also a requirement that it should last forever (a timeout).
I assume you mean its shouldn't last forever. This requires support in the library you are using to make the requests. If they can be interrupted you can cancel(true) the future either in your waiting thread or another ScheduledExecutorService. If they can't be interrupts you may have to stop() the thread but only as a last resort.
The javadoc says:
A Future represents the result of an asynchronous computation. Methods
are provided to check if the computation is complete, to wait for its
completion, and to retrieve the result of the computation. The result
can only be retrieved using method get when the computation has
completed, blocking if necessary until it is ready.
Therefore it is up to you to choose which strategy to adopt: it mostly depends on what you want to do with those requests.
You could place those Futures in any iterable structure before going through them. Block on each get may be a strategy if you can handle each result pretty fast and do need to check while waiting if other futures are already returned.
I have a series of concurrent tasks to run. If any one of them fails, I want to interrupt them all and await termination. But assuming none of them fail, I want to wait for all of them to finish.
ExecutorCompletionService seems like almost what I want here, but there doesn't appear to be a way to tell if all of my tasks are done, except by keeping a separate count of the number of tasks. (Note that both of the examples of in the Javadoc for ExecutorCompletionService keep track of the count "n" of the tasks, and use that to determine if the service is finished.)
Am I overlooking something, or do I really have to write this code myself?
Yes, you do need to keep track if you're using an ExecutorCompletionService. Typically, you would call get() on the futures to see if an error occurred. Without iterating over the tasks, how else could you tell that one failed?
If your series of tasks is of a known size, then you should use the second example in the javadoc.
However, if you don't know the number of tasks which you will submit to the CompletionService, then you have a sort of Producer-Consumer problem. One thread is producing tasks and placing them in the ECS, another would be consuming the task futures via take(). A shared Semaphore could be used, allowing the Producer to call release() and the Consumer to call acquire(). Completion semantics would depend on your application, but a volatile or atomic boolean on the producer to indicate that it is done would suffice.
I suggest a Semaphore over wait/notify with poll() because there is a non-deterministic delay between the time a task is produced and the time that task's future is available for consumption. Therefore the consumer and producer needs to be just slightly smarter.