Does completableFuture in Java 8 scale to multiple cores? - java

Lets say I have a single thread that calls bunch of methods that return completablefuture and say I add all of them to a list and in the end I do completablefutures.allof(list_size).join(). Now does the futures in the list can scale to multiple cores? other words are the futures scheduled into multiple cores to take advantage of parallelism?

CompletableFuture represents a task which is associated with some Executor. If you did not specify executor explicitly (for example, you used CompletableFuture.supplyAsync(Supplier) instead of CompletableFuture.supplyAsync(Supplier, Executor)), then common ForkJoinPool is used as executor. This pool could be obtained via ForkJoinPool.commonPool() and by default it creates as many threads as many hardware threads your system has (usually number of cores, double it if your cores support hyperthreading). So in general, yes, if you use all defaults, then multiple cores will be used for your completable futures.

CompletableFuture itself is not scheduled to a thread (or core). Tasks are. To achieve parallelism, you need to create multiple tasks. If your methods which return CompletableFuture submit tasks like
return CompletableFuture.supplyAsync(this::calculate);
then multiple tasks are started.
If they just create CompletableFuture like
return new CompletableFuture();
then no tasks are started and no parallelism present.
CompletableFuture objects created by CompletableFuture{handle, thenCombine, thenCompose, thenApply} are not connected to parallel tasks, so parallelism is not increased.
CompletableFuture objects created by CompletableFuture{handleAsync, thenCombineAsync, thenComposeAsync, thenApplyAsync} are connected to parallel tasks, but these tasks are executed strictly after the task corresponding to the this CompletableFuture object, so cannot increase parallelism.

Having a bunch of CompletableFutures doesn't tell you anything about how they will be completed.
There are two kind of completions:
Explicit, through cancel, complete, completeExceptionally, obtrudeException and obtrudeValue on an instance, or by obtainng a future with the completedFuture static method
Implicit, through executing a provided function, whether it returns normally or exceptionally, or through the completion of a previous future
For instance:
exceptionally completes normally without running the provided function if the previous future completes normally
Every other chaining method, except for handle and whenComplete and their *Async variations, complete exceptionally without running the provided function if the previous future, or any of the previous futures in the combining (*Both*, *Combine* and *Either*) methods, complete exceptionally
Otherwise, the future completes when the provided function runs and completes either normally or exceptionally
If the futures you have were created without a function or they're not chained to another future, or in other words, if they don't have a function associated, then they will only complete explicitly, and as such it makes no sense to say if this kind of completable future runs, much less if they may use multiple threads.
On the other hand, if the futures have a function, it depends on how they were created:
If they're all independent and use the ForkJoinPool.commonPool() (or a cached thread pool or similar) as the executor, then they will probably run in parallel, possibly using as many active threads as the number of cores
If they all have a dependency on each other (except for one) or if the executor is single-threaded, then they'll run one at a time
Anything in between is valid, such as:
some futures may depend on each other, or on some other internal future you have no knowledge of
some futures may have been created with e.g. a fixed thread pool executor where you'll see a limited degree of concurrently running tasks
Invoking join does not tell a future to start running, it just waits for it to complete.
So, to finally answer your question:
If the future has a function associated, then it may already be running, it may or may not run its function depending on how it was chained and the completion of the previous future, and it may never run if it doesn't have a function or if it was completed before it had a chance to run its function
The futures that are already running or that will run do so:
On the provided executor when chained with the *Async methods or when created with the *Async static methods that take an executor
On the ForkJoinPool.commonPool() when chained with the *Async methods or when created with the *Async static methods that don't take an executor
On the same thread as where the future they depend on is completed when chained without the *Async methods in case the future is not yet complete
On the current thread if the future they depend on is already completed when chained without the *Async methods
In my opinion, the explicit completion methods should have been segregated to a e.g. CompletionSource interface and have a e.g. CompletableFutureSource class that implements it and provides a future, much like .NET's relation between a TaskCompletionSource and its Task.
As things are now, most probably you can tamper with the completable futures you have by completing them in a way not originally intended. For this reason, you should not use a CompletableFuture after you expose it publicly; it's your API user's CompletableFuture from then on.

Related

Ensure unique tasks in ExectorService

I have a scenario wherein same Tasks get assigned multiple times to an ExecutorService. I want to avoid that, Is there a way to do it?
I have Tasks with a String constructor.
Task task1 = new Task ("A");
than I execute this task
executor.execute (task1);
Then I create another task with same string.
Task task2 = new Task ("A");
Lets say I cannot avoid this from happening.
Now I execute this task.
executor.execute (task2).
I want only one of these tasks to be executed, since both tasks are similar in nature.
How?
At first, I would have implemented a queue interface and passed it to the executor service. My implementation would be using a hashset to hold the memory, next to a regular collection to hold the tasks as the queue requires. Adding to my queue therefore would have involved checking the hashset first. Maybe a linkedhashset rolling off eldest entries...
However, the executorservice submit() doesn't fully rely on the queue. See the javadoc of ThreadPoolExecutor. The queue could refuse the task, but the executor could spawn a thread instead and still accept the task. Besides, that implies you could intervene in the executor's construction anyway.
So, presuming control on the executor's class, it seems you must extend an executorservice and have the knowledge there instead of a in a custom queue. You only need to override submit() and throw the rejection exception.
Of course, this reveals the rejection to the submitter. You should deal with that gracefully. You cannot hide this fact because you cannot return a Future if there was no submission. Unless you wire up the old Future to the new one (your knowledge hashmap contains Futures). This may cause difficulties though since every Future has a 'done' state and is cancel-able... Your own Future would be delegating to the original Futures. I think task rejection is simpler.

Queries on Java Future and RejectionHandler

I had some queries regarding Future usage. Please go through below example before addressing my queries.
http://javarevisited.blogspot.in/2015/01/how-to-use-future-and-futuretask-in-Java.html
The main purpose of using thread pools & Executors is to execute task asynchronously without blocking main thread. But once you use Future, it is blocking calling thread. Do we have to create separate new thread/thread pool to analyse the results of Callable tasks? OR is there any other good solution?
Since Future call is blocking the caller, is it worth to use this feature? If I want to analyse the result of a task, I can have synchronous call and check the result of the call without Future.
What is the best way to handle Rejected tasks with usage of RejectionHandler? If a task is rejected, is it good practice to submit the task to another Thread or ThreadPool Or submit the same task to current ThreadPoolExecutor again?
Please correct me if my thought process is wrong about this feature.
Your question is about performing an action when an asynchronous action has been done. Futures on the other hand are good if you have an unrelated activity which you can perform while the asynchronous action is running. Then you may regularly poll the action represented by the Future via isDone() and do something else if not or call the blocking get() if you have no more unrelated work for your current thread.
If you want to schedule an on-completion action without blocking the current thread, you may instead use CompletableFuture which offers such functionality.
CompletableFuture is the solution for queries 1 and 2 as suggested by #Holger
I want to update about RejectedExecutionHandler mechanism regarding query 3.
Java provides four types of Rejection Handler policies as per javadocs.
In the default ThreadPoolExecutor.AbortPolicy, the handler throws a runtime RejectedExecutionException upon rejection.
In ThreadPoolExecutor.CallerRunsPolicy, the thread that invokes execute itself runs the task. This provides a simple feedback control mechanism that will slow down the rate that new tasks are submitted.
In ThreadPoolExecutor.DiscardPolicy, a task that cannot be executed is simply dropped.
In ThreadPoolExecutor.DiscardOldestPolicy, if the executor is not shut down, the task at the head of the work queue is dropped, and then execution is retried (which can fail again, causing this to be repeated.)
CallerRunsPolicy: If you have more tasks in task queue, using this policy will degrade the performance. You have to be careful since reject tasks will be executed by main thread itself. If Running the rejected task is critical for your application and you have limited task queue, you can use this policy.
DiscardPolicy: If discarding a non-critical event does not bother you, then you can use this policy.
DiscardOldestPolicy: Discard the oldest job and try to resume the last one
If none of them suits your need, you can implement your own RejectionHandler.

Examples of when it is convenient to use Executors.newSingleThreadExecutor()

Could please somebody tell me a real life example where it's convenient to use this factory method rather than others?
newSingleThreadExecutor
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an
unbounded queue. (Note however that if this single thread terminates
due to a failure during execution prior to shutdown, a new one will
take its place if needed to execute subsequent tasks.) Tasks are
guaranteed to execute sequentially, and no more than one task will be
active at any given time. Unlike the otherwise equivalent
newFixedThreadPool(1) the returned executor is guaranteed not to be
reconfigurable to use additional threads.
Thanks in advance.
Could please somebody tell me a real life example where it's convenient to use [the newSingleThreadExecutor() factory method] rather than others?
I assume you are asking about when you use a single-threaded thread-pool as opposed to a fixed or cached thread pool.
I use a single threaded executor when I have many tasks to run but I only want one thread to do it. This is the same as using a fixed thread pool of 1 of course. Often this is because we don't need them to run in parallel, they are background tasks, and we don't want to take too many system resources (CPU, memory, IO). I want to deal with the various tasks as Callable or Runnable objects so an ExecutorService is optimal but all I need is a single thread to run them.
For example, I have a number of timer tasks that I spring inject. I have two kinds of tasks and my "short-run" tasks run in a single thread pool. There is only one thread that executes them all even though there are a couple of hundred in my system. They do routine tasks such as checking for disk space, cleaning up logs, dumping statistics, etc.. For the tasks that are time critical, I run in a cached thread pool.
Another example is that we have a series of partner integration tasks. They don't take very long and they run rather infrequently and we don't want them to compete with other system threads so they run in a single threaded executor.
A third example is that we have a finite state machine where each of the state mutators takes the job from one state to another and is registered as a Runnable in a single thread-pool. Even though we have hundreds of mutators, only one task is valid at any one point in time so it makes no sense to allocate more than one thread for the task.
Apart from the reasons already mentioned, you would want to use a single threaded executor when you want ordering guarantees, i.e you need to make sure that whatever tasks are being submitted will always happen in the order they were submitted.
The difference between Executors.newSingleThreadExecutor() and Executors.newFixedThreadPool(1) is small but can be helpful when designing a library API. If you expose the returned ExecutorService to users of your library and the library works correctly only when the executor uses a single thread (tasks are not thread safe), it is preferable to use Executors.newSingleThreadExecutor(). Otherwise the user of your library could break it by doing this:
ExecutorService e = myLibrary.getBackgroundTaskExecutor();
((ThreadPoolExecutor)e).setCorePoolSize(10);
, which is not possible for Executors.newSingleThreadExecutor().
It is helpful when you need a lightweight service which only makes it convenient to defer task execution, and you want to ensure only one thread is used for the job.

ForkJoinPool seems to waste a thread

I'm comparing two variations on a test program. Both are operating with a 4-thread ForkJoinPool on a machine with four cores.
In 'mode 1', I use the pool very much like an executor service. I toss a pile of tasks into ExecutorService.invokeAll. I get better performance than from an ordinary fixed thread executor service (even though there are calls to Lucene, that do some I/O, in there).
There is no divide-and-conquer here. Literally, I do
ExecutorService es = new ForkJoinPool(4);
es.invokeAll(collection_of_Callables);
In 'mode 2', I submit a single task to the pool, and in that task call ForkJoinTask.invokeAll to submit the subtasks. So, I have an object that inherits from RecursiveAction, and it is submitted to the pool. In the compute method of that class, I call the invokeAll on a collection of objects from a different class that also inherits from RecursiveAction. For testing purposes, I submit only one-at-a-time of the first objects. What I naively expected to see what all four threads busy, as the thread calling invokeAll would grab one of the subtasks for itself instead of just sitting and blocking. I can think of some reasons why it might not work that way.
Watching in VisualVM, in mode 2, one thread is pretty nearly always waiting. What I expect to see is the thread calling invokeAll immediately going to work on one of the invoked tasks rather than just sitting still. This is certainly better than the deadlocks that would result from trying this scheme with an ordinary thread pool, but still, what up? Is it holding one thread back in case something else gets submitted? And, if so, why not the same problem in mode 1?
So far I've been running this using the jsr166 jar added to java 1.6's boot class path.
ForkJoinTask.invokeAll is forking all tasks, but the first in the list. The first task it runs itself. Then it joins other tasks. It's thread is not released in any way to the pool. So you what you see, it it's thread blocking on other tasks to be complete.
The classic use of invokeAll for a Fork Join pool is to fork one task and compute another (in that executing thread). The thread that does not fork will join after it computes. The work stealing comes in with both tasks computing. When each task computes it is expected to fork it's own subtasks (until some threshold is met).
I am not sure what invokeAll is being called for your RecursiveAction.compute() but if it is the invokeAll which takes two RecursiveAction it will fork one, compute the other and wait for the forked task to finish.
This is different then a plain executor service because each task of an ExecutorService is simply a Runnable on a queue. There is no need for two tasks of an ExecutorService to know the outcome of another. That is the primary use case of a FJ Pool.

How to use Thread Pool concept in Java?

I am creating a http proxy server in java. I have a class named Handler which is responsible for processing the requests and responses coming and going from web browser and to web server respectively. I have also another class named Copy which copies the inputStream object to outputStream object . Both these classes implement Runnable interface. I would like to use the concept of Thread pooling in my design, however i don't know how to go about that! Any hint or idea would be highly appreciated.
I suggest you look at Executor and ExecutorService. They add a lot of good stuff to make it easier to use Thread pools.
...
#Azad provided some good information and links. You should also buy and read the book Java Concurrency in Practice. (often abbreviated as JCiP) Note to stackoverflow big-wigs - how about some revenue link to Amazon???
Below is my brief summary of how to use and take advantage of ExecutorService with thread pools. Let's say you want 8 threads in the pool.
You can create one using the full featured constructors of ThreadPoolExecutor, e.g.
ExecutorService service = new ThreadPoolExecutor(8,8, more args here...);
or you can use the simpler but less customizable Executors factories, e.g.
ExecutorService service = Executors.newFixedThreadPool(8);
One advantage you immediately get is the ability to shutdown() or shutdownNow() the thread pool, and to check this status via isShutdown() or isTerminated().
If you don't care much about the Runnable you wish to run, or they are very well written, self-contained, never fail or log any errors appropriately, etc... you can call
execute(Runnable r);
If you do care about either the result (say, it calculates pi or downloads an image from a webpage) and/or you care if there was an Exception, you should use one of the submit methods that returns a Future. That allows you, at some time in the future, check if the task isDone() and to retrieve the result via get(). If there was an Exception, get() will throw it (wrapped in an ExecutionException). Note - even of your Future doesn't "return" anything (it is of type Void) it may still be good practice to call get() (ignoring the void result) to test for an Exception.
However, this checking the Future is a bit of chicken and egg problem. The whole point of a thread pool is to submit tasks without blocking. But Future.get() blocks, and Future.isDone() begs the questions of which thread is calling it, and what it does if it isn't done - do you sleep() and block?
If you are submitting a known chunk of related of tasks simultaneously, e.g., you are performing some big mathematical calculation like a matrix multiply that can be done in parallel, and there is no particular advantage to obtaining partial results, you can call invokeAll(). The calling thread will then block until all the tasks are complete, when you can call Future.get() on all the Futures.
What if the tasks are more disjointed, or you really want to use the partial results? Use ExecutorCompletionService, which wraps an ExecutorService. As tasks get completed, they are added to a queue. This makes it easy for a single thread to poll and remove events from the queue. JCiP has a great example of an web page app that downloads all the images in parallel, and renders them as soon as they become available for responsiveness.
I hope below will help you:,
class Executor
An object that executes submitted Runnable tasks. This interface provides a way of decoupling task submission from the mechanics of how each task will be run, including details of thread use, scheduling, etc. An Executor is normally used instead of explicitly creating threads. For example, rather than invoking new Thread(new(RunnableTask())).start() for each of a set of tasks, you might use:
Executor executor = anExecutor;
executor.execute(new RunnableTask1());
executor.execute(new RunnableTask2());
...
class ScheduledThreadPoolExecutor
A ThreadPoolExecutor that can additionally schedule commands to run after a given delay, or to execute periodically. This class is preferable to Timer when multiple worker threads are needed, or when the additional flexibility or capabilities of ThreadPoolExecutor (which this class extends) are required.
Delayed tasks execute no sooner than they are enabled, but without any real-time guarantees about when, after they are enabled, they will commence. Tasks scheduled for exactly the same execution time are enabled in first-in-first-out (FIFO) order of submission.
and
Interface ExecutorService
An Executor that provides methods to manage termination and methods that can produce a Future for tracking progress of one or more asynchronous tasks.
An ExecutorService can be shut down, which will cause it to stop accepting new tasks. After being shut down, the executor will eventually terminate, at which point no tasks are actively executing, no tasks are awaiting execution, and no new tasks can be submitted.
Edited:
you can find example to use Executor and ExecutorService herehereand here Question will be useful for you.

Categories

Resources