Java's Fork/Join vs ExecutorService - when to use which?

Java's Fork/Join vs ExecutorService - when to use which? - java

I just finished reading this post: What's the advantage of a Java-5 ThreadPoolExecutor over a Java-7 ForkJoinPool? and felt that the answer is not straight enough.
Can you explain in simple language and examples, what are the trade-offs between Java 7's Fork-Join framework and the older solutions?
I also read the Google's #1 hit on the topic Java Tip: When to use ForkJoinPool vs ExecutorService from javaworld.com but the article doesn't answer the title question when, it talks about api differences mostly ...

Fork-join allows you to easily execute divide and conquer jobs, which have to be implemented manually if you want to execute it in ExecutorService. In practice ExecutorService is usually used to process many independent requests (aka transaction) concurrently, and fork-join when you want to accelerate one coherent job.

Fork-join is particularly good for recursive problems, where a task involves running subtasks and then processing their results. (This is typically called "divide and conquer" ... but that doesn't reveal the essential characteristics.)
If you try to solve a recursive problem like this using conventional threading (e.g. via an ExecutorService) you end up with threads tied up waiting for other threads to deliver results to them.
On the other hand, if the problem doesn't have those characteristics, there is no real benefit from using fork-join.
References:
Java Tutorials: Fork/Join.
Java Tip: When to use ForkJoinPool vs ExecutorService:

Java 8 provides one more API in Executors
static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
With addition of this API,Executors provides different types of ExecutorService options.
Depending on your requirement, you can choose one of them or you can look out for ThreadPoolExecutor which provides better control on Bounded Task Queue Size, RejectedExecutionHandler mechanisms.
static ExecutorService newFixedThreadPool(int nThreads)
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.
static ScheduledExecutorService newScheduledThreadPool(int corePoolSize)
Creates a thread pool that can schedule commands to run after a given delay, or to execute periodically.
static ExecutorService newCachedThreadPool(ThreadFactory threadFactory)
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available, and uses the provided ThreadFactory to create new threads when needed.
static ExecutorService newWorkStealingPool(int parallelism)
Creates a thread pool that maintains enough threads to support the given parallelism level, and may use multiple queues to reduce contention.
Each of these APIs are targeted to fulfil respective business needs of your application. Which one to use will depend on your use case requirement.
e.g.
If you want to process all submitted tasks in order of arrival, just use newFixedThreadPool(1)
If you want to optimize performance of big computation of recursive tasks, use ForkJoinPool or newWorkStealingPool
If you want to execute some tasks periodically or at certain time in future, use newScheduledThreadPool
Have a look at one more nice article by PeterLawrey on ExecutorService use cases.
Related SE question:
java Fork/Join pool, ExecutorService and CountDownLatch

Brian Goetz describes the situation best: https://www.ibm.com/developerworks/library/j-jtp11137/index.html
Using conventional thread pools to implement fork-join is also challenging because fork-join tasks spend much of their lives waiting for other tasks. This behavior is a recipe for thread starvation deadlock, unless the parameters are carefully chosen to bound the number of tasks created or the pool itself is unbounded. Conventional thread pools are designed for tasks that are independent of each other and are also designed with potentially blocking, coarse-grained tasks in mind — fork-join solutions produce neither.
I recommend reading the whole post, as it has a good example of why you'd want to use a fork-join pool. It was written before ForkJoinPool became official, so the coInvoke() method he refers to became invokeAll().

Fork-Join framework is an extension to Executor framework to particularly address 'waiting' issues in recursive multi-threaded programs. In fact, the new Fork-Join framework classes all extend from the existing classes of the Executor framework.
There are 2 characteristics central to Fork-Join framework
Work Stealing (An idle thread steals work from a thread having tasks
queued up more than it can process currently)
Ability to recursively decompose the tasks and collect the results.
(Apparently, this requirement must have popped up along with the
conception of the notion of parallel processing... but lacked a solid
implementation framework in Java till Java 7)
If the parallel processing needs are strictly recursive, there is no choice but to go for Fork-Join, otherwise either of executor or Fork-Join framework should do, though Fork-Join can be said to better utilize the resources because of the idle threads 'stealing' some tasks from busier threads.

Fork Join is an implementation of ExecuterService. The main difference is that this implementation creates DEQUE worker pool. Where task is inserted from oneside but withdrawn from any side. It means if you have created new ForkJoinPool() it will look for the available CPU and create that many worker thread. It then distribute the load evenly across each thread. But if one thread is working slowly and others are fast, they will pick the task from the slow thread. from the backside. The below steps will illustrate the stealing better.
Stage 1 (initially):
W1 -> 5,4,3,2,1
W2 -> 10,9,8,7,6
Stage 2:
W1 -> 5,4
W2 -> 10,9,8,7,
Stage 3:
W1 -> 10,5,4
W2 -> 9,8,7,
Whereas Executor service creates asked number of thread, and apply a blocking queue to store all the remaining waiting task. If you have used cachedExecuterService, it will create single thread for each job and there will be no waiting queue.

Related

What if i do not fork my task and simply process in parallel using ForkJoinPool?

I have been reading on Concurrency and Parallelism in JDK 7 i.e. ThreadPoolExecutor, ExecutorService & ForkJoinPool. I have already implemented ThreadPoolExecutor (to create controlled threads) & ExecutorService (For fixed threads) but now thinking of putting my processing in parallel using ForkJoinPool.
My question initially is what if I just put my tasks to ForkJoinPool and do not actually fork further (as suggested to use RecursiveTask or RecursiveAction and use compute to fork for given threshold).
My assumption is that processing in parallel for given no. of threads e.g. 4. would be faster than tasks processed concurrently. How far i will be achieving performance using concurrency & parallelism. Need your suggestions on it.
I will be extending my logic on ForkJoinPool in phases as program logic is complex

CompletableFuture supplyAsync(supplier) vs supplyAsync(supplier, executor)

Java Docs says CompletableFuture:supplyAsync(Supplier<U> supplier) runs the task in the ForkJoinPool#commonPool() whereas the CompletableFuture:suppleAsync(supplier, executor) runs it in the given executor.
I'm trying to figure out which one to use. So my questions are:
What is the ForkJoinPool#commonPool()?
When should I use supplyAsync(supplier) vs supplyAsync(supplier, executor)?

ForkJoinPool#commonPool() is the common pool of threads that Java API provides. If you ever used stream API, then the parallel operations are also done in this thread pool.
The advantage of using the common thread pool is that Java API would manage that for you - from creation to destruction. The disadvantage is that you would expect a lot of classes to share the usage of this pool.
If you used an executor, then it is like owning a private pool, so nothing is going to fight with you over the usage. You make to create the executor yourself, and pass it into CompletableFuture. However, do note that, eventually the actual performance would still depend on what is being done in the threads, and by your hardware.
Generally, I find it fine to use the common thread pool for doing more computationally intensive stuff, while executor would be better for doing things that would have to wait for things (like IO). When you "sleep" in common thread pool thread, it is like using a cubicle in a public washroom to play games on your mobile phone - someone else could be waiting for the cubicle.

Situational use: Run tasks in ForkJoinPool vs. new Thread

Since I'm trying to understand a few of the new features provided in Java 8, I came upon the following situation:
I wish to implement asynchron method calls into my application (in JavaFX). The idea was to provide/use a seperate thread for everything related to the GUI so that background tasks wont block/delay the visible output in my application.
For the background tasks, I thought about either use a pool of threads or simply run them in the main thread of the application for now. Then, I came upon the ForkJoinPool used in the standard way by using the CompletableFuture class when doing something like this:
CompletableFuture.runAsync(task);
whereas task is a Runnable.
In most tutorials and the Javadoc, the ForkJoinPool is described as "a pool which contains threads waiting for tasks to run". Also, the ForkJoinPool is usually the size of the cores of the users machine, or doubled if hyperthreading is supported.
What advantages does the ForkJoinPool give me over the traditional Thread when I want to run a task asynchronously?

ForkJoinPool is not comparable with Threads, it's rather comparable with ThreadPools. Creating a new thread by code often bad and will lead to OutOfMemoryErrors because it is not controlled. Depending on your use case you might need to use a ForkJoinPool or a different pool but make sure you use one. And by the way, all CompletableFuture methods have overloads that allow you pass your own thread pool.
Some benefits of ForkJoinPool,
Already initialised for you and shutdown on the JVM shutdown, so you don't need to worry about that
Its size can be controlled through a VM parameter
Its perfect for compute bound task where work stealing can be beneficial, although this could be a problem depending on the case.

Examples of when it is convenient to use Executors.newSingleThreadExecutor()

Could please somebody tell me a real life example where it's convenient to use this factory method rather than others?
newSingleThreadExecutor
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an
unbounded queue. (Note however that if this single thread terminates
due to a failure during execution prior to shutdown, a new one will
take its place if needed to execute subsequent tasks.) Tasks are
guaranteed to execute sequentially, and no more than one task will be
active at any given time. Unlike the otherwise equivalent
newFixedThreadPool(1) the returned executor is guaranteed not to be
reconfigurable to use additional threads.
Thanks in advance.

Could please somebody tell me a real life example where it's convenient to use [the newSingleThreadExecutor() factory method] rather than others?
I assume you are asking about when you use a single-threaded thread-pool as opposed to a fixed or cached thread pool.
I use a single threaded executor when I have many tasks to run but I only want one thread to do it. This is the same as using a fixed thread pool of 1 of course. Often this is because we don't need them to run in parallel, they are background tasks, and we don't want to take too many system resources (CPU, memory, IO). I want to deal with the various tasks as Callable or Runnable objects so an ExecutorService is optimal but all I need is a single thread to run them.
For example, I have a number of timer tasks that I spring inject. I have two kinds of tasks and my "short-run" tasks run in a single thread pool. There is only one thread that executes them all even though there are a couple of hundred in my system. They do routine tasks such as checking for disk space, cleaning up logs, dumping statistics, etc.. For the tasks that are time critical, I run in a cached thread pool.
Another example is that we have a series of partner integration tasks. They don't take very long and they run rather infrequently and we don't want them to compete with other system threads so they run in a single threaded executor.
A third example is that we have a finite state machine where each of the state mutators takes the job from one state to another and is registered as a Runnable in a single thread-pool. Even though we have hundreds of mutators, only one task is valid at any one point in time so it makes no sense to allocate more than one thread for the task.

Apart from the reasons already mentioned, you would want to use a single threaded executor when you want ordering guarantees, i.e you need to make sure that whatever tasks are being submitted will always happen in the order they were submitted.

The difference between Executors.newSingleThreadExecutor() and Executors.newFixedThreadPool(1) is small but can be helpful when designing a library API. If you expose the returned ExecutorService to users of your library and the library works correctly only when the executor uses a single thread (tasks are not thread safe), it is preferable to use Executors.newSingleThreadExecutor(). Otherwise the user of your library could break it by doing this:
ExecutorService e = myLibrary.getBackgroundTaskExecutor();
((ThreadPoolExecutor)e).setCorePoolSize(10);
, which is not possible for Executors.newSingleThreadExecutor().

It is helpful when you need a lightweight service which only makes it convenient to defer task execution, and you want to ensure only one thread is used for the job.

What's the advantage of a Java-5 ThreadPoolExecutor over a Java-7 ForkJoinPool?

Java 5 has introduced support for asynchronous task execution by a thread pool in the form of the Executor framework, whose heart is the thread pool implemented by java.util.concurrent.ThreadPoolExecutor. Java 7 has added an alternative thread pool in the form of java.util.concurrent.ForkJoinPool.
Looking at their respective API, ForkJoinPool provides a superset of ThreadPoolExecutor's functionality in standard scenarios (though strictly speaking ThreadPoolExecutor offers more opportunities for tuning than ForkJoinPool). Adding to this the observation that
fork/join tasks seem to be faster (possibly due to the work stealing scheduler), need definitely fewer threads (due to the non-blocking join operation), one might get the impression that ThreadPoolExecutor has been superseded by ForkJoinPool.
But is this really correct? All the material I have read seems to sum up to a rather vague distinction between the two types of thread pools:
ForkJoinPool is for many, dependent, task-generated, short, hardly ever blocking (i.e. compute-intensive) tasks
ThreadPoolExecutor is for few, independent, externally-generated, long, sometimes blocking tasks
Is this distinction correct at all? Can we say anything more specific about this?

ThreadPool (TP) and ForkJoinPool (FJ) are targeted towards different use cases. The main difference is in the number of queues employed by the different executors which decide what type of problems are better suited to either executor.
The FJ executor has n (aka parallelism level) separate concurrent queues (deques) while the TP executor has only one concurrent queue (these queues/deques maybe custom implementations not following the JDK Collections API). As a result, in scenarios where you have a large number of (usually relatively short running) tasks generated, the FJ executor will perform better as the independent queues will minimize concurrent operations and infrequent steals will help with load balancing. In TP due to the single queue, there will be concurrent operations every time work is dequeued and it will act as a relative bottleneck and limit performance.
In contrast, if there are relatively fewer long-running tasks the single queue in TP is no longer a bottleneck for performance. However, the n-independent queues and relatively frequent work-stealing attempts will now become a bottleneck in FJ as there can be possibly many futile attempts to steal work which add to overhead.
In addition, the work-stealing algorithm in FJ assumes that (older) tasks stolen from the deque will produce enough parallel tasks to reduce the number of steals. E.g. in quicksort or mergesort where older tasks equate to larger arrays, these tasks will generate more tasks and keep the queue non-empty and reduce the number of overall steals. If this is not the case in a given application then the frequent steal attempts again become a bottleneck. This is also noted in the javadoc for ForkJoinPool:
this class provides status check methods (for example getStealCount())
that are intended to aid in developing, tuning, and monitoring
fork/join applications.

Recommended Reading http://gee.cs.oswego.edu/dl/jsr166/dist/docs/
From the docs for ForkJoinPool:
A ForkJoinPool differs from other kinds of ExecutorService mainly by
virtue of employing work-stealing: all threads in the pool attempt to
find and execute tasks submitted to the pool and/or created by other
active tasks (eventually blocking waiting for work if none exist).
This enables efficient processing when most tasks spawn other subtasks
(as do most ForkJoinTasks), as well as when many small tasks are
submitted to the pool from external clients. Especially when setting
asyncMode to true in constructors, ForkJoinPools may also be
appropriate for use with event-style tasks that are never joined.
The fork join framework is useful for parallel execution while executor service allows for concurrent execution and there is a difference. See this and this.
The fork join framework also allows for work stealing (usage of a Deque).
This article is a good read.

AFAIK, ForkJoinPool works best if you a large piece of work and you want it broken up automatically. ThreadPoolExecutor is a better choice if you know how you want the work broken up. For this reason I tend to use the latter because I have determined how I want the work broken up. As such its not for every one.
Its worth nothing that when it comes to relatively random pieces of business logic a ThreadPoolExecutor will do everything you need, so why make it more complicated than you need.

Let's compare the differences in constructors:
ThreadPoolExecutor
ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler)
ForkJoinPool
ForkJoinPool(int parallelism,
ForkJoinPool.ForkJoinWorkerThreadFactory factory,
Thread.UncaughtExceptionHandler handler,
boolean asyncMode)
The only advantage I have seen in ForkJoinPool: Work stealing mechanism by idle threads.
Java 8 has introduced one more API in Executors - newWorkStealingPool to create work stealing pool. You don't have to create RecursiveTask and RecursiveAction but still can use ForkJoinPool.
public static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
Advantages of ThreadPoolExecutor over ForkJoinPool:
You can control task queue size in ThreadPoolExecutor unlike in ForkJoinPool.
You can enforce Rejection Policy when you ran out of your capacity unlike in ForkJoinPool
I like these two features in ThreadPoolExecutor which keeps health of system in good state.
EDIT:
Have a look at this article for use cases of various types of Executor Service thread pools and evaluation of ForkJoin Pool features.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.