Number of threads in thread pool for good perfomance

Number of threads in thread pool for good perfomance - java

I have a Executor thread pool with core size and maximum size both kept at 40 and each thread will use a HTTP connection from PoolingClientConnectionManager of Apache HTTP client with 40 connections per host route. I can see that if load is less the performance is also coming less...can you guys please help me out?

Use something like following:
ExecutorService service = Executors.newCachedThreadPool();
service.execute(someRunnable);
In this case, you needn't worry about the number of threads that are running.
The system will automatically scale the number of threads up and down for you.
Here is the documentation:
public static ExecutorService newCachedThreadPool()
Creates a thread pool that creates new threads as needed, but will
reuse previously constructed threads when they are available. These
pools will typically improve the performance of programs that execute
many short-lived asynchronous tasks. Calls to execute will reuse
previously constructed threads if available. If no existing thread is
available, a new thread will be created and added to the pool. Threads
that have not been used for sixty seconds are terminated and removed
from the cache. Thus, a pool that remains idle for long enough will
not consume any resources. Note that pools with similar properties but
different details (for example, timeout parameters) may be created
using ThreadPoolExecutor constructors.
Returns: the newly created thread pool
Hope it helps.

It all depends on how many cores you have in the host that's running your JVM and how much CPU you expect each thread to utilise.
Assuming an 8 core host where nothing else was running aside from your JVM, every task you execute is purely CPU bound, then you'd probably want to have a thread pool size of 7 (leaving one core free for JVM activities like garbage collection and other OS tasks).
However, systems are never that predictable, so it's probably better to do some profiling under expected loads and seeing what works best.

Related

ThreadPoolExecutor#execute. How to reuse running threads?

I used to use ThreadPoolExecutors for years and one of the main reasons - it is designed to 'faster' process many requests because of parallelism and 'ready-to-go' threads (there are other though).
Now I'm stuck on minding inner design well known before.
Here is snippet from java 8 ThreadPoolExecutor:
public void execute(Runnable command) {
...
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*/
...
int c = ctl.get();
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
c = ctl.get();
}
...
I'm interested in this very first step as in most cases you do not want thread poll executor to store 'unprocessed requests' in the internal queue, it is better to leave them in external input Kafka topic / JMS queue etc. So I'm usually designing my performance / parallelism oriented executor to have zero internal capacity and 'caller runs rejection policy'. You chose some sane big amount of parallel threads and core pool timeout not scare others and show how big the value is ;). I don't use internal queue and I want tasks to start to be processed the earlier the better, thus it has become 'fixed thread pool executor'. Thus in most cases I'm under this 'first step' of the method logic.
Here is the question: is this really the case that it will not 'reuse' existing threads but will create new one each time it is 'under core size' (most cases)? Would it be not better to 'add new core thread only if all others are busy' and not 'when we have a chance to suck for a while on another thread creation'? Am I missing anything?

The doc describes the relationship between the corePoolSize, maxPoolSize, and the task queue, and what happens when a task is submitted.
...but will create new one [thread] each time it is 'under core size...'
Yes. From the doc:
When a new task is submitted in method execute(Runnable), and fewer
than corePoolSize threads are running, a new thread is created to
handle the request, even if other worker threads are idle.
Would it be not better to add new core thread only if all others are busy...
Since you don't want to use the internal queue this seems reasonable. So set the corePoolSize and maxPoolSize to be the same. Once the ramp up of creating the threads is complete there won't be any more creation.
However, using CallerRunsPolicy would seem to hurt performance if the external queue grows faster than can be processed.

Here is the question: is this really the case that it will not 'reuse' existing threads but will create new one each time it is 'under core size' (most cases)?
Yes that is how the code is documented and written.
Am I missing anything?
Yes, I think you are missing the whole point of "core" threads. Core threads are defined in the Executors docs are:
... threads to keep in the pool, even if they are idle.
That's the definition. Thread startup is a non trivial process and so if you have 10 core threads in a pool, the first 10 requests to the pool each start a thread until all of the core threads are live. This spreads the startup load across the first X requests. This is not about getting the tasks done, this is about initializing the TPE and spreading the thread creation load out. You could call prestartAllCoreThreads() if you don't want this behavior.
The whole purpose of the core threads is to have threads already started and running available to work on tasks immediately. If we had to start a thread each time we needed one, there would be unnecessary resource allocation time and thread start/stop overhead taking compute and OS resources. If you don't want the core threads then you can let them timeout and pay for the startup time.
I used to use ThreadPoolExecutors for years and one of the main reasons - it is designed to 'faster' process many requests because of parallelism and 'ready-to-go' threads (there are other though).
TPE is not necessarily "faster". We use it because to manually manage and communicate with a number of threads is hard and easy to get wrong. That's why the TPE code is so powerful. It is the OS threads that give us parallelism.
I don't use internal queue and I want tasks to start to be processed the earlier the better,
The entire point of a threaded program is the maximize throughput. If you run 100 threads on a 4 core system and the tasks are CPU intensive, you are going to pay for the increased context switching and the overall time to process a large number of requests is going to decrease. Your application is also most likely competing for resources on the server with other programs and you don't want to cause it to slow to a crawl if 100s of jobs try to run in a thread pool at the same time.
The whole point of limiting your core threads (i.e. not making them a "sane big amount") is that there is an optimal number of concurrent threads that will maximize the overall throughput of your application. It can be really hard to find the optimal core thread size but experimentation, if possible, would help.
It depends highly on the degree of CPU versus IO in a task. If the tasks are making remote RPC calls to a slow service then it might make sense to have a large number of core threads in your pool. If they are predominantly CPU tasks, however, you are going to want to be closer to the number of CPU/cores and then queue the rest of the tasks. Again it is all about overall throughput.

To reuse threads one need somehow to transfer task to existing thread.
This pushed me towards synchronous queue and zero core pool size.
return new ThreadPoolExecutor(0, maxThreadsCount,
10L, SECONDS,
new SynchronousQueue<Runnable>(),
new BasicThreadFactory.Builder().namingPattern("processor-%d").build());
I have really reduced amounts of 'peaks' of 500 - 1500 (ms) on my 'main flow'.
But this will work only for zero-sized queue. For non-zero-sized queue question is still open.

Is idle thread taking CPU execute time in Java Executors?

When I have this code in an application:
Executors.newFixedThreadPool(4);
but I never use this thread pool. Will the idle threads consume CPU time? If so - why?

No, these threads are created lazily, or 'on-demand'. As stated in the documentation (emphasis mine):
On-demand construction
By default, even core threads are initially created and started only when new tasks arrive
Java provides methods to override this default and allow for eager creation, namely prestartCoreThread and prestartAllCoreThreads.
Once threads are actually created, idle ones (generally) won't take CPU time as there is no reason for them to be scheduled on a core when they have no work to do.
They will still hold on to some memory however, for their stack and whatnot.

The javadoc states:
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue. At any point, at most nThreads threads will be active processing tasks.
This might lead to assume: we don't know exactly. But as the other answer clearly finds - we can know, and the implementation is actually fully lazy. Thus:
ExecutorService service = Executors.newFixedThreadPool(4);
does not even cause a lot of setup. The implementation would be free to wait for any
service.submit(...
to happen.
On the other hand, that thread pool could (theoretically) be created immediately, and also 4 OS threads could be created. If the later is the case, all those threads would be idle'ing, so they should not consume any CPU resources.
But of course, as Elliott is correctly pointing out, that initial step of creating the pool and (potentially) creating 1 to 4 threads requires CPU activity.
And of course: threads are a OS resource. So even when they "only exist" and do nothing; they have that "cost". Where again, it might depend on the OS if that could ever turn into a problem (like reaching some "maximum number of existing threads" limit).
Finally: as this got me curious, I had a look into the current Java8 source code (from newFixedThreadPoo() and ThreadPoolExcecutor() down into DefaultThreadFactory). If I am not mistaken: those constructors only prepare for thread creation.
So: the "current" implementation is "fully" lazy; and if you really only call newFixedThreadPool() without ever using the resulting ExecutorService ... nothing happens (in terms of new threads being created).

Java ExecutorService

How to use an ExecutorService in a way that a central Thread Pool is created for the application at the application level whose pool size will be set according to the number of threads available with the CPU at that time and then the different functionalities of the application use thread from this central pool as per their requirement.

As of Java 8, I suggest you use ForkJoinPool.commonPool(). This is the only global thread pool that base Java provides.
Before Java 8, you either keep your own thread pool(s) or use your framework's shareable thread pool(s).

Bellow is my view:
a central Thread Pool?
Maybe, it's say the Singleton Pattern in Design Pattern, I think it can solve your problem;
set according to the number of threads available with the CPU?
The size of the thread pool isn't exact. In practice, the size depends on the tasks' type be executed by the thread pool. For example, the size can be Runtime.getRuntime().availableProcessors() + 1 if the tasks are CPU intensive, or be Runtime.getRuntime().availableProcessors() * 2 if the tasks are I/O intensive.But these are only basic principles, you should determine the appropriate size by testing your application with some guidelines(such as Little's_law);
my suggests:
In practice, I seldom submit all tasks to only one central threal pool, maybe should group your tasks by type, submit them to the different thread pool, this will be convenient to monitor or tuning the thread pool later.
Hope to help you.

Number of threads created in the thread pool when using Executors.newCachedThreadPool()

How many threads are initially created in the thread pool when using Executors.newCachedThreadPool() the Javadoc doesn't specify any number is there a guaranteed number that we will always get like 10 initially or something.Docs below:
newCachedThreadPool public static ExecutorService
newCachedThreadPool() Creates a thread pool that creates new threads
as needed, but will reuse previously constructed threads when they are
available. These pools will typically improve the performance of
programs that execute many short-lived asynchronous tasks. Calls to
execute will reuse previously constructed threads if available. If no
existing thread is available, a new thread will be created and added
to the pool. Threads that have not been used for sixty seconds are
terminated and removed from the cache. Thus, a pool that remains idle
for long enough will not consume any resources. Note that pools with
similar properties but different details (for example, timeout
parameters) may be created using ThreadPoolExecutor constructors.
Returns: the newly created thread pool

The answer is 0.
You can find in source code that right after ThreadPoolExecutor creation there is no workers spawned.

CachedThreadPool vs FixedThreadPool

I would like to know which to use CachedThreadPool or FixedThreadPool in this particular scenario.
When the user logins into the app, a list of addresses will be obtained about 10 addresses. I need to do the following:
Convert the address into latitude and longitude for which I am calling a Google API
Obtain distance between the above fetched latitude and longitude with user's current location also with the help of a Google API
So, I have created a class GetDistance which implements Runnable. In this class I am first calling the Google API and parsing the response to get respective latitude and longitude and then calling and parsing result of another Google API to get driving distance.
private void getDistanceOfAllAddresses(List<Items> itemsList) {
ExecutorService exService = newCachedThreadPool(); //Executors.newFixedThreadPool(3);
for(int i =0; i<itemsList.size(); i++) {
exService.submit(new GetDistance(i,usersCurrentLocation));
}
exService.shutdown();
}
I have tried with both CachedThreadPool and FixedThreadPool, time taken is almost the same. I am in favour of CachedThreadPool as it is recommended for small operations, but I have some concerns. Lets assume CachedThreadPool creates 10 threads (worst case) to complete the process (10 items), will it be an issue if my app is running on lower end devices? As number of threads created will also affect the RAM of the device.
I want to know your thoughts and opinions on this. Which is better to use?

Go with newCachedThreadPool it is better fit for this situation, because your task are small and I/O (network) bound. Which means you should create threads (usually 1.5x ~ 2x times) greater than number of processor cores to get optimum output, but here I guess newCachedThreadPool will manage itself. So, newCachedThreadPool will have less overhead as compared to newFixedThreadPool and will help in your situation.
If you had CPU intensive tasks then newFixedThreadPool could have been a better choice.
Update
A list of addresses will be obtained about 10 addresses.
If you need only 10 address always, then it doesn't matter, go with newCachedThreadPool. But if you think that number of address can increase then use newFixedThreadPool with number of threads <= 1.5x to 2x times number of cores available.
From Java docs:
newFixedThreadPool
Creates a thread pool that reuses a
fixed number of threads operating off
a shared unbounded queue. At any
point, at most nThreads threads will
be active processing tasks. If
additional tasks are submitted when
all threads are active, they will wait
in the queue until a thread is
available. If any thread terminates
due to a failure during execution
prior to shutdown, a new one will take
its place if needed to execute
subsequent tasks. The threads in the
pool will exist until it is explicitly
shutdown.
newCachedThreadPool
Creates a thread pool that creates new
threads as needed, but will reuse
previously constructed threads when
they are available. These pools will
typically improve the performance of
programs that execute many short-lived
asynchronous tasks. Calls to execute
will reuse previously constructed
threads if available. If no existing
thread is available, a new thread will
be created and added to the pool.
Threads that have not been used for
sixty seconds are terminated and
removed from the cache. Thus, a pool
that remains idle for long enough will
not consume any resources. Note that
pools with similar properties but
different details (for example,
timeout parameters) may be created
using ThreadPoolExecutor constructors.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.