I have a 3 instance of a class which is implementing runnable interface. I am instantiating my Executor class like below;
executor = Executors.newScheduledThreadPool(2);<--- talking about this part
executor.scheduleAtFixedRate(unassignedRunnable, 0, refreshTime, TimeUnit.SECONDS);
executor.scheduleAtFixedRate(assignedToMeRunnable, 2, refreshTime, TimeUnit.SECONDS);
executor.scheduleAtFixedRate(createTicketsFromFile, 3, refreshTime * 2, TimeUnit.SECONDS);
My question is, Does it make any difference, if I change thread pool count from 2 to 1 or 3 ? I tried and gained nothing almost. Can anyone explain the real use of thread pool count ? Maybe my tasks are lightweight ?
You need to understand, it doesn't matter how many threads you are going to create, ultimately, threads would be executed upon number of available cores. Now, as per documentation, it is "the number of threads to keep in the pool, even if they are idle."
Can you tell me what is the real use of thread pool count
executor = Executors.newScheduledThreadPool(2);
Above line of code will create 2 threads in thread pool , but it doesn't mean all will be doing some work. But, on same time, the same thread can be used to perform some other task from thread pool, which was submitted.
So, it is better to understand your requirement before picking the total number of threads to be created. (I usually prefer the number, depending on the number of available cores count)
That is corePoolSize is number thread in pool .Available thread pick the eligible task and run in same thread.If there is no thread available though task is eligble for run will not execute as all threads are busy.In your case may be your tasks very short lived.To demo create corepool size one and submit long running task and after that submit a light task check the behavior then increase the corepoolsize to 2 and see the behavior.
It depends on number of CPU cores of a machine, on which you are running your application. If you have more number of CPU cores, multiple threads can run in parallel and performance of overall system can be improved if your application is not IO Bound application.
CPU bound application will benefit with more number of cores & threads.
If you have 4 core CPU, you can configure the value as 4. If your machine has single CPU core, there won't be any benefit to change the pool size as 4.
Related SE questions:
Java: How to scale threads according to cpu cores?
Is multithreading faster than single thread?
Related
I have a service which schedules async tasks using ScheduledExecutorService for the user. Each user will trigger the service to schedule two tasks. (The 1st Task schedule the 2nd task with a fixed delay, such as 10 seconds interval)
pseudocode code illustration:
task1Future = threadPoolTaskScheduler.schedule(task1);
for(int i = 0; i< 10000; ++i) {
task2Future = threadPoolTaskScheduler.schedule(task2);
task2Future.get(); // Takes long time
Thread.sleep(10);
}
task1.Future.get();
Suppose I have a potential of 10000 users using the service at the same time, we can have two kinds of ScheduledExecutorService configuration for my service:
A single ScheduledExecutorService for all the users.
Create a ScheduledExecutorService for each user.
What I can think about the first method:
Pros:
Easy to control the number of threads in the thread pool.
Avoid creating new threads for scheduled tasks.
Cons:
Always keeping multiple number of threads available could waste computer resources.
May cause the hang of the service because of lacking available threads. (For example, set the thread pool size to 10, and then there is a 100 person using the service the same time, then after entering the 1st task and it tries to schedule the 2nd task, then finding out there is no thread available for scheduling the 2nd task)
What I can think about the second method
Pros:
Avoiding always keep many threads available when the number of user is small.
Can always provide threads for a large number of simultaneously usage.
Cons:
Creating new threads creates overheads.
Don't know how to control the number of maximum threads for the service. May cause the RAM out of space.
Any ideas about which way is better?
Single ScheduledExecutorService drives many tasks
The entire point of a ScheduledExecutorService is to maintain a collection of tasks to be executed after a certain amount of time elapses.
So given the scenario you describe, you need only a single ScheduledExecutorService object. Submit your 10,000 tasks to that one object. Each task will be executed approximately when its designated delay elapses. Simple, and easy.
Thread pool size
The real issue is deciding how many threads to assign to the ScheduledExecutorService.
Threads, as currently implemented in the OpenJDK project, are mapped directly to host OS threads. This makes them relatively heavyweight in terms of CPU and memory usage. In other words, currently Java threads are “expensive”.
There is no simple easy answer to calculating thread pool size. The optimal number is the least amount of threads that can keep up with the workload without over-burdening the host machine’s limited number of cores and limited memory. If you search Stack Overflow, you’ll find many discussions on the topic of deciding how many threads to use in a pool.
Project Loom
And keep tabs with the progress of Project Loom and its promise to bring virtual threads to Java. That technology has the potential to radically alter the calculus of deciding thread pool size. Virtual threads will be more efficient with CPU and with memory. In other words, virtual threads will be quite “cheap”, “inexpensive”.
How executor service works
You said:
entering the 1st task and it tries to schedule the 2nd task, then finding out there is no thread available for scheduling the 2nd task
That is not how the scheduled executor service (SES) works.
If a task being currently executed by a SES needs to schedule itself or some other task to later execution, that submitted task is added to the queue maintained internally by the SES. There is no need to have a thread immediately available. Nothing happens immediately except that queue addition. Later, when the added task’s specified delay has elapsed, the SES looks for an available thread in its thread-pool to execute that task that was queued a while back in time.
You seem to feel a need to manage the time of each task’s execution on certain threads. But that is the job of the scheduled executor service. The SES tracks the tasks submitted for execution, notices when their specified delay elapses, and schedules their execution on a thread from its managed pool of threads. You don’t need to manage any of that. Your only challenge is to assign an appropriate number of threads to the pool.
Multiple executor services
You commented:
why don't use multiple ScheduledExecutorService instances
Because in your scenario, there is no benefit. Your Question implies that you have many tasks all similar with none being prioritized. In such a case, just use one executor service. One scheduled executor service with 12 threads will get the same amount of work accomplished as 3 services with 4 threads each.
As for excess threads, they are not a burden. Any thread without a task to execute uses virtually no CPU time. A pool may or may not choose to close some unused threads after a while. But such a policy is up to the implementation of the thread pool of the executor service, and is transparent to us as calling programmers.
If the scenario were different, where some of the tasks block for long periods of time, or where you need to prioritize certain tasks, then you may want to segregate those into a separate executor service.
In today's Java (before Project Loom with virtual threads), when code in a thread blocks, that thread sits there doing nothing but waiting to unblock. Blocking means your code is performing an operation that awaits a response. For example, making network calls to a socket or web service blocks, writing to storage blocks, and accessing an external database blocks. Ideally, you would not write code that blocks for long periods of time. But sometimes you must.
In such a case where some tasks run long, or conversely you have some tasks that must be prioritized for fast execution, then yes, use multiple executor services.
For example, say you have a 16-core machine with not much else running except your Java app. You might have one executor service with a thread pool size of 4 maximum for long-running tasks, one executor service with a thread pool with a size of 7 maximum for many run-of-the-mill tasks, and a third executor service with a thread pool maximum size of 2 for very few tasks that run short but must run quickly. (The numbers here are arbitrary examples, not a recommendation.)
Other approaches
As commented, there are other frameworks for managing concurrency. The ScheduledExecutorService discussed here is general purpose.
For example, Swing, JavaFX, Spring, and Jakarta EE each have their own concurrency management. Consider using those where approriate to your particular project.
On application, i'm using
Executor executionContext = Executors.newFixedThreadPool(10);
for fixed the 10 threads.
I don't want to fix the number of thread, I want thread will be dynamic. The number of threads it requires, it will process.
How it can be applied?
The 10 in your question is the value of the thread pool, not the number of threads. You say that you want:
The number of threads it requires, it will process.
Then your code will work as you want. It will use as many threads as it needs until it reaches the thread pool value. You can also choose what happens when it reaches the max thread pool using "rejection policy".
I used to use ThreadPoolExecutors for years and one of the main reasons - it is designed to 'faster' process many requests because of parallelism and 'ready-to-go' threads (there are other though).
Now I'm stuck on minding inner design well known before.
Here is snippet from java 8 ThreadPoolExecutor:
public void execute(Runnable command) {
...
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*/
...
int c = ctl.get();
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
c = ctl.get();
}
...
I'm interested in this very first step as in most cases you do not want thread poll executor to store 'unprocessed requests' in the internal queue, it is better to leave them in external input Kafka topic / JMS queue etc. So I'm usually designing my performance / parallelism oriented executor to have zero internal capacity and 'caller runs rejection policy'. You chose some sane big amount of parallel threads and core pool timeout not scare others and show how big the value is ;). I don't use internal queue and I want tasks to start to be processed the earlier the better, thus it has become 'fixed thread pool executor'. Thus in most cases I'm under this 'first step' of the method logic.
Here is the question: is this really the case that it will not 'reuse' existing threads but will create new one each time it is 'under core size' (most cases)? Would it be not better to 'add new core thread only if all others are busy' and not 'when we have a chance to suck for a while on another thread creation'? Am I missing anything?
The doc describes the relationship between the corePoolSize, maxPoolSize, and the task queue, and what happens when a task is submitted.
...but will create new one [thread] each time it is 'under core size...'
Yes. From the doc:
When a new task is submitted in method execute(Runnable), and fewer
than corePoolSize threads are running, a new thread is created to
handle the request, even if other worker threads are idle.
Would it be not better to add new core thread only if all others are busy...
Since you don't want to use the internal queue this seems reasonable. So set the corePoolSize and maxPoolSize to be the same. Once the ramp up of creating the threads is complete there won't be any more creation.
However, using CallerRunsPolicy would seem to hurt performance if the external queue grows faster than can be processed.
Here is the question: is this really the case that it will not 'reuse' existing threads but will create new one each time it is 'under core size' (most cases)?
Yes that is how the code is documented and written.
Am I missing anything?
Yes, I think you are missing the whole point of "core" threads. Core threads are defined in the Executors docs are:
... threads to keep in the pool, even if they are idle.
That's the definition. Thread startup is a non trivial process and so if you have 10 core threads in a pool, the first 10 requests to the pool each start a thread until all of the core threads are live. This spreads the startup load across the first X requests. This is not about getting the tasks done, this is about initializing the TPE and spreading the thread creation load out. You could call prestartAllCoreThreads() if you don't want this behavior.
The whole purpose of the core threads is to have threads already started and running available to work on tasks immediately. If we had to start a thread each time we needed one, there would be unnecessary resource allocation time and thread start/stop overhead taking compute and OS resources. If you don't want the core threads then you can let them timeout and pay for the startup time.
I used to use ThreadPoolExecutors for years and one of the main reasons - it is designed to 'faster' process many requests because of parallelism and 'ready-to-go' threads (there are other though).
TPE is not necessarily "faster". We use it because to manually manage and communicate with a number of threads is hard and easy to get wrong. That's why the TPE code is so powerful. It is the OS threads that give us parallelism.
I don't use internal queue and I want tasks to start to be processed the earlier the better,
The entire point of a threaded program is the maximize throughput. If you run 100 threads on a 4 core system and the tasks are CPU intensive, you are going to pay for the increased context switching and the overall time to process a large number of requests is going to decrease. Your application is also most likely competing for resources on the server with other programs and you don't want to cause it to slow to a crawl if 100s of jobs try to run in a thread pool at the same time.
The whole point of limiting your core threads (i.e. not making them a "sane big amount") is that there is an optimal number of concurrent threads that will maximize the overall throughput of your application. It can be really hard to find the optimal core thread size but experimentation, if possible, would help.
It depends highly on the degree of CPU versus IO in a task. If the tasks are making remote RPC calls to a slow service then it might make sense to have a large number of core threads in your pool. If they are predominantly CPU tasks, however, you are going to want to be closer to the number of CPU/cores and then queue the rest of the tasks. Again it is all about overall throughput.
To reuse threads one need somehow to transfer task to existing thread.
This pushed me towards synchronous queue and zero core pool size.
return new ThreadPoolExecutor(0, maxThreadsCount,
10L, SECONDS,
new SynchronousQueue<Runnable>(),
new BasicThreadFactory.Builder().namingPattern("processor-%d").build());
I have really reduced amounts of 'peaks' of 500 - 1500 (ms) on my 'main flow'.
But this will work only for zero-sized queue. For non-zero-sized queue question is still open.
This question already has answers here:
"Parallel.For" for Java?
(11 answers)
Closed 9 years ago.
I have a loop which iterate for 10 times and in each time it executes some interrelated methods.So is there any solution that can do the same task paralally for each instance of that loop.
1st core should execute the program for i=0,1,2,3
2nd core should execute the program for i=4,5
3rd core should execute the program for i=6,7
4th core should execute the program for i=8,9
Assume I have 4 cores
I have a loop which iterate for 10 times and in each time it executes some interrelated methods.So is there any solution that can do the same task parallaly for each instance of that loop.
First thing to realize is that you don't have specific control over task to core mapping unless you utilize 3rd party libraries. In general you don't want to do this anyway -- just let the JVM be smart about it.
To utilize all of the cores on your hardware, typically you change the number of threads in a fixed thread pool. You can use a cached thread pool which will increase the tasks depending on the number of tasks submitted to the ExecutorService. On modern JVMs and operating systems thrashing is less of a worry so this works well unless you are talking about 1000s of tasks.
But typically I set a fixed number of threads in the pool. If the tasks are 100% CPU bound then I may set the number of threads to match the number of virtual processors the JVM has. Typically with log output, network transactions, and other IO you might need to increase the number of threads to utilize all of the processors. I'd run your program a couple of times with different values.
In your case you seem to want to specifically run jobs 0,1,2,3 in one thread while 4,5 run in another thread in parallel (hopefully). To accomplish this you should submit 4 tasks to your ExecutorService and have each task run the iteration numbers that you want.
Maybe something like:
executorService.submit(new MyTask(new int[] { 0, 1, 2, 3}));
executorService.submit(new MyTask(new int[] { 4, 5 }));
If the number to thread mapping is arbitrary then I'd just submit 10 tasks to the ExecutorService and it will run the tasks in parallel in a FIFO manner.
Look into Executors and ExecutorService. You can create a thread pool and throw any number of Runnables at it. For instance, if you want to have one thread per core, you can do something like:
ExecutorService pool =
Executors.newFixedThreadPool(Runtime.getRuntime().getAvailableProcessors());
Then you can just do this:
pool.execute(myRunnable);
There is a very good example in the API docs: http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html
I am using a ThreadPoolExecutor with a corePoolSize = maxPoolSize = queueSize = 15
Every incoming request spawns 7 tasks, to be executed with this thread pool.
Even though each of the individual tasks, on being scheduled, take less than 3 seconds, the overall request takes much longer.
I suspected that the system is falling short of threads and tasks being queued.
I logged the following information for every incoming request.
getActiveCount()
getLargestPoolSize()
getPoolSize()
getQueue().size()
I notice that the system is not falling short of threads.
getPoolSize and getLargestPoolSize values are constantly at 15 - This is as expected.
getQueue().size() is always 0 - so no tasks are getting queued.
getActiveCount() values are always between 1-2.
Why aren't the rest of the threads in the pool working ?
Is "getActiveCount()-Returns the approximate number of threads that are actively executing tasks." the right API to use ?
As #Thomas suggests, the pool is creating threads as required so if you only give the pool 1-2 tasks to do at once, it will only have 1-2 threads active. You need to feed more tasks to it at once if you want it to be busier.
I don't know Java's thread pool that well but generally you'd only create as many threads as your machine has cores (or hardware threads) available. If you are running on a dual core machine 2 active threads are a sensible value.
Have you tried to run this in debug mode in eclipse? It usually shows all the threads allocated as well as their statuses.
I suspect that in your case the following scenario takes place: you have thread pool and thread pool has allocated number of threads to process the tasks you are submitting into it. But once any particular thread in the thread pool has completed the task submitted into the thread pool it does not terminate. Instead it switches its status to BLOCKED or WAITING (I don't remember for sure which one of them) and waits until the next task is passed into this particular thread for processing.