Allowing core thread timeout with ScheduledThreadPoolExecutor

Allowing core thread timeout with ScheduledThreadPoolExecutor - java

Could you explain me, why in ScheduledThreadPoolExecutor javadoc is this:
Additionally, it is almost never a good idea to set corePoolSize to
zero or use allowCoreThreadTimeOut because this may leave the pool without
threads to handle tasks once they become eligible to run.
I've tried to analyze how new threads are created in this thread pool when a new task has to be executed and I think problem described in javadoc shouldn't happen.

The thread pool trys to make the number of work threads equals corePoolSize, to improve efficiency by caching threads. Allowing core thread time out is contrary to this purpose. If you allow core thread time out, new task will be executed, but leads to repeatly create and destroy work threads.
If you set allowCoreThreadTimeOut = true, then after the work thread find no task in task queue and time out, they will be destroyed even the number of working threads is less than corePoolSize. So, if you submit new task at this time, the thread pool has to create new thread.
If you set allowCoreThreadTimeOut = false, then after the work thread find no task in task queue and number of work threads less than corePoolSize, they will be not be destroyed and keep waiting for new task.

My guess is that the answer is stale Javadoc, for the most part. As you note, ensurePrestart ensures that as long as corePoolSize > 0, the number of core pool threads is nonzero after the call. This has been the case since https://github.com/openjdk/jdk/commit/2d19ea519b17529a083a62eb219da532693bbef3, but notably that commit did not update the Javadoc on ScheduledThreadPoolExecutor.
However, the details aren't quite so simple either. Rather than only worrying about new task submissions and task reschedule on completion, you also need to worry about core pool threads idling out because all scheduled tasks are too far in the future to trigger before the pool timeout.
Not sure if I'm reading the JRE code correctly, but it looks like in such a case, the pool will:
Start a worker thread (due to ensurePrestart)
Regardless of thread type (core or not), the thread is eligible for timeout because allowCoreThreadTimeout is true
Worker thread polls DelayedWorkQueue for next task, with timeout
poll returns null (times out), because the next scheduled task is beyond pool timeout
The core thread will terminate because it thinks there is nothing to do
ThreadPoolExecutor.processWorkerExit will run on worker termination. It will check the queue, notice that it's nonempty, and thus require a minimum of at least one thread
If the thread being terminated is the last thread, it will notice that the minimum is not met and immediately start a new (non-core) worker
Repeat from step 1
So, the pool will work as you intend, but probably won't be in an ideal state either (you really want a single core thread polling without timeout here, not threads constantly spinning up, polling with timeout, timing out, then starting a new thread to replace itself). In that sense, step 6 is really what prevents the case mentioned in Javadoc (by the time tasks are eligible, all pool threads have timed out), but it does so imperfectly because of the unnecessary thread creation/destruction loop.
This weirdness is really because DelayedWorkQueue semantically breaks the BlockingQueue contract. All else equal, you'd assume that size() > 0 implies that a subsequent poll(...) will successfully retrieve an element and not time out, but DelayedWorkQueue allows those elements to be held back for some time (even though they're already visible via size() and isEmpty()).
NOTE: Seems like this code changed in Java 8, with the addition of this condition to ThreadPoolExecutor.getTask(). This will keep that last worker thread alive and avoid the thread create/destroy loop, but it will busy-poll the work queue for work instead.

Related

Will thread in thread pool (ThreadPoolTaskExecutor) be used for other tasks when in sleep?

I have a thread pool created from ThreadPoolTaskExecutor:
threadPoolTaskExecutor.setCorePoolSize(10);
this ThreadPoolTaskExecutor will execute runnables, each runnable has a Thread.sleep(6000); call during the task execution.
So when I execute the 1st task and when the 1st task calls Thread.sleep(6000) to sleep, at that time, execute the 2nd task, will it be possible the 2nd task will use the thread of the 1st task or interrupt the thread used by the 1st task? since it is in sleep.

No, not automatically. If a normal Java Thread is sleeping, then it's sleeping.
Also, Threads are really cheap nowadays, I would not bother with details like that.
Working around the sleep and let that 'sleeping' Thread do some other work is really really hard to manage properly and easily will lead to loads and loads of unforeseen problems.
In your situation, what you COULD improve is that the pool size is not fixed but dynamic, releasing (destroying) threads when they haven't been used for some time (say: a minute).
If more threads are needed, the pool will create new Threads, up to the given limit. (Executors.newCachedThreadPool())
That's especially helpful on a server, because 99% of threads on there are just idle (not even 'actively' sleeping).
Here are some details that might interest you: https://www.baeldung.com/thread-pool-java-and-guava
Especially check out ForkJoinPool: https://www.baeldung.com/java-fork-join, because this comes very close to a very similar problem.

What happens when no thread is free in the thread pool and we submit a task to the pool?

Will the thread creating method wait for a thread to get free?
can I reduce the number of threads generated using thread pooling?

If you use a cached thread pool, the pool will create more threads. However this will only be the maximum needed at any one time and might be far less than the number of tasks you submit.
If you use a fixed size thread pool, it will create a fixed number of threads regardless of whether you give it any tasks, or if you give it more tasks than it can do. It will queue any tasks which are waiting.
Will the thread creating method wait for a thread to get free?
While you could create a queue which did this, this is not the default behaviour. A more common solution is to make the caller execute the task if this is required.
can I reduce the number of threads generated using thread pooling?
Thread pooling is likely to produce far less threads than tasks esp if you limit the number of threads.

Will the thread creating method wait for a thread to get free?
That contradicts with your title. You'd normally submit a task and the pool would pass that task to a worker thread when one is available. So you'd not create a thread but submit a task. Whether you wait for the task to be executed or just trigger asynchronous execution (which in most cases would be the default) depends on your system and requirements.
Can I reduce the number of threads generated using thread pooling?
Thread pooling is often used to reduce the number of threads created, i.e. instead of having a thread per task you have a defined (maximum) number of worker threads and thus if #tasks > max threads in pool you'll reduce the number of threads needed.

From the ThreadPoolExecutor documentation:
A ThreadPoolExecutor will automatically adjust the pool size (see
getPoolSize()) according to the bounds set by corePoolSize (see
getCorePoolSize()) and maximumPoolSize (see getMaximumPoolSize()).
When a new task is submitted in method execute(java.lang.Runnable),
and fewer than corePoolSize threads are running, a new thread is
created to handle the request, even if other worker threads are idle.
If there are more than corePoolSize but less than maximumPoolSize
threads running, a new thread will be created only if the queue is
full. By setting corePoolSize and maximumPoolSize the same, you create
a fixed-size thread pool. By setting maximumPoolSize to an essentially
unbounded value such as Integer.MAX_VALUE, you allow the pool to
accommodate an arbitrary number of concurrent tasks. Most typically,
core and maximum pool sizes are set only upon construction, but they
may also be changed dynamically using setCorePoolSize(int) and
setMaximumPoolSize(int).
Basically, you can set two sizes: the 'core' size and the 'max' size. When tasks are submitted, if there are fewer than 'core' threads, a new thread will be created to execute that task. If there are greater than 'core' threads, one of the current threads will be used to execute tasks, unless all current threads are busy. If all current threads are busy, more threads will be created up to 'max' size. Once 'max' number of threads are reached, no more will be created, and new tasks will be queued until a thread is available to run them.
In the general case, there is no 'right' way that thread pools work. Any given implementation could be used: a fixed size thread pool that always has X threads, or a thread pool that always grows up to a maximum limit, etc.

ThreadPoolExecutor's submitted method throws RejectedExecutionException if the task cannot be scheduled for execution.

Java Thread Pool Executor Monitoring

The ThreadPoolExecutor class in the Java SE 6 docs has the following method:
public int getActiveCount()
Returns the approximate number of threads
that are actively executing tasks.
What is the meaning of approximate and actively executing here?
Is there any guarantee that, if before, during and after the call to getActiveCount()
N threads have been allotted from the pool for task execution, and
None of these N threads are available for further task assignment,
the integer returned by getActiveCount() will be exactly N?
If getActiveCount() does not provide this guarantee, is there any other way to obtain this information in a more precise manner?
Prior SO Questions:
I have looked at Thread Pool Executor Monitoring Requirement and How to tell if there is an available thread in a thread pool in java, but they do not answer my queries.

The reason that it is approximate is because the number could change during the calculation; you're multi-threading. Before the calculation completes a different number of threads could now be active (A thread that was inactive when checked is now active).
When you say "particular time instance" ... that doesn't really mean anything. The calculation isn't instantaneous. The number you get back is the best possible answer given the fluid/dynamic nature of the pool.
If by chance the calculation starts and completes while none of the threads in the pool change state, then yes that number is "exact" but only until a thread in the pool changes state, which means it might only be "exact" for 1ms (or less).

I think you're possibly confusing things by introducing a notion of "rejoining the pool" that doesn't really exist in the implementation of ThreadPoolExecutor.
Each worker thread sits continually waiting for a task (it effectively sits on top of a blocking queue). Each a task comes in to its queue, that worker is "locked", then any pre-task housekeeping is run, then the actual task is run, then post-task housekeeping, then the worker is "unlocked".
activeCount() gives you the number of threads in the "locked" state: notice that this means that they could actually be conducting 'housekeeping' at the precise moment of calling activeCount(), but that to be counted as 'active', there must be a task actually involved, either about to be, currently, or having just been executed.
Whether that equates with your notion of "rejoining the pool" I'm not sure-- as I say, you seem to be inventing a notion that strictly speaking doesn't exist from the point of view of ThreadPoolExecutor.

Very few threads in the thread pool of threadpoolexcecutor actively executing tasks

I am using a ThreadPoolExecutor with a corePoolSize = maxPoolSize = queueSize = 15
Every incoming request spawns 7 tasks, to be executed with this thread pool.
Even though each of the individual tasks, on being scheduled, take less than 3 seconds, the overall request takes much longer.
I suspected that the system is falling short of threads and tasks being queued.
I logged the following information for every incoming request.
getActiveCount()
getLargestPoolSize()
getPoolSize()
getQueue().size()
I notice that the system is not falling short of threads.
getPoolSize and getLargestPoolSize values are constantly at 15 - This is as expected.
getQueue().size() is always 0 - so no tasks are getting queued.
getActiveCount() values are always between 1-2.
Why aren't the rest of the threads in the pool working ?
Is "getActiveCount()-Returns the approximate number of threads that are actively executing tasks." the right API to use ?

As #Thomas suggests, the pool is creating threads as required so if you only give the pool 1-2 tasks to do at once, it will only have 1-2 threads active. You need to feed more tasks to it at once if you want it to be busier.

I don't know Java's thread pool that well but generally you'd only create as many threads as your machine has cores (or hardware threads) available. If you are running on a dual core machine 2 active threads are a sensible value.

Have you tried to run this in debug mode in eclipse? It usually shows all the threads allocated as well as their statuses.
I suspect that in your case the following scenario takes place: you have thread pool and thread pool has allocated number of threads to process the tasks you are submitting into it. But once any particular thread in the thread pool has completed the task submitted into the thread pool it does not terminate. Instead it switches its status to BLOCKED or WAITING (I don't remember for sure which one of them) and waits until the next task is passed into this particular thread for processing.

ThreadPoolExecutor - ArrayBlockingQueue ... to wait before it removes an element form the Queue

I am trying to Tune a thread which does the following:
A thread pool with just 1 thread [CorePoolSize =0, maxPoolSize = 1]
The Queue used is a ArrayBlockingQueue
Quesize = 20
BackGround:
The thread tries to read a request and perform an operation on it.
HOWEVER, eventually the requests have increased so much that the thread is always busy and consume 1 CPU which makes it a resource hog.
What I want to do it , instead sample the requests at intervals and process them . Other requests can be safely ignored.
What I would have to do is put a sleep in "operation" function so that for each task the thread sleeps for sometime and releases the CPU.
Quesiton:
However , I was wondering if there is a way to use a queue which basically itself sleeps for sometime before it reads the next element. This would be ideal since sleeping a task in the middle of execution and keeping the execution incomplete just doesn't sound the best to me.
Please let me know if you have any other suggestions as well for the tasks
Thanks.
Edit:
I have added a follow-up question here
corrected the maxpool size to be 1 [written in a haste] .. thanks tim for pointing it out.

No, you can't make the thread sleep while it's in the pool. If there's a task in the queue, it will be executed.
Pausing within a queued task is the only way to force the thread to be idle in spite of queued tasks. Now, the "sleep" doesn't have to be in the same task as the "work"—you could queue a separate rest task after each real task, which might make for a cleaner implementation. More importantly, if the work is a Callable that returns a result, separating into two tasks will allow you to obtain the result as soon as possible.
As a refinement, rather than sleeping for a fixed interval between every task, you could "throttle" execution to a specified rate. This would allow you to avoid waiting unnecessarily between tasks, yet avoid executing too many tasks within a specified time interval. You can read another answer of mine for a simple way to implement this with a DelayQueue.

You could subclass ThreadPool and override beforeExecute to sleep for some time:
#Overrides
protected void beforeExecute(Thread t,
Runnable r){
try{
Thread.sleep( millis); // will sleep the correct thread, see JavaDoc
}
catch (InterruptedException e){}
}
But see AngerClown's comment about artificially slowing down the queue probably not being a good idea.

This might not work for you, but you could try setting the executor's thread priority to low.
Essentially, create the ThreadPoolExecutor with a custom ThreadFactory. Have the ThreadFactory.newThread() method return Threads with a priority of Thread.MIN_PRIORITY. This will cause the executor service you use to only be scheduled if there is an available core to run it.
The implication: On a system that strictly uses time slicing, you will only be given a time slice to execute if there is no other Thread in the entire program with a greater priority asking to be scheduled. Depending on how busy your application really is, you might get scheduled every once in awhile, or you might not be scheduled at all.

The reason the thread is consuming 100% CPU is because it is given more work than it can process. Adding a delay between tasks is not going to fix this problem. It is just make things worse.
Instead you should look at WHY your tasks are consuming so much CPU e.g. with a profiler and change them so that consume less CPU until you find that your thread can keep up and it no longer consumes 100% cpu.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.