How accurate is the task scheduling of a ScheduledThreadPoolExecutor - java

I was reading ScheduledThreadPoolExecutor JavaDoc and came across the following thing:
Delayed tasks execute no sooner than they are enabled, but without any
real-time guarantees about when, after they are enabled, they will
commence. Tasks scheduled for exactly the same execution time are
enabled in first-in-first-out (FIFO) order of submission.
So, if I write something like this:
ScheduledExecutorService ses = Executors.newScheduledThreadPool(4); //uses ScheduledThreadPoolExecutor internally
Callable<Integer> c;
//initialize c
ses.schedule(c, 10, TimeUnit.SECONDS);
there's no any guarantees that the execution of the callable will start in 10 seconds after the scheduling? As far as I got, the specification allows it to execute even in hour after scheduleing (without any real-time guarantees, as stated in the documentation).
How does it work in practice? Should I excepct some really long delay?

Your understanding is correct. The Executor is not claiming to be a real-time system with any sort of timing guarantees. The only thing it will guarantee is that it doesn't run tasks too early.
In practice, the timing of well-tuned Executors are very accurate. Typically they start within 10ms after the scheduled time from my experience. The only time you will see scheduling get pushed back very far is if your Executor is lacking the appropriate resources to run it's workload. So this is more of a tuning issue.
Realistically, if you give your Executor enough resources to work with, the timing will be quite accurate.
Some things that you don't want to do with an Executor is use the scheduling as part of a rate-based calculation. For example, if you schedule a task to run every 1 second and you use that to compute <somemetric> per second without factoring in what time the task is actually running at.
Another thing to be mindful of is the cost of context switching. If you schedule multiple tasks to run every 1ms, the Executor will not be able to keep up with running your task and context switching everyone 1ms.

Related

Single ScheduledExecutorService instance vs Multiple ScheduledExecutorService instances

I have a service which schedules async tasks using ScheduledExecutorService for the user. Each user will trigger the service to schedule two tasks. (The 1st Task schedule the 2nd task with a fixed delay, such as 10 seconds interval)
pseudocode code illustration:
task1Future = threadPoolTaskScheduler.schedule(task1);
for(int i = 0; i< 10000; ++i) {
task2Future = threadPoolTaskScheduler.schedule(task2);
task2Future.get(); // Takes long time
Thread.sleep(10);
}
task1.Future.get();
Suppose I have a potential of 10000 users using the service at the same time, we can have two kinds of ScheduledExecutorService configuration for my service:
A single ScheduledExecutorService for all the users.
Create a ScheduledExecutorService for each user.
What I can think about the first method:
Pros:
Easy to control the number of threads in the thread pool.
Avoid creating new threads for scheduled tasks.
Cons:
Always keeping multiple number of threads available could waste computer resources.
May cause the hang of the service because of lacking available threads. (For example, set the thread pool size to 10, and then there is a 100 person using the service the same time, then after entering the 1st task and it tries to schedule the 2nd task, then finding out there is no thread available for scheduling the 2nd task)
What I can think about the second method
Pros:
Avoiding always keep many threads available when the number of user is small.
Can always provide threads for a large number of simultaneously usage.
Cons:
Creating new threads creates overheads.
Don't know how to control the number of maximum threads for the service. May cause the RAM out of space.
Any ideas about which way is better?
Single ScheduledExecutorService drives many tasks
The entire point of a ScheduledExecutorService is to maintain a collection of tasks to be executed after a certain amount of time elapses.
So given the scenario you describe, you need only a single ScheduledExecutorService object. Submit your 10,000 tasks to that one object. Each task will be executed approximately when its designated delay elapses. Simple, and easy.
Thread pool size
The real issue is deciding how many threads to assign to the ScheduledExecutorService.
Threads, as currently implemented in the OpenJDK project, are mapped directly to host OS threads. This makes them relatively heavyweight in terms of CPU and memory usage. In other words, currently Java threads are “expensive”.
There is no simple easy answer to calculating thread pool size. The optimal number is the least amount of threads that can keep up with the workload without over-burdening the host machine’s limited number of cores and limited memory. If you search Stack Overflow, you’ll find many discussions on the topic of deciding how many threads to use in a pool.
Project Loom
And keep tabs with the progress of Project Loom and its promise to bring virtual threads to Java. That technology has the potential to radically alter the calculus of deciding thread pool size. Virtual threads will be more efficient with CPU and with memory. In other words, virtual threads will be quite “cheap”, “inexpensive”.
How executor service works
You said:
entering the 1st task and it tries to schedule the 2nd task, then finding out there is no thread available for scheduling the 2nd task
That is not how the scheduled executor service (SES) works.
If a task being currently executed by a SES needs to schedule itself or some other task to later execution, that submitted task is added to the queue maintained internally by the SES. There is no need to have a thread immediately available. Nothing happens immediately except that queue addition. Later, when the added task’s specified delay has elapsed, the SES looks for an available thread in its thread-pool to execute that task that was queued a while back in time.
You seem to feel a need to manage the time of each task’s execution on certain threads. But that is the job of the scheduled executor service. The SES tracks the tasks submitted for execution, notices when their specified delay elapses, and schedules their execution on a thread from its managed pool of threads. You don’t need to manage any of that. Your only challenge is to assign an appropriate number of threads to the pool.
Multiple executor services
You commented:
why don't use multiple ScheduledExecutorService instances
Because in your scenario, there is no benefit. Your Question implies that you have many tasks all similar with none being prioritized. In such a case, just use one executor service. One scheduled executor service with 12 threads will get the same amount of work accomplished as 3 services with 4 threads each.
As for excess threads, they are not a burden. Any thread without a task to execute uses virtually no CPU time. A pool may or may not choose to close some unused threads after a while. But such a policy is up to the implementation of the thread pool of the executor service, and is transparent to us as calling programmers.
If the scenario were different, where some of the tasks block for long periods of time, or where you need to prioritize certain tasks, then you may want to segregate those into a separate executor service.
In today's Java (before Project Loom with virtual threads), when code in a thread blocks, that thread sits there doing nothing but waiting to unblock. Blocking means your code is performing an operation that awaits a response. For example, making network calls to a socket or web service blocks, writing to storage blocks, and accessing an external database blocks. Ideally, you would not write code that blocks for long periods of time. But sometimes you must.
In such a case where some tasks run long, or conversely you have some tasks that must be prioritized for fast execution, then yes, use multiple executor services.
For example, say you have a 16-core machine with not much else running except your Java app. You might have one executor service with a thread pool size of 4 maximum for long-running tasks, one executor service with a thread pool with a size of 7 maximum for many run-of-the-mill tasks, and a third executor service with a thread pool maximum size of 2 for very few tasks that run short but must run quickly. (The numbers here are arbitrary examples, not a recommendation.)
Other approaches
As commented, there are other frameworks for managing concurrency. The ScheduledExecutorService discussed here is general purpose.
For example, Swing, JavaFX, Spring, and Jakarta EE each have their own concurrency management. Consider using those where approriate to your particular project.

Short but frequent jobs: HandlerThread or ThreadPoolExecutor?

First of all, I could not determine what the title should be, so if it's not specific enough, the question itself will be.
We have an application that uses a foreground service and stays alive forever, and in this service, there are frequent database access jobs, network access jobs and some more, that needs to run on background threads. One job itself consumes a small amount of time, but the jobs themselves are frequent. Obviously, they need to run on worker threads, so I'm here to ask which design we should follow.
HandlerThread is a structure that creates a singular thread and uses a queue to execute tasks but always loops and waits for messages which consumes power, while ThreadPoolExecutor creates multiple threads for each job and deletes threads when the jobs are done, but because of too many threads there could be leaks, or out-of-memory even. The job count may be 5, or it may be 20, depending on how the user acts in a certain way. And, between 2 jobs, there can be a 5 second gap, or a day gap, totally depending on user. But, remember, the application stays alive forever and waits for these jobs to execute.
So, for this specific occasion, which one is better to use? A thread pool executor or a handler thread? Any advice is appreciated, thanks.
Caveat: I do not do Android work, so I am no expert there. My opinions here are based a quick reading of Android documentation.
tl;dr
➥ Use Executors rather than HandlerThread.
The Executors framework is more modern, flexible, and powerful than the legacy Thread facility used by HandlerThread. Everything you can do in HandlerThread you can do better with executors.
Differences
One big difference between HandlerThread and ThreadPoolExecutor is that the first comes from Android while the second comes from Java. So if you'll be doing other work with Java, you might not want to get in the habit of using HandlerThread.
Another big difference is age. The android.os.HandlerThread class inherits from java.lang.Thread, and dates back to the original Android API level 1. While nice for its time, the Thread facility in Java is limited in its design. That facility was supplanted by the more modern, flexible, and powerful Executors framework in later Java.
Executors
Your Question is not clear about whether these are recurring jobs or sporadically scheduled. Either can be handled with Executors.
For jobs that run once at a specific time, and for recurring scheduled jobs, use a ScheduledExecutorService. You can schedule a job to run once at a certain time by specifying a delay, a span of time to wait until execution. For repeated jobs, you can specify an amount to wait, then run, then wait, then run, and so on. I'll not address this further, as you seem to be talking about sporadic immediate jobs rather than scheduled or repeating jobs. If interested, search Stack Overflow as ScheduledExecutorService has been covered many times already on Stack Overflow.
Single thread pool
HandlerThread is a structure that creates a singular thread
If you want to recreate that single thread behavior, use a thread pool consisting of only a single thread.
ExecutorService es = Executors.newSingleThreadExecutor() ;
Make your tasks. Implement either Runnable or Callable using (a) a class implementing either interface, (b) without defining a class, via lambda syntax or conventional syntax.
Conventional syntax.
Runnable sayHelloJob = new Runnable()
{
#Override
public void run ( )
{
System.out.println( "Hello. " + Instant.now() );
}
};
Lambda syntax.
Runnable sayBonjourJob = ( ) -> System.out.println( "Bonjour. " + Instant.now() );
Submit as many of these jobs to the executor service as you wish.
es.submit( sayHelloJob ) ;
es.submit( sayBonjourJob ) ;
Notice that the submit method returns a Future. With that Future object, you can check if the computation is complete, wait for its completion, or retrieve the result of the computation. Or you may choose to ignore the Future object as seen in the code above.
Fixed thread pool
If you want multiple thread behavior, just create your executor with a different kind of thread pool.
A fixed thread pool has a maximum number of threads servicing a single queue of submitted jobs (Runnable or Callable objects). The threads continue to live, and are replaced as needed in case of failure.
ExecutorService es = Executors.newFixedThreadPool​( 3 ) ; // Specify number of threads.
The rest of the code remains the same. That is the beauty of using the ExecutorService interface: You can change the implementation of the executor service to get difference behavior while not breaking your code that calls upon that executor service.
Cached thread pool
Your needs may be better service by a cached thread pool. Rather than immediately creating and maintaining a certain number of threads as the fixed thread pool does, this pool creates threads only as needed, up to a maximum. When a thread is done, and resting for over a minute, the thread is terminated. As the Javadoc notes, this is ideal for “many short-lived asynchronous tasks” such as yours. But notice that there is no upper limit of threads that may be running simultaneously. If the nature of your app is such that you may see often spikes of many jobs arriving simultaneously, you may want to use a different implementation other than cached thread pool.
ExecutorService es = Executors.newCachedThreadPool() ;
Managing executors and threads
but because of too many threads there could be leaks, or out-of-memory even
It is the job of you the programmer and your sysadmin to not overburden the production server. You need to monitor performance in production. The managagement is easy enough to perform, as you control the number of threads available in the thread pool backing your executor service.
We have an application that uses a foreground service and stays alive forever
Of course your app does eventually come to end, being shutdown. When that happens, be sure to shutdown your executor and its backing thread pool. Otherwise the threads may survive, and continue indefinitely. Be sure to use the life cycle hooks of your app’s execution environment to detect and react to the app shutting down.
The job count may be 5, or it may be 20, depending on how the user acts in a certain way.
Jobs submitted to an executor service are buffered up until they can be scheduled on a thread for execution. So you may have a thread pool of, for example, 3 threads and 20 waiting jobs. No problem. The waiting jobs will be eventually executed when their time comes.
You may want to prioritize certain jobs, to be done ahead of lower priority jobs. One easy way to do this is to have two executor services. Each executor has its own backing thread pool. One executor is for the fewer but higher-priority jobs, while the other executor is for the many lower-priority jobs.
Remember that threads in a thread pool doing no work, on stand-by, have virtually no overhead in Java for either CPU or memory. So there is no downside to having a special higher-priority executor service sitting around and waiting for eventual jobs to arrive. The only concern is that your total number of all background threads and their workload not overwhelm your machine. Also, the implementation of the thread pool may well shut down unused threads after a period of disuse.
Don't really think its a question of the number of threads you are running, more how you want them run. If you want them run one at at time (i.e. you only want to execute on database query at a time) then use a HandlerThread. If you want multi-threading / a pool of threads, then use and Executor.
In my experience, leaks are really more down to how you have coded your threads, not really the chosen implementation.
Personally, I'd use a HandlerThread, here's a nice article on implementing them and how to avoid memory leaks ... Using HandlerThread in Android

Executing some code after a specific amount of time

I need to execute an action after a specific amount of time (for example 30 minutes after the app started up, if the app is still up).
What are my options and will it necessary means there's going to be one thread "lost" waiting for the 30 minutes to pass by?
Ideally, at program startup, I'd like to do something like the following (simplified on purpose) and then don't have to think about it anymore:
doIfStillUp( 30, new Runnable() {
....
});
So how should I go about implementing doIfStillUp(...)?
Should I use a TimerTask? The Executor framework?
Most importantly (it's for understanding purpose): does this mean there's going to be one thread lost idling for basically nothing during 30 minutes?
If there's going to be one thread "doing nothing", is this an issue? What if there are 10 000 threads (I'm being facetious here) "doing nothing"?
Note that I'm trying to understand the "big picture", not to solve a particular problem.
The Executor framework is a reasonable choice.
There's a schedule method that just takes a runnable and a delay time.
schedule(Runnable command,
long delay,
TimeUnit unit)
That's pretty straightforward. There won't necessarily be a thread blocked waiting on your task. You could use a ScheduledThreadPoolExecutor, as linked above that keeps X threads ready to run scheduled tasks.
You can imagine a data structure that holds the time at which a task should be run. A single thread can watch or set up these delays and can potentially watch thousands of them in a single thread. When the first time expires it'll run the task. Potentially using its own thread, potentially using 1 of X in the thread pool. When a new task is added or an existing task is finished it'll wait for the earliest time to arrive and then start the whole process again.
You should use a Timer. Its javadoc answers all your questions.
One thread is used for every timer, but the timer executes several tasks, sequentially. The timer tasks should be very short. If they aren't, consider using several timers.
Of course, the timer thread will be idle if it doesn't have any task to execute. An idle thread doesn't consume anything (or nearly anything), so I wouldn't worry about it. Anyway, you don't have many choices. 10000 threads doing nothing would of course be an issue, but that would mean that you instantiated one timer per task, which is wrong.
You can schedule task on java.util.Timer. For all timer tasks single timer thread will be created by java.util.Timer.
The builtin java timer is the straight away solution: http://download.oracle.com/javase/1,5.0/docs/api/java/util/Timer.html#schedule(java.util.TimerTask, long)

ThreadPoolExecutor - ArrayBlockingQueue ... to wait before it removes an element form the Queue

I am trying to Tune a thread which does the following:
A thread pool with just 1 thread [CorePoolSize =0, maxPoolSize = 1]
The Queue used is a ArrayBlockingQueue
Quesize = 20
BackGround:
The thread tries to read a request and perform an operation on it.
HOWEVER, eventually the requests have increased so much that the thread is always busy and consume 1 CPU which makes it a resource hog.
What I want to do it , instead sample the requests at intervals and process them . Other requests can be safely ignored.
What I would have to do is put a sleep in "operation" function so that for each task the thread sleeps for sometime and releases the CPU.
Quesiton:
However , I was wondering if there is a way to use a queue which basically itself sleeps for sometime before it reads the next element. This would be ideal since sleeping a task in the middle of execution and keeping the execution incomplete just doesn't sound the best to me.
Please let me know if you have any other suggestions as well for the tasks
Thanks.
Edit:
I have added a follow-up question here
corrected the maxpool size to be 1 [written in a haste] .. thanks tim for pointing it out.
No, you can't make the thread sleep while it's in the pool. If there's a task in the queue, it will be executed.
Pausing within a queued task is the only way to force the thread to be idle in spite of queued tasks. Now, the "sleep" doesn't have to be in the same task as the "work"—you could queue a separate rest task after each real task, which might make for a cleaner implementation. More importantly, if the work is a Callable that returns a result, separating into two tasks will allow you to obtain the result as soon as possible.
As a refinement, rather than sleeping for a fixed interval between every task, you could "throttle" execution to a specified rate. This would allow you to avoid waiting unnecessarily between tasks, yet avoid executing too many tasks within a specified time interval. You can read another answer of mine for a simple way to implement this with a DelayQueue.
You could subclass ThreadPool and override beforeExecute to sleep for some time:
#Overrides
protected void beforeExecute(Thread t,
Runnable r){
try{
Thread.sleep( millis); // will sleep the correct thread, see JavaDoc
}
catch (InterruptedException e){}
}
But see AngerClown's comment about artificially slowing down the queue probably not being a good idea.
This might not work for you, but you could try setting the executor's thread priority to low.
Essentially, create the ThreadPoolExecutor with a custom ThreadFactory. Have the ThreadFactory.newThread() method return Threads with a priority of Thread.MIN_PRIORITY. This will cause the executor service you use to only be scheduled if there is an available core to run it.
The implication: On a system that strictly uses time slicing, you will only be given a time slice to execute if there is no other Thread in the entire program with a greater priority asking to be scheduled. Depending on how busy your application really is, you might get scheduled every once in awhile, or you might not be scheduled at all.
The reason the thread is consuming 100% CPU is because it is given more work than it can process. Adding a delay between tasks is not going to fix this problem. It is just make things worse.
Instead you should look at WHY your tasks are consuming so much CPU e.g. with a profiler and change them so that consume less CPU until you find that your thread can keep up and it no longer consumes 100% cpu.

What is the best approach to schedule Callables which may time out in a polling situation?

I have several Callables which query for some JMX Beans, so each one may time out. I want to poll for values lets say every second. The most naive approach would be to start each in a separate thread, but I want to minimize the number of threads. Which options do I have to do it in a better way?
My interpretation is that you have a bunch of Callable objects which need to be polled at some interval. The trouble if you use a thread pool is that the pool will become contaminated with the slowest members, and your faster ones will be starved.
It sounds like you have control over the scheduling, so you might consider an exponential backoff approach. That is, after Callable X has run (and perhaps timed out), you wait 2 seconds instead of 1 second before rescheduling it. If it still fails, go to 4s, then 8s, etc. If you use a ScheduledThreadPoolExecutor, it comes with a built-in way to do this, allowing you to schedule your executions after a set delay.
If you set a constant timeout, this strategy will reduce your pool's susceptibility to monopolization by the slow ones. It is very difficult to get rid of this problem completely. Using a separate thread per queried object is really the only way to make sure you don't get starvation, and that can be very resource-intensive, as you say.
Another strategy is to bucket your pool into a fast one and a slow one. If an object is timing out (say more than N times), you move it to the slow pool. This keeps your fast pool fast, and while the slow ones all get in each others' way, at least they don't clog up the fast pool. If they have good statistics for a while, you can promote them to the fast pool again.
As soon as you submit a Callable you receive a Future - a handle to the future result. You can decide to wait for its completion for a given amount of time:
Future<String> future = executorService.submit(callable);
try {
future.get(1, TimeUnit.SECONDS);
} catch ( TimeoutException e ) {
future.cancel(true);
} catch ...
Calling get with a timeout allows you to receive an exception if the task has not been completed. This does not distinguish between not started tasks and started but not completed. On the other hand cancel will take a boolean parameter mayStopIfRunning so you can choose to e.g. only cancel tasks not yet scheduled.
i agree with robbotic...implementing a 'cachedThreadPool' will solve your problem as it will restrict the number of threads to the optimum level at the same time has timeouts which will free your un-utilized resources

Categories

Resources