ScheduledExecutorService multiple threads in parallel - java

I'm interested in using ScheduledExecutorService to spawn multiple threads for tasks if task before did not yet finish. For example I need to process a file every 0.5s. First task starts processing file, after 0.5s if first thread is not finished second thread is spawned and starts processing second file and so on. This can be done with something like this:
ScheduledExecutorService executor = Executors.newScheduledThreadPool(4)
while (!executor.isShutdown()) {
executor.execute(task);
try {
Thread.sleep(500);
} catch (InterruptedException e) {
// handle
}
}
Now my question: Why I can't do it with executor.scheduleAtFixedRate?
What I get is if the first task takes longer, the second task is started as soon as first finished, but no new thread is started even if executor has pool of threads. executor.scheduleWithFixedDelay is clear - it executes tasks with same time span between them and it doesn't matter how long it takes to complete the task. So probably I misunderstood ScheduledExecutorService purpose.
Maybe I should look at another kind of executor? Or just use code which I posted here? Any thoughts?

I've solved the problem by launching a nested anonymous runnable in each scheduled execution:
final ScheduledExecutorService service = Executors.newScheduledThreadPool(POOL_SIZE);
final Runnable command = new SlowRunnable();
service.scheduleAtFixedRate(
new Runnable() {
#Override
public void run() {
service.execute(command);
}
}, 0, 1, TimeUnit.SECONDS);
With this example there will be 1 thread executing at every interval a fast instruction, so it will be surely be finished when the next interval is expired. The remaining POOL_SIZE-1 threads will be executing the SlowRunnable's run() in parallel, which may take longer time than the duration of the single interval.
Please note that while I like this solution as it minimize the code and reuse the same ScheduledExecutorService, it must be sized correctly and may not be usable in every context: if the SlowRunnable is so slow that up to POOL_SIZE jobs get executed together, there will be no threads to run the the scheduled task in time.
Also, if you set the interval at 1 TimeUnit.NANOSECONDS it will probably became too slow also the execution of the main runnable.

One of the scheduleAtFixedRate methods is what you're looking for. It starts a task in a thread from the pool at the given interval, even if previous tasks haven't finished. If you're running out of threads to do the processing, adjust the pool size constraints as detailed in the ThreadPoolExecutor docs.

Related

How not to start ScheduledExecutorService task if previous one is not finished

My problem is we have to give it a fixed schedule time to make it start task. Lets say i give 10 seconds and my task has average finish time of 10-15 seconds. Thus after some time waiting threads in quque causes huge memory consumption. If i use syncronized for the method above problem will occur. If i don't use syncronized then i am wasting resources ( cpu) because i dont need to run task if not finished. So i thought a solution of recursive call of task but i believe recursive threads will add more memory problems... what should i do? Shortly i just want to be able to call a task when it is finished. Not fixed time.
public void myScheduledTask{
doJob(); ( use countdown latch to control waiting if necessary)
TimeUnit.SECONDS.sleep(x);
new Thread( new Runnable( { mySchedulTask(); } ));
or
executor.execute( a thread that call myScheduledTask() method);
}
The option that sounds like what you're trying to accomplish:
ScheduledExecutorService executor = Executors.newScheduledThreadPool(count);
ScheduledFuture<?> future = executor.scheduleWithFixedDelay(
task,
delay,
delay,
TimeUnit.MILLISECONDS
);
This would start your task and execute it after delay milliseconds after the previous completion. Count should be the number of threads you want to use, 1 is acceptable. This also lets you stop the task using the future.
The problems with your example. a) You are sleeping on an executor thread. Dont do this let the executor handle it. If you were using a threadpool of 1 then this executor couldn't do any work while you're waiting. b) Starting a new thread is taking control from the executor... just use the executor, then you have some control over the execution.
If you really wanted to stick with the form you have.
class RecurringTask implements Runnable{
#Override
public void run(){
doJob();
executor.schedule(this, delay, TimeUnit.MILLISECONDS);
}
}
Now you will be creating Futures that you never use, so it will be harder to control the execution of the task.
Create static member in your task class - Lock.
In doJob avoid performing job if lock is already aquired :
if (lock.tryLock()) {
try {
// do the job
} finally {
lock.unlock();
}
} else {
// log the fact you skipped the job
return;
}

Timer schedule vs scheduleAtFixedRate?

public class MyTimerTask extends TimerTask{
#Override
public void run() {
int i = 0;
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("Run Me ~" + ++i);
System.out.println("Test");
}
}
Case 1 :-
TimerTask task = new MyTimerTask();
Timer timer = new Timer();
timer.schedule(task, 1000,6000); // line 1
System.out.println("End"); // here is bebug point.
My Expectation of schedule() method (as per my understanding given in javadocs where each execution is scheduled once previous task execution is completed)
that two threads should be
created after line 1.
One for timer which spawns another thread for tasks. Once first task thread dies
another will be created and son on. But at debug point , i just see one thread corresponding to Timer. Why
not thread for tasks which implement Runnable?
Case 2 :-
TimerTask task = new MyTimerTask();
Timer timer = new Timer();
timer.scheduleAtFixedRate(task, 1000,6000); // line 1
System.out.println("End"); // here is bebug point.
My Expectation of scheduleAtFixedRate() method(as per my understanding given in javadocs where each execution is scheduled relative to the scheduled
execution time of the initial execution) that around 17 threads(dont pay much attention
to 17. It can be more or less to that. But it should be greater than 2 ) should be
created after line 1.
One for timer which should spawn 16 other thread corresponding two each task. At first task sleeps
for 100 second, Timer should create another thread corresponding to next task and similarly for other task.
But at debug point , i just see one thread corresponding to Timer. Here also i can see sequential execution of task. Why not 17 threads?
UPDATE :- As per ScheduleAtFixedRate javadocs , each execution is scheduled relative to the scheduled execution time of the initial execution. If an execution is delayed for any reason (such as garbage collection or other background activity), two or more executions will occur in rapid succession to "catch up. what does that mean? To me it gives impression, if second task is due even first task is not completed, then timer will create new thread for due task. Is n't it?
Timer uses the Active Object pattern under the hood, so there is only ever a single thread being used and scheduling a new task on the timer adds that task to the thread's tasks queue.
The timer thread keeps track of all the tasks in it's queue and sleeps until the next task is scheduled. Then, it wakes up and executes the task itself by invoking task.run() directly, meaning that it does not spawn another thread to execute the code.
This also means that if you schedule two tasks to execute at the same time then, true to the Active Object pattern, they will be executed sequentially (one after another) on the same thread of control. This means the second task will execute after it's scheduled time (but probably not by much).
Now, to unequivocally answer your question, here is the scheduling logic from Timer.class that schedules the next time that the task should be run again (from lines 262-272 here):
// set when the next task should be launched
if (task.fixedRate) {
// task is scheduled at fixed rate
task.when = task.when + task.period;
} else {
// task is scheduled at fixed delay
task.when = System.currentTimeMillis()
+ task.period;
}
// insert this task into queue
insertTask(task);
task.fixedRate is set to true if you use one of the timer.scheduleAtFixedRate() methods and is set to false if you use one of the timer.schedule() methods.
task.when is the "time" (ticks) that the task was scheduled to run.
task.period is the interval you passed to the timer.schedule*() method.
So, from the code we can see that if you use a fixed rate then a repeating task will be scheduled to run relative to when it was first started. If you don't use a fixed rate, then it is scheduled to run relative to when it was last run (which will drift relative to a fixed rate, unless your task is never delayed and takes less than one tick to execute).
This also means that if a task falls behind and it is on a fixed rate, then Timer will keep rescheduling the task for immediate execution until it catches up to the total number of times it should have ran over a given period.
So if you have a task, say a ping() that you schedule to run at a fixed rate every 10ms and there is temporary blocking in the ping() method to where it takes 20ms to execute, then the Timer will call ping() again immediately after the previous call finished, and it will keep doing so until the given rate is achieved.
The javadoc for Timer says
Corresponding to each Timer object is a single background thread that
is used to execute all of the timer's tasks, sequentially.
Basically it holds a queue of tasks to which it adds when you schedule them. It uses one thread to iterate over the queue and execute the tasks.
The timer class creates one thread per instance of the timer class and this thread do all tasks scheduled Timer#schedule or Timer#scheduleAtFixRate.
So, as you ovserved, the timer creates only one thread.
A task would have came start time before the preciding task has finished, then the follwing task has waited until the preciding task has finished.
So, Timer "never" create another thread although the preciding task hasn't finished and the time the following task has to start has come.
So, I advise you that:
if you want to schedule tasks and do the tasks on time whether a preciding task has finished or not, use ScheduledThreadPoolExecutor instead of Timer.
And though if you do not want, it's prefer to use ScheduledThreadPoolExecutor than Timer because for one thing, tasks scheduled by Timer would never have done if a task would have threw RuntimeException or Error.
Schedule will not execute the missed task if the start time is in the past.
scheduleAtFixedRate will execute the missed tasks if the start time is in the past.For the missed tasks, the start time will be calculated based last task's end time. When missed tasks are executed fully, the new normal tasks' start time will be calculated based on last task's start time.
BR Sanchez

How to implement a fixed rate poller with ScheduledExecutorService?

Given the following class:
public class Poller implements Runnable {
public static final int CORE_POOL_SIZE = 4;
public boolean running;
public ScheduledExecutorService ses;
public void startPolling() {
this.ses = Executors.newScheduledThreadPool(CORE_POOL_SIZE);
this.ses.scheduleAtFixedRate(this, 0, 1, TimeUnit.SECONDS);
}
public void run() {
running = true;
// ... Do something ...
running = false;
}
}
The ScheduledExecutorService has a core thread pool size of 4 but will more than one poller thread ever be created? Since this is passed into scheduleAtFixedRate does that mean there will only ever be one thread - or does something more complex happen behind the scenes?
And 2 bonus questions:-
Should running be static?
Is CORE_POOL_SIZE redundant?
The ScheduledExecutorService has a core thread pool size of 4 but will more than one poller thread ever be created?
It depends - if you run your program long enough, it will probably create 4 threads. If you quit after running your scheduled task only once or twice, you might only see 2 or 3 threads.
Why does it matter?
One way to monitor thread creation is to provide your own ThreadFactory:
this.ses = Executors.newScheduledThreadPool(CORE_POOL_SIZE, new ThreadFactory() {
#Override
public Thread newThread(Runnable r) {
System.out.println("Creating thread");
return new Thread(r);
}
});
Should running be static?
It depends on what you want to achieve... Since you are not really it using in your example it is hard to say. You might need to make it static if you have several instances of Poller and you want them to not run concurrently for example.
Whether it is static or not, if you use it as a flag, you should make it volatile to ensure visibility.
Is CORE_POOL_SIZE redundant?
Not sure what you mean. It is a mandatory parameter so you need to provide a value. If you know for sure that no two execution will run concurrently, you could only have one thread. That will also prevent concurrent execution (so if one scheduled task needs to start but another is already running, the new one will be delayed).
scheduleAtFixedRate (Runnable, long initialDelay, long period, TimeUnit timeunit)
This method schedules a task to be executed periodically. The task is executed the first time after the initialDelay, and then recurringly every time the period expires.
If any execution of the given task throws an exception, the task is no longer executed. If no exceptions are thrown, the task will continue to be executed until the ScheduledExecutorService is shut down.
If a task takes longer to execute than the period between its scheduled executions, the next execution will start after the current execution finishes. The scheduled task will not be executed by more than one thread at a time.
Why do you put your executor service in Runnable class?
You should separate your ScheduledExecutorService as Singleton rather than being variable of runnable class.
Remind this ScheduledExecutorService is a thread container, so when you code this
this.ses = Executors.newScheduledThreadPool(CORE_POOL_SIZE);
it will create a lot of threads base on value of the size on the same time, when you put this code
this.ses.scheduleAtFixedRate(this, 0, 1, TimeUnit.SECONDS);
the ScheduledExecutorService will randomly pick a thread which idle to run this class every 1 second until it is finish. if you put sleep in run method that is longer than value of period time pass to scheduled thread, it wont create another thread until the 1st thread is finish. So if you want multiple thread run this Poller on the same time, then create multiple Poller instance and pass it to ScheduledExecutorService
CORE_POOL_SIZE its not redundant for me, its good to be a constant which value taken from configuration file.
Should running be static?
it's depends on what you need. if you intend to create multiple instance of Poller then u shouldn't

How to schedule a group of threads (tasks) at variable times?

I have to launch 10 tasks asynchronously at variable times through out the day until a certain hour the next day. The closer I get to the time the next day the more I have to repeat these 10 tasks.
My question, is how should I manage this? What executors should I use? What is the best way to manage the memory?
I thought of using an Executors.newScheduledThreadPool that could start a threadpool process with the 10 tasks at variable times. The problem requires me to launch a new set of tasks even though the previous group of tasks have not finished (so probably trigger a new threadpool each time).
I am also thinking of using sort of process registry to manage the different processes that have been launched. When a process is unused anymore than the registry can stop it.
And each time the tasks are done, I thought of flushing the runnables, and stopping the threadpool. Is that overall a good solution?
The problem that may arise, is to have the memory saturated with threadpools. Maybe put a time limit on the threadpool?
I guess, you need one dispatching thread inside plain non-scheduling pool and another pool for workers, something like this:
ExecutorService ex = Executors.newFixedThreadPool(1);
final ExecutorService workersPool = Executors.newCachedThreadPool();
ex.submit(new Runnable() {
public void run() {
try {
do {
// determine if it's time to start workers
if (timeToStartWorkers()) {
workersPool.submit(new Worker(...));
workersPool.submit(new Worker(...));
...
}
// sleep till next time
Thread.sleep(timeTillNextCheck);
}
} catch (InterruptedException e) {
// handle exception
}
}
});
No need to recreate thread pools.

500 Worker Threads, what kind of thread pool?

I am wondering if this is the best way to do this. I have about 500 threads that run indefinitely, but Thread.sleep for a minute when done one cycle of processing.
ExecutorService es = Executors.newFixedThreadPool(list.size()+1);
for (int i = 0; i < list.size(); i++) {
es.execute(coreAppVector.elementAt(i)); //coreAppVector is a vector of extends thread objects
}
The code that is executing is really simple and basically just this
class aThread extends Thread {
public void run(){
while(true){
Thread.sleep(ONE_MINUTE);
//Lots of computation every minute
}
}
}
I do need a separate threads for each running task, so changing the architecture isn't an option. I tried making my threadPool size equal to Runtime.getRuntime().availableProcessors() which attempted to run all 500 threads, but only let 8 (4xhyperthreading) of them execute. The other threads wouldn't surrender and let other threads have their turn. I tried putting in a wait() and notify(), but still no luck. If anyone has a simple example or some tips, I would be grateful!
Well, the design is arguably flawed. The threads implement Genetic-Programming or GP, a type of learning algorithm. Each thread analyzes advanced trends makes predictions. If the thread ever completes, the learning is lost. That said, I was hoping that sleep() would allow me to share some of the resources while one thread isn't "learning"
So the actual requirements are
how can I schedule tasks that maintain
state and run every 2 minutes, but
control how many execute at one time.
If your threads are not terminating, this is the fault of the code within the thread, not the thread pool. For more detailed help you will need to post the code that is being executed.
Also, why do you put each Thread to sleep when it is done; wouldn't it be better just to let it complete?
Additionally, I think you are misusing the thread pool by having a number of threads equal to the number of tasks you wish to execute. The point of a thread pool is to put a constraint on the number of resources used; this approach is no better than not using a thread pool at all.
Finally, you don't need to pass instances of Thread to your ExecutorService, just instances of Runnable. ExecutorService maintains its own pool of threads which loop indefinitely, pulling work off of an internal queue (the work being the Runnables you submit).
Why not used a ScheduledExecutorService to schedule each task to run once per minute, instead of leaving all these threads idle for a full minute?
ScheduledExecutorService workers =
Executors.newScheduledThreadPool(Runtime.getRuntime().availableProcessors());
for (Runnable task : list) {
workers.scheduleWithFixedDelay(task, 0, 1, TimeUnit.MINUTES);
}
What do you mean by, "changing the architecture isn't an option"? If you mean that you can't modify your task at all (specifically, the tasks have to loop, instead of running once, and the call to Thread.sleep() can't be removed), then "good performance isn't an option," either.
I'm not sure your code is semantically correct in how it's using a thread pool. ExecutionService creates and manages threads internally, a client should just supply an instance of Runnable, whose run() method will be executed in context of one of pooled threads. You can check my example. Also note that each running thread takes ~10Mb of system memory for the stack, and on linux the mapping of java-to-native threads is 1-to-1.
Instead of putting a tread to sleep you should let it return and use a ThreadPoolexecutor to execute work posted every minute to your work queue.
To answer your question, what type of thread pool?
I posted my comments but this really should address your issue. You have a computation that can take 2 seconds to complete. You have many tasks (500) that you want to be completed as fast as possible. The fastest possible throughput you can achieve, assuming there is no IO and or network traffic, is with Runtime.getRuntime().availableProcessors() number of threads.
If you increase your number to 500 threads, then each task will be executing on its own thread, but the OS will schedule a thread out every so often to give to another thread. Thats 125 context switches at any given point. Each context switch will increase the amount of time for each task to run.
The big picture here is that adding more threads does NOT equal greater throughput when you are way over the number of processors.
Edit: A quick update. You dont need to sleep here. When you execute the 500 tasks with 8 processors, each task will complete in the 2 seconds, finish and the thread it was running on will then take the next task and complete that one.
8 Threads is the max that your system can handle, any more and you are slowing yourself down with context switching.
Look at this article http://www.informit.com/articles/article.aspx?p=1339471&seqNum=4 It will give you an overview of how to do it.
This should do what you desire, but not what you asked for :-) You have to take out the Thread.sleep()
ScheduledRunnable.java
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
public class ScheduledRunnable
{
public static void main(final String[] args)
{
final int numTasks = 10;
final ScheduledExecutorService ses = Executors.newScheduledThreadPool(Runtime.getRuntime().availableProcessors());
for (int i = 0; i < numTasks; i++)
{
ses.scheduleAtFixedRate(new MyRunnable(i), 0, 10, TimeUnit.SECONDS);
}
}
private static class MyRunnable implements Runnable
{
private int id;
private int numRuns;
private MyRunnable(final int id)
{
this.id = id;
this.numRuns = 0;
}
#Override
public void run()
{
this.numRuns += 1;
System.out.format("%d - %d\n", this.id, this.numRuns);
}
}
}
This schedules the Runnables every 10 SECONDS to show the behavior.
If you really need to wait a fixed amount of time AFTER processing is complete you might need to play around with which .scheduleXXX method that you need. I think fixedWait will just run it every N amount of time regardless of what the execution time is.
I do need a separate threads for each running task, so changing the architecture isn't an option.
If that is true (for example, making a call to an external blocking function), then create separate threads for them and start them. You can't create a thread pool with a limited number of threads, as a blocking function in one of threads will prevent any other runnable being put into it, and don't gain much creating a thread pool with one thread per task.
I tried making my threadPool size equal to Runtime.getRuntime().availableProcessors() which attempted to run all 500 threads, but only let 8 (4xhyperthreading) of them execute.
When you pass the Thread objects you are creating to thread pool, it only sees that they implement Runnable. Therefore it will run each Runnable to completion. Any loop which stops the run() method returning will not allow the next enqueued task to run; eg:
public static void main (String...args) {
ExecutorService executor = Executors.newFixedThreadPool(2);
for (int i = 0; i < 10; ++i) {
final int task = i;
executor.execute(new Runnable () {
private long lastRunTime = 0;
#Override
public void run () {
for (int iteration = 0; iteration < 4; )
{
if (System.currentTimeMillis() - this.lastRunTime > TIME_OUT)
{
// do your work here
++iteration;
System.out.printf("Task {%d} iteration {%d} thread {%s}.\n", task, iteration, Thread.currentThread());
this.lastRunTime = System.currentTimeMillis();
}
else
{
Thread.yield(); // otherwise, let other threads run
}
}
}
});
}
executor.shutdown();
}
prints out:
Task {0} iteration {1} thread {Thread[pool-1-thread-1,5,main]}.
Task {1} iteration {1} thread {Thread[pool-1-thread-2,5,main]}.
Task {0} iteration {2} thread {Thread[pool-1-thread-1,5,main]}.
Task {1} iteration {2} thread {Thread[pool-1-thread-2,5,main]}.
Task {0} iteration {3} thread {Thread[pool-1-thread-1,5,main]}.
Task {1} iteration {3} thread {Thread[pool-1-thread-2,5,main]}.
Task {0} iteration {4} thread {Thread[pool-1-thread-1,5,main]}.
Task {2} iteration {1} thread {Thread[pool-1-thread-1,5,main]}.
Task {1} iteration {4} thread {Thread[pool-1-thread-2,5,main]}.
Task {3} iteration {1} thread {Thread[pool-1-thread-2,5,main]}.
Task {2} iteration {2} thread {Thread[pool-1-thread-1,5,main]}.
Task {3} iteration {2} thread {Thread[pool-1-thread-2,5,main]}.
Task {2} iteration {3} thread {Thread[pool-1-thread-1,5,main]}.
Task {3} iteration {3} thread {Thread[pool-1-thread-2,5,main]}.
Task {2} iteration {4} thread {Thread[pool-1-thread-1,5,main]}.
...
showing that the first (thread pool size) tasks run to completion before the next tasks get scheduled.
What you need to do is create tasks which run for a while, then let other tasks run. Quite how you structure these depends on what you want to achieve
whether you want all the tasks to run at the same time, the all wait for a minute, then all run at the same time again, or whether the tasks are not synchronised with each other
whether you really wanted each task to run at a one-minute interval
whether your tasks are potentially blocking or not, and so really require separate threads
what behaviour is expected if a task blocks longer than the expected window for running
what behaviour is expected if a task blocks longer than the repeat rate (blocks for more than one minute)
Depending on the answers to these, some combination of ScheduledExecutorService, semaphores or mutexes can be used to co-ordinate the tasks. The simplest case is the non-blocking, non-synchronous tasks, in which case use a ScheduledExecutorService directly to run your runnables once every minute.
Can you rewrite your project for using some agent-based concurrency framework, like Akka?
You can certainly find some improvement in throughput by reducing the number of threads to what the system can realistically handle. Are you open to changing the design of the thread a bit? It'll unburden the scheduler to put the sleeping ones in a queue instead of actually having hundreds of sleeping threads.
class RepeatingWorker implements Runnable {
private ExecutorService executor;
private Date lastRan;
//constructor takes your executor
#Override
public void run() {
try {
if (now > lastRan + ONE_MINUTE) {
//do job
lastRan = now;
} else {
return;
} finally {
executor.submit(this);
}
}
}
This preserves your core semantic of 'job repeats indefinitely, but waits at least one minute between executions' but now you can tune the thread pool to something the machine can handle and the ones that aren't working are in a queue instead of loitering about in the scheduler as sleeping threads. There is some wait busy behavior if nobody's actually doing anything, but I am assuming from your post that the entire purpose of the application is to run these threads and it's currently railing your processors. You may need to tune around that if room has to be made for other things :)
You need a semaphore.
class AThread extends Thread {
Semaphore sem;
AThread(Semaphore sem) {
this.sem = sem;
}
public void run(){
while(true){
Thread.sleep(ONE_MINUTE);
sem.acquire();
try {
//Lots of computation every minute
} finally {
sem.release();
}
}
}
}
When instantiating the AThreads you need to pass the same semaphore instance:
Semaphore sem = new Semaphore(MAX_AVAILABLE, true);
Edit: Who voted down can please explain why? There is something wrong in my solution?

Categories

Resources