I have to launch 10 tasks asynchronously at variable times through out the day until a certain hour the next day. The closer I get to the time the next day the more I have to repeat these 10 tasks.
My question, is how should I manage this? What executors should I use? What is the best way to manage the memory?
I thought of using an Executors.newScheduledThreadPool that could start a threadpool process with the 10 tasks at variable times. The problem requires me to launch a new set of tasks even though the previous group of tasks have not finished (so probably trigger a new threadpool each time).
I am also thinking of using sort of process registry to manage the different processes that have been launched. When a process is unused anymore than the registry can stop it.
And each time the tasks are done, I thought of flushing the runnables, and stopping the threadpool. Is that overall a good solution?
The problem that may arise, is to have the memory saturated with threadpools. Maybe put a time limit on the threadpool?
I guess, you need one dispatching thread inside plain non-scheduling pool and another pool for workers, something like this:
ExecutorService ex = Executors.newFixedThreadPool(1);
final ExecutorService workersPool = Executors.newCachedThreadPool();
ex.submit(new Runnable() {
public void run() {
try {
do {
// determine if it's time to start workers
if (timeToStartWorkers()) {
workersPool.submit(new Worker(...));
workersPool.submit(new Worker(...));
...
}
// sleep till next time
Thread.sleep(timeTillNextCheck);
}
} catch (InterruptedException e) {
// handle exception
}
}
});
No need to recreate thread pools.
Related
I am having a scenario of around inserting millions of data into the back end and currently using executor framework to load this. I will explain my problem in simpler terms.
In the below case, I am having 10 runnable and three threads to execute the same. Consider my runnable is doing an insert operation and it is taking time to complete the task. When I checked ,It is understood that ,if all the threads are busy, the other tasks will go to the queue and once the threads completed the tasks ,it will fetch the tasks from the pool and complete it.
So in this case, object of SampleRunnable 4 to 10 will be created and this will be in the pool.
Problem: Since I need to load millions of tasks,I cannot load all the records in queue which can lead to memory issues. So my question is instead of taking all tasks in the queue ,is it possible to make the main thread waiting until any one of the executor worker threads becomes available.
Following approaches I tried as a work around instead of queuing this much tasks:
Approach 1: Used Array Blocking Queue for executor and gave the size as 5 (for e.g.)
So in this case, when the 9th task comes ,this will throw RejectedExecutionException and in the catch clause,put a sleep for 1 minute and recursively trying the same.This will get picked up on any of the retry when the thread is available.
Approach 2: Used shut down and await termination. i.e. if the task count is 5, i am putting shut down and await termination. In the await Termination 'if' block (executor.awaitTermination(60000,TimeUnit.SECONDS)),I am instantiating the thread pool again.
public class SampleMain {
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(3);
for (int i=0;i<10;i++){
executorService.execute(new SampleRunnable(i));
}
executor.shutdown();
}
Sounds like the problem is, you want to throttle the main thread, so that it does not get ahead of the workers. If that's the case, then consider explicitly constructing a ThreadPoolExecutor instance instead of calling Executors.newFixedThreadPool().
That class has several different constructors, and most of them allow you to supply your own blocking queue. If you create an ArrayBlockingQueue with a limited size, then every time the queue becomes full, the main thread will be automatically blocked until a worker makes room by taking another task.
final int work_queue_size = 30;
BlockingQueue work_queue = new ArrayBlockingQueue(work_queue_size);
ExecutorService executor = new ThreadPoolExecutor(..., work_queue);
for (int i=0;i<10;i++){
executorService.execute(new SampleRunnable(i));
}
...
I am using ForkJoinPool.commonPool().execute(runnable) as a handy way to spawn a thread in many places across my application. But at a particular invocation of that it is taking more time (more than 10 seconds) to invoke the code in the runnable in a thread. What could be the reason for that? How to avoid that?
EDIT: As per #ben 's answer, avoiding long running process in thread pool seems to the solution. Creating new thread manually solved my problem instead of using common ForkJoinPool.
So after some quick testing I found the issue. Look at the following example code:
List<Runnable> runnables = new ArrayList<Runnable>();
for (int i = 0; i < 20; ++i)
{
runnables.add(() -> {
System.out.println("Runnable start");
try
{
Thread.sleep(10000);
}
catch (InterruptedException e)
{
}
System.out.println("Runnable end");
});
}
for (Runnable run : runnables)
{
//ForkJoinPool.commonPool().execute(run);
//new Thread(run).start();
}
Comment in one of the two lines.
We create a number of runnables that send a message, sit idle for 10s and send a message again. Quite simple.
When using Threads for each of those all Runnables send Runnable start 10s pass, all runnables send Runnable end.
When using the commonPool() just a number of them sends Runnable start 10s pass, they send Runnable end and another bunch of them sends Runnable start until they are all finished.
This is simply because the number of cores on your system determines how many threads the threadpool will hold. When all of them are filled new tasks are not executed until one thread is freed up.
So moral of the story: Only use a threadpool when you know the way it works internally and that is what you want it to do.
My problem is we have to give it a fixed schedule time to make it start task. Lets say i give 10 seconds and my task has average finish time of 10-15 seconds. Thus after some time waiting threads in quque causes huge memory consumption. If i use syncronized for the method above problem will occur. If i don't use syncronized then i am wasting resources ( cpu) because i dont need to run task if not finished. So i thought a solution of recursive call of task but i believe recursive threads will add more memory problems... what should i do? Shortly i just want to be able to call a task when it is finished. Not fixed time.
public void myScheduledTask{
doJob(); ( use countdown latch to control waiting if necessary)
TimeUnit.SECONDS.sleep(x);
new Thread( new Runnable( { mySchedulTask(); } ));
or
executor.execute( a thread that call myScheduledTask() method);
}
The option that sounds like what you're trying to accomplish:
ScheduledExecutorService executor = Executors.newScheduledThreadPool(count);
ScheduledFuture<?> future = executor.scheduleWithFixedDelay(
task,
delay,
delay,
TimeUnit.MILLISECONDS
);
This would start your task and execute it after delay milliseconds after the previous completion. Count should be the number of threads you want to use, 1 is acceptable. This also lets you stop the task using the future.
The problems with your example. a) You are sleeping on an executor thread. Dont do this let the executor handle it. If you were using a threadpool of 1 then this executor couldn't do any work while you're waiting. b) Starting a new thread is taking control from the executor... just use the executor, then you have some control over the execution.
If you really wanted to stick with the form you have.
class RecurringTask implements Runnable{
#Override
public void run(){
doJob();
executor.schedule(this, delay, TimeUnit.MILLISECONDS);
}
}
Now you will be creating Futures that you never use, so it will be harder to control the execution of the task.
Create static member in your task class - Lock.
In doJob avoid performing job if lock is already aquired :
if (lock.tryLock()) {
try {
// do the job
} finally {
lock.unlock();
}
} else {
// log the fact you skipped the job
return;
}
I'm looking for a Java Executor that allows me to specify throttling/throughput/pacing limitations, for example, no more than say 100 tasks can be processed in a second -- if more tasks get submitted they should get queued and executed later. The main purpose of this is to avoid running into limits when hitting foreign APIs or servers.
I'm wondering whether either base Java (which I doubt, because I checked) or somewhere else reliable (e.g. Apache Commons) provides this, or if I have to write my own. Preferably something lightweight. I don't mind writing it myself, but if there's a "standard" version out there somewhere I'd at least like to look at it first.
Take a look at guavas RateLimiter:
A rate limiter. Conceptually, a rate limiter distributes permits at a
configurable rate. Each acquire() blocks if necessary until a permit
is available, and then takes it. Once acquired, permits need not be
released. Rate limiters are often used to restrict the rate at which
some physical or logical resource is accessed. This is in contrast to
Semaphore which restricts the number of concurrent accesses instead of
the rate (note though that concurrency and rate are closely related,
e.g. see Little's Law).
Its threadsafe, but still #Beta. Might be worth a try anyway.
You would have to wrap each call to the Executor with respect to the rate limiter. For a more clean solution you could create some kind of wrapper for the ExecutorService.
From the javadoc:
final RateLimiter rateLimiter = RateLimiter.create(2.0); // rate is "2 permits per second"
void submitTasks(List<Runnable> tasks, Executor executor) {
for (Runnable task : tasks) {
rateLimiter.acquire(); // may wait
executor.execute(task);
}
}
The Java Executor doesn't offer such a limitation, only limitation by amount of threads, which is not what you are looking for.
In general the Executor is the wrong place to limit such actions anyway, it should be at the moment where the Thread tries to call the outside server. You can do this for example by having a limiting Semaphore that threads wait on before they submit their requests.
Calling Thread:
public void run() {
// ...
requestLimiter.acquire();
connection.send();
// ...
}
While at the same time you schedule a (single) secondary thread to periodically (like every 60 seconds) releases acquired resources:
public void run() {
// ...
requestLimiter.drainPermits(); // make sure not more than max are released by draining the Semaphore empty
requestLimiter.release(MAX_NUM_REQUESTS);
// ...
}
no more than say 100 tasks can be processed in a second -- if more
tasks get submitted they should get queued and executed later
You need to look into Executors.newFixedThreadPool(int limit). This will allow you to limit the number of threads that can be executed simultaneously. If you submit more than one thread, they will be queued and executed later.
ExecutorService threadPool = Executors.newFixedThreadPool(100);
Future<?> result1 = threadPool.submit(runnable1);
Future<?> result2 = threadPool.submit(runnable2);
Futurte<SomeClass> result3 = threadPool.submit(callable1);
...
Snippet above shows how you would work with an ExecutorService that allows no more than 100 threads to be executed simultaneously.
Update:
After going over the comments, here is what I have come up with (kinda stupid). How about manually keeping a track of threads that are to be executed ? How about storing them first in an ArrayList and then submitting them to the Executor based on how many threads have already been executed in the last one second.
So, lets say 200 tasks have been submitted into our maintained ArrayList, We can iterate and add 100 to the Executor. When a second passes, we can add few more threads based on how many have completed in theExecutor and so on
Depending on the scenario, and as suggested in one of the previous responses, the basic functionalities of a ThreadPoolExecutor may do the trick.
But if the threadpool is shared by multiple clients and you want to throttle, to restrict the usage of each one of them, making sure that one client won't use all the threads, then a BoundedExecutor will do the work.
More details can be found in the following example:
http://jcip.net/listings/BoundedExecutor.java
Personally I found this scenario quite interesting. In my case, I wanted to stress that the interesting phase to throttle is the consuming side one, as in classical Producer/Consumer concurrent theory. That's the opposite of some of the suggested answers before. This is, we don't want to block the submitting thread, but block the consuming threads based in a rate (tasks/second) policy. So, even if there are tasks ready in the queue, executing/consuming Threads may block waiting to meet the throtle policy.
That said, I think a good candidate would be the Executors.newScheduledThreadPool(int corePoolSize). This way you would need a simple queue in front of the executor (a simple LinkedBlockingQueue would suit), and then schedule a periodic task to pick actual tasks from the queue (ScheduledExecutorService.scheduleAtFixedRate). So, is not an straightforward solution, but it should perform goog enough if you try to throttle the consumers as discussed before.
Can limit it inside Runnable:
public static Runnable throttle (Runnable realRunner, long delay) {
Runnable throttleRunner = new Runnable() {
// whether is waiting to run
private boolean _isWaiting = false;
// target time to run realRunner
private long _timeToRun;
// specified delay time to wait
private long _delay = delay;
// Runnable that has the real task to run
private Runnable _realRunner = realRunner;
#Override
public void run() {
// current time
long now;
synchronized (this) {
// another thread is waiting, skip
if (_isWaiting) return;
now = System.currentTimeMillis();
// update time to run
// do not update it each time since
// you do not want to postpone it unlimited
_timeToRun = now+_delay;
// set waiting status
_isWaiting = true;
}
try {
Thread.sleep(_timeToRun-now);
} catch (InterruptedException e) {
e.printStackTrace();
} finally {
// clear waiting status before run
_isWaiting = false;
// do the real task
_realRunner.run();
}
}};
return throttleRunner;
}
Take from JAVA Thread Debounce and Throttle
I'm interested in using ScheduledExecutorService to spawn multiple threads for tasks if task before did not yet finish. For example I need to process a file every 0.5s. First task starts processing file, after 0.5s if first thread is not finished second thread is spawned and starts processing second file and so on. This can be done with something like this:
ScheduledExecutorService executor = Executors.newScheduledThreadPool(4)
while (!executor.isShutdown()) {
executor.execute(task);
try {
Thread.sleep(500);
} catch (InterruptedException e) {
// handle
}
}
Now my question: Why I can't do it with executor.scheduleAtFixedRate?
What I get is if the first task takes longer, the second task is started as soon as first finished, but no new thread is started even if executor has pool of threads. executor.scheduleWithFixedDelay is clear - it executes tasks with same time span between them and it doesn't matter how long it takes to complete the task. So probably I misunderstood ScheduledExecutorService purpose.
Maybe I should look at another kind of executor? Or just use code which I posted here? Any thoughts?
I've solved the problem by launching a nested anonymous runnable in each scheduled execution:
final ScheduledExecutorService service = Executors.newScheduledThreadPool(POOL_SIZE);
final Runnable command = new SlowRunnable();
service.scheduleAtFixedRate(
new Runnable() {
#Override
public void run() {
service.execute(command);
}
}, 0, 1, TimeUnit.SECONDS);
With this example there will be 1 thread executing at every interval a fast instruction, so it will be surely be finished when the next interval is expired. The remaining POOL_SIZE-1 threads will be executing the SlowRunnable's run() in parallel, which may take longer time than the duration of the single interval.
Please note that while I like this solution as it minimize the code and reuse the same ScheduledExecutorService, it must be sized correctly and may not be usable in every context: if the SlowRunnable is so slow that up to POOL_SIZE jobs get executed together, there will be no threads to run the the scheduled task in time.
Also, if you set the interval at 1 TimeUnit.NANOSECONDS it will probably became too slow also the execution of the main runnable.
One of the scheduleAtFixedRate methods is what you're looking for. It starts a task in a thread from the pool at the given interval, even if previous tasks haven't finished. If you're running out of threads to do the processing, adjust the pool size constraints as detailed in the ThreadPoolExecutor docs.