Java, divide incoming work uniformly via hashing in multithreaded evnironments

Java, divide incoming work uniformly via hashing in multithreaded evnironments - java

I've implemented a java code to execute incoming tasks (as Runnable) with n Threads based on their hashCode module nThreads. The work should spread, ideally - uniformly, among those threads.
Specifically, we have a dispatchId as a string for each Task.
Here is this java code snippet:
int nThreads = Runtime.getRuntime().availableProcessors(); // Number of threads
Worker[] workers = new Worker[nThreads]; // Those threads, Worker is just a thread class that can run incoming tasks
...
Worker getWorker(String dispatchId) { // Get a thread for this Task
return workers[(dispatchId.hashCode() & Integer.MAX_VALUE) % nThreads];
}
Important: In most cases a dispatchId is:
String dispatchId = 'SomePrefix' + counter.next()
But, I have a concern that modulo division by nThreads is not a good choice, because nThreads should be a prime number for a more uniform distribution of dispatId keys.
Are there any other options on how to spread the work better?
Update 1:
Each Worker has a queue:
Queue<RunnableWrapper> tasks = new ConcurrentLinkedQueue();
The worker gets tasks from it and executes them. Tasks can be added to this queue from other threads.
Update 2:
Tasks with the same dispatchId can come in multiple times, therefore we need to find their thread by dispatchId.
Most importantly, each Worker thread must process its incoming tasks sequentially. Hence, there is data structure Queue in the update 1 above.
Update 3:
Also, some threads can be busy, while others are free. Thus, we need to somehow decouple Queues from Threads, but maintain the FIFO order for the same dispatchId for tasks execution.
Solution:
I've implemented Ben Manes' idea (his answer below), the code can be found here.

It sounds like you need FIFO ordering per dispatch id, so the ideal would be to have dispatch queues as the abstraction. That would explain your concern about hashing as not providing uniform distribution, as some dispatch queues may be more active than others and unfairly balanced among workers. By separating the queue from the worker, you retain FIFO semantics and evenly spread out the work.
An inactive library that provides this abstraction is HawtDispatch. It is Java 6 compatible.
A very simple Java 8 approach is to use CompletableFuture as a queuing mechanism, ConcurrentHashMap for registration, and an Executor (e.g. ForkJoinPool) for computing. See EventDispatcher for an implementation of this idea, where registration is explicit. If your dispatchers are more dynamic then you may need to periodically prune the map. The basic idea is as follows.
ConcurrentMap<String, CompletableFuture<Void>> dispatchQueues = ...
public CompletableFuture<Void> dispatch(String queueName, Runnable task) {
return dispatchQueues.compute(queueName, (k, queue) -> {
return (queue == null)
? CompletableFuture.runAsync(task)
: queue.thenRunAsync(task);
});
}
Update (JDK7)
A backport of the above idea would be translated with Guava into something like,
ListeningExecutorService executor = ...
Striped<Lock> locks = Striped.lock(256);
ConcurrentMap<String, ListenableFuture<?>> dispatchQueues = ...
public ListenableFuture<?> dispatch(String queueName, final Runnable task) {
Lock lock = locks.get(queueName);
lock.lock();
try {
ListenableFuture<?> future = dispatchQueues.get(queueName);
if (future == null) {
future = executor.submit(task);
} else {
final SettableFuture<Void> next = SettableFuture.create();
future.addListener(new Runnable() {
try {
task.run();
} finally {
next.set(null);
}
}, executor);
future = next;
}
dispatchQueues.put(queueName, future);
} finally {
lock.unlock();
}
}

Related

Get results of scheduled non-blocking operations in Java

I am trying to do some blocking operations (say HTTP request) in a scheduled and non-blocking manner. Let's say I have 10 requests and one request takes 3 seconds but I would like not to wait for 3 seconds but wait 1 second and send the next one. After all executions are finished I would like to gather all results in a list and return to the user.
Below, there is a prototype of my scenario (thread sleep used as blocking operation instead of HTTP req.)
public static List<Integer> getResults(List<Integer> inputs) throws InterruptedException, ExecutionException {
List<Integer> results = new LinkedList<Integer>();
Queue<Callable<Integer>> tasks = new LinkedList<Callable<Integer>>();
List<Future<Integer>> futures = new LinkedList<Future<Integer>>();
for (Integer input : inputs) {
Callable<Integer> task = new Callable<Integer>() {
public Integer call() throws InterruptedException {
Thread.sleep(3000);
return input + 1000;
}
};
tasks.add(task);
}
ExecutorService es = Executors.newCachedThreadPool();
ScheduledExecutorService ses = Executors.newScheduledThreadPool(1);
ses.scheduleAtFixedRate(new Runnable() {
#Override
public void run() {
Callable<Integer> task = tasks.poll();
if (task == null) {
ses.shutdown();
es.shutdown();
return;
}
futures.add(es.submit(task));
}
}, 0, 1000, TimeUnit.MILLISECONDS);
while(true) {
if(futures.size() == inputs.size()) {
for (Future<Integer> future : futures) {
Integer result = future.get();
results.add(result);
}
return results;
}
}
}
public static void main(String[] args) throws InterruptedException, ExecutionException {
List<Integer> results = getResults(new LinkedList<Integer>(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)));
System.out.println(Arrays.toString(results.toArray()));
}
I am waiting in a while loop until all tasks return a proper result. But it never enters inside the breaking condition and it infinitely loops. Whenever I put an I/O operation like logger or even a breakpoint, it just break the while loop and everything becomes ok.
I am relatively new to Java concurrency and trying to understand what is happening and whether this is the correct way to do. I guess I/O operation triggers something on thread scheduler and make it check the collections' sizes.

You need to synchronize your threads. You have two different threads (the main thread and the exectuor service thread) accessing the futures list and since LinkedList is not synchronized, these two threads see two different values of futures.
while(true) {
synchronized(futures) {
if(futures.size() == inputs.size()) {
...
}
}
}
This happens because threads in java use the cpu cache to improve performance. So each thread could have different values of a variable until they are synchronized.
This SO question has more information on this.
Also from this answer:
It's all about memory. Threads communicate through shared memory, but when there are multiple CPUs in a system, all trying to access the same memory system, then the memory system becomes a bottleneck. Therefore, the CPUs in a typical multi-CPU computer are allowed to delay, re-order, and cache memory operations in order to speed things up.
That works great when threads are not interacting with one another, but it causes problems when they actually do want to interact: If thread A stores a value into an ordinary variable, Java makes no guarantee about when (or even if) thread B will see the value change.
In order to overcome that problem when it's important, Java gives you certain means of synchronizing threads. That is, getting the threads to agree on the state of the program's memory. The volatile keyword and the synchronized keyword are two means of establishing synchronization between threads.
And finally, the futures list does not update in your code because the main thread is continuously occupied, because of the infinte while block. Doing any I/O operation in your while loop gives the cpu enough breathing space to update its local cache.
An infinite while loop is generally a bad idea because it is very resource intensive. Adding a small delay before the next iteration can make it a little better (though still inefficient).

Threadpool executor with priority tasks and avoid starvation

For my use case, I need an executor which can execute tasks based on priority. Simple way to achieve this is by using a thread pool with PriorityBlockingQueue and override newTaskFor() to return custom future tasks that are comparable based on priority of the task.
//Define priorities
public enum Priority {
HIGH, MEDIUM, LOW, VERYLOW;
}
A priority task
//A Callable tasks that has priority. Concrete implementation will implement
//call() to do actual work and getPriority() to return priority
public abstract class PriorityTask<V> implements Callable<V> {
public abstract Priority getPriority ();
}
Actual executor implementation
public class PriorityTaskThreadPoolExecutor <V> {
int _poolSize;
private PriorityBlockingQueue<Runnable> _poolQueue =
new PriorityBlockingQueue<Runnable>(500);
private ThreadPoolExecutor _pool;
public PriorityTaskThreadPoolExecutor (int poolSize) {
_poolSize = poolSize;
_pool = new ThreadPoolExecutor(_poolSize, _poolSize, 5, TimeUnit.MINUTES,
_poolQueue) {
//Override newTaskFor() to return wrap PriorityTask
//with a PriorityFutureTaskWrapper.
#Override
protected <V> RunnableFuture<V> newTaskFor(Callable<V> c) {
return new PriorityFutureTaskWrapper<V>((PriorityTask<V>) c);
}
};
_pool.allowCoreThreadTimeOut(true);
}
public Future<V> submit (PriorityTask<V> task) {
return _pool.submit(task);
}
}
//A future task that wraps around the priority task to be used in the queue
class PriorityFutureTaskWrapper<V> extends FutureTask<V>
implements Comparable <PriorityFutureTaskWrapper<V>> {
PriorityTask<V> _priorityTask;
public PriorityFutureTaskWrapper (PriorityTask<V> priorityTask) {
super(priorityTask);
_priorityTask = priorityTask;
}
public PriorityTask<V> getPriorityTask () {
return _priorityTask;
}
#Override
public int compareTo(PriorityFutureTaskWrapper<V> o) {
return _priorityTask.getPriority().ordinal() -
o.getPriorityTask().getPriority().ordinal();
}
}
Problem with this is, in my usecase, there is a potential that low priority tasks may starve forever. I want to avoid this. I could not find a clean way to do this using the executors/pools available in java. So I am thinking of writing my own executor. I have two different approaches.
1) A custom thread pool with PriorityBlockingQueue. There will be a separate thread, checking the tasks age in the queue. Older tasks will be removed and re-added with escalated priority.
2) My use case will have only a limited number of priorities say 1-4. I will have 4 different queues for each priority. Now the threads in the custom pool, instead of blocking on the queues, will scan the queues in the following order when it has to pick up the next task.
40% threads - Q1, Q2, Q3, Q4
30% threads - Q2, Q1, Q3, Q4
20% threads - Q3, Q1, Q2, Q4
10% threads - Q4, Q1, Q2, Q3
Scanning will be done by a thread when it is notified about a new addition in the queue or when the current task executed by that thread is completed. Other times, threads will be waiting. But still, scanning will be little more inefficient compared to blocking on the queue.
Apprach 2 is more suitable for my usecase.
Has any one tried any of these approaches or a different approach for similar usecase? Any thought/suggestions?

There is no easy way to change priority of an already inserted element in the PriorityQueue , It is already discussed here
Your second option should be simple to implement , like process more tasks from high priority queue than from low priority queue
You can also consider having different ThreadPools for each priority and number of threads in each pool depends on priority of the tasks.

How to get the ThreadPoolExecutor to increase threads to max before queueing?

I've been frustrated for some time with the default behavior of ThreadPoolExecutor which backs the ExecutorService thread-pools that so many of us use. To quote from the Javadocs:
If there are more than corePoolSize but less than maximumPoolSize threads running, a new thread will be created only if the queue is full.
What this means is that if you define a thread pool with the following code, it will never start the 2nd thread because the LinkedBlockingQueue is unbounded.
ExecutorService threadPool =
new ThreadPoolExecutor(1 /*core*/, 50 /*max*/, 60 /*timeout*/,
TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>(/* unlimited queue*/));
Only if you have a bounded queue and the queue is full are any threads above the core number started. I suspect a large number of junior Java multithreaded programmers are unaware of this behavior of the ThreadPoolExecutor.
Now I have specific use case where this is not-optimal. I'm looking for ways, without writing my own TPE class, to work around it.
My requirements are for a web service that is making call-backs to a possibly unreliable 3rd party.
I don't want to make the call-back synchronously with the web-request, so I want to use a thread-pool.
I typically get a couple of these a minute so I don't want to have a newFixedThreadPool(...) with a large number of threads that mostly are dormant.
Every so often I get a burst of this traffic and I want to scale up the number of threads to some max value (let's say 50).
I need to make a best attempt to do all callbacks so I want to queue up any additional ones above 50. I don't want to overwhelm the rest of my web-server by using a newCachedThreadPool().
How can I work around this limitation in ThreadPoolExecutor where the queue needs to be bounded and full before more threads will be started? How can I get it to start more threads before queuing tasks?
Edit:
#Flavio makes a good point about using the ThreadPoolExecutor.allowCoreThreadTimeOut(true) to have the core threads timeout and exit. I considered that but I still wanted the core-threads feature. I did not want the number of threads in the pool to drop below the core-size if possible.

How can I work around this limitation in ThreadPoolExecutor where the queue needs to be bounded and full before more threads will be started.
I believe I have finally found a somewhat elegant (maybe a little hacky) solution to this limitation with ThreadPoolExecutor. It involves extending LinkedBlockingQueue to have it return false for queue.offer(...) when there are already some tasks queued. If the current threads are not keeping up with the queued tasks, the TPE will add additional threads. If the pool is already at max threads, then the RejectedExecutionHandler will be called which does the put(...) into the queue.
It certainly is strange to write a queue where offer(...) can return false and put() never blocks so that's the hack part. But this works well with TPE's usage of the queue so I don't see any problem with doing this.
Here's the code:
// extend LinkedBlockingQueue to force offer() to return false conditionally
BlockingQueue<Runnable> queue = new LinkedBlockingQueue<Runnable>() {
private static final long serialVersionUID = -6903933921423432194L;
#Override
public boolean offer(Runnable e) {
// Offer it to the queue if there is 0 items already queued, else
// return false so the TPE will add another thread. If we return false
// and max threads have been reached then the RejectedExecutionHandler
// will be called which will do the put into the queue.
if (size() == 0) {
return super.offer(e);
} else {
return false;
}
}
};
ThreadPoolExecutor threadPool = new ThreadPoolExecutor(1 /*core*/, 50 /*max*/,
60 /*secs*/, TimeUnit.SECONDS, queue);
threadPool.setRejectedExecutionHandler(new RejectedExecutionHandler() {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
try {
// This does the actual put into the queue. Once the max threads
// have been reached, the tasks will then queue up.
executor.getQueue().put(r);
// we do this after the put() to stop race conditions
if (executor.isShutdown()) {
throw new RejectedExecutionException(
"Task " + r + " rejected from " + e);
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
return;
}
}
});
With this mechanism, when I submit tasks to the queue, the ThreadPoolExecutor will:
Scale the number of threads up to the core size initially (here 1).
Offer it to the queue. If the queue is empty it will be queued to be handled by the existing threads.
If the queue has 1 or more elements already, the offer(...) will return false.
If false is returned, scale up the number of threads in the pool until they reach the max number (here 50).
If at the max then it calls the RejectedExecutionHandler
The RejectedExecutionHandler then puts the task into the queue to be processed by the first available thread in FIFO order.
Although in my example code above, the queue is unbounded, you could also define it as a bounded queue. For example, if you add a capacity of 1000 to the LinkedBlockingQueue then it will:
scale the threads up to max
then queue up until it is full with 1000 tasks
then block the caller until space becomes available to the queue.
Also, if you needed to use offer(...) in the
RejectedExecutionHandler then you could use the offer(E, long, TimeUnit) method instead with Long.MAX_VALUE as the timeout.
Warning:
If you expect tasks to be added to the executor after it has been shutdown, then you may want to be smarter about throwing RejectedExecutionException out of our custom RejectedExecutionHandler when the executor-service has been shutdown. Thanks to #RaduToader for pointing this out.
Edit:
Another tweak to this answer could be to ask the TPE if there are idle threads and only enqueue the item if there is so. You would have to make a true class for this and add ourQueue.setThreadPoolExecutor(tpe); method on it.
Then your offer(...) method might look something like:
Check to see if the tpe.getPoolSize() == tpe.getMaximumPoolSize() in which case just call super.offer(...).
Else if tpe.getPoolSize() > tpe.getActiveCount() then call super.offer(...) since there seem to be idle threads.
Otherwise return false to fork another thread.
Maybe this:
int poolSize = tpe.getPoolSize();
int maximumPoolSize = tpe.getMaximumPoolSize();
if (poolSize >= maximumPoolSize || poolSize > tpe.getActiveCount()) {
return super.offer(e);
} else {
return false;
}
Note that the get methods on TPE are expensive since they access volatile fields or (in the case of getActiveCount()) lock the TPE and walk the thread-list. Also, there are race conditions here that may cause a task to be enqueued improperly or another thread forked when there was an idle thread.

Set core size and max size to the same value, and allow core threads to be removed from the pool with allowCoreThreadTimeOut(true).

I've already got two other answers on this question, but I suspect this one is the best.
It's based on the technique of the currently accepted answer, namely:
Override the queue's offer() method to (sometimes) return false,
which causes the ThreadPoolExecutor to either spawn a new thread or reject the task, and
set the RejectedExecutionHandler to actually queue the task on rejection.
The problem is when offer() should return false. The currently accepted answer returns false when the queue has a couple of tasks on it, but as I've pointed out in my comment there, this causes undesirable effects. Alternately, if you always return false, you'll keep spawning new threads even when you have threads waiting on the queue.
The solution is to use Java 7 LinkedTransferQueue and have offer() call tryTransfer(). When there is a waiting consumer thread the task will just get passed to that thread. Otherwise, offer() will return false and the ThreadPoolExecutor will spawn a new thread.
BlockingQueue<Runnable> queue = new LinkedTransferQueue<Runnable>() {
#Override
public boolean offer(Runnable e) {
return tryTransfer(e);
}
};
ThreadPoolExecutor threadPool = new ThreadPoolExecutor(1, 50, 60, TimeUnit.SECONDS, queue);
threadPool.setRejectedExecutionHandler(new RejectedExecutionHandler() {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
try {
executor.getQueue().put(r);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
});

Note: I now prefer and recommend my other answer.
Here's a version which feels to me much more straightforward: Increase the corePoolSize (up to the limit of maximumPoolSize) whenever a new task is executed, then decrease the corePoolSize (down to the limit of the user specified "core pool size") whenever a task completes.
To put it another way, keep track of the number of running or enqueued tasks, and ensure that the corePoolSize is equal to the number of tasks as long as it is between the user specified "core pool size" and the maximumPoolSize.
public class GrowBeforeQueueThreadPoolExecutor extends ThreadPoolExecutor {
private int userSpecifiedCorePoolSize;
private int taskCount;
public GrowBeforeQueueThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue) {
super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
userSpecifiedCorePoolSize = corePoolSize;
}
#Override
public void execute(Runnable runnable) {
synchronized (this) {
taskCount++;
setCorePoolSizeToTaskCountWithinBounds();
}
super.execute(runnable);
}
#Override
protected void afterExecute(Runnable runnable, Throwable throwable) {
super.afterExecute(runnable, throwable);
synchronized (this) {
taskCount--;
setCorePoolSizeToTaskCountWithinBounds();
}
}
private void setCorePoolSizeToTaskCountWithinBounds() {
int threads = taskCount;
if (threads < userSpecifiedCorePoolSize) threads = userSpecifiedCorePoolSize;
if (threads > getMaximumPoolSize()) threads = getMaximumPoolSize();
setCorePoolSize(threads);
}
}
As written the class doesn't support changing the user specified corePoolSize or maximumPoolSize after construction, and doesn't support manipulating the work queue directly or via remove() or purge().

We have a subclass of ThreadPoolExecutor that takes an additional creationThreshold and overrides execute.
public void execute(Runnable command) {
super.execute(command);
final int poolSize = getPoolSize();
if (poolSize < getMaximumPoolSize()) {
if (getQueue().size() > creationThreshold) {
synchronized (this) {
setCorePoolSize(poolSize + 1);
setCorePoolSize(poolSize);
}
}
}
}
maybe that helps too, but yours looks more artsy of course…

The recommended answer resolves only one (1) of the issue with the JDK thread pool:
JDK thread pools are biased towards queuing. So instead of spawning a new thread, they will queue the task. Only if the queue reaches its limit will the thread pool spawn a new thread.
Thread retirement does not happen when load lightens. For example if we have a burst of jobs hitting the pool that causes the pool to go to max, followed by light load of max 2 tasks at a time, the pool will use all threads to service the light load preventing thread retirement. (only 2 threads would be needed…)
Unhappy with the behavior above, I went ahead and implemented a pool to overcome the deficiencies above.
To resolve 2) Using Lifo scheduling resolves the issue. This idea was presented by Ben Maurer at ACM applicative 2015 conference:
Systems # Facebook scale
So a new implementation was born:
LifoThreadPoolExecutorSQP
So far this implementation improves async execution perfomance for ZEL.
The implementation is spin capable to reduce context switch overhead, yielding superior performance for certain use cases.
Hope it helps...
PS: JDK Fork Join Pool implement ExecutorService and works as a "normal" thread pool, Implementation is performant, It uses LIFO Thread scheduling, however there is no control over internal queue size, retirement timeout..., and most importantly tasks cannot be interrupted when canceling them

Note: I now prefer and recommend my other answer.
I have another proposal, following to the original idea of changing the queue to return false. In this one all tasks can enter the queue, but whenever a task is enqueued after execute(), we follow it with a sentinel no-op task which the queue rejects, causing a new thread to spawn, which will execute the no-op immediately followed by something from the queue.
Because worker threads may be polling the LinkedBlockingQueue for a new task, it's possible for a task to get enqueued even when there's an available thread. To avoid spawning new threads even when there are threads available, we need to keep track of how many threads are waiting for new tasks on the queue, and only spawn a new thread when there are more tasks on the queue than waiting threads.
final Runnable SENTINEL_NO_OP = new Runnable() { public void run() { } };
final AtomicInteger waitingThreads = new AtomicInteger(0);
BlockingQueue<Runnable> queue = new LinkedBlockingQueue<Runnable>() {
#Override
public boolean offer(Runnable e) {
// offer returning false will cause the executor to spawn a new thread
if (e == SENTINEL_NO_OP) return size() <= waitingThreads.get();
else return super.offer(e);
}
#Override
public Runnable poll(long timeout, TimeUnit unit) throws InterruptedException {
try {
waitingThreads.incrementAndGet();
return super.poll(timeout, unit);
} finally {
waitingThreads.decrementAndGet();
}
}
#Override
public Runnable take() throws InterruptedException {
try {
waitingThreads.incrementAndGet();
return super.take();
} finally {
waitingThreads.decrementAndGet();
}
}
};
ThreadPoolExecutor threadPool = new ThreadPoolExecutor(1, 50, 60, TimeUnit.SECONDS, queue) {
#Override
public void execute(Runnable command) {
super.execute(command);
if (getQueue().size() > waitingThreads.get()) super.execute(SENTINEL_NO_OP);
}
};
threadPool.setRejectedExecutionHandler(new RejectedExecutionHandler() {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
if (r == SENTINEL_NO_OP) return;
else throw new RejectedExecutionException();
}
});

The best solution that I can think of is to extend.
ThreadPoolExecutor offers a few hook methods: beforeExecute and afterExecute. In your extension you could maintain use a bounded queue to feed in tasks and a second unbounded queue to handle overflow. When someone calls submit, you could attempt to place the request into the bounded queue. If you're met with an exception, you just stick the task in your overflow queue. You could then utilize the afterExecute hook to see if there is anything in the overflow queue after finishing a task. This way, the executor will take care of the stuff in it's bounded queue first, and automatically pull from this unbounded queue as time permits.
It seems like more work than your solution, but at least it doesn't involve giving queues unexpected behaviors. I also imagine that there's a better way to check the status of the queue and threads rather than relying on exceptions, which are fairly slow to throw.

Note: For JDK ThreadPoolExecutor when you have a bounded queue, you are only creating new threads when offer is returning false. You might obtain something usefull with CallerRunsPolicy which creates a bit of BackPressure and directly calls run in caller thread.
I need tasks to be executed from threads created by the pool and have an ubounded queue for scheduling, while the number of threads within the pool may grow or shrink between corePoolSize and maximumPoolSize so...
I ended up doing a full copy paste from ThreadPoolExecutor and change a bit the execute method because
unfortunately this could not be done by extension(it calls private methods).
I didn't wanted to spawn new threads just immediately when new request arrive and all threads are busy(because I have in general short lived tasks). I've added a threshold but feel free to change it to your needs ( maybe for mostly IO is better to remove this threshold)
private final AtomicInteger activeWorkers = new AtomicInteger(0);
private volatile double threshold = 0.7d;
protected void beforeExecute(Thread t, Runnable r) {
activeWorkers.incrementAndGet();
}
protected void afterExecute(Runnable r, Throwable t) {
activeWorkers.decrementAndGet();
}
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
int c = ctl.get();
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
c = ctl.get();
}
if (isRunning(c) && this.workQueue.offer(command)) {
int recheck = this.ctl.get();
if (!isRunning(recheck) && this.remove(command)) {
this.reject(command);
} else if (workerCountOf(recheck) == 0) {
this.addWorker((Runnable) null, false);
}
//>>change start
else if (workerCountOf(recheck) < maximumPoolSize //
&& (activeWorkers.get() > workerCountOf(recheck) * threshold
|| workQueue.size() > workerCountOf(recheck) * threshold)) {
this.addWorker((Runnable) null, false);
}
//<<change end
} else if (!this.addWorker(command, false)) {
this.reject(command);
}
}

Below is a solution using two Threadpools both with core and max pool size as same. The second pool is used when the 1st pool is busy.
import java.util.concurrent.Future;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
public class MyExecutor {
ThreadPoolExecutor tex1, tex2;
public MyExecutor() {
tex1 = new ThreadPoolExecutor(15, 15, 5, TimeUnit.SECONDS, new LinkedBlockingQueue<>());
tex1.allowCoreThreadTimeOut(true);
tex2 = new ThreadPoolExecutor(45, 45, 100, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>());
tex2.allowCoreThreadTimeOut(true);
}
public Future<?> submit(Runnable task) {
ThreadPoolExecutor ex = tex1;
int excessTasks1 = tex1.getQueue().size() + tex1.getActiveCount() - tex1.getCorePoolSize();
if (excessTasks1 >= 0) {
int excessTasks2 = tex2.getQueue().size() + tex2.getActiveCount() - tex2.getCorePoolSize();;
if (excessTasks2 <= 0 || excessTasks2 / (double) tex2.getCorePoolSize() < excessTasks1 / (double) tex1.getCorePoolSize()) {
ex = tex2;
}
}
return ex.submit(task);
}
}

Which ThreadPool in Java should I use?

There are a huge amount of tasks.
Each task is belong to a single group. The requirement is each group of tasks should executed serially just like executed in a single thread and the throughput should be maximized in a multi-core (or multi-cpu) environment. Note: there are also a huge amount of groups that is proportional to the number of tasks.
The naive solution is using ThreadPoolExecutor and synchronize (or lock). However, threads would block each other and the throughput is not maximized.
Any better idea? Or is there exist a third party library satisfy the requirement?

A simple approach would be to "concatenate" all group tasks into one super task, thus making the sub-tasks run serially. But this will probably cause delay in other groups that will not start unless some other group completely finishes and makes some space in the thread pool.
As an alternative, consider chaining a group's tasks. The following code illustrates it:
public class MultiSerialExecutor {
private final ExecutorService executor;
public MultiSerialExecutor(int maxNumThreads) {
executor = Executors.newFixedThreadPool(maxNumThreads);
}
public void addTaskSequence(List<Runnable> tasks) {
executor.execute(new TaskChain(tasks));
}
private void shutdown() {
executor.shutdown();
}
private class TaskChain implements Runnable {
private List<Runnable> seq;
private int ind;
public TaskChain(List<Runnable> seq) {
this.seq = seq;
}
#Override
public void run() {
seq.get(ind++).run(); //NOTE: No special error handling
if (ind < seq.size())
executor.execute(this);
}
}
The advantage is that no extra resource (thread/queue) is being used, and that the granularity of tasks is better than the one in the naive approach. The disadvantage is that all group's tasks should be known in advance.
--edit--
To make this solution generic and complete, you may want to decide on error handling (i.e whether a chain continues even if an error occures), and also it would be a good idea to implement ExecutorService, and delegate all calls to the underlying executor.

I would suggest to use task queues:
For every group of tasks You have create a queue and insert all tasks from that group into it.
Now all Your queues can be executed in parallel while the tasks inside one queue are executed serially.
A quick google search suggests that the java api has no task / thread queues by itself. However there are many tutorials available on coding one. Everyone feel free to list good tutorials / implementations if You know some:

I mostly agree on Dave's answer, but if you need to slice CPU time across all "groups", i.e. all task groups should progress in parallel, you might find this kind of construct useful (using removal as "lock". This worked fine in my case although I imagine it tends to use more memory):
class TaskAllocator {
private final ConcurrentLinkedQueue<Queue<Runnable>> entireWork
= childQueuePerTaskGroup();
public Queue<Runnable> lockTaskGroup(){
return entireWork.poll();
}
public void release(Queue<Runnable> taskGroup){
entireWork.offer(taskGroup);
}
}
and
class DoWork implmements Runnable {
private final TaskAllocator allocator;
public DoWork(TaskAllocator allocator){
this.allocator = allocator;
}
pubic void run(){
for(;;){
Queue<Runnable> taskGroup = allocator.lockTaskGroup();
if(task==null){
//No more work
return;
}
Runnable work = taskGroup.poll();
if(work == null){
//This group is done
continue;
}
//Do work, but never forget to release the group to
// the allocator.
try {
work.run();
} finally {
allocator.release(taskGroup);
}
}//for
}
}
You can then use optimum number of threads to run the DoWork task. It's kind of a round robin load balance..
You can even do something more sophisticated, by using this instead of a simple queue in TaskAllocator (task groups with more task remaining tend to get executed)
ConcurrentSkipListSet<MyQueue<Runnable>> sophisticatedQueue =
new ConcurrentSkipListSet(new SophisticatedComparator());
where SophisticatedComparator is
class SophisticatedComparator implements Comparator<MyQueue<Runnable>> {
public int compare(MyQueue<Runnable> o1, MyQueue<Runnable> o2){
int diff = o2.size() - o1.size();
if(diff==0){
//This is crucial. You must assign unique ids to your
//Subqueue and break the equality if they happen to have same size.
//Otherwise your queues will disappear...
return o1.id - o2.id;
}
return diff;
}
}

Actor is also another solution for this specified type of issues.
Scala has actors and also Java, which provided by AKKA.

I had a problem similar to your, and I used an ExecutorCompletionService that works with an Executor to complete collections of tasks.
Here is an extract from java.util.concurrent API, since Java7:
Suppose you have a set of solvers for a certain problem, each returning a value of some type Result, and would like to run them concurrently, processing the results of each of them that return a non-null value, in some method use(Result r). You could write this as:
void solve(Executor e, Collection<Callable<Result>> solvers)
throws InterruptedException, ExecutionException {
CompletionService<Result> ecs = new ExecutorCompletionService<Result>(e);
for (Callable<Result> s : solvers)
ecs.submit(s);
int n = solvers.size();
for (int i = 0; i < n; ++i) {
Result r = ecs.take().get();
if (r != null)
use(r);
}
}
So, in your scenario, every task will be a single Callable<Result>, and tasks will be grouped in a Collection<Callable<Result>>.
Reference:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorCompletionService.html

How to manage M threads (1 per task) ensuring only N threads at the same time. With N < M. In Java

I have a queue of task in java. This queue is in a table in the DB.
I need to:
1 thread per task only
No more than N threads running at the same time. This is because the threads have DB interaction and I don't want have a bunch of DB connections opened.
I think I could do something like:
final Semaphore semaphore = new Semaphore(N);
while (isOnJob) {
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
final CountDownLatch cdl = new CountDownLatch(tasks.size());
for (final JobTask task : tasks) {
Thread tr = new Thread(new Runnable() {
#Override
public void run() {
semaphore.acquire();
task.doWork();
semaphore.release();
cdl.countDown();
}
});
}
cdl.await();
}
}
I know that an ExecutorService class exists, but I'm not sure if it I can use it for this.
So, do you think that this is the best way to do this? Or could you clarify me how the ExecutorService works in order to solve this?
final solution:
I think the best solution is something like:
while (isOnJob) {
ExecutorService executor = Executors.newFixedThreadPool(N);
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
for (final JobTask task : tasks) {
executor.submit(new Runnable() {
#Override
public void run() {
task.doWork();
}
});
}
}
executor.shutdown();
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.HOURS);
}
Thanks a lot for the awnsers. BTW I am using a connection pool, but the queries to the DB are very heavy and I don't want to have uncontrolled number of task at the same time.

You can indeed use an ExecutorService. For instance, create a new fixed thread pool using the newFixedThreadPool method. This way, besides caching threads, you also guarantee that no more than n threads are running at the same time.
Something along these lines:
private static final ExecutorService executor = Executors.newFixedThreadPool(N);
// ...
while (isOnJob) {
List<JobTask> tasks = getJobTasks();
if (!tasks.isEmpty()) {
List<Future<?>> futures = new ArrayList<Future<?>>();
for (final JobTask task : tasks) {
Future<?> future = executor.submit(new Runnable() {
#Override
public void run() {
task.doWork();
}
});
futures.add(future);
}
// you no longer need to use await
for (Future<?> fut : futures) {
fut.get();
}
}
}
Note that you no longer need to use the latch, as get will wait for the computation to complete, if necessary.

I agree with JG that ExecutorService is the way to go... but I think you're both making it more complicated than it needs to be.
Rather than creating a large number of threads (1 per task) why not just create a fixed sized thread pool (with Executors.newFixedThreadPool(N)) and submit all the tasks to it? No need for a semaphore or anything like that - just submit the jobs to the thread pool as you get them, and the thread pool will handle them with up to N threads at a time.
If you aren't going to use more than N threads at a time, why would you want to create them?

Use a ThreadPoolExecutor instance with an unbound queue and fixed maximum size of Threads, e.g. Executors.newFixedThreadPool(N). This will accept a large number of tasks but will only execute N of them concurrently.
If you choose a bounded queue instead (with a capacity of N) the Executor will reject the execution of the task (how exactly depends on the Policy you can configure when working with ThreadPoolExecutor directly, instead of using the Executors factory - see RejectedExecutionHandler).
If you need "real" congestion control you should setup a bound BlockingQueue with a capacity of N. Fetch the tasks you want done from the database and put them into the queue - if it's full the calling thread will block. In another thread (perhaps also started using the Executor API) you take tasks from the BlockingQueue and submit them to the Executor. If the BlockingQueue is empty the calling thread will also block. To signal that you're done use a "special" object (e.g. a singleton which marks the last/final item in the queue).

Achieving good performance also depends on the kind of work that needs to be done in the threads. If your DB is the bottleneck in processing I would start paying attention to how your threads access the DB. Using a connection pool is probably in order. This might help you to achive more throughput, since worker threads can re-use DB connections from the pool.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.