Configuring threadpool size in a service

Configuring threadpool size in a service - java

I am writing a service which takes two urls urlA and urlB to fetch two integers a and b. The service returns the sum of a and b.
In its most simple form the service works like this:
public Integer getSumFromUrls(String urlA, String urlB) {
Integer a = fetchFromUrl(urlA);
Integer b = fetchFromUrl(urlB);
return a + b;
}
Here fetchFromUrl is a synchronous operation, so it blocks the processing thread unless the value is available. To make things efficient I would rather use ExecutorService to schedule the two fetches and return when the results are available. Here is the changed code (ignore the syntactic nuances)
public Integer getSumFromUrls(String urlA, String urlB) {
Future<Integer> aFuture = Executors.newSingleThreadScheduledExecutor().submit(new Callable<Integer>() {
public Integer call() {
return fetchFromUrl(urlA);
}
});
Future<Integer> bFuture = Executors.newSingleThreadScheduledExecutor().submit(new Callable<Integer>() {
public Integer call() {
return fetchFromUrl(urlB);
}
});
Integer a = aFuture.get();
Integer b = bFuture.get();
return a + b;
}
Here, I have created single thread executors to execute the requests concurrently.
Since, this code would be running in the context of a web service, I should probably be not creating the single thread executors locally inside the function but should rather use some N sized thread pools shared across the requests.
My questions here are:
Is the above understanding (italicised part) correct?
If yes, how should I choose the optimum size of the thread pool. Should it be a function of the thread pool size of my service container, or request throughput or both etc?
Is there a better way of optimising this scenario so that service threads are not blocked on doing IO most of the time.
Note: The details provided in this question are not the completely real scenarios but are representative of the same set of complexities required to answer the question.

If your function getSumFromUrls executed in every time a new request comes that means it will create a new ThreadPool each time and submit the task. Suppose if you have 1000 request hit at any point of time then 1000 ThreadPool will be created and which eventually create 1000s of thread. I believe if you create 1000s or more of thread at any point of time it will be an issue for your application. Generally at any point of time number of the active thread should be about/equal to the number of available core size of the system , however that totally depends on the use cases suppose your task is CPU intensive then number of threads should be as CPU core size but if your task is IO intensive then you can have more number of thread. More number of threads means more number of context switch will happen and which has it own cost and may degrade application performance.
Is the above understanding (italicised part) correct?
-> Yes.
If yes, how should I choose the optimum size of the thread pool. Should it be a function of the thread pool size of my service container, or request throughput or both etc?
-> As I have mentioned above it depends on the which type of task you are doing. You should use common thread pool to execute those task.
Is there a better way of optimizing this scenario so that service threads are not blocked on doing IO most of the time?
-> You should benchmark thread pool size and operating system automatically assign the CPU to another thread when a thread doing IO
operation and do not need the CPU.

Related

How to convert Java Threads to Kotlin coroutines?

I have this "ugly" Java code I need to convert to Kotlin idiomatic coroutines and I cant quite figure out how.
Thread[] pool=new Thread[2*Runtime.getRuntime().availableProcessors()];
for (int i=0;i<pool.length;i++)
pool[i]=new Thread(){
public void run() {
int y; while((y=yCt.getAndIncrement())<out.length) putLine(y,out[y]);
}
};
for (Thread t:pool) t.start();
for (Thread t:pool) t.join();
I think it is possible to implement using runBlocking but how do I deal with availableProcessors count?

I'll make some assumptions here:
putLine() is a CPU intensive and not IO operation. I assume this, because it is executed using threads number of 2 * CPU cores, which is usually used for CPU intensive tasks.
We just need to execute putLine() for each item in out. From the above code it is not clear if e.g. yCt is initially 0.
out isn't huge like e.g. millions of items.
You don't look for 1:1 the same code in Kotlin, but rather its equivalent.
Then the solution is really very easy:
coroutineScope {
out.forEachIndexed { index, item ->
launch(Dispatchers.Default) { putLine(index, item) }
}
}
Few words of explanation:
Dispatchers.Default coroutine dispatcher is used specifically for CPU calculations and its number of threads depends on the number of CPU cores. We don't need to create our own threads, because coroutines provide a suitable thread pool.
We don't handle a queue of tasks manually, because coroutines are lightweight and we can instead just schedule a separate coroutine per each item - they will be queued automatically.
coroutineScope() waits for its children, so we don't need to also manually wait for all asynchronous tasks. Any code put below coroutineScope() will be executed when all tasks finish.
There are some differences in behavior between the Java/threads and Kotlin/coroutines code:
Dispatchers.Default by default has the number of threads = CPU cores, not 2 * CPU cores.
In coroutines solution, if any task fail, the whole operation throws an exception. In the original code, errors are ignored and the application continues with inconsistent state.
In coroutines solution the thread pool is shared with other components of the application. This could be a desired behavior or not.

CachedThreadPool vs FixedThreadPool

I would like to know which to use CachedThreadPool or FixedThreadPool in this particular scenario.
When the user logins into the app, a list of addresses will be obtained about 10 addresses. I need to do the following:
Convert the address into latitude and longitude for which I am calling a Google API
Obtain distance between the above fetched latitude and longitude with user's current location also with the help of a Google API
So, I have created a class GetDistance which implements Runnable. In this class I am first calling the Google API and parsing the response to get respective latitude and longitude and then calling and parsing result of another Google API to get driving distance.
private void getDistanceOfAllAddresses(List<Items> itemsList) {
ExecutorService exService = newCachedThreadPool(); //Executors.newFixedThreadPool(3);
for(int i =0; i<itemsList.size(); i++) {
exService.submit(new GetDistance(i,usersCurrentLocation));
}
exService.shutdown();
}
I have tried with both CachedThreadPool and FixedThreadPool, time taken is almost the same. I am in favour of CachedThreadPool as it is recommended for small operations, but I have some concerns. Lets assume CachedThreadPool creates 10 threads (worst case) to complete the process (10 items), will it be an issue if my app is running on lower end devices? As number of threads created will also affect the RAM of the device.
I want to know your thoughts and opinions on this. Which is better to use?

Go with newCachedThreadPool it is better fit for this situation, because your task are small and I/O (network) bound. Which means you should create threads (usually 1.5x ~ 2x times) greater than number of processor cores to get optimum output, but here I guess newCachedThreadPool will manage itself. So, newCachedThreadPool will have less overhead as compared to newFixedThreadPool and will help in your situation.
If you had CPU intensive tasks then newFixedThreadPool could have been a better choice.
Update
A list of addresses will be obtained about 10 addresses.
If you need only 10 address always, then it doesn't matter, go with newCachedThreadPool. But if you think that number of address can increase then use newFixedThreadPool with number of threads <= 1.5x to 2x times number of cores available.
From Java docs:
newFixedThreadPool
Creates a thread pool that reuses a
fixed number of threads operating off
a shared unbounded queue. At any
point, at most nThreads threads will
be active processing tasks. If
additional tasks are submitted when
all threads are active, they will wait
in the queue until a thread is
available. If any thread terminates
due to a failure during execution
prior to shutdown, a new one will take
its place if needed to execute
subsequent tasks. The threads in the
pool will exist until it is explicitly
shutdown.
newCachedThreadPool
Creates a thread pool that creates new
threads as needed, but will reuse
previously constructed threads when
they are available. These pools will
typically improve the performance of
programs that execute many short-lived
asynchronous tasks. Calls to execute
will reuse previously constructed
threads if available. If no existing
thread is available, a new thread will
be created and added to the pool.
Threads that have not been used for
sixty seconds are terminated and
removed from the cache. Thus, a pool
that remains idle for long enough will
not consume any resources. Note that
pools with similar properties but
different details (for example,
timeout parameters) may be created
using ThreadPoolExecutor constructors.

What the difference between ExecutorService's execute and thread.run in running threads concurrently in Java?

I'm new to this concurrent programming in java and came up with following scenarios where I'm getting confusion which to use when.
Scenario 1: In the following code I was trying to run threads by calling .start() on GPSService class which is a Runnable implementation.
int clientNumber = 0;
ServerSocket listener = new ServerSocket(port);
while (true) {
new GPSService(listener.accept(), clientNumber++, serverUrl).start();
}
Scenario 2: In the following code I was trying to run threads by using ExecutorService class as shown
int clientNumber = 0;
ServerSocket listener = new ServerSocket(port);
while(true) {
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.execute(new GPSService(listener.accept(), client++, serverUrl));
executor.shutdown();
while (!executor.awaitTermination(1, TimeUnit.SECONDS)) {
// Threads are still running
System.out.println("Thread is still running");
}
// All threads are completed
System.out.println("\nThread completed it's execution and terminated successfully\n");
}
My Questions are
Which is the best practice to invoke a thread in concurrent programming?
What will be result(troubles) I'll end up with when I use first or second?
Note: I've been facing an issue with the first scenario where the program is getting hanged after every few days. So, is that issue related/expected when I use first method.?
Any good/helpful answer will be appreciated :) Thank you

There are no big differences in the two scenario you posted, except from managing thread termination in Scenario2; you always create a new thread for each incoming request. If you want to use ThreadPool my advice is not to create one for every request but to create one for each server and reuse threads. Something like:
public class YourClass {
//in init method or constructor
ExecutorService executor = Executors....;// choose from newCachedThreadPool() or newFixedThreadPool(int nThreads) or some custom option
int clientNumber = 0;
ServerSocket listener = new ServerSocket(port);
while(true) {
executor.execute(new GPSService(listener.accept(), client++, serverUrl));
}
This will allow you to use a thread pool and to control how many threads to use for your server. If you want to use a Executor this is the preferred way to go.
With a server pool you need to decide how many threads there are in the pool; you have different choices but you can start or with a fixed number or threads or with a pool that tries to use a non busy thread and if all threads are busy it creates a new one (newCachedThreadPool()). The number of threads to allocate depends form many factors: the number of concurrents requests and it durations. The more your server side code takes time the more you need for additional thread. If your server side code is very faster there are very high chances that the pool can recycle threads already allocated (since the requests do not come all in the same exact instant).
Say for example that you have 10 request during a second and each request lasts 0.2 seconds; if the request arrive at 0, 0.1, 0.2, 0.3, 0.4, 0.5, .. part of the second (for example 23/06/2015 7:16:00:00, 23/06/2015 7:16:00:01, 23/06/2015 7:16:00:02) you need only three threads since the request coming at 0.3 can be performed by the thread that server the first request (the one at 0), and so on (the request at time 0.4 can reuse thread used for the request that came at 0.1). Ten requests managed by three threads.
I recommend you (if you did not it already) to read Java Concurrency in practice (Task Execution is chapter 6); which is an excellent book on how to build concurrent application in Java.

From oracle documentation from Executors
public static ExecutorService newCachedThreadPool()
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks.
Calls to execute will reuse previously constructed threads if available. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for sixty seconds are terminated and removed from the cache.
Thus, a pool that remains idle for long enough will not consume any resources. Note that pools with similar properties but different details (for example, timeout parameters) may be created using ThreadPoolExecutor constructors.
public static ExecutorService newFixedThreadPool(int nThreads)
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue. At any point, at most nThreads threads will be active processing tasks. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available.
If any thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks. The threads in the pool will exist until it is explicitly shutdown.
#Giovanni is saying that you don' have to provide number of threads to newCachedThreadPool unlike newFixedThreadPool(), where you have to pass maximum cap on number of threads in ThreadPool.
But between these two, newFixedThreadPool() is preferred. newCachedThread Pool may cause leak and you may reach maximum number of available threads due to unbounded nature. Some people consider it as an evil.
Have a look at related SE question:
Why is an ExecutorService created via newCachedThreadPool evil?

What is the use of a Thread pool in Java?

What is the use of a Thread pool? Is there a good real world example?

A thread pool is a group of threads initially created that waits for jobs and executes them. The idea is to have the threads always existing, so that we won't have to pay overhead time for creating them every time. They are appropriate when we know there's a stream of jobs to process, even though there could be some time when there are no jobs.
Here's a nice diagram from Wikipedia:

Thread Pools from the Java Tutorials has a good overview:
Using worker threads minimizes the overhead due to thread creation. Thread objects use a significant amount of memory, and in a large-scale application, allocating and deallocating many thread objects creates a significant memory management overhead.

Thread pool is a pool of already created worker thread ready to do the job. It creates Thread and manage them. Instead of creating Thread and discarding them once task is done, thread-pool reuses threads in form of worker thread.
Why?
Because creation of Thread is time consuming process and it delays request processing. It also limits number of clients based upon how many thread per JVM is allowed, which is obviously a limited number.
Create fixed size thread pool using Executor framework -
Java 5 introduced a full feature built-in Thread Pool framework commonly known as Executor framework.
Creating fixed size thread pool using Java 5 Executor framework is pretty easy because of static factory methods provided by Executors class. All you need to do is define your task which you want to execute concurrently and than submit that task to ExecutorService.
From here, Thread pool will take care of how to execute that task; it can be executed by any free worker thread.
public class ThreadPoolExample {
public static void main(String args[]) {
ExecutorService service = Executors.newFixedThreadPool(10); //create 10 worker threads in Thread Pool
for (int i =0; i<100; i++){
service.submit(new Task(i)); //submit that to be done
}
service.shutdown();
}
}
final class Task implements Runnable {
private int taskId;
public Task(int id){
this.taskId = id;
}
#Override
public void run() {
System.out.println("Task ID : " + this.taskId +" performed by "
+ Thread.currentThread().getName());
}
}
Output:
Task ID : 0 performed by pool-1-thread-1
Task ID : 3 performed by pool-1-thread-4
Task ID : 2 performed by pool-1-thread-3
Task ID : 1 performed by pool-1-thread-2
Task ID : 5 performed by pool-1-thread-6
Task ID : 4 performed by pool-1-thread-5
*Output may vary from system to system

A simple Google search will result in a wealth of information regarding Java thread pools and thread pools in general.
Here are some helpful links:
http://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html
http://en.wikipedia.org/wiki/Thread_pool_pattern

Thread Pools are useful only in a Server-client kind of situation where the number/occurrence of client requests cannot be determined/predicted.
In this scenario, creating a new Thread each time a client request is made has two dis-advantages:
1) Run time latency for thread creation:
Creation of a thread requires some time, thus the actual job does not start as soon as the request comes in. The client may notice a slight delay.
This criteria is crucial in interactive systems, where the client expects an immediate action.
2) Uncontrolled use of System Resources:
Threads consume system resources (memory etc.), thus the system may run out of resources in case there is an unprecedented flow of client requests.
Thread pools address the above concerns by:
1) Creating specified number of threads on server start-up instead of creating them during the run-time.
2) Limiting the number of threads that are running at any given time.
Note: The above is applicable for Thread Pools of Fixed Sizes.

You may assume Threads to be actual workers and Thread Pools to be group of workers.
You may create multiple groups for various reasons like priority, purpose, etc.
So, while one pool may be for general purpose tasks like background schedules, email broadcasting, etc. there might be a transaction processing pool to simultaneously process multiple transactions. In case of an Executor Service, I am sure you would not like to delay the transactional jobs to be completed after other non-critical activities like broadcasting confirmation emails or database maintenance activities are not completed.
You may segregate them into pools and maintain them independently.
That's a very simplistic answer without getting into technical jargons.
Regards,
KT

Already great answers are there to explain it but Lets understand it
with an example:
Problem without thread pool: Consider a web server application where each HTTP request is handled by a separate thread. If the application simply creates a new thread for every new HTTP request, and the system receives more requests than it can handle immediately, the application will suddenly stop responding to all requests when the overhead of all those threads exceed the capacity of the system.
Solution with thread pool: With a limit on the number of the threads that can be created, the application will not be servicing HTTP requests as quickly as they come in, but it will be servicing them as quickly as the system can sustain.
For more details(overhead of all the threads): Why is creating a Thread said to be expensive?

Computing map: computing value ahead of time

I have a computing map (with soft values) that I am using to cache the results of an expensive computation.
Now I have a situation where I know that a particular key is likely to be looked up within the next few seconds. That key is also more expensive to compute than most.
I would like to compute the value in advance, in a minimum-priority thread, so that when the value is eventually requested it will already be cached, improving the response time.
What is a good way to do this such that:
I have control over the thread (specifically its priority) in which the computation is performed.
Duplicate work is avoided, i.e. the computation is only done once. If the computation task is already running then the calling thread waits for that task instead of computing the value again (FutureTask implements this. With Guava's computing maps this is true if you only call get but not if you mix it with calls to put.)
The "compute value in advance" method is asynchronous and idempotent. If a computation is already in progress it should return immediately without waiting for that computation to finish.
Avoid priority inversion, e.g. if a high-priority thread requests the value while a medium-priority thread is doing something unrelated but the the computation task is queued on a low-priority thread, the high-priority thread must not be starved. Maybe this could be achieved by temporarily boosting the priority of the computing thread(s) and/or running the computation on the calling thread.
How could this be coordinated between all the threads involved?
Additional info
The computations in my application are image filtering operations, which means they are all CPU-bound. These operations include affine transforms (ranging from 50µs to 1ms) and convolutions (up to 10ms.) Of course the effectiveness of varying thread priorities depends on the ability of the OS to preempt the larger tasks.

You can arrange for "once only" execution of the background computation by using a Future with the ComputedMap. The Future represents the task that computes the value. The future is created by the ComputedMap and at the same time, passed to an ExecutorService for background execution. The executor can be configured with your own ThreadFactory implementation that creates low priority threads, e.g.
class LowPriorityThreadFactory implements ThreadFactory
{
public Thread newThread(Runnable r) {
Tread t = new Thread(r);
t.setPriority(MIN_PRIORITY);
return t;
}
}
When the value is needed, your high-priority thread then fetches the future from the map, and calls the get() method to retrieve the result, waiting for it to be computed if necessary. To avoid priority inversion you add some additional code to the task:
class HandlePriorityInversionTask extends FutureTask<ResultType>
{
Integer priority; // non null if set
Integer originalPriority;
Thread thread;
public ResultType get() {
if (!isDone())
setPriority(Thread.currentThread().getPriority());
return super.get();
}
public void run() {
synchronized (this) {
thread = Thread.currentThread();
originalPriority = thread.getPriority();
if (priority!=null) setPriority(priority);
}
super.run();
}
protected synchronized void done() {
if (originalPriority!=null) setPriority(originalPriority);
thread = null;
}
void synchronized setPriority(int priority) {
this.priority = Integer.valueOf(priority);
if (thread!=null)
thread.setPriority(priority);
}
}
This takes care of raising the priority of the task to the priority of the thread calling get() if the task has not completed, and returns the priority to the original when the task completes, normally or otherwise. (To keep it brief, the code doesn't check if the priority is indeed greater, but that's easy to add.)
When the high priority task calls get(), the future may not yet have begun executing. You might be tempted to avoid this by setting a large upper bound on the number of threads used by the executor service, but this may be a bad idea, since each thread could be running at high priority, consuming as much cpu as it can before the OS switches it out. The pool should probably be the same size as the number of hardware threads, e.g. size the pool to Runtime.availableProcessors(). If the task has not started executing, rather than wait for the executor to schedule it (which is a form of priority inversion, since your high priority thread is waiting for the low-priority threads to complete) then you may choose to cancel it from the current executor and re-submit on an executor running only high-priority threads.

One common way of coordinating this type of situation is to have a map whose values are FutureTask objects. So, stealing as an example some code I wrote from a web server of mine, the essential idea is that for a given parameter, we see if there is already a FutureTask (meaning that the calculation with that parameter has already been scheduled), and if so we wait for it. In this example, we otherwise schedule the lookup, but that could be done elsewhere with a separate call if that was desirable:
private final ConcurrentMap<WordLookupJob, Future<CharSequence>> cache = ...
private Future<CharSequence> getOrScheduleLookup(final WordLookupJob word) {
Future<CharSequence> f = cache.get(word);
if (f == null) {
Callable<CharSequence> ex = new Callable<CharSequence>() {
public CharSequence call() throws Exception {
return doCalculation(word);
}
};
Future<CharSequence> ft = executor.submit(ex);
f = cache.putIfAbsent(word, ft);
if (f != null) {
// somebody slipped in with the same word -- cancel the
// lookup we've just started and return the previous one
ft.cancel(true);
} else {
f = ft;
}
}
return f;
}
In terms of thread priorities: I wonder if this will achieve what you think it will? I don't quite understand your point about raising the priority of the lookup above the waiting thread: if the thread is waiting, then it's waiting, whatever the relative priorities of other threads... (You might want to have a look at some articles I've written on thread priorities and thread scheduling, but to cut a long story short, I'm not sure that changing the priority will necessarily buy you what you're expecting.)

I suspect that you are heading down the wrong path by focusing on thread priorities. Usually the data that a cache holds is expensive to compute due I/O (out-of-memory data) vs. CPU bound (logic computation). If you're prefetching to guess a user's future action, such as looking at unread emails, then it indicates to me that your work is likely I/O bound. This means that as long as thread starvation does not occur (which schedulers disallow), playing games with thread priority won't offer much of a performance improvement.
If the cost is an I/O call then the background thread is blocked waiting for the data to arrive and processing that data should be fairly cheap (e.g. deserialization). As the change in thread priority won't offer much of a speed-up, performing the work asynchronously on background threadpool should be sufficient. If the cache miss penalty is too high, then using multiple layers of caching tends to help to further reduce the user perceived latency.

As an alternative to thread priorities, you could perform a low-priority task only if no high-priority tasks are in progress. Here's a simple way to do that:
AtomicInteger highPriorityCount = new AtomicInteger();
void highPriorityTask() {
highPriorityCount.incrementAndGet();
try {
highPriorityImpl();
} finally {
highPriorityCount.decrementAndGet();
}
}
void lowPriorityTask() {
if (highPriorityCount.get() == 0) {
lowPriorityImpl();
}
}
In your use case, both Impl() methods would call get() on the computing map, highPriorityImpl() in the same thread and lowPriorityImpl() in a different thread.
You could write a more sophisticated version that defers low-priority tasks until the high-priority tasks complete and limits the number of concurrent low-priority tasks.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.