Java threadpool oversubscription

Java threadpool oversubscription - java

I have mainly CPU intensive operation which is running on a thread pool.
Operation however has certain amount of waiting foe external events which doesn't happen uniformly in time.
Since in Java, as far as I know, there is not a thread pool implementation which automatically sizes its number of threads based on observed task throughput (as in Microsoft's CLR 4), is there at least a way to manually tell to thread pool to increase its size when a blocking operation starts and to decrease when it ends?
For example with 8 cores, pool size is 8.
If operation is 100% CPU bound, just use this fixed pool.
If there is some blocking operation, one should be able to do this:
pool.increase();
waitForSpecialKeyPress();
pool.decrease();
Here is how it is being done in Microsoft's C++ Async library: Use Oversubscription to Offset Latency

You could extend ThreadPoolExecutor to add your own increase() and decrease() functions, which do simple setMaximumPoolSize(getMaximumPoolSize() +/- 1).
Make sure to synchronize the methods, to make sure you don't mess up the pool size by accident.

Java 7's ForkJoinPool has a ManagedBlocker, which can be used to keep the pool informed about blocked threads, so that it can schedule more threads if necessary.
EDIT: I forgot to mention, the classes are also available for Java 6 as jsr166y.

Related

ForkJoin vs other thread pools in Java?

Java ForkJoin pools has been compared to the other "classic" thread pool implementation in Java many times. The question I have is slightly different though:
Can I use a single, shared ForkJoin pool for an application that have BOTH types of thread usage - long running, socket-handling, transactional threads, AND short running tasks (CompletableFuture)? Or do I have to go through the pain of maintaining 2 separate pools for each type of need?
... In other words, is there a significant (performance?) penalty, if ForkJoin is used in place where other Java thread pool implementations suffice?

According to the documentation, it depends:
(3) Unless the ForkJoinPool.ManagedBlocker API is used, or the number of possibly blocked tasks is known to be less than the pool's ForkJoinPool.getParallelism() level, the pool cannot guarantee that enough threads will be available to ensure progress or good performance.
The parallelism is coupled to the amount of available CPU cores. So given enough CPU cores and not to many blocking I/O tasks, you could use the commonPool. It does not mean you should though. For one thing, ForkJoinPool is explicitly not designed for long running (blocking) tasks. For another thing, you probably want to do something with long running (blocking) tasks during shutdown.

Optimization of Thread Pool Executor-java

I am using ThreadPoolexecutor by replacing it with legacy Thread.
I have created executor as below:
pool = new ThreadPoolExecutor(coreSize, size, 0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>(coreSize),
new CustomThreadFactory(name),
new CustomRejectionExecutionHandler());
pool.prestartAllCoreThreads();
Here core size is maxpoolsize/5. I have pre-started all the core threads on start up of application roughly around 160 threads.
In legacy design we were creating and starting around 670 threads.
But the point is even after using Executor and creating and replacing legacy design we are not getting much better results.
For results memory management we are using top command to see memory usage.
For time we have placed loggers of System.currentTime in millis to check the usage.
Please tell how to optimize this design. Thanks.

But the point is even after using Executor and creating and replacing legacy design we are not getting much better results.
I am assuming that you are looking at the overall throughput from your application and you are not seeing a better performance as opposed to running each task in its own thread -- i.e. not with a pool?
This sounds like you were not being blocked because of context switching. Maybe your application is IO bound or otherwise waiting on some other system resource. 670 threads sounds like a lot and you would have been using a lot of thread stack memory but otherwise it may not have been holding back the performance of your application.
Typically we use the ExecutorService classes not necessarily because they are faster than raw threads but because the code is easier to manage. The concurrent classes take care of a lot of locking, queueing, etc. out of your hands.
Couple code comments:
I'm not sure you want the LinkedBlockingQueue to be limited by core-size. Those are two different numbers. core-size is the minimum number of threads in the pool. The size of the BlockingQueue is how many jobs can be queued up waiting for a free thread.
As an aside, the ThreadPoolExecutor will never allocate a thread past the core thread number, unless the BlockingQueue is full. In your case, if all of the core-threads are busy and the queue is full with the core-size number of queued tasks is when the next thread is forked.
I've never had to use pool.prestartAllCoreThreads();. The core threads will be started once tasks are submitted to the pool so I don't think it buys you much -- at least not with a long running application.
For time we have placed loggers of System.currentTime in millis to check the usage.
Be careful on this. Too many loggers could affect performance of your application more than re-architecting it. But I assume you added the loggers after you didn't see a performance improvement.

The executor merely wraps the creation/usage of Threads, so it's not doing anything magical.
It sounds like you have a bottleneck elsewhere. Are you locking on a single object ? Do you have a single single-threaded resource that every thread hits ? In such a case you wouldn't see any change in behaviour.
Is your process CPU-bound ? If so your threads should (very roughly speaking) match the number of processing cores available. Note that each thread you create consumes memory for its stack, and if you're memory bound, then creating multiple threads won't help here.

Java - Managing Size of Thread Pool (Increasing mostly)

I'm trying to use thread pool in Java. But the number of threads is unknown, so I'm trying to find a solution. Then two questions occured:
I'm looking for increasing size of thread pool for some time, but I couldn't come up with something yet. Any suggestions for that? Some say Executors.newCachedThreadPool() should work but in definition of the method it says it's for short-time threads.
What if I set the size of the thread pool as a big number like 50 or 100? Does it work fine?

You can use Executors.newCachedThreadPool for more long-lived tasks also, but the thing is that if you have long running tasks, and they're added constantly and more frequently than existing tasks are being completed, the amount of threads will spin out of control. In such case it might be a better idea to use a (larger) fixed-size thread pool and let the further tasks wait in queue for a free thread.
This will only mean you'll (probably) have lots of alive threads that are sleeping (idle) most of the time. Basically the things to consider are
How many threads can your system handle (ie. how many threads can be created in total, in Windows-machines this can be less than 1000, in Linuces you can get tens of thousands of thread and even more with some tweaking of the system configuration)
Each thread consumes at least the stack size of a single thread in terms of memory (in Linux, this can be something like 1-8MB per thread by default, again it can be tweaked from ulimits and the JVM's -Xss -parameter)
At least with NPTL, there should minimal or almost zero context-switching penalty for sleeping threads, so excess threads aren't "heavy" in terms of cpu usage
That being said, it'd probably be best to use the ThreadPoolExecutor's constructors directly to get the kind of pooling you want.

Executors.newCachedThreadPool() allows you to create thread on demands. I think you can start by using this - I cannot see where it's stated that it's for short-time threads, but I bet the reason is since you are re-using available threads, having short threads allows you to keep the number of simultaneous active threads quite low.
Unless you've got too many threads running (you can check it using JVisualVM or JConsole), I would suggest sticking with that solution - specially because number of expected threads is undefined. Analyze then the VM and tune your pool accordingly.
For question 2 - were you referring to using something like Executors.newFixedThreadPool(int)? If yes, remember that going aobve the number of threads you defined when you've created the ThreadPool will make threads wait - instead of newCachedThreadPool in which new threads are dynamically created.

Why is an ExecutorService created via newCachedThreadPool evil?

Paul Tyma presentation has this line:
Executors.newCacheThreadPool evil, die die die
Why is it evil ?
I will hazard a guess: is it because the number of threads will grow in an unbounded fashion. Thus a server that has been slashdotted, would probably die if the JVM's max thread count was reached ?

(This is Paul)
The intent of the slide was (apart from having facetious wording) that, as you mention, that thread pool grows without bound creating new threads.
A thread pool inherently represents a queue and transfer point of work within a system. That is, something is feeding it work to do (and it may be feeding work elsewhere too). If a thread pool starts to grow its because it cannot keep up with demand.
In general, that's fine as computer resources are finite and that queue is built to handle bursts of work. However, that thread pool doesn't give you control over being able to push the bottleneck forward.
For example, in a server scenario, a few threads might be accepting on sockets and handing a thread pool the clients for processing. If that thread pool starts to grow out of control - the system should stop accepting new clients (in fact, the "acceptor" threads then often hop into the thread-pool temporarily to help process clients).
The effect is similar if you use a fixed thread pool with an unbounded input queue. Anytime you consider the scenario of the queue filling out of control - you realize the problem.
IIRC, Matt Welsh's seminal SEDA servers (which are asynchronous) created thread pools which modified their size according to server characteristics.
The idea of stop accepting new clients sounds bad until you realize the alternative is a crippled system which is processing no clients. (Again, with the understanding that computers are finite - even an optimally tuned system has a limit)
Incidentally, JVMs limit threads to 16k (usually) or 32k threads depending on the JVM. But if you are CPU bound, that limit isn't very relevant - starting yet another thread on a CPU-bound system is counterproductive.
I've happily run systems at 4 or 5 thousand threads. But nearing the 16k limit things tend to bog down (this limit JVM enforced - we had many more threads in linux C++) even when not CPU bound.

The problem with Executors.newCacheThreadPool() is that the executor will create and start as many threads as necessary to execute the tasks submitted to it. While this is mitigated by the fact that the completed threads are released (the thresholds are configurable), this can indeed lead to severe resource starvation, or even crash the JVM (or some badly designed OS).

There are a couple of issues with it. Unbounded growth in terms of threads an obvious issue – if you have cpu bound tasks then allowing many more than the available CPUs to run them is simply going to create scheduler overhead with your threads context switching all over the place and none actually progressing much. If your tasks are IO bound though things get more subtle. Knowing how to size pools of threads that are waiting on network or file IO is much more difficult, and depends a lot on the latencies of those IO events. Higher latencies mean you need (and can support) more threads.
The cached thread pool continues adding new threads as the rate of task production outstrips the rate of execution. There are a couple of small barriers to this (such as locks that serialise new thread id creation) but this can unbound growth can lead to out-of-memory errors.
The other big problem with the cached thread pool is that it can be slow for task producer thread. The pool is configured with a SynchronousQueue for tasks to be offered to. This queue implementation basically has zero size and only works when there is a matching consumer for a producer (there is a thread polling when another is offering). The actual implementation was significantly improved in Java6, but it is still comparatively slow for the producer, particularly when it fails (as the producer is then responsible for creating a new thread to add to the pool). Often it is more ideal for the producer thread to simply drop the task on an actual queue and continue.
The problem is, no-one has a pool that has a small core set of threads which when they are all busy creates new threads up to some max and then enqueues subsequent tasks. Fixed thread pools seem to promise this, but they only start adding more threads when the underlying queue rejects more tasks (it is full). A LinkedBlockingQueue never gets full so these pools never grow beyond the core size. An ArrayBlockingQueue has a capacity, but as it only grows the pool when capacity is reached this doesn't mitigate the production rate until it is already a big problem. Currently the solution requires using a good rejected execution policy such as caller-runs, but it needs some care.
Developers see the cached thread pool and blindly use it without really thinking through the consequences.

Threads usage in Application

Can any tell me that how many threads I can use in an application. I mean is the thread implementation usage is bounded to any fixed number?
Can I use the same thread for more than one time?
for example:
public Thread td;
td = new Thread(this);
td.start();
Can I use the above thread in my same Application in the different class or method?
Please help.

Is the thread implementation usage is bounded to any fixed number?
There is no fixed number on the number of threads, but is limited to heap size allocated to the program.
Can I use the same thread for more than one time?
Of course, a same thread can be used any number of times. Check the java.util.concurrent.Executor for using thread pools.

You may have to read in-depth concepts about Threads. Its not similar to reusable chunks. Threads have lot of issues to be addressed like race conditions. You need to really know what you're doing before using them.

One can start a thread only once. To create another thread you have create another instance.

JVM doesn't enforce max number of threads but there could be other factors OS support, available resources etc. Check following question on similar lines for max threads allowed:
Java very limited on max number of threads?
About can you use Thread multiple times,
You should have look at ThreadPoolExecutor which does pooling of Threads.

Thread limits in the OS and Java's implementation of threads can vary. In all cases, threads consume memory just to maintain even when they're not doing anything since the OS allocates stack for each instance. In a 32-bit Windows app, the max number of threads per process is usually 2048 because the default stack size is 1Mb so 2*2048 = 2Gb. However Java.exe on Windows has a 256Kb stack size so perhaps it can go higher.
However it is not usually necessary or desirable to spawn so many threads. Something like a web server or such like would probably use a thread pool and put sensible bounds on the maximum number of threads it allows at one time.
Only apps which have to deal with a large number of simultaneous actions (e.g. an IRC server) should consider spawning thousands of threads, and even then I question if it's a good idea. With loading balancing etc. the load can be farmed out over several PCs which is a good thing from a maintenance point of view any way.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.