How to configure a ThreadPool for parallel webservice requests?

How to configure a ThreadPool for parallel webservice requests? - java

I want to send webservice calls to a webservice in parallel. There should be a maximum of 20 parallel requests waiting for a webservice response. Any other requests should wait for them to finish.
If a single user sends a request to me, this usually leads to sending 5 parallel requests to the target server.
So I could serve a maximum of 20/5 = 4 users at a time instantly. Others would have to wait, which is fine. Or being rejected on really high load.
Question: which thread pool should I use for this, and how should I configure it?
private ExecutorService asyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(20);
executor.setMaxPoolSize(20);
executor.setQueueCapacity(100);
executor.initialize();
//return executor.getThreadPoolExecutor(); //I could as well use this. Which is better??
return Executors.newFixedThreadPool(20, executor);
}
I read the above as follows: the main pool can send 20 parallel requests to the webserver (thereby serve 4 users instantly in parralel). If all 20 threads are busy and a 5th user comes in, the requests are queued.
Question:
It the configuration correct?
Why do I have to set the nthreads=20 in Executors.newFixedThreadPool() additionally? What is the difference in setting the poolSize?
Is there a performance overhead of setting the corePoolSize=20? I mean, then the pool will never thrink. But I could probably not set it lower (eg to 5), because new threads are only created if the queue is exhausted. Right?

I think that
executor.setMaxPoolSize(20)
can be used to override initially set # of threads via constructor, it seems redundant in this case. Same applies to setCorePoolSize. It's just for dynamic control.
As per size of the pool, in general you may consider Little's Law

When calling Executors.newFixedThreadPool(20, executor); you are passing previously set up executor as a ThreadFactory, because ThreadPoolTaskExecutor implements ThreadFactory interface.
The result is an ExecutorService that starts with 20 threads ready, maximum number of threads 20 and an unbounded LinkedBlockingQueue for tasks waiting to be processed (you can see it here https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool(int,%20java.util.concurrent.ThreadFactory)).
So as to your first and second question- you can omit Executors.newFixedThreadPool() call and just return ThreadPoolTaskExecutor or call getThreadPoolExecutor() on it if you want to work with jdk class ThreadPoolExecutor.
As to your thrid question- there will be a penalty at a startup because jvm will have to create and run 20 threads but I guess thats negligible. One thing that will give you setting corePoolSize to something lower than maximumPoolSize is that unsed threads will be killed by jvm which can reduce the memory footprint of your application.
As to when new threads are created in ExecutorService- threads in ExecutorService are created when there are tasks in the queue and maximumPoolSize is not reached. If maximumPoolSize in executor is reached and queue is full- executor will throw RejectedExecutionException for every new task submitted to it, untill there is room again for a new ones. You can change this behaviour by implementing your own RejectedExecutionHandler.

Related

Java ExecutorService that dynamically scale the number of threads

I have a list of work-unit and I want to process them in parallel. Unit work is 8-15 seconds each, fully computational time, no I/O blocking. What I want to achieve is to have an ExecutorService that:
has zero threads instantiated when there is no work to do
can dynamically scale up to 20 thread if needed
allow me to add all work-units at once (without blocking the submission)
Something like:
Queue<WorkResult> queue = new ConcurrentLinkedDeque<>();
ExecutorService service = ....
for(WorkUnit unit : list) {
service.submit(() -> {
.. do some work ..
queue.offer(result);
);
}
while(queue.peek() != null) {
... process results while they arrive ...
}
What I tried with no success is:
Using a newCachedThreadPool() creates too many threads
Then I've used its internal call new ThreadPoolExecutor(0, 20, 60L, SECONDS, new SynchronousQueue<>()), but then I noticed that submit() is blocking due to the synchronous queue
So I've used new LinkedBlockingQueue(), just to find out that the ThreadPoolExecutor spawns only one thread
I'm sure there is official implementation to handle this very basic use-case of concurrency.
Can someone advice?

Create the ThreadPoolExecutor using a LinkedBlockingQueue and 20 as corePoolSize (first argument in the constructor):
new ThreadPoolExecutor(20, 20, 60L, SECONDS, new LinkedBlockingQueue<>());
If you use the LinkedBlockingQueue without a predefined capacity, the Pool:
Won't ever check maxPoolSize.
Won't create more threads than corePoolSize's specified number.
In your case, only one thread will be executed. And you're lucky to get one, since you set it to 0 and previous versions of Java (<1.6) wouldn't create any if the corePoolSize was set to 0 (how dare they?).
Further versions do create a new thread even if the corePoolSize is 0, which seems like ... a fix that is ... a bug that... changes ... a logical behaviour?.
Thread Pool Executor
Using an unbounded queue (for example a LinkedBlockingQueue without a
predefined capacity) will cause new tasks to wait in the queue when
all corePoolSize threads are busy. Thus, no more than corePoolSize
threads will ever be created. (And the value of the maximumPoolSize
therefore doesn't have any effect.)
About scaling down
In order to achieve removing all threads if there's no work to do, you will have to close the coreThreads specifically (they don't terminate by default). To achieve this, set allowCoreThreadTimeOut(true) before starting the Pool.
Be aware of setting a correct keep-alive timeout: for example, if a new task is received on average at every 6 seconds, setting the keep-alive time to 5 seconds could lead to unnecessary erase+create operations(oh dear thread, you just had to wait one second!). Set this timeout based on the task reception income speed.
allowCoreThreadTimeOut
Sets the policy governing whether core threads may time out and
terminate if no tasks arrive within the keep-alive time, being
replaced if needed when new tasks arrive. When false, core threads are
never terminated due to lack of incoming tasks. When true, the same
keep-alive policy applying to non-core threads applies also to core
threads. To avoid continual thread replacement, the keep-alive time
must be greater than zero when setting true. This method should in
general be called before the pool is actively used.
TL/DR
Unbounded LinkedBloquingQueue as task queue.
corePoolSize replacing maxPoolSize's meaning.
allowCoreThreadTimeOut(true) in order to allow the Pool to scale down using a timeout based mechanism that also affects coreThreads.
keep-alive value set to something logical based on the task reception latency.
This fresh mix will lead to an ExecutorService that 99,99999% percent of the time won't block the submitter (for this to happen, the number of tasks queued should be 2.147.483.647), and that efficiently scales the number of threads in base of the work load, fluctuating (in both directions) between { 0 <--> corePoolSize } concurrent threads.
As a suggestion, the queue's size should be monitorized, as the non-blocking behaviour has a price: the probability of getting OOM exceptions if it keeps growing without control, until INTEGER.MAX_VALUE is met (f.e: if the threads are deadlocked for an entire day while the submitters keep inserting tasks). Even if the task's size in memory could be small, 2.147.483.647 objects with its corresponding link wrappers, etc... is a lot of extra load.

The simplest way would be to use the method
public static ScheduledExecutorService newScheduledThreadPool(int corePoolSize)
Of class Executors. This gives you a simple out-of-the-box solution. The pool that you get will expand and shrink as per need. You can further configure it with methods dealing with core threads timeout etc. ScheduledExecutorService is an extension of ExecutorService class and is the only one that is out of the box may dynamically expand and shrink.

why is newCachedThreadPool good for short-lived asynchronous tasks? [duplicate]

newCachedThreadPool() versus newFixedThreadPool()
When should I use one or the other? Which strategy is better in terms of resource utilization?

I think the docs explain the difference and usage of these two functions pretty well:
newFixedThreadPool
Creates a thread pool that reuses a
fixed number of threads operating off
a shared unbounded queue. At any
point, at most nThreads threads will
be active processing tasks. If
additional tasks are submitted when
all threads are active, they will wait
in the queue until a thread is
available. If any thread terminates
due to a failure during execution
prior to shutdown, a new one will take
its place if needed to execute
subsequent tasks. The threads in the
pool will exist until it is explicitly
shutdown.
newCachedThreadPool
Creates a thread pool that creates new
threads as needed, but will reuse
previously constructed threads when
they are available. These pools will
typically improve the performance of
programs that execute many short-lived
asynchronous tasks. Calls to execute
will reuse previously constructed
threads if available. If no existing
thread is available, a new thread will
be created and added to the pool.
Threads that have not been used for
sixty seconds are terminated and
removed from the cache. Thus, a pool
that remains idle for long enough will
not consume any resources. Note that
pools with similar properties but
different details (for example,
timeout parameters) may be created
using ThreadPoolExecutor constructors.
In terms of resources, the newFixedThreadPool will keep all the threads running until they are explicitly terminated. In the newCachedThreadPool Threads that have not been used for sixty seconds are terminated and removed from the cache.
Given this, the resource consumption will depend very much in the situation. For instance, If you have a huge number of long running tasks I would suggest the FixedThreadPool. As for the CachedThreadPool, the docs say that "These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks".

Just to complete the other answers, I would like to quote Effective Java, 2nd Edition, by Joshua Bloch, chapter 10, Item 68 :
"Choosing the executor service for a particular application can be tricky. If you’re writing a small program, or a lightly loaded server, using Executors.new- CachedThreadPool is generally a good choice, as it demands no configuration and generally “does the right thing.” But a cached thread pool is not a good choice for a heavily loaded production server!
In a cached thread pool, submitted tasks are not queued but immediately handed off to a thread for execution. If no threads are available, a new one is created. If a server is so heavily loaded that all of its CPUs are fully utilized, and more tasks arrive, more threads will be created, which will only make matters worse.
Therefore, in a heavily loaded production server, you are much better off using Executors.newFixedThreadPool, which gives you a pool with a fixed number of threads, or using the ThreadPoolExecutor class directly, for maximum control."

If you see the code in the grepcode, you will see, they are calling ThreadPoolExecutor. internally and setting their properties. You can create your one to have a better control of your requirement.
public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>());
}
public static ExecutorService newCachedThreadPool() {
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>());
}

The ThreadPoolExecutor class is the base implementation for the executors that are returned from many of the Executors factory methods. So let's approach Fixed and Cached thread pools from ThreadPoolExecutor's perspective.
ThreadPoolExecutor
The main constructor of this class looks like this:
public ThreadPoolExecutor(
int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler
)
Core Pool Size
The corePoolSize determines the minimum size of the target thread pool. The implementation would maintain a pool of that size even if there are no tasks to execute.
Maximum Pool Size
The maximumPoolSize is the maximum number of threads that can be active at once.
After the thread pool grows and becomes bigger than the corePoolSize threshold, the executor can terminate idle threads and reach to the corePoolSize again.
If allowCoreThreadTimeOut is true, then the executor can even terminate core pool threads if they were idle more than keepAliveTime threshold.
So the bottom line is if threads remain idle more than keepAliveTime threshold, they may get terminated since there is no demand for them.
Queuing
What happens when a new task comes in and all core threads are occupied? The new tasks will be queued inside that BlockingQueue<Runnable> instance. When a thread becomes free, one of those queued tasks can be processed.
There are different implementations of the BlockingQueue interface in Java, so we can implement different queuing approaches like:
Bounded Queue: New tasks would be queued inside a bounded task queue.
Unbounded Queue: New tasks would be queued inside an unbounded task queue. So this queue can grow as much as the heap size allows.
Synchronous Handoff: We can also use the SynchronousQueue to queue the new tasks. In that case, when queuing a new task, another thread must already be waiting for that task.
Work Submission
Here's how the ThreadPoolExecutor executes a new task:
If fewer than corePoolSize threads are running, tries to start a
new thread with the given task as its first job.
Otherwise, it tries to enqueue the new task using the
BlockingQueue#offer method. The offer method won't block if the queue is full and immediately returns false.
If it fails to queue the new task (i.e. offer returns false), then it tries to add a new thread to the thread pool with this task as its first job.
If it fails to add the new thread, then the executor is either shut down or saturated. Either way, the new task would be rejected using the provided RejectedExecutionHandler.
The main difference between the fixed and cached thread pools boils down to these three factors:
Core Pool Size
Maximum Pool Size
Queuing
+-----------+-----------+-------------------+---------------------------------+
| Pool Type | Core Size | Maximum Size | Queuing Strategy |
+-----------+-----------+-------------------+---------------------------------+
| Fixed | n (fixed) | n (fixed) | Unbounded `LinkedBlockingQueue` |
+-----------+-----------+-------------------+---------------------------------+
| Cached | 0 | Integer.MAX_VALUE | `SynchronousQueue` |
+-----------+-----------+-------------------+---------------------------------+
Fixed Thread Pool
Here's how the Excutors.newFixedThreadPool(n) works:
public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>());
}
As you can see:
The thread pool size is fixed.
If there is high demand, it won't grow.
If threads are idle for quite some time, it won't shrink.
Suppose all those threads are occupied with some long-running tasks and the arrival rate is still pretty high. Since the executor is using an unbounded queue, it may consume a huge part of the heap. Being unfortunate enough, we may experience an OutOfMemoryError.
When should I use one or the other? Which strategy is better in terms of resource utilization?
A fixed-size thread pool seems to be a good candidate when we're going to limit the number of concurrent tasks for resource management purposes.
For example, if we're going to use an executor to handle web server requests, a fixed executor can handle the request bursts more reasonably.
For even better resource management, it's highly recommended to create a custom ThreadPoolExecutor with a bounded BlockingQueue<T> implementation coupled with reasonable RejectedExecutionHandler.
Cached Thread Pool
Here's how the Executors.newCachedThreadPool() works:
public static ExecutorService newCachedThreadPool() {
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>());
}
As you can see:
The thread pool can grow from zero threads to Integer.MAX_VALUE. Practically, the thread pool is unbounded.
If any thread is idle for more than 1 minute, it may get terminated. So the pool can shrink if threads remain too much idle.
If all allocated threads are occupied while a new task comes in, then it creates a new thread, as offering a new task to a SynchronousQueue always fails when there is no one on the other end to accept it!
When should I use one or the other? Which strategy is better in terms of resource utilization?
Use it when you have a lot of predictable short-running tasks.

If you are not worried about an unbounded queue of Callable/Runnable tasks, you can use one of them. As suggested by bruno, I too prefer newFixedThreadPool to newCachedThreadPool over these two.
But ThreadPoolExecutor provides more flexible features compared to either newFixedThreadPool or newCachedThreadPool
ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime,
TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory,
RejectedExecutionHandler handler)
Advantages:
You have full control of BlockingQueue size. It's not un-bounded, unlike the earlier two options. I won't get an out of memory error due to a huge pile-up of pending Callable/Runnable tasks when there is unexpected turbulence in the system.
You can implement custom Rejection handling policy OR use one of the policies:
In the default ThreadPoolExecutor.AbortPolicy, the handler throws a runtime RejectedExecutionException upon rejection.
In ThreadPoolExecutor.CallerRunsPolicy, the thread that invokes execute itself runs the task. This provides a simple feedback control mechanism that will slow down the rate that new tasks are submitted.
In ThreadPoolExecutor.DiscardPolicy, a task that cannot be executed is simply dropped.
In ThreadPoolExecutor.DiscardOldestPolicy, if the executor is not shut down, the task at the head of the work queue is dropped, and then execution is retried (which can fail again, causing this to be repeated.)
You can implement a custom Thread factory for the below use cases:
To set a more descriptive thread name
To set thread daemon status
To set thread priority

That’s right, Executors.newCachedThreadPool() isn't a great choice for server code that's servicing multiple clients and concurrent requests.
Why? There are basically two (related) problems with it:
It's unbounded, which means that you're opening the door for anyone to cripple your JVM by simply injecting more work into the service (DoS attack). Threads consume a non-negligible amount of memory and also increase memory consumption based on their work-in-progress, so it's quite easy to topple a server this way (unless you have other circuit-breakers in place).
The unbounded problem is exacerbated by the fact that the Executor is fronted by a SynchronousQueue which means there's a direct handoff between the task-giver and the thread pool. Each new task will create a new thread if all existing threads are busy. This is generally a bad strategy for server code. When the CPU gets saturated, existing tasks take longer to finish. Yet more tasks are being submitted and more threads created, so tasks take longer and longer to complete. When the CPU is saturated, more threads is definitely not what the server needs.
Here are my recommendations:
Use a fixed-size thread pool Executors.newFixedThreadPool or a ThreadPoolExecutor. with a set maximum number of threads;

You must use newCachedThreadPool only when you have short-lived asynchronous tasks as stated in Javadoc, if you submit tasks which takes longer time to process, you will end up creating too many threads. You may hit 100% CPU if you submit long running tasks at faster rate to newCachedThreadPool (http://rashcoder.com/be-careful-while-using-executors-newcachedthreadpool/).

I do some quick tests and have the following findings:
1) if using SynchronousQueue:
After the threads reach the maximum size, any new work will be rejected with the exception like below.
Exception in thread "main" java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask#3fee733d rejected from java.util.concurrent.ThreadPoolExecutor#5acf9800[Running, pool size = 3, active threads = 3, queued tasks = 0, completed tasks = 0]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
2) if using LinkedBlockingQueue:
The threads never increase from minimum size to maximum size, meaning the thread pool is fixed size as the minimum size.

Problems with pool size of Spring's ThreadPoolTaskExecutor

I'm doing some load tests agains my Spring application and now I'm a little bit confused about the configuration of the ThreadPoolTaskExecutor.
The documentation of the internally used ThreadPoolExecutor describes the corePoolSize as "the number of threads to keep in the pool, even if they are idle, [...]" and maximumPoolSize as "the maximum number of threads to allow in the pool".
That obviously means that the maximumPoolSize limits the number of thread in the pool. But instead the limit seems the be set by the corePoolSize. Actually I configured just the corePoolSize with 100 an let the maximumPoolSize unconfigured (that means the default value is used: Integer.MAX_VALUE = 2147483647).
When I run the load test I can see (by reviewing the logs), that the executed worker thread are numbered from worker-1 to worker-100. So in this case the thread pool size is limited by corePoolSize. Even if I set maximumPoolSize to 200 or 300, the result is exactly the same.
Why the value of maximumPoolSize has no affect in my case?
#Bean
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(100);
taskExecutor.setThreadNamePrefix("worker-");
return taskExecutor;
}
SOLUTION
I've found the solution in the documentation: "If there are more than corePoolSize but less than maximumPoolSize threads running, a new thread will be created only if the queue is full". The default queue size is Integer.MAX_VALUE. If I limit the queue, everything works fine.

I have done some testing on ThreadPoolTaskExecutor and there is three things that you have to understand:
corePoolSize
queueCapacity
maxPoolSize
When you start the process there is no threads in the pool.
Each time a task comes one new executor thread will be created to handle this new load as long the corePoolSize is not reached.
When the corePoolSize is reached the next task will be shift to the queue and wait for a free executor thread.
If the load is too high and queueCapacity is full, the new executor threads will be created unless the maxPoolSize is reached. These additional threads will expire as soon as the queue is empty.
If the corePoolSize is exhausted, queueCapacity is full and maxPoolSize is also reached then the new submitteds tasks will be rejected and called will get an exception.
You have not mentioned the queueCapacity of your configuration so it might be set to highest integer number and thus maxPoolSize is never getting triggered. Try with small corePoolSize and queueCapacity and you will observe the desired result.

If you have 100 threads in a pool and you are executing CPU bound code on 4 physical CPU cores, most of your core threads are idle in the pool waiting to be re-used. That is probably why you don't see more than worker-100.
You didn't show us code you are executing in workers, therefore I assume it is not I/O bound. If it would be I/O bound code and 100 of your core threads would be occupied by waiting for blocking I/O operations to finish, ThreadPoolExecutor would need to create additional workers.
Try it with corePoolSize lower than number of cores on your machine to confirm. Another option is to put Thread.sleep(1000) into your worker code and observe how your workers count will be raising.
EDIT:
You suggested to use SimpleAsyncTaskExecutor in comment. Notice this section of Spring Framework docs:
SimpleAsyncTaskExecutor This implementation does not reuse any
threads, rather it starts up a new thread for each invocation.
However, it does support a concurrency limit which will block any
invocations that are over the limit until a slot has been freed up. If
you are looking for true pooling, see the discussions of
SimpleThreadPoolTaskExecutor and ThreadPoolTaskExecutor below.
So with SimpleAsyncTaskExecutor you don't have pooling at all and a lot of resources (CPU cycles included) are wasted on creation and deletion of Thread objects, which may be quite expensive operation.
So SimpleAsyncTaskExecutor executor type does more harm than good to your load testing. If you want to have more workers, use more machines. It's naive to use only one machine if you want to have accurate load testing.

What the difference between ExecutorService's execute and thread.run in running threads concurrently in Java?

I'm new to this concurrent programming in java and came up with following scenarios where I'm getting confusion which to use when.
Scenario 1: In the following code I was trying to run threads by calling .start() on GPSService class which is a Runnable implementation.
int clientNumber = 0;
ServerSocket listener = new ServerSocket(port);
while (true) {
new GPSService(listener.accept(), clientNumber++, serverUrl).start();
}
Scenario 2: In the following code I was trying to run threads by using ExecutorService class as shown
int clientNumber = 0;
ServerSocket listener = new ServerSocket(port);
while(true) {
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.execute(new GPSService(listener.accept(), client++, serverUrl));
executor.shutdown();
while (!executor.awaitTermination(1, TimeUnit.SECONDS)) {
// Threads are still running
System.out.println("Thread is still running");
}
// All threads are completed
System.out.println("\nThread completed it's execution and terminated successfully\n");
}
My Questions are
Which is the best practice to invoke a thread in concurrent programming?
What will be result(troubles) I'll end up with when I use first or second?
Note: I've been facing an issue with the first scenario where the program is getting hanged after every few days. So, is that issue related/expected when I use first method.?
Any good/helpful answer will be appreciated :) Thank you

There are no big differences in the two scenario you posted, except from managing thread termination in Scenario2; you always create a new thread for each incoming request. If you want to use ThreadPool my advice is not to create one for every request but to create one for each server and reuse threads. Something like:
public class YourClass {
//in init method or constructor
ExecutorService executor = Executors....;// choose from newCachedThreadPool() or newFixedThreadPool(int nThreads) or some custom option
int clientNumber = 0;
ServerSocket listener = new ServerSocket(port);
while(true) {
executor.execute(new GPSService(listener.accept(), client++, serverUrl));
}
This will allow you to use a thread pool and to control how many threads to use for your server. If you want to use a Executor this is the preferred way to go.
With a server pool you need to decide how many threads there are in the pool; you have different choices but you can start or with a fixed number or threads or with a pool that tries to use a non busy thread and if all threads are busy it creates a new one (newCachedThreadPool()). The number of threads to allocate depends form many factors: the number of concurrents requests and it durations. The more your server side code takes time the more you need for additional thread. If your server side code is very faster there are very high chances that the pool can recycle threads already allocated (since the requests do not come all in the same exact instant).
Say for example that you have 10 request during a second and each request lasts 0.2 seconds; if the request arrive at 0, 0.1, 0.2, 0.3, 0.4, 0.5, .. part of the second (for example 23/06/2015 7:16:00:00, 23/06/2015 7:16:00:01, 23/06/2015 7:16:00:02) you need only three threads since the request coming at 0.3 can be performed by the thread that server the first request (the one at 0), and so on (the request at time 0.4 can reuse thread used for the request that came at 0.1). Ten requests managed by three threads.
I recommend you (if you did not it already) to read Java Concurrency in practice (Task Execution is chapter 6); which is an excellent book on how to build concurrent application in Java.

From oracle documentation from Executors
public static ExecutorService newCachedThreadPool()
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks.
Calls to execute will reuse previously constructed threads if available. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for sixty seconds are terminated and removed from the cache.
Thus, a pool that remains idle for long enough will not consume any resources. Note that pools with similar properties but different details (for example, timeout parameters) may be created using ThreadPoolExecutor constructors.
public static ExecutorService newFixedThreadPool(int nThreads)
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue. At any point, at most nThreads threads will be active processing tasks. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available.
If any thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks. The threads in the pool will exist until it is explicitly shutdown.
#Giovanni is saying that you don' have to provide number of threads to newCachedThreadPool unlike newFixedThreadPool(), where you have to pass maximum cap on number of threads in ThreadPool.
But between these two, newFixedThreadPool() is preferred. newCachedThread Pool may cause leak and you may reach maximum number of available threads due to unbounded nature. Some people consider it as an evil.
Have a look at related SE question:
Why is an ExecutorService created via newCachedThreadPool evil?

ThreadPoolExecutor policy

I'm trying to use a ThreadPoolExecutor to schedule tasks, but running into some problems with its policies. Here's its stated behavior:
If fewer than corePoolSize threads are running, the Executor always prefers adding a new thread rather than queuing.
If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread.
If a request cannot be queued, a new thread is created unless this would exceed maximumPoolSize, in which case, the task will be rejected.
The behavior I want is this:
same as above
If more than corePoolSize but less than maximumPoolSize threads are running, prefers adding a new thread over queuing, and using an idle thread over adding a new thread.
same as above
Basically I don't want any tasks to be rejected; I want them to be queued in an unbounded queue. But I do want to have up to maximumPoolSize threads. If I use an unbounded queue, it never generates threads after it hits coreSize. If I use a bounded queue, it rejects tasks. Is there any way around this?
What I'm thinking about now is running the ThreadPoolExecutor on a SynchronousQueue, but not feeding tasks directly to it - instead feeding them to a separate unbounded LinkedBlockingQueue. Then another thread feeds from the LinkedBlockingQueue into the Executor, and if one gets rejected, it simply tries again until it is not rejected. This seems like a pain and a bit of a hack, though - is there a cleaner way to do this?

It probably isn't necessary to micro-manage the thread pool as being requested.
A cached thread pool will re-use idle threads while also allowing potentially unlimited concurrent threads. This of course could lead to runaway performance degrading from context switching overhead during bursty periods.
Executors.newCachedThreadPool();
A better option is to place a limit on the total number of threads while discarding the notion of ensuring idle threads are used first. The configuration changes would be:
corePoolSize = maximumPoolSize = N;
allowCoreThreadTimeOut(true);
setKeepAliveTime(aReasonableTimeDuration, TimeUnit.SECONDS);
Reasoning over this scenario, if the executor has less than corePoolSize threads, than it must not be very busy. If the system is not very busy, then there is little harm in spinning up a new thread. Doing this will cause your ThreadPoolExecutor to always create a new worker if it is under the maximum number of workers allowed. Only when the maximum number of workers are "running" will workers waiting idly for tasks be given tasks. If a worker waits aReasonableTimeDuration without a task, then it is allowed to terminate. Using reasonable limits for the pool size (after all, there are only so many CPUs) and a reasonably large timeout (to keep threads from needlessly terminating), the desired benefits will likely be seen.
The final option is hackish. Basically, the ThreadPoolExecutor internally uses BlockingQueue.offer to determine if the queue has capacity. A custom implementation of BlockingQueue could always reject the offer attempt. When the ThreadPoolExecutor fails to offer a task to the queue, it will try to make a new worker. If a new worker can not be created, a RejectedExecutionHandler would be called. At that point, a custom RejectedExecutionHandler could force a put into the custom BlockingQueue.
/** Hackish BlockingQueue Implementation tightly coupled to ThreadPoolexecutor implementation details. */
class ThreadPoolHackyBlockingQueue<T> implements BlockingQueue<T>, RejectedExecutionHandler {
BlockingQueue<T> delegate;
public boolean offer(T item) {
return false;
}
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
delegate.put(r);
}
//.... delegate methods
}

Your use case is common, completely legit and unfortunately more difficult than one would expect. For background info you can read this discussion and find a pointer to a solution (also mentioned in the thread) here. Shay's solution works fine.
Generally I'd be a bit wary of unbounded queues; it's usually better to have explicit incoming flow control that degrades gracefully and regulates the ratio of current/remaining work to not overwhelm either producer or consumer.

Just set corePoolsize = maximumPoolSize and use an unbounded queue?
In your list of points, 1 excludes 2, since corePoolSize will always be less or equal than maximumPoolSize.
Edit
There is still something incompatible between what you want and what TPE will offer you.
If you have an unbounded queue, maximumPoolSize is ignored so, as you observed, no more than corePoolSize threads will ever be created and used.
So, again, if you take corePoolsize = maximumPoolSize with an unbounded queue, you have what you want, no?

Would you be looking for something more like a cached thread pool?
http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/Executors.html#newCachedThreadPool()

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.