Goal
I want to understand how to handle two thread pools simultaneously in java?
Consider a client server system in which clients are sending blocking I/O requests to the server (for example a file server ). There is a single ThreadPoolExecutor instance running on the server. Some types of client’s requests take much longer to process than other requests. These requests are called high I/O intensity requests. These high I/O intensity requests hog all threads and bring down entire application.
I want to solve this problem by two separate ThreadPoolExecutor.
I create two ThreadPoolExecutor instances ,one for high I/o intensity requests and another for low I/o intensity requests, and through offline workload procedure I create a lookup table to classify requests and when a request arrive I first search its class in the lookup table so that I can handover it to its corresponding thread pool.
Real Problem.
How to share processors equally to these two thread pools. Will this task be handled by JVM itself or I have to handle it by myself on application level ?
Should I make use of cluster and use another machine that run an instance of ThreadPoolExecutor to handle high I/O intensity requests?
Kindly give me proper design suggestions.
Generally is up to the system CPU scheduler how to distribute time between threads. Thread pool has nothing to do with thread scheduling. It can manage some threads reusing or synchronization between them.
The only advantage of creating 2 pools instead of 1 is that one pool can use ThreadFactory different than standard Executors.defaultThreadFactory(). You can give different priority for your demanding clients. Prority is a information that scheBut they would suffer even more then if you make them less important or vice versa ;)
Maybe you could rather do something like tuning it's priority when someone uses too much resources.
Here is some reference how does Microsoft uses priorities to tune threads CPU consumption.
No the JVM cannot route the ThreadPoolExecutor for you. You may implement a external watcher thread that monitor your threads and apply the appropriate policy to them (priority, exception handling and so on).
Take a look at this example:
http://tutorials.jenkov.com/java-multithreaded-servers/thread-pooled-server.html
Related
I work with a Java webapp that runs with Apache Tomcat. The max threads for the Tomcat thread pool is 800, and the minSpareThreads is 25. While it runs, it usually sits at around 400 running threads at a given time.
Let's say I have a computationally expensive, non-blocking task that I have to do in my Tomcat app, in which the ForkJoinPool.commonPool is used to solve the task more efficiently.
Because my Apache Tomcat app already has a large thread pool in it, does the Tomcat thread pool reduce the performance gains I would get from using a ForkJoinPool (or any thread pool, for that matter) in my Tomcat app? Could the performance costs of running the Tomcat threadpool along side a ForkJoinPool negate the performance gains of using a ForkJoinPool because now, there are going to be way more threads than there are CPUs?
Is adding any sort of additional thread pool to an Apache Tomcat app bad for the performance of the entire application?
It's hard to give a general answer to this question, because it depends so much on the specific workload. There's no substitute for testing and profiling your own application. Here are some things to think about, however.
Running a CPU-bound task in a separate thread pool isn't guaranteed to give you any performance benefit at all. There are two main reasons it might be beneficial:
When one thread submits a task to an executor to run on a separate thread, it can then continue on to do other work concurrently.
If the task can be broken down into multiple sub-tasks that are each run on separate threads, you can take advantage of parallel processing on multiple CPU cores to get it done faster.
The costs of having more threads are:
Memory allocated to each thread.
Latency caused by context switching when the OS reallocates CPU time from one thread to another.
The defaults for the Tomcat request thread pool size are based on the common situation that threads are spending a lot of time blocking on I/O: reading the request over the network, making database queries and updates, and writing the response back to the client. This means that these threads can't make use of all available CPU time and it is beneficial to have a lot more threads than CPU cores so that blocked threads can be preempted by ones that need CPU time.
So, a big question is what is invoking these tasks: is it a request thread? If so, what does that request thread do while the task is in progress? Is it making blocking I/O calls? Is it just waiting for the task to complete?
If most of your requests are invoking one of these CPU-intensive tasks and then blocked waiting for it to complete, and if these tasks are not split up to run in parallel on multiple cores, then you might not get any benefit from running tasks in a separate thread pool. It might be better to avoid the overhead of context switching and run these tasks on the request thread. If most of the requests handled by your service run this type of task, then you might want to reduce the number of request threads in the Tomcat thread pool, because your actual concurrency will be limited by the available CPU time. You could end up with a large number of waiting threads and high response latency. These requests might then time out on the client side, wasting a lot of server resources on requests that ultimately fail.
When implementing a server, we can delegate one client request to one thread. I read that problem with this approach is that each thread will have its own stack and this would be very "expensive". Alternative approach is that have server be single threaded and implement all client requests on this one server thread with I/O requests as non-blocking request. My doubt is that if one server thread is running multiple client requests simultaneously, won't server code have instruction pointer, set of local variables, function calls stacks for each client request, then won't this again be "expensive" as before. How are we really saving?.
I read that problem with this approach is that each thread will have its own stack and this would be very "expensive".
Depends on how tight your system resources are. The typical JVM stack-space allocated per thread defaults to 1mB on many current architectures although this can be tuned with the -Xss command line argument. How much system memory your JVM has at its disposal and how many threads you need determines if you want to pay the high price of writing the server single threaded.
My doubt is that if one server thread is running multiple client requests simultaneously, won't server code have instruction pointer, set of local variables, function calls stacks for each client request, then won't this again be "expensive" as before
It will certainly will need to store per request context information in the heap but I suspect that it would be a lot less than 1mB worth of information to hold the variables necessary to service the incoming connections.
Like most things, what we are really competing against when we look to optimize a program, whether to reduce memory or other system resource use, is code complexity. It is harder to get right and harder to maintain.
Although threaded programs can be highly complex, isolating a request handler in a single thread can make the code extremely simple unless it needs to coordinate with other requests somehow. Writing a high performance single threaded server would be much more complex than the threaded version in most cases. Of course, there would also be limits on the performance given that you can't make use of multiple processors.
Using non blocking I/O, A single I/O thread can handle many connections. The I/O thread will get notification when:
client wants to connect
the write buffer of the socket of the connection has space when the write buffer of the socket was full the previous round.
the read buffer of the socket of the connection has data available for reading
So the thread makes use of event-multiplexing to serve the connections concurrently using a selector. A thread waits for a set of selection-keys from the selector, and the selection key contains the state of the event you have registered for and you can attach user data like a 'session' to the selection-key.
A very typical design pattern used here is the reactor pattern.
But often you want to prevent blocking the I/O thread with longer running requests. So you offload the work to a different pool of threads. Then the reactor changes to the proactor pattern.
And often you want to scale the number of I/O threads. So you can have a bunch of I/O threads in parallel.
But the total number of threads in your application should remain limited.
It all depends on what you want. Above are techniques I frequently used while working for Hazelcast.
I would not start to write all this logic from scratch. If you want to make use of networking, I would have a look at Netty. It takes care of most of the heavy lifting and has all kinds of optimizations built in.
I'm not 100% sure if the a thread that doesn't write to its stack will actually consume 1MB of physical memory. In Linux the (shared) zero-page is used for a memory allocation, so no actual page frame (physical memory) is allocated unless the stack of the thread is actually written to; this will trigger a copy on write to do the actual allocation of a page-frame. Apart from saving memory, this also prevents wasting memory bandwidth on zeroing out the the stack. Memory consumption of a thread is one thing; but context switching is another problem. If you have many more threads than cores, context switching can become a real performance problem.
I am trying to understand the core principles of non-blocking programming (and frameworks like project reactor). The main idea is to have "thread pool" with determined number of threads (executors) and tasks which are executed there. We should not have any blocked threads. In "user code" we just run something to execute and leave callback (what to do with the result). Out "user" thread is not blocked, right. But what if my task depends on some jdbc query. My task will request this query and then will be blocked waiting for the result, right? So, this thread is blocked.
But we avoid thread creating (which is expensive). Is it the core benefit of this style?
If my thread pool consists of 2 executors and both are blocked waiting for something, other tasks will not be executed, right? How to avoid it? Create more than 2 threads?
Threads are relatively costly system resources. For example, each thread needs memory for the call stack. How much this is depends on the operating system, but typically it's something like 1 or 2 MB. This means it's not a good idea to start thousands of threads - you'd waste 1 or 2 GB memory just on the call stacks of 1000 threads.
So, to do things more efficiently you want to limit the number of threads, for example using a thread pool to handle work. The thread pool makes it possible to manage the number of threads that are being used.
However, imagine that you'd have a thread pool with 10 threads, and then 10 requests come in. Each of your threads will be reserved to handle a request. While they are busy, you can't handle request #11 because there is no thread free. When you are using blocking I/O, then, even though all your 10 threads are doing nothing (waiting for I/O to complete), request #11 cannot be handled...
When you use non-blocking I/O, threads will never need to wait for I/O - so when the handling request #3 is suspended because it needs the result of an I/O operation, the thread that was handling it can temporarily switch to handling other requests.
So, with non-blocking I/O, you never have waiting threads and you are using system resources more efficiently.
This will only work if you are using non-blocking I/O from the front to the back of your system. If at the back-end you are using JDBC, which is a blocking API, then you'll loose the full benefit of non-blocking I/O.
Therefore, if you have a database at the back-end, this works best if you have a DB which supports non-blocking I/O. Some NoSQL databases like MongoDB support this, and for some relational databases there are special drivers / APIs available that support this. You won't be using JDBC in that case, because JDBC is an inherently blocking API.
Oracle is working on a new API for relational databases tentatively called
ADBA which will allow you to do non-blocking / async I/O with relational databases but it's not ready yet.
Project Reactor is an implementation of Reactive Streams specification. The specification overview can be found at ReactiveManifest. It's not just creating a set of threads and letting them do their jobs, It's the framework or the runtime (in this case ProjectReactor) that will organize your code in such a way that it'll presumably behave as nonblocking. Also, the whole system implementation has to be in this fashion otherwise you won't be benefited from the reactive streams.
If my thread pool consists of 2 executors and both are blocked waiting for something, other tasks will not be executed, right? How to avoid it? Create more than 2 threads?
The answer to this will be yes, and no. The framework may are may not create threads. Since the code will be interleaved among the threads, Since the non-blocking system are event-driven including the low-level operations (ex, libuv I/O), It's not necessary for a thread to wait for the completion of an I/O operation. Meanwhile, the thread will be executing something meaningful. The completion of the task will be notified and the dependent code can be executed by any of the available thread. The goal of such a system is to utilize the CPU to the fullest with limited resources(threads).
Taken from http://www.reactive-streams.org.
The main goal of Reactive Streams is to govern the exchange of stream data across an asynchronous boundary—think passing elements on to another thread or thread-pool—while ensuring that the receiving side is not forced to buffer arbitrary amounts of data. In other words, back pressure is an integral part of this model in order to allow the queues which mediate between threads to be bounded. The benefits of asynchronous processing would be negated if the communication of back pressure were synchronous (see also the Reactive Manifesto), therefore care has to be taken to mandate fully non-blocking and asynchronous behavior of all aspects of a Reactive Streams implementation.
It's the Reactor framework that enforces and help you in building a completely non-blocking system from the ground up.
When a single user is accessing an application, multiple threads can be used, and they can run parallel if multiple cores are present. If only one processor exists, then threads will run one after another.
When multiple users are accessing an application, how are the threads handled?
I can talk from Java perspective, so your question is "when multiple users are accessing an application, how are the threads handled?".
The answer is it all depends on how you programmed it, if you are using some web/app container they provide thread pool mechanism where you can have more than one threads to server user reuqests, Per user there is one request initiated and which in turn is handled by one thread, so if there are 10 simultaneous users there will be 10 threads to handle the 10 requests simultaneously, now we do have Non-blocking IO now a days where the request processing can be off loaded to other threads so allowing less than 10 threads to handle 10 users.
Now if you want to know how exactly thread scheduling done around CPU core, it again depends on the OS. One thing common though 'thread is the basic unit of allocation to a CPU'. Start with green threads here, and you will understand it better.
The incorrect assuption is
If only one processor exists, then threads will run one after another.
How threads are being executed is up to the runtime environment.
With java there are some definitions that certain parts of your code will not be causing synchronisation with other threads and thus will not cause (potential) rescheduling of threads.
In general, the OS will be in charge of scheduling units-of-execution. In former days mostly such entities have been processes. Now there may by processes and threads (some do scheduling only at thread level). For simplicity let ssume OS is dealing with threads only.
The OS then may allow a thread to run until it reaches a point where it can't continue, e.g. waiting for an I/O operation to cpmplete. This is good for the thread as it can use CPU for max. This is bad for all the other threads that want to get some CPU cycles on their own. (In general there always will be more threads than available CPUs.So, the problem is independent of number of CPUs.) To improve interactive behaviour an OS might use time slices that allow a thread to run for a certain time. After the time slice is expired the thread is forcible removed from the CPU and the OS selects a new thread for being run (could even be the one just interrupted).
This will allow each thread to make some progress (adding some overhead for scheduling). This way, even on a single processor system, threads my (seem) to run in parallel.
So for the OS it is not at all important whether a set of thread is resulting from a single user (or even from a single call to a web application) or has been created by a number of users and web calls.
You need understand about thread scheduler.
In fact, in a single core, CPU divides its time among multiple threads (the process is not exactly sequential). In a multiple core, two (or more) threads can run simultaneously.
Read thread article in wikipedia.
I recommend Tanenbaum's OS book.
Tomcat uses Java multi-threading support to serve http requests.
To serve an http request tomcat starts a thread from the thread pool. Pool is maintained for efficiency as creation of thread is expensive.
Refer to java documentation about concurrency to read more https://docs.oracle.com/javase/tutorial/essential/concurrency/
Please see tomcat thread pool configuration for more information https://tomcat.apache.org/tomcat-8.0-doc/config/executor.html
There are two points to answer to your question : Thread Scheduling & Thread Communication
Thread Scheduling implementation is specific to Operating System. Programmer does not have any control in this regard except setting priority for a Thread.
Thread Communication is driven by program/programmer.
Assume that you have multiple processors and multiple threads. Multiple threads can run in parallel with multiple processors. But how the data is shared and accessed is specific to program.
You can run your threads in parallel Or you can wait for threads to complete the execution before proceeding further (join, invokeAll, CountDownLatch etc.). Programmer has full control over Thread life cycle management.
There is no difference if you have one user or several. Threads work depending the logic of your program. The processor runs every thread for a certain ammount of time and then follows to the next one. The time is very short, so if there are not too much threads (or different processes) working, the user won't notice it. If the processor uses a 20 ms unit, and there are 1000 threads, then every thread will have to wait for two seconds for its next turn. Fortunately, current processors, even with just one core, have two process units which can be used for parallel threads.
In "classic" implementations, all web requests arriving to the same port are first serviced by the same single thread. However as soon as request is received (Socket.accept returns), almost all servers would immediately fork or reuse another thread to complete the request. Some specialized single user servers and also some advanced next generation servers like Netty may not.
The simple (and common) approach would be to pick or reuse a new thread for the whole duration of the single web request (GET, POST, etc). After the request has been served, the thread likely will be reused for another request that may belong to the same or different user.
However it is fully possible to write the custom code for the server that binds and then reuses particular thread to the web request of the logged in user, or IP address. This may be difficult to scale. I think standard simple servers like Tomcat typically do not do this.
I just want to ask a rookie question: How to set an appropriate thread number for my thread pool on the server side?
Are there any general rules or formulas I can follow?
What are the issues I have to consider? For example, the number of network requests per second, the number of CPU cores, the CPU and memory usage rate in my application, the hardware I use on my server, etc.
Well, basically the size of the pool should be set to the the maximum possible of commands executed concurrently on your configuration, like if you have 4 cores (without HyperThreading), then you can set it to 4. With hyperthreading, you can set it to 8.
There are however questions like: what is the expected behaviour of the application, if it wants to get a thread from the pool, but the pool is empty (like you had 8 threads in the pool, every single one if them is working on a video encoding job in the next 10 minutes, and you get a new request in your manager thread).
You should consider however, that it is NOT guaranteed, that all your threads will run in every moment, even if your application handles threading exceptionally perfectly, as other applications are running on your computer meanwhile (your OS for example), and they need CPU as well.
On the other hand it is also a big question, that what does a thread do in your pool? You provided no informations about what is this thread pool used for, are they used in your own app, or you want to configure an open-source app/commercial app, etc. Creating and managing threads do have serious costs (scheduling, context switching, etc.), which may worth only if, the your threads stay alive long enough (you can provide enough job them to work on).
For further details, a quite good starting point in this subject could be Google I guess, for the following keywords: "scheduling, concurrency, threads, java executor service, hyperthreading".