I am trying to understand the core principles of non-blocking programming (and frameworks like project reactor). The main idea is to have "thread pool" with determined number of threads (executors) and tasks which are executed there. We should not have any blocked threads. In "user code" we just run something to execute and leave callback (what to do with the result). Out "user" thread is not blocked, right. But what if my task depends on some jdbc query. My task will request this query and then will be blocked waiting for the result, right? So, this thread is blocked.
But we avoid thread creating (which is expensive). Is it the core benefit of this style?
If my thread pool consists of 2 executors and both are blocked waiting for something, other tasks will not be executed, right? How to avoid it? Create more than 2 threads?
Threads are relatively costly system resources. For example, each thread needs memory for the call stack. How much this is depends on the operating system, but typically it's something like 1 or 2 MB. This means it's not a good idea to start thousands of threads - you'd waste 1 or 2 GB memory just on the call stacks of 1000 threads.
So, to do things more efficiently you want to limit the number of threads, for example using a thread pool to handle work. The thread pool makes it possible to manage the number of threads that are being used.
However, imagine that you'd have a thread pool with 10 threads, and then 10 requests come in. Each of your threads will be reserved to handle a request. While they are busy, you can't handle request #11 because there is no thread free. When you are using blocking I/O, then, even though all your 10 threads are doing nothing (waiting for I/O to complete), request #11 cannot be handled...
When you use non-blocking I/O, threads will never need to wait for I/O - so when the handling request #3 is suspended because it needs the result of an I/O operation, the thread that was handling it can temporarily switch to handling other requests.
So, with non-blocking I/O, you never have waiting threads and you are using system resources more efficiently.
This will only work if you are using non-blocking I/O from the front to the back of your system. If at the back-end you are using JDBC, which is a blocking API, then you'll loose the full benefit of non-blocking I/O.
Therefore, if you have a database at the back-end, this works best if you have a DB which supports non-blocking I/O. Some NoSQL databases like MongoDB support this, and for some relational databases there are special drivers / APIs available that support this. You won't be using JDBC in that case, because JDBC is an inherently blocking API.
Oracle is working on a new API for relational databases tentatively called
ADBA which will allow you to do non-blocking / async I/O with relational databases but it's not ready yet.
Project Reactor is an implementation of Reactive Streams specification. The specification overview can be found at ReactiveManifest. It's not just creating a set of threads and letting them do their jobs, It's the framework or the runtime (in this case ProjectReactor) that will organize your code in such a way that it'll presumably behave as nonblocking. Also, the whole system implementation has to be in this fashion otherwise you won't be benefited from the reactive streams.
If my thread pool consists of 2 executors and both are blocked waiting for something, other tasks will not be executed, right? How to avoid it? Create more than 2 threads?
The answer to this will be yes, and no. The framework may are may not create threads. Since the code will be interleaved among the threads, Since the non-blocking system are event-driven including the low-level operations (ex, libuv I/O), It's not necessary for a thread to wait for the completion of an I/O operation. Meanwhile, the thread will be executing something meaningful. The completion of the task will be notified and the dependent code can be executed by any of the available thread. The goal of such a system is to utilize the CPU to the fullest with limited resources(threads).
Taken from http://www.reactive-streams.org.
The main goal of Reactive Streams is to govern the exchange of stream data across an asynchronous boundary—think passing elements on to another thread or thread-pool—while ensuring that the receiving side is not forced to buffer arbitrary amounts of data. In other words, back pressure is an integral part of this model in order to allow the queues which mediate between threads to be bounded. The benefits of asynchronous processing would be negated if the communication of back pressure were synchronous (see also the Reactive Manifesto), therefore care has to be taken to mandate fully non-blocking and asynchronous behavior of all aspects of a Reactive Streams implementation.
It's the Reactor framework that enforces and help you in building a completely non-blocking system from the ground up.
Related
I have been thinking about why JDBC is only blocking operation and why I can't set some listener to the hypothetical event handler onResultSetArrived(ResultSet rs). Why I have to block single one thread per each JDBC query.
After a while I've dive into Java Sockets (I suppose JDBC is build on top of them) and realised that there also isn't any event handling. Only option to provide non-blocking read is through the available() method but this is very inefficient as it has to be checked periodically in the loop.
As far as I'm aware, interruption is fundamental thing in PC. It goes down from the hardware up to the operating system. In the Java it can be implemented into event driven approach in read value from Socket.
Now, my question is am I missing something and there exists some workaround or current architecture in Java really is one thread per one blocking operation? And if yes isn't it inefficient?
In Java, you can have many threads. A thread is doing its stuff until it is blocked somewhere (typically, on a mutex or a I/O operation). Of course, this does not block other threads.
The fundamental scenario of multithreaded applications is that you use multiple threads when waiting for a blocked thread would introduce too much waiting. Definition of "too much" here depends entirely on you, but in general, this is how you achieve beter performance through better utilization of resources.
There are some limitations in how threads in Java work, however. Most, if not all of them are when the thread is blocked somewhere "outside" of Java such as in OS call or external (native) library. Theoretically, if native code blocks a thread, Java can not do anything about it. Normally, this should not be a problem unless the native code has a bug.
So in the case of a blocking JDBC response, you would create a new thread which would do other work while first thread is waiting for database to complete. Alternatively, you could make a thread just for doing JDBC. You could make it exactly like you want (with listeners etc.) except for limitations imposed by OS. So it's possible, but it's probably not provided out-of-the-box by JDBC drivers. There is a lot of infrastructure already in core Java which you might find useful (thread pools, workers, synchronized collections). But as with any multithreading, you need to be very careful with accessing data from different threads simultaneously.
Since Java 7, there is also support for non-blocking I/O (NIO). This is almost exactly what you are describing. I/O is offloaded to OS, so your operations return immediately and you get a callback when the operation is finished. However, not all libraries support NIO. For my work, I have never had a reason to use it, because I could always implement the same stuff with my threads at least as good.
If the question is whether the "current architecture in Java really is one thread per one blocking operation" and by "blocking operation" you mean "database operation" then the answer is no. Most database drivers available for Java currently are jdbc-based and do work that way. But there are usable alternatives (https://spring.io/blog/2016/11/28/going-reactive-with-spring-data) and more on the way (
https://blogs.oracle.com/java/jdbc-next:-a-new-asynchronous-api-for-connecting-to-a-database , https://dzone.com/articles/spring-5-webflux-and-jdbc-to-block-or-not-to-block). For how this works see How is ReactiveMongo implemented so that it is considered non-blocking?
For jdbc there are also ways to wrap the jdbc calls (Wrapping blocking I/O in project reactor , Spring webflux and reading from database ) and projects pursuing this approach (https://dzone.com/articles/myth-asynchronous-jdbc)
What I understood from Vert.x documentation (and a little bit of coding in it) is that Vert.x is single threaded and executes events in the event pool. It doesn't wait for I/O or any network operation(s) rather than giving time to another event (which was not before in any Java multi-threaded framework).
But I couldn't understand following:
How single thread is better than multi-threaded? What if there are millions of incoming HTTP requests? Won't it be slower than other multi-threaded frameworks?
Verticles depend on CPU cores. As many CPU cores you have, you can have that many verticles running in parallel. How come a language that works on a virtual machine can make use of CPU as needed? As far as I know, the Java VM (JVM) is an application that uses just another OS process for (here my understanding is less about OS and JVM hence my question might be naive).
If a single threaded, non-blocking concept is so effective then why can't we have the same non-blocking concept in a multi-threaded environemnt? Won't it be faster? Or again, is it because CPU can execute one thread at a time?
What I understood from Vert.x documentation (and a little bit of coding in it) is that Vert.x is single threaded and executes events in the event pool.
It is event-driven, callback-based. It isn't single-threaded:
Instead of a single event loop, each Vertx instance maintains several event loops. By default we choose the number based on the number of available cores on the machine, but this can be overridden.
It doesn't wait for I/O or any network operation(s)
It uses non-blocking or asynchronous I/O, it isn't clear which. Use of the Reactor pattern suggests non-blocking, but it may not be.
rather than giving time to another event (which was not before in any Java multi-threaded framework).
This is meaningless.
How single thread is better than multi-threaded?
It isn't.
What if there are millions of incoming HTTP requests? Won't it be slower than other multi-threaded frameworks?
Yes.
Verticles depend on CPU cores. As many CPU cores you have, you can have that many verticles running in parallel. How come a language that works on a virtual machine can make use of CPU as needed? As far as I know, the Java VM (JVM) is an application that uses just another OS process for (here my understanding is less about OS and JVM hence my question might be naive).
It uses a thread per core, as per the quotation above, or whatever you choose by overriding that.
If a single threaded, non-blocking concept is so effective then why can't we have the same non-blocking concept in a multi-threaded environemnt?
You can.
Won't it be faster?
Yes.
Or again, is it because CPU can execute one thread at a time?
A multi-core CPU can execute more than one thread at a time. I don't know what 'it' in 'is it because' refers to.
First of all, Vertx isn't single threaded by any means. It just doesn't spawn more threads that it needs.
Second, and this is not related to Vertx at all, JVM maps threads to native OS threads.
Third, we can have non-blocking behavior in multithreaded environment. It's not one thread per CPU, but one thread per core.
But then the question is: "what are those threads doing?". Because usually, to be useful, they need other resources. Network, DB, filesystem, memory. And here it becomes tricky. When you're single threaded, you don't have race conditions. The only one accessing the memory at any point of time is you. But if you're multi threaded, you need to concern yourself with mutexes, or any other way to keep you data consistent.
Q:
How single thread is better than multi-threaded? What if there are millions of incoming HTTP requests? Won't it be slower than other multi-threaded frameworks?
A:
Vert.x isn't a single threaded framework, it does make sure that a "verticle" which is something you deploy within you application and register with vert.x is mostly single threaded.
The reason for this is that concurrency with multiple threads over complicates concurrency with locks synchronisation and other concept that need to be taken care of with multi threaded communication.
While verticles are single threaded the do use something called an event loop which is the true power behind this paradigm called the reactor pattern or multi reactor pattern in Vert.x's case. Multiple verticles can be registered within one application, communication between these verticles run through an eventbus which empowers verticles to use an event based transfer protocol internally but this can also be distributed using some other technology to manage the clustering.
event loops handle events coming in on one thread but everything is async so computation gets handled by the loop and when it's done a signal notifies that a result can be used.
So all computation is either callback based or uses something like Reactive.X / fibers / coroutines / channels and the lot.
Due to the simpler communication model for concurrency and other nice features of Vert.x it can actually be faster than a lot of the Blocking and pure multi threaded models out there.
the numbers
Q:
If a single threaded, non-blocking concept is so effective then why can't we have the same non-blocking concept in a multi-threaded environemnt? Won't it be faster? Or again, is it because CPU can execute one thread at a time?
A:
Like a said with the first question it's not really single threaded. Actually when you know something is blocking you'll have to register computation with a method called executeBlocking which wil make it run multithreaded on an ExecutorService managed by Vert.x
The reason why Vert.x's model is mostly faster is also here because event loops make better use of cpu computation features and constraints. This is mostly powered by the Netty project.
the overhead of multi threading with it's locks and syncs imposes to much strain to outdo Vert.x with it's multi reactor pattern.
This is going to be the most basic and even may be stupid question here. When we talk about using multi threading for better resource utilization. For example, an application reads and processes files from the local file system. Lets say that reading of file from disk takes 5 seconds and processing it takes 2 seconds.
In above scenario, we say that using two threads one to read and other to process will save time. Because even when one thread is processing first file, other thread in parallel can start reading second file.
Question: Is this because of the way CPUs are designed. As in there is a different processing unit and different read/write unit so these two threads can work in parallel on even a single core machine as they are actually handled by different modules? Or this needs multiple core.
Sorry for being stupid. :)
On a single processor, multithreading is achieved through time slicing. One thread will do some work then it will switch to the other thread.
When a thread is waiting on some I/O, such as a file read, it will give up it's CPU time-slice prematurely allowing another thread to make use of the CPU.
The result is overall improved throughput compared to a single thread even on a single core.
Key for below:
= Doing work on CPU
- I/O
_ Idle
Single thread:
====--====--====--====--
Two threads:
====--__====--__====--__
____====--__====--__====
So you can see how more can get done in the same time as the CPU is kept busy where it would have been kept waiting before. The storage device is also being used more.
In theory yes. Single core has same parallelism. One thread waiting for read from file (I/O Wait), another thread is process file that already read before. First thread actually can not running state until I/O operations is completed. Rougly not use cpu resource at this state. Second thread consume CPU resource and complete task. Indeed, multi core CPU has better performance.
To start with, there is a difference between concurrency and parallelism. Theoretically, a single core machine does not support parallelism.
About the question on performance improvement as a result of concurrency (using threads), it is very implementation dependent. Take for instance, Android or Swing. Both of them have a main thread (or the UI thread). Doing large calculation on the main thread will block the UI and make in unresponsive. So from a layman perspective that would be a bad performance.
In your case(I am assuming there is no UI Thread) where you will benefit from delegating your processing to another thread depends on a lot of factors, specially the implementation of your threads. e.g. Synchronized threads would not be as good as the unsynchronized ones. Your problem statement reminds me of classic consumer producer problem. So use of threads should not really be the best thing for your work as you need synchronized threads. IMO It's better to do all the reading and processing in a single thread.
Multithreading will also have a context switching cost. It is not as big as Process's context switching, but it's still there. See this link.
[EDIT] You should preferably be using BlockingQueue for such producer consumer scenario.
Goal
I want to understand how to handle two thread pools simultaneously in java?
Consider a client server system in which clients are sending blocking I/O requests to the server (for example a file server ). There is a single ThreadPoolExecutor instance running on the server. Some types of client’s requests take much longer to process than other requests. These requests are called high I/O intensity requests. These high I/O intensity requests hog all threads and bring down entire application.
I want to solve this problem by two separate ThreadPoolExecutor.
I create two ThreadPoolExecutor instances ,one for high I/o intensity requests and another for low I/o intensity requests, and through offline workload procedure I create a lookup table to classify requests and when a request arrive I first search its class in the lookup table so that I can handover it to its corresponding thread pool.
Real Problem.
How to share processors equally to these two thread pools. Will this task be handled by JVM itself or I have to handle it by myself on application level ?
Should I make use of cluster and use another machine that run an instance of ThreadPoolExecutor to handle high I/O intensity requests?
Kindly give me proper design suggestions.
Generally is up to the system CPU scheduler how to distribute time between threads. Thread pool has nothing to do with thread scheduling. It can manage some threads reusing or synchronization between them.
The only advantage of creating 2 pools instead of 1 is that one pool can use ThreadFactory different than standard Executors.defaultThreadFactory(). You can give different priority for your demanding clients. Prority is a information that scheBut they would suffer even more then if you make them less important or vice versa ;)
Maybe you could rather do something like tuning it's priority when someone uses too much resources.
Here is some reference how does Microsoft uses priorities to tune threads CPU consumption.
No the JVM cannot route the ThreadPoolExecutor for you. You may implement a external watcher thread that monitor your threads and apply the appropriate policy to them (priority, exception handling and so on).
Take a look at this example:
http://tutorials.jenkov.com/java-multithreaded-servers/thread-pooled-server.html
I was going through the twitter finagle library which is an asynchronous service framework in scala and I have some question regarding asynchronous libraries in general.
So as I understand, the advantage of an synchronous library using a callback is that the application thread gets free and the library calls the callback as soon as the request is completed over the network. And in general the application threads might not have a 1:1 mapping with the library thread.
The service call in the library thread is blocking right?
If that's the case then we are just making the blocking call in some other thread. This makes the application thread free but some other thread is doing the same work. Can't we just increase the number of application threads to have that advantage?
It's possible that I mis-understand how the asynchronous libraries are implemented in Java/Scala or JVM in general. Can anyone help me understand how does this work?
Async approach is useful abstraction: your CPU-intensive thread offloads long-running IO operation to dedicated (maybe, belonging to a library) thread. When IO is done, some other thread will receive IO result.
Using blocking approach, you'll miss CPU ticks for your threads which are doing blocking IO call. And adding some more threads to ensure there's always free thread to do some CPU work means wasting OS/JVM resources for re-scheduling.
Blocking IO is used because it's simpler to program (no need to synchronize caller and callback).
Actually, IO is only one possible use-case where async style is useful. In general, whenever you feel that task at hand will benefit from splitting it into several activities, which can be run concurrently and would communicate with each other, this is the case for async style. Examples not connected to IO:
GUI: GUI event loop thread passes user input to background threads, and they do necessary processing;
Utilizing modern multi-core CPUs: if your task can be split in several subtasks, you can run these in separate threads, utilizing all available cores. Naturally, you'll need to gather results of subtasks, and you'll need async style here.