Understanding NodeJS & Non-Blocking IO

Understanding NodeJS & Non-Blocking IO - java

So, I've recently been injected with the Node virus which is spreading in the Programming world very fast.
I am fascinated by it's "Non-Blocking IO" approach and have indeed tried out a couple of programs myself.
However, I fail to understand certain concepts at the moment.
I need answers in layman terms (someone coming from a Java background)
1. Multithreading & Non-Blocking IO.
Let's consider a practical scenario. Say, we have a website where users can register. Below would be the code.
..
..
// Read HTTP Parameters
// Do some Database work
// Do some file work
// Return a confirmation message
..
..
In a traditional programming language, the above happens in a sequential way. And, if there are multiple requests for registration, the web server creates a new thread and the rest is history. Of course, programmers can create threads of their own to work on Line 2 and Line 3 simultaneously.
In Node, as I understand, Lines 2 & 3 will be run in parallel while the rest of the program gets executed and the Interpreter polls the lines 2 & 3 every 'x' ms.
Now, my question is, if Node is a single threaded language, what does the job of lines 2 & 3 while the rest of the program is being executed?
2. Scalability
I recently read that LinkedIn have adapted Node as a back-end for their Mobile Apps and have seen massive improvements.
Can anyone explain how it has made such a difference?
3. Adapting in other programming languages
If people are claiming that Node to be making a lot of difference when it comes to performance, why haven't other programming languages adapted this Non-Blocking IO paradigm?
I'm sure I'm missing something. Only if you can explain me and guide me with some links, would be helpful.
Thanks.

A similar question was asked and probably contains all the info you're looking for: How the single threaded non blocking IO model works in Node.js
But I'll briefly cover your 3 parts:
1.
Lines 2 and 3 in a very simple form could look like:
db.query(..., function(query_data) { ... });
fs.readFile('/path/to/file', function(file_data) { ... });
Now the function(query_data) and function(file_data) are callbacks. The functions db.query and fs.readFile will send the actual I/O requests but the callbacks allow the processing of the data from the database or the file to be delayed until the responses are received. It doesn't really "poll lines 2 and 3". The callbacks are added to an event loop and associated with some file descriptors for their respective I/O events. It then polls the file descriptors to see if they are ready to perform I/O. If they are, it executes the callback functions with the I/O data.
I think the phrase "Everything runs in parallel except your code" sums it up well. For example, something like "Read HTTP parameters" would execute sequentially, but I/O functions like in lines 2 and 3 are associated with callbacks that are added to the event loop and execute later. So basically the whole point is it doesn't have to wait for I/O.
2.
Because of the things explained in 1., Node scales well for I/O intensive requests and allows many users to be connected simultaneously. It is single threaded, so it doesn't necessarily scale well for CPU intensive tasks.
3.
This paradigm has been used with JavaScript because JavaScript has support for callbacks, event loops and closures that make this easy. This isn't necessarily true in other languages.
I might be a little off, but this is the gist of what's happening.

Q1. " what does the job of lines 2 & 3 while the rest of the program is being executed?"
Answer: "Nothing". Lines 2 and 3 each themselves start their respective jobs, but those jobs cannot be done immediately because (for example) the disk sectors required are not loaded in yet - so the operating system issues a call to the disk to go get those sectors, then "Nothing happens" (node goes on with it's next task) until the disk subsystem (later) issues an interrupt to report they're ready, at which point node returns control to lines #2 and #3.
Q2. single-thread non-blocking dedicates almost no resources to each incoming connection (just some housekeeping data about the connected socket). It's very memory efficient. Traditional web servers "fork" a whole new process to handle each new connection - that means making a humongous copy of every bit of code and data variables needed, and time-slicing the CPU to deal with it all. That's massively wasteful of resources. Thus - if your load is a lot of idle connections waiting for stuff, as was theirs, node makes loads more sense.
Q3. almost every programming language does already have non-blocking I/O if you want to use it. Node is not a programming language, it's a web server that runs javascript and uses non-blocking I/O (eg: I personally wrote my own identical thing 10 years ago in perl, as did google (in C) when they started, and I'm sure loads of other people have similar web servers too). The non-blocking I/O is not the hard part - getting the programmer to understand how to use it is the tricky bit. Javascript happens to work well for that, because those programmers are already familiar with event programming.

Even though node.js has been around for a few years, it's performance model is still a bit mysterious.
I recently started a blog and decided that the node.js model would be a good first topic since I wanted to understand it better myself and it would be helpful to others to share what I learned. Here are a couple of articles I wrote that explain the high level concepts and some tradeoffs:
Blocking vs. Non-Blocking I/O – What’s going on?
Understanding node.js Performance

Related

Reactive Programming vs Thread Based Programming

I am new to this concept and want to have a great understanding of this topic.
To make my point clear I want to take an analogy.
Let's take a scenario of Node JS which is single-threaded and provide fast IO operation using an event loop. Now that makes sense since It is single-threaded and is not blocked for any task.
While studying reactive programming in Java using reactor. I came to a situation where the main thread is blocked when an object subscribes and some delay event took place.
Then I came to know the concept of subscribeOn.boundedElastic and many more pipelines like this.
I got it that they are trying to make it asynchronous by moving those subscribers to other threads.
But if it occurs like this then why is the asynchronous. Is it not thread-based programming?
If we are trying to achieve the async behaviour of Node JS then according to my view it should be in a single thread.
Summary of my question is:
So I don't get the fact of using or calling reactive programming as asynchronous or functional programming because of two reason
Main thread is blocked
We can manage the thread and can run it in another pool. Runnable service/ callable we can also define.

First of all you can't compare asynchronous with functional programming. Its like comparing a rock with a banana. Its two separate things.
Functional programming is compared to other types of programming, like object oriented programming or procedural programming etc. etc.
Reactor is a java library, and java is an object oriented programming language with functional features.
Asynchronous i will explain with what wikipedia says
Asynchrony, in computer programming, refers to the occurrence of events independent of the main program flow and ways to deal with such events.
So basically how to handle stuff "around" your application, that is not a part of the main flow of your program.
In comparison to Blocking, wikipedia again:
A process that is blocked is one that is waiting for some event, such as a resource becoming available or the completion of an I/O operation.
A traditional servlet application works by assigning one thread per request.
So every time a request comes in, a thread is spawned, and this thread follows along the request until the request returns. If there is something blocking during this request, for instance reading a file from the operating system, or making a request to another service. The assigned thread will block and wait until the reading of the file is completed, or the request has returned etc.
Reactive works with subscribers and producers and makes heavy use of the observer pattern. Which means that as soon as some thing blocks, reactor can take that thread and use it for something else. And then it is un-blocked any thread can pick up where it left off. This makes sure that every thread is always in use, and utilized at 100%.
All things processed in reactor is done by the event loop the event loop is a single threaded loop that just processes events as quick as possible. Schedulers schedule things to be processed on the event loop, and after they are processed a scheduler picks up the result and carries on.
If you just run reactor you get a default scheduler that will schedule things for you completely automatically.
But lets say you have something blocking. Well then you will stop the event loop. And everything needs to wait for that thing to finish.
When you run a fully reactive application you usually get one event loop per core during startup. Which means lets say you have 4 cores, you get 4 event loops and you block one, then during that period of blockages your application runs 25% slower.
25% slower is a lot!
Well sometimes you have something that is blocking that you can't avoid. For instance an old database that doesn't have a non-blocking driver. Or you need to read files from the operating system in a blocking manor. How do you do then?
Well the reactor team built in a fallback, so that if you use onSubscribe in combination with its own elastic thread pool, then you will get the old servlet behaviour back for that single subscriber to a specific say endpoint etc.
This makes sure that you can run fully reactive stuff side by side with old legacy blocking things. So that maybe some reaquests usese the old servlet behaviour, while other requests are fully non-blocking.
You question is not very clear so i am giving you a very unclear answer. I suggest you read the reactor documentation and try out all their examples, as most of this information comes from there.

Project loom: what makes the performance better when using virtual threads?

To give some context here, I have been following Project Loom for some time now. I have read The state of Loom. I have done asynchronous programming.
Asynchronous programming (provided by Java NIO) returns the thread to the thread pool when the task waits and it goes to great lengths to not block threads. And this gives a large performance gain, we can now handle many more request as they are not directly bound by the number of OS threads. But what we lose here, is the context. The same task is now NOT associated with just one thread. All the context is lost once we dissociate tasks from threads. Exception traces do not provide very useful information and debugging is difficult.
In comes Project Loom with virtual threads that become the single unit of concurrency. And now you can perform a single task on a single virtual thread.
It's all fine until now, but the article goes on to state, with Project Loom:
A simple, synchronous web server will be able to handle many more requests without requiring more hardware.
I don't understand how we get performance benefits with Project Loom over asynchronous APIs? The asynchrounous API:s make sure to not keep any thread idle. So, what does Project Loom do to make it more efficient and performant that asynchronous API:s?
EDIT
Let me re-phrase the question. Let's say we have an http server that takes in requests and does some crud operations with a backing persistent database. Say, this http server handles a lot of requests - 100K RPM. Two ways of implementing this:
The HTTP server has a dedicated pool of threads. When a request comes in, a thread carries the task up until it reaches the DB, wherein the task has to wait for the response from DB. At this point, the thread is returned to the thread pool and goes on to do the other tasks. When DB responds, it is again handled by some thread from the thread pool and it returns an HTTP response.
The HTTP server just spawns virtual threads for every request. If there is an IO, the virtual thread just waits for the task to complete. And then returns the HTTP Response. Basically, there is no pooling business going on for the virtual threads.
Given that the hardware and the throughput remain the same, would any one solution fare better than the other in terms of response times or handling more throughput?
My guess is that there would not be any difference w.r.t performance.

We don't get benefit over asynchronous API. What we potentially will get is performance similar to asynchronous, but with synchronous code.

The answer by #talex puts it crisply. Adding further to it.
Loom is more about a native concurrency abstraction, which additionally helps one write asynchronous code. Given its a VM level abstraction, rather than just code level (like what we have been doing till now with CompletableFuture etc), It lets one implement asynchronous behavior but with reduce boiler plate.
With Loom, a more powerful abstraction is the savior. We have seen this repeatedly on how abstraction with syntactic sugar, makes one effectively write programs. Whether it was FunctionalInterfaces in JDK8, for-comprehensions in Scala.
With loom, there isn't a need to chain multiple CompletableFuture's (to save on resources). But one can write the code synchronously. And with each blocking operation encountered (ReentrantLock, i/o, JDBC calls), the virtual-thread gets parked. And because these are light-weight threads, the context switch is way-cheaper, distinguishing itself from kernel-threads.
When blocked, the actual carrier-thread (that was running the run-body of the virtual thread), gets engaged for executing some other virtual-thread's run. So effectively, the carrier-thread is not sitting idle but executing some other work. And comes back to continue the execution of the original virtual-thread whenever unparked. Just like how a thread-pool would work. But here, you have a single carrier-thread in a way executing the body of multiple virtual-threads, switching from one to another when blocked.
We get the same behavior (and hence performance) as manually written asynchronous code, but instead avoiding the boiler-plate to do the same thing.
Consider the case of a web-framework, where there is a separate thread-pool to handle i/o and the other for execution of http requests. For simple HTTP requests, one might serve the request from the http-pool thread itself. But if there are any blocking (or) high CPU operations, we let this activity happen on a separate thread asynchronously.
This thread would collect the information from an incoming request, spawn a CompletableFuture, and chain it with a pipeline (read from database as one stage, followed by computation from it, followed by another stage to write back to database case, web service calls etc). Each one is a stage, and the resultant CompletablFuture is returned back to the web-framework.
When the resultant future is complete, the web-framework uses the results to be relayed back to the client. This is how Play-Framework and others, have been dealing with it. Providing an isolation between the http thread handling pool, and the execution of each request. But if we dive deeper in this, why is it that we do this?
One core reason is to use the resources effectively. Particularly blocking calls. And hence we chain with thenApply etc so that no thread is blocked on any activity, and we do more with less number of threads.
This works great, but quite verbose. And debugging is indeed painful, and if one of the intermediary stages results with an exception, the control-flow goes hay-wire, resulting in further code to handle it.
With Loom, we write synchronous code, and let someone else decide what to do when blocked. Rather than sleep and do nothing.

The http server has a dedicated pool of threads ....
How big of a pool? (Number of CPUs)*N + C? N>1 one can fall back to anti-scaling, as lock contention extends latency; where as N=1 can under-utilize available bandwidth. There is a good analysis here.
The http server just spawns...
That would be a very naive implementation of this concept. A more realistic one would strive for collecting from a dynamic pool which kept one real thread for every blocked system call + one for every real CPU. At least that is what the folks behind Go came up with.
The crux is to keep the {handlers, callbacks, completions, virtual threads, goroutines : all PEAs in a pod} from fighting over internal resources; thus they do not lean on system based blocking mechanisms until absolutely necessary This falls under the banner of lock avoidance, and might be accomplished with various queuing strategies (see libdispatch), etc.. Note that this leaves the PEA divorced from the underlying system thread, because they are internally multiplexed between them. This is your concern about divorcing the concepts. In practice, you pass around your favourite languages abstraction of a context pointer.
As 1 indicates, there are tangible results that can be directly linked to this approach; and a few intangibles. Locking is easy -- you just make one big lock around your transactions and you are good to go. That doesn't scale; but fine-grained locking is hard. Hard to get working, hard to choose the fineness of the grain. When to use { locks, CVs, semaphores, barriers, ... } are obvious in textbook examples; a little less so in deeply nested logic. Lock avoidance makes that, for the most part, go away, and be limited to contended leaf components like malloc().
I maintain some skepticism, as the research typically shows a poorly scaled system, which is transformed into a lock avoidance model, then shown to be better. I have yet to see one which unleashes some experienced developers to analyze the synchronization behavior of the system, transform it for scalability, then measure the result. But, even if that were a win experienced developers are a rare(ish) and expensive commodity; the heart of scalability is really financial.

How is it possible to write event based single threaded programs?

My knowledge of threads is very limited. I happen to be the guy who can write multi-threaded programs but just by copy-pasting and finding answers to my questions on the internet. But I've finally decided to learn a bit about concurrency and bought the book "Java Concurrency in Practice". After reading a couple of pages, I'm confident that I'll learn a great deal from this book.
Maybe I'm being a little impatient but I cannot resist the temptation of asking this question. It made me create an account on Stack Overflow. I'm not sure I'll be able to correctly phrase the question so I'll try to explain my question using an example.
If I had to write a (extremely unprofessionally coded) peer-to-peer chat client in, say, Java, I'll initiate a socket connection between the clients and keep it alive because messages can arrive at any time. The solution I can imagine would open a socket connection in a new thread and run a while loop continuously to keep the thread alive, as the thread dies as soon as run returns. For some reason, I cannot imagine a similar chat client in a single threaded program. How can you keep "waiting" until a message arrives if all you have is a single thread. Won't that block the execution of entire program?
To solve such a problem, what's the alternative to a continuous while loop?

How can you keep "waiting" until a message arrives if all you have is a single thread.
One possibility is to have the "parallelism" to happen "outside" of your application. Imagine a waiter in a restaurant. Just one guy. He walks from one customer to the next, and writes up the orders. From time to time, he walks over to the counter, puts in the orders, and picks up whatever stuff the chef left for him. Just one guy, walking around, doing "single task" work. But in the end, the overall system still has multiple actors (the guests, the waiter, the chef, the guy beyond the bar preparing the beverages). So, the waiter could be seen as "single threaded", but in the end, the overall system "restaurant" isn't.
Some IT architectures "mimic" that, for example around the idea of "non blocking" IO. That is how node.js works. It is single threaded by nature, but does async IO (see here for details). And you can do similar things with Java, too.
On the other hand, when you learn about concurrency, you still want to learn about the "real" multi threading, what it means, and how you would write code to "use" that concept.

Java simple Analytics/Event Stream Processing with front end

My application takes a lot of measurements of it's internal processes. For example I time certain methods, I time external webservice calls and I also have variables which have a changing value, and processes which have a 'state' (e.g. PAUSED, WAITING etc).
The application uses 100 to 200 threads, and each bit of data would be associated with a particular thread.
I am looking for some software that I can channel all this information into that would produce useful metrics and graphs of the data (ideally in real time or close to real time), let me set thresholds to trigger warnings, would allow me to filter the data by thread or thread group, etc etc.
The application is performing time critical tasks so the software/api would need to be very fast and never block.
The application is written in java, and ideally the software/api would be in java as well. I think what I'm looking for is called Event Stream Processing, but I'm really not sure what language to use to describe it.
All I've found so far are Esper and ERMA. Can anyone give me a recommendation? I'm the only one working on this project so I'm hoping for something that is pretty easy to set up and use, and has a workable front end.

In the end I found Graphite which was pretty close to being exactly what I wanted. Not the simplest to set up and configure however, but I got it working in the end.
http://graphite.wikidot.com/
In my case I send data directly from my application to Statsd (via UDP), which collects the data and does some pre processing before it ends up in the whisper back end, there is a simple example of a java interface here https://github.com/etsy/statsd/commit/2253223f3c19d2149d65ec5bc802198ff93da4cb
Alternatively you could send your data directly to graphite, example here http://neopatel.blogspot.co.uk/2011/04/logging-to-graphite-monitoring-tool.html

What is event driven io (context: Apache MINA, JBoss Netty)?

I want to understand What is event driven io. I am hearing it is different than traditional blocking request/response model. Do we have any example to explain this? and how will it contribute to the increase in performance?
Examples will be highly appreciated.

I'm guessing since it's been 4 months you've got your answers. Regardless here goes...
Netty
http://www.jboss.org/netty
Mina
http://mina.apache.org/
C10K
http://www.kegel.com/c10k.html
To understand part of the problem that evented io is trying to solve take a look at the C10K link above. Scability is one of the main benefits of evented io.
A traditional web server will handle a request and then return a response (synchronous/blocking). Each request would typically require it's own thread.
An event driven web server will handle a request, then create an event (asynchronous/nonblocking io), and then return the response. Multiple requests are shared by a single thread/process.
Evented IO should be able to handle more requests per thread than a typical web server. You might not speed up your web application with evented IO, but it should handle large numbers of connections a lot easier than a traditional web server. This means requiring less machines for scaling.
Though I would argue that evented io architecture will force you to develop your web application to handle smaller chunks of data. Much like a google mail type application that uses a lot of ajax calls to poll for data on the server and then does small updates in the browser. This itself has many benefits that will help speed up AND improve scaling on your server.
Netty and Mina provide plenty of example code.

This is a very old question but I assume this might help some body else to understand Event driven programming :
This following analogy might help you to understand event driven I/O programming by drawing a parallel to waiting line at Doctor's Reception desk.
Blocking I/O is like, if you are standing in the queue, receptionist asks a guy in front of you to fill in the form and she waits till he finishes. You have to wait for your turn till the guy finishes his form, this is blocking.
If single guy takes 3 mins to fill in, the 10th guy have to wait till 30 minutes. Now to reduce this 10th guys wait time, solution would be, increasing number of receptionist's, which is costly. This is what happens in traditional web servers. If you request for a user info, subsequent request by other users should wait till the current operation, fetching from Database, is completed. This increases the "time to response" of the 10th request and it increase exponentially for nth user. To avoid this traditional web servers creates thread (equivalent to increasing number of receptionists) for every single request, ie., basically it creates a copy of the server for each request which is costly interms of CPU consumption since every request will need a Operating systems thread. To scale up the app, you would have to throw lots of computation power at the app.
Event Driven: The other approach to scale up queue's "response time" is to go for event driven approach, where guy's in the queue will be handed over the form, asked to fill in and come back on completion. Hence receptionist can always take request. This is exactly what javascript has been doing since from it's inception. In browser, javascript would respond to user click event, scroll, swipe or database fetch and so on. This is possible in javascript inherently, because javascript treats functions as first class objects and they can be passed as a parameters to other functions (called callbacks), and can be called on completion of particular task. This is what exactly node.js does on the server. You can find more info about event driven programming and blocking i/o, in the context of node here

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.