Output result of heavy calculation in Vert.x with back-pressure - java

The application in question handles requests from clients which then requires a lot of calculation on the server side. This calculation is done piece-by-piece, so if the client is slow to read, this calculation should not progress (the calculation should respond to back-pressure).
The calculation is now represented as a Supplier<Buffer>, in which the get() call might take a long time and needs to be called multiple times until it responds with null (no more data). The get() should be called in a separate thread-pool (which is shared with other requests), and should only be called if the client is really able to accept the data.
My current code is:
ReadStream<Buffer> readStream = new MyComplicatedReadStream(supplier, executor)
.exceptionHandler(request::fail)
.endHandler(x -> request.response().end());
Pump.pump(readStream, request.response())).start();
I've made a custom implementation of ReadStream to do this, which sort-of works, but is long, clunky and has synchronization issues.
Instead of fixing that, I wonder if there is a idiomatic way in vert.x / rx to implement / instantiate a MyComplicatedReadStream. So, for a Supplier<Buffer> and an ExecutorService get a ReadStream<Buffer> which executes get() with the given executor and doesn't generate if it is paused.

I have near 0 experience with vert.x but I do have some experience with rxjava. So there might be a better way to do this but from rxjava perspective you can make use of generate method to create 'cold' flowables which only generate items on demand. I believe in this case when the stream is paused, no additional calls to supplier.get() will be made as there is no 'demand'
using kotlin syntax here but I think you can derive the java version easily.
Flowable.generate<Buffer> { emitter ->
val nextValue = supplier.get()
if (nextValue == null) {
emitter.onComplete()
} else {
emitter.onNext(nextValue)
}
}.subscribeOn(Schedulers.from(executor)) // this will make the above callback run in the given executor
Since it seems that the supplier is holding some state, you may in some cases want to generate a 'new supplier' for each consumer, in which case you can use the overload of the generate method that allows specifying another callback to get an instance of the state (supplier in your case). http://reactivex.io/RxJava/2.x/javadoc/io/reactivex/Flowable.html#generate-java.util.concurrent.Callable-io.reactivex.functions.BiConsumer-
Looks like then you can convert the flowable to a read stream:
ReadStream<Buffer> readStream = FlowableHelper.toReadStream(observable);
based on the docs here: https://vertx.tk/docs/vertx-rx/java2/#_read_stream_support

Related

How is Apache NIO HttpAsyncClient performing non-blocking HTTP Client

How is Apache NIO HttpAsyncClient able to wait for a remote response without blocking any thread? Does it have a way to setup a callback with the OS (I doubt so?). Otherwise does it perform some sort of polling?
EDIT - THIS ANSWER IS WRONG. PLEASE IGNORE AS IT IS INCORRECT.
You did not specify a version, so I can not point you to source code. But to answer your question, the way that Apache does it is by returning a Future<T>.
Take a look at this link -- https://hc.apache.org/httpcomponents-asyncclient-4.1.x/current/httpasyncclient/apidocs/org/apache/http/nio/client/HttpAsyncClient.html
Notice how the link says nio in the package. That stands for "non-blocking IO". And 9 times out of 10, that is done by doing some work with a new thread.
This operates almost exactly like a CompletableFuture<T> from your first question. Long story short, the library kicks off the process in a new thread (just like CompletableFuture<T>), stores that thread into the Future<T>, then allows you to use that Future<T> to manage that newly created thread containing your non-blocking task. By doing this, you get to decide exactly when and where the code blocks, potentially giving you the chance to make some significant performance optimizations.
To be more explicit, let's give a pseudocode example. Let's say I have a method attached to an endpoint. Whenever the endpoint is hit, the method is executed. The method takes in a single parameter --- userID. I then use that userID to perform 2 operations --- fetch the user's personal info, and fetch the user's suggested content. I need both pieces, and neither request needs to wait for the other to finish before starting. So, what I do is something like the following.
public StoreFrontPage visitStorePage(int userID)
{
final Future<UserInfo> userInfoFuture = this.fetchUserInfo(userID);
final Future<PageSuggestion> recommendedContentFuture = this.fetchRecommendedContent(userId);
final UserInfo userInfo = userInfoFuture.get();
final PageSuggestion recommendedContent = recommendedContentFuture.get();
return new StoreFrontPage(userInfo, recommendedContent);
}
When I call this.fetchUserInfo(userID), my code creates a new thread, starts fetching user info on that new thread, but let's my main thread continue and kick off this.fetchRecommendedContent(userID) in the meantime. The 2 fetches are occurring in parallel.
However, I need both results in order to create my StoreFrontPage. So, when I decided that I cannot continue any further until I have the results from both fetches, I call Future::get on each of my fetches. What this method does is merge the new thread back into my original one. In short, it says "wait for that one thread you created to finish doing what it was doing, then output the result as a return value".
And to more explicitly answer your question, no, this tool does not require you to do anything involving callbacks or polling. All it does is give you a Future<T> and lets you decide when you need to block the thread to wait on that Future<T> to finish.
EDIT - THIS ANSWER IS WRONG. PLEASE IGNORE AS IT IS INCORRECT.

equivalent of an ExecutorService in D language?

The D documentation is a bit difficult to understand, how do I achieve the following Java code in D?
ExecutorService service = Executors.newFixedThreadPool(num_threads);
for (File f : files) {
service.execute(() -> process(f));
}
service.shutdown();
try {
service.awaitTermination(24, TimeUnit.HOURS);
} catch (InterruptedException e) {
e.printStackTrace();
}
Would I use std.parallelism or std.concurrency or is this functionality not available in the standard library.
The example you posted is best represented by std.parallelism. You can use the parallel helper function in there, which when used in a foreach it will automatically execute the body of the foreach loop in a thread pool with a thread number (worker size) of totalCPUs - 1. You can change this default value by setting defaultPoolThreads = x; before doing any parallel code (best done at the start of your main) or by using a custom taskPool.
basically then your code would translate to this:
foreach (f; files.parallel) {
process(f); // or just paste what should be done with f in here if it matters
}
std.parallelism is the high-level implementation of multithreading. If you want to just have a task pool you can create a new TaskPool() (with number of workers as optional argument) and then do the same as above using service.parallel(files).
Alternatively you could queue lots of tasks using
foreach (f; files) {
service.put!process(f);
}
service.finish(true); // true = blocking
// you could also do false here in a while true loop with sleeps to implement a timeout
which would then allow to implement a timeout.
Though I would recommend using parallel because it handles the code above for you + gives each thread a storage to access the local stack so you can use it just the same as a normal non-parallel foreach loop.
A side-note/explanation on the documentation:
The std.concurrency is also very useful, though not what you would use with your example. In it there is a spawn function which is spawning a new thread with the powerful messaging API. With the messaging API (send and receive) you can implement thread-safe value passing between threads without using sockets, files or other workarounds.
When you have a task (thread with messaging API) and call receive in it it will wait until the passed timeout is done or another thread calls the send function on the task. For example you could have a file loading queue task which always waits using receive and when e.g. the UI puts a file into the loading queue (just by calling send once or more) it can work on these files and send them back to the UI task which receives using a timeout in the main loop.
std.concurrency also has a FiberScheduler which can be used to do thread style programming in a single thread. For example if you have a UI which does drawing and input handling and all sorts of things it can then in the main loop on every tick call the FiberScheduler and all the currently running tasks will continue where they last stopped (by calling yield). This is useful when you have like an image generator which takes long to generate, but you don't want to block the UI for too long so you call yield() every iteration or so to halt the execution of the generator and do one step of the main loop.
When fibers aren't running they can even be passed around threads so you can have a thread pool from std.parallelism and a custom FiberScheduler implementation and do load balancing which could be useful in a web server for example.
If you want to create Fibers without a FiberScheduler and call them raw (and check their finish states and remove them from any custom scheduler implementation) you can inherit the Fiber class from core.thread, which works exactly the same as a Thread, you just need to call Fiber.yield() every time you wait or think you are in a CPU intensive section.
Though because most APIs aren't made for Fibers they will block and make Fibers seem kind of useless, so you definitely want to use some API which uses Fibers there. For example vibe.d has lots of fiber based functions, but a custom std.concurrency implementation so you need to look out for that.
But just to come back to your question, a TaskPool or in your particular case the parallel function is what you need.
https://dlang.org/phobos/std_parallelism.html#.parallel
https://dlang.org/phobos/std_parallelism.html#.TaskPool.parallel

Difference between Futures(Guava)/CompletableFuture and Observable(RxJava) [duplicate]

I would like to know the difference between
CompletableFuture,Future and Observable RxJava.
What I know is all are asynchronous but
Future.get() blocks the thread
CompletableFuture gives the callback methods
RxJava Observable --- similar to CompletableFuture with other benefits(not sure)
For example: if client needs to make multiple service calls and when we use Futures (Java) Future.get() will be executed sequentially...would like to know how its better in RxJava..
And the documentation http://reactivex.io/intro.html says
It is difficult to use Futures to optimally compose conditional asynchronous execution flows (or impossible, since latencies of each request vary at runtime). This can be done, of course, but it quickly becomes complicated (and thus error-prone) or it prematurely blocks on Future.get(), which eliminates the benefit of asynchronous execution.
Really interested to know how RxJava solves this problem. I found it difficult to understand from the documentation.
Futures
Futures were introduced in Java 5 (2004). They're basically placeholders for a result of an operation that hasn't finished yet. Once the operation finishes, the Future will contain that result. For example, an operation can be a Runnable or Callable instance that is submitted to an ExecutorService. The submitter of the operation can use the Future object to check whether the operation isDone(), or wait for it to finish using the blocking get() method.
Example:
/**
* A task that sleeps for a second, then returns 1
**/
public static class MyCallable implements Callable<Integer> {
#Override
public Integer call() throws Exception {
Thread.sleep(1000);
return 1;
}
}
public static void main(String[] args) throws Exception{
ExecutorService exec = Executors.newSingleThreadExecutor();
Future<Integer> f = exec.submit(new MyCallable());
System.out.println(f.isDone()); //False
System.out.println(f.get()); //Waits until the task is done, then prints 1
}
CompletableFutures
CompletableFutures were introduced in Java 8 (2014). They are in fact an evolution of regular Futures, inspired by Google's Listenable Futures, part of the Guava library. They are Futures that also allow you to string tasks together in a chain. You can use them to tell some worker thread to "go do some task X, and when you're done, go do this other thing using the result of X". Using CompletableFutures, you can do something with the result of the operation without actually blocking a thread to wait for the result. Here's a simple example:
/**
* A supplier that sleeps for a second, and then returns one
**/
public static class MySupplier implements Supplier<Integer> {
#Override
public Integer get() {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
//Do nothing
}
return 1;
}
}
/**
* A (pure) function that adds one to a given Integer
**/
public static class PlusOne implements Function<Integer, Integer> {
#Override
public Integer apply(Integer x) {
return x + 1;
}
}
public static void main(String[] args) throws Exception {
ExecutorService exec = Executors.newSingleThreadExecutor();
CompletableFuture<Integer> f = CompletableFuture.supplyAsync(new MySupplier(), exec);
System.out.println(f.isDone()); // False
CompletableFuture<Integer> f2 = f.thenApply(new PlusOne());
System.out.println(f2.get()); // Waits until the "calculation" is done, then prints 2
}
RxJava
RxJava is whole library for reactive programming created at Netflix. At a glance, it will appear to be similar to Java 8's streams. It is, except it's much more powerful.
Similarly to Futures, RxJava can be used to string together a bunch of synchronous or asynchronous actions to create a processing pipeline. Unlike Futures, which are single-use, RxJava works on streams of zero or more items. Including never-ending streams with an infinite number of items. It's also much more flexible and powerful thanks to an unbelievably rich set of operators.
Unlike Java 8's streams, RxJava also has a backpressure mechanism, which allows it to handle cases in which different parts of your processing pipeline operate in different threads, at different rates.
The downside of RxJava is that despite the solid documentation, it is a challenging library to learn due to the paradigm shift involved. Rx code can also be a nightmare to debug, especially if multiple threads are involved, and even worse - if backpressure is needed.
If you want to get into it, there's a whole page of various tutorials on the official website, plus the official documentation and Javadoc. You can also take a look at some of the videos such as this one which gives a brief intro into Rx and also talks about the differences between Rx and Futures.
Bonus: Java 9 Reactive Streams
Java 9's Reactive Streams aka Flow API are a set of Interfaces implemented by various reactive streams libraries such as RxJava 2, Akka Streams, and Vertx. They allow these reactive libraries to interconnect, while preserving the all important back-pressure.
I have been working with Rx Java since 0.9, now at 1.3.2 and soon migrating to 2.x I use this in a private project where I already work on for 8 years.
I wouldn't program without this library at all anymore. In the beginning I was skeptic but it is a complete other state of mind you need to create. Quiete difficult in the beginning. I sometimes was looking at the marbles for hours.. lol
It is just a matter of practice and really getting to know the flow (aka contract of observables and observer), once you get there, you'll hate to do it otherwise.
For me there is not really a downside on that library.
Use case:
I have a monitor view that contains 9 gauges (cpu, mem, network, etc...). When starting up the view, the view subscribes itselfs to a system monitor class that returns an observable (interval) that contains all the data for the 9 meters.
It will push each second a new result to the view (so not polling !!!).
That observable uses a flatmap to simultaneously (async!) fetch data from 9 different sources and zips the result into a new model your view will get on the onNext().
How the hell you gonna do that with futures, completables etc ... Good luck ! :)
Rx Java solves many issues in programming for me and makes in a way a lot easier...
Advantages:
Statelss !!! (important thing to mention, most important maybe)
Thread management out of the box
Build sequences that have their own lifecycle
Everything are observables so chaining is easy
Less code to write
Single jar on classpath (very lightweight)
Highly concurrent
No callback hell anymore
Subscriber based (tight contract between consumer and producer)
Backpressure strategies (circuit breaker a like)
Splendid error handling and recovering
Very nice documentation (marbles <3)
Complete control
Many more ...
Disadvantages:
- Hard to test
Java's Future is a placeholder to hold something that will be completed in the future with a blocking API. You'll have to use its' isDone() method to poll it periodically to check if that task is finished. Certainly you can implement your own asynchronous code to manage the polling logic. However, it incurs more boilerplate code and debug overhead.
Java's CompletableFuture is innovated by Scala's Future. It carries an internal callback method. Once it is finished, the callback method will be triggered and tell the thread that the downstream operation should be executed. That's why it has thenApply method to do further operation on the object wrapped in the CompletableFuture.
RxJava's Observable is an enhanced version of CompletableFuture. It allows you to handle the backpressure. In the thenApply method (and even with its brothers thenApplyAsync) we mentioned above, this situation might happen: the downstream method wants to call an external service that might become unavailable sometimes. In this case, the CompleteableFuture will fail completely and you will have to handle the error by yourself. However, Observable allows you to handle the backpressure and continue the execution once the external service to become available.
In addition, there is a similar interface of Observable: Flowable. They are designed for different purposes. Usually Flowable is dedicated to handle the cold and non-timed operations, while Observable is dedicated to handle the executions requiring instant responses. See the official documents here: https://github.com/ReactiveX/RxJava#backpressure
All three interfaces serve to transfer values from producer to consumer. Consumers can be of 2 kinds:
synchronous: consumer makes blocking call which returns when the value is ready
asynchronous: when the value is ready, a callback method of the consumer is called
Also, communication interfaces differ in other ways:
able to transfer single value of multiple values
if multiple values, backpressure can be supported or not
As a result:
Future transferes single value using synchronous interface
CompletableFuture transferes single value using both synchronous and asynchronous interfaces
Rx transferes multiple values using asynchronous interface with backpressure
Also, all these communication facilities support transferring exceptions. This is not always the case. For example, BlockingQueue does not.
The main advantage of CompletableFuture over normal Future is that CompletableFuture takes advantage of the extremely powerful stream API and gives you callback handlers to chain your tasks, which is absolutely absent if you use normal Future. That along with providing asynchronous architecture, CompletableFuture is the way to go for handling computation heavy map-reduce tasks, without worrying much about application performance.

When does the AsyncRestTemplate send request?

Today I did some experiments on AsyncRestTemplate. Below is a piece of sample code:
ListenableFuture<ResponseEntity<MyObject[]>> result
= asyncRestTemplate.getForEntity(uri, MyObject[]);
List<MyObject> objects = Arrays.asList(result.get().getBody());
To my surprise, the request was not sent to uri in first line (i.e. after calling getForEntity) but sent after result.get() is called.
Isn't it a synchronous way of doing stuff?
The whole idea of doing async request is that either you do not want to wait for the async task to start/complete OR you want the main thread to do some other task before asking for the result from the Future instance. Internally, the AsyncRestTemplate prepares an AsyncRequest and calls executeAsync method.
AsyncClientHttpRequest request = createAsyncRequest(url, method);
if (requestCallback != null) {
requestCallback.doWithRequest(request);
}
ListenableFuture<ClientHttpResponse> responseFuture = request.executeAsync();
There are two different implementations - HttpComponentsAsyncClientHttpRequest ( which uses high performant async support provided in Apache http component library ) and SimpleBufferingAsyncClientHttpRequest (which uses facilities provided by J2SE classes). In case of HttpComponentsAsyncClientHttpRequest, internally it has a thread factory (which is not spring managed AFAIK) whereas in SimpleBufferingAsyncClientHttpRequest, there is a provision of Spring managed AsyncListenableTaskExecutor. The whole point is that in all cases there is some ExecutorService of some kind to be able to run the tasks asynchronously. Of course as is natural with these thread pools, the actual starting time of task is indeterminate and depends upon lots of factor like load, available CPU etc. and should not be relied upon.
When you call future.get() you're essentially turning an asynchronous operation into a synchronous one by waiting for the result.
It doesn't matter when the actual request is performed, the important thing is that since it's asynchronous, you don't need to worry about it unless/until you need the result.
The advantage is obvious when you need to perform other work before processing the result, or when you're not waiting for a result at all.

Vertx3 - Return a value from a JDBC connection? (sql db).

I have an interface with a single method that returns "config" Object.
I want to utilize interface this in Android and Vertx3 environments.
Config retrieveConfig(String authCreds);
I'm trying to implement this in a vertx program, utilizing the JDBC client from it, but I'm running into issues.
jdbcClient.getConnection(sqlConnResult ->
//checks for success
sqlConnResult.result().query(selectStatement, rows -> {
//get result here, want to return it as answer to interface.
//seems this is a "void" method in this scope.
});
Is this interface even possible with Vertx async code?
In Async programming you can't really return your value to the caller, because this would then be a blocking call - one of the main things async programming seeks to remove. This is why in Vertx a lot of methods return this or void.
Various paradigms exist as alternatives:
Vert.x makes extensive use of the Handler<T> interface where the handler(T result) method will be executed with the result.
Vert.x 3 also has support for the Rx Observable. This will allow you to return an Observable<T> which will emit the result to subscribers once the async task has completed.
Also, there is always an option to return Future<T> which, once the async task has completed will contain the result. Although Vert.x doesn't really use this very much.
So you're probably going to find it difficult to have a common interface like this for blocking and non-blocking api. Vertx offers nice and easy ways to run blocking code but I don't think that is a good solution in your case.
Personally, I would have a look at RxJava. There is support for Rx on Android, and has been well adopted in Vertx 3 - with almost every API call having a Rx equivalent.
Moving from:
Config retrieveConfig(String authCreds);
to
Observable<Config> retrieveConfig(String authCreds);
would give you the ability to have a common interface and for it to work on both Android & Vert.x. It would also give the added benefit of not having to stray into callback hell which Rx seeks to avoid.
Hope this helps,

Categories

Resources