We have a set of Java applications that were originally written using normal synchronous methods but have largely been converted to asynchronous Vert.x (the regular API, not Rx) wherever it makes sense. We're having some trouble at the boundaries between sync and async code, especially when we have a method that must be synchronous (reasoning explained below) and we want to invoke an async method from it.
There are many similar questions asked previously on Stack Overflow, but practically all of them are in a C# context and the answers do not appear to apply.
Among other things we are using Geotools and Apache Shiro. Both provide customization through extension using APIs they have defined that are strictly synchronous. As a specific example, our custom authorization realm for Shiro needs to access our user data store, for which we have created an async DAO API. The Shiro method we have to write is called doGetAuthorizationInfo; it is expected to return an AuthorizationInfo. But there does not appear to be a reliable way to access the authorization data from the other side of the async DAO API.
In the specific case that the thread was not created by Vert.x, using a CompletableFuture is a workable solution: the synchronous doGetAuthorizationInfo would push the async work over to a Vert.x thread and then block the current thread in CompletableFuture.get() until the result becomes available.
Unfortunately the Shiro (or Geotools, or whatever) method may be invoked on a Vert.x thread. In that case it is extremely bad to block the current thread: if it's the event loop thread then we're breaking the Golden Rule, while if it's a worker thread (say, via Vertx.executeBlocking) then blocking it will prevent the worker from picking up anything more from its queue - meaning the blocking will be permanent.
Is there a "standard" solution to this problem? It seems to me that it will crop up anytime Vert.x is being used under an extensible synchronous library. Is this just a situation that people avoid?
EDIT
... with a bit more detail. Here is a snippet from org.apache.shiro.realm.AuthorizingRealm:
/**
* Retrieves the AuthorizationInfo for the given principals from the underlying data store. When returning
* an instance from this method, you might want to consider using an instance of
* {#link org.apache.shiro.authz.SimpleAuthorizationInfo SimpleAuthorizationInfo}, as it is suitable in most cases.
*
* #param principals the primary identifying principals of the AuthorizationInfo that should be retrieved.
* #return the AuthorizationInfo associated with this principals.
* #see org.apache.shiro.authz.SimpleAuthorizationInfo
*/
protected abstract AuthorizationInfo doGetAuthorizationInfo(PrincipalCollection principals);
Our data access layer has methods like this:
void loadUserAccount(String id, Handler<AsyncResult<UserAccount>> handler);
How can we invoke the latter from the former? If we knew doGetAuthorizationInfo was being invoked in a non-Vert.x thread, then we could do something like this:
#Override
protected AuthorizationInfo doGetAuthorizationInfo(PrincipalCollection principals) {
CompletableFuture<UserAccount> completable = new CompletableFuture<>();
vertx.<UserAccount>executeBlocking(vertxFuture -> {
loadUserAccount((String) principals.getPrimaryPrincipal(), vertxFuture);
}, res -> {
if (res.failed()) {
completable.completeExceptionally(res.cause());
} else {
completable.complete(res.result());
}
});
// Block until the Vert.x worker thread provides its result.
UserAccount ua = completable.get();
// Build up authorization info from the user account
return new SimpleAuthorizationInfo(/* etc...*/);
}
But if doGetAuthorizationInfo is called in a Vert.x thread then things are completely different. The trick above will block an event loop thread, so that's a no-go. Or if it's a worker thread then the executeBlocking call will put the loadUserAccount task onto the queue for that same worker (I believe), so the subsequent completable.get() will block permanently.
I bet you know the answer already, but are wishing it wasn't so -- If a call to GeoTools or Shiro will need to block waiting for a response from something, then you shouldn't be making that call on a Vert.x thread.
You should create an ExecutorService with a thread pool that you should use to execute those calls, arranging for each submitted task to send a Vert.x message when it's done.
You may have some flexibility in the size of the chunks you move into the thread pool. Instead of tightly wrapping those calls, you can move something larger higher up the call stack. You will probably make this decision based on how much code you will have to change. Since making a method asynchronous usually implies changing all the synchronous methods in its call stack anyway (that's the unfortunate fundamental problem with this kind of async model), you will probably want to do it high on the stack.
You will probably end up with an adapter layer that provides Vert.x APIs for a variety of synchronous services.
Related
So if AsyncContext::complete closes the response and I need to write the response within the asynchronous context, how do I implement a multi-step response in which some steps are blocking with non-blocking sections in-between them?
You seem to be operating under a misapprehension about the nature of an AsyncContext and the semantics of ServletRequest::startAsync. This method (re)initializes an AsyncContext for the request and associated response, creating one first if necessary, and associates it with the request / response pair. This puts the request into asynchronous mode, which, at its core, means nothing more than that the container will not consider request processing complete until the provided context's complete() method is invoked.
In particular, creating an async context does not create any threads or assign the associated request to a different thread, and the methods of an AsyncContext run on the thread that invokes them (though that's kinda a technicality for AsyncContext::start). The context is primarily an object for whatever asynchronous code you provide to use for interacting with the container, which it otherwise could not safely do. To actually perform processing on some other thread, you need to arrange for that thread to exist, and for the work to be assigned to it. AsyncContext::start is a convenient way to do that, but not the only way.
With respect specifically to
how do I implement a multi-step response in which some steps are blocking with non-blocking sections in-between them?
, the basic answer is "however you want". The AsyncContext neither hinders nor particularly helps you because it's about communication with the container, not about workflow. In particular, I see no need or special use for nested AsyncContexts.
I think you're describing a processing pipeline with certain, limited parallelization. You might implement that, say, by running the overall workflow -- all the "blocking" steps, I guess -- in a thread launched via AsyncContext::start, and dispatching the other work to a thread pool, in whatever units make sense. Do be aware, however, that the request and response objects are not thread-safe. Ideally, then, the primary thread will extract all the needed data from the request, and perform all needed writes to the response.
Alternatively, maybe you use the regular request processing thread for the main workflow, dispatch pieces of work to a thread pool as appropriate, and skip the AsyncContext bit altogether. It is not necessary in any absolute sense to use an AsyncContext to perform asynchronous computations in a web application -- it's purpose and the processing models it is designed to support are rather a lot more specific.
I am facing the following situation, which to my surprise, I couldn't find much documentation:
There is a service which only provides a rest call for item details, by obtaining it 1 by 1.
There are 1k+ items in total.
For responsiveness reasons, I would like to persist this data on my end, and not fetch it lazily.
In order for my API key to not be locked, I would like to limit my calls to X calls / second.
I could not find any support for this in the Feign documentation.
Does anybody know if there is one? Or do you have any suggestions on how to go about this implementation?
There is no built in throttling capability in Feign, that is delegated to the underlying Client implementation. With that said, you can define your own client extending from one of the provided ones, Apache Http, OkHttp, and Ribbon.
One solution is to extend the Client to use a ScheduledThreadPoolExecutor as outlined in this answer.
Apache HttpClient: Limit total calls per second
To use this with the provided ApacheHttpClient in Feign, you could extend it, providing your own implementation of the execute method to use the executor.
public class ThrottledHttpClient extends ApacheHttpClient {
// create a pool with one thread, you'll control the flow later.
private final ExecutorService throttledQueue = Executors.newScheduledThreadPool(1);
#Override
public Response execute(Request request, Request.Options options) throws IOException {
// use the executor
ScheduledFuture<Response> future = throttledQueue.scheduleAtFixedRate(super.execute(), ....);
return future.get()
}
Set the appropriate thread pool size, delay and fixed wait to achieve the throughput you desire.
I would like to know the difference between
CompletableFuture,Future and Observable RxJava.
What I know is all are asynchronous but
Future.get() blocks the thread
CompletableFuture gives the callback methods
RxJava Observable --- similar to CompletableFuture with other benefits(not sure)
For example: if client needs to make multiple service calls and when we use Futures (Java) Future.get() will be executed sequentially...would like to know how its better in RxJava..
And the documentation http://reactivex.io/intro.html says
It is difficult to use Futures to optimally compose conditional asynchronous execution flows (or impossible, since latencies of each request vary at runtime). This can be done, of course, but it quickly becomes complicated (and thus error-prone) or it prematurely blocks on Future.get(), which eliminates the benefit of asynchronous execution.
Really interested to know how RxJava solves this problem. I found it difficult to understand from the documentation.
Futures
Futures were introduced in Java 5 (2004). They're basically placeholders for a result of an operation that hasn't finished yet. Once the operation finishes, the Future will contain that result. For example, an operation can be a Runnable or Callable instance that is submitted to an ExecutorService. The submitter of the operation can use the Future object to check whether the operation isDone(), or wait for it to finish using the blocking get() method.
Example:
/**
* A task that sleeps for a second, then returns 1
**/
public static class MyCallable implements Callable<Integer> {
#Override
public Integer call() throws Exception {
Thread.sleep(1000);
return 1;
}
}
public static void main(String[] args) throws Exception{
ExecutorService exec = Executors.newSingleThreadExecutor();
Future<Integer> f = exec.submit(new MyCallable());
System.out.println(f.isDone()); //False
System.out.println(f.get()); //Waits until the task is done, then prints 1
}
CompletableFutures
CompletableFutures were introduced in Java 8 (2014). They are in fact an evolution of regular Futures, inspired by Google's Listenable Futures, part of the Guava library. They are Futures that also allow you to string tasks together in a chain. You can use them to tell some worker thread to "go do some task X, and when you're done, go do this other thing using the result of X". Using CompletableFutures, you can do something with the result of the operation without actually blocking a thread to wait for the result. Here's a simple example:
/**
* A supplier that sleeps for a second, and then returns one
**/
public static class MySupplier implements Supplier<Integer> {
#Override
public Integer get() {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
//Do nothing
}
return 1;
}
}
/**
* A (pure) function that adds one to a given Integer
**/
public static class PlusOne implements Function<Integer, Integer> {
#Override
public Integer apply(Integer x) {
return x + 1;
}
}
public static void main(String[] args) throws Exception {
ExecutorService exec = Executors.newSingleThreadExecutor();
CompletableFuture<Integer> f = CompletableFuture.supplyAsync(new MySupplier(), exec);
System.out.println(f.isDone()); // False
CompletableFuture<Integer> f2 = f.thenApply(new PlusOne());
System.out.println(f2.get()); // Waits until the "calculation" is done, then prints 2
}
RxJava
RxJava is whole library for reactive programming created at Netflix. At a glance, it will appear to be similar to Java 8's streams. It is, except it's much more powerful.
Similarly to Futures, RxJava can be used to string together a bunch of synchronous or asynchronous actions to create a processing pipeline. Unlike Futures, which are single-use, RxJava works on streams of zero or more items. Including never-ending streams with an infinite number of items. It's also much more flexible and powerful thanks to an unbelievably rich set of operators.
Unlike Java 8's streams, RxJava also has a backpressure mechanism, which allows it to handle cases in which different parts of your processing pipeline operate in different threads, at different rates.
The downside of RxJava is that despite the solid documentation, it is a challenging library to learn due to the paradigm shift involved. Rx code can also be a nightmare to debug, especially if multiple threads are involved, and even worse - if backpressure is needed.
If you want to get into it, there's a whole page of various tutorials on the official website, plus the official documentation and Javadoc. You can also take a look at some of the videos such as this one which gives a brief intro into Rx and also talks about the differences between Rx and Futures.
Bonus: Java 9 Reactive Streams
Java 9's Reactive Streams aka Flow API are a set of Interfaces implemented by various reactive streams libraries such as RxJava 2, Akka Streams, and Vertx. They allow these reactive libraries to interconnect, while preserving the all important back-pressure.
I have been working with Rx Java since 0.9, now at 1.3.2 and soon migrating to 2.x I use this in a private project where I already work on for 8 years.
I wouldn't program without this library at all anymore. In the beginning I was skeptic but it is a complete other state of mind you need to create. Quiete difficult in the beginning. I sometimes was looking at the marbles for hours.. lol
It is just a matter of practice and really getting to know the flow (aka contract of observables and observer), once you get there, you'll hate to do it otherwise.
For me there is not really a downside on that library.
Use case:
I have a monitor view that contains 9 gauges (cpu, mem, network, etc...). When starting up the view, the view subscribes itselfs to a system monitor class that returns an observable (interval) that contains all the data for the 9 meters.
It will push each second a new result to the view (so not polling !!!).
That observable uses a flatmap to simultaneously (async!) fetch data from 9 different sources and zips the result into a new model your view will get on the onNext().
How the hell you gonna do that with futures, completables etc ... Good luck ! :)
Rx Java solves many issues in programming for me and makes in a way a lot easier...
Advantages:
Statelss !!! (important thing to mention, most important maybe)
Thread management out of the box
Build sequences that have their own lifecycle
Everything are observables so chaining is easy
Less code to write
Single jar on classpath (very lightweight)
Highly concurrent
No callback hell anymore
Subscriber based (tight contract between consumer and producer)
Backpressure strategies (circuit breaker a like)
Splendid error handling and recovering
Very nice documentation (marbles <3)
Complete control
Many more ...
Disadvantages:
- Hard to test
Java's Future is a placeholder to hold something that will be completed in the future with a blocking API. You'll have to use its' isDone() method to poll it periodically to check if that task is finished. Certainly you can implement your own asynchronous code to manage the polling logic. However, it incurs more boilerplate code and debug overhead.
Java's CompletableFuture is innovated by Scala's Future. It carries an internal callback method. Once it is finished, the callback method will be triggered and tell the thread that the downstream operation should be executed. That's why it has thenApply method to do further operation on the object wrapped in the CompletableFuture.
RxJava's Observable is an enhanced version of CompletableFuture. It allows you to handle the backpressure. In the thenApply method (and even with its brothers thenApplyAsync) we mentioned above, this situation might happen: the downstream method wants to call an external service that might become unavailable sometimes. In this case, the CompleteableFuture will fail completely and you will have to handle the error by yourself. However, Observable allows you to handle the backpressure and continue the execution once the external service to become available.
In addition, there is a similar interface of Observable: Flowable. They are designed for different purposes. Usually Flowable is dedicated to handle the cold and non-timed operations, while Observable is dedicated to handle the executions requiring instant responses. See the official documents here: https://github.com/ReactiveX/RxJava#backpressure
All three interfaces serve to transfer values from producer to consumer. Consumers can be of 2 kinds:
synchronous: consumer makes blocking call which returns when the value is ready
asynchronous: when the value is ready, a callback method of the consumer is called
Also, communication interfaces differ in other ways:
able to transfer single value of multiple values
if multiple values, backpressure can be supported or not
As a result:
Future transferes single value using synchronous interface
CompletableFuture transferes single value using both synchronous and asynchronous interfaces
Rx transferes multiple values using asynchronous interface with backpressure
Also, all these communication facilities support transferring exceptions. This is not always the case. For example, BlockingQueue does not.
The main advantage of CompletableFuture over normal Future is that CompletableFuture takes advantage of the extremely powerful stream API and gives you callback handlers to chain your tasks, which is absolutely absent if you use normal Future. That along with providing asynchronous architecture, CompletableFuture is the way to go for handling computation heavy map-reduce tasks, without worrying much about application performance.
Today I did some experiments on AsyncRestTemplate. Below is a piece of sample code:
ListenableFuture<ResponseEntity<MyObject[]>> result
= asyncRestTemplate.getForEntity(uri, MyObject[]);
List<MyObject> objects = Arrays.asList(result.get().getBody());
To my surprise, the request was not sent to uri in first line (i.e. after calling getForEntity) but sent after result.get() is called.
Isn't it a synchronous way of doing stuff?
The whole idea of doing async request is that either you do not want to wait for the async task to start/complete OR you want the main thread to do some other task before asking for the result from the Future instance. Internally, the AsyncRestTemplate prepares an AsyncRequest and calls executeAsync method.
AsyncClientHttpRequest request = createAsyncRequest(url, method);
if (requestCallback != null) {
requestCallback.doWithRequest(request);
}
ListenableFuture<ClientHttpResponse> responseFuture = request.executeAsync();
There are two different implementations - HttpComponentsAsyncClientHttpRequest ( which uses high performant async support provided in Apache http component library ) and SimpleBufferingAsyncClientHttpRequest (which uses facilities provided by J2SE classes). In case of HttpComponentsAsyncClientHttpRequest, internally it has a thread factory (which is not spring managed AFAIK) whereas in SimpleBufferingAsyncClientHttpRequest, there is a provision of Spring managed AsyncListenableTaskExecutor. The whole point is that in all cases there is some ExecutorService of some kind to be able to run the tasks asynchronously. Of course as is natural with these thread pools, the actual starting time of task is indeterminate and depends upon lots of factor like load, available CPU etc. and should not be relied upon.
When you call future.get() you're essentially turning an asynchronous operation into a synchronous one by waiting for the result.
It doesn't matter when the actual request is performed, the important thing is that since it's asynchronous, you don't need to worry about it unless/until you need the result.
The advantage is obvious when you need to perform other work before processing the result, or when you're not waiting for a result at all.
I'm looking for the Java/Akka equivalent of Python's yield from or gevent's monkey patch.
Update
There has been some confusion in the commets about what is question is asking so let me restate the question:
If I have a future, how do I wait for the future to compete without blocking the thread AND without returning to caller until the future is complete?
Lets say we have method that blocks:
public Object foo() {
Object result = someBlockingCall();
return doSomeThingWithResult(result);
}
To make this asynchronous, we would pass SomeBlockingCall() a callback:
public void foo() {
someAsyncCall(new Handler() {
public void onSuccess(Object result) {
message = doSomethingWithResult(result);
callerRef.tell(message, ActorRef.noSender());
}
});
}
The call to foo() now returns before the result is ready, so the caller no longer gets the result. We have to get the result back to the caller by passing a message. To convert synchronous code to asynchronous Akka code, a redesign of the caller is required.
What I'd like is async code that looks like synchronous code like Python's Gevent.
I want to write:
public Object foo() {
Future future = someAsyncCall();
// save stack frame, go back to the event loop and relinquish CPU
// so other events can use the thread,
// and come back when future is complete and restore stack frame
return yield(future);
}
This would allow me to make my synchronous code asynchronous without a redesign.
Is this posible?
Note:
The Play framework seems to fake this with async() and AsyncResult. But this won't work in general since I have to write the code that handles the AsyncResult which would look like the above callback handler.
I think trying to get back a more straightforward sync design, although an efficient one, is actually a good intention and a good idea (see for example here).
Quasar has facilities to obtain sync/blocking APIs that are still highly efficient from async APIs (see this blog post), which looks exactly what you're looking for.
The fundamental problem is not that the sync/blocking style itself is bad (actually async and sync are dual styles and can be transformed into one another, see for example here), but rather than blocking Java's heavyweight threads is not efficient: it is not an abstraction problem but an implementation problem, so instead of giving up the easier thread abstraction only because the implementation is inefficient, I agree it is better for the future of your code to try and look for more efficient thread implementations.
As Roland hinted, Quasar adds lightweight threads or fibers to the JVM, so you can get the same performance of async frameworks without giving up the thread abstraction and regular imperative control flow constructs (sequence, loops etc.) available in the language.
It also unifies JVM/JDK's threads and its fibers under a common strand interface, so they can interoperate seamlessly, and provides a porting of java.util.concurrent to this unified concept.
On top of strands (either fibers or regular threads) Quasar also offers fully-fledged Erlang-style actors, blocking Go-like channels and dataflow programming, so you can choose the concurrent programming paradigm that suits best your skills and needs without being forced into one.
It also provides bindings for popular and standard technologies (as part of the Comsat project), so you can preserve your code assets because the porting effort will be minimal (if any). For the same reason you can also opt-out easily, should you choose to.
Currently Quasar has binding for Java 7 and 8, Clojure under the Pulsar project and soon JetBrain's Kotlin. Being based on JVM bytecode instrumentation, Quasar can really work with any JVM language if an integration module is present, and it offers tools to build additional ones.
The answer to your question is “no”, and that is very much by design. Writing a method to be asynchronous means returning the Future as its result, the method itself will not perform the computation but will arrange for the result to be provided later. You can then pass this Future to the right place where it is further used, for example by transforming it using one of the many combinators (like map, recover, etc.).
Awaiting a strict result for the Future will have to block the current thread of execution, no matter which technology you use. With plain JVM threads you will block a real thread from the operating system, with Quasar you will block your Fiber, with Akka you will block your Actor (*); blocking means blocking, no way around that.
(*) In an Actor you would get the result via a message at a later point, and until that point you will have to switch the behavior such that new incoming messages are stashed, rejected or dropped, depending on your use-case.