What is the difference between thenApply and thenApplyAsync of Java CompletableFuture?

What is the difference between thenApply and thenApplyAsync of Java CompletableFuture? - java

Suppose I have the following code:
CompletableFuture<Integer> future
= CompletableFuture.supplyAsync( () -> 0);
thenApply case:
future.thenApply( x -> x + 1 )
.thenApply( x -> x + 1 )
.thenAccept( x -> System.out.println(x));
Here the output will be 2. Now in case of thenApplyAsync:
future.thenApplyAsync( x -> x + 1 ) // first step
.thenApplyAsync( x -> x + 1 ) // second step
.thenAccept( x -> System.out.println(x)); // third step
I read in this blog that each thenApplyAsync are executed in a separate thread and 'at the same time'(that means following thenApplyAsyncs started before preceding thenApplyAsyncs finish), if so, what is the input argument value of the second step if the first step not finished?
Where will the result of the first step go if not taken by the second step?
the third step will take which step's result?
If the second step has to wait for the result of the first step then what is the point of Async?
Here x -> x + 1 is just to show the point, what I want know is in cases of very long computation.

The difference has to do with the Executor that is responsible for running the code. Each operator on CompletableFuture generally has 3 versions.
thenApply(fn) - runs fn on a thread defined by the CompleteableFuture on which it is called, so you generally cannot know where this will be executed. It might immediately execute if the result is already available.
thenApplyAsync(fn) - runs fn on a environment-defined executor regardless of circumstances. For CompletableFuture this will generally be ForkJoinPool.commonPool().
thenApplyAsync(fn,exec) - runs fn on exec.
In the end the result is the same, but the scheduling behavior depends on the choice of method.

You're mis-quoting the article's examples, and so you're applying the article's conclusion incorrectly. I see two question in your question:
What is the correct usage of .then___()
In both examples you quoted, which is not in the article, the second function has to wait for the first function to complete. Whenever you call a.then___(b -> ...), input b is the result of a and has to wait for a to complete, regardless of whether you use the methods named Async or not. The article's conclusion does not apply because you mis-quoted it.
The example in the article is actually
CompletableFuture<String> receiver = CompletableFuture.supplyAsync(this::findReceiver);
receiver.thenApplyAsync(this::sendMsg);
receiver.thenApplyAsync(this::sendMsg);
Notice the thenApplyAsync both applied on receiver, not chained in the same statement. This means both function can start once receiver completes, in an unspecified order. (Any assumption of order is implementation dependent.)
To put it more clearly:
a.thenApply(b).thenApply(c); means the order is a finishes then b starts, b finishes, then c starts.
a.thenApplyAsync(b).thenApplyAsync(c); will behave exactly the same as above as far as the ordering between a b c is concerned.
a.thenApply(b); a.thenApply(c); means a finishes, then b or c can start, in any order. b and c don't have to wait for each other.
a.thenApplyAync(b); a.thenApplyAsync(c); works the same way, as far as the order is concerned.
You should understand the above before reading the below. The above concerns asynchronous programming, without it you won't be able to use the APIs correctly. The below concerns thread management, with which you can optimize your program and avoid performance pitfalls. But you can't optimize your program without writing it correctly.
As titled: Difference between thenApply and thenApplyAsync of Java CompletableFuture?
I must point out that the people who wrote the JSR must have confused the technical term "Asynchronous Programming", and picked the names that are now confusing newcomers and veterans alike. To start, there is nothing in thenApplyAsync that is more asynchronous than thenApply from the contract of these methods.
The difference between the two has to do with on which thread the function is run. The function supplied to thenApply may run on any of the threads that
calls complete
calls thenApply on the same instance
while the 2 overloads of thenApplyAsync either
uses a default Executor (a.k.a. thread pool), or
uses a supplied Executor
The take away is that for thenApply, the runtime promises to eventually run your function using some executor which you do not control. If you want control of threads, use the Async variants.
If your function is lightweight, it doesn't matter which thread runs your function.
If your function is heavy CPU bound, you do not want to leave it to the runtime. If the runtime picks the network thread to run your function, the network thread can't spend time to handle network requests, causing network requests to wait longer in the queue and your server to become unresponsive. In that case you want to use thenApplyAsync with your own thread pool.
Fun fact: Asynchrony != threads
thenApply/thenApplyAsync, and their counterparts thenCompose/thenComposeAsync, handle/handleAsync, thenAccept/thenAcceptAsync, are all asynchronous! The asynchronous nature of these function has to do with the fact that an asynchronous operation eventually calls complete or completeExceptionally. The idea came from Javascript, which is indeed asynchronous but isn't multi-threaded.

This is what the documentation says about CompletableFuture's thenApplyAsync:
Returns a new CompletionStage that, when this stage completes
normally, is executed using this stage's default asynchronous
execution facility, with this stage's result as the argument to the
supplied function.
So, thenApplyAsync has to wait for the previous thenApplyAsync's result:
In your case you first do the synchronous work and then the asynchronous one. So, it does not matter that the second one is asynchronous because it is started only after the synchrounous work has finished.
Let's switch it up. In some cases "async result: 2" will be printed first and in some cases "sync result: 2" will be printed first. Here it makes a difference because both call 1 and 2 can run asynchronously, call 1 on a separate thread and call 2 on some other thread, which might be the main thread.
CompletableFuture<Integer> future
= CompletableFuture.supplyAsync(() -> 0);
future.thenApplyAsync(x -> x + 1) // call 1
.thenApplyAsync(x -> x + 1)
.thenAccept(x -> System.out.println("async result: " + x));
future.thenApply(x -> x + 1) // call 2
.thenApply(x -> x + 1)
.thenAccept(x -> System.out.println("sync result:" + x));

The second step (i.e. computation) will always be executed after the first step.
If the second step has to wait for the result of the first step then what is the point of Async?
Async means in this case that you are guaranteed that the method will return quickly and the computation will be executed in a different thread.
When calling thenApply (without async), then you have no such guarantee. In this case the computation may be executed synchronously i.e. in the same thread that calls thenApply if the CompletableFuture is already completed by the time the method is called. But the computation may also be executed asynchronously by the thread that completes the future or some other thread that calls a method on the same CompletableFuture. This answer: https://stackoverflow.com/a/46062939/1235217 explained in detail what thenApply does and does not guarantee.
So when should you use thenApply and when thenApplyAsync? I use the following rule of thumb:
non-async: only if the task is very small and non-blocking, because in this case we don't care which of the possible threads executes it
async (often with an explicit executor as parameter): for all other tasks

In both thenApplyAsync and thenApply the Consumer<? super T> action passed to these methods will be called asynchronously and will not block the thread that specified the consumers.
The difference have to do with which thread will be responsible for calling the method Consumer#accept(T t):
Consider an AsyncHttpClient call as below: Notice the thread names printed below. I hope it give you clarity on the difference:
// running in the main method
// public static void main(String[] args) ....
CompletableFuture<Response> future =
asyncHttpClient.prepareGet(uri).execute().toCompletableFuture();
log.info("Current Thread " + Thread.currentThread().getName());
//Prints "Current Thread main"
thenApply Will use the same thread that completed the future.
//will use the dispatcher threads from the asyncHttpClient to call `consumer.apply`
//The thread that completed the future will be blocked by the execution of the method `Consumer#accept(T t)`.
future.thenApply(myResult -> {
log.info("Applier Thread " + Thread.currentThread().getName());
return myResult;
})
//Prints: "Applier Thread httpclient-dispatch-8"
thenApplyAsync Will use the a thread from the Executor pool.
//will use the threads from the CommonPool to call `consumer.accept`
//The thread that completed the future WON'T be blocked by the execution of the method `Consumer#accept(T t)`.
future.thenApplyAsync(myResult -> {
log.info("Applier Thread " + Thread.currentThread().getName());
return myResult;
})
//Prints: "Applier Thread ForkJoinPool.commonPool-worker-7"
future.get() Will block the main thread .
//If called, `.get()` may block the main thread if the CompletableFuture is not completed.
future.get();
Conclusion
The Async suffix in the method thenApplyAsync means that the thread completing the future will not be blocked by the execution of the Consumer#accept(T t) method.
The usage of thenApplyAsync vs thenApply depends if you want to block the thread completing the future or not.

Related

How the extractors in Reactor blocking (in terms of performance)

I've noticed that it's not recommended to use such extractors Mono.toFuture(), Flux.collectList() since they will block the flow.
I'm not very sure the 'blocking' is in which way. Like it in the code below, I know Flux.collectList() will wait for all the item finishing, will it having like a certain thread keep waiting or it's just the last thread that finishes at the last do the .collectList() thing?
It has been metion that Mono.toFuture() will block too, will it return a 'future' immediately (and the future will be usable when onNext() or onComplete() occurred), or it will return until the onNext() or the onComplete() occurred?
var m = Flux.range(0, 100)
.parallel()
.runOn(Schedulers.boundedElastic())
.map(i -> Mono.fromFuture(
Mono.just(i).map(n -> {
try {
var s = (long) (Math.random() * 100);
Thread.sleep(s);
System.out.println(Thread.currentThread() + "after " + s + "ms awaking: " + n);
} catch (InterruptedException e) {
e.printStackTrace();
}
return n;
}).toFuture())
)
.doOnNext(o -> System.out.println(Thread.currentThread() + "before sequential"))
.sequential();
var mm = Flux.merge(m)
.doOnNext(o -> System.out.println(Thread.currentThread() + "before collecting"))
.collectList()
.doOnNext(o -> System.out.println(Thread.currentThread() + "before map"))
.map(list -> list.stream().map(i -> i).collect(Collectors.toList()))
.publishOn(Schedulers.single())
.toFuture();

Your assumptions aren't quite correct.
Mono.toFuture() isn't blocking at all - it simply returns a CompleteableFuture, which you can either block (if you call its get() method) or execute asynchronously (if you use any of its async methods, like thenApply(), thenCompose() etc.) You break out of the reactor context and so forfeit things like backpressure, but you don't immediately have to block.
It's possible you're thinking of (very) old versions of reactor where I believe there was a toFuture() variant that returned a Future, rather than a CompleteableFuture - and while that wasn't blocking either, it put you in a context where you had to then block, as Future has no async component. So while the method call itself wasn't blocking, that was then the only choice you had.
Contrary to popular belief Flux.collectList() also isn't blocking - it specifically returns a Mono<List<T>>, that is a non-blocking publisher that will emit a single element, which is a list of everything that's in that flux. You can call block() on this publisher of course, and that operation would be blocking - but calling collectList() by itself is no more blocking than any other operator.
That being said, it certainly can cause problems. Due to the nature of what it's doing (collecting all elements from a flux into a single list in memory), it may not be ideal:
You might have to wait a long time for the list to be emitted, with no feedback on how many elements it contains, or if it's being populated at all;
You might run out of memory if the number of elements, or size of elements in the flux is particularly large;
You can't output any intermediate state as elements are added, so you forfeit things like streaming JSON support.
That doesn't make it blocking however, it just means there's a different set of potential issues you need to weigh up before deciding whether it's an operator that's worth using in your partiuclar scenario.

Replicate deferred/async launch policies from C++ in Java

In C++ you can start a thread with a deferred or asynchronous launch policy. Is there a way to replicate this functionality in Java?
auto T1 = std::async(std::launch::deferred, doSomething());
auto T2 = std::async(std::launch::async, doSomething());
Descriptions of each--
Asynchronous:
If the async flag is set, then async executes the callable object f on a new thread of execution (with all thread-locals initialized) except that if the function f returns a value or throws an exception, it is stored in the shared state accessible through the std::future that async returns to the caller.
Deferred:
If the deferred flag is set, then async converts f and args... the same way as by std::thread constructor, but does not spawn a new thread of execution. Instead, lazy evaluation is performed: the first call to a non-timed wait function on the std::future that async returned to the caller will cause the copy of f to be invoked (as an rvalue) with the copies of args... (also passed as rvalues) in the current thread (which does not have to be the thread that originally called std::async). The result or exception is placed in the shared state associated with the future and only then it is made ready. All further accesses to the same std::future will return the result immediately.
See the documentation for details.

Future
First of all, we have to observe that std::async is a tool to execute a given task and return a std::future object that holds the result of the computation once its available.
For example we can call result.get() to block and wait for the result to arrive. Also, when the computation encountered an exception, it will be stored and rethrown to us as soon as we call result.get().
Java provides similar classes, the interface is Future and the most relevant implementation is CompletableFuture.
std::future#get translates roughly to Future#get. Even the exceptional behavior is very similar. While C++ rethrows the exception upon calling get, Java will throw a ExecutionException which has the original exception set as cause.
How to obtain a Future?
In C++ you create your future object using std::async. In Java you could use one of the many static helper methods in CompletableFuture. In your case, the most relevant are
CompletableFuture#runAsync, if the task does not return any result and
CompletableFuture#supplyAsync, if the task will return a result upon completion
So in order to create a future that just prints Hello World!, you could for example do
CompletableFuture<Void> task = CompletableFuture.runAsync(() -> System.out.println("Hello World!"));
/*...*/
task.get();
Java not only has lambdas but also method references. Lets say you have a method that computes a heavy math task:
class MyMath {
static int compute() {
// Very heavy, duh
return (int) Math.pow(2, 5);
}
}
Then you could create a future that returns the result once its available as
CompletableFuture<Integer> task = CompletableFuture.runAsync(MyMath::compute);
/*...*/
Integer result = task.get();
async vs deferred
In C++, you have the option to specify a launch policy which dictates the threading behavior for the task. Let us put the memory promises C++ makes aside, because in Java you do not have that much control over memory.
The differences are that async will immediately schedule creation of a thread and execute the task in that thread. The result will be available at some point and is computed while you can continue work in your main task. The exact details whether it is a new thread or a cached thread depend on the compiler and are not specified.
deferred behaves completely different to that. Basically nothing happens when you call std::async, no extra thread will be created and the task will not be computed yet. The result will not be made available in the meantime at all. However, as soon as you call get, the task will be computed in your current thread and return a result. Basically as if you would have called the method directly yourself, without any async utilities at all.
std::launch::async in Java
That said, lets focus on how to translate this behavior to Java. Lets start with async.
This is the simple one, as it is basically the default and intended behavior offered in CompletableFuture. So you just do runAsync or supplyAsync, depending on whether your method returns a result or not. Let me show the previous examples again:
// without result
CompletableFuture<Void> task = CompletableFuture.runAsync(() -> System.out.println("Hello World!"));
/*...*/ // the task is computed in the meantime in a different thread
task.get();
// with result
CompletableFuture<Integer> task = CompletableFuture.supplyAsync(MyMath::compute);
/*...*/
Integer result = task.get();
Note that there are also overloads of the methods that except an Executor which can be used if you have your own thread pool and want CompletableFuture to use that instead of its own (see here for more details).
std::launch::deferred in Java
I tried around a lot to mock this behavior with CompletableFuture but it does not seem to be possibly without creating your own implementation (please correct me if I am wrong though). No matter what, it either executes directly upon creation or not at all.
So I would just propose to use the underlying task interface that you gave to CompletableFuture, for example Runnable or Supplier, directly. In our case, we might also use IntSupplier to avoid the autoboxing.
Here are the two code examples again, but this time with deferred behavior:
// without result
Runnable task = () -> System.out.println("Hello World!");
/*...*/ // the task is not computed in the meantime, no threads involved
task.run(); // the task is computed now
// with result
IntSupplier task = MyMath::compute;
/*...*/
int result = task.getAsInt();
Modern multithreading in Java
As a final note I would like to give you a better idea how multithreading is typically used in Java nowadays. The provided facilities are much richer than what C++ offers by default.
Ideally should design your system in a way that you do not have to care about such little threading details. You create an automatically managed dynamic thread pool using Executors and then launch your initial task against that (or use the default executor service provided by CompletableFuture). After that, you just setup an operation pipeline on the future object, similar to the Stream API and then just wait on the final future object.
For example, let us suppose you have a list of file names List<String> fileNames and you want to
read the file
validate its content, skip it if its invalid
compress the file
upload the file to some web server
check the response status code
and count how many where invalid, not successfull and successfull. Suppose you have some methods like
class FileUploader {
static byte[] readFile(String name) { /*...*/ }
static byte[] requireValid(byte[] content) throws IllegalStateException { /*...*/ }
static byte[] compressContent(byte[] content) { /*...*/ }
static int uploadContent(byte[] content) { /*...*/ }
}
then we can do so easily by
AtomicInteger successfull = new AtomicInteger();
AtomicInteger notSuccessfull = new AtomicInteger();
AtomicInteger invalid = new AtomicInteger();
// Setup the pipeline
List<CompletableFuture<Void>> tasks = fileNames.stream()
.map(name -> CompletableFuture
.completedFuture(name)
.thenApplyAsync(FileUploader::readFile)
.thenApplyAsync(FileUploader::requireValid)
.thenApplyAsync(FileUploader::compressContent)
.thenApplyAsync(FileUploader::uploadContent)
.handleAsync((statusCode, exception) -> {
AtomicInteger counter;
if (exception == null) {
counter = statusCode == 200 ? successfull : notSuccessfull;
} else {
counter = invalid;
}
counter.incrementAndGet();
})
).collect(Collectors.toList());
// Wait until all tasks are done
tasks.forEach(CompletableFuture::join);
// Print the results
System.out.printf("Successfull %d, not successfull %d, invalid %d%n", successfull.get(), notSuccessfull.get(), invalid.get());
The huge benefit of this is that it will reach max throughput and use all hardware capacity offered by your system. All tasks are executed completely dynamic and independent, managed by an automatic pool of threads. And you just wait until everything is done.

For asynchronous launch of a thread, in modern Java prefer the use of a high-level java.util.concurrent.ExecutorService.
One way to obtain an ExecutorService is through java.util.concurrent.Executors. Different behaviors are available for ExecutorServices; the Executors class provides methods for some common cases.
Once you have an ExecutorService, you can submit Runnables and Callables to it.
Future<MyReturnValue> myFuture = myExecutorService.submit(myTask);

If I understood you correctly, may be something like this:
private static CompletableFuture<Void> deferred(Runnable run) {
CompletableFuture<Void> future = new CompletableFuture<>();
future.thenRun(run);
return future;
}
private static CompletableFuture<Void> async(Runnable run) {
return CompletableFuture.runAsync(run);
}
And then using them like:
public static void main(String[] args) throws Exception {
CompletableFuture<Void> def = deferred(() -> System.out.println("run"));
def.complete(null);
System.out.println(def.join());
CompletableFuture<Void> async = async(() -> System.out.println("run async"));
async.join();
}

To get something like a deferred thread, you might try running a thread at a reduced priority.
First, in Java it's often idiomatic to make a task using a Runnable first. You can also use the Callable<T> interface, which allows the thread to return a value (Runnable can't).
public class MyTask implements Runnable {
#Override
public void run() {
System.out.println( "hello thread." );
}
}
Then just create a thread. In Java threads normally wrap the task they execute.
MyTask myTask = new MyTask();
Thread t = new Tread( myTask );
t.setPriority( Thread.currentThread().getPriority()-1 );
t.start();
This should not run until there is a core available to do so, which means it shouldn't run until the current thread is blocked or run out of things to do. However you're at the mercy of the OS scheduler here, so the specific operation is not guaranteed. Most OSs will guarantee that all threads run eventually, so if the current thread takes a long time with out blocking the OSs will start it executing anyway.
setPriority() can throw a security exception if you're not allowed to set the priority of a thread (uncommon but possible). So just be aware of that minor inconvenience.
For an asynch task with a Future I would use an executor service. The helper methods in the class Executors are a convenient way to do this.
First make your task as before.
public class MyCallable implements Callable<String> {
#Override
public String call() {
return "hello future thread.";
}
}
Then use an executor service to run it:
MyCallable myCallable = new MyCallable();
ExecutorService es = Executors.newCachedThreadPool();
Future<String> f = es.submit( myCallable );
You can use the Future object to query the thread, determine its running status and get the value it returns. You will need to shutdown the executor to stop all of its threads before exiting the JVM.
es.shutdown();
I've tried to write this code as simply as possible, without the use of lambdas or clever use of generics. The above should show you what those lambdas are actually implementing. However it's usually considered better to be a bit more sophisticated when writing code (and a bit less verbose) so you should investigate other syntax once you feel you understand the above.

What is the best way to apply more than one CompletableFuture on result of another CompletableFuture?

Lets make up an example:
We have four methods:
CompletableFututre<Void> loadAndApply(SomeObject someObject);
CompletableFuture<SomeData> loadData();
A processA(SomeData data);
B processB(SomeData data);
loadAndApply combines all other methods. loadData gets data for a long time. Then we set someObject.A to result of running processA(data) and set someObject.B to result of running processB(data)
We can't apply both processA and processB at the same time because processA can only be run on swingExecutor and processB can be run only on backgroundExecutor.
So my question here is: can we somehow chain all these methods in some good looking way?
Currently I launch them like this:
CompletableFututre<Void> loadAndApply(SomeObject someObject) {
return loadData()
.thenApplyAsync(data -> { someObject.setA(processA(data)); return data; }, swingExecutor)
.thenAcceptAsync(data -> someObject.setB(processB(data)), backgroundExecutor);
}
is there any way looking better than applyAsync that actually does not apply anything on given object and just returns it for the next future?

You can do this by using CompletionStage.thenCompose(Function) in combination with CompletableFuture.allOf(CompletableFuture...). The generic signature of the Function used by thenCompose is: Function<? super T, ? extends CompletionStage<U>>.
public CompletableFuture<Void> loadAndApply(SomeObject object) {
return loadData().thenCompose(data ->
CompletableFuture.allOf(
CompletableFuture.runAsync(() -> object.setA(processA(data)), swingExecutor),
CompletableFuture.runAsync(() -> object.setB(processB(data)), backgroundExecutor)
) // End of "allOf"
); // End of "thenCompose"
} // End of "loadAndApply"
This has an added benefit. In the code your are currently using the thenAcceptAsync stage has to wait for the thenApplyAsync stage to complete before it can execute. When using the above both setA and setB will run concurrently in their respective executors.
For the sake of convenience, here's Javadoc for allOf:
Returns a new CompletableFuture that is completed when all of the
given CompletableFutures complete. If any of the given
CompletableFutures complete exceptionally, then the returned
CompletableFuture also does so, with a CompletionException holding
this exception as its cause. Otherwise, the results, if any, of the
given CompletableFutures are not reflected in the returned
CompletableFuture, but may be obtained by inspecting them
individually. If no CompletableFutures are provided, returns a
CompletableFuture completed with the value null.
Among the applications of this method is to await completion of a set
of independent CompletableFutures before continuing a program, as in:
CompletableFuture.allOf(c1, c2, c3).join();.
...and the Javadoc for thenCompose:
Returns a new CompletionStage that is completed with the same value as
the CompletionStage returned by the given function.
When this stage completes normally, the given function is invoked with
this stage's result as the argument, returning another
CompletionStage. When that stage completes normally, the
CompletionStage returned by this method is completed with the same
value.
To ensure progress, the supplied function must arrange eventual
completion of its result.
This method is analogous to Optional.flatMap and Stream.flatMap.
See the CompletionStage documentation for rules covering exceptional
completion.
Note: CompletableFuture, which implements CompletionStage, overrides thenCompose but makes the return type more specific (returns CompletableFuture rather than CompletionStage).

CompletableFuture takes more time - Java 8

I have two snippets of code which are technically same, but the second one takes 1 sec extra then the first one. The first one executes in 6 sec and the second in 7.
Double yearlyEarnings = employmentService.getYearlyEarningForUserWithEmployer(userId, emp.getId());
CompletableFuture<Double> earlyEarningsInHomeCountryCF = currencyConvCF.thenApplyAsync(currencyConv -> {
return currencyConv * yearlyEarnings;
});
The above one takes 6s and the next takes 7s
Here is the link to code
CompletableFuture<Double> earlyEarningsInHomeCountryCF = currencyConvCF.thenApplyAsync(currencyConv -> {
Double yearlyEarnings = employmentService.getYearlyEarningForUserWithEmployer(userId, emp.getId());
return currencyConv * yearlyEarnings;
});
Please explain why the second code consistently takes 1s more (extra time) as compared to the first one
Below is the signature of the method getYearlyEarningForUserWithEmployer. Just sharing, but it should not have any affect
Double getYearlyEarningForUserWithEmployer(long userId, long employerId);
Here is the link to code

Your question is horribly incomplete, but from what we can guess, it’s entirely plausible that the second variant takes longer, if we assume that currencyConvCF represents an asynchronous operation which might be running concurrently while your code fragments are executed and you’re talking about the overall time it takes to complete all operations, including the one represented by the CompletableFuture returned by thenApplyAsync (earlyEarningsInHomeCountryCF).
In the first variant you are invoking getYearlyEarningForUserWithEmployer while the operation represented by currencyConvCF might be still running concurrently. The multiplication will happen when both operations completed.
In the second variant, the getYearlyEarningForUserWithEmployer invocation is part of the operation passed to currencyConvCF.thenApplyAsync, thus it will not start before the operation represented by currencyConvCF has been completed, so no operation will run concurrently. If we assume that getYearlyEarningForUserWithEmployer takes a significant time, say one second, and has no internal dependencies to the other operation, it’s not surprising when the overall operation takes longer in that variant.
It seems, what you actually want to do is something like:
CompletableFuture<Double> earlyEarningsInHomeCountryCF = currencyConvCF.thenCombineAsync(
CompletableFuture.supplyAsync(
() -> employmentService.getYearlyEarningForUserWithEmployer(userId, emp.getId())),
(currencyConv, yearlyEarnings) -> currencyConv * yearlyEarnings);
so getYearlyEarningForUserWithEmployer is not executed sequentially in the initiating thread but both source operations can run asynchronously before the final multiplication applies.
However, when you are invoking get right afterwards in the initiating thread, like in your linked code on github, that asynchronous processing of the second operation has no benefit. Instead of waiting for the completion, your initiating thread can just perform the independent operation as the second code variant of your question already does and you will likely be even faster when not spawning an asynchronous operation for something as simple as a single multiplication, i.e. use instead:
CompletableFuture<Double> currencyConvCF = /* a true asynchronous operation */
return employmentService.getYearlyEarningForUserWithEmployer(userId, emp.getId())
* employerCurrencyCF.join();

What ever Holger said does make sense, but not in the problem I posted. I do agree that the question is not written in the best way.
The problem was that the order in which the futures were written were causing a consistent increase in time.
Ideally the order of the future should not matter as long as the code is written in a correct reactive fashion
The reason of the problem was the default ForkJoinPool of Java and Java uses this pool by default to run all the CompletableFutures. If I run all the CompletableFutues with a custom pool, I get almost the same time, irrespective of the order in which the future statements were written.
I still need to find what are the limitation of ForkJoinPool and find why my custom pool of 20 threads performs better.
I ll update my answer when I find the right reason.

RxJava .subscribeOn(Schedulers.newThread()) questions

I am on plain JDK 8. I have this simple RxJava example:
Observable
.from(Arrays.asList("one", "two", "three"))
.doOnNext(word -> System.out.printf("%s uses thread %s%n", word, Thread.currentThread().getName()))
//.subscribeOn(Schedulers.newThread())
.subscribe(word -> System.out.println(word));
and it prints out the words line by line, intertwined with information about the thread, which is 'main' for all next calls, as expected.
However, when I uncomment the subscribeOn(Schedulers.newThread()) call, nothing is printed at all. Why isn't it working? I would have expected it to start a new thread for each onNext() call and the doOnNext() to print that thread's name. Right now, I see nothing, also for the other schedulers.
When I add the call to Thread.sleep(10000L) at the end of my main, I can see the output, which would suggest the threads used by RxJava are all daemons. Is this the case? Can this be changed somehow, but using a custom ThreadFactory or similar concept, and not have to implement a custom Scheduler?
With the mentioned change, the thread name is always RxNewThreadScheduler-1, whereas the documentation for newThread says "Scheduler that creates a new {#link Thread} for each unit of work". Isn't it supposed to create a new thread for all of the emissions?

As Vladimir mentioned, RxJava standard schedulers run work on daemon threads which terminate in your example because the main thread quits. I'd like to emphasise that they don't schedule each value on a new thread, but they schedule the stream of values for each individual subscriber on a newly created thread. Subscribing a second time would give you "RxNewThreadScheduler-2".
You don't really need to change the default schedulers, but just wrap your own Executor-based scheduler with Schedulers.from() and supply that as a parameter where needed:
ThreadPoolExecutor exec = new ThreadPoolExecutor(
0, 64, 2, TimeUnit.SECONDS, new LinkedBlockingQueue<>());
exec.allowCoreThreadTimeOut(true);
Scheduler s = Schedulers.from(exec);
Observable
.from(Arrays.asList("one", "two", "three"))
.doOnNext(word -> System.out.printf("%s uses thread %s%n", word,
Thread.currentThread().getName()))
.subscribeOn(s)
.subscribe(word -> System.out.println(word));
I've got a series of blog posts about RxJava schedulers whichs should help you implement a "more permanent" variant.

Contrary to newcomers belief, reactive streams are not inherently concurrent but are inherently asynchronous. They also are inherently sequential and concurrency must be configured within the stream. Put simply, reactive streams are naturally sequential at their ends but can be concurrent at their core.
The secret sauce is using the flatMap() operator within the stream. This operator takes an Observable<T> input from the source stream and, internally re-emit it as an Observable<Observable<T>> stream to which it subscribes too all instances at once. As long as the flatMap() internal stream is executed in a multi-threaded context, it will concurrently execute the provided Function<T, R> that applies your logic and, finally, re-emit the result on the original stream as it's own emissions.
This sounds very complicated (and it is quite a bit at first glance) but simple examples with explanations help to understand the concept.
Find more details from a similar question here and articles on RxJava2 Schedulers and Concurrency with code sample and detailed explanations on how to use Schedulers sequentially and concurrently.
Hope this helps,
Softjake

public class MainClass {
public static void main(String[] args) {
Scheduler scheduler = Schedulers.from(Executors.newFixedThreadPool(10, Executors.defaultThreadFactory()));
Observable.interval(1,TimeUnit.SECONDS)
.doOnNext(word -> System.out.printf("%s uses thread %s%n", word,
Thread.currentThread().getName()))
.subscribeOn(scheduler)
.observeOn(Schedulers.io())
.doOnNext(word -> System.out.printf("%s uses thread %s%n", word,
Thread.currentThread().getName()))
.subscribe();
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

What is the difference between thenApply and thenApplyAsync of Java CompletableFuture? - java

Related

How the extractors in Reactor blocking (in terms of performance)

Replicate deferred/async launch policies from C++ in Java

What is the best way to apply more than one CompletableFuture on result of another CompletableFuture?

CompletableFuture takes more time - Java 8

RxJava .subscribeOn(Schedulers.newThread()) questions

Categories

Resources