I have a method on some repository class that returns a CompletableFuture. The code that completes these futures uses a third party library which blocks. I intend to have a separate bounded Executor which this repository class will use to make these blocking calls.
Here is an example:
public class PersonRepository {
private Executor executor = ...
public CompletableFuture<Void> create(Person person) {...}
public CompletableFuture<Boolean> delete(Person person) {...}
}
The rest of my application will compose these futures and do some other things with the results. When these other functions that are supplied to thenAccept, thenCompose, whenComplete, etc, I don't want them to run on the Executor for the repository.
Another example:
public CompletableFuture<?> replacePerson(Person person) {
final PersonRepository repo = getPersonRepository();
return repo.delete(person)
.thenAccept(existed -> {
if (existed) System.out.println("Person deleted"));
else System.out.println("Person did not exist"));
})
.thenCompose(unused -> {
System.out.println("Creating person");
return repo.create(person);
})
.whenComplete((unused, ex) -> {
if (ex != null) System.out.println("Creating person");
repo.close();
});
}
The JavaDoc states:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
Side question: Why is there an or here? In what case is there another caller of a completion method that does not complete the current future?
Main question: If I want all the println to be executed by a different Executor than the one used by the repository, which methods do I need to make async and provide the executor manually?
Obviously the thenAccept needs to be changed to thenAcceptAsync but I'm not sure about that point onwards.
Alternative question: Which thread completes the returned future from thenCompose?
My guess is that it will be whatever thread completes the future returned from the function argument. In other words I would need to also change whenComplete to whenCompleteAsync.
Perhaps I am over complicating things but this feels like it could get quite tricky. I need to pay a lot of attention to where all these futures come from. Also from a design point of view, if I return a future, how do I prevent callers from using my executor? It feels like it breaks encapsulation. I know that all the transformation functions in Scala take an implicit ExecutionContext which seems to solve all these problems.
Side Question: If you assigned the intermediate CompletionStage to a variable and call a method on it, it would get executed on the same thread.
Main Question: Only the first one, so change thenAccept to thenAcceptAsync -- all the following ones will execute their steps on the thread that is used for the accept.
Alternative Question: the thread that completed the future from thenCompose is the same one as was used for the compose.
You should think of the CompletionStages as steps, that are executed in rapid succession on the same thread (by just applying the functions in order), unless you specifically want the step to be executed on a different thread, using async. All next steps are done on that new thread then.
In your current setup the steps would be executed like this:
Thread-1: delete, accept, compose, complete
With the first accept async, it becomes:
Thread-1: delete
Thread-2: accept, compose, complete
As for your last question, about the same thread being used by your callers if they add additional steps -- I don't think there is much you can do about aside from not returning a CompletableFuture, but a normal Future.
Just from my empirical observations while playing around with it, the thread that executes these non-async methods will depend on which happens first, thenCompose itself or the task behind the Future.
If thenCompose completes first (which, in your case, is almost certain), then the method will run on the same thread that is executing the Future task.
If the task behind your Future completes first, then the method will run immediately on the calling thread (i.e. no executor at all).
Related
I'm trying to gather specific data on how long a task waits between being submitted and actually being executed. The idea is to be able to closely monitor the existing threadpool and tasks that are submitted for execution.
Let's assume I have an ExecutorService with a fixedThreadPool.
I'm also using a composition of completableFutures to perform a set of tasks asynchronously.
I'd like to be able to track within my logs the exact time a certain task had to wait in the queue before being taken for execution.
The way I see it I need two things:
A way to label CompletableFuture (or the Supplier functions passed to CompletableFuture.supplyAsync())
This I could potentially do by providing a wrapper method for the Supplier as mentioned here https://stackoverflow.com/a/57888886 and overwrite the CompletableFuture.supplyAsync() method so it will internally log which named supplier was provided
A way to monitor the time between the submission and execution of a specific Runnable to the threadpool executor.
This I can achieve by extending the ThreadPoolExecutorand providing some custom logging in the beforeExecute() and execute() method
What I'm kind of 'stuck' on now is linking both of them together. The beforeExecute() method override gives me the thread and the runnable - but the thread in itself doesn't tell me much yet, and the runnable isn't named in any way so I can't really know which exact task is taken for execution. Of course I can add additional logs in the task implementations themselves and then assume that they will be right next to the log from beforeExecute(). The problem still remains for execute() itself, since that one is called internally after using the CompletableFuture composition.
So how can I properly link the information from the executor service, with some labelling of the exact tasks provided as CompletableFutures as in the example below?
List<Foo> results = createResults();
results.forEach(r -> CompletableFuture.completedFuture(r)
.thenCompose(result -> addSomething(result, something)
.thenCombine(addSomethingElse(result, somethingElse), (r1, r2) -> result)
.thenCompose(r -> doSomething(result).thenCompose(this::setSomething))
.thenApply(v -> result))));
...
// At some point join() is called to actually wait for execution and completion
listOfFutures.join()
And each of the functions called within return a CompletableFuture<Foo> created by:
private CompletableFuture<Foo> setSomething(Foo foo) {
return CompletableFuture.supplyAsync(() -> {
foo.description = "Setting something";
return foo;
}, myExecutorService);
}
So even by wrapping the Supplier<T> to have it labeled, how am I able to link this with the tracking within the execute() and beforeExecute() method of the ThreadPoolExecutor when that one operates on Runnables instead of Suppliers?
I have the following CompletableFuture with supplyAync that uses a client to call an external http endpoint.
CompletableFuture<T> future = CompletableFuture.supplyAsync(() -> {
**call to external client**
})
If I were to do future.get twice with different timeouts for both calls, would that mean the external client is being called twice as well? I am not sure what the behavior that is in the future is at this point.
Thanks
The documentation is sort of telling you already:
Waits if necessary for this future to complete, and then returns its result
which implies that if the future is already completed, you will get its result without waiting, since it is already known. "known" here means that your call was already executed and the result is already "saved", thus no need to execute it twice.
You can sneak pick into the implementation too and see that your call to external client is a Runnable under the hood (besides other things). CompletableFuture::supplyAsync will schedule that Runnable and your two threads will pool for the result (this is a very simplified explanation).
I'm studying CompletableFuture API and there is an example:
CompletableFuture.completedFuture(url)
.thenComposeAsync(this::readPage, executor)
.thenApply(this::getImageURLs)
.thenApply(this::saveFoundImages)
.....
I have a question: if I call the thenComposeAsync(...) method as the first one, will the other methods in the chain execute in the executor which I passed through the params, or I should call the other methods using async to get asynchronous execution in a particular executor?
OK, so there are 3 types of methods in CompletableFuture. For example:
thenApply()
thenApplyAsync(Function) (without an Executor)
thenApplyAsync(Function, Executor) (with an Executor)
The last one means that this action will execute in the Executor that you pass to it and it is the most obvious one.
The second one means that the action is executed in the ForkJoinPool.
The first one is far more interesting. The documentation makes it sound like it's easy, via:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method
And you need to start bisecting this into smaller pieces. What you need to understand that there are threads that complete a certain CompletableFuture, there are threads that execute some actions on it and there are threads that chain certain dependent actions. Potentially, these are all different threads. And this is where it all begins:
If the dependent action is already chained, the thread that will call complete is going to be the thread that executes this action.
If the future is already completed, then the thread that chains the action will execute it.
Since there is no linear actions on the steps above, it is literally impossible to say for sure in which thread your thenApply will execute, at least with 100% certainty. That action can be executed in any of :
the thread that calls complete/completeExceptionally
the thread that does the chaining of thenApply
the thread that calls join/get
Any of the above is a possibility. If you really want I made a rather interesting test here, proving some of the things above.
I am not trying to pick on the other answer, but he made a rather interesting point that I was very confused about in the begging too:
In your example: After .thenComposeAsync also all the following chained futures will get completed by executor.
We can easily prove that this is not correct:
CompletableFuture<String> future1 = CompletableFuture.completedFuture("a");
CompletableFuture<String> future2 = future1.thenApplyAsync(x -> "b" + x, Executors.newCachedThreadPool());
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(1));
CompletableFuture<String> future3 = future2.thenApply(x -> {
System.out.println(Thread.currentThread().getName());
return x + "c";
});
future3.join();
What you are going to see if you run this, is that main actually executes thenApply, not a thread from the pool.
I have a set of classes which encapsulate a unit of work on Google Sheets. After the class's execute method is called, they pass a request to a service, bundled with a callback which the service should call on task completion. (As the tasks are non-critical and repeated frequently, the service just logs errors and does not call the class back if its request fails).
Stripped down, the tasks look like this:
public void execute() {
//preparatory stuff, then...
Request r = new Request(this::callback);
service.execute(r);
}
public void callback(Result result) {
...
}
The call to the service is synchronous but within the service, the Request is queued, executed asynchronously, and the callback is invoked on a new thread. Some of the tasks involve several service invocations, the callback methods may themselves create a Request with a second callback method and invoke the service again. I want that to be invisible to client code.
My problem now is that I would like to run the tasks asynchronously from client code and then execute an arbitrary handler after they are done. One way to do this would be to give the class a callback in the execute() method for it to call once execution is complete. But I'd really rather be able to do this inline in code, this sort of thing:
CompletableFuture.supplyAsync(() -> (new Task()).execute()).whenComplete((result, error) -> {});
The problem with that is, the completion of the execute() method does not signal the end of the task, as the task is still awaiting its callback. The other thing is, that callback might never arrive. I can't figure out how I should go about calling the task such that I can run it asynchronously and inline like in the code above and the whenComplete() will be invoked when the Task class explicitly decides it is finished. I'd also need a timeout, as the tasks's callback may not be invoked.
Any ideas? Note that I control the service invoked by the tasks, so I can change how that works if necessary, but I'd probably rather not.
I'd spend some time looking around in java.util.concurrent. Can't you just use an ExecutorService for a lot of this? You can submit() Callable code and get a future back, you can submit a list of Callables and give a timeout, you can call shutdown() and then awaitTermination() to wait for the processing to stop. You can get these notification callbacks by just submitting a Callable that constructs with the callback interface and invokes it when it feelsl like it's done.
Failing this, you might look at actors. Your concurrency pattern would likely be very easy in the actor model.
Going to answer my own question here: I just altered the task execute methods to return a CompletableFuture<TaskResult> with TaskResult containing the desired information. The task stores the CompletableFuture internally and calls complete() as needed in later callbacks. Not sure why I had trouble seeing this solution.
I am having some trouble understanding CompletableFuture. I do not understand the get() method. Please correct me if I am wrong, but it says
"Waits if necessary for this future to complete, and then returns its result." So If I do not return any result I do not have to call get method?
Please see below. Even I do not call get() method it still does its job. So I understand get() as if the future returns something it will make sense then, otherwise get() is not necessary for a future that does not return anything.
//Since it does not return something two statements below do the same thing.
CompletableFuture.runAsync(doSomething());
CompletableFuture.runAsync(doSomething()).get();
private final CompletableFuture<Void> doSomething() {
//do something
return null;
}
The main purpose of get() is to wait for the task to complete, then return the result.
If your task is a Runnable, not Callable, it will return a Void, so as you have pointed out, there is no point in checking the result. For such tasks you are executing get() only to wait for them to complete.
The main advantages of CompletableFuture are the methods that allow you to handle exceptions, and further process the data. It also has methods wo wait for all and single task to complete from a set of ComplatableFuture tasks. So it's mutch easyer to work in multithread anv. get() method works same as for Future class though.
UPDATE:
If you don't require for them to complete before passing further in your application, you don't have to call get() method at all. But it would be wise to keep a reference to them, and wait for them to complete or cancel them before exiting a program. At some point of the program you probably will want to see if they have completed.
But if you want for the to complete before going further, then you can use CompletableFuture.allOf().
In some cases it's also wise to add a timeout to their execution, so that you will not have a hanging thread in your application. This may be dangerous, especially on mobile environment.
So it all depends on your business case.