Wrapping blocking client with Mono and execute sequence - java

I am trying to make a reactive application which needs to execute ssh commands.
Currently, there is an SSH client (based on sshd mina) which is blocking (maybe there is a way to use it in a non-blocking way, but I don't know it). My idea is to create a wrapper around this blocking client, so that I can transform the blocking calls into Mono as in the code below.
public class SshReactiveClient extends AutoCloseable {
private final SshClient sshClient;
public SshReactiveClient(final SshClient sshClient) {
this.sshClient = sshClient;
}
public Mono<SshResponse> open(final Duration timeout) {
return Mono.fromCallable(() -> sshClient.open(timeout))
.subscribeOn(Schedulers.boundedElastic());
}
public Mono<SshResponse> execCommand(final SshCommand command, final Duration timeout) {
return Mono.fromCallable(() -> sshClient.execCommand(command, timeout))
.subscribeOn(Schedulers.boundedElastic());
}
#Override
public void close() throws Exception {
sshClient.close();
}
}
First, is it a good idea to proceed like that or not? What would be better?
The second point is how to write the code so that I can execute a sequence of ssh command using the responses from the previous commands to execute the next one?

Your understanding is correct. You need to wrap blocking or sync code and run it on a separate Scheduler. The better way would be if client supports async interface.
To execute commands in a sequence you need to build a flow using reactive API.
execCommand(command1)
.flatMap(res ->
execCommand(getCommand2(res))
)
.flatMap(res ->
execCommand(getCommand3(res))
)
There are many other options depending on your requirements. For example, if you need results from command1 & command2 to execute command3, you could just "shift" flow one level down
execCommand(command1)
.flatMap(res1 ->
execCommand(getCommand2(res1))
.flatMap(res2 ->
execCommand(getCommand3(res1, res2))
)
)
As an alternative, you could apply builder pattern to the reactor flow to collect responses in a sequential flow Builder pattern with Reactor mono
You could also execute command1 and command2 in parallel and use responses from both
Mono.zip(command1, command2)
.flatMap(tuple ->
execCommand(getCommand3(tuple.getT1(), tuple.getT2()))
)

Related

Infinite never failing hot flux

I'm using webflux for handling my http request. As an side effect of the processing I want to add record to the database but I do not want to stop processing of user request to achieve that.
Somewhere in main application flow.
#GetMapping
Flux<Data> getHandler(){
return doStuff().doOnNext(data -> dataStore.store(data));
}
In different class I have
class DataStore {
private static final Logger LOGGER = LoggerFactory.getLogger(DataStore.class);
private DataRepository repository;
private Scheduler scheduler;
private Sinks.Many<Data> sink;
public DataStore(DataRepository repository, Scheduler scheduler)
this.repository = repository;
this.scheduler = scheduler; //will be boundedElastic in production
this.sink = Sinks.many().replay().limit(1000); //buffer size
//build hot flux
this.sink.asFlux()
.map(data -> repository.save(data))
// retry strategy for random issues with DB connection
.retryWhen(Retry.backoff(maxRetry, backoffDuration)
.doBeforeRetry(signal -> LOGGER.warn("Retrying to save, attempt {}", signal.totalRetries())))
// give up on saving this item, drop it, try with another one, reset backoff strategy in the meantime
.onErrorContinue(Exceptions::isRetryExhausted, (e, o) -> LOGGER.error("Dropping data"))
.subscribeOn(scheduler, true)
.subscribe(
data-> LOGGER.info("Data {} saved.", data),
error -> LOGGER.error("Fatal error. Terminating store flux.", error)
);
}
public void store(Data data) {
sink.tryEmitNext(data);
}
But when writing tests for it I have noticed that if backoff reaches it limit flux instead of doping the data and continuing will just stop.
#BeforeEach
public void setup() {
repository = mock(DataRepository.class);
dataStore = new DataStore(repository, Schedulers.immediate()); //maxRetry = 4, backoffDuration = Duration.ofMillis(1)
}
#Test
public void test() throws Exception {
//given
when(repository.save(any()))
.thenThrow(new RuntimeException("fail")) // normal store
.thenThrow(new RuntimeException("fail")) // first retry
.thenThrow(new RuntimeException("fail")) // second retry
.thenThrow(new RuntimeException("fail")) // third retry
.thenThrow(new RuntimeException("fail")) // fourth retry -> should drop data("One")
.thenAnswer(invocation -> invocation.getArgument(0)) //store data("Two")
.thenAnswer(invocation -> invocation.getArgument(0));//store data("Three")
//when
searchStore.store(data("One")); //exhaust 5 retries
searchStore.store(data("Two")); //successful store
searchStore.store(data("Three")); //successful store
//then
Thread.sleep(2000); //overkill sleep
verify(repository, times(7)).save(any()); //assertion fails. data two and three was not saved.
}
When running this test my assertion fails and in the logs I can see only
Retrying to save, attempt 0
Retrying to save, attempt 1
Retrying to save, attempt 2
Retrying to save, attempt 3
Dropping data
And there is no info of successful processing of data Two and Three.
I do not want to retry indefinitely, because I assume that DB connection may fail from time to time and I do not want to have buffer overflow.
I know that I can achieve similar flow without flux (use queue etc.), but the build in retry with backoff is very tempting.
How I can drop error from the flux as onErrorContinue does not seam to be working?
General note - the code in the above question isn't used in a reactive context, and therefore this answer suggestions options that would be completely wrong if using Webflux or in a similar reactive environment.
Firstly, note that onErrorContinue() is almost certainly not what you want - not just in this situation, but generally. It's a specialist operator that almost certainly doesn't quite do what you think it does.
Usually, I'd balk at this line:
.map(data -> repository.save(data))
...as it implies your repository isn't reactive, so you're blocking in a reactive chain - a complete no-no. In this case because you're using it purely for the convenience of the retry semantics it's not going to cause issues, but bear in mind most people used to seeing reactive code will get scared when they see stuff like this!
If you're able to use a reactive repository, that would be better, and then you'd expect to see something like this:
.flatMap(data -> repository.save(data))
...implying that the save() method is returning a non-blocking Mono rather than a plain value after blocking. The norm with retrying would then be to retry on the inner publisher, resuming on an empty publisher if retries are exhausted:
.flatMap(data -> repository.save(data)
.retryWhen(Retry.backoff(maxRetry, backoffDuration)
.doBeforeRetry(signal -> LOGGER.warn("Retrying to save, attempt {}", signal.totalRetries())))
.onErrorResume(Exceptions::isRetryExhausted, e -> Mono.empty())
)
If you're not able or willing to use a reactive repository, then in this case you can still achieve the above by wrapping repository.save(data) as Mono.just(repository.save(data)) - but again, that's a bit of a code smell, and completely forbidden in a standard reactive chain, as you're making something "look" reactive when it's not.

Non-blocking streaming of data between two Quarkus services (Vert.x with Mutiny in Java)

Update!
I have fixed minor bugs in the sample code after solving some of the problems that were irrelevant to the main question which is still about non-blocking streaming between services.
Background info:
I'm porting a Spring WebFlux service under Quarkus. The service runs long searches on multiple huge data sets and returns the partial results in a Flux (text/event-stream) as they become available.
Problem:
Right now I'm trying to use Mutiny Multi with Vert.x under Quarkus but cannot figure out how the consumer service could receive this stream without blocking.
In all examples the consumer is either a JS front end page or the producer's content type is application/json that seems to bluck until the Multi completes before sending it over in one JSON object (which makes no sense in my application).
Questions:
How to receive text/event-stream with the Mutiny-flavoured Vert.x WebClient?
If the problem would be that the WebClient is not able to receive continuous steams: What is the standard way to stream data between two Quarkus services?
Here is a simplified example
Test entity
public class SearchResult implements Serializable {
private String content;
public SearchResult(String content) {
this.content = content;
}
//.. toString, getters and setters
}
Producer 1. simple infinite stream -> hangs
#GET
#Path("/search")
#Produces(MediaType.SERVER_SENT_EVENTS)
#SseElementType(MediaType.APPLICATION_JSON)
public Multi<SearchResult> getResults() {
return Multi.createFrom().ticks().every(Duration.ofSeconds(2) .onItem().transform(n -> new SearchResult(n.toString()));
}
Producer 2. with Vertx Paths infinite stream -> hangs
#Route(path = "/routed", methods = HttpMethod.GET)
public Multi<SearchResult> getSrStreamRouted(RoutingContext context) {
log.info("routed run");
return ReactiveRoutes.asEventStream(Multi.createFrom().ticks().every(Duration.ofSeconds(2))
.onItem().transform(n -> new SearchResult(n.toString()));
}
Producer 3. simple finite stream -> blocks until completion
#GET
#Path("/search")
#Produces(MediaType.SERVER_SENT_EVENTS)
#SseElementType(MediaType.APPLICATION_JSON)
public Multi<SearchResult> getResults() {
return Multi.createFrom().ticks().every(Duration.ofSeconds(2))
.transform().byTakingFirstItems(5)
.onItem().transform(n -> new SearchResult(n.toString()));
}
Consumer:
Tried multiple different solutions on both producer and consumer sides, but in every case the the stream blocks until it is complete or hangs indefinitely without transferring data for infinite streams. I get the same results with httpie. Here is the latest iteration:
WebClientOptions webClientOptions = new WebClientOptions().setDefaultHost("localhost").setDefaultPort(8182);
WebClient client = WebClient.create(vertx, webClientOptions);
client.get("/string")
.send()
.onFailure().invoke(resp -> log.error("error: " + resp))
.onItem().invoke(resp -> log.info("result: " + resp.statusCode()))
.toMulti()
.subscribe().with(r -> log.info(String.format("Subscribe: code:%d body:%s",r.statusCode(), r.bodyAsString())));
The Vert.x Web Client does not work with SSE (Without configuration).
From https://vertx.io/docs/vertx-web-client/java/:
Responses are fully buffered, use BodyCodec.pipe to pipe the response to a write stream
It waits until the response completes. You can either use the raw Vert.x HTTP Client or use the pipe codec. Examples are given on https://vertx.io/docs/vertx-web-client/java/#_decoding_responses.
Alternatively, you can use an SSE client such as in:
https://github.com/quarkusio/quarkus-quickstarts/blob/master/kafka-quickstart/src/test/java/org/acme/kafka/PriceResourceTest.java#L27-L34

Java Stream vs Flux fromIterable

I have a list of usernames and want to fetch user details from the remote service without blocking the main thread. I'm using Spring's reactive client WebClient. For the response, I get Mono then subscribe it and print the result.
private Mono<User> getUser(String username) {
return webClient
.get()
.uri(uri + "/users/" + username)
.retrieve()
.bodyToMono(User.class)
.doOnError(e ->
logger.error("Error on retrieveing a user details {}", username));
}
I have implemented the task in two ways:
Using Java stream
usernameList.stream()
.map(this::getUser)
.forEach(mono ->
mono.subscribe(System.out::println));
Using Flux.fromIterable:
Flux.fromIterable(usernameList)
.map(this::getUser)
.subscribe(mono ->
mono.subscribe(System.out::println));
It seems the main thread is not blocked in both ways.
What is the difference between Java Stream and Flux.fromIterable in this situation? If both are doing the same thing, which one is recommended to use?
There are not huge differences between both variants. The Flux.fromIterable variant might give your more options and control about concurrency/retries, etc - but not really in this case because calling subscribe here defeats the purpose.
Your question is missing some background about the type of application you're building and in which context these calls are made. If you're building a web application and this is called during request processing, or a batch application - opinions might vary.
In general, I think applications should stay away from calling subscribe because it disconnects the processing of that pipeline from the rest of the application: if an exception happens, you might not be able to report it because the resource to use to send that error message might be gone at that point. Or maybe the application is shutting down and you have no way to make it wait the completion of that task.
If you're building an application that wants to kick off some work and that its result is not useful to the current operation (i.e. it doesn't matter if that work completes or not during the lifetime of the current operation), then subscribe might be an option.
In that case, I'd try and group all operations in a single Mono<Void> operation and then trigger that work:
Mono<Void> logUsers = Flux.fromIterable(userNameList)
.map(name -> getUser(name))
.doOnNext(user -> System.out.println(user)) // assuming this is non I/O work
.then();
logUsers.subscribe(...);
If you're concerned about consuming server threads in a web application, then it's really different - you might want to get the result of that operation to write something to the HTTP response. By calling subscribe, both tasks are now disconnected and the HTTP response might be long gone by the time that work is done (and you'll get an error while writing to the response).
In that case, you should chain the operations with Reactor operators.

Method rxExecuteBlocking consuming all results - ssh client

I'm trying to do really simple SSH client with Vert.x. As I don't have non-blocking SSH library under the hood, I have to handle everything in rxExecuteBlocking. It's working great when I'm running all logic in one big block of code as follows:
public Single<String> exec() {
return vertx.rxExecuteBlocking(f -> {
String result = "";
// connect()
// exec()
// close()
f.complete(result);
}, false);
}
// hostnames :: Observalbe<String>
hostnames()
.filter()
.flatMapSingle(this::exec)
.moreCalls()
.subscribe(); // OK
I'd rather to have connect(), exec(), close() separeted and call like:
hostnames()
.filter()
.flatMapSingle(this::connect)
.moreCalls()
.flatMapSingle(this::exec)
.moreCalls()
.flatMapSingle(this::close)
.subscribe();
But when running more than one piece of blocking code
public Single<Connection> connect() {
return vertx.rxExecuteBlocking(f -> {
// connect
}, false);
}
public Single<Connection> exec() {
return vertx.rxExecuteBlocking(f -> {
// exec
}, false);
}
the chain stops at flatMapSingle(this::connect), consume all results from filter() first (make all connections) and then continue in chain. This behavior consumes pretty much resources as all connections are in memory (this behavior reminds me reduce() or collect())
The desired result will be not stopping in chain and continue, release resources and do this for every event.
Is there any way to do this?
Thanks in advance.
I would suggest to try to use overloaded flatMap, which takes as an argument the maximum number of concurrently subscribed observables at the particular pipeline stage. Provided there are 20 threads in worker thread pool by default, you could give a fraction of the pool to each of flatMap calls, e.g. 5 to each.
hostnames()
// ...some filtering
.flatMap(hostname -> this.connect(hostname).toObservable(), 5)
// ...more operators
.flatMap(connection -> this.exec(connection).toObservable(), 5)
// ...more operators
.flatMap(connection -> this.close(connection).toObservable(), 5)
.subscribe();
This will ensure that not the whole thread pool is used at the same moment.
Some tweaks to concurrent load may be needed. For example, less concurrently subscribed observables for connect and more for exec if connect is faster than exec. Thus, results of connect are not stacked in a buffer before exec.

How to write async actions in play framework 2.5?

I wrote the following code to send an email as non-blocking action.
It's not working for more than 1 request.
CompletableFuture.supplyAsync(() ->
EmailService.sendVerificationMail(appUser , mailString)).
thenApply(i -> ok("Got result: " + i));
As play.Promise is deprecated in play.2.5 (java). My previous code is not supporting. So please give me proper solution to make my action as non-blocking.
If the function EmailService.sendVerificationMail is blocking, CompletableFuture only makes it non-blocking on the calling thread. In fact it is still blocking on other thread (probably the common ForkJoinPool).
This is not a problem if only several email tasks are running. But if there are too many email tasks (say 100 or more), they will "dominate" the pool. This causes "Convoy effect" and other tasks have to wait much more time to start. This can badly damage the server performance.
If you have a lot of concurrent email tasks, you can create your own pool to handles them, instead of using the common pool. Thread pool is better than fork join pool because it does not allow work-stealing.
Or you can find the asynchronous APIs of EmailService, or implement them on your own if possible.
To answer the other question, now Play 2.5 uses CompletionStage for the default promise. It should work if you just use CompletionStage.
Some example code here. Note the use of CompletionStage in the return type.
public CompletionStage<Result> testAction() {
return CompletableFuture
.supplyAsync(() -> EmailService.sendVerificationMail(appUser, mailString), EmailService.getExecutor())
.thenApply(i -> ok("Got result: " + i));
}
For more details, you may check the Java Migration Guide on Play's site.
import java.util.concurrent.CompletableFuture;
public static CompletableFuture<Result> asynchronousProcessTask() {
final CompletableFuture<Boolean> promise = CompletableFuture
.supplyAsync(() -> Locate365Util.doTask());
return promise.thenApplyAsync(
(final Boolean i) -> ok("The Result of promise" + promise));
}
** doTask() method must return boolean value

Categories

Resources