I want to have two types of retry connection with different delay (I use Redis running on Kubernetes and jedis clinet).
It is possible to do it like this, and use retryWhen two times in the same block? If not, how to do it? A unit tests provide me, that is not working correctly (I think only first retryWhen is called, but second not). Where is the bug?
public Observable<String> getMessagesFromChannel(String channel) {
return PublishSubject.<String>fromPublisher(
subscriber ->
config.executor().execute(() ->
bindRedisSubscriptionToPublisher(channel,subscriber)))
.retryWhen(t -> t.take(config.numberOfReconnectWhenNetworkIsBreak()).delay(1, TimeUnit.SECONDS))
.doOnError(ex ->
log.trace("Trying to subscribe " + config.numberOfReconnect() + " times without success"))
.retryWhen(
ts -> ts.doOnEach(
ex -> {
log.warn(
"Redis instance down. Error while subscribing, trying to reconnect in {} ms",
config.reconnectTimeAfterRedisInstanceDown().toMillis(),
ex.getValue());
})
.delay(config.reconnectTimeAfterRedisInstanceDown().toMillis(), MILLISECONDS, config.scheduler()));
}
Related
I'm using webflux for handling my http request. As an side effect of the processing I want to add record to the database but I do not want to stop processing of user request to achieve that.
Somewhere in main application flow.
#GetMapping
Flux<Data> getHandler(){
return doStuff().doOnNext(data -> dataStore.store(data));
}
In different class I have
class DataStore {
private static final Logger LOGGER = LoggerFactory.getLogger(DataStore.class);
private DataRepository repository;
private Scheduler scheduler;
private Sinks.Many<Data> sink;
public DataStore(DataRepository repository, Scheduler scheduler)
this.repository = repository;
this.scheduler = scheduler; //will be boundedElastic in production
this.sink = Sinks.many().replay().limit(1000); //buffer size
//build hot flux
this.sink.asFlux()
.map(data -> repository.save(data))
// retry strategy for random issues with DB connection
.retryWhen(Retry.backoff(maxRetry, backoffDuration)
.doBeforeRetry(signal -> LOGGER.warn("Retrying to save, attempt {}", signal.totalRetries())))
// give up on saving this item, drop it, try with another one, reset backoff strategy in the meantime
.onErrorContinue(Exceptions::isRetryExhausted, (e, o) -> LOGGER.error("Dropping data"))
.subscribeOn(scheduler, true)
.subscribe(
data-> LOGGER.info("Data {} saved.", data),
error -> LOGGER.error("Fatal error. Terminating store flux.", error)
);
}
public void store(Data data) {
sink.tryEmitNext(data);
}
But when writing tests for it I have noticed that if backoff reaches it limit flux instead of doping the data and continuing will just stop.
#BeforeEach
public void setup() {
repository = mock(DataRepository.class);
dataStore = new DataStore(repository, Schedulers.immediate()); //maxRetry = 4, backoffDuration = Duration.ofMillis(1)
}
#Test
public void test() throws Exception {
//given
when(repository.save(any()))
.thenThrow(new RuntimeException("fail")) // normal store
.thenThrow(new RuntimeException("fail")) // first retry
.thenThrow(new RuntimeException("fail")) // second retry
.thenThrow(new RuntimeException("fail")) // third retry
.thenThrow(new RuntimeException("fail")) // fourth retry -> should drop data("One")
.thenAnswer(invocation -> invocation.getArgument(0)) //store data("Two")
.thenAnswer(invocation -> invocation.getArgument(0));//store data("Three")
//when
searchStore.store(data("One")); //exhaust 5 retries
searchStore.store(data("Two")); //successful store
searchStore.store(data("Three")); //successful store
//then
Thread.sleep(2000); //overkill sleep
verify(repository, times(7)).save(any()); //assertion fails. data two and three was not saved.
}
When running this test my assertion fails and in the logs I can see only
Retrying to save, attempt 0
Retrying to save, attempt 1
Retrying to save, attempt 2
Retrying to save, attempt 3
Dropping data
And there is no info of successful processing of data Two and Three.
I do not want to retry indefinitely, because I assume that DB connection may fail from time to time and I do not want to have buffer overflow.
I know that I can achieve similar flow without flux (use queue etc.), but the build in retry with backoff is very tempting.
How I can drop error from the flux as onErrorContinue does not seam to be working?
General note - the code in the above question isn't used in a reactive context, and therefore this answer suggestions options that would be completely wrong if using Webflux or in a similar reactive environment.
Firstly, note that onErrorContinue() is almost certainly not what you want - not just in this situation, but generally. It's a specialist operator that almost certainly doesn't quite do what you think it does.
Usually, I'd balk at this line:
.map(data -> repository.save(data))
...as it implies your repository isn't reactive, so you're blocking in a reactive chain - a complete no-no. In this case because you're using it purely for the convenience of the retry semantics it's not going to cause issues, but bear in mind most people used to seeing reactive code will get scared when they see stuff like this!
If you're able to use a reactive repository, that would be better, and then you'd expect to see something like this:
.flatMap(data -> repository.save(data))
...implying that the save() method is returning a non-blocking Mono rather than a plain value after blocking. The norm with retrying would then be to retry on the inner publisher, resuming on an empty publisher if retries are exhausted:
.flatMap(data -> repository.save(data)
.retryWhen(Retry.backoff(maxRetry, backoffDuration)
.doBeforeRetry(signal -> LOGGER.warn("Retrying to save, attempt {}", signal.totalRetries())))
.onErrorResume(Exceptions::isRetryExhausted, e -> Mono.empty())
)
If you're not able or willing to use a reactive repository, then in this case you can still achieve the above by wrapping repository.save(data) as Mono.just(repository.save(data)) - but again, that's a bit of a code smell, and completely forbidden in a standard reactive chain, as you're making something "look" reactive when it's not.
I have a situation in which I wanted to implement an API retry mechanism.
Let say I have an API that calls third party API where normal response time comes under 2 seconds but sometimes we got an error saying "Service Not available", "Gateway Timeout" Etc.
So I went online to see if we have a library to handle these things and I found out https://jodah.net/failsafe/
Purpose Of using Library:-
If under 5 seconds, I don't get the result, I will cancel the execution of the current call and try one more time.
For that, In Library I can see we have timeout and retry policy.
First I am trying the timeout.
Timeout<Object> timeout = Timeout.of(Duration.ofMillis(1000)).withCancel(true)
.onFailure(e -> logger.error("Connection attempt timed out {} {} ",new DateTime(), e.getFailure()))
.onSuccess(e -> logger.info("Execution completed on time"));
try {
logger.info("TIme: {}", new DateTime());
result = Failsafe.with(timeout).get(() -> restTemplate.postForEntity(messageSendServiceUrl, request, String.class));
} catch (TimeoutExceededException | HttpClientErrorException e) {
logger.info("TIme: {}", new DateTime());
logger.error("Timeout exception", e);
} catch (Exception e) {
logger.error("Exception", e);
}
But while calculating the time I am getting 20 seconds delay between calling the API and receiving TimeoutExceededException, which should be 1 second as duration is Duration.ofMillis(1000). Below you can see a difference of 21 seconds.
TIme: 2020-06-11T10:00:17.964+05:30
Connection attempt timed out 2020-06-11T10:00:39.037+05:30 {}
Can you please let me know what I am doing wrong here.
Second is the retry policy
RetryPolicy<Object> retryPolicy = new RetryPolicy<>()
.handle(HttpClientErrorException.class, TimeoutExceededException.class, Exception.class)
.withDelay(Duration.ofSeconds(1))
.withMaxRetries(3);
I want once TimeoutExceededException exception occurs after let's say 3 seconds, with a delay of 1 second, again the request is fired with max 3 retries.
I am using it as
result = Failsafe.with(retryPolicy,timeout).get(() -> restTemplate.postForEntity(messageSendServiceUrl, request, String.class));
get is a blocking or synchronous operation and it uses the calling thread. There is virtually no way for Failsafe to stop early. Timeout is best used in conjunction with an asynchronous operation, usually indicated by *Async methods. Make sure you read https://jodah.net/failsafe/schedulers/ because the default has some implications and is usually a poor choice for IO-bound operations.
I ran this:
Mono<Void> mono = Mono.empty();
System.out.println("mono.block: " + mono.block());
and it produces:
mono.block: null
as expected. In other words, calling block will return immediately if the Mono already completed.
Another example, resembling the real-world scenario. I have a source flux, e.g.:
Flux<Integer> ints = Flux.range(0, 2);
I make a connectable flux that I will use to allow multiple subscribers:
ConnectableFlux<Integer> publish = ints.publish();
For this example, let's say there's a single real-work subscriber:
publish
.doOnComplete(() -> System.out.println("publish completed"))
.subscribe();
and another subscriber that just produces the element count:
Mono<Long> countMono = publish
.doOnComplete(() -> System.out.println("countMono completed"))
.count();
countMono.subscribe();
I connect the connectable flux and print the element count:
publish.connect();
System.out.println("block");
long count = countMono.block();
System.out.println("count: " + count);
This prints:
publish completed
countMono completed
block
In other words, both subscribers subscribe successfully and complete, but then countMono.block() blocks indefinitely.
Why is that and how do I make this work? My end goal is to get the count of the elements.
You can get this to work by using autoConnect or refCount instead of manually calling connect().
For example:
Flux<Integer> ints = Flux.range(0, 2);
Flux<Integer> publish = ints.publish()
.autoConnect(2); // new
publish
.doOnComplete(() -> System.out.println("publish completed"))
.subscribe();
Mono<Long> countMono = publish
.doOnComplete(() -> System.out.println("countMono completed"))
.count();
// countMono.subscribe();
long count = countMono.block();
System.out.println("count: " + count);
Why does your example not work?
Here is what I think is happening in your example... but this is based on my limited knowledge, and I'm not 100% sure it is correct.
.publish() turns the upstream source into a hot stream
You then subscribe twice (but these don't start the flow yet, since the connectable flux is not connected to the upstream yet)
.connect() subscribes to the upstream, and starts the flow
The upstream, and the two subscriptions that were registered before connect() complete (since this is all happening in the main thread)
At this point the ConnectableFlux is no longer connected to the upstream, because the upstream has completed (The reactor docs are light on details on what happens to a ConnectableFlux when new subscriptions arrive after the upstream source completes, so this is what I'm not 100% certain about.)
block() creates a new subscription.
But since the ConnectableFlux is no longer connected, no data is flowing
If you were to call connect() again (from another thread, since the main thread is blocked), data would flow again, and the block() would complete. However, this would be a new sequence (not the original sequence that completed in step 4)
Why does my example work?
Only two subscriptions are created (instead of 3 in your example), one from a .subscribe() call, and one from .block(). The ConnectableFlux auto connects after 2 subscriptions, and therefore the block() subscription completes. Both subscriptions share the same upstream sequence.
I have an Observable (which obtains data from network).
The problem is that observable can be fast or slow depending on network conditions.
I show progress widget, when observable is executing, and hide it when observable completes. When the network is fast - progress flikers (appears and disappears). I want to set minimum execution time of observable to 1 second. How can I do that?
"Delay" operator is not an option because it will delay even for slow network.
You can use Observable.zip() for that. Given
Observable<Response> network = ...
One can do
Observable<Integer> readyNotification = Observable.just(42).delay(1, TimeUnit.SECONDS);
Observable delayedNetwork = network.zipWith(readyNotification,
(response, notUsed) -> response);
Use Observable.concatEager()
It allows you to force one stream to complete after another (concat operator), but also kick off the network request immediately without having to wait for the first argument observable to complete (concatEager):
Observable<Response> responseObservable = ...;
Observable<Response> responseWithMinDelay = Observable.concatEager(
Observable.timer(1, TimeUnit.SECONDS).ignoreElements(),
responseObservable
).cast(Response.class);
It looked like Observable.zip would be a reasonable approach, and it seemed to work well until there was an error emitted; then it didn't wait for the expected time.
This seemed to work well for me:
Observable.mergeDelayError(
useCase.execute(), // can return Unit or throw error
Observable.timer(1, TimeUnit.SECONDS)
)
.reduce { _, _ -> Unit }
.doOnError { /* will wait at least 1 second */ }
.subscribe { /* will wait at least 1 second */ }
I wrote the following code to send an email as non-blocking action.
It's not working for more than 1 request.
CompletableFuture.supplyAsync(() ->
EmailService.sendVerificationMail(appUser , mailString)).
thenApply(i -> ok("Got result: " + i));
As play.Promise is deprecated in play.2.5 (java). My previous code is not supporting. So please give me proper solution to make my action as non-blocking.
If the function EmailService.sendVerificationMail is blocking, CompletableFuture only makes it non-blocking on the calling thread. In fact it is still blocking on other thread (probably the common ForkJoinPool).
This is not a problem if only several email tasks are running. But if there are too many email tasks (say 100 or more), they will "dominate" the pool. This causes "Convoy effect" and other tasks have to wait much more time to start. This can badly damage the server performance.
If you have a lot of concurrent email tasks, you can create your own pool to handles them, instead of using the common pool. Thread pool is better than fork join pool because it does not allow work-stealing.
Or you can find the asynchronous APIs of EmailService, or implement them on your own if possible.
To answer the other question, now Play 2.5 uses CompletionStage for the default promise. It should work if you just use CompletionStage.
Some example code here. Note the use of CompletionStage in the return type.
public CompletionStage<Result> testAction() {
return CompletableFuture
.supplyAsync(() -> EmailService.sendVerificationMail(appUser, mailString), EmailService.getExecutor())
.thenApply(i -> ok("Got result: " + i));
}
For more details, you may check the Java Migration Guide on Play's site.
import java.util.concurrent.CompletableFuture;
public static CompletableFuture<Result> asynchronousProcessTask() {
final CompletableFuture<Boolean> promise = CompletableFuture
.supplyAsync(() -> Locate365Util.doTask());
return promise.thenApplyAsync(
(final Boolean i) -> ok("The Result of promise" + promise));
}
** doTask() method must return boolean value