Reactor - how to retry on hot flux without dropping elements? - java

I have an infinite hot flux of data. I am about to engage in carrying out an operation on each element in the stream, each of which returns a Mono which will complete (one way or another) after some finite time.
There is the possibility of an error being thrown from these operations. If so, I want to resubscribe to the hot flux without missing anything, retrying elements that were in the middle of being processed when the error was thrown (i.e. anything that did not complete successfully).
What do I do here? I can tolerate repeated operations on the same elements, but not losing elements entirely from the stream.
I've attempted to use a ReplayProcessor to handle this, but I can't see a way of making it work without repeating a lot of operations that might well have succeeded (using a very conservative timeout), or losing elements due to new elements overriding old ones in the buffer (as below).
Test case:
#Test
public void fluxTest() {
List<String> strings = new ArrayList<>();
strings.add("one");
strings.add("two");
strings.add("three");
strings.add("four");
ConnectableFlux<String> flux = Flux.fromIterable(strings).publish();
//Goes boom after three uses of its method, otherwise
//returns a mono. completing after a little time
DangerousClass dangerousClass = new DangerousClass(3);
ReplayProcessor<String> replay = ReplayProcessor.create(3);
flux.subscribe(replay);
replay.flatMap(dangerousClass::doThis)
.retry(1)
.doOnNext(s -> LOG.info("Completed {}", s))
.subscribe();
flux.connect();
flux.blockLast();
}
public class DangerousClass {
Logger LOG = LoggerFactory.getLogger(DangerousClass.class);
private int boomCount;
private AtomicInteger count;
public DangerousClass(int boomCount) {
this.boomCount = boomCount;
this.count = new AtomicInteger(0);
}
public Mono<String> doThis(String s) {
return Mono.fromSupplier(() -> {
LOG.info("doing dangerous {}", s);
if (count.getAndIncrement() == boomCount) {
LOG.error("Throwing exception from {}", s);
throw new RuntimeException("Boom!");
}
return s;
}).delayElement(Duration.ofMillis(600));
}
}
This prints:
doing dangerous one
doing dangerous two
doing dangerous three
doing dangerous four
Throwing exception from four
doing dangerous two
doing dangerous three
doing dangerous four
Completed four
Completed two
Completed three
One is never completed.

The error (at least in the above example) can only occur in the flatMap(dangerousClass::doThis) call - so resubscribing to the root Flux and replaying elements when this one flatMap() call has failed seems a bit odd, and (probably) isn't what you want to do.
Instead, I'd recommend ditching the ReplayProcessor and just calling retry on the inner flatMap() call instead, so you end up with something like:
ConnectableFlux<String> flux = Flux.range(1, 10).map(n -> "Entry " + n).publish();
DangerousClass dangerousClass = new DangerousClass(3);
flux.flatMap(x -> dangerousClass.doThis(x).retry(1))
.doOnNext(s -> System.out.println("Completed " + s))
.subscribe();
flux.connect();
This will give you something like the following, with all entries completed and no retries:
doing dangerous Entry 1
doing dangerous Entry 2
doing dangerous Entry 3
doing dangerous Entry 4
Throwing exception from Entry 4
doing dangerous Entry 4
Completed Entry 2
Completed Entry 1
Completed Entry 3
Completed Entry 4

Related

Project Reactor timeout

I am working on a project reactor workshop and am stuck with the following task:
/**
* TODO 5
* <p>
* For each item call received in colors flux call the {#link #simulateRemoteCall} operation.
* Timeout in case the {#link #simulateRemoteCall} does not return within 400 ms, but retry twice
* If still no response then provide "default" as a return value
*/
The problem I can't wrap my head around is that Flux never actually throws the TimeOutException! I am able to observe this in the console log:
16:05:09.759 [main] INFO Part04HandlingErrors - Received red delaying for 300
16:05:09.781 [main] INFO Part04HandlingErrors - Received black delaying for 500
16:05:09.782 [main] INFO Part04HandlingErrors - Received tan delaying for 300
I tried to play around with the order of the statements, though it didn't seem to change the behaviour. Note: In addition, I tried the overloaded variant of timeout() which accepts a default value that should be returned, if no element is emitted.
public Flux<String> timeOutWithRetry(Flux<String> colors) {
return colors
.timeout(Duration.ofMillis(400))
//.timeout(Duration.ofMillis(400), Mono.just("default"))
.retry(2)
.flatMap(this::simulateRemoteCall)
.onErrorReturn(TimeoutException.class, "default");
}
Can someone clear up why the timeout doesn't occur? I suspect that the mechanism is somehow not "bound" to the method invoked by flatMap.
For completeness: The helper method:
public Mono<String> simulateRemoteCall(String input) {
int delay = input.length() * 100;
return Mono.just(input)
.doOnNext(s -> log.info("Received {} delaying for {} ", s, delay))
.map(i -> "processed " + i)
.delayElement(Duration.of(delay, ChronoUnit.MILLIS));
}
More completeness, this is the test I am given to verify the functionality:
#Test
public void timeOutWithRetry() {
Flux<String> colors = Flux.just("red", "black", "tan");
Flux<String> results = workshop.timeOutWithRetry(colors);
StepVerifier.create(results).expectNext("processed red", "default", "processed tan").verifyComplete();
}
The answer of Martin Tarjányi is correct, but you also asked why in your code
return colors
.timeout(Duration.ofMillis(400))
//.timeout(Duration.ofMillis(400), Mono.just("default"))
.retry(2)
.flatMap(this::simulateRemoteCall)
.onErrorReturn(TimeoutException.class, "default");
no timeout occurs.
The reason is that if the elements of the colors flux are available, then invoking .timeout(Duration.ofMillis(400)) has no effect as timeout only propagates a TimeoutException if no item is emitted within the given duration of 400ms, but this is not the case in this example.
As a consequence the element is emitted and retry(2) has no effect either. Next you invoke simulateRemoteCall on the emitted element which takes some time, but which does not return an error. The result of your code is (beyond timing differences) the same as if you simply apply a map on the given flux:
public Flux<String> timeOutWithRetry(Flux<String> colors) {
return colors.map(s -> "processed " + s);
}
If you want to see a timeout on invocation of simulateRemoteCall then you must add the timeout method after this invocation.
Instead of using flatMap you could also use concatMap. The difference is whether the order should be preserved or not, i.e. whether the default values may occur out of order or not.
Using concatMap the answer looks as follows:
public Flux<String> timeOutWithRetry(Flux<String> colors) {
return colors.concatMap(
color -> simulateRemoteCall(color)
.timeout(Duration.ofMillis(400))
.retry(2)
.onErrorReturn("default"));
}
You're right in that it's the order and place of the statements that is incorrect.
Since you want to retry/timeout/error-handle the remote call, you should put these operators on the Mono of the remote call instead of the Flux.
Timeout on Flux observes the time elapsed between subsequent elements. However, when you use flatMap you get concurrency out of the box and delay between elements practically will be zero (assuming the colors Flux is sourced by an in-memory list). So this operator should not be put directly on the Flux to achieve your goal.
Retry on Flux means it resubscribes to the source in case of error, which depending on the source could result in re-processing already processed elements. Instead, you want to retry failed elements only, so it also should be put on Mono.
public Flux<String> timeOutWithRetry(Flux<String> colors) {
return colors.flatMap(color -> simulateRemoteCall(color).timeout(Duration.ofMillis(400))
.retry(2)
.onErrorReturn("default"));
}

Limit for `onErrorContinue(...)` in Flux?

I have a (possibly infinite) Flux source that is supposed to first store each message (e.g. into a database) and then asynchronously forward the messages (e.g. using Spring WebClient).
The forward(s) in case of failure are supposed to log an error, without completing the source Flux.
I however realized that forward(s) wihtin the flow (flatMap(...)) block execution of the source Flux after exactly 256 messages that cause exceptions (e.g. reactor.retry.RetryExhaustedException).
Representative example that fails in the assert since only 256 messages are processed:
#Test
#SneakyThrows
public void sourceBlockAfter256Exceptions() {
int numberOfRequests = 500;
Set<Integer> sink = new HashSet<>();
Flux
.fromStream(IntStream.range(0, numberOfRequests).boxed())
.map(sink::add)
.flatMap(i -> Mono
// normally the forwards are contained here e.g. by means of Mono.when(...).thenReturn(...).retryWhen(...):
.error(new Exception("any"))
)
.onErrorContinue((throwable, o) -> log.error("Error", throwable))
.subscribe();
Thread.sleep(3000);
Assertions.assertEquals(numberOfRequests, sink.size());
}
Doing the forward within the subscribe(...) doesn't block the source Flux but that's certainly no solution, since I don't possibly want to lose messages.
Questions:
What has happened here? (probably related to some state stored in just one bit)
How can I do this correctly?
EDIT:
According to the discussion below I've constructed an example that uses FluxMessageChannel (which up to my understanding is made for infinite streams and definitly not expected to block after 256 Errors) and has exactly the same behaviour:
#Test
#SneakyThrows
public void maxConnectionWithChannelTest() {
int numberOfRequests = 500;
Set<Integer> sink = new HashSet<>();
FluxMessageChannel fluxMessageChannel = MessageChannels.flux().get();
fluxMessageChannel.subscribeTo(
Flux
.fromStream(IntStream
.range(0, numberOfRequests).boxed()
.map(i -> MessageBuilder.withPayload(i).build())
)
.map(Message::getPayload)
.map(sink::add)
.flatMap(i -> Mono.error(new Exception("whatever")))
);
Flux
.from(fluxMessageChannel)
.subscribe();
Thread.sleep(3000);
Assert.assertEquals(numberOfRequests, sink.size());
}
EDIT:
I just raised an issue in the reactor core project: https://github.com/reactor/reactor-core/issues/2011

How to limit Single.Zip paralellism?

I am doing several http request, waiting for all the requests to complete, and with the information from all the request (and several other sources) calculate the result.
Currently I am doing it like this:
Single.zip(observables, { array -> array })
Where observables is just an array of observables, each of them doing an async operation.
But I have a limit on how many operations I can do concurrently. There should be no more than n operations at the same time. (n being ideally 5 but 1 is accepted too)
Unfortunately Zip seems to start all the operations without waiting for any of them to complete. Is there a way to limit this behavior?
Maybe you could use a combination of window() operator and zip()?
Something like that:
public static void main(String[] args) {
Flowable<Integer>[] flowables = new Flowable[] {
Flowable.just(1), Flowable.just(2), Flowable.just(3), Flowable.just(4), Flowable.just(5),
Flowable.just(6), Flowable.just(7), Flowable.just(8), Flowable.just(9)
};
Flowable.fromArray(flowables)
.window(5)
.flatMap(f -> Flowable.zip(f, objects -> Arrays.stream(objects).map(Object::toString).collect(joining("")))
.flatMapSingle(Single::just))
.subscribe(s -> System.out.println("received: " + s));
Flowable.timer(10, SECONDS) // Just to block the main thread for a while
.blockingSubscribe();
}
The window() will split the flowables into a Flowable of Flowables. Each flowable emitting only 5 elements (which can be the number of operations you want).
In this example, the zip() just concatenate the given integers.
It will print:
received: 12345
received: 6789
I hope this helps.

Limiting rate of requests with Reactor

I'm using project reactor to load data from a web service using rest. This is done in parallel with multiple threads. I'm starting to hit rate limits on the web service, so I would like to send at most 10 requests per second to avoid getting these errors. How would I do that using reactor?
Using zipWith(Mono.delayMillis(100))? Or is there some better way?
Thank you
You can use delayElements instead of the whole zipwith.
One could use Flux.delayElements to process a 10 requests batch at every 1s; be aware though that if the processing takes longer than 1s the next batch will still be started in parallel hence being processed together with the previous one (and potentially many other previous ones)!
That's why I propose another solution where a 10 requests batch is still processed at every 1s but, if its processing takes longer than 1s, the next batch will fail (see overflow IllegalStateException); one could deal with that failure such that to continue the overall processing but I won't show that here because I want to keep the example simple; see onErrorResume useful to handle overflow IllegalStateException.
The code below will do a GET on https://www.google.com/ at a rate of 10 requests per second. You'll have to do additional changes in order to support the situation where your server is not able to process in 1s all your 10 requests; you could just skip sending requests when those asked at previous second are still processed by your server.
#Test
void parallelHttpRequests() {
// this is just for limiting the test running period otherwise you don't need it
int COUNT = 2;
// use whatever (blocking) http client you desire;
// when using e.g. WebClient (Spring, non blocking client)
// the example will slightly change for no longer use
// subscribeOn(Schedulers.elastic())
RestTemplate client = new RestTemplate();
// exit, lock, condition are provided to allow one to run
// all this code in a #Test, otherwise they won't be needed
var exit = new AtomicBoolean(false);
var lock = new ReentrantLock();
var condition = lock.newCondition();
MessageFormat message = new MessageFormat("#batch: {0}, #req: {1}, resultLength: {2}");
Flux.interval(Duration.ofSeconds(1L))
.take(COUNT) // this is just for limiting the test running period otherwise you don't need it
.doOnNext(batch -> debug("#batch", batch)) // just for debugging
.flatMap(batch -> Flux.range(1, 10) // 10 requests per 1 second
.flatMap(i -> Mono.fromSupplier(() ->
client.getForEntity("https://www.google.com/", String.class).getBody()) // your request goes here (1 of 10)
.map(s -> message.format(new Object[]{batch, i, s.length()})) // here the request's result will be the output of message.format(...)
.doOnSubscribe(s -> debug("doOnSubscribe: #batch = " + batch + ", i = " + i)) // just for debugging
.subscribeOn(Schedulers.elastic()) // one I/O thread per request
)
)
// consider using onErrorResume to handle overflow IllegalStateException
.subscribe(
s -> debug("received", s) // do something with the above request's result
e -> {
// pay special attention to overflow IllegalStateException
debug("error", e.getMessage());
signalAll(exit, condition, lock);
},
() -> {
debug("done");
signalAll(exit, condition, lock);
}
);
await(exit, condition, lock);
}
// you won't need the "await" and "signalAll" methods below which
// I created only to be easier for one to run this in a test class
private void await(AtomicBoolean exit, Condition condition, Lock lock) {
lock.lock();
while (!exit.get()) {
try {
condition.await();
} catch (InterruptedException e) {
// maybe spurious wakeup
e.printStackTrace();
}
}
lock.unlock();
debug("exit");
}
private void signalAll(AtomicBoolean exit, Condition condition, Lock lock) {
exit.set(true);
try {
lock.lock();
condition.signalAll();
} finally {
lock.unlock();
}
}

Generate infinite sequence of Natural numbers using RxJava

I am trying to write a simple program using RxJava to generate an infinite sequence of natural numbers. So, far I have found two ways to generate sequence of numbers using Observable.timer() and Observable.interval(). I am not sure if these functions are the right way to approach this problem. I was expecting a simple function like one we have in Java 8 to generate infinite natural numbers.
IntStream.iterate(1, value -> value +1).forEach(System.out::println);
I tried using IntStream with Observable but that does not work correctly. It sends infinite stream of numbers only to first subscriber. How can I correctly generate infinite natural number sequence?
import rx.Observable;
import rx.functions.Action1;
import java.util.stream.IntStream;
public class NaturalNumbers {
public static void main(String[] args) {
Observable<Integer> naturalNumbers = Observable.<Integer>create(subscriber -> {
IntStream stream = IntStream.iterate(1, val -> val + 1);
stream.forEach(naturalNumber -> subscriber.onNext(naturalNumber));
});
Action1<Integer> first = naturalNumber -> System.out.println("First got " + naturalNumber);
Action1<Integer> second = naturalNumber -> System.out.println("Second got " + naturalNumber);
Action1<Integer> third = naturalNumber -> System.out.println("Third got " + naturalNumber);
naturalNumbers.subscribe(first);
naturalNumbers.subscribe(second);
naturalNumbers.subscribe(third);
}
}
The problem is that the on naturalNumbers.subscribe(first);, the OnSubscribe you implemented is being called and you are doing a forEach over an infinite stream, hence why your program never terminates.
One way you could deal with it is to asynchronously subscribe them on a different thread. To easily see the results I had to introduce a sleep into the Stream processing:
Observable<Integer> naturalNumbers = Observable.<Integer>create(subscriber -> {
IntStream stream = IntStream.iterate(1, i -> i + 1);
stream.peek(i -> {
try {
// Added to visibly see printing
Thread.sleep(50);
} catch (InterruptedException e) {
}
}).forEach(subscriber::onNext);
});
final Subscription subscribe1 = naturalNumbers
.subscribeOn(Schedulers.newThread())
.subscribe(first);
final Subscription subscribe2 = naturalNumbers
.subscribeOn(Schedulers.newThread())
.subscribe(second);
final Subscription subscribe3 = naturalNumbers
.subscribeOn(Schedulers.newThread())
.subscribe(third);
Thread.sleep(1000);
System.out.println("Unsubscribing");
subscribe1.unsubscribe();
subscribe2.unsubscribe();
subscribe3.unsubscribe();
Thread.sleep(1000);
System.out.println("Stopping");
Observable.Generate is exactly the operator to solve this class of problem reactively. I also assume this is a pedagogical example, since using an iterable for this is probably better anyway.
Your code produces the whole stream on the subscriber's thread. Since it is an infinite stream the subscribe call will never complete. Aside from that obvious problem, unsubscribing is also going to be problematic since you aren't checking for it in your loop.
You want to use a scheduler to solve this problem - certainly do not use subscribeOn since that would burden all observers. Schedule the delivery of each number to onNext - and as a last step in each scheduled action, schedule the next one.
Essentially this is what Observable.generate gives you - each iteration is scheduled on the provided scheduler (which defaults to one that introduces concurrency if you don't specify it). Scheduler operations can be cancelled and avoid thread starvation.
Rx.NET solves it like this (actually there is an async/await model that's better, but not available in Java afaik):
static IObservable<int> Range(int start, int count, IScheduler scheduler)
{
return Observable.Create<int>(observer =>
{
return scheduler.Schedule(0, (i, self) =>
{
if (i < count)
{
Console.WriteLine("Iteration {0}", i);
observer.OnNext(start + i);
self(i + 1);
}
else
{
observer.OnCompleted();
}
});
});
}
Two things to note here:
The call to Schedule returns a subscription handle that is passed back to the observer
The Schedule is recursive - the self parameter is a reference to the scheduler used to call the next iteration. This allows for unsubscription to cancel the operation.
Not sure how this looks in RxJava, but the idea should be the same. Again, Observable.generate will probably be simpler for you as it was designed to take care of this scenario.
When creating infinite sequencies care should be taken to:
subscribe and observe on different threads; otherwise you will only serve single subscriber
stop generating values as soon as subscription terminates; otherwise runaway loops will eat your CPU
The first issue is solved by using subscribeOn(), observeOn() and various schedulers.
The second issue is best solved by using library provided methods Observable.generate() or Observable.fromIterable(). They do proper checking.
Check this:
Observable<Integer> naturalNumbers =
Observable.<Integer, Integer>generate(() -> 1, (s, g) -> {
logger.info("generating {}", s);
g.onNext(s);
return s + 1;
}).subscribeOn(Schedulers.newThread());
Disposable sub1 = naturalNumbers
.subscribe(v -> logger.info("1 got {}", v));
Disposable sub2 = naturalNumbers
.subscribe(v -> logger.info("2 got {}", v));
Disposable sub3 = naturalNumbers
.subscribe(v -> logger.info("3 got {}", v));
Thread.sleep(100);
logger.info("unsubscribing...");
sub1.dispose();
sub2.dispose();
sub3.dispose();
Thread.sleep(1000);
logger.info("done");

Categories

Resources