Deal efficiently with a lot of discarded objects - java

I have the following reactive stream where logs is Flux<String>:
logs.bufferTimeout(50, Duration.ofSeconds(20))
.doOnNext(logs -> sendLogs(logs))
.doOnDiscard(String.class, log -> sendLogs(List.of(log)));
The purpose of the stream is to send logs to another microservice for long-term storage. As you can see, the logs are buffered and sent in batches of 50. The intent is to limit the number of requests made to the second microservice.
For reasons that are not relevant to this question, the logs stream can be canceled at any time. Whenever this happens, I want all log messages that are still in the buffer to also be sent to the second microservice. The doOnDiscard operator I've added does just that, but perhaps not in the most performant/safe way. Let's say that there are 49 logs in the buffer at the time of cancellation. The lambda in doOnDiscard will then be called 49 times (once for each discarded log message) and this will result in 49 requests to the second microservice is a short amount of time.
You can reproduce this behavior by executing the following code:
public static void main(String[] args) throws InterruptedException {
Disposable disposable = Flux.interval(Duration.ofSeconds(1))
.bufferTimeout(10, Duration.ofSeconds(5))
.doOnNext(System.out::println)
.doOnDiscard(Long.class, i -> System.out.println("Discarded: " + i))
.subscribe();
Thread.sleep(8000);
disposable.dispose();
Thread.sleep(2000);
}
Is there any way to buffer the discarded elements and process them together? I've tried the following:
public static void main(String[] args) throws InterruptedException {
List<Long> discardedElements = new ArrayList<>();
Disposable disposable = Flux.interval(Duration.ofSeconds(1))
.bufferTimeout(10, Duration.ofSeconds(5))
.doOnNext(System.out::println)
.doOnDiscard(Long.class, discardedElements::add)
.doFinally(signal -> System.out.println("Discarded elements: " + discardedElements))
.subscribe();
Thread.sleep(8000);
disposable.dispose();
Thread.sleep(2000);
}
It does what I want, but I feel like using a variable outside of the stream for temporary storage is not the "cleanest" solution. Is there a better one? Is the pattern of using an external variable a "recommended" one?

Related

Limit for `onErrorContinue(...)` in Flux?

I have a (possibly infinite) Flux source that is supposed to first store each message (e.g. into a database) and then asynchronously forward the messages (e.g. using Spring WebClient).
The forward(s) in case of failure are supposed to log an error, without completing the source Flux.
I however realized that forward(s) wihtin the flow (flatMap(...)) block execution of the source Flux after exactly 256 messages that cause exceptions (e.g. reactor.retry.RetryExhaustedException).
Representative example that fails in the assert since only 256 messages are processed:
#Test
#SneakyThrows
public void sourceBlockAfter256Exceptions() {
int numberOfRequests = 500;
Set<Integer> sink = new HashSet<>();
Flux
.fromStream(IntStream.range(0, numberOfRequests).boxed())
.map(sink::add)
.flatMap(i -> Mono
// normally the forwards are contained here e.g. by means of Mono.when(...).thenReturn(...).retryWhen(...):
.error(new Exception("any"))
)
.onErrorContinue((throwable, o) -> log.error("Error", throwable))
.subscribe();
Thread.sleep(3000);
Assertions.assertEquals(numberOfRequests, sink.size());
}
Doing the forward within the subscribe(...) doesn't block the source Flux but that's certainly no solution, since I don't possibly want to lose messages.
Questions:
What has happened here? (probably related to some state stored in just one bit)
How can I do this correctly?
EDIT:
According to the discussion below I've constructed an example that uses FluxMessageChannel (which up to my understanding is made for infinite streams and definitly not expected to block after 256 Errors) and has exactly the same behaviour:
#Test
#SneakyThrows
public void maxConnectionWithChannelTest() {
int numberOfRequests = 500;
Set<Integer> sink = new HashSet<>();
FluxMessageChannel fluxMessageChannel = MessageChannels.flux().get();
fluxMessageChannel.subscribeTo(
Flux
.fromStream(IntStream
.range(0, numberOfRequests).boxed()
.map(i -> MessageBuilder.withPayload(i).build())
)
.map(Message::getPayload)
.map(sink::add)
.flatMap(i -> Mono.error(new Exception("whatever")))
);
Flux
.from(fluxMessageChannel)
.subscribe();
Thread.sleep(3000);
Assert.assertEquals(numberOfRequests, sink.size());
}
EDIT:
I just raised an issue in the reactor core project: https://github.com/reactor/reactor-core/issues/2011

Reactor - how to retry on hot flux without dropping elements?

I have an infinite hot flux of data. I am about to engage in carrying out an operation on each element in the stream, each of which returns a Mono which will complete (one way or another) after some finite time.
There is the possibility of an error being thrown from these operations. If so, I want to resubscribe to the hot flux without missing anything, retrying elements that were in the middle of being processed when the error was thrown (i.e. anything that did not complete successfully).
What do I do here? I can tolerate repeated operations on the same elements, but not losing elements entirely from the stream.
I've attempted to use a ReplayProcessor to handle this, but I can't see a way of making it work without repeating a lot of operations that might well have succeeded (using a very conservative timeout), or losing elements due to new elements overriding old ones in the buffer (as below).
Test case:
#Test
public void fluxTest() {
List<String> strings = new ArrayList<>();
strings.add("one");
strings.add("two");
strings.add("three");
strings.add("four");
ConnectableFlux<String> flux = Flux.fromIterable(strings).publish();
//Goes boom after three uses of its method, otherwise
//returns a mono. completing after a little time
DangerousClass dangerousClass = new DangerousClass(3);
ReplayProcessor<String> replay = ReplayProcessor.create(3);
flux.subscribe(replay);
replay.flatMap(dangerousClass::doThis)
.retry(1)
.doOnNext(s -> LOG.info("Completed {}", s))
.subscribe();
flux.connect();
flux.blockLast();
}
public class DangerousClass {
Logger LOG = LoggerFactory.getLogger(DangerousClass.class);
private int boomCount;
private AtomicInteger count;
public DangerousClass(int boomCount) {
this.boomCount = boomCount;
this.count = new AtomicInteger(0);
}
public Mono<String> doThis(String s) {
return Mono.fromSupplier(() -> {
LOG.info("doing dangerous {}", s);
if (count.getAndIncrement() == boomCount) {
LOG.error("Throwing exception from {}", s);
throw new RuntimeException("Boom!");
}
return s;
}).delayElement(Duration.ofMillis(600));
}
}
This prints:
doing dangerous one
doing dangerous two
doing dangerous three
doing dangerous four
Throwing exception from four
doing dangerous two
doing dangerous three
doing dangerous four
Completed four
Completed two
Completed three
One is never completed.
The error (at least in the above example) can only occur in the flatMap(dangerousClass::doThis) call - so resubscribing to the root Flux and replaying elements when this one flatMap() call has failed seems a bit odd, and (probably) isn't what you want to do.
Instead, I'd recommend ditching the ReplayProcessor and just calling retry on the inner flatMap() call instead, so you end up with something like:
ConnectableFlux<String> flux = Flux.range(1, 10).map(n -> "Entry " + n).publish();
DangerousClass dangerousClass = new DangerousClass(3);
flux.flatMap(x -> dangerousClass.doThis(x).retry(1))
.doOnNext(s -> System.out.println("Completed " + s))
.subscribe();
flux.connect();
This will give you something like the following, with all entries completed and no retries:
doing dangerous Entry 1
doing dangerous Entry 2
doing dangerous Entry 3
doing dangerous Entry 4
Throwing exception from Entry 4
doing dangerous Entry 4
Completed Entry 2
Completed Entry 1
Completed Entry 3
Completed Entry 4

How to limit Single.Zip paralellism?

I am doing several http request, waiting for all the requests to complete, and with the information from all the request (and several other sources) calculate the result.
Currently I am doing it like this:
Single.zip(observables, { array -> array })
Where observables is just an array of observables, each of them doing an async operation.
But I have a limit on how many operations I can do concurrently. There should be no more than n operations at the same time. (n being ideally 5 but 1 is accepted too)
Unfortunately Zip seems to start all the operations without waiting for any of them to complete. Is there a way to limit this behavior?
Maybe you could use a combination of window() operator and zip()?
Something like that:
public static void main(String[] args) {
Flowable<Integer>[] flowables = new Flowable[] {
Flowable.just(1), Flowable.just(2), Flowable.just(3), Flowable.just(4), Flowable.just(5),
Flowable.just(6), Flowable.just(7), Flowable.just(8), Flowable.just(9)
};
Flowable.fromArray(flowables)
.window(5)
.flatMap(f -> Flowable.zip(f, objects -> Arrays.stream(objects).map(Object::toString).collect(joining("")))
.flatMapSingle(Single::just))
.subscribe(s -> System.out.println("received: " + s));
Flowable.timer(10, SECONDS) // Just to block the main thread for a while
.blockingSubscribe();
}
The window() will split the flowables into a Flowable of Flowables. Each flowable emitting only 5 elements (which can be the number of operations you want).
In this example, the zip() just concatenate the given integers.
It will print:
received: 12345
received: 6789
I hope this helps.

RxJava polling + manual refresh

I have a list a want to refresh every minute.
For example the user list here : https://github.com/android10/Android-CleanArchitecture/blob/master/domain/src/main/java/com/fernandocejas/android10/sample/domain/interactor/GetUserList.java
I add a periodical refresh using repeatWhen :
public Observable<List<User>> buildUseCaseObservable(Void unused) {
return this.userRepository
.users()
.repeatWhen(new Function<Observable<Object>, ObservableSource<?>>() {
#Override
public ObservableSource<?> apply(Observable<Object> objectObservable) throws Exception {
return objectObservable.delay(1, TimeUnit.MINUTES);
}
});
}
It works fine this way, calling onNext every minute.
But if I want to refresh immediately this list (because of user's action or because of a notification), I don't know how to perform that.
Should I cancel/dispose the observable and restart a new one ?
Thanks
From your code I understand that the users list is generated and emitted upon subscription.
Here are some solutions I can think of, instead of unsubscribing and resubscribing upon the event to which you want to react immediately:
Instead of using the repeatWhen operator, use the interval creation operator combined with the flatMap to invoke the subscription to a new Observable every minute and use the merge operator to add reaction to the other event in which you are interested. Something like this:
#Test
public void intervalObservableAndImmediateReaction() throws InterruptedException {
Observable<String> obs = Observable.interval(1, TimeUnit.SECONDS)
.cast(Object.class)
.mergeWith(
Observable.just("mockedUserClick")
.delay(500, TimeUnit.MILLISECONDS))
.flatMap(
timeOrClick -> Observable.just("Generated upon subscription")
);
obs.subscribe(System.out::println);
Thread.currentThread().sleep(3000); //to see the prints before ending the test
}
or adjusted to your needs (but the principal is the same):
Observable.interval(1, TimeUnit.MINUTES)
.mergeWith(RxView.clicks(buttonView))
.flatMap(timeOrClick -> this.userRepository.users());
You can use the flatMap operator as before, even while keeping you working current implementation and without merging to an interval - just keep your working code and in another area of the programme chain it to the RxBinding of your choosing:
RxView.touches(yourViewVariable)
.flatMatp(motionEvent -> this.userRepository.users())
.subscribe(theObserver);
Note that in this solution the subscription is done independently to the two observables. You'll probably be better off if you use different observers, or manage a subject or something on that line. A small test I ran showed one subscriber handled subscribing to 2 different observables with no problem (in Rxjava1 - didn't check in Rxjava2 yet), but it feels iffy to me.
If you aren't concerned with adjusting the refresh time after one of the other observables emits data you can do something like the following:
// Specific example of a user manually requesting
val request = Observable.create<String> { emitter ->
refresh.setOnClickListener {
emitter.onNext("Click Request")
}
}
.observeOn(Schedulers.io())
.flatMap {
userRepository.users()
}
// Refresh based off of your original work, could use something like interval as well
val interval = userRepository.users()
.subscribeOn(Schedulers.io())
.repeatWhen { objectObservable ->
objectObservable.delay(1, TimeUnit.MINUTES)
}
// Combine them so that both emissions are received you can even add on another source
Observable.merge(request,interval)
.observeOn(AndroidSchedulers.mainThread())
.subscribe({
contents.text = it.toString()
}, {
contents.text = it.toString()
},{
println(contents.text)
})
Then you don't have to dispose and resubscribe every time

Limiting rate of requests with Reactor

I'm using project reactor to load data from a web service using rest. This is done in parallel with multiple threads. I'm starting to hit rate limits on the web service, so I would like to send at most 10 requests per second to avoid getting these errors. How would I do that using reactor?
Using zipWith(Mono.delayMillis(100))? Or is there some better way?
Thank you
You can use delayElements instead of the whole zipwith.
One could use Flux.delayElements to process a 10 requests batch at every 1s; be aware though that if the processing takes longer than 1s the next batch will still be started in parallel hence being processed together with the previous one (and potentially many other previous ones)!
That's why I propose another solution where a 10 requests batch is still processed at every 1s but, if its processing takes longer than 1s, the next batch will fail (see overflow IllegalStateException); one could deal with that failure such that to continue the overall processing but I won't show that here because I want to keep the example simple; see onErrorResume useful to handle overflow IllegalStateException.
The code below will do a GET on https://www.google.com/ at a rate of 10 requests per second. You'll have to do additional changes in order to support the situation where your server is not able to process in 1s all your 10 requests; you could just skip sending requests when those asked at previous second are still processed by your server.
#Test
void parallelHttpRequests() {
// this is just for limiting the test running period otherwise you don't need it
int COUNT = 2;
// use whatever (blocking) http client you desire;
// when using e.g. WebClient (Spring, non blocking client)
// the example will slightly change for no longer use
// subscribeOn(Schedulers.elastic())
RestTemplate client = new RestTemplate();
// exit, lock, condition are provided to allow one to run
// all this code in a #Test, otherwise they won't be needed
var exit = new AtomicBoolean(false);
var lock = new ReentrantLock();
var condition = lock.newCondition();
MessageFormat message = new MessageFormat("#batch: {0}, #req: {1}, resultLength: {2}");
Flux.interval(Duration.ofSeconds(1L))
.take(COUNT) // this is just for limiting the test running period otherwise you don't need it
.doOnNext(batch -> debug("#batch", batch)) // just for debugging
.flatMap(batch -> Flux.range(1, 10) // 10 requests per 1 second
.flatMap(i -> Mono.fromSupplier(() ->
client.getForEntity("https://www.google.com/", String.class).getBody()) // your request goes here (1 of 10)
.map(s -> message.format(new Object[]{batch, i, s.length()})) // here the request's result will be the output of message.format(...)
.doOnSubscribe(s -> debug("doOnSubscribe: #batch = " + batch + ", i = " + i)) // just for debugging
.subscribeOn(Schedulers.elastic()) // one I/O thread per request
)
)
// consider using onErrorResume to handle overflow IllegalStateException
.subscribe(
s -> debug("received", s) // do something with the above request's result
e -> {
// pay special attention to overflow IllegalStateException
debug("error", e.getMessage());
signalAll(exit, condition, lock);
},
() -> {
debug("done");
signalAll(exit, condition, lock);
}
);
await(exit, condition, lock);
}
// you won't need the "await" and "signalAll" methods below which
// I created only to be easier for one to run this in a test class
private void await(AtomicBoolean exit, Condition condition, Lock lock) {
lock.lock();
while (!exit.get()) {
try {
condition.await();
} catch (InterruptedException e) {
// maybe spurious wakeup
e.printStackTrace();
}
}
lock.unlock();
debug("exit");
}
private void signalAll(AtomicBoolean exit, Condition condition, Lock lock) {
exit.set(true);
try {
lock.lock();
condition.signalAll();
} finally {
lock.unlock();
}
}

Categories

Resources