Automatic rate adjustment in Reactor - java

TL;DR;
Is there a way to automatically adjust delay between elements in Project Reactor based on downstream health?
More details
I have an application that reads records from Kafka topic, sends an HTTP request for each one of them and writes the result to another Kafka topic. Reading and writing from/to Kafka is fast and easy, but the third party HTTP service is easily overwhelmed, so I use delayElements() with a value from a property file, which means that this value does not change during application runtime. Here's a code sample:
kafkaReceiver.receiveAutoAck()
.concatMap(identity())
.delayElements(ofMillis(delayElement))
.flatMap(message -> recordProcessingFunction.process(message.value()), messageRate)
.onErrorContinue(handleError())
.map(this::getSenderRecord)
.flatMap(kafkaSender::send)
However, the third party service might perform differently overtime and I'd like to be able to adjust this delay accordingly. Let's say, if I see that over 5% of requests fail over 10 second period, I would increase the delay. If it gets lower than 5% for over 10 sec, then I would reduce the delay again.
Is there an existing mechanism for that in Reactor? I can think of some creative solutions from my side, but was wondering if they (or someone else) already implemented that.

I don't think there is backpressure provided by any HTTP client, including netty. One option is to switch to RSocket, but if you are calling a third-party service, that may not be an option, I guess. You could tune a rate that works during most of the day and send the errored out message to another topic using doOnError or similar. Another receiver can process those messages with even higher delays, put the message back on the same topic with a retry count if it errors out again so that you can finally stop processing them.

If you are looking for delaying elements depends on the elements processing speed, you could use delayUntil.
Flux.range(1, 100)
.doOnNext(i -> System.out.println("Kafka Receive :: " + i))
.delayUntil(i -> Mono.fromSupplier(() -> i)
.map(k -> {
// msg processing
return k * 2;
})
.delayElement(Duration.ofSeconds(1)) // msg processing simulation
.doOnNext(k -> System.out.println("Kafka send :: " + k)))
.subscribe();

You can add a retry with exponential backoff. Somethign like this:
influx()
.flatMap(x -> Mono.just(x)
.map(data -> apiCall(data))
.retryWhen(
Retry.backoff(Integet.MAX_VALUE, Duration.ofSeconds(30))
.filter(err -> err instanceof RuntimeException)
.doBeforeRetry(
s -> log.warn("Retrying for err {}", s.failure().getMessage()))
.onRetryExhaustedThrow((spec, sig) -> new RuntimeException("ex")))
.onErrorResume(err -> Mono.empty()),
concurrency_val,
prefetch_val)
This will retry the failed request Integet.MAX_VALUE times with minimum time of 30s between each retry. The subsequent retries are actually offset by a configurable jitter factor (default value = 0.5) causing the duration to increase between successive retries.
The documentation on Retry.backoff says that:
A RetryBackoffSpec preconfigured for exponential backoff strategy with jitter, given a maximum number of retry attempts and a minimum Duration for the backoff.
Also, since the whole operation is mapped in flatMap, you can vary the default concurrency and prefetch values for it in order to account for the maximum number of requests that can fail at any given time while the whole pipeline waits for the RetryBackOffSpec to complete successfully.
Worst case scenario, your concurrency_val number of requests have failed and waiting for 30+ seconds for the retry to happen. The whole operation might halt down (still waiting for success from downstream) which may not be desirable if the downstream system don't recover in time. Better to replace backOff limit from Integer.MAX_VALUE to something managable beyond which it would just log the error and proceed with next event.

Related

SpringBoot with REACTOR kafka : increase message consumption thorughput on a 2CPUs pod

Small question regarding a SpringBoot 3 app with reactor kafka please.
I have a small reactive kafka consumer app, which consumes messages from kafka and processes the message.
The app is consuming one topic the-topic which has three partitions.
The app is dockerized, and for resource consumption limit reason, the app can only use 2CPUs (please bear with me on that one). And to make things more difficult, I am allowed to only have one unique instance of this app running.
The app is very straightforward:
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
</dependency>
<dependency>
<groupId>io.projectreactor.kafka</groupId>
<artifactId>reactor-kafka</artifactId>
</dependency>
</dependencies>
#Configuration
public class MyKafkaConfiguration {
#Bean
public KafkaReceiver<String, String> reactiveKafkaConsumerTemplate(KafkaProperties kafkaProperties) {
kafkaProperties.setBootstrapServers(List.of("my-kafka.com:9092"));
kafkaProperties.getConsumer().setGroupId("should-i-do-something-here");
final ReceiverOptions<String, String> basicReceiverOptions = ReceiverOptions.create(kafkaProperties.buildConsumerProperties());
basicReceiverOptions.subscription(Collections.singletonList("the-topic"));
return new DefaultKafkaReceiver<>(ConsumerFactory.INSTANCE, basicReceiverOptions);
}
}
#Service
public class MyConsumer implements CommandLineRunner {
#Autowired
private KafkaReceiver<String, String> kafkaReceiver;
#Override
public void run(String... args) {
myConsumer().subscribe();
}
public Flux<String> myConsumer() {
return kafkaReceiver.receive()
.flatMap(oneMessage -> consume(oneMessage))
.doOnNext(abc -> System.out.println("successfully consumed {}={}" + abc))
.doOnError(throwable -> System.out.println("something bad happened while consuming : {}" + throwable.getMessage()));
}
private Mono<String> consume(ConsumerRecord<String, String> oneMessage) {
// this first line is a heavy in memory computation which transforms the incoming message to a data to be saved.
// it is very intensive computation, but has been tested NON BLOCKING by different tools, and takes 1 second :D
String transformedStringCPUIntensiveNonButNonBLocking = transformDataNonBlockingWithIntensiveOperation(oneMessage);
//then, just saved the correct transformed data into any REACTIVE repository :)
return myReactiveRepository.save(transformedStringCPUIntensiveNonButNonBLocking);
}
}
If I understand project reactor correctly, and due to my resource limitation, I will have at most 2 reactor cores.
The consume method here has been tested non-blocking, but takes one second to deal with the message.
Therefore, will I only be able to consume 2 messages per second? (hopefully not)
The messages can be consumed in any order, I wish to just maximize the throughput with this single app.
May I ask how could I maximize parallelism / throughput on this app with those constraints please?
Thank you
If you want to process messages from a Flux publisher in a parallel manner you have to use flatMap operator, because map operator operates in a synchronous way by requesting items by 1.
When you are using the flatMap operator you can rely on Reactor and let him control the concurrency or you can specify the desired concurrency via concurrency parameter (i.e. flatMap(it -> consume(), YOUR_CONCURRENCY_VALUE)
If your consume() method is not a publisher:
You can wrap it in a Mono by using Mono.fromCallable() and publish it on a scheduler that is designed of blocking tasks:
.publishOn(Schedulers.boundedElastic())
But better rewrite all consumer code to the reactive types otherwise you lose the benefits of using the reactor.
We could apply Little's Law to calculate required concurrency for handling required throughput.
workers >= throughput x latency, in our case workers is a number of messages processed in parallel
For example, to handle 100 messages per sec with 60 sec latency we would need to process 100 x 60 = 6000 concurrently. In "traditional" blocking app we would need the same number of threads. In reactive app the same workload could be handled by several threads only and, as result, much less memory. Even if a message takes 30-60 sec to process, thread will not be blocked because all IO operations are async. To scale processing you need to decrease the latency or increase concurrency.
In our case we need to process 6000 in parallel. With 3 partitions you could have 3 consumers processing 2000 messages in parallel each.
By default, flatMap is processing Queues.SMALL_BUFFER_SIZE = 256 messages in parallel but you can make this configurable.
kafkaReceiver.receive()
.flatMap(oneMessage -> consume(oneMessage), concurrency)
It's really hard to say how many messages one app can handle and you would need to run a load test to understand max throughput. Try to maximize this number to understand your limits looking at metrics. In case app could not handle such load, you would need to increase number of partitions and deploy more consumers.
Ultimately, your goal is to process more messages than you produce. If producers send 2 messages per second then you don't need high concurrency (2 * 60 = 120).
There are other variables to consider - message size, throughput of the downstream system, limits of other components. For example, WebClient/Netty has a default limit of 500 concurrent connections. Sometimes you even need to "slow down" consumers to don't overload downstream services.

Delaying Kafka Streams consuming

I'm trying to use Kafka Streams (i.e. not a simple Kafka Consumer) to read from a retry topic with events that have previously failed to process. I wish to consume from the retry topic, and if processing still fails (for example, if an external system is down), I wish to put the event back on the retry topic. Thus I don't want to keep consuming immediately, but instead wait a while before consuming, in order to not flood the systems with messages that are temporarily unprocessable.
Simplified, the code currently does this, and I wish to add a delay to it.
fun createTopology(topic: String): Topology {
val streamsBuilder = StreamsBuilder()
streamsBuilder.stream<String, ArchivalData>(topic, Consumed.with(Serdes.String(), ArchivalDataSerde()))
.peek { key, msg -> logger.info("Received event for key $key : $msg") }
.map { key, msg -> enrich(msg) }
.foreach { key, enrichedMsg -> archive(enrichedMsg) }
return streamsBuilder.build()
}
I have tried to use Window Delay to set this up, but have not managed to get it to work. I could of course do a sleep inside a peek, but that would leave a thread hanging and does not sound like a very clean solution.
The exact details of how the delay would work is not terribly important to my use case. For example, all of these would work fine:
All events on the topic in the past x seconds are all consumed at once. After it begins / finishes to consume, the stream waits x seconds before consuming again
Every event is processed x seconds after being put on the topic
The stream consumes messages with a delay of x seconds between every event
I would be very grateful if someone could provide a few lines of Kotlin or Java code that would accomplish any of the above.
You cannot really pause reading from the input topic using Kafka Streams—the only way to "delay" would be to call a "sleep", but as you mentioned, that blocks the whole thread and is not a good solution.
However, what you can do is to use a stateful processor, e.g., process() (with attached state store) instead of foreach(). If the retry fails, you don't put the record back into the input topic, but you put it into the store and also register a punctuation with desired retry delay. If the punctuation fires, you retry and if the retry succeeds, you delete the entry from the store and cancel the punctuation; otherwise, you wait until the punctuation fires again.

How to limit the number of Mono waiting for completion?

In Project Reactor I have a set number of requests to be made in the format of Mono. In order to not overwhelm the target service, I need to limit the number of requests to at most X per second and more than that ensure that I have a maximum of N pending requests waiting for completion.
The time limiting part was easy tranforming a list of Monos in a Flux and using the function Flux#delayElement but I can't figure the second requisite.
How can I limit the number of Mono waiting for completion?

Project Reactor: How to delay emission of (throttle) each element?

Consider the following Flux
Flux.range(1, 5)
.parallel(10)
.runOn(Schedulers.parallel())
.map(i -> "https://www.google.com")
.flatMap(uri -> Mono.fromCallable(new HttpGetTask(httpClient, uri)))
HttpGetTask is a Callable whose actual implementation is irrelevant in this case, it makes a HTTP GET call to the given URI and returns the content if successful.
Now, I'd like to slow down the emission by introducing an artificial delay, such that up to 10 threads are started simultaneously, but each one doesn't complete as soon as HttpGetTask is done. For example, say no thread must finish before 3 seconds. How do I achieve that?
If the requirement is really "not less than 3s" you could add a delay of 3 seconds to the Mono inside the flatMap by using Mono.fromCallable(...).delayElement(Duration.ofSeconds(3)).

Reactive event processing with retrying in Java/Groovy

I would like to implement a microservice which after receive a request (via message queue) will try to execute it via REST/SOAP calls to the external services. On success the reply should be sent back via MQ, but on failure the request should be rescheduled for the execution later (using some custom algorithm like 10 seconds, 1 minute, 10 minutes, timeout - give up). After specified amount of time the failure message should be sent back to the requester.
It should run on Java 8 and/or Groovy. Event persistence is not required.
First I though about Executor and Runnable/Future together with ScheduledExecutorService.scheduleWithFixedDelay, but it looks to much low level for me. The second idea was actors with Akka and Scheduler (for rescheduling), but I'm sure there could be some other approaches.
Question. What technique would you use for reactive event processing with an ability to reschedule them on failure?
"Event" is quite fuzzy term, but most of definitions I met was talking about one of techniques of Inversion of Control. This one was characterized with fact, that you don't care WHEN and BY WHOM some piece of code will be called, but ON WHAT CONDITION. That means that you invert (or more precisely "lose") control over execution flow.
Now, you want event-driven processing (so you don't want to handle WHEN and BY WHOM), yet you want to specify TIMED (so strictly connected to WHEN) behaviour on failure. This is some kind of paradox to me.
I'd say you would do better, if you'd use callbacks for reactive programming, and on failure you'd just start new thread that will sleep for 10 seconds and re-run callback.
In the end I have found the library async-retry which was written just for this purpose. It allows to asynchronously retry the execution in a very customizable way. Internally it leverages ScheduledExecutorService and CompletableFuture (or ListenableScheduledFuture from Guava when Java 7 has to be used).
Sample usage (from the project web page):
ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
RetryExecutor executor = new AsyncRetryExecutor(scheduler).
retryOn(SocketException.class).
withExponentialBackoff(500, 2). //500ms times 2 after each retry
withMaxDelay(10_000). //10 seconds
withUniformJitter(). //add between +/- 100 ms randomly
withMaxRetries(20);
final CompletableFuture<Socket> future = executor.getWithRetry(() ->
new Socket("localhost", 8080)
);
future.thenAccept(socket ->
System.out.println("Connected! " + socket)
);

Categories

Resources