How to wait for future inside Kafka Stream map()? - java

I am implementing Spring Boot application in Java, using Spring Cloud Stream with Kafka Streams binder.
I need to implement blocking operation inside of KStream map method like so:
public Consumer<KStream<?, ?>> sink() {
return input -> input
.mapValues(value -> methodReturningCompletableFuture(value).get())
.foreach((key, value) -> otherMethod(key, value));
}
completableFuture.get() throws exceptions (InterruptedException, ExecutionException)
How to handle these exceptions so that the chained method doesn't get executed and the Kafka message is not acknowledged? I cannot afford message loss, sending it to a dead letter topic is not an option.
Is there a better way of blocking inside map()?

You can try the branching feature in Kafka Streams to control the execution of the chained methods. For example, here is a pseudo-code that you can try.
You can possibly use this as a starting point and adapt this to your particular use case.
final Map<String, ? extends KStream<?, String>> branches =
input.split()
.branch(k, v) -> {
try {
methodReturningCompletableFuture(value).get();
return true;
}
catch (Exception e) {
return false;
}
}, Branched.as("good-records"))
.defaultBranch();
final KStream<?, String> kStream = branches.get("good-records");
kStream.foreach((key, value) -> otherMethod(key, value));
The idea here is that you will only send the records that didn't throw an exception to the named branch good-records, everything else goes into a default branch which we simply ignore in this pseudo-code. Then you invoke additional chained methods (as this foreach call shows) only for those "good" records.
This does not solve the problem of not acknowledging the message after an exception is thrown. That seems to be a bit challenging. However, I am curious about that use case. When an exception happens and you handle it, why don't you want to ack the message? The requirements seem to be a bit rigid without using a DLT. The ideal solution here is that you might want to introduce some retries and once exhausted from the retries, send the record to a DLT which makes Kafka Streams consumer acknowledges the message. Then the application moves on to the next offset.
The call methodReturningCompletableFuture(value).get() simply waits until a default or configured timeout is reached, assuming that methodReturningCompletableFuture() returns a Future object. Therefore, that is already a good approach to wait inside the KStream map operation. I don't think anything else is necessary to make it wait further.

Related

Sending messages from the database to Kafka with the guaranty of not losing them and keeping the order

I need to send data from database to Kafka. Any data should not be lost and the message order should be strictly kept as they are fetched from the database. After the messages are sent, I need to remove them from the database. Once completed, this task will be repeated again and again (it is scheduled via #Scheduler).
I have come to the conclusion that the guaranty of not losing any message and keeping the order of them requires the following: before sending a new message, I need to make sure that the previous one was successfully delivered to the Kafka broker (Acks=all, min.insync.replicas=2). If a message is not delivered to the broker, there is no point to send the next one. As a result, the solution turns out to be synchronous. Here is my code example:
public List<String> sendMessages(String topicName, List<Object> data) {
List<String> successIds = new ArrayList<>();
for (Object value : data) {
ListenableFuture<SendResult<String, Object>> listenableFuture = kafkaTemplate.send(topicName, value.getSiebelId(), value);
try {
listenableFuture.get(3, TimeUnit.SECONDS);
} catch (Exception e) {
log.warn("todo");
break;
}
successIds.add(value.getId());
}
return successIds;
}
successIds contains the id of messages which are successfully delivered to the broker. Next, I use them to delete the corresponding data in the database. If during the operation of sending messages from the List<Object> data some message was not delivered to the broker for some reason, then we end the iteration earlier and delete exactly what managed to get into successIds. In the next iteration, we will start with those messages that were not included in the successIds, because they have not been removed from the database.
From this solution requires the rejection of asynchrony, which will certainly lead to a decrease in the performance. I already tested it and it works very slowly. I am new to Kafka so would like an expert opinion. Is this solution optimal?
The following solution below give me much faster productivity (~ 100 times on my test data) comparing with listenableFuture.get posted in the question. Here I add a callback to listenableFuture where in onSuccess method I put the successfully sent ids into the list. After the iteration over the list, I call flush() on kafkaTemplate.
#Override
public List<String> sendMessages(String topicName, List<T> data) {
List<String> successIds = new ArrayList<>();
data.forEach(value ->
kafkaTemplate.send(topicName, value.getSiebelId(), value)
.addCallback(new ListenableFutureCallback<>() {
#Override
public void onSuccess(SendResult<String, Object> result) {
successIds.add(value.getId());
}
#Override
public void onFailure(Throwable exception) {
log.warn("todo");
}
}));
kafkaTemplate.flush();
return successIds;
}
The output with successIds of this solution however might differ from the one in the question. Say, I have 5 messages to send. If 3th message is not delivered (say due to problem with the network), the 4th, 5th will still be sent to the broker and might be delivered (if the network problem is fixed). So successIds={1,2,4,5}. Later on 3th messages will be sent with a new list iteration and therefore might be delivered after 5th message. So it delivers faster, give the guaranty that no message is lost, but will not give 100% guaranty of keeping the order. This is not ideal, but maybe I will try to use it as a compromise.
In the same situation, the solution with listenableFuture.get will not even send 4th, 5th messages, and gets successIds={1,2}. Not delivered messages 3th, 4th, 5th will be sent in the proper order in a new list iteration.
I could not explain properly why I gain much productivity with the presented solution. I guess kafkaTemplate.flush() somehow do asych stuff, while listenableFuture.get put the request in the sequence to wait the corresponding responses.
P.S. Interesting to note, that if I use the same code but remove line kafkaTemplate.flush() and instead I initialize bean kafkaTemplate with autoflash=true, then it works again slowly.

Consumer Thread and SQS - Preventing from submitting null data to the blocking queue and finding a better way to call the consumer threads at runtime

I have a standard SQS and I need to read data from it using multiple consumer threads.
To achieve the same, I have written the following method ( in Java):
public void consume() {
ExecutorService exService = Executors.newFixedThreadPool(3);
while(true) {
exService.execute(()->{
try {
/* the following code submits a null data to the blocking queue when the queue(sqs) is empty*/
OrderQ order = queueMessagingTemplate.receiveAndConvert(StringConstants.QUEUE_NAME
, OrderQ.class);
if(order!=null)
repstiory.saveOrder(order);
log.debug("Received from queue:"+ order);
}catch(Exception exc) {
log.error("couldn't read msg from queue");
throw new CouldNotReadMessageFromSQSException("couldn't read msg from queue");
}
});
}
}
and, as of now, I have two approaches to call the above method and start consuming the messages from the queue stated as follows along with the issues associated with them:
Approach-1 :
create a rest api and call this method. This approach won't work in production and also when the queue is empty, null will keep getting populated in the blocking queue of the thread pool. So, this is clearly not a good approach. Question-How can I ensure that only 'not null' data is submitted to the blocking queue?
Approach-2:
calling the consume method from CommandLineRunner but the issue is, the queue, most probably, won't have data as soon as the application starts and will run into the same problem as described in the 1st approach.
Problem-1:What can be a better solution to handle the null data problem?
problem-2:what can be better way to call the consume method, considering the production environment?
Kindly suggest solutions and best practices to achieve the same.

Infinite never failing hot flux

I'm using webflux for handling my http request. As an side effect of the processing I want to add record to the database but I do not want to stop processing of user request to achieve that.
Somewhere in main application flow.
#GetMapping
Flux<Data> getHandler(){
return doStuff().doOnNext(data -> dataStore.store(data));
}
In different class I have
class DataStore {
private static final Logger LOGGER = LoggerFactory.getLogger(DataStore.class);
private DataRepository repository;
private Scheduler scheduler;
private Sinks.Many<Data> sink;
public DataStore(DataRepository repository, Scheduler scheduler)
this.repository = repository;
this.scheduler = scheduler; //will be boundedElastic in production
this.sink = Sinks.many().replay().limit(1000); //buffer size
//build hot flux
this.sink.asFlux()
.map(data -> repository.save(data))
// retry strategy for random issues with DB connection
.retryWhen(Retry.backoff(maxRetry, backoffDuration)
.doBeforeRetry(signal -> LOGGER.warn("Retrying to save, attempt {}", signal.totalRetries())))
// give up on saving this item, drop it, try with another one, reset backoff strategy in the meantime
.onErrorContinue(Exceptions::isRetryExhausted, (e, o) -> LOGGER.error("Dropping data"))
.subscribeOn(scheduler, true)
.subscribe(
data-> LOGGER.info("Data {} saved.", data),
error -> LOGGER.error("Fatal error. Terminating store flux.", error)
);
}
public void store(Data data) {
sink.tryEmitNext(data);
}
But when writing tests for it I have noticed that if backoff reaches it limit flux instead of doping the data and continuing will just stop.
#BeforeEach
public void setup() {
repository = mock(DataRepository.class);
dataStore = new DataStore(repository, Schedulers.immediate()); //maxRetry = 4, backoffDuration = Duration.ofMillis(1)
}
#Test
public void test() throws Exception {
//given
when(repository.save(any()))
.thenThrow(new RuntimeException("fail")) // normal store
.thenThrow(new RuntimeException("fail")) // first retry
.thenThrow(new RuntimeException("fail")) // second retry
.thenThrow(new RuntimeException("fail")) // third retry
.thenThrow(new RuntimeException("fail")) // fourth retry -> should drop data("One")
.thenAnswer(invocation -> invocation.getArgument(0)) //store data("Two")
.thenAnswer(invocation -> invocation.getArgument(0));//store data("Three")
//when
searchStore.store(data("One")); //exhaust 5 retries
searchStore.store(data("Two")); //successful store
searchStore.store(data("Three")); //successful store
//then
Thread.sleep(2000); //overkill sleep
verify(repository, times(7)).save(any()); //assertion fails. data two and three was not saved.
}
When running this test my assertion fails and in the logs I can see only
Retrying to save, attempt 0
Retrying to save, attempt 1
Retrying to save, attempt 2
Retrying to save, attempt 3
Dropping data
And there is no info of successful processing of data Two and Three.
I do not want to retry indefinitely, because I assume that DB connection may fail from time to time and I do not want to have buffer overflow.
I know that I can achieve similar flow without flux (use queue etc.), but the build in retry with backoff is very tempting.
How I can drop error from the flux as onErrorContinue does not seam to be working?
General note - the code in the above question isn't used in a reactive context, and therefore this answer suggestions options that would be completely wrong if using Webflux or in a similar reactive environment.
Firstly, note that onErrorContinue() is almost certainly not what you want - not just in this situation, but generally. It's a specialist operator that almost certainly doesn't quite do what you think it does.
Usually, I'd balk at this line:
.map(data -> repository.save(data))
...as it implies your repository isn't reactive, so you're blocking in a reactive chain - a complete no-no. In this case because you're using it purely for the convenience of the retry semantics it's not going to cause issues, but bear in mind most people used to seeing reactive code will get scared when they see stuff like this!
If you're able to use a reactive repository, that would be better, and then you'd expect to see something like this:
.flatMap(data -> repository.save(data))
...implying that the save() method is returning a non-blocking Mono rather than a plain value after blocking. The norm with retrying would then be to retry on the inner publisher, resuming on an empty publisher if retries are exhausted:
.flatMap(data -> repository.save(data)
.retryWhen(Retry.backoff(maxRetry, backoffDuration)
.doBeforeRetry(signal -> LOGGER.warn("Retrying to save, attempt {}", signal.totalRetries())))
.onErrorResume(Exceptions::isRetryExhausted, e -> Mono.empty())
)
If you're not able or willing to use a reactive repository, then in this case you can still achieve the above by wrapping repository.save(data) as Mono.just(repository.save(data)) - but again, that's a bit of a code smell, and completely forbidden in a standard reactive chain, as you're making something "look" reactive when it's not.

Conditional logic on a Reactor Flux

I am a Reactor newbie. I am trying to develop the following application logic:
Read messages from a Kafka topic source.
Transform the massages.
Write a subset of the transformed messages to a new Kafka topic target.
Explicitly acknowledge the reading operation for all the messages originally read from topic source.
The only solution I found is to rewrite the above business logic as it follows.
Read messages from a Kafka topic source.
Transform the massages.
Immediately acknowledge the message not be written to topic target.
Filter all the above messages.
Write the rest of the transformed messages to the new Kafka topic target.
Explicitly acknowledge the reading operation for these messages
The code implementing the second logic is the following:
receiver.receive()
.flatMap(this::processMessage)
.map(this::acknowledgeMessagesNotToWriteInKafka)
.filter(this::isMessageToWriteInKafka)
.as(this::sendToKafka)
.doOnNext(r -> r.correlationMetadata().acknowledge());
Clearly, receiver type is KafkaReceiver, and method sendToKafka uses a KafkaSender. One of the things I don't like is that I am using a map to acknowledge some messages.
Is there any better solution to implement the original logic?
This is not exactly your four business logic steps, but I think it's a little bit closer to what you want.
You could acknowledge the "discarded" messages that won't be written in .doOnDiscard after .filter...
receiver.receive()
.flatMap(this::processMessage)
.filter(this::isMessageToWriteInKafka)
.doOnDiscard(ReceiverRecord.class, record -> record.receiverOffset().acknowledge())
.as(this::sendToKafka)
.doOnNext(r -> r.correlationMetadata().acknowledge());
Note: you'll need to use the proper object type that was discarded. I don't know what type of object the Publisher returned from processMessage emits, but I assume you can get the ReceiverRecord or ReceiverOffset from it in order to acknowledge it.
Alternatively, you could combine filter/doOnDiscard into a single .handle operator...
receiver.receive()
.flatMap(this::processMessage)
.handle((m, sink) -> {
if (isMessageToWriteInKafka(m)) {
sink.next(m);
} else {
m.getReceiverRecord().getReceiverOffset().acknowledge();
}
})
.as(this::sendToKafka)
.doOnNext(r -> r.correlationMetadata().acknowledge());

Assert Kafka send worked

I'm writing an application with Spring Boot so to write to Kafka I do:
#Autowired
private KafkaTemplate<String, String> kafkaTemplate;
and then inside my method:
kafkaTemplate.send(topic, data)
But I feel like I'm just relying on this to work, how can I know if this has worked? If it's asynchronous, is it a good practice to return a 200 code and hoped it did work? I'm confused. If Kafka isn't available, won't this fail? Shouldn't I be prompted to catch an exception?
Along with what #mjuarez has mentioned you can try playing with two Kafka producer properties. One is ProducerConfig.ACKS_CONFIG, which lets you set the level of acknowledgement that you think is safe for your use case. This knob has three possible values. From Kafka doc
acks=0: Producer doesn't care about acknowledgement from server, and considers it as sent.
acks=1: This will mean the leader will write the record to its local log but will respond without awaiting full acknowledgement from all followers.
acks=all: This means the leader will wait for the full set of in-sync replicas to acknowledge the record.
The other property is ProducerConfig.RETRIES_CONFIG. Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error.
Yes, if Kafka is not available, that .send() call will fail, but if you send it async, no one will be notified. You can specify a callback that you want to be executed when the future finally finishes. Full interface spec here: https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/producer/Callback.html
From the official Kafka javadoc here: https://kafka.apache.org/20/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html
Fully non-blocking usage can make use of the Callback parameter to
provide a callback that will be invoked when the request is complete.
ProducerRecord<byte[],byte[]> record = new ProducerRecord<byte[],byte[]>("the-topic", key, value);
producer.send(myRecord,
new Callback() {
public void onCompletion(RecordMetadata metadata, Exception e) {
if(e != null) {
e.printStackTrace();
} else {
System.out.println("The offset of the record we just sent is: " + metadata.offset());
}
}
});
you can use below command while sending messages to kafka:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic-name
while above command is running you should run your code and if sending messages being successful then the message must be printed on the console.
Furthermore, likewise any other connection to any resources if the connection could not be established, then doing any kinds of operations would result some exception raises.

Categories

Resources