Exactly once Producer and Consumption - Apache Kafka and SpringBoot - java

I am working on a microservices project. In this project, there is a Microservice A is doing a process in various steps. At the completion of each step, Microservice sends a message into a kafka topic. Then another Microservice B consumes the message from the kafka topic and sends an email notifying the successful completion of the step. I need Exactly once semantics for this. I am using KafkaTemplate.send in Microservice A and #KafkaListener to read the message in Microservice B. My question is whether KafkaTemplate producer and #KafkaListener consumer are idempotent and if not, how can I make them idempotent.
Regards,
I am creating autowiring the KafkaTemplate using the following code:-
#Autowired
public EventProducer(NewTopic topic, KafkaTemplate<String, Event> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}

Exactly once semantics in Kafka apply to consume->process->produce operations within the same application - even then, only the entire cpp is "exactly once"; the consume->process part is at least once; consumption is always at least once (or at most once), including in your scenario (for B).

Related

Consume 2 topics but process only one topic which is enabled

For our application we are writing the smae messages to 2 different Kafka Brokers with 2 different topics, now at the consumer part I have to consume both the topics and process only one topic that is enabled in spring boot property file.
How I can do it using spring boot and KafkaListner??
You can configure this simply by :
#KafkaListener(topics = { "${spring.kafka.topic1}", "${spring.kafka.topic2}" })
Now if you want to consume both but process one and discard the other you can use keys in the messages and create the key accordingly on both topics
Ref:
Java consumer API : ConsumerRecord#key()

Spring cloud kafka binder query

We have a requirement where we are consuming messages from one topic then there is some enrichment happening and then we are publishing the message to another topic. below are the events
Consumer - Consume the message
Enrichment - Enriched the consumed message
Producer - Published Enriched message to other topic
I am using Spring cloud kafka binder and things are working fine. suddenly we observed that producer is sending duplicate message to the topic and then we made Producer is idempotent. We have autocommitOffSet to false for better control. Below is what we are doing in the method
#StreamListener("INPUT")
#SendTo("OUTPUT")
public void consumer(Message message){
String inputMessage = message.getPayload.toString();
String enrichMessage = // Enrichment on inputMessage
return enrichMessage;
}
We observed if ack.acknowledge() failed due to some issue, Message still sent to the outbound channel. How can we handle entire consumer/producer as part of one transaction so that if acknowledge fail message will not sent to the topic.
I have set below transaction properties as well
spring.cloud.stream.kafka.binder.transaction.transactionIdPrefix=TX-
spring.cloud.stream.kafka.binder.transaction.producer.configuration.ack=all
spring.cloud.stream.kafka.binder.transaction.producer.configuration.retries=1
spring.cloud.stream.kafka.bindings.input.consumer.autoCommitOffset=true
spring.cloud.stream.kafka.bindings.input.consumer.enableDlq=true
spring.cloud.stream.kafka.bindings.input.consumer.dlqName=error.topic
spring.cloud.stream.kafka.bindings.input.consumer.autoCommitOnError=true
If there is any example available that would be really helpful.
Cheers
You need to make the binder transactional. See the documentation
https://docs.spring.io/spring-cloud-stream-binder-kafka/docs/3.1.4/reference/html/spring-cloud-stream-binder-kafka.html#_kafka_binder_properties
spring.cloud.stream.kafka.binder.transaction.transactionIdPrefix
Enables transactions in the binder. See transaction.id in the Kafka documentation and Transactions in the spring-kafka documentation. When transactions are enabled, individual producer properties are ignored and all producers use the spring.cloud.stream.kafka.binder.transaction.producer.* properties.
Default null (no transactions)
Note that consumers on the output topic must be configured with isolation.level=read_committed to avoid receiving rolled-back records.

Kafka partition blocked when infra problem in a single instance of application

I have a problem with some micro service when running the microservice with kubernetes with many PODs.
I use the Manual commit strategy so I should acknowledge or not every message.
All the instance of the application belong the same kafka group. And the topic have at least 20 partition divided between the PODs. When consuming the message the listener make a call to a extern component (like a rest API by WebClient or RestTemplate, or a kafka producer of a different topic) . The Kafka consumer look like this:
#KafkaListener(topics = "topic")
#Trace
public void listen(#Payload Object message, , Acknowledgment acknowledgment)) {
try {
api.call(message);
acknowledgment.acknowledge();
} catch (InfraException e) {
acknowledgment.nack(1000);
}
But sometime this external component have infra problems and is not available. The problem usually happens when for some reason a single POD have connectivity issues. As the message is not acknowledge, he continue to be consumed, what is good. But the problem is that the message continue to be send to the same problematic instance of the application and is never redirected to another 'healthy' consumer.. As the consumer is able to get message from kafka and send heartbeats it is never considered a problematic consumer for kafka even after the rebalancing.
There some strategy or something we can do in the config to solve this problem or avoid the partition to be blocked?
Thank you for your attention so far.

How to pause/start/stop Kafka Producer / Kafka Template

I'm using a spring boot app with kafka integration and I want to implement an endpoint to stop and start kafka from publishing messages.
The message are triggered in a async way by another endpoints.
The beans KafkaTemplate<String, String> or ProducerFactory<String, String> producerFactory() does not have any stop and pause actions.
My goal is to be able to simulate connection failures and make sure those message are stored in a fallback mechanisms that I have in place.
Any ideas?
The KafkaTemplate doesn't have those callbacks because it is a passive component which can do the stuff only if we call it.
For the simulation of the connection failure, I would suggest you to implement a custom ProducerFactory producing a KafkaProducer with mocked or overridden Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback);. There, in that ProducerFactory you can implement appropriate lifecylce callbacks and react to the state in the mentioned send() implementation.
The org.apache.kafka.clients.producer.MockProducer may have something for your to reuse or borrow. For example see its close() or fenceProducer().

Spring Cloud Stream with RabbitMQ binder, how to apply #Transactional?

I have a Spring Cloud Stream application that receives events from RabbitMQ using the Rabbit Binder. My application can be summarized as this:
#Transactional
#StreamListener(MySink.SINK_NAME)
public void processEvents(Flux<Event> events) {
// Transform events and store them in MongoDB using
// spring-boot-data-mongodb-reactive
...
}
The problem is that it doesn't seem that #Transactional works with Spring Cloud Stream (or at least that's my impression) since if there's an exception when writing to MongoDB the event seems to have already been ack:ed to RabbitMQ and the operation is not retried.
Given that I want to achieve basically the same functionality as when using the #Transactional around a function with spring-amqp:
Do I have to manually ACK the messages to RabbitMQ when using Spring
Cloud Stream with the Rabbit Binder?
If so, how can I achieve this?
There are several issues here.
Transactions are not required for acknowledging messages
Reactor-based #StreamListener methods are invoked exactly once, just to set up the Flux so #Transactional on that method is meaningless - messages then flow through the flux so anything pertaining to individual messages has to be done within the context of the flux.
Spring Transactions are bound to the thread - Reactor is non-blocking; the message will be acked at the first handoff.
Yes, you would need to use manual acks; presumably on the result of the mongodb store operation. You would probably need to use Flux<Message<Event>> so you would have access to the channel and delivery tag headers.

Categories

Resources