I have an app using Spring Boot 2.5.2
We have a kafka consumer and it's running in some instance of applications.
Here is the Consumer Config
#Bean("KafkaListenerContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setBatchListener(isBatchConsumer);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);
factory.getContainerProperties().setConsumerTaskExecutor(messageProcessorExecutor());
return factory;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, messagingAddress);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, maxBatch);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.DEFAULT_ISOLATION_LEVEL, IsolationLevel.READ_COMMITTED);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, sessionTimeoutKafkaConfig);
props.put(
ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,
CooperativeStickyAssignor.class.getName());
return new DefaultKafkaConsumerFactory<>(props);
}
Here is my Consumer
#KafkaListener(topics = "${topic.mesage}", groupId = "#{'${groupid.ms}'}", properties = {
"key.deserializer=org.apache.kafka.common.serialization.StringDeserializer",
"value.deserializer=org.apache.kafka.common.serialization.StringDeserializer"}, concurrency = "${messaging.consumer.concurrent.thread}", containerFactory = "KafkaListenerContainerFactory")
public void consumerListener(String data, Acknowledgment acknowledgment, #Header(KafkaHeaders.RECEIVED_PARTITION_ID) Integer partitions,
#Header(KafkaHeaders.OFFSET) Long offsets) {
logger.info("MessagingConsumer partitions {} offsets {}", partitions, offsets);
acknowledgment.acknowledge();
logger.info("MessagingConsumer acknowledge: " + data);
...
When I redeploy my application, an error occur. I found a request has been duplicated consume.
In the first instance, we found 2 logs show that the message has been ack ("MessagingConsumer partitions 29 offsets 21204" and "MessagingConsumer acknowledge:") . But after sometime, the messsage has been consume again in second instance with same partition and offset. between that log of 2 instance, I found some "partitions revoked:" and "partitions assigned:".
But I cannot understand why if I has ack sucessfully , why the message still consume twice.
Could anyone help me?
Related
In my Spring boot application I have kafka consumer class which reads message frequently whenever there are message available in the topic. I want to limit the consumer to consume message in every 2 hours interval time. Like after reading one message the consumer will be paused for 2 hours then again consumer another message.
This is my consumer config method :-
#Bean
public Map<String, Object> scnConsumerConfigs() {
Map<String, Object> propsMap = new HashMap<>();
// common props
logger.info("KM Dataloader :: Kafka Brokers for Software topic: {}", bootstrapServersscn);
propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServersscn);
propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "15000");
propsMap.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
propsMap.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 7200000);
// ssl props
propsMap.put("security.protocol", mpaasSecurityProtocol);
propsMap.put(SslConfigs.SSL_TRUSTSTORE_LOCATION_CONFIG, truststorePath);
propsMap.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG, truststorePassword);
propsMap.put(SslConfigs.SSL_KEYSTORE_LOCATION_CONFIG, keystorePath);
propsMap.put(SslConfigs.SSL_KEYSTORE_PASSWORD_CONFIG, keystorePassword);
return propsMap;
}
then I create this container method where I setup rest of the kafka configuration
ConcurrentKafkaListenerContainerFactory<String, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
LOGGER.info("Setting concurrency to {} for {}", config.getConcurrency(), topicName);
factory.setConcurrency(config.getConcurrency());
factory.setConsumerFactory(cFactory);
factory.setRetryTemplate(retryTemplate);
factory.getContainerProperties().setIdleBetweenPolls(7200000);
return factory;
using this code partitions is rebalanced every 2 hours, but its not reading message at all.
My kafka consumer method :-
#Bean
public KmKafkaListener softwareKafkaListener(KmSoftwareService softwareService) {
return new KmKafkaListener(softwareService) {
#KafkaListener(topics = SOFTWARE_TOPIC, containerFactory = "softwareMessageContainer", groupId = SOFTWARE_CONSUMER_GROUP)
public void onscnMessageforSA20(#Payload ConsumerRecord<String, Object> record)
throws InterruptedException {
this.onMessage(record);
}
};
}
Try to add to add #KafkaListener annotated method in KmKafkaListenerso that Spring kafka will take care of calling it.
public class KmKafkaListener{
#KafkaListener(topics = SOFTWARE_TOPIC, containerFactory = "softwareMessageContainer", groupId = SOFTWARE_CONSUMER_GROUP)
public void onscnMessageforSA20(#Payload ConsumerRecord<String, Object> record)
throws InterruptedException {
this.onMessage(record);
}
}
and initalize the bean this way
#Bean
public KmKafkaListener softwareKafkaListener(KmSoftwareService softwareService) {
return new KmKafkaListener(softwareService);
}
We see some messages are lost in consuming messages from Kafka topic, especially during restarting of service when using the default properties
#Bean
public ConsumerFactory<String, String> consumerFactory()
{
// Creating a Map of string-object pairs
Map<String, Object> config = new HashMap<>();
// Adding the Configuration
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
"127.0.0.1:9092");
config.put(ConsumerConfig.GROUP_ID_CONFIG,
"group_id");
config.put(
ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
config.put(
ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
return new DefaultKafkaConsumerFactory<>(config);
}
// Creating a Listener
public ConcurrentKafkaListenerContainerFactory
concurrentKafkaListenerContainerFactory()
{
ConcurrentKafkaListenerContainerFactory<
String, String> factory
= new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
}
From the documentation, it was mentioned the default value for ackMode is BATCH which states this
Commit the offset when all the records returned by the poll() have been processed
How does spring know that all the messages are processed is a sample example like in here ? and does it mean, when we restart the service offsets are committed and we haven't processed the messages leads to loosing of the messages
#KafkaListener(topics = "topicName", groupId = "foo")
public void listenGroupFoo(String message) {
System.out.println("Received Message in group foo: " + message);
}
I am facing a strange a problem in kafka that all kafka messages from topic are being replayed after consumer application restart. Can anyone help me what am I doing wrong here ?
Here is my configuration:
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.enable.auto.commit= false
My Producer configuration:
producerconfigs.put(ENABLE_IDEMPOTENCE_CONFIG, "true");
producerconfigs.put(ACKS_CONFIG, "all");
producerconfigs.put(ProducerConfig.CLIENT_ID_CONFIG, "client.id");
producerconfigs.put(RETRIES_CONFIG, Integer.toString(Integer.MAX_VALUE));
producerconfigs.put(MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
producerconfigs.put(TRANSACTIONAL_ID_CONFIG, "V1-"+ UUID.randomUUID().toString());
Consumer Configuration :
consumerconfig.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
consumerconfigs.put(SESSION_TIMEOUT_MS_CONFIG, "10000");
consumerconfigs.put("isolation.level", "read_committed");
Consumer code :
#KafkaListener(topics = "TOPIC-1", groupId = "TOPIC-GRP", containerFactory = "kaListenerContainerFactory",concurrency = "10", autoStartup = "true")
public String processMesage(#Payload String message,#Header(value = KafkaHeaders.CORRELATION_ID, required = false) String correlationId,#Header(value = KafkaHeaders.OFFSET, required = false) String offset) throws JsonProcessingException {//business logic goes here }
Container Code
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String>
kafkaListenerContainerFactory(){
ConcurrentKafkaListenerContainerFactory<String, String> factory = new
ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactoryString());
factory.setBatchListener(true);
return factory;
}
consumer config
Map<String, Object> getConsumerProperties() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
environment.getProperty("spring.kafka.consumer.bootstrap-servers"));
config.put(ConsumerConfig.GROUP_ID_CONFIG,
environment.getProperty("spring.kafka.consumer.group-id"));
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,
environment.getProperty("spring.kafka.consumer.auto-offset-reset"));
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,
environment.getProperty("spring.kafka.enable.auto.commit"));
config.put(KEY_DESERIALIZER_CLASS_CONFIG,
environment.getProperty("spring.kafka.consumer.key-deserializer"));
config.put(VALUE_DESERIALIZER_CLASS_CONFIG,
environment.getProperty("spring.kafka.consumer.value-deserializer"));
config.put("isolation.level", "read_committed");
return config;
}
application.properties
spring.kafka.consumer.bootstrap-servers=${spring.embedded.kafka.brokers}
spring.kafka.consumer.group-id=consumer-group
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.key-deserializer=org.apache.kafka.common.serialization.StringDeserializer
spring.kafka.consumer.value-deserializer=org.apache.kafka.common.serialization.StringDeserializer
spring.kafka.enable.auto.commit= false
I am not sure if someone has answered the question, but it seems config is causing the issue.
Auto commit false mean kafka topic will never have information of offset traversed by consumer group and auto-offset-reset earliest mean always read from beginning.
spring.kafka.enable.auto.commit= false
spring.kafka.consumer.auto-offset-reset=earliest
I have an application for logging transactions to hbase tables. I am consuming transactions messages from kafka but my kafka listener logging following lines. And could not persist to hbase table. I am using for consumer spring-kafka 2.1.7 release. How can i resolve this issue? My kafka consumer implementation like that;
#KafkaListener(topics = "${kafka.consumer.topic}")
public void receive(ConsumerRecord<String, String> consumerRecord) throws IOException {
if (StringUtils.hasText(consumerRecord.value())) {
//Some business logic
}
}
Kafka Listener Config
#Configuration
#EnableKafka
public class KafkaListenerConfig {
#Autowired
private KafkaListenerProperties kafkaListenerProperties;
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
// factory.setConcurrency(1);
factory.setConsumerFactory(consumerFactory());
return factory;
}
#Bean
public DefaultKafkaConsumerFactory consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerProps(), stringKeyDeserializer(), workUnitJsonValueDeserializer());
}
#Bean
public Map<String, Object> consumerProps() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaListenerProperties.getBootstrap());
props.put(ConsumerConfig.GROUP_ID_CONFIG, kafkaListenerProperties.getGroup());
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
// props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "15000");
return props;
}
#Bean
public Deserializer stringKeyDeserializer() {
return new StringDeserializer();
}
#Bean
public Deserializer workUnitJsonValueDeserializer() {
return new StringDeserializer();
}
}
Kafka Listener-Hbase logging like following;
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] #106, waiting for some tasks to finish. Expected max=0, tasksInProgress=2 hasError=false, tableName=FraudRequest
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] Left over 2 task(s) are processed on server(s): [XXXXX.kfs.local,60020,1552673278405]
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] Regions against which left over task(s) are processed: [FraudRequest,15543323,1554891506303.d3bb6fef4ab349e93729d14d13f730bc.]
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] #107, waiting for some tasks to finish. Expected max=0, tasksInProgress=2 hasError=false, tableName=FraudRequest
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] Left over 2 task(s) are processed on server(s): [XXXXX.kfs.local,60020,1552673278405]
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] Regions against which left over task(s) are processed: [FraudRequest,15543323,1554891506303.d3bb6fef4ab349e93729d14d13f730bc.]
I have couple of questions regarding the behaviour of spring-kafka during certain scenarios. Any answers or pointers would be great.
Background: I am building a kafka consumer which talk with external apis and sends acknowledge back. My Config looks like this:
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokerServers());
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, this.configuration.getString("kafka-generic.consumer.group.id"));
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "5000000");
props.put(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG, "6000000");
return props;
}
#Bean
public RetryTemplate retryTemplate() {
final ExponentialRandomBackOffPolicy backOffPolicy = new ExponentialRandomBackOffPolicy();
backOffPolicy.setInitialInterval(this.configuration.getLong("retry-exp-backoff-init-interval"));
final SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
retryPolicy.setMaxAttempts(this.configuration.getInt("retry-max-attempts"));
final RetryTemplate retryTemplate = new RetryTemplate();
retryTemplate.setBackOffPolicy(backOffPolicy);
retryTemplate.setRetryPolicy(retryPolicy);
return retryTemplate;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Event> retryKafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Event> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(this.configuration.getInt("kafka-concurrency"));
factory.setRetryTemplate(retryTemplate());
factory.getContainerProperties().setIdleEventInterval(this.configuration.getLong("kafka-rtm-idle-time"));
//factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setErrorHandler(kafkaConsumerErrorHandler);
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
return factory;
}
Lets say no of partitions I have is 4. My partition distribution is for KafkaListener is:
#KafkaListener(topicPartitions = #TopicPartition(topic = "topic", partitions = {"0", "1"}),
containerFactory = "retryKafkaListenerContainerFactory")
public void receive(Event event, Acknowledgment acknowledgment) throws Exception {
serviceInvoker.callService(event);
acknowledgment.acknowledge();
}
#KafkaListener(topicPartitions = #TopicPartition(topic = "topic", partitions = {"2", "3"}),
containerFactory = "retryKafkaListenerContainerFactory")
public void receive1(Event event, Acknowledgment acknowledgment) throws Exception {
serviceInvoker.callService(event);
acknowledgment.acknowledge();
}
Now my questions are:
Let's say I have 2 machines where I deployed this code (with the same consumer group id). If I understood properly, if I get a event for a partition, one of the machines' kafkalistener for corresponding partition and will listen but the other machines' kafkalistener won't listen to this event. Is it?
My error handler is:
#Named
public class KafkaConsumerErrorHandler implements ErrorHandler {
#Inject
private KafkaListenerEndpointRegistry kafkaListenerEndpointRegistry;
#Override
public void handle(Exception e, ConsumerRecord<?, ?> consumerRecord) {
System.out.println("Shutting down all the containers");
kafkaListenerEndpointRegistry.stop();
}
}
Lets talk abt a scenario where a consumers' kafkalistener is called where it calls serviceInvoker.callService(event); but the service is down, then according to the retryKafkaListenerContainerFactory, it retries for 3 times then fails, then errorhandler is called thus stopping kafkaListenerEndpointRegistry. Will this shutdown all other consumers or machines with the same consumer group or just this consumer or machine?
Lets talk abt the scenerio in 2. Is there any configuration where we need to change to let kafka know to hold off acknowledgement for that much time?
My kafka producer produces messages for every 10 mins. Do I need to configure that 10 mins anywhere in my Consumer code or is it agnostic of such?
In my KafkaListener annotations I hardcoded topic name and partitions. Can I change it during run time?
Any help is really appreciated. Thanks in advance. :)
Correct; only 1 will get it.
It will only stop the local containers - Spring doesn't know anything about your other instances.
Since you have ackOnError=false, the offset won't be committed.
The consumer does not need to know how often messages are published.
You can't change them at runtime, but you can use property placeholders ${...} or Spel Expressions #{...} to set them up during application initialization.