Spring Kafka ChainedKafkaTransactionManager doesn't synchronize with JPA Spring-data transaction

Spring Kafka ChainedKafkaTransactionManager doesn't synchronize with JPA Spring-data transaction - java

I read a ton of Gary Russell answers and posts, but didn't find actual solution for the common use-case for synchronization of the sequence below:
recieve from topic A => save to DB via Spring-data => send to topic B
As i understand properly: there is no guarantee for fully atomic processing in that case and i need to deal with messages deduplication on the client side, but the main issue is that ChainedKafkaTransactionManager doesn't synchronize with JpaTransactionManager (see #KafkaListener below)
Kafka config:
#Production
#EnableKafka
#Configuration
#EnableTransactionManagement
public class KafkaConfig {
private static final Logger log = LoggerFactory.getLogger(KafkaConfig.class);
#Bean
public ConsumerFactory<String, byte[]> commonConsumerFactory(#Value("${kafka.broker}") String bootstrapServer) {
Map<String, Object> props = new HashMap<>();
props.put(BOOTSTRAP_SERVERS_CONFIG, bootstrapServer);
props.put(AUTO_OFFSET_RESET_CONFIG, 'earliest');
props.put(SESSION_TIMEOUT_MS_CONFIG, 10000);
props.put(ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(MAX_POLL_RECORDS_CONFIG, 10);
props.put(MAX_POLL_INTERVAL_MS_CONFIG, 17000);
props.put(FETCH_MIN_BYTES_CONFIG, 1048576);
props.put(FETCH_MAX_WAIT_MS_CONFIG, 1000);
props.put(ISOLATION_LEVEL_CONFIG, 'read_committed');
props.put(KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(VALUE_DESERIALIZER_CLASS_CONFIG, ByteArrayDeserializer.class);
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, byte[]> kafkaListenerContainerFactory(
#Qualifier("commonConsumerFactory") ConsumerFactory<String, byte[]> consumerFactory,
#Qualifier("chainedKafkaTM") ChainedKafkaTransactionManager chainedKafkaTM,
#Qualifier("kafkaTemplate") KafkaTemplate<String, byte[]> kafkaTemplate,
#Value("${kafka.concurrency:#{T(java.lang.Runtime).getRuntime().availableProcessors()}}") Integer concurrency
) {
ConcurrentKafkaListenerContainerFactory<String, byte[]> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.getContainerProperties().setMissingTopicsFatal(false);
factory.getContainerProperties().setTransactionManager(chainedKafkaTM);
factory.setConsumerFactory(consumerFactory);
factory.setBatchListener(true);
var arbp = new DefaultAfterRollbackProcessor<String, byte[]>(new FixedBackOff(1000L, 3));
arbp.setCommitRecovered(true);
arbp.setKafkaTemplate(kafkaTemplate);
factory.setAfterRollbackProcessor(arbp);
factory.setConcurrency(concurrency);
factory.afterPropertiesSet();
return factory;
}
#Bean
public ProducerFactory<String, byte[]> producerFactory(#Value("${kafka.broker}") String bootstrapServer) {
Map<String, Object> configProps = new HashMap<>();
configProps.put(BOOTSTRAP_SERVERS_CONFIG, bootstrapServer);
configProps.put(BATCH_SIZE_CONFIG, 16384);
configProps.put(ENABLE_IDEMPOTENCE_CONFIG, true);
configProps.put(KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(VALUE_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class);
var kafkaProducerFactory = new DefaultKafkaProducerFactory<String, byte[]>(configProps);
kafkaProducerFactory.setTransactionIdPrefix('kafka-tx-');
return kafkaProducerFactory;
}
#Bean
public KafkaTemplate<String, byte[]> kafkaTemplate(#Qualifier("producerFactory") ProducerFactory<String, byte[]> producerFactory) {
return new KafkaTemplate<>(producerFactory);
}
#Bean
public KafkaTransactionManager kafkaTransactionManager(#Qualifier("producerFactory") ProducerFactory<String, byte[]> producerFactory) {
KafkaTransactionManager ktm = new KafkaTransactionManager<>(producerFactory);
ktm.setTransactionSynchronization(SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return ktm;
}
#Bean
public ChainedKafkaTransactionManager chainedKafkaTM(JpaTransactionManager jpaTransactionManager,
KafkaTransactionManager kafkaTransactionManager) {
return new ChainedKafkaTransactionManager(kafkaTransactionManager, jpaTransactionManager);
}
#Bean(name = "transactionManager")
public JpaTransactionManager transactionManager(EntityManagerFactory em) {
return new JpaTransactionManager(em);
}
}
Kafka listener:
#KafkaListener(groupId = "${group.id}", idIsGroup = false, topics = "${topic.name.import}")
public void consume(List<byte[]> records, #Header(KafkaHeaders.OFFSET) Long offset) {
for (byte[] record : records) {
// cause infinity rollback (perhaps due to batch listener)
if (true)
throw new RuntimeExcetion("foo");
// spring-data storage with #Transactional("chainedKafkaTM"), since Spring-data can't determine TM among transactionManager, chainedKafkaTM, kafkaTransactionManager
var result = storageService.persist(record);
kafkaTemplate.send(result);
}
}
Spring-kafka version: 2.3.3
Spring-boot version: 2.2.1
What is a proper way to implement such use-case ?
Spring-kafka documentation limited only to small/specific examples.
P.s. when i'm using #Transactional(transactionManager = "chainedKafkaTM", rollbackFor = Exception.class) on #KafkaListener method i am facing endless cyclic rollback, however FixedBackOff(1000L, 3L) is set.
EDIT: i'm planning to achieve max affordable synchronization between listener, producer and database with configurable retries num.
EDIT: Code snippets above edited with respect to advised configuration. Using ARBP doesn't solve infinity rollback cycle for me, since the first statement's predicate is always false (SeekUtils.doSeeks):
DefaultAfterRollbackProcessor
...
#Override
public void process(List<ConsumerRecord<K, V>> records, Consumer<K, V> consumer, Exception exception,
boolean recoverable) {
if (SeekUtils.doSeeks(((List) records), consumer, exception, recoverable,
getSkipPredicate((List) records, exception), LOGGER)
&& isCommitRecovered() && this.kafkaTemplate != null && this.kafkaTemplate.isTransactional()) {
ConsumerRecord<K, V> skipped = records.get(0);
this.kafkaTemplate.sendOffsetsToTransaction(
Collections.singletonMap(new TopicPartition(skipped.topic(), skipped.partition()),
new OffsetAndMetadata(skipped.offset() + 1)));
}
}
It is worth saying that there is no active transaction in Kafka Consumer method (TransactionSynchronizationManager.isActualTransactionActive()).

What makes you think it's not synchronized? You really don't need #Transactional since the container will start both transactions.
You shouldn't use a SeekToCurrentErrorHandler with transactions because that occurs within the transaction. Configure the after rollback processor instead. The default ARBP uses a FixedBackOff(0L, 9) (10 attempts).
This works fine for me; and stops after 4 delivery attempts:
#SpringBootApplication
public class So58804826Application {
public static void main(String[] args) {
SpringApplication.run(So58804826Application.class, args);
}
#Bean
public JpaTransactionManager transactionManager() {
return new JpaTransactionManager();
}
#Bean
public ChainedKafkaTransactionManager<?, ?> chainedTxM(JpaTransactionManager jpa,
KafkaTransactionManager<?, ?> kafka) {
kafka.setTransactionSynchronization(SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return new ChainedKafkaTransactionManager<>(kafka, jpa);
}
#Autowired
private Saver saver;
#KafkaListener(id = "so58804826", topics = "so58804826")
public void listen(String in) {
System.out.println("Storing: " + in);
this.saver.save(in);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so58804826")
.partitions(1)
.replicas(1)
.build();
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
// template.executeInTransaction(t -> t.send("so58804826", "foo"));
};
}
}
#Component
class ContainerFactoryConfigurer {
ContainerFactoryConfigurer(ConcurrentKafkaListenerContainerFactory<?, ?> factory,
ChainedKafkaTransactionManager<?, ?> tm) {
factory.getContainerProperties().setTransactionManager(tm);
factory.setAfterRollbackProcessor(new DefaultAfterRollbackProcessor<>(new FixedBackOff(1000L, 3)));
}
}
#Component
class Saver {
#Autowired
private MyEntityRepo repo;
private final AtomicInteger ids = new AtomicInteger();
#Transactional("chainedTxM")
public void save(String in) {
this.repo.save(new MyEntity(in, this.ids.incrementAndGet()));
throw new RuntimeException("foo");
}
}
I see "Participating in existing transaction" from both TxMs.
and with #Transactional("transactionManager"), I just get it from the JPATm, as one would expect.
EDIT
There is no concept of "recovery" for a batch listener - the framework has no idea which record in the batch needs to be skipped. In 2.3, we added a new feature for batch listeners when using MANUAL ack modes.
See Committing Offsets.
Starting with version 2.3, the Acknowledgment interface has two additional methods nack(long sleep) and nack(int index, long sleep). The first one is used with a record listener, the second with a batch listener. Calling the wrong method for your listener type will throw an IllegalStateException.
When using a batch listener, you can specify the index within the batch where the failure occurred. When nack() is called, offsets will be committed for records before the index and seeks are performed on the partitions for the failed and discarded records so that they will be redelivered on the next poll(). This is an improvement over the SeekToCurrentBatchErrorHandler, which can only seek the entire batch for redelivery.
However, the failed record will still be replayed indefinitely.
You could keep track of the record that keeps failing and nack index + 1 to skip over it.
However, since your JPA tx has rolled back; this won't work for you.
With batch listener's you must handle problems with batches in your listener code.

Related

Kafka: ErrorHandler for Serializer [duplicate]

I'm getting up an application consuming kafka messages.
I followed Spring-docs about Deserialization Error Handling in order to catch deserialization exception. I've tried the failedDeserializationFunction method.
This is my Consumer Configuration Class
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> consumerProps = new HashMap<>();
consumerProps.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, offsetReset);
consumerProps.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, autoCommit);
/* Error Handling */
consumerProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer2.class);
consumerProps.put(ErrorHandlingDeserializer2.VALUE_DESERIALIZER_CLASS, JsonDeserializer.class.getName());
consumerProps.put(ErrorHandlingDeserializer2.VALUE_FUNCTION, FailedNTCMessageBodyProvider.class);
return consumerProps;
}
#Bean
public ConsumerFactory<String, NTCMessageBody> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs(), new StringDeserializer(),
new JsonDeserializer<>(NTCMessageBody.class));
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, NTCMessageBody> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, NTCMessageBody> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
This is the BiFunction Provider
public class FailedNTCMessageBodyProvider implements BiFunction<byte[], Headers, NTCMessageBody> {
#Override
public NTCMessageBody apply(byte[] t, Headers u) {
return new NTCBadMessageBody(t);
}
}
public class NTCBadMessageBody extends NTCMessageBody{
private final byte[] failedDecode;
public NTCBadMessageBody(byte[] failedDecode) {
this.failedDecode = failedDecode;
}
public byte[] getFailedDecode() {
return this.failedDecode;
}
}
When I send just one corrupted message on the topic I got this error (in loop):
org.apache.kafka.common.errors.SerializationException: Error deserializing key/value
I understood that the ErrorHandlingDeserializer2 should delegate the NTCBadMessageBody type and continue the consumption. I also saw (in debug mode) it didn't never go in the constructor of the NTCBadMessageBody class.

Use ErrorHandlingDeserializer.
When a deserializer fails to deserialize a message, Spring has no way to handle the problem because it occurs before the poll() returns. To solve this problem, version 2.2 introduced the ErrorHandlingDeserializer. This deserializer delegates to a real deserializer (key or value). If the delegate fails to deserialize the record content, the ErrorHandlingDeserializer returns a DeserializationException instead, containing the cause and raw bytes. When using a record-level MessageListener, if either the key or value contains a DeserializationException, the container’s ErrorHandler is called with the failed ConsumerRecord. When using a BatchMessageListener, the failed record is passed to the application along with the remaining records in the batch, so it is the responsibility of the application listener to check whether the key or value in a particular record is a DeserializationException.
You can use the DefaultKafkaConsumerFactory constructor that takes key and value Deserializer objects and wire in appropriate ErrorHandlingDeserializer configured with the proper delegates. Alternatively, you can use consumer configuration properties which are used by the ErrorHandlingDeserializer to instantiate the delegates. The property names are ErrorHandlingDeserializer.KEY_DESERIALIZER_CLASS and ErrorHandlingDeserializer.VALUE_DESERIALIZER_CLASS; the property value can be a class or class name
package com.mypackage.app.config;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.TimeoutException;
import com.mypacakage.app.model.kafka.message.KafkaEvent;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.annotation.EnableKafka;
import org.springframework.kafka.config.ConcurrentKafkaListenerContainerFactory;
import org.springframework.kafka.core.ConsumerFactory;
import org.springframework.kafka.core.DefaultKafkaConsumerFactory;
import org.springframework.kafka.listener.ListenerExecutionFailedException;
import org.springframework.kafka.support.serializer.ErrorHandlingDeserializer;
import org.springframework.kafka.support.serializer.JsonDeserializer;
import org.springframework.retry.policy.SimpleRetryPolicy;
import org.springframework.retry.support.RetryTemplate;
import lombok.extern.slf4j.Slf4j;
#EnableKafka
#Configuration
#Slf4j
public class KafkaConsumerConfig {
#Value("${kafka.bootstrap-servers}")
private String servers;
#Value("${listener.group-id}")
private String groupId;
#Bean
public ConcurrentKafkaListenerContainerFactory<String, KafkaEvent> ListenerFactory() {
ConcurrentKafkaListenerContainerFactory<String, KafkaEvent> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setRetryTemplate(retryTemplate());
factory.setErrorHandler(((exception, data) -> {
/*
* here you can do you custom handling, I am just logging it same as default
* Error handler does If you just want to log. you need not configure the error
* handler here. The default handler does it for you. Generally, you will
* persist the failed records to DB for tracking the failed records.
*/
log.error("Error in process with Exception {} and the record is {}", exception, data);
}));
return factory;
}
#Bean
public ConsumerFactory<String, KafkaEvent> consumerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, servers);
config.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer.class);
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer.class);
config.put(ErrorHandlingDeserializer.KEY_DESERIALIZER_CLASS, StringDeserializer.class);
config.put(ErrorHandlingDeserializer.VALUE_DESERIALIZER_CLASS, JsonDeserializer.class.getName());
config.put(JsonDeserializer.VALUE_DEFAULT_TYPE,
"com.mypackage.app.model.kafka.message.KafkaEvent");
config.put(JsonDeserializer.TRUSTED_PACKAGES, "com.mypackage.app");
return new DefaultKafkaConsumerFactory<>(config);
}
private RetryTemplate retryTemplate() {
RetryTemplate retryTemplate = new RetryTemplate();
/*
* here retry policy is used to set the number of attempts to retry and what
* exceptions you wanted to try and what you don't want to retry.
*/
retryTemplate.setRetryPolicy(retryPolicy());
return retryTemplate;
}
private SimpleRetryPolicy retryPolicy() {
Map<Class<? extends Throwable>, Boolean> exceptionMap = new HashMap<>();
// the boolean value in the map determines whether exception should be retried
exceptionMap.put(IllegalArgumentException.class, false);
exceptionMap.put(TimeoutException.class, true);
exceptionMap.put(ListenerExecutionFailedException.class, true);
return new SimpleRetryPolicy(3, exceptionMap, true);
}
}

ErrorHandlingDeserializer
When a deserializer fails to deserialize a message, Spring has no way to handle the problem because it occurs before the poll() returns. To solve this problem, version 2.2 introduced the ErrorHandlingDeserializer. This deserializer delegates to a real deserializer (key or value). If the delegate fails to deserialize the record content, the ErrorHandlingDeserializer returns a DeserializationException instead, containing the cause and raw bytes. When using a record-level MessageListener, if either the key or value contains a DeserializationException, the container’s ErrorHandler is called with the failed ConsumerRecord. When using a BatchMessageListener, the failed record is passed to the application along with the remaining records in the batch, so it is the responsibility of the application listener to check whether the key or value in a particular record is a DeserializationException.
So according to your code you are using record-level MessageListener then just add ErrorHandler to Container
Handling Exceptions
If your error handler implements this interface you can, for example, adjust the offsets accordingly. For example, to reset the offset to replay the failed message, you could do something like the following; note however, these are simplistic implementations and you would probably want more checking in the error handler.
#Bean
public ConsumerAwareListenerErrorHandler listen3ErrorHandler() {
return (m, e, c) -> {
this.listen3Exception = e;
MessageHeaders headers = m.getHeaders();
c.seek(new org.apache.kafka.common.TopicPartition(
headers.get(KafkaHeaders.RECEIVED_TOPIC, String.class),
headers.get(KafkaHeaders.RECEIVED_PARTITION_ID, Integer.class)),
headers.get(KafkaHeaders.OFFSET, Long.class));
return null;
};
}
Or you can do custom implementation like in this example
#Bean
public ConcurrentKafkaListenerContainerFactory<String, GenericRecord>
kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, GenericRecord> factory
= new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setErrorHandler(new ErrorHandler() {
#Override
public void handle(Exception thrownException, List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer, MessageListenerContainer container) {
String s = thrownException.getMessage().split("Error deserializing key/value for partition ")[1].split(". If needed, please seek past the record to continue consumption.")[0];
String topics = s.split("-")[0];
int offset = Integer.valueOf(s.split("offset ")[1]);
int partition = Integer.valueOf(s.split("-")[1].split(" at")[0]);
TopicPartition topicPartition = new TopicPartition(topics, partition);
//log.info("Skipping " + topic + "-" + partition + " offset " + offset);
consumer.seek(topicPartition, offset + 1);
System.out.println("OKKKKK");
}
#Override
public void handle(Exception e, ConsumerRecord<?, ?> consumerRecord) {
}
#Override
public void handle(Exception e, ConsumerRecord<?, ?> consumerRecord, Consumer<?,?> consumer) {
String s = e.getMessage().split("Error deserializing key/value for partition ")[1].split(". If needed, please seek past the record to continue consumption.")[0];
String topics = s.split("-")[0];
int offset = Integer.valueOf(s.split("offset ")[1]);
int partition = Integer.valueOf(s.split("-")[1].split(" at")[0]);
TopicPartition topicPartition = new TopicPartition(topics, partition);
//log.info("Skipping " + topic + "-" + partition + " offset " + offset);
consumer.seek(topicPartition, offset + 1);
System.out.println("OKKKKK");
}
});
return factory;
}

Above answer may have problem if the partion name have character like '-'. so, i have modified same logic with regex.
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.kafka.clients.consumer.Consumer;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.errors.SerializationException;
import org.springframework.kafka.listener.ErrorHandler;
import org.springframework.kafka.listener.MessageListenerContainer;
import lombok.extern.slf4j.Slf4j;
#Slf4j
public class KafkaErrHandler implements ErrorHandler {
/**
* Method prevents serialization error freeze
*
* #param e
* #param consumer
*/
private void seekSerializeException(Exception e, Consumer<?, ?> consumer) {
String p = ".*partition (.*) at offset ([0-9]*).*";
Pattern r = Pattern.compile(p);
Matcher m = r.matcher(e.getMessage());
if (m.find()) {
int idx = m.group(1).lastIndexOf("-");
String topics = m.group(1).substring(0, idx);
int partition = Integer.parseInt(m.group(1).substring(idx));
int offset = Integer.parseInt(m.group(2));
TopicPartition topicPartition = new TopicPartition(topics, partition);
consumer.seek(topicPartition, (offset + 1));
log.info("Skipped message with offset {} from partition {}", offset, partition);
}
}
#Override
public void handle(Exception e, ConsumerRecord<?, ?> record, Consumer<?, ?> consumer) {
log.error("Error in process with Exception {} and the record is {}", e, record);
if (e instanceof SerializationException)
seekSerializeException(e, consumer);
}
#Override
public void handle(Exception e, List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer,
MessageListenerContainer container) {
log.error("Error in process with Exception {} and the records are {}", e, records);
if (e instanceof SerializationException)
seekSerializeException(e, consumer);
}
#Override
public void handle(Exception e, ConsumerRecord<?, ?> record) {
log.error("Error in process with Exception {} and the record is {}", e, record);
}
}
finally use the error handler in config.
#Bean
public ConcurrentKafkaListenerContainerFactory<String, GenericType> macdStatusListenerFactory() {
ConcurrentKafkaListenerContainerFactory<String, GenericType> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(macdStatusConsumerFactory());
factory.setRetryTemplate(retryTemplate());
factory.setErrorHandler(new KafkaErrHandler());
return factory;
}
However parsing error string to get parition, topic and offset is not recommended. If anyone have better solution please post here.

in my factory I've added commonErrorHander
factory.setCommonErrorHandler(new KafkaMessageErrorHandler());
and KafkaMessageErrorHandler is created as follow
class KafkaMessageErrorHandler implements CommonErrorHandler {
#Override
public void handleRecord(Exception thrownException, ConsumerRecord<?, ?> record, Consumer<?, ?> consumer, MessageListenerContainer container) {
manageException(thrownException, consumer);
}
#Override
public void handleOtherException(Exception thrownException, Consumer<?, ?> consumer, MessageListenerContainer container, boolean batchListener) {
manageException(thrownException, consumer);
}
private void manageException(Exception ex, Consumer<?, ?> consumer) {
log.error("Error polling message: " + ex.getMessage());
if (ex instanceof RecordDeserializationException) {
RecordDeserializationException rde = (RecordDeserializationException) ex;
consumer.seek(rde.topicPartition(), rde.offset() + 1L);
consumer.commitSync();
} else {
log.error("Exception not handled");
}
}
}

Control is not going inside #KafkaListner method and no message is printing in java console

I have created messaging component which will be called by other service for consuming and sending message from kafka, producer part is working fine, I am not sure what wrong with the below consumer listner part why it not printing messages or in debug mode control also not going inside the #kafkaListner method, but GUI based kafkamanager app shows offset is got committed even thought its mannual offset commit.
Here is my Message listner class code , I have checked topic and groupid is setting and fetched properly
#Component
public class SpringKafkaMessageListner {
public CountDownLatch latch = new CountDownLatch(1);
#KafkaListener(topics = "#{consumerFactory.getConfigurationProperties().get(\"topic-name\")}",
groupId = "#{consumerFactory.getConfigurationProperties().get(\"group.id\")}",
containerFactory = "springKafkaListenerContainerFactory")
public void listen(ConsumerRecord<?, ?> consumerRecord, Acknowledgment ack) {
System.out.println("listening...");
System.out.println("Received Message in group : "
+ " and message: " + consumerRecord.value());
System.out.println("current offsetId : " + consumerRecord.offset());
ack.acknowledge();
latch.countDown();
}
}
Consumer config class-
#Configuration
#EnableKafka
public class KafkaConsumerBeanConfig<T> {
#Autowired
#Lazy
private KafkaConsumerConfigDTO kafkaConsumerConfigDTO;
#Bean
public ConsumerFactory<Object, T> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(kafkaConsumerConfigDTO.getConfigs());
}
//for spring kafka with manual offset commit
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<Object,
springKafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<Object, T> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
//manual commit
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);
return factory;
}
#Bean
SpringKafkaMessageListner consumerListner(){
return new SpringKafkaMessageListner();
}
}
Below code snippet is consumer interface implementation which expose subscribe() method and all other bean creation is done thru ConfigurableApplicationContext.
public class SpringKafkaConsumer<T> implements Consumer<T> {
public SpringKafkaConsumer(ConsumerConfig<T> consumerConfig,
ConfigurableApplicationContext context) {
this.consumerConfig = consumerConfig;
this.context = context;
this.consumerFactory = context.getBean("consumerFactory", ConsumerFactory.class);
this.springKafkaContainer = context.getBean("springKafkaListenerContainerFactory",
ConcurrentKafkaListenerContainerFactory.class);
}
// here is it just simple code to initialize SpringKafkaMessageListner class and invoking
listening part
#Override
public void subscribe() {
consumerListner = context.getBean("consumerListner", SpringKafkaMessageListner.class);
try {
consumerListner.latch.await(30, TimeUnit.SECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
Test class with my local docker kafka setup
#RunWith(SpringRunner.class)
#DirtiesContext
#ContextConfiguration(classes = QueueManagerSpringConfig.class)
public class SpringKafkaTest extends AbstractJUnit4SpringContextTests {
#Autowired
private QueueManager queueManager;
private Consumer<KafkaMessage> consumer;`
// test method
#Test
public void testSubscribeWithLocalBroker() {
String topicName = "topic1";
String brokerServer = "127.0.0.1:9092";
String groupId = "grp1";
Map<String, String> additionalProp = new HashMap<>();
additionalProp.put(KafkaConsumerConfig.GROUP_ID, groupId);
additionalProp.put(KafkaConsumerConfig.AUTO_COMMIT, "false");
additionalProp.put(KafkaConsumerConfig.AUTO_COMMIT_INTERVAL, "100");
ConsumerConfig<KafkaMessage> consumerConfig =
new ConsumerConfig.Builder<>(topicName, new KafkaSuccessMessageHandler(new
KafkaMessageSerializerTest()),
new KafkaMessageDeserializerTest())
.additionalProperties(additionalProp)
.enableSpringKafka(true)
.offsetPositionStrategy(new EarliestPositionStrategy())
.build();
consumer = queueManager.getConsumer(consumerConfig);
System.out.println("start subscriber");
// calling subcribe method of consumer that will invoke kafkalistner
consumer.subscribe();
}
#Configuration
public class QueueManagerSpringConfig {
#Bean
public QueueManager queueManager() {
Map<String, String> kafkaProperties = new HashMap<>();
kafkaProperties.put(KafkaPropertyNamespace.NS_PREFIX +
KafkaPropertyNamespace.BOOTSTRAP_SERVERS,
"127.0.0.1:9092");
return QueueManagerFactory.getInstance(new KafkaPropertyNamespace(kafkaProperties)); } }

Better way of error handling in Kafka Consumer

I have a Springboot app configured with spring-kafka where I want to handle all sorts of error that can happen while listening to a topic. If any message is missed / not able to be consumed because of either Deserialization or any other Exception, there will be 2 retries and after which the message should be logged to an error file. I have two approaches that can be followed :-
First Approach( Using SeekToCurrentErrorHandler with DeadLetterPublishingRecoverer):-
#Autowired
KafkaTemplate<String,Object> template;
#Bean(name = "kafkaSourceProvider")
public ConcurrentKafkaListenerContainerFactory<K, V> consumerFactory() {
Map<String, Object> config = appProperties.getSource()
.getProperties();
ConcurrentKafkaListenerContainerFactory<K, V> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(new DefaultKafkaConsumerFactory<>(config));
DeadLetterPublishingRecoverer recoverer = new DeadLetterPublishingRecoverer(template,
(r, e) -> {
if (e instanceof FooException) {
return new TopicPartition(r.topic() + ".DLT", r.partition());
}
});
ErrorHandler errorHandler = new SeekToCurrentErrorHandler(recoverer, new FixedBackOff(0L, 2L));
factory.setErrorHandler(errorHandler);
return factory;
}
But for this we require addition topic(a new .DLT topic) and then we can log it to a file.
#Bean
public KafkaAdmin admin() {
Map<String, Object> configs = new HashMap<>();
configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,
StringUtils.arrayToCommaDelimitedString(kafkaEmbedded().getBrokerAddresses()));
return new KafkaAdmin(configs);
}
#KafkaListener( topics = MY_TOPIC + ".DLT", groupId = MY_ID)
public void listenDlt(ConsumerRecord<String, SomeClassName> consumerRecord,
#Header(KafkaHeaders.DLT_EXCEPTION_STACKTRACE) String exceptionStackTrace) {
logger.error(exceptionStackTrace);
}
Approach 2 ( Using custom SeekToCurrentErrorHandler) :-
#Bean
public ConcurrentKafkaListenerContainerFactory<K, V> consumerFactory() {
Map<String, Object> config = appProperties.getSource()
.getProperties();
ConcurrentKafkaListenerContainerFactory<K, V> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(new DefaultKafkaConsumerFactory<>(config));
factory.setErrorHandler(new CustomSeekToCurrentErrorHandler());
factory.setRetryTemplate(retryTemplate());
return factory;
}
private RetryTemplate retryTemplate() {
RetryTemplate retryTemplate = new RetryTemplate();
retryTemplate.setBackOffPolicy(backOffPolicy());
retryTemplate.setRetryPolicy(aSimpleReturnPolicy);
}
public class CustomSeekToCurrentErrorHandler extends SeekToCurrentErrorHandler {
private static final int MAX_RETRY_ATTEMPTS = 2;
CustomSeekToCurrentErrorHandler() {
super(MAX_RETRY_ATTEMPTS);
}
#Override
public void handle(Exception exception, List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer, MessageListenerContainer container) {
try {
if (!records.isEmpty()) {
log.warn("Exception: {} occurred with message: {}", exception, exception.getMessage());
super.handle(exception, records, consumer, container);
}
} catch (SerializationException e) {
log.warn("Exception: {} occurred with message: {}", e, e.getMessage());
}
}
}
Can anyone provide their suggestions on what's the standard way to implement this kind of feature. In first approach we do see an overhead of creation of .DLT topics and an additional #KafkaListener. In second approach, we can directly log our consumer record exception.

With the first approach, it is not necessary to use a DeadLetterPublishingRecoverer, you can use any ConsumerRecordRecoverer that you want; in fact the default recoverer simply logs the failed message.
/**
* Construct an instance with the default recoverer which simply logs the record after
* the backOff returns STOP for a topic/partition/offset.
* #param backOff the {#link BackOff}.
* #since 2.3
*/
public SeekToCurrentErrorHandler(BackOff backOff) {
this(null, backOff);
}
And, in the FailedRecordTracker...
if (recoverer == null) {
this.recoverer = (rec, thr) -> {
...
logger.error(thr, "Backoff "
+ (failedRecord == null
? "none"
: failedRecord.getBackOffExecution())
+ " exhausted for " + ListenerUtils.recordToString(rec));
};
}
Backoff (and a limit to retries) was added to the error handler after adding retry in the listener adapter, so it's "newer" (and preferred).
Also, using in-memory retry can cause issues with rebalancing if long BackOffs are employed.
Finally, only the SeekToCurrentErrorHandler can deal with deserialization problems (via the ErrorHandlingDeserializer).
EDIT
Use the ErrorHandlingDeserializer together with a SeekToCurrentErrorHandler. Deserialization exceptions are considered fatal and the recoverer is called immediately.
See the documentation.
Here is a simple Spring Boot application that demonstrates it:
public class So63236346Application {
private static final Logger log = LoggerFactory.getLogger(So63236346Application.class);
public static void main(String[] args) {
SpringApplication.run(So63236346Application.class, args);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so63236346").partitions(1).replicas(1).build();
}
#Bean
ErrorHandler errorHandler() {
return new SeekToCurrentErrorHandler((rec, ex) -> log.error(ListenerUtils.recordToString(rec, true) + "\n"
+ ex.getMessage()));
}
#KafkaListener(id = "so63236346", topics = "so63236346")
public void listen(String in) {
System.out.println(in);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
template.send("so63236346", "{\"field\":\"value1\"}");
template.send("so63236346", "junk");
template.send("so63236346", "{\"field\":\"value2\"}");
};
}
}
package com.example.demo;
public class Thing {
private String field;
public Thing() {
}
public Thing(String field) {
this.field = field;
}
public String getField() {
return this.field;
}
public void setField(String field) {
this.field = field;
}
#Override
public String toString() {
return "Thing [field=" + this.field + "]";
}
}
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
spring.kafka.consumer.properties.spring.deserializer.value.delegate.class=org.springframework.kafka.support.serializer.JsonDeserializer
spring.kafka.consumer.properties.spring.json.value.default.type=com.example.demo.Thing
Result
Thing [field=value1]
2020-08-10 14:30:14.780 ERROR 78857 --- [o63236346-0-C-1] com.example.demo.So63236346Application : so63236346-0#7
Listener failed; nested exception is org.springframework.kafka.support.serializer.DeserializationException: failed to deserialize; nested exception is org.apache.kafka.common.errors.SerializationException: Can't deserialize data [[106, 117, 110, 107]] from topic [so63236346]
2020-08-10 14:30:14.782 INFO 78857 --- [o63236346-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-so63236346-1, groupId=so63236346] Seeking to offset 8 for partition so63236346-0
Thing [field=value2]

The expectation was to log any exception that we might get at the container level as well as the listener level.
Without retrying, following is the way I have done error handling:-
If we encounter any exception at the container level, we should be able to log the message payload with the error description and seek that offset and skip it and go ahead receiving the next offset. Though it is done only for DeserializationException, the rest of the exceptions also needs to be seek and offsets needs to be skipped for them.
#Component
public class KafkaContainerErrorHandler implements ErrorHandler {
private static final Logger logger = LoggerFactory.getLogger(KafkaContainerErrorHandler.class);
#Override
public void handle(Exception thrownException, List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer, MessageListenerContainer container) {
String s = thrownException.getMessage().split("Error deserializing key/value for partition ")[1].split(". If needed, please seek past the record to continue consumption.")[0];
// modify below logic according to your topic nomenclature
String topics = s.substring(0, s.lastIndexOf('-'));
int offset = Integer.parseInt(s.split("offset ")[1]);
int partition = Integer.parseInt(s.substring(s.lastIndexOf('-') + 1).split(" at")[0]);
logger.error("...")
TopicPartition topicPartition = new TopicPartition(topics, partition);
logger.info("Skipping {} - {} offset {}", topics, partition, offset);
consumer.seek(topicPartition, offset + 1);
}
#Override
public void handle(Exception e, ConsumerRecord<?, ?> consumerRecord) {
}
}
factory.setErrorHandler(kafkaContainerErrorHandler);
If we get any exception at the #KafkaListener level, then I am configuring my listener with my custom error handler and logging the exception with the message as can be seen below:-
#Bean("customErrorHandler")
public KafkaListenerErrorHandler listenerErrorHandler() {
return (m, e) -> {
logger.error(...);
return m;
};
}

Spring-rabbitmq - start spring-boot server even in case of lack of connection

I am using spring boot with RabbitMQ. Everything is working - processing messages works and after losing connection it automatically tries to reconnect. However,
I have only one problem:
When Rabbit server is swtiched off (no possibility to establish connection) and I try to launch spring-boot server it can't start. I can't check now (no access to machine) what exact content of exception is, however it was about problem with instatiation of beans. Can you help me ?
#Configuration
public class RabbitConfig{
private String queueName = "myQueue";
private String echangeName = "myExchange";
#Bean
public FanoutExchange exchange(RabbitAdmin rabbitAdmin) {
FanoutExchange exch = new
FanoutExchange(echangeName);
rabbitAdmin.declareExchange(exch);
return exch;
}
#Bean
public Queue queue(FanoutExchange exchange, RabbitAdmin rabbitAdmin) {
HashMap<String, Object> args = new HashMap<String, Object>();
args.put("x-message-ttl", 20);
args.put("x-dead-letter-exchange", "dlx_exchange_name");
Queue queue = new Queue(queueName, true, false, false, args);
rabbitAdmin.declareQueue(queue);
rabbitAdmin.declareBinding(BindingBuilder.bind(queue).to(exchange));
return queue;
}
}
Edit
I must edit, because I was not aware of fact that is is important here.
In my case the last argument is not null, it is some Hashmap (it is important for me). I edited my code above.
Moreover, I don't understand your answer exactly. Could you be more precisely ?
In order make sure that I was sufficiently clear: I would like to take advantage of automatic reconnection (now it is working). Additionally, If during starting spring boot server rabbit broker is shutdown it should start and cyclically try to reconnect (at this moment application doesn't start).
Edit2
#Configuration
public class RabbitConfig{
private String queueName = "myQueue";
private String echangeName = "myExchange";
#Bean
public FanoutExchange exchange(RabbitAdmin rabbitAdmin) {
FanoutExchange exch = new
FanoutExchange(echangeName);
//rabbitAdmin.declareExchange(exch);
return exch;
}
#Bean
public Queue queue(FanoutExchange exchange, RabbitAdmin rabbitAdmin) {
HashMap<String, Object> args = new HashMap<String, Object>();
args.put("x-message-ttl", 20);
args.put("x-dead-letter-exchange", "dlx_exchange_name");
Queue queue = new Queue(queueName, true, false, false, args);
//rabbitAdmin.declareQueue(queue);
//rabbitAdmin.declareBinding(BindingBuilder.bind(queue).to(exchange));
return queue;
}
// EDIT 3: now, we are made to create binding bean
#Autowired
Queue queue; // inject bean by name
#Autowired
Exchange exchange;
#Bean
public Binding binding() {
return BindingBuilder.bind(queue.to(exchange);
}
}

That's correct. You try to register Broker entities manually:
rabbitAdmin.declareExchange(exch);
...
rabbitAdmin.declareQueue(queue);
rabbitAdmin.declareBinding(BindingBuilder.bind(queue).to(exchange));
You should rely here on the built-in auto-declaration mechanism in the Framework.
In other words: you're good to declare those beans (including Bindingm BTW), but you have not to call rabbitAdmin.declare at all. At least not from the bean definition phase.

Spring batch : Job instances run sequentially when using annotaitons

I have a simple annotation configuration for a Spring batch job as follows :
#Configuration
#EnableBatchProcessing
public abstract class AbstractFileLoader<T> {
private static final String FILE_PATTERN = "*.dat";
#Bean
#StepScope
#Value("#{stepExecutionContext['fileName']}")
public FlatFileItemReader<T> reader(String file) {
FlatFileItemReader<T> reader = new FlatFileItemReader<T>();
String path = file.substring(file.indexOf(":") + 1, file.length());
FileSystemResource resource = new FileSystemResource(path);
reader.setResource(resource);
DefaultLineMapper<T> lineMapper = new DefaultLineMapper<T>();
lineMapper.setFieldSetMapper(getFieldSetMapper());
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer(",");
tokenizer.setNames(getColumnNames());
lineMapper.setLineTokenizer(tokenizer);
reader.setLineMapper(lineMapper);
reader.setLinesToSkip(1);
return reader;
}
#Bean
public ItemProcessor<T, T> processor() {
// TODO add transformations here
return null;
}
//Exception when using JobScope for the writer
#Bean
public ItemWriter<T> writer() {
ListItemWriter<T> writer = new ListItemWriter<T>();
return writer;
}
#Bean
public Job loaderJob(JobBuilderFactory jobs, Step s1,
JobExecutionListener listener) {
return jobs.get(getLoaderName()).incrementer(new RunIdIncrementer())
.listener(listener).start(s1).build();
}
#Bean
public Step readStep(StepBuilderFactory stepBuilderFactory,
ItemReader<T> reader, ItemWriter<T> writer,
ItemProcessor<T, T> processor, TaskExecutor taskExecutor,
ResourcePatternResolver resolver) {
final Step readerStep = stepBuilderFactory
.get(getLoaderName() + " ReadStep:slave").<T, T> chunk(25254)
.reader(reader).processor(processor).writer(writer)
.taskExecutor(taskExecutor).throttleLimit(16).build();
final Step partitionedStep = stepBuilderFactory
.get(getLoaderName() + " ReadStep:master")
.partitioner(readerStep)
.partitioner(getLoaderName() + " ReadStep:slave",
partitioner(resolver)).taskExecutor(taskExecutor)
.build();
return partitionedStep;
}
#Bean
public TaskExecutor taskExecutor() {
return new SimpleAsyncTaskExecutor();
}
#Bean
public Partitioner partitioner(
ResourcePatternResolver resourcePatternResolver) {
MultiResourcePartitioner partitioner = new MultiResourcePartitioner();
Resource[] resources;
try {
resources = resourcePatternResolver.getResources("file:"
+ getFilesPath() + FILE_PATTERN);
} catch (IOException e) {
throw new RuntimeException(
"I/O problems when resolving the input file pattern.", e);
}
partitioner.setResources(resources);
return partitioner;
}
#Bean
public JobExecutionListener listener(ItemWriter<T> writer) {
/* org.springframework.batch.core.scope.StepScope scope; */
return new JobCompletionNotificationListener<T>(writer);
}
public abstract FieldSetMapper<T> getFieldSetMapper();
public abstract String getFilesPath();
public abstract String getLoaderName();
public abstract String[] getColumnNames();
}
When I run the same instance of the job with two different job parameters, both instances run sequentially instead of running in parallel. I have a SimpleAysncTaskExecutor bean configured which I assume should cause the jobs to be triggered asynchronously.
Do I need to add any more configuration to this class to have the job instances execute in parallel?

You have to configure the jobLauncher that you're using to launch jobs to use your TaskExecutor (or a separate pool). The simplest way is to override the bean:
#Bean
JobLauncher jobLauncher(JobRepository jobRepository) {
new SimpleJobLauncher(
taskExecutor: taskExecutor(),
jobRepository: jobRepository)
}
Don't be confused by the warning that will be logged saying that a synchronous task executor will be used. This is due to an extra instance that is created owing to the very awkward way Spring Batch uses to configure the beans it provides in SimpleBatchConfiguration (long story short, if you want to get rid of the warning you'll need to provide a BatchConfigurer bean and specify how 4 other beans are to be created, even if you want to change just one).
Note that it being the same job is irrelevant here. The problem is that by default the job launcher will launch the job on the same thread.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Spring Kafka ChainedKafkaTransactionManager doesn't synchronize with JPA Spring-data transaction - java

Related

Kafka: ErrorHandler for Serializer [duplicate]

Control is not going inside #KafkaListner method and no message is printing in java console

Better way of error handling in Kafka Consumer

Spring-rabbitmq - start spring-boot server even in case of lack of connection

Spring batch : Job instances run sequentially when using annotaitons

Categories

Resources