Seek method behavior in spring kafka consumer 1.2.x - java

I do not want to commit offsets for those messages for which processing fails and I want them to be re-delivered again for processing. I am using spring-kafka 1.2.x and implemented ConsumerSeekAware in my listener.
#Component
public class Listener implements ConsumerSeekAware {
private static Logger logger = LoggerFactory.getLogger(Listener.class);
private final ThreadLocal<ConsumerSeekCallback> seekCallBack = new ThreadLocal<>();
#KafkaListener(topics = "my-topic", containerFactory = "kafkaManualAckListenerContainerFactory")
public void listen1(ConsumerRecord<String, String> consumerRecord) throws MyCustomException {
logger.info("received: key - " + consumerRecord.key() + " value - " + consumerRecord.value());
// Below code is just to show the issue.Not acknowledging so I can get the same msg again.
boolean should_commit = false;
if(should_commit) {
ack.acknowledge();
}
else {
this.seekCallBack.get().seek(consumerRecord.topic(), consumerRecord.partition() , consumerRecord.offset());
}
}
#Override
public void registerSeekCallback(ConsumerSeekCallback callback) {
logger.info("registerSeekCallback called..");
this.seekCallBack.set(callback);
}
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
logger.info("onPartitionsAssigned called..");
}
#Override
public void onIdleContainer(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
logger.info("onIdleContainer called..");
}
}
#########Contaianer config (auto.commit is false in consumer)
factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
The problem I am facing is if I have 10 messages in different partitions for a topic so I am getting all of them one by one and after getting all the messages I keep on getting the last message for any partition. I also tried SeekToCurrentErrorHandler which is implemented in version 2.0.x and that works perfectly. but I can not upgrade my kafka version. If I restart the container I get all the messages again which is fine but I don't want to stop the container when processing of a message fails.
So my question is it even possible to get the same (Exactly same without any need of stopping the container) behavior same as SeekToCurrentErrorHandler in spring-kafka 1.2.x ?

Related

Pause message consumption from main kafka stream and start from other kafka topic

I am using #StreamListener (spring cloud stream) to consume messages from a topic (input-channel), do some processing and save into some cache or database.
My requirement is, if DB goes down while processing the consumed message, I want to pause the main consumer(input-channel), and start consuming from another TOPIC (INPUT56-CHANNEL), and as soon as It consume all the message (doesn't have many) from INPUT56-CHANNEL, I want to resume the main consumer (input-channel) again.
Can that be achieved??
#StreamListener is deprecated; you should convert to the functional programming model instead.
Here is an example using that model (but the same techniques apply to the deprecated listeners).
spring.cloud.function.definition=input1;input2
spring.cloud.stream.bindings.input1-in-0.group=grp1
spring.cloud.stream.bindings.input2-in-0.consumer.auto-startup=false
spring.cloud.stream.bindings.input2-in-0.group=grp2
spring.cloud.stream.kafka.bindings.input2-in-0.consumer.idle-event-interval=5000
#SpringBootApplication
public class So69726610Application {
public static void main(String[] args) {
SpringApplication.run(So69726610Application.class, args);
}
boolean dbIsDown = true;
#Autowired
BindingsLifecycleController controller;
TaskExecutor exec = new SimpleAsyncTaskExecutor();
#Bean
public Consumer<String> input1() {
return str -> {
System.out.println(str);
if (this.dbIsDown) {
this.controller.changeState("input1-in-0", State.PAUSED);
this.controller.changeState("input2-in-0", State.STARTED);
throw new RuntimeException("Paused");
}
};
}
#Bean
public Consumer<String> input2() {
return System.out::println;
}
#EventListener
public void idle(ListenerContainerIdleEvent event) {
System.out.println(event);
// assumes concurrency = 1 (default)
if (event.getListenerId().contains("input2-in-0")) {
this.controller.changeState("input1-in-0", State.RESUMED);
this.exec.execute(() -> this.controller.changeState("input2-in-0", State.STOPPED));
}
}
}

Retry max 3 times when consuming batches in Spring Cloud Stream Kafka Binder

I am consuming batches in kafka, where retry is not supported in spring cloud stream kafka binder with batch mode, there is an option given that You can configure a SeekToCurrentBatchErrorHandler (using a ListenerContainerCustomizer) to achieve similar functionality to retry in the binder.
I tried the same, but with SeekToCurrentBatchErrorHandler, but it's retrying more than the time set which is 3 times.
How can I do that?
I would like to retry the whole batch.
How can I send the whole batch to dlq topic? like for record listener I used to match deliveryAttempt(retry) to 3 then send to DLQ topic, check in listener.
I have checked this link, which is exactly my issue but an example would be great help, with this library spring-cloud-stream-kafka-binder, can I achieve that. Please explain with an example, I am new to this.
Currently I have below code.
#Configuration
public class ConsumerConfig {
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> customizer() {
return (container, dest, group) -> {
container.getContainerProperties().setAckOnError(false);
SeekToCurrentBatchErrorHandler seekToCurrentBatchErrorHandler
= new SeekToCurrentBatchErrorHandler();
seekToCurrentBatchErrorHandler.setBackOff(new FixedBackOff(0L, 2L));
container.setBatchErrorHandler(seekToCurrentBatchErrorHandler);
//container.setBatchErrorHandler(new BatchLoggingErrorHandler());
};
}
}
Listerner:
#StreamListener(ActivityChannel.INPUT_CHANNEL)
public void handleActivity(List<Message<Event>> messages,
#Header(name = KafkaHeaders.ACKNOWLEDGMENT) Acknowledgment
acknowledgment,
#Header(name = "deliveryAttempt", defaultValue = "1") int
deliveryAttempt) {
try {
log.info("Received activity message with message length {}", messages.size());
nodeConfigActivityBatchProcessor.processNodeConfigActivity(messages);
acknowledgment.acknowledge();
log.debug("Processed activity message {} successfully!!", messages.size());
} catch (MessagePublishException e) {
if (deliveryAttempt == 3) {
log.error(
String.format("Exception occurred, sending the message=%s to DLQ due to: ",
"message"),
e);
publisher.publishToDlq(EventType.UPDATE_FAILED, "message", e.getMessage());
} else {
throw e;
}
}
}
After seeing #Gary's response added the ListenerContainerCustomizer #Bean with RetryingBatchErrorHandler, but not able to import the class. attaching screenshots.
not able to import RetryingBatchErrorHandler
my spring cloud dependencies
Use a RetryingBatchErrorHandler to send the whole batch to the DLT
https://docs.spring.io/spring-kafka/docs/current/reference/html/#retrying-batch-eh
Use a RecoveringBatchErrorHandler where you can throw a BatchListenerFailedException to tell it which record in the batch failed.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#recovering-batch-eh
In both cases provide a DeadLetterPublishingRecoverer to the error handler; disable DLTs in the binder.
EDIT
Here's an example; it uses the newer functional style rather than the deprecated #StreamListener, but the same concepts apply (but you should consider moving to the functional style).
#SpringBootApplication
public class So69175145Application {
public static void main(String[] args) {
SpringApplication.run(So69175145Application.class, args);
}
#Bean
ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> customizer(
KafkaTemplate<byte[], byte[]> template) {
return (container, dest, group) -> {
container.setBatchErrorHandler(new RetryingBatchErrorHandler(new FixedBackOff(5000L, 2L),
new DeadLetterPublishingRecoverer(template,
(rec, ex) -> new TopicPartition("errors." + dest + "." + group, rec.partition()))));
};
}
/*
* DLT topic won't be auto-provisioned since enableDlq is false
*/
#Bean
public NewTopic topic() {
return TopicBuilder.name("errors.so69175145.grp").partitions(1).replicas(1).build();
}
/*
* Functional equivalent of #StreamListener
*/
#Bean
public Consumer<List<String>> input() {
return list -> {
System.out.println(list);
throw new RuntimeException("test");
};
}
/*
* Not needed here - just to show we sent them to the DLT
*/
#KafkaListener(id = "so69175145", topics = "errors.so69175145.grp")
public void listen(String in) {
System.out.println("From DLT: " + in);
}
}
spring.cloud.stream.bindings.input-in-0.destination=so69175145
spring.cloud.stream.bindings.input-in-0.group=grp
spring.cloud.stream.bindings.input-in-0.content-type=text/plain
spring.cloud.stream.bindings.input-in-0.consumer.batch-mode=true
# for DLT listener
spring.kafka.consumer.auto-offset-reset=earliest
[foo]
2021-09-14 09:55:32.838ERROR...
...
[foo]
2021-09-14 09:55:37.873ERROR...
...
[foo]
2021-09-14 09:55:42.886ERROR...
...
From DLT: foo

Spring Boot: Kafka health indicator

I have something like below which works well, but I would prefer checking health without sending any message, (not only checking socket connection). I know Kafka has something like KafkaHealthIndicator out of the box, does someone have experience or example using it ?
public class KafkaHealthIndicator implements HealthIndicator {
private final Logger log = LoggerFactory.getLogger(KafkaHealthIndicator.class);
private KafkaTemplate<String, String> kafka;
public KafkaHealthIndicator(KafkaTemplate<String, String> kafka) {
this.kafka = kafka;
}
#Override
public Health health() {
try {
kafka.send("kafka-health-indicator", "❥").get(100, TimeUnit.MILLISECONDS);
} catch (InterruptedException | ExecutionException | TimeoutException e) {
return Health.down(e).build();
}
return Health.up().build();
}
}
In order to trip health indicator, retrieve data from one of the future objects otherwise indicator is UP even when Kafka is down!!!
When Kafka is not connected future.get() throws an exception which in turn set this indicator down.
#Configuration
public class KafkaConfig {
#Autowired
private KafkaAdmin kafkaAdmin;
#Bean
public AdminClient kafkaAdminClient() {
return AdminClient.create(kafkaAdmin.getConfigurationProperties());
}
#Bean
public HealthIndicator kafkaHealthIndicator(AdminClient kafkaAdminClient) {
final DescribeClusterOptions options = new DescribeClusterOptions()
.timeoutMs(1000);
return new AbstractHealthIndicator() {
#Override
protected void doHealthCheck(Health.Builder builder) throws Exception {
DescribeClusterResult clusterDescription = kafkaAdminClient.describeCluster(options);
// In order to trip health indicator DOWN retrieve data from one of
// future objects otherwise indicator is UP even when Kafka is down!!!
// When Kafka is not connected future.get() throws an exception which
// in turn sets the indicator DOWN.
clusterDescription.clusterId().get();
// or clusterDescription.nodes().get().size()
// or clusterDescription.controller().get();
builder.up().build();
// Alternatively directly use data from future in health detail.
builder.up()
.withDetail("clusterId", clusterDescription.clusterId().get())
.withDetail("nodeCount", clusterDescription.nodes().get().size())
.build();
}
};
}
}
Use the AdminClient API to check the health of the cluster via describing the cluster and/or the topic(s) you'll be interacting with, and verifying those topics have the required number of insync replicas, for example
Kafka has something like KafkaHealthIndicator out of the box
It doesn't. Spring's Kafka integration might

What are the major issues while using websocket to get stream of data?

I'm doing it first time. Where am going to read stream of data using websocket.
Here is my code snippet
RsvpApplication
#SpringBootApplication
public class RsvpApplication {
private static final String MEETUP_RSVPS_ENDPOINT = "ws://stream.myapi.com/2/rsvps";
public static void main(String[] args) {
SpringApplication.run(RsvpApplication.class, args);
}
#Bean
public ApplicationRunner initializeConnection(
RsvpsWebSocketHandler rsvpsWebSocketHandler) {
return args -> {
System.out.println("initializeConnection");
WebSocketClient rsvpsSocketClient = new StandardWebSocketClient();
rsvpsSocketClient.doHandshake(
rsvpsWebSocketHandler, MEETUP_RSVPS_ENDPOINT);
};
}
}
RsvpsWebSocketHandler
#Component
class RsvpsWebSocketHandler extends AbstractWebSocketHandler {
private static final Logger logger =
Logger.getLogger(RsvpsWebSocketHandler.class.getName());
private final RsvpsKafkaProducer rsvpsKafkaProducer;
public RsvpsWebSocketHandler(RsvpsKafkaProducer rsvpsKafkaProducer) {
this.rsvpsKafkaProducer = rsvpsKafkaProducer;
}
#Override
public void handleMessage(WebSocketSession session,
WebSocketMessage<?> message) {
logger.log(Level.INFO, "New RSVP:\n {0}", message.getPayload());
System.out.println("handleMessage");
rsvpsKafkaProducer.sendRsvpMessage(message);
}
}
RsvpsKafkaProducer
#Component
#EnableBinding(Source.class)
public class RsvpsKafkaProducer {
private static final int SENDING_MESSAGE_TIMEOUT_MS = 10000;
private final Source source;
public RsvpsKafkaProducer(Source source) {
this.source = source;
}
public void sendRsvpMessage(WebSocketMessage<?> message) {
System.out.println("sendRsvpMessage");
source.output()
.send(MessageBuilder.withPayload(message.getPayload())
.build(),
SENDING_MESSAGE_TIMEOUT_MS);
}
}
As far I know and read about websocket is that, It needs one time connection and stream of data will be flowing continuously until either party (client or server) stops.
I'm building it first time, so trying to cover major scenarios which can come acroos while dealing with 10000+ messages per minute. Total kafka brokers are two with enough space.
What can be done, if connection gets lost and again start consuming messages from webscoket once connected back where it was left in last failure and push messages into further Kafka broker ?
What can be done to put on hold websocket to keep pushing messages in broker if it has reached to threshold limit of not processed messages (in broker) ?
What can be done, When broker reached to its threshold, run a separate process to check available space in broker to push more messages and give indication to resume pushing messages in kafka broker ?
Please share other issues, which needs to be considered while setting up this thing ?

Better way of error handling in Kafka Consumer

I have a Springboot app configured with spring-kafka where I want to handle all sorts of error that can happen while listening to a topic. If any message is missed / not able to be consumed because of either Deserialization or any other Exception, there will be 2 retries and after which the message should be logged to an error file. I have two approaches that can be followed :-
First Approach( Using SeekToCurrentErrorHandler with DeadLetterPublishingRecoverer):-
#Autowired
KafkaTemplate<String,Object> template;
#Bean(name = "kafkaSourceProvider")
public ConcurrentKafkaListenerContainerFactory<K, V> consumerFactory() {
Map<String, Object> config = appProperties.getSource()
.getProperties();
ConcurrentKafkaListenerContainerFactory<K, V> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(new DefaultKafkaConsumerFactory<>(config));
DeadLetterPublishingRecoverer recoverer = new DeadLetterPublishingRecoverer(template,
(r, e) -> {
if (e instanceof FooException) {
return new TopicPartition(r.topic() + ".DLT", r.partition());
}
});
ErrorHandler errorHandler = new SeekToCurrentErrorHandler(recoverer, new FixedBackOff(0L, 2L));
factory.setErrorHandler(errorHandler);
return factory;
}
But for this we require addition topic(a new .DLT topic) and then we can log it to a file.
#Bean
public KafkaAdmin admin() {
Map<String, Object> configs = new HashMap<>();
configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,
StringUtils.arrayToCommaDelimitedString(kafkaEmbedded().getBrokerAddresses()));
return new KafkaAdmin(configs);
}
#KafkaListener( topics = MY_TOPIC + ".DLT", groupId = MY_ID)
public void listenDlt(ConsumerRecord<String, SomeClassName> consumerRecord,
#Header(KafkaHeaders.DLT_EXCEPTION_STACKTRACE) String exceptionStackTrace) {
logger.error(exceptionStackTrace);
}
Approach 2 ( Using custom SeekToCurrentErrorHandler) :-
#Bean
public ConcurrentKafkaListenerContainerFactory<K, V> consumerFactory() {
Map<String, Object> config = appProperties.getSource()
.getProperties();
ConcurrentKafkaListenerContainerFactory<K, V> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(new DefaultKafkaConsumerFactory<>(config));
factory.setErrorHandler(new CustomSeekToCurrentErrorHandler());
factory.setRetryTemplate(retryTemplate());
return factory;
}
private RetryTemplate retryTemplate() {
RetryTemplate retryTemplate = new RetryTemplate();
retryTemplate.setBackOffPolicy(backOffPolicy());
retryTemplate.setRetryPolicy(aSimpleReturnPolicy);
}
public class CustomSeekToCurrentErrorHandler extends SeekToCurrentErrorHandler {
private static final int MAX_RETRY_ATTEMPTS = 2;
CustomSeekToCurrentErrorHandler() {
super(MAX_RETRY_ATTEMPTS);
}
#Override
public void handle(Exception exception, List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer, MessageListenerContainer container) {
try {
if (!records.isEmpty()) {
log.warn("Exception: {} occurred with message: {}", exception, exception.getMessage());
super.handle(exception, records, consumer, container);
}
} catch (SerializationException e) {
log.warn("Exception: {} occurred with message: {}", e, e.getMessage());
}
}
}
Can anyone provide their suggestions on what's the standard way to implement this kind of feature. In first approach we do see an overhead of creation of .DLT topics and an additional #KafkaListener. In second approach, we can directly log our consumer record exception.
With the first approach, it is not necessary to use a DeadLetterPublishingRecoverer, you can use any ConsumerRecordRecoverer that you want; in fact the default recoverer simply logs the failed message.
/**
* Construct an instance with the default recoverer which simply logs the record after
* the backOff returns STOP for a topic/partition/offset.
* #param backOff the {#link BackOff}.
* #since 2.3
*/
public SeekToCurrentErrorHandler(BackOff backOff) {
this(null, backOff);
}
And, in the FailedRecordTracker...
if (recoverer == null) {
this.recoverer = (rec, thr) -> {
...
logger.error(thr, "Backoff "
+ (failedRecord == null
? "none"
: failedRecord.getBackOffExecution())
+ " exhausted for " + ListenerUtils.recordToString(rec));
};
}
Backoff (and a limit to retries) was added to the error handler after adding retry in the listener adapter, so it's "newer" (and preferred).
Also, using in-memory retry can cause issues with rebalancing if long BackOffs are employed.
Finally, only the SeekToCurrentErrorHandler can deal with deserialization problems (via the ErrorHandlingDeserializer).
EDIT
Use the ErrorHandlingDeserializer together with a SeekToCurrentErrorHandler. Deserialization exceptions are considered fatal and the recoverer is called immediately.
See the documentation.
Here is a simple Spring Boot application that demonstrates it:
public class So63236346Application {
private static final Logger log = LoggerFactory.getLogger(So63236346Application.class);
public static void main(String[] args) {
SpringApplication.run(So63236346Application.class, args);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so63236346").partitions(1).replicas(1).build();
}
#Bean
ErrorHandler errorHandler() {
return new SeekToCurrentErrorHandler((rec, ex) -> log.error(ListenerUtils.recordToString(rec, true) + "\n"
+ ex.getMessage()));
}
#KafkaListener(id = "so63236346", topics = "so63236346")
public void listen(String in) {
System.out.println(in);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
template.send("so63236346", "{\"field\":\"value1\"}");
template.send("so63236346", "junk");
template.send("so63236346", "{\"field\":\"value2\"}");
};
}
}
package com.example.demo;
public class Thing {
private String field;
public Thing() {
}
public Thing(String field) {
this.field = field;
}
public String getField() {
return this.field;
}
public void setField(String field) {
this.field = field;
}
#Override
public String toString() {
return "Thing [field=" + this.field + "]";
}
}
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
spring.kafka.consumer.properties.spring.deserializer.value.delegate.class=org.springframework.kafka.support.serializer.JsonDeserializer
spring.kafka.consumer.properties.spring.json.value.default.type=com.example.demo.Thing
Result
Thing [field=value1]
2020-08-10 14:30:14.780 ERROR 78857 --- [o63236346-0-C-1] com.example.demo.So63236346Application : so63236346-0#7
Listener failed; nested exception is org.springframework.kafka.support.serializer.DeserializationException: failed to deserialize; nested exception is org.apache.kafka.common.errors.SerializationException: Can't deserialize data [[106, 117, 110, 107]] from topic [so63236346]
2020-08-10 14:30:14.782 INFO 78857 --- [o63236346-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-so63236346-1, groupId=so63236346] Seeking to offset 8 for partition so63236346-0
Thing [field=value2]
The expectation was to log any exception that we might get at the container level as well as the listener level.
Without retrying, following is the way I have done error handling:-
If we encounter any exception at the container level, we should be able to log the message payload with the error description and seek that offset and skip it and go ahead receiving the next offset. Though it is done only for DeserializationException, the rest of the exceptions also needs to be seek and offsets needs to be skipped for them.
#Component
public class KafkaContainerErrorHandler implements ErrorHandler {
private static final Logger logger = LoggerFactory.getLogger(KafkaContainerErrorHandler.class);
#Override
public void handle(Exception thrownException, List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer, MessageListenerContainer container) {
String s = thrownException.getMessage().split("Error deserializing key/value for partition ")[1].split(". If needed, please seek past the record to continue consumption.")[0];
// modify below logic according to your topic nomenclature
String topics = s.substring(0, s.lastIndexOf('-'));
int offset = Integer.parseInt(s.split("offset ")[1]);
int partition = Integer.parseInt(s.substring(s.lastIndexOf('-') + 1).split(" at")[0]);
logger.error("...")
TopicPartition topicPartition = new TopicPartition(topics, partition);
logger.info("Skipping {} - {} offset {}", topics, partition, offset);
consumer.seek(topicPartition, offset + 1);
}
#Override
public void handle(Exception e, ConsumerRecord<?, ?> consumerRecord) {
}
}
factory.setErrorHandler(kafkaContainerErrorHandler);
If we get any exception at the #KafkaListener level, then I am configuring my listener with my custom error handler and logging the exception with the message as can be seen below:-
#Bean("customErrorHandler")
public KafkaListenerErrorHandler listenerErrorHandler() {
return (m, e) -> {
logger.error(...);
return m;
};
}

Categories

Resources