I have something like below which works well, but I would prefer checking health without sending any message, (not only checking socket connection). I know Kafka has something like KafkaHealthIndicator out of the box, does someone have experience or example using it ?
public class KafkaHealthIndicator implements HealthIndicator {
private final Logger log = LoggerFactory.getLogger(KafkaHealthIndicator.class);
private KafkaTemplate<String, String> kafka;
public KafkaHealthIndicator(KafkaTemplate<String, String> kafka) {
this.kafka = kafka;
}
#Override
public Health health() {
try {
kafka.send("kafka-health-indicator", "❥").get(100, TimeUnit.MILLISECONDS);
} catch (InterruptedException | ExecutionException | TimeoutException e) {
return Health.down(e).build();
}
return Health.up().build();
}
}
In order to trip health indicator, retrieve data from one of the future objects otherwise indicator is UP even when Kafka is down!!!
When Kafka is not connected future.get() throws an exception which in turn set this indicator down.
#Configuration
public class KafkaConfig {
#Autowired
private KafkaAdmin kafkaAdmin;
#Bean
public AdminClient kafkaAdminClient() {
return AdminClient.create(kafkaAdmin.getConfigurationProperties());
}
#Bean
public HealthIndicator kafkaHealthIndicator(AdminClient kafkaAdminClient) {
final DescribeClusterOptions options = new DescribeClusterOptions()
.timeoutMs(1000);
return new AbstractHealthIndicator() {
#Override
protected void doHealthCheck(Health.Builder builder) throws Exception {
DescribeClusterResult clusterDescription = kafkaAdminClient.describeCluster(options);
// In order to trip health indicator DOWN retrieve data from one of
// future objects otherwise indicator is UP even when Kafka is down!!!
// When Kafka is not connected future.get() throws an exception which
// in turn sets the indicator DOWN.
clusterDescription.clusterId().get();
// or clusterDescription.nodes().get().size()
// or clusterDescription.controller().get();
builder.up().build();
// Alternatively directly use data from future in health detail.
builder.up()
.withDetail("clusterId", clusterDescription.clusterId().get())
.withDetail("nodeCount", clusterDescription.nodes().get().size())
.build();
}
};
}
}
Use the AdminClient API to check the health of the cluster via describing the cluster and/or the topic(s) you'll be interacting with, and verifying those topics have the required number of insync replicas, for example
Kafka has something like KafkaHealthIndicator out of the box
It doesn't. Spring's Kafka integration might
Related
I am trying to configure my Spring AMQP ListenerContainer to allow for a certain type of retry flow that's backwards compatible with a custom rabbit client previously used in the project I'm working on.
The protocol works as follows:
A message is received on a channel.
If processing fails the message is nacked with the republish flag set to false
A copy of the message with additional/updated headers (a retry counter) is published to the same queue
The headers are used for filtering incoming messages, but that's not important here.
I would like the behaviour to happen on an opt-in basis, so that more standardised Spring retry flows can be used in cases where compatibility with the old client isn't a concern, and the listeners should be able to work without requiring manual acking.
I have implemented a working solution, which I'll get back to below. Where I'm struggling is to publish the new message after signalling to the container that it should nack the current message, because I can't really find any good hooks after the nack or before the next message.
Reading the documentation it feels like I'm looking for something analogous to the behaviour of RepublishMessageRecoverer used as the final step of a retry interceptor. The main difference in my case is that I need to republish immediately on failure, not as a final recovery step. I tried to look at the implementation of RepublishMessageRecoverer, but the many of layers of indirection made it hard for me to understand where the republishing is triggered, and if a nack goes before that.
My working implementation looks as follows. Note that I'm using an AfterThrowsAdvice, but I think an error handler could also be used with nearly identical logic.
/*
MyConfig.class, configuring the container factory
*/
#Configuration
public class MyConfig {
#Bean
// NB: bean name is important, overwrites autoconfigured bean
public SimpleRabbitListenerContainerFactory rabbitListenerContainerFactory(
ConnectionFactory connectionFactory,
Jackson2JsonMessageConverter messageConverter,
RabbitTemplate rabbitTemplate
) {
SimpleRabbitListenerContainerFactory factory = new SimpleRabbitListenerContainerFactory();
factory.setConnectionFactory(connectionFactory);
factory.setMessageConverter(messageConverter);
// AOP
var a1 = new CustomHeaderInspectionAdvice();
var a2 = new MyThrowsAdvice(rabbitTemplate);
Advice[] adviceChain = {a1, a2};
factory.setAdviceChain(adviceChain);
return factory;
}
}
/*
MyThrowsAdvice.class, hooking into the exception flow from the listener
*/
public class MyThrowsAdvice implements ThrowsAdvice {
private static final Logger logger = LoggerFactory.getLogger(MyThrowsAdvice2.class);
private final AmqpTemplate amqpTemplate;
public MyThrowsAdvice2(AmqpTemplate amqpTemplate) {
this.amqpTemplate = amqpTemplate;
}
public void afterThrowing(Method method, Object[] args, Object target, ListenerExecutionFailedException ex) {
var message = message(args);
var cause = ex.getCause();
// opt-in to old protocol by throwing an instance of BusinessException in business logic
if (cause instanceof BusinessException) {
/*
NB: Since we want to trigger execution after the current method fails
with an exception we need to schedule it in another thread and delay
execution until the nack has happened.
*/
new Thread(() -> {
try {
Thread.sleep(1000L);
var messageProperties = message.getMessageProperties();
var count = getCount(messageProperties);
messageProperties.setHeader("xb-count", count + 1);
var routingKey = messageProperties.getReceivedRoutingKey();
var exchange = messageProperties.getReceivedExchange();
amqpTemplate.send(exchange, routingKey, message);
logger.info("Sent!");
} catch (InterruptedException e) {
logger.error("Sleep interrupted", e);
}
}).start();
// NB: Produce the desired nack.
throw new AmqpRejectAndDontRequeueException("Business logic exception, message will be re-queued with updated headers", cause);
}
}
private static long getCount(MessageProperties messageProperties) {
try {
Long c = messageProperties.getHeader("xb-count");
return c == null ? 0 : c;
} catch (Exception e) {
return 0;
}
}
private static Message message(Object[] args) {
try {
return (Message) args[1];
} catch (Exception e) {
logger.info("Bad cast parse", e);
throw new AmqpRejectAndDontRequeueException(e);
}
}
}
Now, as you can imagine, I'm not particularly pleased with the indeterminism of scheduling a new thread with a delay.
So my question is simply, is there any way I could produce a deterministic solution to my problem using the provided hooks of the ListenerContainer ?
Your current solution risks message loss; since you are publishing on a different thread after a delay. If the server crashes during that delay, the message is lost.
It would be better to publish immediately to another queue with a TTL and dead-letter configuration to republish the expired message back to the original queue.
Using the RepublishMessageRecoverer with retries set to maxattempts=1 should do what you need.
I am consuming batches in kafka, where retry is not supported in spring cloud stream kafka binder with batch mode, there is an option given that You can configure a SeekToCurrentBatchErrorHandler (using a ListenerContainerCustomizer) to achieve similar functionality to retry in the binder.
I tried the same, but with SeekToCurrentBatchErrorHandler, but it's retrying more than the time set which is 3 times.
How can I do that?
I would like to retry the whole batch.
How can I send the whole batch to dlq topic? like for record listener I used to match deliveryAttempt(retry) to 3 then send to DLQ topic, check in listener.
I have checked this link, which is exactly my issue but an example would be great help, with this library spring-cloud-stream-kafka-binder, can I achieve that. Please explain with an example, I am new to this.
Currently I have below code.
#Configuration
public class ConsumerConfig {
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> customizer() {
return (container, dest, group) -> {
container.getContainerProperties().setAckOnError(false);
SeekToCurrentBatchErrorHandler seekToCurrentBatchErrorHandler
= new SeekToCurrentBatchErrorHandler();
seekToCurrentBatchErrorHandler.setBackOff(new FixedBackOff(0L, 2L));
container.setBatchErrorHandler(seekToCurrentBatchErrorHandler);
//container.setBatchErrorHandler(new BatchLoggingErrorHandler());
};
}
}
Listerner:
#StreamListener(ActivityChannel.INPUT_CHANNEL)
public void handleActivity(List<Message<Event>> messages,
#Header(name = KafkaHeaders.ACKNOWLEDGMENT) Acknowledgment
acknowledgment,
#Header(name = "deliveryAttempt", defaultValue = "1") int
deliveryAttempt) {
try {
log.info("Received activity message with message length {}", messages.size());
nodeConfigActivityBatchProcessor.processNodeConfigActivity(messages);
acknowledgment.acknowledge();
log.debug("Processed activity message {} successfully!!", messages.size());
} catch (MessagePublishException e) {
if (deliveryAttempt == 3) {
log.error(
String.format("Exception occurred, sending the message=%s to DLQ due to: ",
"message"),
e);
publisher.publishToDlq(EventType.UPDATE_FAILED, "message", e.getMessage());
} else {
throw e;
}
}
}
After seeing #Gary's response added the ListenerContainerCustomizer #Bean with RetryingBatchErrorHandler, but not able to import the class. attaching screenshots.
not able to import RetryingBatchErrorHandler
my spring cloud dependencies
Use a RetryingBatchErrorHandler to send the whole batch to the DLT
https://docs.spring.io/spring-kafka/docs/current/reference/html/#retrying-batch-eh
Use a RecoveringBatchErrorHandler where you can throw a BatchListenerFailedException to tell it which record in the batch failed.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#recovering-batch-eh
In both cases provide a DeadLetterPublishingRecoverer to the error handler; disable DLTs in the binder.
EDIT
Here's an example; it uses the newer functional style rather than the deprecated #StreamListener, but the same concepts apply (but you should consider moving to the functional style).
#SpringBootApplication
public class So69175145Application {
public static void main(String[] args) {
SpringApplication.run(So69175145Application.class, args);
}
#Bean
ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> customizer(
KafkaTemplate<byte[], byte[]> template) {
return (container, dest, group) -> {
container.setBatchErrorHandler(new RetryingBatchErrorHandler(new FixedBackOff(5000L, 2L),
new DeadLetterPublishingRecoverer(template,
(rec, ex) -> new TopicPartition("errors." + dest + "." + group, rec.partition()))));
};
}
/*
* DLT topic won't be auto-provisioned since enableDlq is false
*/
#Bean
public NewTopic topic() {
return TopicBuilder.name("errors.so69175145.grp").partitions(1).replicas(1).build();
}
/*
* Functional equivalent of #StreamListener
*/
#Bean
public Consumer<List<String>> input() {
return list -> {
System.out.println(list);
throw new RuntimeException("test");
};
}
/*
* Not needed here - just to show we sent them to the DLT
*/
#KafkaListener(id = "so69175145", topics = "errors.so69175145.grp")
public void listen(String in) {
System.out.println("From DLT: " + in);
}
}
spring.cloud.stream.bindings.input-in-0.destination=so69175145
spring.cloud.stream.bindings.input-in-0.group=grp
spring.cloud.stream.bindings.input-in-0.content-type=text/plain
spring.cloud.stream.bindings.input-in-0.consumer.batch-mode=true
# for DLT listener
spring.kafka.consumer.auto-offset-reset=earliest
[foo]
2021-09-14 09:55:32.838ERROR...
...
[foo]
2021-09-14 09:55:37.873ERROR...
...
[foo]
2021-09-14 09:55:42.886ERROR...
...
From DLT: foo
We are using spring cloude stream 2.0 & Kafka as a message broker.
We've implemented a circuit breaker which stops the Application context, for cases where the target system (DB or 3rd party API) is unavilable, as suggested here: Stop Spring Cloud Stream #StreamListener from listening when target system is down
Now in spring cloud stream 2.0 there is a way to manage the lifecycle of binder using actuator: Binding visualization and control
Is it possible to control the binder lifecycle from the code, means in case target server is down, to pause the binder, and when it's up, to resume?
Sorry, I misread your question.
You can auto wire the BindingsEndpoint but, unfortunately, its State enum is private so you can't call changeState() programmatically.
I have opened an issue for this.
EDIT
You can do it with reflection, but it's a bit ugly...
#SpringBootApplication
#EnableBinding(Sink.class)
public class So53476384Application {
public static void main(String[] args) {
SpringApplication.run(So53476384Application.class, args);
}
#Autowired
BindingsEndpoint binding;
#Bean
public ApplicationRunner runner() {
return args -> {
Class<?> clazz = ClassUtils.forName("org.springframework.cloud.stream.endpoint.BindingsEndpoint$State",
So53476384Application.class.getClassLoader());
ReflectionUtils.doWithMethods(BindingsEndpoint.class, method -> {
try {
method.invoke(this.binding, "input", clazz.getEnumConstants()[2]); // PAUSE
}
catch (InvocationTargetException e) {
e.printStackTrace();
}
}, method -> method.getName().equals("changeState"));
};
}
#StreamListener(Sink.INPUT)
public void listen(String in) {
}
}
I do not want to commit offsets for those messages for which processing fails and I want them to be re-delivered again for processing. I am using spring-kafka 1.2.x and implemented ConsumerSeekAware in my listener.
#Component
public class Listener implements ConsumerSeekAware {
private static Logger logger = LoggerFactory.getLogger(Listener.class);
private final ThreadLocal<ConsumerSeekCallback> seekCallBack = new ThreadLocal<>();
#KafkaListener(topics = "my-topic", containerFactory = "kafkaManualAckListenerContainerFactory")
public void listen1(ConsumerRecord<String, String> consumerRecord) throws MyCustomException {
logger.info("received: key - " + consumerRecord.key() + " value - " + consumerRecord.value());
// Below code is just to show the issue.Not acknowledging so I can get the same msg again.
boolean should_commit = false;
if(should_commit) {
ack.acknowledge();
}
else {
this.seekCallBack.get().seek(consumerRecord.topic(), consumerRecord.partition() , consumerRecord.offset());
}
}
#Override
public void registerSeekCallback(ConsumerSeekCallback callback) {
logger.info("registerSeekCallback called..");
this.seekCallBack.set(callback);
}
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
logger.info("onPartitionsAssigned called..");
}
#Override
public void onIdleContainer(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
logger.info("onIdleContainer called..");
}
}
#########Contaianer config (auto.commit is false in consumer)
factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
The problem I am facing is if I have 10 messages in different partitions for a topic so I am getting all of them one by one and after getting all the messages I keep on getting the last message for any partition. I also tried SeekToCurrentErrorHandler which is implemented in version 2.0.x and that works perfectly. but I can not upgrade my kafka version. If I restart the container I get all the messages again which is fine but I don't want to stop the container when processing of a message fails.
So my question is it even possible to get the same (Exactly same without any need of stopping the container) behavior same as SeekToCurrentErrorHandler in spring-kafka 1.2.x ?
I'm just starting to learn Spring Cloud Streams and Dataflow and I want to know one of important use cases for me. I created example processor Multiplier which takes message and resends it 5 times to output.
#EnableBinding(Processor.class)
public class MultiplierProcessor {
#Autowired
private Source source;
private int repeats = 5;
#Transactional
#StreamListener(Processor.INPUT)
public void handle(String payload) {
for (int i = 0; i < repeats; i++) {
if(i == 4) {
throw new RuntimeException("EXCEPTION");
}
source.output().send(new GenericMessage<>(payload));
}
}
}
What you can see is that before 5th sending this processor crashes. Why? Because it can (programs throw exceptions). In this case I wanted to practice fault prevention on Spring Cloud Stream.
What I would like to achieve is to have input message backed in DLQ and 4 messages that were send before to be reverted and not consumed by next operand (just like in normal JMS transaction). I tried already to define following properties in my processor project but without success.
spring.cloud.stream.bindings.output.producer.autoBindDlq=true
spring.cloud.stream.bindings.output.producer.republishToDlq=true
spring.cloud.stream.bindings.output.producer.transacted=true
spring.cloud.stream.bindings.input.consumer.autoBindDlq=true
Could you tell me if it possible and also what am I doing wrong? I would be overwhelmingly thankful for some examples.
You have several issues with your configuration:
missing .rabbit in the rabbit-specific properties)
you need a group name and durable subscription to use autoBindDlq
autoBindDlq doesn't apply on the output side
The consumer has to be transacted so that the producer sends are performed in the same transaction.
I just tested this with 1.0.2.RELEASE:
spring.cloud.stream.bindings.output.destination=so8400out
spring.cloud.stream.rabbit.bindings.output.producer.transacted=true
spring.cloud.stream.bindings.input.destination=so8400in
spring.cloud.stream.bindings.input.group=so8400
spring.cloud.stream.rabbit.bindings.input.consumer.durableSubscription=true
spring.cloud.stream.rabbit.bindings.input.consumer.autoBindDlq=true
spring.cloud.stream.rabbit.bindings.input.consumer.transacted=true
and it worked as expected.
EDIT
Actually, no, the published messages were not rolled back. Investigating...
EDIT2
OK; it does work, but you can't use republishToDlq - because when that is enabled, the binder publishes the failed message to the DLQ and the transaction is committed.
When that is false, the exception is thrown to the container, the transaction is rolled back, and RabbitMQ moves the failed message to the DLQ.
Note, however, that retry is enabled by default (3 attempts) so, if your processor succeeds during retry, you will get duplicates in your output.
For this to work as you want, you need to disable retry by setting the max attempts to 1 (and don't use republishToDlq).
EDIT3
OK, if you want more control over the publishing of the errors, this will work, when the fix for this JIRA is applied to Spring AMQP...
#SpringBootApplication
#EnableBinding({ Processor.class, So39018400Application.Errors.class })
public class So39018400Application {
public static void main(String[] args) {
SpringApplication.run(So39018400Application.class, args);
}
#Bean
public Foo foo() {
return new Foo();
}
public interface Errors {
#Output("errors")
MessageChannel errorChannel();
}
private static class Foo {
#Autowired
Source source;
#Autowired
Errors errors;
#StreamListener(Processor.INPUT)
public void handle (Message<byte[]> in) {
try {
source.output().send(new GenericMessage<>("foo"));
source.output().send(new GenericMessage<>("foo"));
throw new RuntimeException("foo");
}
catch (RuntimeException e) {
errors.errorChannel().send(MessageBuilder.fromMessage(in)
.setHeader("foo", "bar") // add whatever you want, stack trace etc.
.build());
throw e;
}
}
}
}
with properties:
spring.cloud.stream.bindings.output.destination=so8400out
spring.cloud.stream.bindings.errors.destination=so8400errors
spring.cloud.stream.rabbit.bindings.errors.producer.transacted=false
spring.cloud.stream.rabbit.bindings.output.producer.transacted=true
spring.cloud.stream.bindings.input.destination=so8400in
spring.cloud.stream.bindings.input.group=so8400
spring.cloud.stream.rabbit.bindings.input.consumer.transacted=true
spring.cloud.stream.rabbit.bindings.input.consumer.requeue-rejected=false
spring.cloud.stream.bindings.input.consumer.max-attempts=1