I want to do the following: when a message fails and falls to my dead letter queue, I want to wait 5 minutes and republishes the same message on my queue.
Today, using Spring Cloud Streams and RabbitMQ, I did the following code Based on this documentation:
public class HandlerDlq {
private static final Logger LOGGER = LoggerFactory.getLogger(HandlerDlq.class);
private static final String X_RETRIES_HEADER = "x-retries";
private static final String X_DELAY_HEADER = "x-delay";
private static final int NUMBER_OF_RETRIES = 3;
private static final int DELAY_MS = 300000;
private RabbitTemplate rabbitTemplate;
public HandlerDlq(RabbitTemplate rabbitTemplate) {
this.rabbitTemplate = rabbitTemplate;
#RabbitListener(queues = MessageInputProcessor.DLQ)
public void rePublish(Message failedMessage) {
Map<String, Object> headers = failedMessage.getMessageProperties().getHeaders();
Integer retriesHeader = (Integer) headers.get(X_RETRIES_HEADER);
if (retriesHeader == null) {
retriesHeader = 0;
if (retriesHeader > NUMBER_OF_RETRIES) {
LOGGER.warn("Message {} added to failed messages queue", failedMessage);
this.rabbitTemplate.send(MessageInputProcessor.FAILED, failedMessage);
throw new ImmediateAcknowledgeAmqpException("Message failed after " + NUMBER_OF_RETRIES + " attempts");
headers.put(X_RETRIES_HEADER, retriesHeader);
headers.put(X_DELAY_HEADER, DELAY_MS * retriesHeader);
LOGGER.warn("Retrying message, {} attempts", retriesHeader);
this.rabbitTemplate.send(MessageInputProcessor.DELAY_EXCHANGE, MessageInputProcessor.INPUT_DESTINATION, failedMessage);
public DirectExchange delayExchange() {
DirectExchange exchange = new DirectExchange(MessageInputProcessor.DELAY_EXCHANGE);
return exchange;
public Binding bindOriginalToDelay() {
return BindingBuilder.bind(new Queue(MessageInputProcessor.INPUT_DESTINATION)).to(delayExchange()).with(MessageInputProcessor.INPUT_DESTINATION);
public Queue parkingLot() {
return new Queue(MessageInputProcessor.FAILED);
My MessageInputProcessor interface:
public interface MessageInputProcessor {
String INPUT = "myInput";
String INPUT_DESTINATION = "myInput.group";
String DLQ = INPUT_DESTINATION + ".dlq"; //from application.properties file
String FAILED = INPUT + "-failed";
SubscribableChannel storageManagerInput();
SubscribableChannel storageManagerFailed();
And my properties file:
#dlx/dlq setup - retry dead letter 5 minutes later (300000ms later)
With this code, I can read from dead letter queue, capture the header but I can't put it back to my queue (the line LOGGER.warn("Retrying message, {} attempts", retriesHeader); only runs once, even if I put a very slow time).
My guess is that the method bindOriginalToDelay is binding the exchange to a new queue, and not mine. However, I didn't find a way to get my queue to bind there instead of creating a new one. But I'm not even sure this is the error.
I've also tried to send to MessageInputProcessor.INPUT instead of MessageInputProcessor.INPUT_DESTINATION, but it didn't work as expected.
Also, unfortunately, I can't update Spring framework due to dependencies on the project...
Could you help me with putting back the failed message on my queue after some time? I really didn't want to put a thread.sleep there...
With that configuration, myInput.group is bound to the delayed (topic) exchange myInput with routing key #.
You should probably remove spring.cloud.stream.rabbit.bindings.myInput.consumer.delayedExchange=true because you don't need the main exchange to be delayed.
It will also be bound to your explicit delayed exchange, with key myInput.group.
Everything looks correct to me; you should see the same (single) queue bound to two exchanges:
The myInput.group.dlq is bound to DLX with key myInput.group
You should set a longer TTL and examine the message in the DLQ to see if something stands out.
I just copied your code with a 5 second delay and it worked fine for me (with turning off the delay on the main exchange).
Retrying message, 4 attempts
added to failed messages queue
Perhaps you thought it was not working because you have a delay on the main exchange too?
I am trying to configure my Spring AMQP ListenerContainer to allow for a certain type of retry flow that's backwards compatible with a custom rabbit client previously used in the project I'm working on.
The protocol works as follows:
A message is received on a channel.
If processing fails the message is nacked with the republish flag set to false
A copy of the message with additional/updated headers (a retry counter) is published to the same queue
The headers are used for filtering incoming messages, but that's not important here.
I would like the behaviour to happen on an opt-in basis, so that more standardised Spring retry flows can be used in cases where compatibility with the old client isn't a concern, and the listeners should be able to work without requiring manual acking.
I have implemented a working solution, which I'll get back to below. Where I'm struggling is to publish the new message after signalling to the container that it should nack the current message, because I can't really find any good hooks after the nack or before the next message.
Reading the documentation it feels like I'm looking for something analogous to the behaviour of RepublishMessageRecoverer used as the final step of a retry interceptor. The main difference in my case is that I need to republish immediately on failure, not as a final recovery step. I tried to look at the implementation of RepublishMessageRecoverer, but the many of layers of indirection made it hard for me to understand where the republishing is triggered, and if a nack goes before that.
My working implementation looks as follows. Note that I'm using an AfterThrowsAdvice, but I think an error handler could also be used with nearly identical logic.
MyConfig.class, configuring the container factory
public class MyConfig {
// NB: bean name is important, overwrites autoconfigured bean
public SimpleRabbitListenerContainerFactory rabbitListenerContainerFactory(
ConnectionFactory connectionFactory,
Jackson2JsonMessageConverter messageConverter,
RabbitTemplate rabbitTemplate
) {
SimpleRabbitListenerContainerFactory factory = new SimpleRabbitListenerContainerFactory();
// AOP
var a1 = new CustomHeaderInspectionAdvice();
var a2 = new MyThrowsAdvice(rabbitTemplate);
Advice[] adviceChain = {a1, a2};
return factory;
MyThrowsAdvice.class, hooking into the exception flow from the listener
public class MyThrowsAdvice implements ThrowsAdvice {
private static final Logger logger = LoggerFactory.getLogger(MyThrowsAdvice2.class);
private final AmqpTemplate amqpTemplate;
public MyThrowsAdvice2(AmqpTemplate amqpTemplate) {
this.amqpTemplate = amqpTemplate;
public void afterThrowing(Method method, Object[] args, Object target, ListenerExecutionFailedException ex) {
var message = message(args);
var cause = ex.getCause();
// opt-in to old protocol by throwing an instance of BusinessException in business logic
if (cause instanceof BusinessException) {
NB: Since we want to trigger execution after the current method fails
with an exception we need to schedule it in another thread and delay
execution until the nack has happened.
new Thread(() -> {
try {
var messageProperties = message.getMessageProperties();
var count = getCount(messageProperties);
messageProperties.setHeader("xb-count", count + 1);
var routingKey = messageProperties.getReceivedRoutingKey();
var exchange = messageProperties.getReceivedExchange();
amqpTemplate.send(exchange, routingKey, message);
} catch (InterruptedException e) {
logger.error("Sleep interrupted", e);
// NB: Produce the desired nack.
throw new AmqpRejectAndDontRequeueException("Business logic exception, message will be re-queued with updated headers", cause);
private static long getCount(MessageProperties messageProperties) {
try {
Long c = messageProperties.getHeader("xb-count");
return c == null ? 0 : c;
} catch (Exception e) {
return 0;
private static Message message(Object[] args) {
try {
return (Message) args[1];
} catch (Exception e) {
logger.info("Bad cast parse", e);
throw new AmqpRejectAndDontRequeueException(e);
Now, as you can imagine, I'm not particularly pleased with the indeterminism of scheduling a new thread with a delay.
So my question is simply, is there any way I could produce a deterministic solution to my problem using the provided hooks of the ListenerContainer ?
Your current solution risks message loss; since you are publishing on a different thread after a delay. If the server crashes during that delay, the message is lost.
It would be better to publish immediately to another queue with a TTL and dead-letter configuration to republish the expired message back to the original queue.
Using the RepublishMessageRecoverer with retries set to maxattempts=1 should do what you need.
I am trying to calculate the rate of incoming events per minute from a Kafka topic based on event time. I am using TumblingEventTimeWindows of 1 minute for this. The code snippet is given below.
I have observed that if I am not receiving any event for a particular window, e.g. from 2.34 to 2.35, then the previous window of 2.33 to 2.34 does not get closed. I understand the risk of losing data for the window of 2.33 to 2.34 (may happen due to system failure, bigger Kafka lag, etc.), but I cannot wait indefinitely. I need to close this window after waiting for a certain period of time, and subsequent windows can continue after the system recovers. How can I achieve this?
StreamExecutionEnvironment executionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment();
org.apache.flink.api.common.time.Time.of(10, TimeUnit.SECONDS)
Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "localhost:9092");
properties.setProperty("group.id", "AllEventCountConsumerGroup");
FlinkKafkaConsumer<String> kafkaConsumer = new FlinkKafkaConsumer<>("event_input_topic", new SimpleStringSchema(), properties);
DataStreamSource<String> kafkaDataStream = environment.addSource(kafkaConsumer);
.flatMap(new EventFlatter())
.withTimestampAssigner((SerializableTimestampAssigner<Entity>) (element, recordTimestamp) -> element.getTimestamp()))
.assignTimestampsAndWatermarks(new EntityWatermarkStrategy())
.keyBy((KeySelector<Entity, String>) Entity::getTenant)
.aggregate(new EventCountAggregator())
private static class EntityWatermarkStrategy implements WatermarkStrategy<Entity> {
public WatermarkGenerator<Entity> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context) {
return new EntityWatermarkGenerator();
private static class EntityWatermarkGenerator implements WatermarkGenerator<Entity> {
private long maxTimestamp;
public EntityWatermarkGenerator() {
this.maxTimestamp = Long.MIN_VALUE + 1;
public void onEvent(Entity event, long eventTimestamp, WatermarkOutput output) {
maxTimestamp = Math.max(maxTimestamp, eventTimestamp);
public void onPeriodicEmit(WatermarkOutput output) {
output.emitWatermark(new Watermark(maxTimestamp + 2));
Also, I tried adding some custom triggers, but it didn't help. I am using Apache Flink 1.11
Can somebody suggest, what wrong am I doing?
When I tried to push some more data with the newer timestamp (say t+1) of a topic, data from an earlier timeframe (t) gets pushed. but again for t+1 data, the same issues occur as of t.
One reason why withIdleness() isn't helping in your case is that you are calling assignTimestampsAndWatermarks on the datastream after it has been emitted by the kafka source, rather than calling it on the FlinkKafkaConsumer itself. If you were to do the latter, then the FlinkKafkaConsumer would be able to assign timestamps and watermarks on a per-partition basis, and would consider idleness at the granularity of each individual kafka partition. See Watermark Strategies and the Kafka Connector for more info.
To make this work, however, you'll need to use a deserializer other than a SimpleStringSchema (such as a KafkaDeserializationSchema) that is able to create individual stream records, with timestamps. See https://stackoverflow.com/a/62072265/2000823 for an example of how to implement a
Keep in mind, however, that withIdleness() will not advance the watermark if all partitions are idle. What it will do is to prevent idle partitions from holding back the watermark, which may advance if there are events from other partitions.
See the idle partitions documentation for an approach to solving your problem.
using flink 1.11+ watermarkstrategy api should help you avoid pumping dummy data. What you need is to generate watermark at the end of minute periodically. this is the reference:
Create a flinkKafkaConsumer with CustomKafkaSerializer:
FlinkKafkaConsumer otherConsumer = new FlinkKafkaConsumer(
topics, new CustomKafkaSerializer(apacheFlinkEnvironmentLoader), props);
How to create CustomKafkaSerializer ?
Ans -
Two questions about Flink deserializing
Now use watermark Strategy for this flinkKafkaConsumer:
FlinkKafkaConsumer<Tuple3<String,String,String>> flinkKafkaConsumer = apacheKafkaConfig.getOtherConsumer();
flinkKafkaConsumer.assignTimestampsAndWatermarks(new ApacheFlinkWaterMarkStrategy(envConfig.getOutOfOrderDurationSeconds()).
So this is How WaterMark Strategy Looks Like?
Ans ->
public class ApacheFlinkWaterMarkStrategy implements WatermarkStrategy<Tuple3<String, String, String>> {
private long outOfOrderDuration;
public ApacheFlinkWaterMarkStrategy(long outOfOrderDuration)
this.outOfOrderDuration = outOfOrderDuration;
public TimestampAssigner<Tuple3<String, String, String>> createTimestampAssigner(TimestampAssignerSupplier.Context context) {
return new ApacheFlinkTimeForEvent();
public WatermarkGenerator<Tuple3<String, String, String>> createWatermarkGenerator(WatermarkGeneratorSupplier.Context context) {
return new ApacheFlinkWaterMarkGenerator(this.outOfOrderDuration);
} }
This is how we get event time from payload:
public class ApacheFlinkTimeForEvent implements SerializableTimestampAssigner<Tuple3<String,String,String>> {
public static final Logger logger = LoggerFactory.getLogger(ApacheFlinkTimeForEvent.class);
private static final FhirContext fhirContext = FhirContext.forR4();
public long extractTimestamp(Tuple3<String,String,String> o, long l) {
//get timestamp from payload
This is how we generate watermarks periodically so that irrespective
whether data arrives or not watermark gets updated every minute in
each partition.
public class ApacheFlinkWaterMarkGenerator implements WatermarkGenerator<Tuple3<String,String,String>> {
public static final Logger logger = LoggerFactory.getLogger(ApacheFlinkWaterMarkGenerator.class);
private long outOfOrderGenerator;
private long maxEventTimeStamp;
public ApacheFlinkWaterMarkGenerator(long outOfOrderGenerator)
this.outOfOrderGenerator = outOfOrderGenerator;
public void onEvent(Tuple3<String, String, String> stringStringStringTuple3, long l, WatermarkOutput watermarkOutput) {
maxEventTimeStamp = Math.max(maxEventTimeStamp,l);
Watermark eventWatermark = new Watermark(maxEventTimeStamp);
logger.info("Current Watermark emitted from event is {}",eventWatermark.getFormattedTimestamp());
public void onPeriodicEmit(WatermarkOutput watermarkOutput) {
long currentUtcTime = Instant.now().toEpochMilli();
Watermark periodicWaterMark = new Watermark(currentUtcTime-outOfOrderGenerator);
logger.info("Current Watermark emitted periodically is {}",periodicWaterMark.getFormattedTimestamp());
Also, periodic emitting of watermark has to be set at start of the
streamExecutionEnvironment.getConfig().setAutoWatermarkInterval(This is in milliseconds long);
This is how we add to custom watermark and timestamps to flinkKafkaConsumer.
flinkKafkaConsumer.assignTimestampsAndWatermarks(new ApacheFlinkWaterMarkStrategy(Out of Order seconds).
withIdleness(IdlePartiton Seconds);
I'm doing it first time. Where am going to read stream of data using websocket.
Here is my code snippet
public class RsvpApplication {
private static final String MEETUP_RSVPS_ENDPOINT = "ws://stream.myapi.com/2/rsvps";
public static void main(String[] args) {
SpringApplication.run(RsvpApplication.class, args);
public ApplicationRunner initializeConnection(
RsvpsWebSocketHandler rsvpsWebSocketHandler) {
return args -> {
WebSocketClient rsvpsSocketClient = new StandardWebSocketClient();
rsvpsWebSocketHandler, MEETUP_RSVPS_ENDPOINT);
class RsvpsWebSocketHandler extends AbstractWebSocketHandler {
private static final Logger logger =
private final RsvpsKafkaProducer rsvpsKafkaProducer;
public RsvpsWebSocketHandler(RsvpsKafkaProducer rsvpsKafkaProducer) {
this.rsvpsKafkaProducer = rsvpsKafkaProducer;
public void handleMessage(WebSocketSession session,
WebSocketMessage<?> message) {
logger.log(Level.INFO, "New RSVP:\n {0}", message.getPayload());
public class RsvpsKafkaProducer {
private static final int SENDING_MESSAGE_TIMEOUT_MS = 10000;
private final Source source;
public RsvpsKafkaProducer(Source source) {
this.source = source;
public void sendRsvpMessage(WebSocketMessage<?> message) {
As far I know and read about websocket is that, It needs one time connection and stream of data will be flowing continuously until either party (client or server) stops.
I'm building it first time, so trying to cover major scenarios which can come acroos while dealing with 10000+ messages per minute. Total kafka brokers are two with enough space.
What can be done, if connection gets lost and again start consuming messages from webscoket once connected back where it was left in last failure and push messages into further Kafka broker ?
What can be done to put on hold websocket to keep pushing messages in broker if it has reached to threshold limit of not processed messages (in broker) ?
What can be done, When broker reached to its threshold, run a separate process to check available space in broker to push more messages and give indication to resume pushing messages in kafka broker ?
Please share other issues, which needs to be considered while setting up this thing ?
I am trying to build a simple cloud stream application with kafka binding. Let me describe the set up.
1. I have a producer producing to topic topic_1.
2. There's a stream binder, binding topic_1 after some processing into topic_2.
public String handleIncomingMsgs(String s) {
logger.info(s); // prints all the messages
return s;
When the producer produces messages, the StreamListner handleIncomingMsgs gets all the messages.
After receiving, it should forward the messages to some other channel.
public class LogMsg {
public void handle(String board) {
logger.info("Received payload: " + board); //prints every alternate messages
Here is my binder
public interface ViewsStreams {
String INPUT = "input";
String OUTPUT_1 = "output_1";
String OP_USERS = "output_2";
SubscribableChannel job_board_views();
MessageChannel outboundJobBoards();
MessageChannel outboundUsers();
I am new in these technologies. Unable to figure out what is going wrong here. Can someone please help?
Your guess is correct; you have two consumers on the OUTPUT_2 channel - the listener and the binding which sends out the message.
They each get alternate messages.
I'm working on a Spring 5 project and have some very special expectations with junit. Spring 5 now support junit multithreading and that definitely works very well, I'm now running my hundreds of tests into method parrallel multithreading. But I just setup recently my whole automatic mailing system which works like a charm but that's where it start to be problematic : I run a class that send all my mails to test them, and so they are being sent concurently. But as I just tried right now to test it with not only one email at a time but several, I get a strange SSL handshake error which I related to the fact that concurrent mail sending is not supported by most mail clients.
That's where goes my interrogation: how can I run all my test classes with parallel methods execution except for that email batch sending class?
Maybe I should think about a mail queue to avoid this kind of problem in live? Anyone has an idea?
By the way, in case you wonder I'm yet using gmail client to send mail as I didn't configured it yet for our live mail sending but it will be achieved using dedicated 1and1.fr smtp email client.
Thanks for your patience!
For those who feels interested about the solution, here is how I solved it:
I created a new Singleton class which would handle the queue :
public class EmailQueueHandler {
/** private Constructor */
private EmailQueueHandler() {}
/** Holder */
private static class EmailQueueHandlerHolder
/** unique instance non preinitialized */
private final static EmailQueueHandler INSTANCE = new EmailQueueHandler();
/** access point for unique instanciation of the singleton **/
public static EmailQueueHandler getInstance()
return EmailQueueHandlerHolder.INSTANCE;
private List<EmailPreparator> queue = new ArrayList<>();
public void queue(EmailPreparator email) {
public List<EmailPreparator> getQueue()
List<EmailPreparator> preparators = queue;
queue = new ArrayList<>();
return preparators;
// This method is used to make this handler thread safe
private synchronized void waitForQueueHandlerToBeAvailable(){}
I then created a CRON task using #Schedule annotation in my Scheduler bean in which I would correctly handle any mail sending fail.
#Scheduled(fixedRate = 30 * SECOND)
public void sendMailsInQueue()
List<EmailPreparator> queue = emailQueueHandler.getQueue();
int mailsSent = queue.size();
int mailsFailed = 0;
for(EmailPreparator preparator : queue)
try {
// And finally send the mail
// If mail sending is not activated, mail sending function will throw an exception,
// Therefore we have to catch it and only throw it back if the email was really supposed to be sent
catch(Exception e)
mailsSent --;
// If we are not in test Env
mailsFailed ++;
// This will log the error into the database and eventually
// print it to the console if in LOCAL env
new Error()
else if(SpringConfiguration.SEND_MAIL_ANYWAY_IN_TEST_ENV || preparator.isForceSend())
mailsFailed ++;
throw new EmailException(e);
log.info("CRON Task - " + mailsSent + " were successfuly sent ");
if(mailsFailed > 0)
log.warn("CRON Task - But " + mailsFailed + " could not be sent");
And then I called this mail queue emptyer methods at the end of each unit test in my #After annotated method to make sure it's called before I unit test the mail resulted. This way I'm aware of any mail sending fail even if it appear in PROD env and I'm also aware of any mail content creation failure when testing.
public void downUp() throws Exception
logger.debug("After Test");
RequestHolder requestHolder = securityContextBuilder.getSecurityContextHolder().getRequestHolder();
// We check mails sending if some were sent
for(ExtResults results : requestHolder.getExtResults())
ExtSenderVerificationResolver resolver =
new ExtSenderVerificationResolver(
// Some code
protected void proceedMailQueueManuallyIfNotAlreadySent()
mailQueueProceeded = true;