SQS Message always stays inflight

SQS Message always stays inflight - java

I have the following code retrieving messages from a SQS queue. I am using a AmazonSQSBufferedAsyncClient to retrieve the message from Queue. A fixed delay SingleThreadedExecutor wakes up every 5 mins calling receiveMessage. Long polling is enabled in the Queue
#Service
public class AmazonQueueService
implements QueueService<String> {
#Autowired
private AmazonSQSBufferedAsyncClient sqsAsyncClient;
#Value("${aws.sqs.queueUrl}")
private String queueUrl;
#Override
public List<Message<String>> receiveMessage() {
ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest(queueUrl);
ReceiveMessageResult result = sqsAsyncClient.receiveMessage(receiveMessageRequest);
LOG.debug("Size=" + result.getMessages().size());
return Lists.transform(result.getMessages(), ......);
}
.....
}
The problem is when I check AWS console, the message is always in-flight, but it is never received in the application (Size is always printed as 0) . Looks like the AmazonSQSBufferedAsyncClient is reading the message from queue but not returning in the receiveMessage call.
Any ideas?

Finally figured this out. The problem manifests with the combination of queue visibility timeout (2 mins) and the scheduledExecutor delay (5 mins).
Increasing the visibility timeout to 15 mins solved the problem.
My theory is -> The AmazonSQSBufferedAsyncClient retrieves message and keeps it in the buffer waiting for receiveMessage call. As the executor delay is 5 mins, the visibility of the message times out befor receiveMessage is called and the message is returned to the queue. It also looks like the message is picked from the queue almost immediately. And now for whatever reason a call to receiveMessage does not receive the message. Increasing the timeout which gave a chance for the receiveMessage call to happen before a timeout event, solved the problem, I guess.
Any other possible explanation?

For me problem was my messages were being read by lambda even before I see them on console. So the result was I was not able to poll for the messages from sqs console.

You have to delete the message from the queue when you're done with it. If you don't, it's going to stay in-flight until it times out and then go right back to the queue. It is designed this way so you will never lose messages. If your program crashes before it finishes handling the message and deleting it, the message will go right back to the queue.
From the basic Java example (SampleDriver.java):
QMessage message = messages.get(0);
System.out.println("\nMessage received");
System.out.println(" message id: " + message.getId());
System.out.println(" receipt handle: " + message.getReceiptHandle());
System.out.println(" message content: " + message.getContent());
testQueue.deleteMessage(message.getReceiptHandle()); // <===== here

In my case, I was using short polling, and sometimes it was receiving more than 1 message but I was processing only one. Thus was losing track of messages. Since the message was never processed its visibility timeout was never reached and thus message appeared in inflight messages.

Related

SQS ReceiveMessageRequest not receiving all the messages in the queue (messages are under 10)

I'm trying to retrieve messages from a queue. I understand that RecieveMessageRequest has a threshold of 10 messages but when I tried I was able to receive only 2 out 3 messages in the queue. I read many threads which said adding setMaxNumberOfMessages(10) and increasing WaitTimeSeconds will fix it(Before adding this I received only one message out of 3) but it wasn't helpful.
FYI: I'm using a standard queue and all the messages were definitely there in the queue at the time of receive message request so it shouldn't have been a polling issue.
My implementation:
List<Message> messages;
ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest().withQueueUrl(queueUrl)
.withWaitTimeSeconds(10)
.withMaxNumberOfMessages(10);
messages = sqsConfig.getSQSClient().receiveMessage(receiveMessageRequest).getMessages();

I know it make not make a ton of sense but in multiple programming languages (certainly Java and Python) I've had to loop to get everything from a queue. In Java it's something like (using the V2 api's):
// note that "done" needs to be set somewhere else to stop the loop
while (!done) {
ReceiveMessageRequest receiveMessageRequest = ReceiveMessageRequest.builder()
.queueUrl(queueUrl)
.waitTimeSeconds(20)
.build();
List<Message> messages = sqsClient.receiveMessage(receiveMessageRequest).messages();
for (Message nextMessage : messages) {
// do something with the message
DeleteMessageRequest deleteMessageRequest = DeleteMessageRequest
.builder()
.queueUrl(queueUrl)
.receiptHandle(nextMessage.receiptHandle())
.build();
sqsClient.deleteMessage(deleteMessageRequest);
}
}
From what I've read it is because there are ultimately many servers behind SQS and a single call doesn't hit all of them - it takes multiple calls. This is not a busy loop as the waitTimeSeconds does block if there is nothing to do.
Try something like this to read SQS. It's not quite as elegant but it does work.

Good way to check if Kafka Consumer doesn't have any records to return and is empty in java?

I'm using Apache KafkaConsumer. I want to check if the consumer has any messages to return without polling. If I poll the consumer and there aren't any messages, then I get the message "Attempt to heartbeat failed since the group is rebalancing" in an infinite loop until the timeout expires, even though I have a records.isEmpty() clause. This is a snippet of my code:
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(10));
if (records.isEmpty()) {
log.info("No More Records");
consumer.close();
}
else {
records.iterator().forEachRemaining(record -> log.info("RECORD: " + record);
);
This works fine until records are empty. Once it is empty, it logs "Attempt to heartbeat failed since the group is rebalancing" many times, logs "No More Records" once, and then continues to log the heartbeat error. What can I do to combat this and how can I elegantly check (without any heartbeat messages) that there are no more records to poll?
Edit: I asked another question and the full code and context is on this link: How to get messages from Kafka Consumer one by one in java?
Thanks in advance!

Out of comment: "Since I have a UI and want to receive a message one by one by clicking the "receive" button, there might be a case when there are no more messages to be polled."
In that case you need to create a new KafkaConsumer every time someone clicks on the "receive" button and then close it afterwards.
If you want to use the same KafkaConsumer for the lifetime of your client, you need to let the broker know that it is still alive (by sending a heartbeat, which is implicitly done through calling the poll method). Otherwise, as you have already experienced, the broker thinks your KafkaConsumer is dead and will initiate a rebalancing. As there is no other active Consumer available this rebalancing will not stop.

How to know when consumer.poll() is called while using #KafkaListener annotation?

I understand that I cannot control when to poll if I use #KafkaListener, and I read from this answer that
the next poll() is performed after the last message from the previous poll has been processed by the listener.
So I'm wondering how to know when each poll() is executed? Or equivalently, how long does it take to process all messages received in each poll() call?
I am asking because my program got "Offset commit failed ... The request timed out" exceptions, and I would like to tune my consumer config, i.e. max.poll.interval.ms and max.poll.records, but I need to know the current performance first.
Here is part of my #KafkaListener method if it helps:
#KafkaListener(id = "dataListener", topics ="${spring.kafka.topic}", containerFactory = "kafkaListenerContainerFactory")
public void listen(#Payload(required = false) ConsumerRecord payload, #Header(KafkaHeaders.RECEIVED_PARTITION_ID) String partition,
#Header(KafkaHeaders.OFFSET)Long offset, #Header(KafkaHeaders.RECEIVED_MESSAGE_KEY)String messageKey){
// processing messages
}

You can see polling activity by turning on DEBUG logging.
this.logger.debug(() -> "Received: " + records.count() + " records");
logged after each poll().

Akka application hang

We use akka to stress test one of our systems by sending JMS messages in parallel to our system entry points.
At a very high level, there is a Boss actor and a number of Worker actors that do the job.
When Boss actor is created in its constructor it also creates the Worker actors and puts them in a map:
workers = endPoints.stream().collect(Collectors.toMap(e -> e, e -> newWorker(e, ...)));
Then when the Boss receives the StartTesting message it just iterates to the list of its workers and send a PerformWork message to each one. Upon receiving the PerformWork message each worker goes to the database get the messages to send out, and start sending them to its associated end point. Nothing out of the ordinary here:
logger.info("Number of worker actors: " + workers.values().size());
PerformWork performWork = new PerformWork();
workers.values().forEach(w -> {
logger.info("Sending PerformWork message to " + w);
w.tell(performWork, getSelf());
});
When running we can see in the logs the following:
... Number of worker actors: 43
... Sending a PerformWork message to Actor[akka://.../WORKER-1
... Sending a PerformWork message to Actor[akka://.../WORKER-2
.
... WORKER-1: received message of type PerformWork
... Sending a PerformWork message to Actor[akka://.../WORKER-30
.
The number of Sending a PerformWork log entries are never equals with the number of workers (43 in this case). Usually between 20 and 30 but not necessarily always the same. The number of WORKER-x received message of PerformWork entries are usually smaller than the messages sent. The worker actors that actually received the PerformWork message perform what they are supposed to do without any problem.
However we never seen the rest of the Sending ... or Received ... messages in the logs for the rest of the workers, and obviously those end points associated with them never receive any message.
So my questions would be:
What I am doing wrong? Maybe my approach is too naive. I am not very experienced with akka, never used it beyond building testing tools.
What will cause the sending PerformMessage loop above to never complete.
What will cause the delivery of PerformMessage above to fail.
All this processing is happening inside the same JVM. I am open to any other suggestion that will help me understand what is going on and address the issue.
The testing tool is written in Java. I added the Scala tag thinking that Scala developers would be more familiar with akka, given Actor framework is part of the language.
Thank you in advance.
UPDATE
I worked out what was causing the issue, and I was able to fix it, however I still don't have an explanation about why was happening and it would be good to have an understanding.
So before the fix the Worker code looked like below:
#Override
public void onReceive(final Object message) throws InterruptedException {
if (message instanceof PerformWork) {
// here is the code using jdbcTemplate to get messages
// from database and send them to the system end point
getSender().tell(new WorkDone(), getSelf());
} else {
logger.info("Not prepared to handle message of type " + message.getClass());
unhandled(message);
}
}
And I changed it to start sending messages with a delay of two seconds:
#Override
public void onReceive(final Object message) throws InterruptedException {
if (message instanceof PerformWork) {
new Timer().schedule(
new TimerTask() {
#Override
public void run() {
proceedWithSendingMessages();
}
},
2000
);
} else {
logger.info("Not prepared to handle message of type " + message.getClass());
unhandled(message);
}
}
private void proceedWithSendingMessages() {
// here is the code using jdbcTemplate to get messages
// from database and send them to the system end point
getSender().tell(new WorkDone(), getSelf());
}
After the above change everything started working but the question I have is why the issue existed in the first place. It seems to me that what I experienced is against the asynchronous processing concept.
Again thank you in advance for your inputs.
UPDATE 2
In a normal run each worker runs for 24 hours. Remember it is a stress/load test so no WorkDone message is sent back to the Boss in less than this time. My akka application got hung after less than one second after starting. As I said all the workers that received the PerformWork were running normally and kept sending messages for 24 hours.
When receiving a WorkDone message the Boss verify if all Workers are done and if so it shuts down akka. I am attaching the code here as it was requested but I don't think it has anything to do with my problem.
} else if (message instanceof WorkDone) {
// completed below is a set
completed.add(((WorkDone) message).getSystem())
if (completed.size() == workers.size()) {
shutDownAkka();
}
}
I realized that was the Worker starting the work immediately after receiving a PerformWork message causing my issue by changing the database in such a way that it will be nothing to send so by querying an empty database made everything to work OK but finished straight away.
I suspected something funny would be caused by the spring JdbcTemplate bean which was singleton and being thread safe will lock things when multiple workers will try to use it immediately. I changed it to be prototype but the problem staid the same. I even made it new JdbcTemplate(singletonDatasource) with no luck. On the other hand I thought it must not be JdbcTemplate as 99% of the Java world it is using it.
At this point I added the delay in the worker code as posted in my first UPDATE thinking it was important all workers receive the PerformWork notification. Once I did this all started working and no problems. However I would be interested to know what would have caused this.

Issue with readLine() method

I am building a Chat system, where I need wait to get the User input (Sender) as well as to display the reply message (from receiver) at the same time.
So I am using a while loop for receiving and sending the messages:
while((text = inFromUser.readLine()) != null) //Msg from Sender
{
while((data_from_server=inFromServer.readLine()) != null) //Msg from receiver
{
System.out.println("Displaying Output=" + data_from_server);
System.out.println(data_from_server);
}
System.out.println("Getting Input=" + text);
outToserver.writeBytes(text + "\n");
}
My Problem is the client may send inputs again and again ,whereas the receiver may/may not send the reply back. But according to my logic, it's always expecting a input from the receiver and Vice Versa. Please suggest to fix this problem.

You're going to need more than one thread. Think about it - you have to wait until the user enters some data, and when that happens, display it immediately. You also have to wait until the server gives you some data, and display that immediately.
You can't wait for both at once; if you did, nothing would be displayed until both the user and the server had entered a line. You can't wait for one, then the other; if you did, the client couldn't read what they wrote until the server sent a message, or vice versa.
You need to wait for both at the same time, but running side-by-side. You want to perform an action as soon as either of them return something. This means you need to run a second thread. One thread waits for the user, and one thread waits for the server.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

SQS Message always stays inflight - java

For me problem was my messages were being read by lambda even before I see them on console. So the result was I was not able to poll for the messages from sqs console.

In my case, I was using short polling, and sometimes it was receiving more than 1 message but I was processing only one. Thus was losing track of messages. Since the message was never processed its visibility timeout was never reached and thus message appeared in inflight messages.

Related

SQS ReceiveMessageRequest not receiving all the messages in the queue (messages are under 10)

Good way to check if Kafka Consumer doesn't have any records to return and is empty in java?

How to know when consumer.poll() is called while using #KafkaListener annotation?

Akka application hang

Issue with readLine() method

Categories

Resources