Sometimes due to some external problems, I need to requeue a message by basic.reject with requeue = true.
But I don't need to consume it immediately because it will possibly fail again in a short time. If I continuously requeue it, this may result in infinite loop and requeue.
So I need to consume it later, say one minute later,
And I need to know how many times the messages has been requeue so that I can stop requeue it but only reject it to declare it fails to consume.
PS: I am using Java client.
There are multiple solutions to point 1.
First one is the one chosen by Celery (a Python producer/consumer library that can use RabbitMQ as broker). Inside your message, add a timestamp at which the task should be executed. When your consumer gets the message, do not ack it and check its timestamp. As soon as the timestamp is reached, the worker can execute the task. (Note that the worker can continue working on other tasks instead of waiting)
This technique has some drawbacks. You have to increase the QoS per channel to an arbitrary value. And if your worker is already working on a long running task, the delayed task wont be executed until the first task has finished.
A second technique is RabbitMQ-only and is much more elegant. It takes advantage of dead-letter exchanges and Messages TTL. You create a new queue which isn't consumed by anybody. This queue has a dead-letter exchange that will forward the messages to the consumer queue. When you want to defer a message, ack it (or reject it without requeue) from the consumer queue and copy the message into the dead-lettered queue with a TTL equal to the delay you want (say one minute later). At (roughly) the end of TTL, the defered message will magically land in the consumer queue again, ready to be consumed. RabbitMQ team has also made the Delayed Message Plugin (this plugin is marked as experimental yet fairly stable and potential suitable for production use as long as the user is aware of its limitations and has serious limitations in term of scalability and reliability in case of failover, so you might decide whether you really want to use it in production, or if you prefer to stick to the manual way, limited to one TTL per queue).
Point 2. just requires putting a counter in your message and handling this inside your app. You can choose to put this counter in a header or directly in the body.
Related
imagine you have some task structure of
Task1
Task2: 1 million separate independent Subtask[i] that can run concurrently
Task3: must run once after ALL Task2 subtasks have completed
And all of Task1, Subtask[i] and Task3 are represented by MQ messages.
How can this be solved on an ActiveMQ? Especially the triggering of a Task3 message once all subtasks are complete.
I know, it's not a queueing problem, it's a fork-join problem. Lets say the environment dictates you must use an ActiveMQ for it.
Using ActiveMQ features, dynamic queues and consumers, stuff like that, is allowed. Using external counters, like a database row representing Task2's progress, is not allowed.
Hidden in this fork-join problem is a state management and observability challenge. Since the database is ruled out, you have to rely on something in-memory or on-queue.
Create a unique id for the task run -- something short, but with enough space to not collide like an airplane locator code-- ie. 34FDSX
Send all messages for the task to a queue://TASK.34FDSX.DATA
Send a control message to queue://TASK.34FDSX.CONTROL that contains the task id and expected total # of messages (including each messageId would be helpful too)
When consumers from queue://TASK.34FDSX.DATA complete their work, they should send a 'done' message to queue://TASK.34FDSX.DONE queue with their messageId or some identifier.
The consumers for the .CONTROL queue and the .DONE queue should be the same process and can track the expected and total completed tasks. Once everything is completed, he can fire the event to trigger Task #3.
This approach provides everything as 'online', and you can also timeout the .CONTROL and .DONE reader if too much time passes before the task completes.
Queue deletion can be done using ActiveMQ destination GC, or as a clean-up step in the .CONTROL/.DONE reader during the occurances when everything completes successfully.
Advantages:
No infinite blocking consumers
No infinite open transactions
State of the TASK is online and observable via the presence of queues and queue metrics-- queue size, enqueue count, dequeue count
The entire solution can be multi-threaded and the only requirement is that for a given task the .CONTROL/.DONE listener is the same consumer, but multiple tasks can have individual .CONTROL/.DONE listeners to scale.
The question here is a bit vague so my answer will have to be a bit vague as well.
Each of the million independent subtasks for "Task 2" can be represented by a single message. All these messages can be in the same queue. You can spin up as many consumers as you want and process all these messages (i.e. perform all the subtasks). Just ensure that these consumers either use client-acknowledge mode or a transacted session so that the message is not removed from the queue until they are done processing the message. Once there are no more messages in the queue then you know "Task 2" is done.
To detect when the queue is empty you can have a "special" consumer on the queue that periodically opens a transacted session and tries to consume a message from the queue. If the consumer receives a message then you can rollback the transacted session to put the message back on the queue and you know that the queue is not empty (i.e. "Task 2" is not done). If the consumer doesn't receive a message then you know the queue is empty and you can send another message indicating this. You could launch this special consumer as part of "Task 2" after all the messages for the subtasks have been sent to avoid detecting an empty queue prematurely.
To be clear, this is a simple solution. You could certainly add more complexity depending on your requirements, but your question just outlined the basic problem so it's unclear what other requirements you have (if any).
I had a question on how rabbitmq works with batching acknowledgements. I understand that the Prefetch value is the max number of messages that will get queued before reaching its limit. However, I wasn't sure if the ack's manage themselves or if I have to manage this in code.
Which method is correct?
Send each basicAck with multiple set to true
or
wait until 10 acks were supposed to be sent out and send only the last one and AMQP will automatically send all previous in queue. (with multiple set to true)
TL;DR multiple = true is faster in some cases but requires a lot more careful book keeping and batch like requirements
The consumer gets messages that have a monotonic-ly growing id specific to that consumer. The id is a 64 bit number (it actually might be an unsigned 32 bit but since Java doesn't have that its a long) called the delivery tag. The prefetch is the most messages a consumer will receive that are unacked.
When you ack the highest delivery tag with multiple true it will acknowledge all the unacked messages with a lower delivery tag (smaller number) that the consumer has outstanding. Obviously if you have high prefetch this is faster than acking each message.
Now RabbitMQ knows the consumer received the messages (the unacked ones) but it doesn't know if all those messages have been correctly consumed. So it is on the burden of you the developer to make sure all the previous messages have been consumed. The consumer will deliver the messages in order (I believe internally the client uses a BlockingQueue) but depending on the library/client used downstream the messages might not be.
Thus this really only works well when you are batching the messages together in a single go (e.g. transaction or sending a group of messages off to some other system) or buffering reliably. Often this is done with a blocking queue and then periodically draining the queue to send a group of messages to a downstream system.
On the other hand if you are streaming each message in real time then you can't really do this (ie multiple = false).
There is also the case of one of the message being bad in the group (e.g. drained from internal queue... not rabbit queue) and you won't to nack that bad one. If that is the case you can't use multiple = true either.
Finally if you wait for a certain amount messages (instead of say time) more than the prefetch you will wait indefinitely.... not a good idea. You need to wait on time and number of messages must be <= prefetch.
As you can see its fairly nontrivial to correctly use multiple = true.
First one correction regarding Prefetch value is the max number of messages that will get queued before reaching its limit. - this is not what prefetch value is; prefetch value is the number of UN-ACKed messages that consumer "gets" from the queue. So they are kind of assigned to the consumer but remain in the queue until they are acknowledged. Quote from here, when prefetch is 1
This tells RabbitMQ not to give more than one message to a worker at a
time. Or, in other words, don't dispatch a new message to a worker
until it has processed and acknowledged the previous one.
And for your question:
I wasn't sure if the ack's manage themselves or if I have to manage
this in code.
You can set the auto ack flag to true and then you could say that the ack's manage themselves
I need a blocking queue that has a size of 1, and every time put is applied it removes the last value and adds the next one. The consumers would be a thread pool in which each thread needs to read the message as it gets put on the queue and decide what to do with it, but they shouldn't be able to take from the queue since all of them need to read from it.
I was considering just taking and putting every time the producer sends out a new message, but having only peek in the run method of the consumers will result in them constantly peeking, won't it? Ideally the message will disappear as soon as the peeking stops, but I don't want to use a timed poll as it's not guaranteed that every consumer will peek the message in time.
My other option at the moment is to iterate over the collection of consumers and call a public method on them with the message, but I really don't want to do that since the system relies on real time updates, and a large collection will take a while to iterate through completely if I'm going through each method call on the stack.
After some consideration, I think you're best off, with each consumer having its own queue and the producer putting its messages on all queues.
If there are few consumers, then putting the messages on those few queues will not take too long (except when the producer blocks because a consumer can't keep up).
If there are many consumers this situation will be highly preferable over a situation where many consumers are in contention with each other.
At the very least this would be a good measure to compare alternate solutions against.
I have a requirement to read all messages in my Amazon SQS queue in 1 read and then sort it based on created timestamp and do business logic on it.
To make sure all the SQS hosts are checked for messages, I enabled long polling. The way I did that was to set the default wait time for the queue as 10 seconds. (Any value more than 0 will enable long polling).
However when I tried to read the queue, it still did not give me all the messages and I had to do multiple reads to get all the messages. I even enabled long polling through code per receive request, still did not work. Below is the code I am using.
AmazonSQSClient sqsClient = new AmazonSQSClient(new ClasspathPropertiesFileCredentialsProvider());
sqsClient.setEndpoint("sqs.us-west-1.amazonaws.com");
String queueUrl = "https://sqs.us-west-1.amazonaws.com/12345/queueName";
ReceiveMessageRequest receiveRequest = new ReceiveMessageRequest().withQueueUrl(queueUrl).withMaxNumberOfMessages(10).withWaitTimeSeconds(20);
List<Message> messages = sqsClient.receiveMessage(receiveRequest).getMessages();
I have 3 messages in the queue and each time I run the code I get a different result, sometimes I get all 3 messages, sometimes just 1. The visibility timeout I set as 2 seconds, just to eliminate the messages becoming invisible as the reason for not seeing them in the read.
This is the expected behavior for short polling. Long polling is supposed to eliminate multiple polls. Is there anything I am doing wrong here?
Thanks
Long polling is supposed to eliminate multiple polls
No, long polling is supposed to eliminate a large number of empty polls and false empty responses when messsages are actually available. A long poll in SQS won't sit and wait for the maximum amount of wait time just looking for more things to return, or keep searching once it's found something. A long poll in SQS only waits long enough to find something:
“Long polling allows the Amazon SQS service to wait until a message is available in the queue before sending a response. So unless the connection times out, the response to the ReceiveMessage request will contain at least one of the available messages (if any) and up to the maximum number requested in the ReceiveMessage call.”
— http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html (emphasis added)
So, the “something” that SQS finds and returns may be all of the messages (up to your max), or a subset of the messages, because, as has been mentioned, SQS is a distributed system. There was likely an architectural decision to be made between "return as quickly as possible once we've found something" and "search the entire system for everything possible up to the maximum number of message the client will accept" ... and, given those alternatives, it seems reasonable that most applications would prefer the faster response of "give me whatever you can, as quickly as you can."
You don't know that you've actually drained a queue until you get back an empty response from a long poll.
As pointed out by Michael - sqlbot, SQS does not guarantee returning all (or the requested number of) messages even in case of Long Polling. Long Polling just ensures that you do not get false empty responses - i.e. your read requests do not return any messages even when there are messages in the queue.
I had done some experiments around this and found that the number of messages returned in the response approaches the number of the messages requested as you increase the number of messages in the queue. Typically, with 1000+ messages in the queue, in my experiments, I could see that it returned 10 messages (which is by the way the max that can be returned for a read request) everytime. In fact this behavior was observed for Short Polling as well. Even with 100+ messages, the number of messages returned was not 10 all the time, although a good percentage of those requests returned 10 messages back. Obviously, this is not guaranteed, but that is what you would typically see.
I had documented the findings from my experiments in one of my blogs - posting a link to the same below in case you would like to see more details of the experiment.
http://pragmaticnotes.com/2017/11/20/amazon-sqs-long-polling-versus-short-polling/
Because SQS is, on the back-end, a distributed system, there is no guarantee that any particular request will be able to return the maximum number of messages that are being polled for.
You just have to keep calling, till you are confident enough that you have as many items as you would expect, or that the queue has been emptied.
Set the execution time out to a value greater than 0. I have set execution timeout to 2 seconds and it is now returning all 9 messages available in the queue.
I want to somehow delay messages for the whole message group.
The thing is that all messages belonging to each message group must be processed in the same order they were posted, sequentially. If one of the messages cannot be consumed - we want to delay it and also delay the remaining ones in the same message group. I do not want to block the consumer - it should be free to process messages from other groups.
How to do that?
I can't say JMS has anything nice built in support for this stuff. Everything is easier with single "stand alone" messages, but there is one thing you could try.
Do a delayed delivery for those messages (in that group).
// Send to same queue once again, but delay 60 sec
if( isGroupMarkedForRedelivery(message.getStringProperty("JMSXGroupID"))){
message.setLongProperty("_HQ_SCHED_DELIVERY", System.currentTimeMillis() + 60000);
producer.send(message); // producer sends to process queue (again).
}
Note that if you need them in the same order, then you should probably not use concurrency in sending and/or receiving. You could of course add more logic to adapt to your situation.
You probably need to make sure isGroupMarkedForRedelivery returns false for a specific group after less amount of time than the "delay".