I have a java program that puts down into the queue on the other side of the queue, i do have 10-15 consumers; any ONE of which should read the message and process it. If any of the 10-15 consumers get free they pick up the next message from the queue.
Basically, a Consumer can pickup the message from the queue whenever it is free, and only one consumer must pick it up. (without any synchronization blocks or so).
Also on the sender's end can i pause sending the messages into the queue if the queue sizes becomes full(or reaches a certain threshold)?
I am really new to the JMS API. Apologies if this is a newbie question .
Thanks!!
I have to Send messages into a queue and i have 20 threads running as
consumers, who can pick up the data from the queue(once they are
free). so when each thread gets free it goes to the queue checks if
the data is there it picks up and so on.. is this doable?
Yes, it's doable - that's the standard process of doing it with JMS queues. Another alternative would be topics, but with topics, every listener would have to process the same message, not just one, so queues are what you want. Although usually you don't have threads as consumers (I'm not even sure what that means), but message-driven beans. You might consider using them. MDBs run in their own thread anyway.
Related
imagine you have some task structure of
Task1
Task2: 1 million separate independent Subtask[i] that can run concurrently
Task3: must run once after ALL Task2 subtasks have completed
And all of Task1, Subtask[i] and Task3 are represented by MQ messages.
How can this be solved on an ActiveMQ? Especially the triggering of a Task3 message once all subtasks are complete.
I know, it's not a queueing problem, it's a fork-join problem. Lets say the environment dictates you must use an ActiveMQ for it.
Using ActiveMQ features, dynamic queues and consumers, stuff like that, is allowed. Using external counters, like a database row representing Task2's progress, is not allowed.
Hidden in this fork-join problem is a state management and observability challenge. Since the database is ruled out, you have to rely on something in-memory or on-queue.
Create a unique id for the task run -- something short, but with enough space to not collide like an airplane locator code-- ie. 34FDSX
Send all messages for the task to a queue://TASK.34FDSX.DATA
Send a control message to queue://TASK.34FDSX.CONTROL that contains the task id and expected total # of messages (including each messageId would be helpful too)
When consumers from queue://TASK.34FDSX.DATA complete their work, they should send a 'done' message to queue://TASK.34FDSX.DONE queue with their messageId or some identifier.
The consumers for the .CONTROL queue and the .DONE queue should be the same process and can track the expected and total completed tasks. Once everything is completed, he can fire the event to trigger Task #3.
This approach provides everything as 'online', and you can also timeout the .CONTROL and .DONE reader if too much time passes before the task completes.
Queue deletion can be done using ActiveMQ destination GC, or as a clean-up step in the .CONTROL/.DONE reader during the occurances when everything completes successfully.
Advantages:
No infinite blocking consumers
No infinite open transactions
State of the TASK is online and observable via the presence of queues and queue metrics-- queue size, enqueue count, dequeue count
The entire solution can be multi-threaded and the only requirement is that for a given task the .CONTROL/.DONE listener is the same consumer, but multiple tasks can have individual .CONTROL/.DONE listeners to scale.
The question here is a bit vague so my answer will have to be a bit vague as well.
Each of the million independent subtasks for "Task 2" can be represented by a single message. All these messages can be in the same queue. You can spin up as many consumers as you want and process all these messages (i.e. perform all the subtasks). Just ensure that these consumers either use client-acknowledge mode or a transacted session so that the message is not removed from the queue until they are done processing the message. Once there are no more messages in the queue then you know "Task 2" is done.
To detect when the queue is empty you can have a "special" consumer on the queue that periodically opens a transacted session and tries to consume a message from the queue. If the consumer receives a message then you can rollback the transacted session to put the message back on the queue and you know that the queue is not empty (i.e. "Task 2" is not done). If the consumer doesn't receive a message then you know the queue is empty and you can send another message indicating this. You could launch this special consumer as part of "Task 2" after all the messages for the subtasks have been sent to avoid detecting an empty queue prematurely.
To be clear, this is a simple solution. You could certainly add more complexity depending on your requirements, but your question just outlined the basic problem so it's unclear what other requirements you have (if any).
I'm building an application using RabbitMQ/Spring/Spring AMQP and am having trouble handling the way I've laid out my queues.
Essentially I have one queue that every consumer listens to, with each message basically saying "this queue is ready to be processed by a single consumer". The consumer will then listen to the queue indicated in the message, consume all the messages in that queue, and finally delete it when done.
These short lived queues are all created on the fly as data comes in to be processed and cannot be consumed by multiple consumers (whichever gets the message in the 'ready' queue).
I'm having trouble gracefully handling the consumers in this situation. Right now I just create a new DirectMessageListenerContainer each time a consumer gets a message from the 'ready' queue and then stop it once it has gotten all the messages it needs. It seems like this solution isn't ideal. Is there any better way to handle a situation like this with Spring AMQP/RabbitMQ?
You can add/remove queues to/from existing container(s) at runtime; it is more efficient with the direct container (see Choosing a container).
The MessageProperties has the consumerQueue property to tell you which queue the message came from.
Sometimes due to some external problems, I need to requeue a message by basic.reject with requeue = true.
But I don't need to consume it immediately because it will possibly fail again in a short time. If I continuously requeue it, this may result in infinite loop and requeue.
So I need to consume it later, say one minute later,
And I need to know how many times the messages has been requeue so that I can stop requeue it but only reject it to declare it fails to consume.
PS: I am using Java client.
There are multiple solutions to point 1.
First one is the one chosen by Celery (a Python producer/consumer library that can use RabbitMQ as broker). Inside your message, add a timestamp at which the task should be executed. When your consumer gets the message, do not ack it and check its timestamp. As soon as the timestamp is reached, the worker can execute the task. (Note that the worker can continue working on other tasks instead of waiting)
This technique has some drawbacks. You have to increase the QoS per channel to an arbitrary value. And if your worker is already working on a long running task, the delayed task wont be executed until the first task has finished.
A second technique is RabbitMQ-only and is much more elegant. It takes advantage of dead-letter exchanges and Messages TTL. You create a new queue which isn't consumed by anybody. This queue has a dead-letter exchange that will forward the messages to the consumer queue. When you want to defer a message, ack it (or reject it without requeue) from the consumer queue and copy the message into the dead-lettered queue with a TTL equal to the delay you want (say one minute later). At (roughly) the end of TTL, the defered message will magically land in the consumer queue again, ready to be consumed. RabbitMQ team has also made the Delayed Message Plugin (this plugin is marked as experimental yet fairly stable and potential suitable for production use as long as the user is aware of its limitations and has serious limitations in term of scalability and reliability in case of failover, so you might decide whether you really want to use it in production, or if you prefer to stick to the manual way, limited to one TTL per queue).
Point 2. just requires putting a counter in your message and handling this inside your app. You can choose to put this counter in a header or directly in the body.
I have a FIFO SQS queue, with visibility time of 30 seconds.
The requirement is to read messages as Quickly as possible and clear the queue.
I have code in JAVA in a fashion shown below ( this is just a representation of idea only, not complete code ):
//keep getting messages from FIFO and process them ASAP
while(true)
{
List<Message> messages =
sqsclient.receiveMessage(receiveMessageRequest).getMessages();
//my logic/code here to process these messages and delete them ASAP
}
In the while loop as soon as the messages are received, they are processed and removed from the queue.
But, many times the receiveMessageRequest does not give me messages (returns zero messages).
Also, the messages limitation is only 10 at a time during receive from SQS, which is already an issue, but due to these zero receives, the queues are piling up.
I have no clue why this is happening. The documentation exactly is not clear on this part (or Am I missing in terms of the configuration of the queue?)
Please help!
Note:
1. My FIFO Queue always has messages in this scenario, so there is no case of Queue having zero messages and receive request returning zero
2. The processing and delete times are also Less than the visibility timeout.
Thanks.
Update:
I have started running multiple consumers for processing the FIFO queue. Clearly, one consumer is not coping up with the inflow of messages. I shall update in few days how multiple consumers are performing. Thanks
You have to first make sure that all messages you received are deleted within VisibilityTimeout. If you are using DeleteMessageBatch for deletion make sure that all 10 messages are deleted.
Also, how did you queue messages when you enqueue them?
Order of messages are guaranteed only in a single message group.
This also means that if you set the same group id to all messages, you are limited to a single consumer so that order of messages are preserved for sure. Even if use multiple consumers, all messages that belong to a same group becomes invisible to other consumers until visibility timeout expires.
I'm currently adding JMS support to a application-server-like framework. The JMS will be implemented by HornetQ (stand-alone broker, hornetq jars on the servers classpath) but there is neither JBoss nor spring nor anything else that would provide MDBs.
The next step is to add a message listener to a xa queue that would allow for parallel processing of incoming messages. Some messages would init long running tasks, so the basic idea is to spawn worker threads from the onMessage method.
On my long journey through the internet I came across this discussion, where one of participants mentioned, that he would not do that but use an extra internal queue for the task: the (single threaded) message listener then would simply grab the messages from the inbound queue and create new messages for an internal queue, where at the other end of that internal queue some worker threads fight for the incoming messages. Inbound messages then would be acknowledge once they're "copied" to the internal queue (which is ok for me).
Unfortunatly they don't say why it would be better to not spawn worker threads from the onMessage method - maybe, because the listener would block if all threads from the pool are busy. So I'm looking for pros and cons for the designs decisions:
Start worker threads from the onMessage method of the message listener
Use an internal queue to "send messages to the worker threads"
Transaction limits aside, whether or not to have multiple threads (or processes) reading from a queue simply comes down to whether or not the message order is important. Obviously if the order is important, then a single thread naturally maintains that order, while multiple threads will provide no such guarentee.
What you will normally find, is that order is important but across a subset of all the messages. In this scenario, if a single thread isn't performant, you need to get those messages off the queue and re-queued in as short a possible time because to preserve the order you'll have to use a single thread reading from the initial queue - hence the use of one or more internal queues. The problem this incurs is that the transaction will be closed before the messages are fully processed and so you need some sort of temporary storage to ensure messages don't get dropped if the process were to fall over before the processing had taken place.
If, as your question suggests, you're not too worried about dropping messages then the java.util.concurrent.BlockingQueue sounds like what you need for the internal queues with a single thread servicing each.