Is Spring-AMQP re-queue message count JVM based?

Is Spring-AMQP re-queue message count JVM based? - java

I was poking around the rabbitmq documentation, and it seems that rabbitmq does not handle message redelivery count. If I were to manually ACK/NACK messages, I would need to either keep the retry count in memory (say, by using correlationId as the unique key in a map), or by setting my own header in the message, and redelivering it (thus putting it at the end of the queue)
However, this is a case that spring handles. Specifically, I am referring to RetryInterceptorBuilder.stateful().maxAttempts(x). Is this count specific to a JVM though, or is it manipulating the message somehow?
For example, I have a web-app deployed to 2 servers, with maxAttempts set to 5. Is it possible that the total redelivery count will be anywhere from 5-9, depending on the order in which it is redelivered and reprocessed among the 2 servers?

Rabbit/AMQP does not allow modification of the message when requeueing based on rejection.
The state (based on messageId) is maintained in a RetryContextCache; the default is a MapRetryContextCache. This is not really suitable for a "cluster" because, as you say, the attempts may be up to ((maxAttempts - 1) * n + 1); plus it will cause a memory leak (state left on some servers). You can configure a SoftReferenceMapRetryContextCache in the RetryTemplate (RetryOperations in the builder) to avoid the memory leak but that only solves the memory leak.
You would need to use a custom RetryContextCache with some persistent shared store (e.g. redis).
I generally recommend using stateless recovery in this scenario - the retries are done entirely in the container and do not involve rabbit at all (until retries are exhausted, in which case the message is discarded or sent to the DLX/DLQ depending on broker configuration).
If you don't care about message order (and I presume you don't given you have competing consumers), an interesting technique is to reject the message, send it to a DLQ with an expiry set and, when the DLQ message expires, route it back to the tail of original queue (rather than the head). In that case, the x-death header can be examined to determine how many times it has been retried.
This answer and this one have more details.

Related

Is RabbitMQ or Kafka message queue a 1:1 messaging system?

As mentioned in the answer,
A message queue is a one-way pipe: one process writes to the queue, and another reads the data in the order
SysV message queue is one example
So, my understanding is,
one message queue is used by two processes, where one process(producer) insert an item in the queue and another process(consumer) consumes the item from the queue
1) Is RabbitMQ or Kafka message queue a 1:1 messaging system? used by only two processes, where one process writes and other process reads......
2) after the consumer consume the item, does the item get deleted? If no, why do we need queue data structure? Why not just shared memory?

Kafka is not strictly 1:1 messaging system. Multiple producers can write into a topic and multiple consumers can read from it. Moreover, in Kafka, multiple consumers can be assigned same or different consumer groups. Every message is consumed by only one consumer from every consumer group (load balancing) and all consumer groups receive a copy of every message (of course, if they are subscribed to corresponding topics and no messages are lost). A good description of this process can be found in this article: Scalability of Kafka Messaging using Consumer Groups.
In Kafka all messages are persisted on the disk and stored until the compaction reaps it, or the retention.ms passes, or the log size is exceeded. That's a very high-level point of view and there are a lot of nuances here. Like: the messages are stored in segments, every segment contains multiple messages. When the retention period passes for a message, it is not removed from the segment at that moment, instead Kafka waits until all messages in that segment are expired and delete the whole segment at once. Also, retention could come before the log exceeds the maximum size or vice versa: the log can exceed the size even before the retention period passes. And so on. Just read the docs and pay attention to topics about "log cleaner" and "retention".
After the Kafka consumer reads the message it is neither compacted, nor expired. So, it's not removed from the log and stays there. It also means that every message could be re-read by a consumer if needed (until it is deleted completely). It can be useful if some of your consumers went offline for some reason and were not able to process the messages as they come in. It also allows interesting features like transaction replays and so on. Persistence is one of the Kafka's features.
Shared memory? Well, strictly speaking shared memory is only allowed inside a single process. So you can't generally use "shared memory" when you need to access it from different processes. And there is absolutely no way to have "shared memory" when you app runs on multiple hosts. However, there are in-memory brokers. Like Redis can be used as a message broker, and it's all in-memory. However, if such a broker restarts for some reason you lose everything. Speaking about Redis: it has two persistence configurations specifically to handle the restarts.
I am not sure about RabbitMQ, but it probably deletes messages after the consumer acknowledged them by default. So it's closer to 1:1 mental model. However, RabbitMQ employs disk persistence as well.

RabbitMQ Batch Ack

I had a question on how rabbitmq works with batching acknowledgements. I understand that the Prefetch value is the max number of messages that will get queued before reaching its limit. However, I wasn't sure if the ack's manage themselves or if I have to manage this in code.
Which method is correct?
Send each basicAck with multiple set to true
or
wait until 10 acks were supposed to be sent out and send only the last one and AMQP will automatically send all previous in queue. (with multiple set to true)

TL;DR multiple = true is faster in some cases but requires a lot more careful book keeping and batch like requirements
The consumer gets messages that have a monotonic-ly growing id specific to that consumer. The id is a 64 bit number (it actually might be an unsigned 32 bit but since Java doesn't have that its a long) called the delivery tag. The prefetch is the most messages a consumer will receive that are unacked.
When you ack the highest delivery tag with multiple true it will acknowledge all the unacked messages with a lower delivery tag (smaller number) that the consumer has outstanding. Obviously if you have high prefetch this is faster than acking each message.
Now RabbitMQ knows the consumer received the messages (the unacked ones) but it doesn't know if all those messages have been correctly consumed. So it is on the burden of you the developer to make sure all the previous messages have been consumed. The consumer will deliver the messages in order (I believe internally the client uses a BlockingQueue) but depending on the library/client used downstream the messages might not be.
Thus this really only works well when you are batching the messages together in a single go (e.g. transaction or sending a group of messages off to some other system) or buffering reliably. Often this is done with a blocking queue and then periodically draining the queue to send a group of messages to a downstream system.
On the other hand if you are streaming each message in real time then you can't really do this (ie multiple = false).
There is also the case of one of the message being bad in the group (e.g. drained from internal queue... not rabbit queue) and you won't to nack that bad one. If that is the case you can't use multiple = true either.
Finally if you wait for a certain amount messages (instead of say time) more than the prefetch you will wait indefinitely.... not a good idea. You need to wait on time and number of messages must be <= prefetch.
As you can see its fairly nontrivial to correctly use multiple = true.

First one correction regarding Prefetch value is the max number of messages that will get queued before reaching its limit. - this is not what prefetch value is; prefetch value is the number of UN-ACKed messages that consumer "gets" from the queue. So they are kind of assigned to the consumer but remain in the queue until they are acknowledged. Quote from here, when prefetch is 1
This tells RabbitMQ not to give more than one message to a worker at a
time. Or, in other words, don't dispatch a new message to a worker
until it has processed and acknowledged the previous one.
And for your question:
I wasn't sure if the ack's manage themselves or if I have to manage
this in code.
You can set the auto ack flag to true and then you could say that the ack's manage themselves

How to disable RabbitMQ prefetch count with SimpleMessageListenerContainer

RabbitMQ offers the ability to optionally set a prefetch count.
Using spring-amqp's SimpleMessageListenerContainer, I've noticed the prefetch count is always set. I cannot set the prefetch count to 0, because SimpleMessageListenerContainer sets it to at least txSize which must be greater than zero (even when there are no transactions involved). So is there a way to disable the prefetch count, i.e. make it unlimited?
Here is the relevant code from spring-amqp:
SimpleMessageListenerContainer.createBlockingQueueConsumer() does this:
int actualPrefetchCount = prefetchCount > txSize ? prefetchCount : txSize;
and BlockingQueueConsumer.start() does this:
if (!acknowledgeMode.isAutoAck()) {
// Set basicQos before calling basicConsume (otherwise if we are not acking the broker
// will send blocks of 100 messages)
try {
channel.basicQos(prefetchCount);
}
What is the reasoning behind always calling basicQos() in Springs's BlockingQueueConsumer? Isn't there any use case for disabling the prefetch count? (except auto-ack obviously).
The rabbitmq documentation discusses the overhead of setting prefetch count with a channel (global) scope. It is not explicitly mentioned whether setting it with a consumer scope has any overhead compared to not setting it at all. If I'm not mistaken spring always sets it with a consumer scope. Does it indeed have no overhead at all? Still it seems strange not having the option to not set it.
Thanks

Since the current implementation hands off to an internal queue, due to the way earlier rabbitmq clients worked, not setting the qos will cause an OOM condition if the consumer can't keep up.
In fact, with earlier versions of Spring AMQP, this would happen with auto ack so the queue is bounded according to the prefetch size to stop the broker from sending messages under this condition.
In 2.0, we are planning a new container implementation that avoids this queue since the rabbit client no longer has the issues that required it. We can consider supporting qos=0 then.

EJBException: Failed to acquire the pool semaphore

I'm occasionally getting the following EJB exception across several different message driven beans:
javax.ejb.EJBException: Failed to acquire the pool semaphore, strictTimeout=10000
This behavior closely corresponds to when a particular database is having issues and thereby increases the amount of time spent in the MDB's onMessage function. The messages are being delivered by an ActiveMQ broker (version 5.4.2). The prefetch on the MDBs is 2000 (20 Sessions x 100 Messages per session).
My question is a general one. What exactly is happening here? I know that a message which has been delivered to the server running the MDB will time out after 10 seconds if there is no instance in the bean pool to handle it, however how has that message been delivered to the server in the first place? My assumption up to this point is that the MDB requests messages from the broker in the quantity of only when it no longer has any messages to process. Are they simply waiting in that server-side "bucket" for too long?
Has anyone else run into this? Suggestions for tuning prefetch/semaphore timeout?
EDIT: Forgot to mention I'm using JBoss AS 5.1.0

After doing some research I've found a satisfactory explanation for this EJBException.
MessageDrivenBeans have an instance pool. When a batch of JMS messages is delivered to an MDB in the quantity of the prefetch each are assigned an instance from this pool and are delivered to that instance via the onMessage function.
A little about how the pool works: In JBoss 5.1.0 the pooled beans such as MDBs and SessionBeans are configured by default through JBoss AOP, specifically a file in the deploy directory titled "ejb3-interceptors-aop.xml". This file creates interceptor bindings and default annotations for any class matching its domain. In the case of the Message Driven Bean domain, among other things a org.jboss.ejb3.annotation.Pool annotation:
<annotation expr="class(*) AND !class(#org.jboss.ejb3.annotation.Pool)">
#org.jboss.ejb3.annotation.Pool (value="StrictMaxPool", maxSize=15, timeout=10000)
</annotation>
The parameters of that annotation are described here.
Herein lies the rub. If the message prefetch exceeds the maxSize of this pool (which it usually will for high throughput messaging applications) you will necessarily have messages that are waiting for an MDB instance. If the time from message delivery to calling onMessage exceeds the pool timeout for any message, an EJBException will be thrown. This may not be an issue for the first few iterations of the message distribution, but if you have a large prefetch and long average onMessage time, the message towards the end of the queue will begin to fail.
Some quick algebra reveals that this will occurs, roughly speaking, when
timeout < (prefetch x onMessageTime) / maxSize
This assumes that messages are distributed instantaneously, and each onMessage takes the same time but should give you a rough estimate of whether you're way out of bounds.
The solution to this problem is more subjective. Simply increasing the timeout is a naive option, because it will mask the fact that messages are sitting on your application server instead of your queue. Given that onMessage time is somewhat fixed, decreasing the prefetch is most likely a good option as is increasing the pool size, if resources allow. In tuning this I decreased timeout in addition to decreasing prefetch substantially and increasing maxSize to keep messages on the queue for longer while maintaining my alert indicator for when onMessage times are higher than normal.

What jpredham says is correct. Also plz check whether
'strictMaximumSize' set to true
which could lead to https://issues.jboss.org/browse/JBAS-1599

Message Driven Bean and message consumption order

I have an MDB that gets subscribed to a topic which sends messages whose content is eventually persisted to a DB.
I know MDBs are pooled, and therefore the container is able to handle more than one incoming message in parallel. In my case, the order in which those messages are consumed (and then persisted) is important. I don't want an MDB instance pool to consume and then persist messages in a different order as they get published in the JMS topic.
Can this be an issue? If so, is there a way of telling the container to follow strict incoming order when consuming messages?

Copied from there:
To ensure that receipt order matches the order in which the client sent the message, you must do the following:
Set max-beans-in-free-pool to 1 for the MDB. This ensures that the MDB is the sole consumer of the message.
If your MDBs are deployed on a cluster, deploy them to a single node in the cluster, [...].
To ensure message ordering in the event of transaction rollback and recovery, configure a custom connection factory with MessagesMaximum set to 1, and ensure that no redelivery delay is configured. For more information see [...].

You should be able to limit the size of the MDB pool to 1, thus ensuring that the messages are processed in the correct order.
Of course, if you still want some parallelism in there, then you have a couple of options, but that really depends on the data.
If certain messages have something in common and the order of processing only matters within that group of messages that share a common value, then you may need to have multiple topics, or use Queues and a threadpool.
If on the other hand certain parts of the logic associated with the arrival of a message can take place in parallel, while other bits cannot, then you will need to split the logic up into the parallel-ok and parallel-not-ok parts, and process those bits accordingly.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.