Request-Reply through a Queue with Hazelcast

Request-Reply through a Queue with Hazelcast - java

I wonder if I can do request-reply with this:
1 hazelcast instance/member (central point)
1 application with hazelcast-client sending request through a queue
1 application with hazelcast-client waiting for requests into the queue
The 1st application also receives the response on another queue posted by the second application.
Is it a good way to proceed? Or do you think of a better solution?
Thanks!

The last couple of days I also worked on a "soa like" solution using hazelcast queues to communicate between different processes on different machines.
My main goals were to have
"one to one-of-many" communication with garanteed reply of one-of-the-many's
"one to one" communication one way
"one to one" communication with answering in a certain time
To make a long story short, I dropped this approach today because of the follwoing reasons:
lots of complicated code with executor services, callables, runnables, InterruptedException's, shutdown-handling, hazelcast transactions, etc
dangling messages in case of the "one to one" communciation when the receiver has shorter lifetime than the sender
loosing messages if I kill certain cluster member(s) at the right time
all cluster members must be able to deserialize the message, because it could be stored anywhere. Therefore the messages can't be "specific" for certain clients and services.
I switched over to a much simpler approach:
all "services" register themselves in a MultiMap ("service registry") using the hazelcast cluster member UUID as key. Each entry contains some meta information like service identifier, load factor, starttime, host, pid, etc
clients pick a UUID of one of the entries in that MultiMap and use a DistributedTask (distributed executor service) for the choosen specific cluster member to invoke the service and optionally get a reply (in time)
only the service client and the service must have the specific DistributedTask implementation in their classpath, all other cluster members are not bothered
clients can easily figure out dead entries in the service registry themselves: if they can't see a cluster member with the specific UUID (hazelcastInstance.getCluster().getMembers()), the service died probably unexpected. Clients can then pick "alive" entries, entries which fewer load factor, do retries in case of idempotent services, etc
The programming gets very easy and powerful using the second approach (e.g. timeouts or cancellation of tasks), much less code to maintain.
Hope this helps!

In the past we have build a SOA system that uses Hazelcast queue's as a bus. Here is some of the headlines.
a. Each service has an income Q. Simply service name is the name of the queue. You can have as many service providers as you wish. You can scale up and down. All you need is these service providers to poll this queue and process the arrived requests.
b. Since the system is fully asynchronous, to correlate request and response, there is also a call id both on request and response.
c. Each client sends a request into the queue of the service that it wants to call. The request has all the parameters for the service, a name of the queue to send the response and a call id. A queue name simply can be the address of the client. This way each client will have it's own unique queue.
d. Upon receiving the request, a service provider processes it and sends the response to the answer queue
e. Each client also continuously polls its input queue to receive the answers for the requests that it send.
The major drawback with this design is that the queues are not as scalable as maps. Thus it is not very scalable. Hoever it still can process 5K requests per seconds.

I made a test for myself and validated that it works well with certain limitation.
The architecture is Producer-Hazelcast_node-Consumer(s)
Using two Hazelcast queues, one for Request, one for Response, I could mesure a round trip of under 1ms.
Load balancing is working fine if I put several consumers of the Request queue.
If I add another node, and connect the clients to each node, then the round trip is above 15ms. This is due to replication between the 2 hazelcast nodes. If I kill a node, the clients continue to work. So failover is working at the cost of time.

Can't you use the correlation id to perform request-reply on a single queue in hazelcast? That's the id that should uniquely define a conversation between 2 providers/consumers of a queue.

What is the purpose of this setup #unludo ?. I am just curious

Related

Request Aggregator / Middle-tier design pattern for costly requests

I'm working on a program that will have multiple threads requiring information from a web-service that can handle requests such as:
"Give me [Var1, Var2, Var3] for [Object1, Object2, ... Object20]"
and the resulting reply will give me a, in this case, 20-node XML (one for each object), each node with 3 sub-nodes (one for each var).
My challenge is that each request made of this web-service costs the organization money and, whether it be for 1 var for 1 object or 20 vars for 20 objects, the cost is the same.
So, that being the case, I'm looking for an architecture that will:
Create a request on each thread as data is required
Have a middle-tier "aggregator" that gets all the requests
Once X number of requests have been aggregated (or a time-limit has reached), the middle-tier performs a single request of the web-service
Middle-tier receives reply from web-service
Middle-tier routes information back to waiting objects
Currently, my thoughts are to use a library such as NetMQ with my middle-tier as a server and each thread as a poller, but I'm getting stuck on the actual implementation and, before going too far down the rabbit-hole, am hoping there's already a design pattern / library out there that does this substantially more efficiently than I'm conceiving of.
Please understand that I'm a noob, and, so, ANY help / guidance would be really greatly appreciated!!
Thanks!!!

Overview
From the architectural point of view, you just sketched out a good approach for the problem:
Insert a proxy between the requesting applications and the remote web service
In the proxy, put the requests in the request queue, until at least one of the following events occurs
The request queue reaches a given length
The oldest request in the request queue reaches a certain age
Group all requests in the request queue in one single request, removing duplicate objects or attributes
Send this request to the remote web service
Move the requests into the (waiting for) response queue
Wait for the response until one of the following occurs
the oldest request in the response queue reaches a certain age (time out)
a response arrives
Get the response (if applicable) and map it to the according requests in the response queue
Answer all requests in the response queue that have an answer
Send a timeout error for all requests older than the timeout limit
Remove all answered requests from the response queue
Technology
You probably won't find an off-the-shelf product or a framework that exactly matches you requirements. But there are several frameworks / architectural patterns that you can use to build a solution.
C#: RX and LINQ
When you want to use C#, you could use reactive extensions for getting the timing and the grouping right.
You could then use LINQ to select the attributes from the requests to build the response and to select the requests in the response queue that either match to a certain part of a response or that timed out.
Scala/Java: Akka
You could model the solution as an actor system, using several actors:
An actor as the gateway for the requests
An actor holding the request queue
An actor sending the request to the remote web service and getting the response back
An actor holding the response queue
An actor sending out the responses or the timeouts
An actor system makes it easy to deal with concurrency and to separate the concerns in a testable way.
When using Scala, you could use its "monadic" collection API (filter, map, flatMap) to do basically the same as with LINQ in the C# approach.
The actor approach really shines when you want to test the individual elements. It is very easy to test each actor individually, without having to mock the whole workflow.
Erlang/Elixir: Actor System
This is similar to the Akka approach, just with a different (functional!) language. Erlang / Elixir has a lot of support for distributed actor systems, so when you need an ultra stable or scalable solution, you should look into this one.
NetMQ / ZeroMQ
This is probably too low level and brings in to few infrastructure. When you use an actor system, you could try to bring in NetMQ / ZeroMQ as the transport system.

Your idea of using a queue looks good to me.
This is one possible solution to your problem and I'm sure there are countless other solutions that can do what you need.
Have a "publish queue" (PQ) and a "consume queue" (CQ)
Clients subscribe to CQ and MT subscribes to PQ
Clients publish the requests to PQ
MT Listens to PQ, aggregates requests and dispatches to farm in a thread
Once the results are back, this thread separates the results into req/res pair
It then publishes the req/res pairs to the CQ
Each client picks the correct message and processes it
Long(er) version:
Have your "middle tier" to listen to a queue (to which, the clients publish messages) and aggregate the requests until N number of requests have come through or X amount of time has passed.
One you are ready, offload the aggregated request to a thread to call your farm and get the results. A bigger problem will most likely arise when you need to communicate this back to the clients.
For that, you probably need another queue that all your clients subscribe to and once your result batch is ready (say 20 responses in XML) from the farm, the thread that called the farm will separate the XML results into their corresponding request/response pair and publish to this queue. Each client will need to pick up the correct request/response pair from the queue and process it.
This will not be a webservice in the traditional sense since the wait times can be prohibitively long and you don't want to maintain a connection which is why I suggest the queue.
You can also have your consumer queue to be topic based, meaning you only publish the req/res pairs to the consumer that asked for it and don't broadcast it (so the client doesn't have to "pick the correct req/res". It will be taken care of based on the topic name). Almost all queues support this.

Concurrent Synchronous Request-Reply with JMS/ActiveMQ - Patterns/Libraries?

I have a web-app where when the user submits a request, we send a JMS message to a remote service and then wait for the reply. (There are also async requests, and we have various niceties set up for message replay, etc, so we'd prefer to stick with JMS instead of, say, HTTP)
In How should I implement request response with JMS?, ActiveMQ seems to discourage the idea of either temporary queues per request or temporary consumers with selectors on the JMSCorrelationID, due to the overhead involved in spinning them up.
However, if I use pooled consumers for the replies, how do I dispatch from the reply consumer back to the original requesting thread?
I could certainly write my own thread-safe callback-registration/dispatch, but I hate writing code I suspect has has already been written by someone who knows better than I do.
That ActiveMQ page recommends Lingo, which hasn't been updated since 2006, and Camel Spring Remoting, which has been hellbanned by my team for its many gotcha bugs.
Is there a better solution, in the form of a library implementing this pattern, or in the form of a different pattern for simulating synchronous request-reply over JMS?
Related SO question:
Is it a good practice to use JMS Temporary Queue for synchronous use?, which suggests that spinning up a consumer with a selector on the JMSCorrelationID is actually low-overhead, which contradicts what the ActiveMQ documentation says. Who's right?

In a past project we had a similar situation, where a sync WS request was handled with a pair of Async req/res JMS Messages. We were using the Jboss JMS impl at that time and temporary destinations where a big overhead.
We ended up writing a thread-safe dispatcher, leaving the WS waiting until the JMS response came in. We used the CorrelationID to map the response back to the request.
That solution was all home grown, but I've come across a nice blocking map impl that solves the problem of matching a response to a request.
BlockingMap
If your solution is clustered, you need to take care that response messages are dispatched to the right node in the cluster. I don't know ActiveMQ, but I remember JBoss messaging to have some glitches under the hood for their clusterable destinations.

I would still think about using Camel and let it handle the threading, perhaps without spring-remoting but just raw ProducerTemplates.
Camel has some nice documentation about the topic and works very well with ActiveMQ.
http://camel.apache.org/jms#JMS-RequestreplyoverJMS
For your question about spinning up a selector based consumer and the overhead, what the ActiveMQ docs actually states is that it requires a roundtrip to the ActiveMQ broker, which might be on the other side of the globe or on a high delay network. The overhead in this case is the TCP/IP round trip time to the AMQ broker. I would consider this as an option. Have used it muliple times with success.

A colleague suggested a potential solution-- one response queue/consumer per webapp thread, and we can set the return-address to the response queue owned by that particular thread. Since these threads are typically long-lived (and are re-used for subsequent web requests), we only have to suffer the overhead at the time the thread is spawned by the pool.
That said, this whole exercise is making me rethink JMS vs HTTP... :)

I have always used CorrelationID for request / response and never suffered any performance issues. I can't imagine why that would be a performance issue at all, it should be super fast for any messaging system to implement and quite an important feature to implement well.
http://www.eaipatterns.com/RequestReplyJmsExample.html has the tow main stream solutions using replyToQueue or correlationID.

It's an old one, but I've landed here searching for something else and actually do have some insights (hopefully will be helpful to someone).
We have implemented very similar use-case with Hazelcast being our chassis for
cluster's internode comminication. The essense is 2 datasets: 1 distributed map for responses, 1 'local' list of response awaiters (on each node in cluster).
each request (receiving it's own thread from Jetty) creates an entry in the map of local awaiters; the entry has obviously the correlation UID and an object that will serve as a semaphore
then the request is being dispatched to the remote (REST/JMS) and the original thread starts waiting on the semaphore; UID must be part of the request
remote returns the response and writes it into the responses map with the correlated UID
responses map is being listened; if the UID of the newly coming response is found in the map of the local awaiters, it's semaphore is being notified, original request's thread is being released, picking up the response from the responses map and returning it to the client
This is a general description, I can update an answer with a few optimizations we have, in case there will be any interest.

concurrent consumers yet ensure order

I have a JMS Queue that is populated at a very high rate ( > 100,000/sec ).
It can happen that there can be multiple messages pertaining to the same entity every second as well. ( several updates to entity , with each update as a different message. )
On the other end, I have one consumer that processes this message and sends it to other applications.
Now, the whole set up is slowing down since the consumer is not able to cope up the rate of incoming messages.
Since, there is an SLA on the rate at which consumer processes messages, I have been toying with the idea of having multiple consumers acting in parallel to speed up the process.
So, what Im thinking to do is
Multiple consumers acting independently on the queue.
Each consumer is free to grab any message.
After grabbing a message, make sure its the latest version of the entity. For this, part, I can check with the application that processes this entity.
if its not latest, bump the version up and try again.
I have been looking up the Integration patterns, JMS docs so far without success.
I would welcome ideas to tackle this problem in a more elegant way along with any known APIs, patterns in Java world.

ActiveMQ solves this problem with a concept called "Message Groups". While it's not part of the JMS standard, several JMS-related products work similarly. The basic idea is that you assign each message to a "group" which indicates messages that are related and have to be processed in order. Then you set it up so that each group is delivered only to one consumer. Thus you get load balancing between groups but guarantee in-order delivery within a group.

Most EIP frameworks and ESB's have customizable resequencers. If the amount of entities is not too large you can have a queue per entity and resequence at the beginning.

For those ones interested in a way to solve this:
Use Recipient List EAI pattern
As the question is about JMS, we can take a look into an example from Apache Camel website.
This approach is different from other patterns like CBR and Selective Consumer because the consumer is not aware of what message it should process.
Let me put this on a real world example:
We have an Order Management System (OMS) which sends off Orders to be processed by the ERP. The Order then goes through 6 steps, and each of those steps publishes an event on the Order_queue, informing the new Order's status. Nothing special here.
The OMS consumes the events from that queue, but MUST process the events of each Order in the very same sequence they were published. The rate of messages published per minute is much greater than the consumer's throughput, hence the delay increases over time.
The solution requirements:
Consume in parallel, including as many consumers as needed to keep queue size in a reasonable amount.
Guarantee that events for each Order are processed in the same publish order.
The implementation:
On the OMS side
The OMS process responsible for sending Orders to the ERP, determines the consumer that will process all events of a certain Order and sends the Recipient name along with the Order.
How this process know what should be the Recipient? Well, you can use different approaches, but we used a very simple one: Round Robin.
On ERP
As it keeps the Recipient's name for each Order, it simply setup the message to be delivered to the desired Recipient.
On OMS Consumer
We've deployed 4 instances, each one using a different Recipient name and concurrently processing messages.
One could say that we created another bottleneck: the database. But it is not true, since there is no concurrency on the order line.
One drawback is that the OMS process which sends the Orders to the ERP must keep knowledge about how many Recipients are working.

Would a JMS Topic suffice in this situation? Or should I look elsewhere?

There is one controlling entity and several 'worker' entities. The controlling entity requests certain data from the worker entities, which they will fetch and return in their own manner.
Since the controlling entity can agnostic about the worker entities (and the working entities can be added/removed at any point), putting a JMS provider in between them sounds like a good idea. That's the assumption at least.
Since it is an one-to-many relation (controller -> workers), a JMS Topic would be the right solution. But, since the controlling entity is depending on the return values of the workers, request/reply functionality would be nice as well (somewhere, I read about the TopicRequester but I cannot seem to find a working example). Request/reply is typical Queue functionality.
As an attempt to use topics in a request/reply sort-of-way, I created two JMS topis: request and response. The controller publishes to the request topic and is subscribed to the response topic. Every worker is subscribed to the request topic and publishes to the response topic. To match requests and responses the controller will subscribe for each request to the response topic with a filter (using a session id as the value). The messages workers publish to the response topic have the session id associated with them.
Now this does not feel like a solution (rather it uses JMS as a hammer and treats the problem (and some more) as a nail). Is JMS in this situation a solution at all? Or are there other solutions I'm overlooking?

Your approach sort of makes sense to me. I think a messaging system could work. I think using topics are wrong. Take a look at the wiki page for Enterprise Service Bus. It's a little more complicated than you need, but the basic idea for your use case, is that you have a worker that is capable of reading from one queue, doing some processing and adding the processed data back to another queue.
The problem with a topic is that all workers will get the message at the same time and they will all work on it independently. It sounds like you only want one worker at a time working on each request. I think you have it as a topic so different types of workers can also listen to the same queue and only respond to certain requests. For that, you are better off just creating a new queue for each type of work. You could potentially have them in pairs, so you have a work_a_request queue and work_a_response queue. Or if your controller is capable of figuring out the type of response from the data, they can all write to a single response queue.
If you haven't chosen an Message Queue vendor yet, I would recommend RabbitMQ as it's easy to set-up, easy to add new queues (especially dynamically) and has really good spring support (although most major messaging systems have spring support and you may not even be using spring).
I'm also not sure what you are accomplishing the filters. If you ensure the messages to the workers contain all the information needed to do the work and the response messages back contain all the information your controller needs to finish the processing, I don't think you need them.

I would simply use two JMS queues.
The first one is the one that all of the requests go on. The workers will listen to the queue, and process them in their own time, in their own way.
Once complete, they will put bundle the request with the response and put that on another queue for the final process to handle. This way there's no need for the the submitting process to retain the requests, they just follow along with the entire procedure. A final process will listen to the second queue, and handle the request/response pairs appropriately.
If there's no need for the message to be reliable, or if there's no need for the actual processes to span JVMs or machines, then this can all be done with a single process and standard java threading (such as BlockingQueues and ExecutorServices).
If there's a need to accumulate related responses, then you'll need to capture whatever grouping data is necessary and have the Queue 2 listening process accumulate results. Or you can persist the results in a database.
For example, if you know your working set has five elements, you can queue up the requests with that information (1 of 5, 2 of 5, etc.). As each one finishes, the final process can update the database, counting elements. When it sees all of the pieces have been completed (in any order), it marks the result as complete. Later you would have some audit process scan for incomplete jobs that have not finished within some time (perhaps one of the messages erred out), so you can handle them better. Or the original processors can write the request to a separate "this one went bad" queue for mitigation and resubmission.
If you use JMS with transaction, if one of the processors fails, the transaction will roll back and the message will be retained on the queue for processing by one of the surviving processors, so that's another advantage of JMS.
The trick with this kind of processing is to try and push the state with message, or externalize it and send references to the state, thus making each component effectively stateless. This aids scaling and reliability since any component can fail (besides catastrophic JMS failure, naturally), and just pick up where you left off when you get the problem resolved an get them restarted.
If you're in a request/response mode (such as a servlet needing to respond), you can use Servlet 3.0 Async servlets to easily put things on hold, or you can put a local object on a internal map, keyed with the something such as the Session ID, then you Object.wait() in that key. Then, your Queue 2 listener will get the response, finalize the processing, and then use the Session ID (sent with message and retained through out the pipeline) to look up
the object that you're waiting on, then it can simply Object.notify() it to tell the servlet to continue.
Yes, this sticks a thread in the servlet container while waiting, that's why the new async stuff is better, but you work with the hand you're dealt. You can also add a timeout to the Object.wait(), if it times out, the processing took to long so you can gracefully alert the client.
This basically frees you from filters and such, and reply queues, etc. It's pretty simple to set it all up.

Well actual answer should depend upon whether your worker entities are external parties, physical located outside network, time expected for worker entity to finish their work etc..but problem you are trying to solve is one-to-many communication...u added jms protocol in your system just because you want all entities to be able to talk in jms protocol or asynchronous is reason...former reason does not make sense...if it is latter reason, you can choose other communication protocol like one-way web service call.
You can use latest java concurrent APIs to create multi-threaded asynchronous one-way web service call to different worker entities...

Is MQ publish/subscribe domain-specific interface generally faster than point-to-point?

I'm working on the existing application that uses transport layer with point-to-point MQ communication.
For each of the given list of accounts we need to retrieve some information.
Currently we have something like this to communicate with MQ:
responseObject getInfo(requestObject){
code to send message to MQ
code to retrieve message from MQ
}
As you can see we wait until it finishes completely before proceeding to the next account.
Due to performance issues we need to rework it.
There are 2 possible scenarios that I can think off at the moment.
1) Within an application to create a bunch of threads that would execute transport adapter for each account. Then get data from each task. I prefer this method, but some of the team members argue that transport layer is a better place for such change and we should place extra load on MQ instead of our application.
2) Rework transport layer to use publish/subscribe model.
Ideally I want something like this:
void send (requestObject){
code to send message to MQ
}
responseObject receive()
{
code to retrieve message from MQ
}
Then I will just send requests in the loop, and later retrieve data in the loop. The idea is that while first request is being processed by the back end system we don't have to wait for the response, but instead send next request.
My question, is it going to be a lot faster than current sequential retrieval?

The question title frames this as a choice between P2P and pub/sub but the question body frames it as a choice between threaded and pipelined processing. These are two completely different things.
Either code snippet provided could just as easily use P2P or pub/sub to put and get messages. The decision should not be based on speed but rather whether the interface in question requires a single message to be delivered to multiple receivers. If the answer is no then you probably want to stick with point-to-point, regardless of your application's threading model.
And, incidentally, the answer to the question posed in the title is "no." When you use the point-to-point model your messages resolve immediately to a destination or transmit queue and WebSphere MQ routes them from there. With pub/sub your message is handed off to an internal broker process that resolves zero to many possible destinations. Only after this step does the published message get put on a queue where, for the remainder of it's journey, it then is handled like any other point-to-point message. Although pub/sub is not normally noticeably slower than point-to-point the code path is longer and therefore, all other things being equal, it will add a bit more latency.
The other part of the question is about parallelism. You proposed either spinning up many threads or breaking the app up so that requests and replies are handled separately. A third option is to have multiple application instances running. You can combine any or all of these in your design. For example, you can spin up multiple request threads and multiple reply threads and then have application instances processing against multiple queue managers.
The key to this question is whether the messages have affinity to each other, to order dependencies or to the application instance or thread which created them. For example, if I am responding to an HTTP request with a request/reply then the thread attached to the HTTP session probably needs to be the one to receive the reply. But if the reply is truly asynchronous and all I need to do is update a database with the response data then having separate request and reply threads is helpful.
In either case, the ability to dynamically spin up or down the number of instances is helpful in managing peak workloads. If this is accomplished with threading alone then your performance scalability is bound to the upper limit of a single server. If this is accomplished by spinning up new application instances on the same or different server/QMgr then you get both scalability and workload balancing.
Please see the following article for more thoughts on these subjects: Mission:Messaging: Migration, failover, and scaling in a WebSphere MQ cluster
Also, go to the WebSphere MQ SupportPacs page and look for the Performance SupportPac for your platform and WMQ version. These are the ones with names beginning with MP**. These will show you the performance characteristics as the number of connected application instances varies.

It doesn't sound like you're thinking about this the right way. Regardless of the model you use (point-to-point or publish/subscribe), if your performance is bounded by a slow back-end system, neither will help speed up the process. If, however, you could theoretically issue more than one request at a time against the back-end system and expect to see a speed up, then you still don't really care if you do point-to-point or publish/subscribe. What you really care about is synchronous vs. asynchronous.
Your current approach for retrieving the data is clearly synchronous: you send the request message, and wait for the corresponding response message. You could do your communication asynchronously if you simply sent all the request messages in a row (perhaps in a loop) in one method, and then had a separate method (preferably on a different thread) monitoring the incoming topic for responses. This would ensure that your code would no longer block on individual requests. (This roughly corresponds to option 2, though without pub/sub.)
I think option 1 could get pretty unwieldly, depending on how many requests you actually have to make, though it, too, could be implemented without switching to a pub/sub channel.

The reworked approach will use fewer threads. Whether that makes the application faster depends on whether the overhead of managing a lot of threads is currently slowing you down. If you have fewer than 1000 threads (this is a very, very rough order-of-magnitude estimate!), i would guess it probably isn't. If you have more than that, it might well be.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.