How to publish messages to rabbitmq with high tps, multithreading

How to publish messages to rabbitmq with high tps, multithreading - java

Since channel is not thread safe, I can either synchronize th channel instance before publish or I create a channel each time I need and close it.
But in my opinion neither of them have a good performance due to cost of locking or create and destory channels.
So how should I publish message to rabbitmq with high tps? Any good pratise on this?

So, first thing first. A channel is not a connection. In RabbitMQ, Channels are the same thing as application sessions, whereas a connection represents an underlying TCP session with the server. This answer explains some of that nicely.
TCP sessions are expensive to create, so they tend to have a lifetime outside the lifetime of any particular worker thread. Channels are extremely cheap to create - all the server does is assign you an integer for your channel identifier and you have a new channel.
Most operations on RabbitMQ close a channel when they fail. This is done because there is no practical consequence of doing so. Would they close the underlying connection, that would cause a lot of problems for the app.
Design Guidance
Pooling would be appropriate for connections, if you really have a lot of processing going on. A discussion on how to do this is really beyond what I can provide in a short answer.
Pooling is absolutely not appropriate for channels. A channel is a lightweight construct that is designed to have a transient lifetime. If it lasts longer than one or two publishes, great. But you should expect that every time you try an operation, there is a possibility it will fail and close the channel. This does not close the underlying connection, but a new channel will have to be reestablished to do anything with the broker.
Consumer lifetimes are tied to channels. When the channel closes, the attached consumer is closed too. Design your consumer objects (worker threads) to be able to get a connection and create a new channel when this happens, and then re-subscribe themselves.
Avoid sharing a channel across threads. One thread = one channel.
While I don't have any particular experience with the Java client, I don't believe locks should be necessary, and I would certainly hope that the implementation doesn't do something screwy to make Channels anything other than lightweight.
If you're writing your own protocol implementation library (not necessary, but also not a bad idea if you need fine-grained control), allocate one thread to manage each connection. Don't do reads and writes to the TCP socket in parallel, or you'll break the protocol.
Regarding the Java client, I think you can assume that channel operations (reads and writes, etc.) are NOT thread-safe, which is why you want to stick to one thread/one channel paradigm. I think you can assume that creating channels (a connection operation) is thread-safe.

You should use a pool.
For instance, use Apache's Generic Object Pool and provide an implementation for opening, closing and checking connections. When you need to publish a message, you borrow a channel from the pool, use it, and return it.

Related

Thread handling in Java HornetQ client

I'm trying to understand how to deal with threads within a Java client that connects to HornetQ. I'm not getting a specific error but fail to understand how I'm expected to deal with threads in the first place (with respect to the HornetQ client and specifically MessageHandler.onMessage() -- threads in general are no problem to me).
In case this is relevant: I'm using 'org.hornetq:hornetq-server:2.4.7.Final' to run the server embedded into my application. I don't intend this to make a difference. In my situation, that's just more convenient from an ops perspective than running a standalone server process.
What I did so far:
create an embedded server: new EmbeddedHornetQ(),
.setConfiguration()
create a server locator: HornetQClient.createServerLocator(false, new TransportConfiguration(InVMConnectorFactory.class.getName()))
create a session factory: serverLocator.createSessionFactory()
Now it seems obvious to me that I can create a session using hornetqClientSessionFactory.createSession(), create a producer and consumer for that session, and deal with messages within a single thread using .send() and .receive().
But I also discovered consumer.setMessageHandler(), and this tells me that I didn't understand threading in the client at all. I tried to use it, but then the consumer calls messageHandler.onMessage() in two threads that are distinct from the one that created the session. This seems to match my impression from looking at the code -- the HornetQ client uses a thread pool to dispatch messages.
This leaves me confused. The javadocs say that the session is a "single-thread object", and the code agrees -- no obvious synchronization going on there. But with onMessage() being called in multiple threads, message.acknowledge() is also called in multiple threads, and that one just delegates to the session.
How is this supposed to work? How would a scenario look in which MessageHandler does NOT access the session from multiple threads?
Going further, how would I send follow-up messages from within onMessage()? I'm using HornetQ for a persistent "to-do" work queue, so sending follow-up messages is a typical use case for me. But again, within onMessage(), I'm in the wrong thread for accessing the session.
Note that I would be okay with staying away from MessageHandler and just using send() / receive() in a way that allows me to control threading. But I'm convinced that I don't understand the whole situation at all, and that combined with multi-threading is just asking for trouble.

I can answer part of your question, although I hope you've already fixed the issue by now.
Form the HornetQ documentation on ClientConsumer (Emphasis mine):
A ClientConsumer receives messages from HornetQ queues.
Messages can be consumed synchronously by using the receive() methods which will block until a message is received (or a timeout expires) or asynchronously by setting a MessageHandler.
These 2 types of consumption are exclusive: a ClientConsumer with a MessageHandler set will throw HornetQException if its receive() methods are called.
So you have two choices on handling message reception:
Synchronize the reception yourself
Do not provide a MessageListener to HornetQ
In your own cunsumer Thread, invoke .receive() or .receive(long itmeout) at your leisure
Retrieve the (optional) ClientMessage object returned by the call
Pro: Using the Session you hopefully carry in the Consumer you can forward the message as you see fit
Con: All this message handling will be sequential
Delegate Thread synchronization to HornetQ
Do not invoke .receive() on a Consumer
Provide a MessageListener implementation of onMessage(ClientMessage)
Pro: All the message handling will be concurrent and fast, hassle-free
Con: I do not think it possible to retrieve the Session from this object, as it is not exposed by the interface.
Untested workaround: In my application (which is in-vm like yours), I exposed the underlying, thread-safe QueueConnection as a static variable available application-wide. From your MessageListener, you may invoke QueueSession jmsSession = jmsConnection.createQueueSession(false, Session.AUTO_ACKNOWLEDGE); on it to obtain a new Session and send your messages from it... This is probably alright as far as I can see because the Session object is not really re-created. I also did this because Sessions had a tendency to become stale.
I don't think you should want so much to be in control of your Message execution threads, especially transient Threads that merely forward messages. HornetQ has built-in Thread pools as you guessed, and reuses these objects efficiently.
Also as you know you don't need to be in a single Thread to access an object (like a Queue) so it doesn't matter if the Queue is accessed through multiple Threads, or even through multiple Sessions. You need only make sure a Session is only accesed by one Thread, and this is by design with MessageListener.

Keeping java sockets open - how to check if new data available?

I have a simple client-server application using sockets for the communication. One possibility is to close the socket every time the client has sent something to the server.
But my idea is to keep the connection always open, i.e. if a client contacts the server the connection should be put into a queue (e.g. LinkedBlockingQueue) and kept open, this would increase the performance.
How can I check in the server if there is new data available in a socket in the queue? The only thing I can imagine is to constantly iterate over the whole queue and check every socket if it has new data. But this would be very inefficient because if I have several threads working on the queue, the queue gets blocked when one thread is scanning over it.
Or is there a possibility to register a callback function on the socket, so that the socket informs the threads that data is ready?

But my idea is to keep the connection always open, i.e. if a client contacts the server the connection should be put into a queue (e.g. LinkedBlockingQueue) and kept open, this would increase the performance.
Keeping connections open will improve performance, though there are scaling issues: an open socket uses kernel resources. (I wouldn't use a queue though ...)
How can I check in the server if there is new data available in a socket in the queue?
If you have a number of sockets to different clients, and you want to process data in (roughly) the order that it arrives, there are two common techniques:
Create a thread per socket, and have each thread simply do a read. This will (naturally) block the thread until data becomes available.
Use the NIO channel selector mechanism (see Selector) which allows you to find out which of a group of I/O channels is ready for a read or write.
Thread per socket tends to be resource hungry (thread stacks), and does not scale well at all if you have multiple threads that are active simultaneously. (Too many context switches, too much load on the thread scheduler.)
By contrast, selectors map onto native syscalls provided by the host operating system, and thus they are efficient and responsive ... if used intelligently.
(You could also obtain non-blocking channels for the sockets, and poll them round-robin fashion. But that isn't going to be either efficient or responsive.)
As you can see, none of these ideas work with a queue. Either you have a number of threads each dealing with one socket, or you have one thread dealing with an array or (array) list of sockets. The queue abstraction is not designed for indexing or iterating.
Or is there a possibility to register a callback function on the socket, so that the socket informs the threads that data is ready?
See #Lolo's answer.

A practical solution would be to use NIO2 AsynchronousSocketChannels to perform asynchronous read operations with a callback that you can specify as a CompletionHandler.

Java NIO - using Selectors

Got some quick questions on using a non blocking selector - answering any of the below would be greatly appreciated:
I have just ben learning about the selector (java NIO model) and want to clarify some facts. Picked up most of the key points from tutorials as well as the SO question:
Java NIO: Selectors
so:
from the question linked above, a Server socket channel is registered to listen for accept, when accepted a channel is registered for read and once read it is registered for write. How does it then go back into listening for an accept? I mean there appears to be no repeated call to a register accept listener.
Is it just that once a ServerSocketChannel is registered the selector will always be listening for an accept? If so doesn't this mean that the selector will always be listening for a read/write on the channels also? How do you stop that - is it as simple as closing the channels?
What happens if a channel is registered twice? I mean - in the above example, a socket is
registered for write every time after a message is read - basically looks like an echo server. I guess the example assumes that the connection will then be closed after writing.
What happens if you want to maintain an open connection and at different intervals write to the socket - must you register an interestOp each time? Is it harmful to register the same channel twice?
Multithreading:
What is the best design now for writing a multithreaded server - as the above non blocking approach assumes that one thread can handle writing and reading. Should you create a new thread if the read/write operation is expected to take a while? Are there any other points to consider?
Understand that multithreaded is for blocking IO but have seen some frameworks that use a thread pool along with a Non blocking model and was wondering why.

a Server socket channel is registered to listen for accept, when accepted a channel is registered for read and once read it is registered for write. How does it then go back into listening for an accept?
It doesn't have to 'go back'. It's listening. It's registered. Nothing has changed.
What happens if a channel is registered twice?
The interestOps are updated and the previous key attachment is lost.
I mean - in the above example, a socket is registered for write every time after a message is read.
It's a very poor example. Incorrect technique. You should just write. If write() returns zero, then you should (a) register the channel for OP_WRITE; (b) retry the write when the channel becomes writable; and (c) deregister the channel for OP_WRITE once the write has completed.
I guess this assumes that the connection will then be closed after writing.
I don't see why. It doesn't follow. In any case it's a poor example, not to be imitated. You would be better off imitating answers here, rather than unanswered questions.
What happens if you want to maintain an open connection and at different intervals write to the socket - must you register an interestOp each time? Is it harmful to register the same channel twice?
See above. Keep the channel open. Its interestOps won't change unless you change them; and you should handle writing as above, not as per the poor example you cited.
Multithreading: What is the best design now for writing a multithreaded server - as the above non blocking approach assumes that one thread can handle writing and reading.
If you want multi-threading, use blocking I/O, most simply via java.net. If you want non-blocking I/O you don't need multi-threading. I don't understand why you're asking about both in the same question.
Should you create a new thread if the read/write operation is expected to take a while?
In the most general case of blocking I/O you have two threads per connection: one for reading and one for writing.

Java client peer-to-multipeer using Netty

I'm writing a process which must connect (and keep alive) to several (hundreds) remote peers and manage messaging / control over them.
I made two versions of this software: first with classic "thread-per-connection" model, the second using standard java NIO and selectors (to reduce thread allocation, but has problems). Then, looking around I found Netty can boost a lot in most cases and I started a third one using it. My goal is to keep resource usage quite low keeping it fast.
Once written the pipeline factory with custom events and dynamic handler switching, I stopped on the most superficial part: its allocation.
All the examples I read use a single client with single connection, so I got the doubt: I set up a ChannelFactory and a PipelineFactory, so every (new ClientBootstrap(factory)).connect(address) makes a new channel with a new pipeline. Is it possible to make a shared pipeline and defer business logic to a thread-pool?
If so, how?
Using standard java NIO I managed to use two small small thread pools (threads < remote peers) taking advantage of selectors; I had, however, troubles on recycling listened channels for writing.
Communication should happen through a single channel which can receive timed messages from the remote peer or make a 3-way control (command-ack-ok).
On second hand: once the event as reached the last handler, what happens? Is it there I extract it or can I extract a message from any point?

You should only have one bootstrap (i.e one ChannelFactory and one PipeLineFactory). Pipelines, or even individual channel handlers, may be shared, but they are usually created unique per channel.
You can have an ExecutionHandler in your pipeline to transfer execution from the IO worker threads to a thread pool.
But why don't you read the exhaustive documentation at http://netty.io/wiki/ ? You'll find answers to every question of your's there.

Stateless Blocking Server Design

A little help please.
I am designing a stateless server that will have the following functionality:
Client submits a job to the server.
Client is blocked while the server tries to perform the job.
The server will spawn one or multiple threads to perform the job.
The job either finishes, times out or fails.
The appropriate response (based on the outcome) is created, the client is unblocked and the response is handed off to the client.
Here is what I have thought of so far.
Client submits a job to the server.
The server assigns an ID to the job, places the job on a Queue and then places the Client on an another queue (where it will be blocked).
Have a thread pool that will execute the job, fetch the result and appropriately create the response.
Based on ID, pick the client out of the queue (thereby unblocking it), give it the response and send it off.
Steps 1,3,4 seems quite straight forward however any ideas about how to put the client in a queue and then block it. Also, any pointers that would help me design this puppy would be appreciated.
Cheers

Why do you need to block the client? Seems like it would be easier to return (almost) immediately (after performing initial validation, if any) and give client a unique ID for a given job. Client would then be able to either poll using said ID or, perhaps, provide a callback.
Blocking means you're holding on to a socket which obviously limits the upper number of clients you can serve simultaneously. If that's not a concern for your scenario and you absolutely need to block (perhaps you have no control over client code and can't make them poll?), there's little sense in spawning threads to perform the job unless you can actually separate it into parallel tasks. The only "queue" in that case would be the one held by common thread pool. The workflow would basically be:
Create a thread pool (such as ThreadPoolExecutor)
For each client request:
If you have any parts of the job that you can execute in parallel, delegate them to the pool.
And / or do them in the current thread.
Wait until pooled job parts complete (if applicable).
Return results to client.
Shutdown the thread pool.
No IDs are needed per se; though you may need to use some sort of latch for 2.1 / 2.3 above.
Timeouts may be a tad tricky. If you need them to be more or less precise you'll have to keep your main thread (the one that received client request) free from work and have it signal submitted job parts (by flipping a flag) when timeout is reached and return immediately. You'll have to check said flag periodically and terminate your execution once it's flipped; pool will then reclaim the thread.

How are you communicating to the client?
I recommend you create an object to represent each job which holds job parameters and the socket (or other communication mechanism) to reach the client. The thread pool will then send the response to unblock the client at the end of job processing.

The timeouts will be somewhat tricky, and will have hidden gotcha's but the basic design would seem to be to straightforward, write a class that takes a Socket in the constructor. on socket.accept we just do a new socket processing instantiation, with great foresight and planning on scalability or if this is a bench-test-experiment, then the socket processing class just goes to the data processing stuff and when it returns you have some sort of boolean or numeric for the state or something, handy place for null btw, and ether writes the success to the Output Stream from the socket or informs client of a timeout or whatever your business needs are
If you have to have a scalable, effective design for long-running heavy-haulers, go directly to nio ... hand coded one-off solutions like I describe probably won't scale well but would provide fundamental conceptualizing basis for an nio design of code-correct work.
( sorry folks, I think directly in code - design patterns are then applied to the code after it is working. What does not hold up gets reworked then, not before )

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.