Do we need to create individual channels for each thread or use the same channel for all threads? Also the same question about connection. Do we need to use different connections for each thread or a single connection?
What is the difference when we use one channel across all threads and individual channels for each thread?
Connection:
According to the java doc (https://www.rabbitmq.com/releases/rabbitmq-java-client/v3.6.5/rabbitmq-java-client-javadoc-3.6.5/):
Current implementations are thread-safe for code at the client API level, and in fact thread-safe internally except for code within RPC calls.
Channel:
According to the doc (https://www.rabbitmq.com/api-guide.html):
Channel instances must not be shared between threads. Applications should prefer using a Channel per thread instead of sharing the same Channel across multiple threads. While some operations on channels are safe to invoke concurrently, some are not and will result in incorrect frame interleaving on the wire. Sharing channels between threads will also interfere with * Publisher Confirms.
Related
How to use TBufferedTransport of TThreadedSelectorServer in java?
in Python client:
self.tsocket= TSocket.TSocket(self.host, self.port)
self.transport = TTransport.TBufferedTransport(self.tsocket)
protocol = TBinaryProtocol(self.transport)
client = Handler.Client(protocol)
self.transport.open()
in Java Server
TNonblockingServerSocket serverTransport = new TNonblockingServerSocket(port);
TProcessor tprocessor = new ExecutionService.Processor<ExecutionService.Iface>(handler);
TThreadedSelectorServer.Args tArgs = new TThreadedSelectorServer.Args(serverTransport);
tArgs.processor(tprocessor);
tArgs.protocolFactory(new TBinaryProtocol.Factory());
this.server = new TThreadedSelectorServer(tArgs);
The Python client uses TBufferedTransport, and the Java server uses TFramedTransport. Causes an exception:
AbstractNonblockingServer$FrameBuffer Read an invalid frame size of -2147418111. Are you using TFramedTransport on the client side?
For some reasons, the client cannot be modified, so I want to modify the java server to TBufferedTransport.
How to use TBufferedTransport of TThreadedSelectorServer in java?
thanks!!!
The TThreadedSelectorServer requires TFramedTransport (reference):
A Half-Sync/Half-Async server with a separate pool of threads to handle non-blocking I/O. Accepts are handled on a single thread, and a configurable number of nonblocking selector threads manage reading and writing of client connections. ... Like TNonblockingServer, it relies on the use of TFramedTransport.
This applies for the other non-blocking server classes deriving from TNonblockingServer (reference):
A nonblocking TServer implementation. This allows for fairness amongst all connected clients in terms of invocations. This server is inherently single-threaded. If you want a limited thread pool coupled with invocation-fairness, see THsHaServer. To use this server, you MUST use a TFramedTransport at the outermost transport, otherwise this server will be unable to determine when a whole method call has been read off the wire. Clients must also use TFramedTransport.
If you cannot use TFramedTransport on the client side, you therefore have to use a blocking server, i.e. TThreadPoolServer (reference):
Server which uses Java's built in ThreadPool management to spawn off a worker pool that deals with client connections in blocking way.
Your code would then look like this:
TServerSocket serverTransport = new TServerSocket(9090);
TThreadPoolServer.Args tArgs = new TThreadPoolServer.Args(serverTransport);
tArgs.processor(processor);
tArgs.protocolFactory(new TBinaryProtocol.Factory());
TThreadPoolServer server = new TThreadPoolServer(tArgs);
To detail the differences between the blocking and the non-blocking servers (for general reference, apologies if the difference is already clear to you): Blocking means that when data is read from a socket, no other operation can be done while reading. So when the data arrives partially, the current thread waits until the remaining data arrives. So when a blocking server only has a single thread, only one client can be handled at a time. The time spend waiting for further data from a client cannot be used to serve other clients.
To support multiple clients, multiple threads can be added (as done for TThreadPoolServer). Each thread can only handle one client at a time as before, so the number of clients that can be served simultaneously is limited by the number of threads. You could of course spawn many threads, but this does not scale well: The threads used by the Java ThreadPool which backs the TThreadPoolServer are system-level threads, so they come with some resource over-head for creation and switching between threads. So creating a large number of threads to serve a large number of clients means more time is spent with OS book-keeping of the tasks.
Non-blocking servers (deriving from TNonblockingServer) are meant to solve this problem by utilizing the time spend waiting for data from one client by reading data from other clients. This way a single thread can handle multiple clients, reading from whichever client currently has available data. A non-blocking server can of course also have multiple threads, each handling multiple clients. This way the number of threads does not have to scale with the number of clients. Instead, the number of threads can be chosen proportionally to the number of CPU cores, and then each thread running on a core can read as much data as the I/O band-width and CPU speed allows. For this reason, a non-blocking server scales better with high-client numbers.
For this reason, if you have to handle a large number of clients simultaneously , using TNonblockingServer would be preferable and it would be better to find a way to switch the client to use the TFramedTransport. If your use-case is handling only a limited number of clients, then using TThreadPoolServer without modifying the client should be fine, even if each client produces a lot of data.
Since channel is not thread safe, I can either synchronize th channel instance before publish or I create a channel each time I need and close it.
But in my opinion neither of them have a good performance due to cost of locking or create and destory channels.
So how should I publish message to rabbitmq with high tps? Any good pratise on this?
So, first thing first. A channel is not a connection. In RabbitMQ, Channels are the same thing as application sessions, whereas a connection represents an underlying TCP session with the server. This answer explains some of that nicely.
TCP sessions are expensive to create, so they tend to have a lifetime outside the lifetime of any particular worker thread. Channels are extremely cheap to create - all the server does is assign you an integer for your channel identifier and you have a new channel.
Most operations on RabbitMQ close a channel when they fail. This is done because there is no practical consequence of doing so. Would they close the underlying connection, that would cause a lot of problems for the app.
Design Guidance
Pooling would be appropriate for connections, if you really have a lot of processing going on. A discussion on how to do this is really beyond what I can provide in a short answer.
Pooling is absolutely not appropriate for channels. A channel is a lightweight construct that is designed to have a transient lifetime. If it lasts longer than one or two publishes, great. But you should expect that every time you try an operation, there is a possibility it will fail and close the channel. This does not close the underlying connection, but a new channel will have to be reestablished to do anything with the broker.
Consumer lifetimes are tied to channels. When the channel closes, the attached consumer is closed too. Design your consumer objects (worker threads) to be able to get a connection and create a new channel when this happens, and then re-subscribe themselves.
Avoid sharing a channel across threads. One thread = one channel.
While I don't have any particular experience with the Java client, I don't believe locks should be necessary, and I would certainly hope that the implementation doesn't do something screwy to make Channels anything other than lightweight.
If you're writing your own protocol implementation library (not necessary, but also not a bad idea if you need fine-grained control), allocate one thread to manage each connection. Don't do reads and writes to the TCP socket in parallel, or you'll break the protocol.
Regarding the Java client, I think you can assume that channel operations (reads and writes, etc.) are NOT thread-safe, which is why you want to stick to one thread/one channel paradigm. I think you can assume that creating channels (a connection operation) is thread-safe.
You should use a pool.
For instance, use Apache's Generic Object Pool and provide an implementation for opening, closing and checking connections. When you need to publish a message, you borrow a channel from the pool, use it, and return it.
I have a socketChannel (java.nio.channels.SocketChannel) listening for reading requests (from multiple clients). It stores each request in a Request Queue.
Also socketChannel.configureBlocking(false)
Then I want the multiple threads to take one request at a time from the Request Queue and write to the socketChannel
I have read the following from a documentation.
Socket channels are safe for use by multiple concurrent threads. They
support concurrent reading and writing, though at most one thread may
be reading and at most one thread may be writing at any given time.
Since only 1 thread can be written, what can I do in the case of multiple writes?
You can use your own lock synchronized or ReentrantLock, or queue the messages and have one thread do the actual writes.
The problem with writes is you can only atomically write one byte at a time, if you write more than one byte, you might send some, but not all of the data in which case another thread can attempt to write it's message and you get a corrupted message.
I have a java.nio.channels.SocketChannel listening for reading requests (from multiple clients).
No you don't. You might have a ServerSocketChannel that listens for connections from multiple clients, but once you have an accepted SocketChannel, it is only connected to one client. All you can get from it is sequential requests from that client.
It stores each request in a Request Queue.
I don't see any need for that.
Also socketChannel.configureBlocking(false)
Then I want the multiple threads to take one request at a time from the Request Queue and write to the socketChannel
Why not just compute the reply as soon as you read it and write it directly back?
I have read the following from a documentation.
Socket channels are safe for use by multiple concurrent threads. They support concurrent reading and writing, though at most one thread may be reading and at most one thread may be writing at any given time.
Since only 1 thread can be written, what can I do in the case of multiple writes?
What multiple writes? You only have one client request per channel. You only need to write one reponse per request. You should not read, let alone process, a new request until you've written the prior response.
Do the instances of Netty's channel handlers such as (SimpleChannelInboundHandler, ChannelInboundHandlerAdapter, etc.) share the same thread and stack or do each have it’s own thread and stack? I ask because I instantiate numerous channel handlers and I need them to communicate with each other and I must decide between using thread communication or non- threaded communication.
Thank you for your answer
As a general rule, if your handler has state, then it's one handler per channel (pipeline). Otherwise, annotate your handler with #ChannelHandler.Sharable and use the same instance for each channel.
The answer: it depends
I assume that you must be building a server, some of what I say might not apply.
According to https://netty.io/4.0/api/io/netty/channel/ChannelHandler.html, one of the factors that determines which thread your channel handler runs on is how you add it to the pipeline. Netty allows the capability to use a single instance of your handler for all pipelines (i.e. Connections to your server), in which case you must accommodate for different threads.
In contrast, if you use a channel initializer to add handlers to the pipeline, then no, you do not need to communicate between threads because each connection uses a different instance of the handler.
This is also given that you are using more than one worker thread to run your channel handlers. You must also account for which channel your handlers are communicating with, if you don't store a state variable using a channel initialized handler, then you must accommodate for inter-thread communication.
Your best bet is to debug the current thread in each handler before making a decision, netty's threading behavior and how it interacts with your program is highly subjective on your implementation and where you are moving your data.
Are netty channels (or java NIO channels in general) FIFO? or I need to implement FIFO by myself using sequence numbers?
Thanks
NIO maintains a read and write lock internally; however they are implemented using a synchronized block in NIO.
There is NO guarantee that Thread B will obtain the lock after Thread A when using synchronized. It is entirely possible that Thread C could obtain write lock before Thread B.
See the following on lock release: Synchronized release order
If you NEED guaranteed FIFO across many threads then you need to create a ReentrantLock with fair=true and require all of your threads to obtain that lock first.
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/locks/ReentrantLock.html
Maintaining order is not the responsibility of the transport layer.so your send message A and message B through the same channel(same socket),on the server side the message arrived order is uncertain.
Meet these conditions the message arrived order is send order:
use TCP protocol
message A and message B send in one system call
if you need order your application should do it.