Java NIO - using Selectors - java

Got some quick questions on using a non blocking selector - answering any of the below would be greatly appreciated:
I have just ben learning about the selector (java NIO model) and want to clarify some facts. Picked up most of the key points from tutorials as well as the SO question:
Java NIO: Selectors
so:
from the question linked above, a Server socket channel is registered to listen for accept, when accepted a channel is registered for read and once read it is registered for write. How does it then go back into listening for an accept? I mean there appears to be no repeated call to a register accept listener.
Is it just that once a ServerSocketChannel is registered the selector will always be listening for an accept? If so doesn't this mean that the selector will always be listening for a read/write on the channels also? How do you stop that - is it as simple as closing the channels?
What happens if a channel is registered twice? I mean - in the above example, a socket is
registered for write every time after a message is read - basically looks like an echo server. I guess the example assumes that the connection will then be closed after writing.
What happens if you want to maintain an open connection and at different intervals write to the socket - must you register an interestOp each time? Is it harmful to register the same channel twice?
Multithreading:
What is the best design now for writing a multithreaded server - as the above non blocking approach assumes that one thread can handle writing and reading. Should you create a new thread if the read/write operation is expected to take a while? Are there any other points to consider?
Understand that multithreaded is for blocking IO but have seen some frameworks that use a thread pool along with a Non blocking model and was wondering why.

a Server socket channel is registered to listen for accept, when accepted a channel is registered for read and once read it is registered for write. How does it then go back into listening for an accept?
It doesn't have to 'go back'. It's listening. It's registered. Nothing has changed.
What happens if a channel is registered twice?
The interestOps are updated and the previous key attachment is lost.
I mean - in the above example, a socket is registered for write every time after a message is read.
It's a very poor example. Incorrect technique. You should just write. If write() returns zero, then you should (a) register the channel for OP_WRITE; (b) retry the write when the channel becomes writable; and (c) deregister the channel for OP_WRITE once the write has completed.
I guess this assumes that the connection will then be closed after writing.
I don't see why. It doesn't follow. In any case it's a poor example, not to be imitated. You would be better off imitating answers here, rather than unanswered questions.
What happens if you want to maintain an open connection and at different intervals write to the socket - must you register an interestOp each time? Is it harmful to register the same channel twice?
See above. Keep the channel open. Its interestOps won't change unless you change them; and you should handle writing as above, not as per the poor example you cited.
Multithreading: What is the best design now for writing a multithreaded server - as the above non blocking approach assumes that one thread can handle writing and reading.
If you want multi-threading, use blocking I/O, most simply via java.net. If you want non-blocking I/O you don't need multi-threading. I don't understand why you're asking about both in the same question.
Should you create a new thread if the read/write operation is expected to take a while?
In the most general case of blocking I/O you have two threads per connection: one for reading and one for writing.

Related

How to publish messages to rabbitmq with high tps, multithreading

Since channel is not thread safe, I can either synchronize th channel instance before publish or I create a channel each time I need and close it.
But in my opinion neither of them have a good performance due to cost of locking or create and destory channels.
So how should I publish message to rabbitmq with high tps? Any good pratise on this?
So, first thing first. A channel is not a connection. In RabbitMQ, Channels are the same thing as application sessions, whereas a connection represents an underlying TCP session with the server. This answer explains some of that nicely.
TCP sessions are expensive to create, so they tend to have a lifetime outside the lifetime of any particular worker thread. Channels are extremely cheap to create - all the server does is assign you an integer for your channel identifier and you have a new channel.
Most operations on RabbitMQ close a channel when they fail. This is done because there is no practical consequence of doing so. Would they close the underlying connection, that would cause a lot of problems for the app.
Design Guidance
Pooling would be appropriate for connections, if you really have a lot of processing going on. A discussion on how to do this is really beyond what I can provide in a short answer.
Pooling is absolutely not appropriate for channels. A channel is a lightweight construct that is designed to have a transient lifetime. If it lasts longer than one or two publishes, great. But you should expect that every time you try an operation, there is a possibility it will fail and close the channel. This does not close the underlying connection, but a new channel will have to be reestablished to do anything with the broker.
Consumer lifetimes are tied to channels. When the channel closes, the attached consumer is closed too. Design your consumer objects (worker threads) to be able to get a connection and create a new channel when this happens, and then re-subscribe themselves.
Avoid sharing a channel across threads. One thread = one channel.
While I don't have any particular experience with the Java client, I don't believe locks should be necessary, and I would certainly hope that the implementation doesn't do something screwy to make Channels anything other than lightweight.
If you're writing your own protocol implementation library (not necessary, but also not a bad idea if you need fine-grained control), allocate one thread to manage each connection. Don't do reads and writes to the TCP socket in parallel, or you'll break the protocol.
Regarding the Java client, I think you can assume that channel operations (reads and writes, etc.) are NOT thread-safe, which is why you want to stick to one thread/one channel paradigm. I think you can assume that creating channels (a connection operation) is thread-safe.
You should use a pool.
For instance, use Apache's Generic Object Pool and provide an implementation for opening, closing and checking connections. When you need to publish a message, you borrow a channel from the pool, use it, and return it.

Java: Managing more connections than there are threads, using a queue

For an exercise, we are to implement a server that has a thread that listens for connections, accepts them and throws the socket into a BlockingQueue. A set of worker threads in a pool then goes through the queue and processes the requests coming in through the sockets.
Each client connects to the server, sends a large number of requests (waiting for the response before sending the next request) and eventually disconnects when done.
My current approach is to have each worker thread waiting on the queue, getting a socket, then processing one request, and finally putting the (still open) socket back into the queue before handling another request, potentially from a different client. There are many more clients than there are worker threads, so many connections queue up.
The problem with this approach: A thread will be blocked by a client even if the client doesn't send anything. Possible pseudo-solutions, all not satisfactory:
Call available() on the inputStream and put the connection back into the queue if it returns 0. The problem: It's impossible to detect if the client is still connected.
As above but use socket.isClosed() or socket.isConnected() to figure out if the client is still connected. The problem: Both methods don't detect a client hangup, as described nicely by EJP in Java socket API: How to tell if a connection has been closed?
Probe if the client is still there by reading from or writing to it. The problem: Reading blocks (i.e. back to the original situation where an inactive client blocks the queue) and writing actually sends something to the client, making the tests fail.
Is there a way to solve this problem? I.e. is it possible to distinguish a disconnected client from a passive client without blocking or sending something?
Short answer: no. For a longer answer, refer to the one by EJP.
Which is why you probably shouldn't put the socket back on the queue at all, but rather handle all the requests from the socket, then close it. Passing the connection to different worker threads to handle requests separately won't give you any advantage.
If you have badly behaving clients you can use a read timeout on the socket, so reading will block only until the timeout occurs. Then you can close that socket, because your server doesn't have time to cater to clients that don't behave nicely.
Is there a way to solve this problem? I.e. is it possible to distinguish a disconnected client from a passive client without blocking or sending something?
Not really when using blocking IO.
You could look into the non-blocking (NIO) package, which deals with things a little differently.
In essence you have a socket which can be registered with a "selector". If you register sockets for "is data ready to be read" you can then determine which sockets to read from without having to poll individually.
Same sort of thing for writing.
Here is a tutorial on writing NIO servers
Turns out the problem is solvable with a few tricks. After long discussions with several people, I combined their ideas to get the job done in reasonnable time:
After creating the socket, configure it such that a blocking read will only block for a certain time, say 100ms: socket.setSoTimeout(100);
Additionally, record the timestamp of the last successful read of each connection, e.g. with System.currentTimeMillis()
In principle (see below for exception to this principle), run available() on the connection before reading. If this returns 0, put the connection back into the queue since there is nothing to read.
Exception to the above principle in which case available() is not used: If the timestamp is too old (say, more than 1 second), use read() to actually block on the connection. This will not take longer than the SoTimeout that you set above for the socket. If you get a TimeoutException, put the connection back into the queue. If you read -1, throw the connection away since it was closed by the remote end.
With this strategy, most read attempts terminate immediately, either returning some data or nothing beause they were skipped since there was nothing available(). If the other end closed its connection, we will detect this within one second since the timestamp of the last successful read is too old. In this case, we perform an actual read that will return -1 and the socket's isClosed() is updated accordingly. And in the case where the socket is still open but the queue is so long that we have more than a second of delay, it takes us aditionally 100ms to find out that the connection is still there but not ready.
EDIT: An enhancement of this is to change "last succesful read" to "last blocking read" and also update the timestamp when getting a TimeoutException.
No, the only way to discern an inactive client from a client that didn't shut down their socket properly is to send a ping or something to check if they're still there.
Possible solutions I can see is
Kick clients that haven't sent anything for a while. You would have to keep track of how long they've been quiet for, and once they reach a limit you assume they've disconnected .
Ping the client to see if they're still there. I know you asked for a way to do this without sending anything, but if this is really a problem, i.e you can't use the above solution, this is probably the best way to do it depending on the specifics(since it's an exercise you might have to imagine the specifics).
A mix of both, actually this is probably better. Keep track of how long they've been quiet for, after a bit send them a ping to see if they still live.

From classic multithreaded to java.nio asynchronous/non-blocking server

I'm the main developer of an online game.
Players use a specific client software that connects to the game server with TCP/IP (TCP, not UDP)
At the moment, the architecture of the server is a classic multithreaded server with one thread per connection.
But in peak hours, when there are often 300 or 400 connected people, the server is getting more and more laggy.
I was wondering, if by switching to a java.nio.* asynchronous I/O model with few threads managing many connections, if the performances would be better.
Finding example codes on the web that cover the basics of such a server architecture is very easy. However, after hours of googling, I didn't find the answers to some more advanced questions:
1 - The protocol is text-based, not binary-based. The clients and the server exchanges lines of text encoded in UTF-8. A single line of text represents a single command, each lines are properly terminated by \n or \r\n.
For the classic multithreaded server, I have that kind of code :
public Connection (Socket sock) {
this.in = new BufferedReader( new InputStreamReader( sock.getInputStream(), "UTF-8" ));
this.out = new BufferedWriter( new OutputStreamWriter(sock.getOutputStream(), "UTF-8"));
new Thread(this) .start();
}
And then in run, data are read line by line with readLine.
In the doc, I found an utilitiy class Channels that can create a Reader out of a SocketChannel. But it is said that the produced Reader wont work if the Channel is in non-blocking mode, what contradicts the fact that non-blocking mode is mandatory to use the highly performant channel selection API I'm willing to use. So, I suspect that it isn't the right solution for what I would like to do.
The first question is therefore the following: if I can't use that, how to efficiently and properly take care of breaking lines and converting native java strings from/to UTF-8 encoded data in the nio API, with buffers and channels?
Do I have to play with get/put or inside the wrapped byte array by hand? How to go from ByteBuffer to strings encoded in UTF-8 ? I admit to don't understand very well how to use classes in the charset package and how it works to do that.
2 - In the asynchronous/non-blocking I/O world, what about the handling of consecutive read/write that have by nature to be executed sequencially one after the other?
For example, the login procedure, which is typicly challenge-response-based: the server sends a question (a particular computation), the client sends the response, and then the server checks the response given by the client.
The answer is, I think, certainly not to make a single task to send to worker threads for the whole login process, as it is quite long, with the risk to freeze worker threads for too much time (Imagine that scenario: 10 pool threads, 10 players try to connect at the same time; tasks related to players already online are delayed until one thread is again ready).
3 - What happens if two different threads simultaneously call Channel.write(ByteBuffer) on the same Channel?
Do the client might receive mixed up lines ? For example if a thread sends "aaaaa" and another sends "bbbbb", could the client receive "aaabbbbbaa", or am I ensured that everyting is sent in a consist order? Am I allowed to modify the buffer used right after the call returned?
Or asked differently, do I need additional synchronization to avoid this sort of situation?
If I need additionnal synchronization, how to know when release locks and so on, upon write finishes?
I'm afraid that the answer isn't as simple as registering for OP_WRITE in the selector. By trying that, I noticed that I get the write-ready event all the time and always for all clients, exiting Selector.select early mostly for nothing, since there are only 3 or 4 messages to send pers second per client, while the selection loop is performed hundreds of times per second. So, potentially, active wait in perspective, what is very bad.
4 - Can multiple threads call Selector.select on the same selector simultaneously without any concurrency problems such as missing an event, scheduling it twice, etc?
5 - In fact, is nio as good as it is said to be ? Would it be interesting to stay to classic multithreaded model, but unstead of creating a thread per connection, use fewer threads and loop over the connections to look for data availability using InputStream.isAvailable ? Is that idea stupid and/or inefficient?
1) Yes. I think that you need to write your own nonblocking readLine method. Note also that a nonblocking read may be signaled when there are several lines in the buffer, or when there is an incomplete line:
Example: (first read)
USER foo
PASS
(second read)
bar
You will need to store (see 2) the data that was not consumed, until enough information is ready to process it.
//channel was select for OP_READ
read data from channel
prepend data from previous read
split complete lines
save incomplete line
execute commands
2) You will need to keep the state of each client.
Map<SocketChannel,State> clients = new HashMap<SocketChannel,State>();
when a channel is connected, put a fresh state into the map
clients.put(channel,new State());
Or store the current state as the attached object of the SelectionKey.
Then, when executing each command, update the state. You may write it as a monolithic method, or do something more fancy such as polymorphic implementations of State, where each state knows how to deal with some commands (e.g. LoginState expects USER and PASS, then you change the state into a new AuthorizedState).
3) I don't recall using NIO with many asynchronous writers per channel, but the documentation says it is thread safe (I won't elaborate, since I have no proof of this). About OP_WRITE, note that it signals when the write buffer is not full. In other words, as said here: OP_WRITE is almost always ready, i.e. except when the socket send buffer is full, so you will just cause your Selector.select() method to spin mindlessly.
4) Yes. Selector.select() performs a blocking selection operation.
5) I think that the most difficult part is switching from a thread-per-client architecture, to a different design where reads and writes are decoupled from processing. Once you have done that, it is easier to work with channels than working your own way with blocking streams.

Event driven server in Java

I am trying to write an event driven HTTP web server. Because I will be using only one thread, the events have to queued up and handled asynchronously (I am also using Java NIO). However, I am stuck with the initial step only. I have opened a ServerSocketChannel. I am not sure how to get a new SocketChannel connection when a request comes in. Is there an operating system queue that I can access through Java? (I am not sure as Java is OS independent) I do not want to use any blocking calls.
If I am proceeding in the wrong direction, any help would be appreciated.
thanks.
You need to:
create a Selector
put the ServerSocketChannel into non-blocking mode
register the SSC with the Selector using OP_ACCEPT
write a select() loop, which you will find in the NIO tutorial
In the select() loop you will find keys for which isAcceptable() returns true: that means you need to call ServerSocketChannel.accept() to accept a connection. That returns a SocketChannel, which you must then put into non-blocking mode and register with OP_READ.
In turn that will cause keys for which isReadable() returns true: that means you should read the associated SocketChannel.
You will find examples of all this in the NIO Tutorial. It gets much more complicated than this ;-)

Is it possible to close Java sockets on both client and server sides?

I have a socket tcp connection between two java applications. When one side closes the socket the other side remains open. but I want it to be closed. And also I can't wait on it to see whether it is available or not and after that close it. I want some way to close it completely from one side.
What can I do?
TCP doesn't work like this. The OS won't release the resources, namely the file descriptor and thus the port, until the application explicitly closes the socket or dies, even if the TCP stack knows that the other side closed it. There's no callback from kernel to user application on receipt of the FIN from the peer. The OS acknowledges it to the other side but waits for the application to call close() before sending its FIN packet. Take a look at the TCP state transition diagram - you are in the passive close box.
One way to detect a situation like this without dedicating a thread to each socket is to use the select/poll/epoll/kqueue family of functions. The socket being passively closed will be signaled as readable and read attempt will return the EOF.
Hope this helps.
Both sides have to read from the connection, so they can detect when the peer has closed. When read returns -1 it will mean the other end closed the connection and that's your clue to close your end.
If you are still reading from your socket, then you will detect the -1 when it closes.
If you are no longer reading from your socket, go ahead and close it.
If it's neither of these, you are probably having a thread wait on an event. This is NOT the way you want to handle thousands of ports! Java will start to get pukey at around 3000 threads in windows--much less in Linux (I don't know why).
Make sure you are using NIO. Use a single thread to manage all your ports (connection pool). It should just grab the data from a thread, forward it to a queue. At that point I think I'd have a thread pool take the data out of the queues and process it because actually processing the data from a port will take some time.
Attaching a thread to each port will NOT work, and is the biggest reason NIO was needed.
Also, having some kind of a "Close" message as part of your stream to trigger closing the port may make things work faster--but you'll still need to handle the -1 to cover the case of broken streams
The usual solution is to let the other side know you are going to close the connection, before actually closing it. For instance, in the case of the SMTP protocol, the server will send '221 Bye' before it closes the connection.
You probably want to have a connection pool.

Categories

Resources