Reading from socket input stream - java

I'm trying to determine the best way to transfer data through a socket between a client and server. Currently I have a BufferedReader that reads one character at a time (or however many characters have arrived since the last iteration). Through each iteration, it pulls the data received so far and puts it into an array. When the '|' character is read, it knows that the current instruction is done.
I know what I have so far is grossly inefficient and burns the CPU, but I'm a little unclear as to the differences between all the ways to read from a socket input stream. What would I use to not have to read each character at a time, but rather to wait until the input stream is finished receiving the current instruction (which would be terminated by "\n")?

I find the best way is to create a new thread for each socket/client that listens for input. readLine() always works for me fine. Perhaps this might help you out a bit.

Related

Is available method in InputStream a good way to iterate through a file?

From the document, the function of the "available" method is:
returns an estimate of the number of bytes that can be read (or
skipped over) from this input stream without blocking by the next
invocation of a method for this input stream.
So, how long does it take for this method to return a result. If I have a file with 10000 words, and I want to go through each word by checking like this:
while (steam.available() > 0) {
steam.read(); // suppose that this read a word
}
So after each reading the first word, is the method going to go through the next 9999 words? And, after the second word, do it check the next 9998 words?
From the document, it say that the method "estimate the number of bytes", then how does it do that?
As it states, the purpose is to tell you how many bytes you can read without the read call blocking. This is mostly useful for network connections, where data is filling the buffer and you might want to process as much of that data without the read call blocking, waiting for more data.
It's not commonly used and doesn't tell you anything about how much is GOING to be available over all. For example, iv seen it used to test the length of a message, which is of course wrong, because only a part of the message may have been received at that point.
You are best to just read the whole stream until EOF is reached. available() will only be of use if you want to process as much data as you can without blocking. it says "estimate" because more data could be coming in all the time and you may have been able to read more bytes than available() returned at the exact moment you called it.
In practice, you need all the data from a stream, or you stop when you reach a certain value. But this is a separate issue to how quickly it streams in from where ever it's coming from. Wether it blocks or not - you will neither know nor care. :)

From classic multithreaded to java.nio asynchronous/non-blocking server

I'm the main developer of an online game.
Players use a specific client software that connects to the game server with TCP/IP (TCP, not UDP)
At the moment, the architecture of the server is a classic multithreaded server with one thread per connection.
But in peak hours, when there are often 300 or 400 connected people, the server is getting more and more laggy.
I was wondering, if by switching to a java.nio.* asynchronous I/O model with few threads managing many connections, if the performances would be better.
Finding example codes on the web that cover the basics of such a server architecture is very easy. However, after hours of googling, I didn't find the answers to some more advanced questions:
1 - The protocol is text-based, not binary-based. The clients and the server exchanges lines of text encoded in UTF-8. A single line of text represents a single command, each lines are properly terminated by \n or \r\n.
For the classic multithreaded server, I have that kind of code :
public Connection (Socket sock) {
this.in = new BufferedReader( new InputStreamReader( sock.getInputStream(), "UTF-8" ));
this.out = new BufferedWriter( new OutputStreamWriter(sock.getOutputStream(), "UTF-8"));
new Thread(this) .start();
}
And then in run, data are read line by line with readLine.
In the doc, I found an utilitiy class Channels that can create a Reader out of a SocketChannel. But it is said that the produced Reader wont work if the Channel is in non-blocking mode, what contradicts the fact that non-blocking mode is mandatory to use the highly performant channel selection API I'm willing to use. So, I suspect that it isn't the right solution for what I would like to do.
The first question is therefore the following: if I can't use that, how to efficiently and properly take care of breaking lines and converting native java strings from/to UTF-8 encoded data in the nio API, with buffers and channels?
Do I have to play with get/put or inside the wrapped byte array by hand? How to go from ByteBuffer to strings encoded in UTF-8 ? I admit to don't understand very well how to use classes in the charset package and how it works to do that.
2 - In the asynchronous/non-blocking I/O world, what about the handling of consecutive read/write that have by nature to be executed sequencially one after the other?
For example, the login procedure, which is typicly challenge-response-based: the server sends a question (a particular computation), the client sends the response, and then the server checks the response given by the client.
The answer is, I think, certainly not to make a single task to send to worker threads for the whole login process, as it is quite long, with the risk to freeze worker threads for too much time (Imagine that scenario: 10 pool threads, 10 players try to connect at the same time; tasks related to players already online are delayed until one thread is again ready).
3 - What happens if two different threads simultaneously call Channel.write(ByteBuffer) on the same Channel?
Do the client might receive mixed up lines ? For example if a thread sends "aaaaa" and another sends "bbbbb", could the client receive "aaabbbbbaa", or am I ensured that everyting is sent in a consist order? Am I allowed to modify the buffer used right after the call returned?
Or asked differently, do I need additional synchronization to avoid this sort of situation?
If I need additionnal synchronization, how to know when release locks and so on, upon write finishes?
I'm afraid that the answer isn't as simple as registering for OP_WRITE in the selector. By trying that, I noticed that I get the write-ready event all the time and always for all clients, exiting Selector.select early mostly for nothing, since there are only 3 or 4 messages to send pers second per client, while the selection loop is performed hundreds of times per second. So, potentially, active wait in perspective, what is very bad.
4 - Can multiple threads call Selector.select on the same selector simultaneously without any concurrency problems such as missing an event, scheduling it twice, etc?
5 - In fact, is nio as good as it is said to be ? Would it be interesting to stay to classic multithreaded model, but unstead of creating a thread per connection, use fewer threads and loop over the connections to look for data availability using InputStream.isAvailable ? Is that idea stupid and/or inefficient?
1) Yes. I think that you need to write your own nonblocking readLine method. Note also that a nonblocking read may be signaled when there are several lines in the buffer, or when there is an incomplete line:
Example: (first read)
USER foo
PASS
(second read)
bar
You will need to store (see 2) the data that was not consumed, until enough information is ready to process it.
//channel was select for OP_READ
read data from channel
prepend data from previous read
split complete lines
save incomplete line
execute commands
2) You will need to keep the state of each client.
Map<SocketChannel,State> clients = new HashMap<SocketChannel,State>();
when a channel is connected, put a fresh state into the map
clients.put(channel,new State());
Or store the current state as the attached object of the SelectionKey.
Then, when executing each command, update the state. You may write it as a monolithic method, or do something more fancy such as polymorphic implementations of State, where each state knows how to deal with some commands (e.g. LoginState expects USER and PASS, then you change the state into a new AuthorizedState).
3) I don't recall using NIO with many asynchronous writers per channel, but the documentation says it is thread safe (I won't elaborate, since I have no proof of this). About OP_WRITE, note that it signals when the write buffer is not full. In other words, as said here: OP_WRITE is almost always ready, i.e. except when the socket send buffer is full, so you will just cause your Selector.select() method to spin mindlessly.
4) Yes. Selector.select() performs a blocking selection operation.
5) I think that the most difficult part is switching from a thread-per-client architecture, to a different design where reads and writes are decoupled from processing. Once you have done that, it is easier to work with channels than working your own way with blocking streams.

Write contents of an InputStream (blocking) to a non-blocking socket

I'm programming a simple Java NIO server and have a little headache: I get normal InputStreams i need to pipe to my clients. I have a single thread performing all writes, so this creates a problem: if the InputStream blocks, all other connection writing will be paused.
I can use InputStream.available() to check if there are any incoming data I can read without blocking, but if I've reached end-of-stream it seems I must call read() to know.
This creates a major headache for me, but I can't possibly believe I'm the first to have this problem.
The only options I've come up with so far:
Have a separate thread for each InputStream, however that's just silly since I'm using non-blocking I/O in the first place. I could also have a thread pool doing this but then again that limits the amount of simultaneous clients I can pipe the InputStream to.
Have a separate thread reading these streams with a timeout (using another thread to interrupt if reading has lasted longer than a certain amount of time), but that'll most certainly choke the data flow should I have many open InputStreams not delivering data.
Of course, if there was a magic InputStream.isEof() or isClosed() then this wouldn't be any problem at all :'(
".....Have a separate thread for each InputStream, however that's just silly since I'm using non-blocking I/O in the first place...."
It's not silly at all. First you need to check whether you can retrieve a SelectableChannel from your InputStream implementation. If it does you are lucky and you can just register it with a selector and do as usual. But chances are that your InputStream may have a channel that's not a SelectableChannel, in which case "Have a separate thread for each InputStream" is the obvious thing to do and probably the right thing to do.
Note that there is a similar problem discussed in SO about not able to get a SelectableChannel from an inputstream. Unfortunately you are stuck.
I have a single thread performing all
writes
Have you stopped to consider whether that is part of the problem rather than part of the solution?

Java NIO: How to know when SocketChannel read() is complete with non-blocking I/O

I am currently using a non-blocking SocketChannel (Java 1.6) to act as a client to a Redis server. Redis accepts plain-text commands directly over a socket, terminated by CRLF and responds in-like, a quick example:
SEND: 'PING\r\n'
RECV: '+PONG\r\n'
Redis can also return huge replies (depending on what you are asking for) with many sections of \r\n-terminated data all as part of a single response.
I am using a standard while(socket.read() > 0) {//append bytes} loop to read bytes from the socket and re-assemble them client side into a reply.
NOTE: I am not using a Selector, just multiple, client-side SocketChannels connected to the server, waiting to service send/receive commands.
What I'm confused about is the contract of the SocketChannel.read() method in non-blocking mode, specifically, how to know when the server is done sending and I have the entire message.
I have a few methods to protect against returning too fast and giving the server a chance to reply, but the one thing I'm stuck on is:
Is it ever possible for read() to return bytes, then on a subsequent call return no bytes, but on another subsequent call again return some bytes?
Basically, can I trust that the server is done responding to me if I have received at least 1 byte and eventually read() returns 0 then I know I'm done, or is it possible the server was just busy and might sputter back some more bytes if I wait and keep trying?
If it can keep sending bytes even after a read() has returned 0 bytes (after previous successful reads) then I have no idea how to tell when the server is done talking to me and in-fact am confused how java.io.* style communications would even know when the server is "done" either.
As you guys know read never returns -1 unless the connection is dead and these are standard long-lived DB connections, so I won't be closing and opening them on each request.
I know a popular response (atleast for these NIO questions) have been to look at Grizzly, MINA or Netty -- if possible I'd really like to learn how this all works in it's raw state before adopting some 3rd party dependencies.
Thank you.
Bonus Question:
I originally thought a blocking SocketChannel would be the way to go with this as I don't really want a caller to do anything until I process their command and give them back a reply anyway.
If that ends up being a better way to go, I was a bit confused seeing that SocketChannel.read() blocks as long as there aren't bytes sufficient to fill the given buffer... short of reading everything byte-by-byte I can't figure out how this default behavior is actually meant to be used... I never know the exact size of the reply coming back from the server, so my calls to SocketChannel.read() always block until a time out (at which point I finally see that the content was sitting in the buffer).
I'm not real clear on the right way to use the blocking method since it always hangs up on a read.
Look to your Redis specifications for this answer.
It's not against the rules for a call to .read() to return 0 bytes on one call and 1 or more bytes on a subsequent call. This is perfectly legal. If anything were to cause a delay in delivery, either because of network lag or slowness in the Redis server, this could happen.
The answer you seek is the same answer to the question: "If I connected manually to the Redis server and sent a command, how could I know when it was done sending the response to me so that I can send another command?"
The answer must be found in the Redis specification. If there's not a global token that the server sends when it is done executing your command, then this may be implemented on a command-by-command basis. If the Redis specifications do not allow for this, then this is a fault in the Redis specifications. They should tell you how to tell when they have sent all their data. This is why shells have command prompts. Redis should have an equivalent.
In the case that Redis does not have this in their specifications, then I would suggest putting in some sort of timer functionality. Code your thread handling the socket to signal that a command is completed after no data has been received for a designated period of time, like five seconds. Choose a period of time that is significantly longer than the longest command takes to execute on the server.
If it can keep sending bytes even after a read() has returned 0 bytes (after previous successful reads) then I have no idea how to tell when the server is done talking to me and in-fact am confused how java.io.* style communications would even know when the server is "done" either.
Read and follow the protocol:
http://redis.io/topics/protocol
The spec describes the possible types of replies and how to recognize them. Some are line terminated, while multi-line responses include a prefix count.
Replies
Redis will reply to commands with different kinds of replies. It is possible to check the kind of reply from the first byte sent by the server:
With a single line reply the first byte of the reply will be "+"
With an error message the first byte of the reply will be "-"
With an integer number the first byte of the reply will be ":"
With bulk reply the first byte of the reply will be "$"
With multi-bulk reply the first byte of the reply will be "*"
Single line reply
A single line reply is in the form of a single line string starting with "+" terminated by "\r\n". ...
...
Multi-bulk replies
Commands like LRANGE need to return multiple values (every element of the list is a value, and LRANGE needs to return more than a single element). This is accomplished using multiple bulk writes, prefixed by an initial line indicating how many bulk writes will follow.
Is it ever possible for read() to return bytes, then on a subsequent call return no bytes, but on another subsequent call again return some bytes? Basically, can I trust that the server is done responding to me if I have received at least 1 byte and eventually read() returns 0 then I know I'm done, or is it possible the server was just busy and might sputter back some more bytes if I wait and keep trying?
Yes, that's possible. Its not just due to the server being busy, but network congestion and downed routes can cause data to "pause". The data is a stream that can "pause" anywhere in the stream without relation to the application protocol.
Keep reading the stream into a buffer. Peek at the first character to determine what type of response to expect. Examine the buffer after each successful read until the buffer contains the full message according to the specification.
I originally thought a blocking SocketChannel would be the way to go with this as I don't really want a caller to do anything until I process their command and give them back a reply anyway.
I think you're right. Based on my quick-look at the spec, blocking reads wouldn't work for this protocol. Since it looks line-based, BufferedReader may help, but you still need to know how to recognize when the response is complete.
I am using a standard
while(socket.read() > 0) {//append
bytes} loop
That is not a standard technique in NIO. You must store the result of the read in a variable, and test it for:
-1, indicating EOS, meaning you should close the channel
zero, meaning there was no data to read, meaning you should return to the select() loop, and
a positive value, meaning you have read that many bytes, which you should then extract and remove from the ByteBuffer (get()/compact()) before continuing.
It's been a long time, but . . .
I am currently using a non-blocking SocketChannel
Just to be clear, SocketChannels are blocking by default; to make them non-blocking, one must explicitly invoke SocketChannel#configureBlocking(false)
I'll assume you did that
I am not using a Selector
Whoa; that's the problem; if you are going to use non-blocking Channels, then you should always use a Selector (at least for reads); otherwise, you run into the confusion you described, viz. read(ByteBuffer) == 0 doesn't mean anything (well, it means that there are no bytes in the tcp receive buffer at this moment).
It's analogous to checking your mailbox and it's empty; does it mean that the letter will never arrive? was never sent?
What I'm confused about is the contract of the SocketChannel.read() method in non-blocking mode, specifically, how to know when the server is done sending and I have the entire message.
There is a contract -> if a Selector has selected a Channel for a read operation, then the next invocation of SocketChannel#read(ByteBuffer) is guaranteed to return > 0 (assuming there's room in the ByteBuffer arg)
Which is why you use a Selector, and because it can in one select call "select" 1Ks of SocketChannels that have bytes ready to read
Now there's nothing wrong with using SocketChannels in their default blocking mode; and given your description (a client or two), there's probably no reason to as its simpler; but if you want to use non-blocking Channels, use a Selector

Java NIO SocketChannel writing problem

I am using Java NIO's SocketChannel to write : int n = socketChannel.write(byteBuffer); Most of the times the data is sent in one or two parts; i.e. if the data could not be sent in one attemmpt, remaining data is retried.
The issue here is, sometimes, the data is not being sent completely in one attempt, rest of the data when tried to send multiple times, it occurs that even after trying several times, not a single character is being written to channel, finally after some time the remaning data is sent. This data may not be large, could be approx 2000 characters.
What could be the cause of such behaviour? Could external factors such as RAM, OS, etc cause the hindarance?
Please help me solve this issue. If any other information is required please let me know.
Thanks
EDIT:
Is there a way in NIO SocketChannel, to check, if the channel could be provided with data to write before actual writing. The intention here is, after attempting to write complete data, if some data hasn't been written on channel, before writing the remaining data can we check if the SocketChannel can take any more data; so instead of attempting multiple times fruitlessly, the thread responsible for writing this data could wait or do something else.
TCP/IP is a streaming protocol. There is no guarantee anywhere at any level that the data you send won't be broken up into single-byte segments, or anything in between that and a single segment as you wrote it.
Your expectations are misplaced.
Re your EDIT, write() will return zero when the socket send buffer fills. When you get that, register the channel for OP_WRITE and stop the write loop. When you get OP_WRITE, deregister it (very important) and continue writing. If write() returns zero again, repeat.
While using TCP, we can write over sender side socket channel only until the socket buffers are filled up and not after that. So, in case the receiver is slow in consuming the data, sender side socket buffers fill up and as you mentioned, write() might return zero.
In any case, when there is some data to be sent on the sender side, we should register the SocketChannel with the selector with OP_WRITE as the interested operation and when selector returns the SelectionKey, check key.isWritable() and try writing on that channel. As mentioned by Nilesh above, don't forget to unregister the OP_WRITE bit with the selector after writing the complete data.

Categories

Resources