Speed difference in write to socket on slow and fast connections?

Speed difference in write to socket on slow and fast connections? - java

Does a write to a plain Outputstream backed by a java socket (on serverside) take the same time to clients with a fast and slow connection?
I would suspect no, but i can also suspect that there is some kind of buffer in front of the socket internally.

There's two socket buffers. One for output, one for input. If you're writing data that fits in the buffers, the speed of the connection doesn't matter for the first write.
After that there will be a noticeable difference, and the slow connection will block a lot more, waiting for space in the output buffer. Of course that speed difference won't affect the code, you'll just be writing data that eventually will be transferred, whether it happens fast or slow. It's only the end user that notices the slowness.

Related

Java Multithreading Network Connection Performance Advice

I need to have lots of network connections open at the same time(!) and transfer data as fast as possible. Thousands of connections. Right now, I have one thread for each connection and reading charwise from the Inputstream of that connection.
And I have the strong suspicion that the CPU/switching between the thousands of threads might impose some performance problems here even though the servers are really slow (low two-digit KB/s), since I've observed that the throughput isn't even close to being proportional to the number of threads.
Therefore I'd like to ask some programmers experienced in parallel programming:
Is it worth rewriting the entire program so that one thread reads from multiple InputStreams in a round robin like fashion? Would that, if there is a speedup, be worth the programming? How many connections per thread? Or do you have another idea for reading really really fast from multiple network input streams?
If I don't read a char, will the server wait to send the next one until I do? What if my thread is sleeping?

reading charwise
You know data is transmitted in packets right? Reading a single character at a time is very inefficient. Each read has to traverse all the layers from your program to the network stack in the operating system. You should try to read one full segment of data at a time.
If I don't read a char, will the server wait to send the next one until I do? What if my thread is sleeping?
That's why the operating system has a buffer for incoming data, also called a window. When TCP segments arrive, they are put into the receive buffer. When your program requests to read from the socket, the operating system returns data from the receive buffer. If the receive buffer is full, the packet is lost and has to be sent again.
For more about how TCP works, see https://beej.us/guide/bgnet/
Wikipedia is pretty good but fairly dense
https://en.m.wikipedia.org/wiki/Transmission_Control_Protocol
Is it worth rewriting the entire program so that one thread reads from multiple InputStreams in a round robin like fashion? Would that, if there is a speedup, be worth the programming?
What you're describing would require moving from blocking I/O to non-blocking I/O. Non-blocking will require fewer system resources, but it is significantly harder to implement correctly and efficiently. So don't do it unless you have a pressing reason.

Thousands of threads (and stacks...) are probably too many for the OS scheduler, memory management units, caches...
You need just a few threads (one per CPU) and use a select()-based solution
on each of them.
Have a look at Selector, ServerSocketChannel and SocketChannel.
(see pages 30-31 of https://www.enib.fr/~harrouet/Data/Courses/Memo_Sockets.pdf)
Edit (after a question in the comments)
Selector is not just a clever algorithm encapsulated in a class.
It relies internally on the select() system-call (or equivalent,
there are many).
The operating system is aware of a set of file-descriptors (communication
means) it has to watch and, as soon as something happens on one (or
several) of them, it wakes up the process (or thread) which is blocked
on this selector.
The idea is to stay blocked as long as possible (to save resources) and
to be waken-up only on when something useful has to be done with incoming
(there are variants) data.
In your current implementation, you use thousands of threads which are
all blocked on a read()/recv() operation because you cannot know
beforehand which connection will be the next one to deliver something.
On the other hand, with a select()-based implementation, a single
thread can be blocked watching many connections at the same time
but will only react to handle the few ones which just delivered new
data.
So I suggest that you start a pool of few threads (one per CPU for example) and as soon as the main program accepts a new incoming
connection it chooses one of them (you can keep a count for each
of them) in order to make it in charge of this new connection.
All of this requires the proper synchronisation of course and probably
a trick (a special file descriptor in the selector for example) in
order to wake-up a blocked thread when it is assigned a new connection.

How is InputStream managed in memory?

I am familiar with the concept of InputStream,buffers and why they are useful (when you need to work with data that might be larger then the machines RAM for example).
I was wondering though, how does the InputStream actually carry all that data?. Could a OutOfMemoryError be caused if there is TOO much data being transfered?
Case-scenario
If I connect from a client to a server,requesting a 100GB file, the server starts iterating through the bytes of the file with a buffer, and writing the bytes back to the client with outputStream.write(byte[]). The client is not ready to read the InputStream right now,for whatever reason. Will the server continue sending the bytes of the file indefinitely? and if so, won't the outputstream/inputstream be larger than the RAM of one of these machines?

InputStream and OutputStream implementations do not generally use a lot of memory. In fact, the word "Stream" in these types means that it does not need to hold the data, because it is accessed in a sequential manner -- in the same way that a stream can transfer water between a lake and the ocean without holding a lot of water itself.
But "stream" is not the best word to describe this. It's more like a pipe, because when you transfer data from a server to a client, every stage transfers back-pressure from the client that controls the rate at which data gets sent. This is similar to how your faucet controls the rate of flow through your pipes all the way to the city reservoir:
As the client reads data, it's InputStream only requests more data from the OS when its internal (small) buffers are empty. Each request allows only a limited amount of data to be transferred;
As data is requested from the OS, its own internal buffer empties, and it notifies the server about how much space there is for new data. The server can send only this much (that's called 'flow control' in TCP: https://en.wikipedia.org/wiki/Transmission_Control_Protocol#Resource_usage)
On the server side, the server-side OS sends out data from its own internal buffer when the client has space to receive it. As its own internal buffer empties, it allows the writing process to re-fill it with more data.
As the server-side process write()s to its OutputStream, the OutputStream will try to write data to the OS. When the OS buffer is full, it will make the server process wait until the server-side buffer has space to accept new data.
Notice that a slow client can make the server process take a very long time. If you're writing a server, and you don't control the clients, then it's very important to consider this and to ensure that there are not a lot of server-side resources tied up while a long data transfer takes place.

Your question is as interesting as difficult to answer properly.
First: InputStream and OutputStream are not a storage means, but an access means: They describe that the data shall be accessed in sequential, unidirectional order, but not how it shall be stored. The actual way of storing the data is implementation-dependent.
So, would there be an InputStream that stores the whole amount of data simultaneally in memory? Yes, could be, though it would be an appalling implementation. The most common and sensitive implementation of InputStreams / OutputStreams is by storing just a fixed and short amount of data into a temporary buffer of 4K-8K, for example.
(So far, I supposed you already knew that, but it was necessary to tell.)
Second: What about connected writting / reading streams between a server and a client? In a common scenario of buffered writting, the server will not write more data than the buffer allows. So, if the server starts writing, and the client then goes down (for whatever reason), the server will just keep writing until the buffer is full, and then set it as ready for reading, and until the read is not completed (by the client peer), the server won't fill the buffer again. Remember: This kind of read/write is blocking: The client blocks until there is a buffer ready to be read, and the server blocks (or, at least, the server thread bound to this connection, it's understood) until the last read is completed.
How many time will the server block? Typically, a server should have a security timeout to ensure that long blocks will break the connection, thus releasing the blocked thread. The same should have the client.
The timeouts set for the connection depend on the implementation, and the protocol.

No, it does not need to hold all data. I just advances forward in the file (usually using buffered data). The stream can discard old buffers as it pleases.
Note that there are a a lot of very different implementations of inputstreams, so the exact behaviour varies a lot.

Using buffered streams for sending objects?

I'm currently using Java sockets in a client-server application with OutputStream and not BufferedOutputStream (and the same for input streams).
The client and server exchanges serialized objects (writeObject() method).
Does it make sense (more speed) to use BufferedOutputStream and BufferedInputStream in this case?
And when I have to flush or should I not write a flush() statement?

Does it make sense (more speed) to use BufferedOutputStream and BufferedInputStream in this case?
Actually, it probably doesn't make sense1.
The object stream implementation internally wraps the stream it has been given with a private class called BlockDataOutputStream that does buffering. If you wrap the stream yourself, you will have two levels of buffering ... which is likely to make performance worse2.
And when I have to flush or should I not write a flush() statement?
Yes, flushing is probably necessary. But there is no universal answer as to when to do it.
On the one hand, if you flush too often, you generate extra network traffic.
On the other hand, if you don't flush when it is needed, the server can stall waiting for an object that the client has written but not flushed.
You need to find the compromise between these two syndromes ... and that depends on your application's client/server interaction patterns; e.g. whether the message patterns are synchronous (e.g. message/response) or asynchronous (e.g. message streaming).
1 - To be certain on this, you would need to do some forensic testing to 1) measure the system performance, and 2) determine what syscalls are made and when network packets are sent. For a general answer, you would need to repeat this for a number of use-cases. I'd also recommend looking at the Java library code yourself to confirm my (brief) reading.
2 - Probably only a little bit worse, but a well designed benchmark would pick up a small performance difference.
UPDATE
After writing the above, I found this Q&A - Performance issue using Javas Object streams with Sockets - which seems to suggest that using BufferedInputStream / BufferedOutputStream helps. However, I'm not certain whether the performance improvement that was reported is 1) real (i.e. not a warmup artefact) and 2) due to the buffering. It could be just due to adding the flush() call. (Why: because the flush could cause the network stack to push the data sooner.)

I think these links might help you:
What is the purpose of flush() in Java streams?
The flush method flushes the output stream and forces any buffered output bytes to be written out. The general contract of flush is that calling it is an indication that, if any bytes previously written have been buffered by the implementation of the output stream, such bytes should immediately be written to their intended destination.
How java.io.Buffer* stream differs from normal streams?
Internally a buffer array is used and instead of reading bytes individually from the underlying input stream enough bytes are read to fill the buffer. This generally results in faster performance as less reads are required on the underlying input stream.
http://www.oracle.com/technetwork/articles/javase/perftuning-137844.html
As a means of starting the discussion, here are some basic rules on how to speed up I/O: 1.Avoid accessing the disk. 2.Avoid accessing the underlying operating system. 3.Avoid method calls. 4.Avoid processing bytes and characters individually.
So using Buffered-Streams usually speeds speeds up the IO-processe, as less read() are done in the background.

Side effects of flush() OutputStreamWriter to Network

What are the side effects of flushing a OutputStreamWriter which is going to a network socket.
I have a program which calls out.flush() after every few bytes. Is there any reason why I should wait until all bytes I need are in the buffer?
Will I get lower transfer rate if I flush too much (more overhead)?
Will this slow down execution of my program (blocking)?

Each time you write to a socket, you add between 5 and 15 micro-seconds. For buffered output, this occurs when you flush() the data. Note: if you don't have a buffered output, it will be performed on every write() and the flush() won't do anything.
Fortunately the OS expects applications to make more calls than is optimal so the it uses Nagle's algorithm by default to groups portions of the data writing into a larger packets. Note: not only does the OS do this but some network adapter do this by default too.
In short, don't flush() too often but unless tens of micros-seconds add up to something which matters to you, you might not notice the difference. e.g. if you do 100 flushes you might add a milli-second.

There is no reason to flush unless:
you want the peer to receive the data as soon as possible
you've sent a buffered request and you're now going to read the response.
In other cases it is better to allow the buffer, the Nagle algorithm, and the TCP receive window work their magic.

In case of data transfer on network, batching of outputStreamWriter's flush does improve performance.
I observed that single flush of data packet of about 520 bytes was taking around 3 milliseconds.

Does it really matter the size of the objects you send through a Socket with object input/output streams?

Is it more efficient to flush the OutputStream after each individual invocation of ObjectOutputStream#writeObject rather than flushing the stream after a sequence of object writes? (Example: write object and flush 4 times, or write 4 times and then just flush once?)
How does ObjectOutputStream work internally?

Is it somehow better sending four Object[5] (flushing each one) than a Object[20], for example?
It is not better. In fact it is probably worse, from a performance perspective. Each of those flushes will force the OS-level TCP/IP stack to send the data "right now". If you just do one flush at the end, you should save on system calls, and on network traffic.
If you haven't done this already, inserting a BufferedOutputStream between the Socket OutputStream and the ObjectOutputStream will make a much bigger difference to performance. This allows the serialized data to accumulate in memory before being written to the socket stream. This potentially save many system calls and could improve performance by orders of magnitude ... depending on the actual objects being sent.
(The representation of four Object[5] objects is larger than one Object[20] object, and that results in a performance hit in the first case. However, this is marginal at most, and tiny compared with the flushing and buffering issues.)
How does this stream work internally?
That is too general a question to answer sensibly. I suggest that you read up on serialization starting with the documents on this page.

No, it shouldn't matter, unless you have reason to believe the net link is likely to go down, and partial data is useful. Otherwise it just sounds like a way to make the code more complex for no reason.

If you look at the one and only public constructor of ObjectOutputStream, you note that it requires an underlying OutputStream for its instantiation.
When and how you flush your ObjectStream is entirely dependent on the type of stream you are using. (And in considering all this, do keep in mind that not all extension of OutputStream are guaranteed to respect your request to flush -- it is entirely implementation independent, as it is spelled out in the 'contract' of the javadocs.)
But certainly we can reason about it and even pull up the code and see what is actually done.
IFF the underlying OutputStream must utilize the OS services for devices (such as the disk or the network interface in case of Sockets) then the behavior of flush() is entirely OS dependent. For example, you may grab the output stream of a socket and then instantiate an ObjectOutputStream to write serialized objects to the net. TCP/IP implementation of the host OS is in charge.
What is more efficient?
Well, if your object stream is wrapping a ByteArrayOutputStream, you are potentially looking at a series of reallocs and System.arrayCopy() calls. I say potentially, since the implementation of byte array doubles the size on each (internal) resize() op and it is very unlikely that writing n (small) objects and flushing each time will result in n reallocs. (Where n is assumed to be a reasonably small number).
But if you are wrapping a network stream, you must keep in mind that network writes are very expensive. It makes much more sense, if your protocol allows it, to chunk your writes (to fill the send buffer) and just flush once.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.