Sending ACK greatly slows down data transfer - java

Awareness of the fact that TCP checksum is actually a very poor checksum prompted me to include in the data block an additional checksum (SHA-256) to verify the integrity of data on the server and in case of corrupted, request the data block again. But the addition of ACK greatly reduces the data transfer rate. In my case (the data is transmitted by wifi) the speed has decreased from ~90mbps to ~12mbps.
SocketChannel socketChannel = InetSocketAddress("", 3333));
ByteBuffer byteBufferData = ByteBuffer.allocateDirect(1024 * 8);
ByteBuffer byteBufferACK = ByteBuffer.allocateDirect(1);
for (int i = 0; i < 1024; i++) {
// write data (payload + checksum (SHA-256))
// read ACK;
// if (byteBufferACK.get() == XXX)
// ... retransmission byteBufferData
ServerSocketChannel serverSocketChannel =;
serverSocketChannel.socket().bind(new InetSocketAddress(3333));
SocketChannel socketChannel = serverSocketChannel.accept();
ByteBuffer byteBufferData = ByteBuffer.allocateDirect(1024 * 8);
ByteBuffer byteBufferACK = ByteBuffer.allocateDirect(1);
long startTime = System.currentTimeMillis();
while (( != -1) {
// when 8192 bytes of data were read
if (!byteBufferData.hasRemaining()) {
// write ACK
System.out.println(System.currentTimeMillis() - startTime);
Please note that the code is a test code and is not intended to convey any useful data. It is intended only for testing the data transfer rate.
I have as 2 questions:
Maybe I do not understand something or do it incorrectly, but why sending one byte of data as a confirmation of data acceptance (ACK) affects the overall data transfer rate so much? How to avoid this?
Is the SHA-256 sufficient as a checksum for data of 8kb size? (On top of the existing TCP CRC)

Because you're waiting for it. Lets say there's 200ms of latency between you and the server. Without the ack, you'd write packets as quickly as possible, saturate the bandwidth, and stop. With the ack, it looks like this:
t=0 send 1st 8k
t=200 server recieves
t=205ish server sends ack
t=405 client recieves ack.
t=410ish client sends 2nd 8k
You waste 50% of your sending time. I'm actually surprised it wasn't worse.
TCP has a LOT of features in it that prevent these kinds of issues, including sliding windows of data (you don't send one packet and ack it, you send N packets and the server acks the ones it receives, allowing missing packets to be resent out of order). YOu're reimplementing TCP badly and almost certainly shouldn't be.
If you are going to do this- don't use TCP. Use UDP or raw sockets and write your new protocol on top of that. You're still using TCP acks and checksums, so yours are redundant.


Why Java DatagramSocket doesn't receive all udp packet that clients have sent?

my clients send udp packets with high rate.
i'm sure that my java app layer doesn't receive all udp packets that clients sent becuase the number of recieved packets in wireshark and my java app doesn't match.
because wireshark receive more udp packets so i'm sure udp packets didn't lost in network.
the code is here:
receive packets in a thread and offer to a LinkedBlockingQueue and on another thread consume take packets from LinkedBlockingQueue and then call onNext on a
rx-java subject.
socket = new DatagramSocket(this.port);
socket.setReceiveBufferSize(2 * 1024 * 1024);
// thread-1
while (true) {
byte[] bytes = new byte[532];
DatagramPacket packet = new DatagramPacket(bytes, bytes.length);
try {
new UdpPacket(
packet.getPort(), packet.getAddress().getHostAddress(), packet.getData()));
} catch (IOException e) {
// thread-2
UdpPacket packet;
while ((packet = queue.take()) != null) {
Host OS: Ubutnu 18.04
Very difficult to give a straight answer but from my experience with UDP message processing in Java it really matters to improve performance of processing the messages, especially with large volumes of data.
So here are some things that I would consider:
1) You are correct to process UDP messages on a different queue. But, the queue has a limited size. Do you manage to process messages fast? Otherwise, the queue fills up and you are blocking the while loop. Some simple logging there could let you know if this is the case. Putting them on a queue where they can be pooped out on a different step is awesome but you also need to make sure that the processing is fast as way and that the queue does not fill up.
2) Are all your data-grams less than 532 bytes? Maybe some loss occurs due to larger messages that don't fill the buffer.
Hope this helps,
I had a similar issue to this recently in a different language. I'm unsure if it works the same in Java, but this may be helpful to you.
So as data packets come into the socket, they are buffered and you have set your buffer size, but you are still only reading a single data packet, even though the buffer could be holding more. As you're processing one datagram at a time, your buffer is filling up even more and eventually when its full, data could be lost as it can't store any more datagrams.
I checked the documentation for DatagramSocket
Receives a datagram packet from this socket
I'm unsure on the functions you would need to call in Java, but here's a little snippet that I am using.
while (!m_server->BufferEmpty()) {
std::shared_ptr<Stream> inStream = std::make_shared<Stream>();
std::vector<unsigned char>& buffer = inStream->GetBuffer();
boost::asio::ip::udp::endpoint senderEndpoint = m_server->receive(boost::asio::buffer(buffer),
boost::posix_time::milliseconds(-1), ec);
if (ec)
std::cout << "Receive error: " << ec.message() << "\n";
std::unique_ptr<IPacketIn> incomingPacket = std::make_unique<IPacketIn>();
m_packetProcessor->ProcessPacket(incomingPacket, senderEndpoint);
This basically says that if the socket has any data for the current frame in its buffer, keep reading datagrams until the buffer is empty.
Unsure on how the LinkedBlockingQueue works, but this could also be causing a bit of a problem if both threads are trying to access it at the same time. In your UDP reading thread you could be blocked for some time, and then packets could be received during this time.

Simulate back pressure in TCP send

I am writing some java TCP/IP networking code ( client - server ) in which I have to deal with scenarios where the sends are much faster than the receives , thus blocking the send operations at one end. ( because the send and recv buffers fill up ). In order to design my code , I wanted to first play around these kind of situations first and see how the client and servers behave under varying load. But I am not able to set the parameters appropriately for acheiving this back pressure. I tried setting Socket.setSendBufferSize(int size) and Socket.setReceiveBufferSize(int size) to small values - hoping that would fill up soon, but I can see that send operation completes without waiting for the client to consume enough data already written. ( which means that the small send and recv buffer size has no effect )
Another approach I took is to use Netty , and set ServerBootstrap.setOption("child.sendBufferSize", 256);, but even this is of not much use. Can anyone help me understand what I am doing wrong /
The buffers have an OS dependent minimium size, this is often around 8 KB.
public static void main(String... args) throws IOException, InterruptedException {
ServerSocketChannel ssc =;
ssc.bind(new InetSocketAddress(0)); // open on a random port
InetSocketAddress remote = new InetSocketAddress("localhost", ssc.socket().getLocalPort());
SocketChannel sc =;
SocketChannel accept = ssc.accept();
ByteBuffer bb = ByteBuffer.allocateDirect(16 * 1024 * 1024);
// write as much as you can
while (sc.write(bb) > 0)
System.out.println("The socket write wrote " + bb.position() + " bytes.");
private static void configure(SocketChannel socketChannel) throws IOException {
on my machine prints
The socket write wrote 32768 bytes.
This is the sum of the send and receive buffers, but I suspect they are both 16 KB
I think Channel.setReadable is what you need. setReadable tell netty temporary pause to read data from system socket in buffer, when the buffer is full, the other end will have to wait.

How can I create a UDP server that will be able to scale to 10000 (uncorrelated) connections?

Currently I'm experimenting with this code (I know it doesn't fit the purpose).
I tried sending from 3 sources simultaneously (UDP Test Tool) and it seems ok, but I wan't to know how this would behave if form those 10K possible clients 2K are sending at the same time? The packets are approximately 70 bytes in size. I'm supposed do to some simple operations on the contents and write the results to a database.
public class Test{
public static void main(String [] args){
int PACKETSIZE=1400;
int port=5555;
byte[] bytes = new byte[PACKETSIZE];
//ByteBuffer bb = ByteBuffer.allocate(4);
//Byte lat=null;
DatagramSocket socket = new DatagramSocket(port);
System.out.println("The server is runing on port " + port +"\n");
while (true)
DatagramPacket packet = new DatagramPacket(bytes, bytes.length);
System.out.println("Packet length = " + packet.getLength());
System.out.println("Sender IP = " + packet.getAddress() + " Port = " + packet.getPort());
for(int i=0; i<=packet.getLength();i++){System.out.print(" "+ packet.getData()[i] + " ");}
Firstly UDP sockets are not connection oriented so the number of "connections" is meaningless. The number that you actually care about is number of datagrams per second. The other issue that is normally overlooked is whether the datagrams span IP packets or not since that affects packet assembly time and, ultimately, how expensive they are to receive. Your packet size is 1,400 which will fit comfortably in an Ethernet frame.
Now, what you need to do is limit your processing time using multiple threads, queueing, or some other processing scheme. You want the receiving thread busy pulling datagrams off of the wire and putting them somewhere else for workers to process. This is a common processing idiom that has been in use for years. It should scale to meet your needs provided that you can separate the processing of the data from the network IO.
You can also use asynchronous or event-driven IO so that you do not have a thread responsible for reading datagrams from the socket directly. See this question for a discussion of Java NIO.
I'm not sure if this is homework or not, but you should read The C10K problem Dan Kegel's excellent article on this very subject. I think that you will probably find it enlightening to say the least.
Check out these two open source projects :
Also check this blog post:

why the TCP receiver can receive data after the Socket Server has shut down?

I am using Java to implement a multithreaded TCP server-client application. But now i have encountered a strange problem: when i shutdown the server socket, the receiver can still receives the last sent packet continuously. Since the detail of socket read is of the kernel concern, i can't figure out the reason. Can anybody give some guideline?
Thanks in advance!
The code involved is simple:
public void run() {
while(runFlag) {
//in = socket.getSocketInputStream();
//byte[] buffer = new byte[bufferSize];
try {;
//process the buffer;
}catch(IOException e) {
when shutdown the server socket, this read operation will receive packet continuously(each time enters the while loop).
The TCP/IP stack inside the OS is buffering the data on both sides of the connection. Sender fills its socket send buffer, which is drained by the device driver pushing packets onto the wire. Receiver accumulates packets off the wire in the socket receive buffer, which is drained by the application reads.
If the data is already in the client socket's buffer (kernel-level, waiting for your application to read it into userspace memory), there is no way for the server to prevent it from being read. It's like with snail mail: once you've sent it away you cannot undo it.
That's how TCP works. It's a reliable byte-stream. Undelivered data continues to be delivered after a normal close. Isn't that what you want? Why is this a 'problem'?

Possible to sit on the network and receive a TCP stream/UDP datagrams?

Has anyone out there done the work of sitting on top of a packet capture interface (like jpcap) with an implementation of UDPSocket (for UDP datagrams) and InputStream (for TCP streams)?
I suppose it wouldn't be too hard to do given the callback API in jpcap, but has anyone out there already done it? Are there any issues with doing this (do I have to figure out how to reassemble a TCP stream myself, for example?)
I have not done this particular thing, but I do do a lot of work with parsing captured packets in C/C++. I don't know if there exist Java libraries for any of this.
Essentially, you need to work your way up the protocol stack, starting with IP. The pcap data starts with the link-level header, but I don't think there's much in it that you're concerned about, other than ignoring non-IP packets.
The trickiest thing with IP is reassembling fragmented datagrams. This is done using the More Fragments bit in the Flags field and the Fragment Offset field, combined with the Identification field to distinguish fragments from different datagrams Then you use the Protocol field to identify TCP and UDP packets, and the Header Length field to find the start of the corresponding header.
The next step, for both TCP and UDP, is demultiplexing, separating out the various connections in the captured packet stream. Both protocols identify connections (well, UDP doesn't have connections per se, but I don't have a better word handy) by the 4-tuple of the source and destination IP address and the source and destination port, so a connection would be a sequence of packets that matches on all 4 of these values.
Once that's done, for UDP, you're just about finished, unless you want to check the checksum. The Length field in the UDP header tells you how long the packet is; subtract 8 bytes for the header and there's your data.
TCP is somewhat more complicated, as you do indeed have to reassemble the stream, This is done using the sequence number in the header, combined with the length. The sum of these two tells you the next sequence number in the stream. Remember that you're keeping track of the traffic in two directions.
(This is a lot easier than writing an actual TCP implementation, as then you have to implement the Nagle algorithm and other minutiae.)
There's a lot of information on the net about the header formats; google "IP header" for starters. A network analyzer like Wireshark is indispensable for this work, as it will show you how your captured data is supposed to look. Indeed, as Wireshark is open source, you can probably find out a lot by looking at how it does things
Tcp reassembly can be done with JNetPcap. Here is a complete example:
final String SOME_PORT = 8888;
StringBuilder errbuf = new StringBuilder();
Pcap pcap = Pcap.openOffline("/dir/someFile.pcap", errbuf); //Can be replace with .openLive(...)
if (pcap == null) {
System.err.printf("Error: "+errbuf.toString());
//Handler that receive Tcp Event one by one
AnalyzerListener<TcpStreamEvent> handler = new AnalyzerListener<TcpStreamEvent>() {
public void processAnalyzerEvent(TcpStreamEvent evt) {
JPacket packet = evt.getPacket();
Tcp tcp = new Tcp();
if (packet.hasHeader(tcp)) {
//Limiting the analysis to a specific protocol
if (tcp.destination() == SOME_PORT || tcp.source() == SOME_PORT) {
String data = new String(tcp.getPayload());
System.out.println("Capture data:{"+data+"}");
TcpAnalyzer tcpAnalyzer = JRegistry.getAnalyzer(TcpAnalyzer.class);
tcpAnalyzer.addTcpStreamListener(handler, null);
//Starting the capture
pcap.loop(Pcap.LOOP_INFINATE, JRegistry.getAnalyzer(JController.class), null);

