DatagramChannel Send Missing On Wire

DatagramChannel Send Missing On Wire - java

I'm seeing some occasional missing data with a datagram channel in a tool I'm developing. UDP is part of the requirement here, so I'm mostly just trying to troubleshoot the behavior I'm seeing. The tool is being developed with Java 7 (another requirement), but the computer on which I'm seeing the behavior occur is running on a Java 8 JRE.
I have a decorator class that decorates a call to DatagramChannel.send with some additional behavior, but the call effectively boils down to this:
public int send( ByteBuffer buffer, SocketAddress target ) throws
{
// some additional decorating code that can't be shared follows
int bytesToWrite = buffer.remaining();
int bytesWritten = decoratedChannel.send(buffer, target);
if (bytesWritten != bytesToWrite) {
// log the occurrence
return bytesWritten;
}
}
There is an additional bit of decoration above this that performs our own fragmentation (as part of the requirements of the remote host). Thus the source data is always guaranteed to be at most 1000 bytes (well within the limit for an ethernet frame). The decorated channel is also configured for blocking I/O.
What I'm seeing on rare occasions, is that this routine (and thus the DatagramChannel's send method) will be called, but no data is seen on the wire (which is monitored with Wireshark). The send routine always returns the number of bytes that should have been written in this case too (so bytesWritten == bytesToWrite).
I understand that UDP has reliability issues (for which we have our own data reliability mechanism that accounts for data loss and other issues), but I'm curious about the behavior of the Datagram channel's implementation. If send is returning the number of bytes written, should I not at least see a corresponding frame in Wireshark? Otherwise, I would expect the native implementation to possibly throw an exception, or at least not return the number of bytes I expected to write?

I actually ended up discovering the cause with more fiddling in Wireshark. I was unintentionally filtering out ARP requests, which seem to be the cause of the problem, as mentioned in this answer:
ARP queues only one outbound IP datagram for a specified destination address while that IP address is being resolved to a MAC address. If a UDP-based application sends multiple IP datagrams to a single destination address without any pauses between them, some of the datagrams may be dropped if there is no ARP cache entry already present. An application can compensate for this by calling the Iphlpapi.dll routine SendArp() to establish an ARP cache entry, before sending the stream of packets.
It appears the ARP entries were going stale really quick and the occasional ARP request would cause the dropped packet. I increased the ARP timeout for the interface on the PC and the dropped packet happens much less often now.

Related

udp file transfer project - is error checking necessary?

I have been given the classical task of transferring files using UDP. On different resources, I have read both checking for errors on the packets (adding CRC alongside data to packets) is necessary AND UDP already checks for corrupted packets and discards them, so I only need to worry about resending dropped packets.
Which one of them is correct? Do I need to manually perform an integrity check on the arrived packets or incorrect ones are already discarded?
Language for the project is Java by the way.
EDIT: Some sources (course books, internet) say checksum only covers the header, therefore ensures sender and receiver IP's are correct etc.. Some sources say checksum also covers the data segment. Some sources say checksum may cover data segment BUT it's optional and decided by the OS.
EDIT 2: Asked my professors and they say UDP error checking on data segment is optional in IPv4, defauld in IPv6. But I still don't know if it's in programmer's control, or OS's, or another layer...

First fact:
UDP has a 16 bit checksum field starting at bit 40 of the packet header. This suffers from (at least) 2 weaknesses:
Checksum is not mandatory, all bits set to 0 are defined as "No checksum"
it is a 16 bit check-sum in the strict sense of the word, so it is susceptible to undetected corruption.
Together this means, that UDP's built-in checksum may or may not be reliable enough, depending on your environment.
Second fact:
An even more realistic threat than data courruption along the transport is packet loss reordering: USP makes no guarantees about
all packets to (eventually) arrive at all
packets to arrive in the same sequence as sent
indeed UDP has no built-in mechanism at all to deal with payloads bigger than a single packet, stemming from the fact, that it wasn't built for that.
Conclusion:
Appending packet after packet as received without additional measures is bound to produce a receive stream differing from the send stream in all but the very favourablest environments., making it a less than optimal protocol for direct file transfer.
If you do want or must use UDP to transfer files, you need to build those parts, that are integral to TCP but not to UDP into the application. There is a saying though, that this will most likely result in an inefrior reimplementation of TCP.
Successfull implementations include many peer-to-peer file sharing protocols, where protection against connection interruption and packet loss or reordering need to be part of the apllication functionality anyway to defeat or mitigate filters.
Implementation recommendations:
What has worked for us is a chunked window implementation: The payload is separated into chunks of a fixed and convenient length, (we used 1023 bytes) a status array of N such chunks is kept on the sending and receiving end.
On the sending side:
A UDP message is inititated, containing such a chunk, its sequence number (more than once) in the stream and a checksum or hash.
The status array marks this chunk as "sent/pending" with a timestamp
Sending stops, if the complete status array (send window) is consumed
On the receiving side:
received packets are checked against their checksum,
corrupted packets are negativly acknowledged if all copies of the sequence number agree, dropped else
OK packets are marked in the status array as "received/pending" with a timestamp
Acknowledgement works by sending an ack packet if either enough chunks have been received to fill an ack packet, or the timestamp of the oldest "receive/pending" grows too old (some ms to some 100ms).
Ack packets need checksumming, but no sequencing.
Chunks, for which an ack has been sent, are marked as "ack/pending" with timestamp in the status array
On the sending side:
Ack packets are received and checked, corrupted packets are dropped
Chunks, for which an ack was received, are marked as "ack/done" in the status array
If the first chunk in the status array is marked "ack/done", the status array slides up, until its first chunk again is not maked done.
This possibly releases one or more unsent chunks to be sent.
for chunks in status "sent/pending", a timeout on the timestamp triggers a new send for this chunk, as the original chunk might have been lost.
On the receiving side:
Reception of chunk i+N (N being the window width) marks chunk i as ack/done, sliding up the receive window. If not all chunks sliding out of the receive window are makred as "ack/pending", this constitutes an unrecoverable error.
for chunks in status "ack/pending", a timeout on the timestamp triggers a new ack for this chunk, as the original ack message might have been lost.
Obviously there is the need for a special message type from the sending side, if the send window slides out the end of the file, to signal reception of an ack without sending chunk N+i, we implemented it by simply sending N chunks more than exist, but without the payload.

You can be sure the packets you receive are the same as what was sent (i.e. if you send packet A and receive packet A you can be sure they are identical). The transport layer CRC checking on the packets ensures this. Since UDP does not have guaranteed delivery however, you need to be sure you received everything that was sent and you need to make sure you order it correctly.
In other words, if packets A, B, and C were sent in that order you might actually receive only A and B (or none). You might get them out of order, C, B, A. So your checking needs to take care of the guaranteed delivery aspect that TCP provides (verify ordering, ensure all the data is there, and notify the server to resend whatever you didn't receive) to whatever degree you require.
The reason to prefer UDP over TCP is that for some applications neither data ordering nor data completeness matter. For example, when streaming AAC audio packets the individual audio frames are so small that a small amount of them can be safely discarded or played out of order without disrupting the listening experience to any significant degree. If 99.9% of the packets are received and ordered correctly you can play the stream just fine and no one will notice. This works well for some cellular/mobile applications and you don't even have to worry about resending missed frames (note that Shoutcast and some other servers do use TCP for streaming in some cases [to facilitate in-band metadata], but they don't have to).
If you need to be sure all the data is there and ordered correctly, then you should use TCP, which will take care of verifying that data is all there, ordering it correctly, and resending if necessary.

The UDP protocol uses the same strategy for checking packets with errors that the TCP protocol uses - a 16 bits checksum in the packet header.
The UDP packet structure is well known (as well as the TCP) so the packet can be easily tampered if not encrypted, adding another checksum (for instance CRC-32) would also make it more robust. If the purpose is to encrypt data (manually or over an SSL channel), I wouldn't bother adding another checksum.
Please take also into consideration that a packet can be sent twice. Make sure you deal with that accordingly.
You can check both packet structure on Wikipedia, both have checksums:
Transmission Control Protocol
User Datagram Protocol
You can check the TCP packet structure with more detail to get tips on how to deal with dropped packets. TCP protocol uses a "Sequence Number" and "Acknowledgment Number" for that purpose.
I hope this helps, and good luck.

UDP will drop packets that don't meet the internal per-packet checksum; CRC checking is useful to determine at the application layer if, once a payload appears to be complete, that what was received is actually complete (no dropped packets) and matches what was sent (no man-in-the-middle or other attacks).

I chose to UDP as my peer 2 peer service, and how can I prove it's reliable in my situation

I have two debian servers located on the same subnet. They are connected by a switch. I am aware the UDP is unreliable.
Question 1: I assume the link layer is ethernet. And MTU from a standard
Ethernet is 1500 bytes. However, when I did a ping from one server to
another, I found out that the maximum packet size can be sent is
65507. Shouldn't it be 1500 bytes? Can I say, because there's no router in between these two servers, therefore, the IP datagram will
not be fragmented.
Question 2: Because two servers are directly connected with a switch, can I
assume that all datagrams arrives in order and no loss on the path?
Question 3: How can I determine that the chances of datagram dropped
at the server because of buffer overflow. What size to set the receive buffer so that datagram will not overflow receive buffer.

No. UDP is not even reliable between processes on the same machine. If packets are sent to a socket without giving the receiver process time to read them, the buffer will overflow and packets will be lost.
You did your ping test with fragmentation enabled. Besides that, ping doesn't use UDP, but ICMP, so the results mean nothing. UDP packets smaller than the MTU will not be fragmented, but the MTU depends on more factors, such as IP options and VLAN headers, so it may not be greater than 1500.
No. Switches perform buffering, and it's possible for the internal buffers to overflow. Consider a 24 port switch where 23 nodes are all transmitting as fast as possible to the last node. Clearly the connection to the last node cannot handle the aggregate traffic of 23 other links, the switch will try to buffer packets but eventually end up dropping them.
Besides that, electrical noise can corrupt packets in transit, causing them to be discarded when the checksum fails.
To analyze the chance of buffer overflow, you could employ queuing theory to find the probability that a packet arrives when the buffer is full. You'll need some assumptions regarding the probability distribution on the rate of packet transmission and the processing time. The number of packets in the buffer then form a finite chain, hopefully Markov, which you can solve for the steady-state probabilities of each state in the chain. Good search keywords to find out more would be "queuing theory", "Markov chain", "call capacity", "circuit capacity", "load factor".
EDIT: You changed the title of the question. The answer to your new question is: "You can't prove something that isn't true." If you want to make a reliable application using UDP, you should add your own acknowledgement and loss handling logic.

The 64 KB maximum packet size is the absolute limit of the protocol, as opposed to the 1500 byte MTU you may have configured (the MTU can be changed easily, the 64 KB limit cannot).
In practice you will probably never see reordered datagrams in your scenario. And you'll probably only lose them if the receiving side is not processing them fast enough (or is shut off completely).
The "chances" of a datagram being dropped by the receiver is not something we can really quantify without knowing a whole lot more about your situation. If the receiver processes datagrams faster than the sender sends them, you're fine, otherwise you may lose some--know how many and exactly when is a considerably finer point.

The IP stack will fragment and defragment the packet for you. You can test this by setting the the no-fragment flag. The packet will be dropped.
No. They will most likely come in order, and probably not dropped, but the network stack, in your sender, router and receiver, are free to drop the packet if it can't handle it when it arrives. Also remember that when a large packet is fragmented, one lost fragment means that the whole packet will be dropped by the stack.
I guess you can probe by sending 1000 packets and measure loss, but historical values does not predict the future...

Question 1: You are confusing the MTU with the tcp maximum packet size see here
Question 2: Two servers connected via a switch does not guarantee datagrams arriving in order. There will be other network transmissions occurring that will interfere with the udp stream potentially causing out of sequence frames
Question 3: Answered by Ben Voigt above.

How to minimize UDP packet loss

I am receiving ~3000 UDP packets per second, each of them having a size of ~200bytes. I wrote a java application which listens to those UDP packets and just writes the data to a file. Then the server sends 15000 messages with previously specified rate. After writing to the file it contains only ~3500 messages. Using wireshark I confirmed that all 15000 messages were received by my network interface. After that I tried changing the buffer size of the socket (which was initially 8496bytes):
(java.net.MulticastSocket)socket.setReceiveBufferSize(32*1024);
That change increased the number of messages saved to ~8000. I kept increasing the buffer size up to 1MB. After that, number of messages saved reached ~14400. Increasing buffer size to larger values wouldn't increase the number of messages saved. I think I have reached the maximum allowed buffer size. Still, I need to capture all 15000 messages which were received by my network interface.
Any help would be appreciated. Thanks in advance.

Smells like a bug, most likely in your code. If the UDP packets are delivered over the network, they will be queued for delivery locally, as you've seen in Wireshark. Perhaps your program just isn't making timely progress on reading from its socket - is there a dedicated thread for this task?
You might be able to make some headway by detecting which packets are being lost by your program. If all the packets lost are early ones, perhaps the data is being sent before the program is waiting to receive them. If they're all later, perhaps it exits too soon. If they are at regular intervals there may be some trouble in your code which loops receiving packets. etc.
In any case you seem exceptionally anxious about lost packets. By design UDP is not a reliable transport. If the loss of these multicast packets is a problem for your system (rather than just a mystery that you'd like to solve for performance reasons) then the system design is wrong.

The problem you appear to be having is that you get delay writing to a file. I would read all the data into memory before writing to the file (or writing to a file in another thread)
However, there is no way to ensure 100% of packet are received with UDP without the ability to ask for packets to be sent again (something TCP does for you)

I see that you are using UDP to send the file contents. In UDP the order of packets is not assured. If you not worried about the order, you put all the packets in a queue and have another thread process the queue and write the contents to file. By this the socket reader thread is not blocked because of file operations.

The receive buffer size is configured at OS level.
For example on Linux system, sysctl -w net.core.rmem_max=26214400 as in this article
https://access.redhat.com/site/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html

This is a Windows only answer, but the following changes in the Network Controller Card properties made a DRAMATIC difference in packet loss for our use-case.
We are consuming around 200 Mbps of UDP data and were experiencing substantial packet loss under moderate server load.
The network card in use is an Asus ROG Aerion 10G card, but I would expect most high-end network controller cards to expose similar properties. You can access them via Device Manager->Network card->Right-Click->Properties->Advanced Options.
1. Increase number of Receive Buffers:
Default value was 512; we could increase it up to 1024. In our case, higher settings were accepted, but the network card becomes disabled once we exceed 1024. Having a larger number of available buffers at the network-card level gives the system more tolerance to latency in transferring data from the network card buffers to the socket buffers where our apps finally can read the data.
2. Set Interrupt Moderation Rate to 'Off':
If I understood correctly, interrupt moderation coalesces multiple "buffer fill" notifications (via interrupts) into a single notification. So, the CPU will be interrupted less-often and fetch multiple buffers during each interrupt. This reduces CPU usage, but increases the chance a ready buffer is overwritten before being fetched, in case the interrupt is serviced late.
Additionally, we increased the socket buffer size (as the OP already did) and also enabled Circular Buffering at the socket level, as suggested by Len Holgate in a comment, this should also increase tolerance to latency in processing the socket buffers.

why does this Java programme cause UDP packet loss?

I'm running experiments on my machine A and B, both with Ubuntu Server 11.04 installed. A and B are connected to the same 1000M/bps switch.
A is the sender:
while (total<=10,000)
send_udp_datagramPacket(new byte[100]) to B
B is the receiver:
while(true)
receive()
But finally I got less than 10,000 (about 9960) at B. Why is this happening?
Where did the lost packets go? Were they not actually sent to the wire to the switch? Or the switch lost them? Or they indeed got to B, but B's OS discarded them? Or they reached to Java, but Java threw them away because of a full buffer?
Any reply would be appreciated.

Remember, UDP does not provide reliable communication, it is intended for situations in which data loss is acceptable (streaming media for instance). Chances are good that this is a buffer overflow (my guess, don't rely on it) but the point is that if this data loss is not acceptable, use TCP instead.
If this is just for the sake of experimentation, try adding a delay (Thread.sleep()) in the loop and increase it until you get the maximum received packets.
EDIT:
As mentioned in a comment, the sleep() is NOT a fix and WILL eventually loose packets...that's just UDP.

But finally I got less than 10,000 (about 9960) at B. Why is this happening?
UDP is a lossy protocol. Even if you got 10,000 in this test you would still have to code for the possibility that some packets will be lost. They can also be fragmented (if larger than 532 bytes) and/or arrive out of order.
Where did the lost package go?
They were dropped.
Were they not actually sent to the wire to the switch?
They can be dropped just about anywhere. I don't believe Java has any logic for dropping packets (but this to is not guaranteed in all implementations) It could be dropped by the OS, the network adapter, corrupted on the wire, dropped by the switch.
Or the switch lost them?
It will do this if the packet arrived corrupt in some way or a buffer filled.
Or they indeed got to B, but B's OS discarded them?
Yes, or A's OS could have discarded them.
Or they reached to Java, but Java threw them away because of a full buffer?
Java doesn't have its own buffers. It uses the underlying buffers from the OS. But the packets could be lost at this stage.
Note: No matter how much you decrease the packet loss, you must always allow for some loss.

Why does this Java programme cause UDP packet loss?
The question is ill-formed. Neither Java nor your program causes UDP packet loss. UDP causes UDP packet loss. There is no guarantee that any UDP packet will ever arrive. See RFC 768.

In Java, how do I deal with UDP messages that are greater than the maximum UDP data payload?

I read this question about the error that I'm getting and I learned that UDP data payloads can't be more than 64k. The suggestions that I've read are to use TCP, but that is not an option in this particular case. I am interfacing with an external system that is transmitting data over UDP, but I don't have access to that external system at this time, so I'm simulating it.
I have data messages that are upwards of 1,400,000 bytes in some instances and it's a requirement that the UDP protocol is used. I am not able to change protocols (I would much rather use TCP or a reliable protocol build on UDP). Instead, I have to find a way to transmit large payloads over UDP from a test application into the system that I am building and to read those large payloads in the system that I'm building for processing. I don't have to worry about dropped packets, either - if I don't get the datagram, I don't care - just wait for the next payload to arrive. If it's incomplete or missing, just throw it all away and continue waiting. I also don't know the size of the datagram in advance (they range of a few hundred bytes to 1,400,000+ bytes.
I've already set my send and receive buffer sizes large enough, but that's not sufficient. What else can I do?

UDP packets have a 16 bit length field. It's nothing to do with Java. They cannot be bigger, period. If the server you are talking to is immutable, you are stuck with what you can fit into a packet.
If you can change the server and thus the protocol, you can more or less reimplement TCP for yourself. Since UDP is defined to be unreliable, you need the full retransmission mechanism to cope with packets that are dropped in the network somewhere. So, you have to split the 'message' into chunks, send the chunks, and have a protocol for requesting retransmission of lost chunks.

It's a requirement ...
The requirement should also therefore dictate the packetization technique. You need more information about the external system and its protocol. Note that the maximum IPv4 UDP payload Is 65535-28 bytes, and the maximum practical payload is < 1500 bytes once a router gets involved.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.