Write a program that simulates a computer network using discrete time.
The first packet on each router queue makes one hop per time interval.
Each router has only a finite number of buffers. If a packet arrives
and there is no room for it, it is discarded and not retransmitted.
Instead, there is an end-to-end protocol, complete with timeouts and
acknowledgement packets, that eventually regenerate the packet from
the source router.
Plot the throughput of the network as a function of the end-to-end
timeout interval, parameterized by error rate.
I've had a few assignments kind of like this in the past. My best guess is that "hop" signifies transfer from one node in the network to another, and that "time interval" is arbitrary, and essentially represents an update cycle for the entire network.
Related
I have a durable queue which holds persistent messages. The messages arrive into the queue at a rate of about 10 messages per second.
The client is unable to fetch those messages at that rate. As a result the queue on the server keeps growing.
Each message is less than 1 KB and I have a healthy 2 Mbps line between the server and my machine. Using a network monitoring utility, I found that it is hardly using any of that bandwidth.
The client is doing nothing with the messages as of now, just printing them to console so processing time on client is almost 0.
Some other details:
I am using a java client.
I have set the client to prefetch 10000 messages. (also tried with default values)
The round trip time is about 350 ms.
Messages are individually acknowledged.
The available resources are being underutilized and 10 messages per second is hardly any load in my opinion. How do I speed up things so that messages held up in queue are transferred faster to the client. Possibly using some sort of batching.
If you are indidivudally acknowledging messages every 350 ms, I would expect the consumer might achieve about 1/0.35 or about 2.9 messages per second. However, the protocol might not be that efficient and it may need two round trips to the server to acknowledge the message and get the next one. i.e. 1.4 message per second may be more realistic.
A round trip of 350 ms is very high, you can go around the world and back again in that time, so a simple solution may not work best for you. e.g. London -> New York -> Tokyo -> London.
I would try having a broker local to your client instead. This way the round trip is between your client and your local broker.
I have been given the classical task of transferring files using UDP. On different resources, I have read both checking for errors on the packets (adding CRC alongside data to packets) is necessary AND UDP already checks for corrupted packets and discards them, so I only need to worry about resending dropped packets.
Which one of them is correct? Do I need to manually perform an integrity check on the arrived packets or incorrect ones are already discarded?
Language for the project is Java by the way.
EDIT: Some sources (course books, internet) say checksum only covers the header, therefore ensures sender and receiver IP's are correct etc.. Some sources say checksum also covers the data segment. Some sources say checksum may cover data segment BUT it's optional and decided by the OS.
EDIT 2: Asked my professors and they say UDP error checking on data segment is optional in IPv4, defauld in IPv6. But I still don't know if it's in programmer's control, or OS's, or another layer...
First fact:
UDP has a 16 bit checksum field starting at bit 40 of the packet header. This suffers from (at least) 2 weaknesses:
Checksum is not mandatory, all bits set to 0 are defined as "No checksum"
it is a 16 bit check-sum in the strict sense of the word, so it is susceptible to undetected corruption.
Together this means, that UDP's built-in checksum may or may not be reliable enough, depending on your environment.
Second fact:
An even more realistic threat than data courruption along the transport is packet loss reordering: USP makes no guarantees about
all packets to (eventually) arrive at all
packets to arrive in the same sequence as sent
indeed UDP has no built-in mechanism at all to deal with payloads bigger than a single packet, stemming from the fact, that it wasn't built for that.
Conclusion:
Appending packet after packet as received without additional measures is bound to produce a receive stream differing from the send stream in all but the very favourablest environments., making it a less than optimal protocol for direct file transfer.
If you do want or must use UDP to transfer files, you need to build those parts, that are integral to TCP but not to UDP into the application. There is a saying though, that this will most likely result in an inefrior reimplementation of TCP.
Successfull implementations include many peer-to-peer file sharing protocols, where protection against connection interruption and packet loss or reordering need to be part of the apllication functionality anyway to defeat or mitigate filters.
Implementation recommendations:
What has worked for us is a chunked window implementation: The payload is separated into chunks of a fixed and convenient length, (we used 1023 bytes) a status array of N such chunks is kept on the sending and receiving end.
On the sending side:
A UDP message is inititated, containing such a chunk, its sequence number (more than once) in the stream and a checksum or hash.
The status array marks this chunk as "sent/pending" with a timestamp
Sending stops, if the complete status array (send window) is consumed
On the receiving side:
received packets are checked against their checksum,
corrupted packets are negativly acknowledged if all copies of the sequence number agree, dropped else
OK packets are marked in the status array as "received/pending" with a timestamp
Acknowledgement works by sending an ack packet if either enough chunks have been received to fill an ack packet, or the timestamp of the oldest "receive/pending" grows too old (some ms to some 100ms).
Ack packets need checksumming, but no sequencing.
Chunks, for which an ack has been sent, are marked as "ack/pending" with timestamp in the status array
On the sending side:
Ack packets are received and checked, corrupted packets are dropped
Chunks, for which an ack was received, are marked as "ack/done" in the status array
If the first chunk in the status array is marked "ack/done", the status array slides up, until its first chunk again is not maked done.
This possibly releases one or more unsent chunks to be sent.
for chunks in status "sent/pending", a timeout on the timestamp triggers a new send for this chunk, as the original chunk might have been lost.
On the receiving side:
Reception of chunk i+N (N being the window width) marks chunk i as ack/done, sliding up the receive window. If not all chunks sliding out of the receive window are makred as "ack/pending", this constitutes an unrecoverable error.
for chunks in status "ack/pending", a timeout on the timestamp triggers a new ack for this chunk, as the original ack message might have been lost.
Obviously there is the need for a special message type from the sending side, if the send window slides out the end of the file, to signal reception of an ack without sending chunk N+i, we implemented it by simply sending N chunks more than exist, but without the payload.
You can be sure the packets you receive are the same as what was sent (i.e. if you send packet A and receive packet A you can be sure they are identical). The transport layer CRC checking on the packets ensures this. Since UDP does not have guaranteed delivery however, you need to be sure you received everything that was sent and you need to make sure you order it correctly.
In other words, if packets A, B, and C were sent in that order you might actually receive only A and B (or none). You might get them out of order, C, B, A. So your checking needs to take care of the guaranteed delivery aspect that TCP provides (verify ordering, ensure all the data is there, and notify the server to resend whatever you didn't receive) to whatever degree you require.
The reason to prefer UDP over TCP is that for some applications neither data ordering nor data completeness matter. For example, when streaming AAC audio packets the individual audio frames are so small that a small amount of them can be safely discarded or played out of order without disrupting the listening experience to any significant degree. If 99.9% of the packets are received and ordered correctly you can play the stream just fine and no one will notice. This works well for some cellular/mobile applications and you don't even have to worry about resending missed frames (note that Shoutcast and some other servers do use TCP for streaming in some cases [to facilitate in-band metadata], but they don't have to).
If you need to be sure all the data is there and ordered correctly, then you should use TCP, which will take care of verifying that data is all there, ordering it correctly, and resending if necessary.
The UDP protocol uses the same strategy for checking packets with errors that the TCP protocol uses - a 16 bits checksum in the packet header.
The UDP packet structure is well known (as well as the TCP) so the packet can be easily tampered if not encrypted, adding another checksum (for instance CRC-32) would also make it more robust. If the purpose is to encrypt data (manually or over an SSL channel), I wouldn't bother adding another checksum.
Please take also into consideration that a packet can be sent twice. Make sure you deal with that accordingly.
You can check both packet structure on Wikipedia, both have checksums:
Transmission Control Protocol
User Datagram Protocol
You can check the TCP packet structure with more detail to get tips on how to deal with dropped packets. TCP protocol uses a "Sequence Number" and "Acknowledgment Number" for that purpose.
I hope this helps, and good luck.
UDP will drop packets that don't meet the internal per-packet checksum; CRC checking is useful to determine at the application layer if, once a payload appears to be complete, that what was received is actually complete (no dropped packets) and matches what was sent (no man-in-the-middle or other attacks).
I have two debian servers located on the same subnet. They are connected by a switch. I am aware the UDP is unreliable.
Question 1: I assume the link layer is ethernet. And MTU from a standard
Ethernet is 1500 bytes. However, when I did a ping from one server to
another, I found out that the maximum packet size can be sent is
65507. Shouldn't it be 1500 bytes? Can I say, because there's no router in between these two servers, therefore, the IP datagram will
not be fragmented.
Question 2: Because two servers are directly connected with a switch, can I
assume that all datagrams arrives in order and no loss on the path?
Question 3: How can I determine that the chances of datagram dropped
at the server because of buffer overflow. What size to set the receive buffer so that datagram will not overflow receive buffer.
No. UDP is not even reliable between processes on the same machine. If packets are sent to a socket without giving the receiver process time to read them, the buffer will overflow and packets will be lost.
You did your ping test with fragmentation enabled. Besides that, ping doesn't use UDP, but ICMP, so the results mean nothing. UDP packets smaller than the MTU will not be fragmented, but the MTU depends on more factors, such as IP options and VLAN headers, so it may not be greater than 1500.
No. Switches perform buffering, and it's possible for the internal buffers to overflow. Consider a 24 port switch where 23 nodes are all transmitting as fast as possible to the last node. Clearly the connection to the last node cannot handle the aggregate traffic of 23 other links, the switch will try to buffer packets but eventually end up dropping them.
Besides that, electrical noise can corrupt packets in transit, causing them to be discarded when the checksum fails.
To analyze the chance of buffer overflow, you could employ queuing theory to find the probability that a packet arrives when the buffer is full. You'll need some assumptions regarding the probability distribution on the rate of packet transmission and the processing time. The number of packets in the buffer then form a finite chain, hopefully Markov, which you can solve for the steady-state probabilities of each state in the chain. Good search keywords to find out more would be "queuing theory", "Markov chain", "call capacity", "circuit capacity", "load factor".
EDIT: You changed the title of the question. The answer to your new question is: "You can't prove something that isn't true." If you want to make a reliable application using UDP, you should add your own acknowledgement and loss handling logic.
The 64 KB maximum packet size is the absolute limit of the protocol, as opposed to the 1500 byte MTU you may have configured (the MTU can be changed easily, the 64 KB limit cannot).
In practice you will probably never see reordered datagrams in your scenario. And you'll probably only lose them if the receiving side is not processing them fast enough (or is shut off completely).
The "chances" of a datagram being dropped by the receiver is not something we can really quantify without knowing a whole lot more about your situation. If the receiver processes datagrams faster than the sender sends them, you're fine, otherwise you may lose some--know how many and exactly when is a considerably finer point.
The IP stack will fragment and defragment the packet for you. You can test this by setting the the no-fragment flag. The packet will be dropped.
No. They will most likely come in order, and probably not dropped, but the network stack, in your sender, router and receiver, are free to drop the packet if it can't handle it when it arrives. Also remember that when a large packet is fragmented, one lost fragment means that the whole packet will be dropped by the stack.
I guess you can probe by sending 1000 packets and measure loss, but historical values does not predict the future...
Question 1: You are confusing the MTU with the tcp maximum packet size see here
Question 2: Two servers connected via a switch does not guarantee datagrams arriving in order. There will be other network transmissions occurring that will interfere with the udp stream potentially causing out of sequence frames
Question 3: Answered by Ben Voigt above.
I am receiving ~3000 UDP packets per second, each of them having a size of ~200bytes. I wrote a java application which listens to those UDP packets and just writes the data to a file. Then the server sends 15000 messages with previously specified rate. After writing to the file it contains only ~3500 messages. Using wireshark I confirmed that all 15000 messages were received by my network interface. After that I tried changing the buffer size of the socket (which was initially 8496bytes):
(java.net.MulticastSocket)socket.setReceiveBufferSize(32*1024);
That change increased the number of messages saved to ~8000. I kept increasing the buffer size up to 1MB. After that, number of messages saved reached ~14400. Increasing buffer size to larger values wouldn't increase the number of messages saved. I think I have reached the maximum allowed buffer size. Still, I need to capture all 15000 messages which were received by my network interface.
Any help would be appreciated. Thanks in advance.
Smells like a bug, most likely in your code. If the UDP packets are delivered over the network, they will be queued for delivery locally, as you've seen in Wireshark. Perhaps your program just isn't making timely progress on reading from its socket - is there a dedicated thread for this task?
You might be able to make some headway by detecting which packets are being lost by your program. If all the packets lost are early ones, perhaps the data is being sent before the program is waiting to receive them. If they're all later, perhaps it exits too soon. If they are at regular intervals there may be some trouble in your code which loops receiving packets. etc.
In any case you seem exceptionally anxious about lost packets. By design UDP is not a reliable transport. If the loss of these multicast packets is a problem for your system (rather than just a mystery that you'd like to solve for performance reasons) then the system design is wrong.
The problem you appear to be having is that you get delay writing to a file. I would read all the data into memory before writing to the file (or writing to a file in another thread)
However, there is no way to ensure 100% of packet are received with UDP without the ability to ask for packets to be sent again (something TCP does for you)
I see that you are using UDP to send the file contents. In UDP the order of packets is not assured. If you not worried about the order, you put all the packets in a queue and have another thread process the queue and write the contents to file. By this the socket reader thread is not blocked because of file operations.
The receive buffer size is configured at OS level.
For example on Linux system, sysctl -w net.core.rmem_max=26214400 as in this article
https://access.redhat.com/site/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html
This is a Windows only answer, but the following changes in the Network Controller Card properties made a DRAMATIC difference in packet loss for our use-case.
We are consuming around 200 Mbps of UDP data and were experiencing substantial packet loss under moderate server load.
The network card in use is an Asus ROG Aerion 10G card, but I would expect most high-end network controller cards to expose similar properties. You can access them via Device Manager->Network card->Right-Click->Properties->Advanced Options.
1. Increase number of Receive Buffers:
Default value was 512; we could increase it up to 1024. In our case, higher settings were accepted, but the network card becomes disabled once we exceed 1024. Having a larger number of available buffers at the network-card level gives the system more tolerance to latency in transferring data from the network card buffers to the socket buffers where our apps finally can read the data.
2. Set Interrupt Moderation Rate to 'Off':
If I understood correctly, interrupt moderation coalesces multiple "buffer fill" notifications (via interrupts) into a single notification. So, the CPU will be interrupted less-often and fetch multiple buffers during each interrupt. This reduces CPU usage, but increases the chance a ready buffer is overwritten before being fetched, in case the interrupt is serviced late.
Additionally, we increased the socket buffer size (as the OP already did) and also enabled Circular Buffering at the socket level, as suggested by Len Holgate in a comment, this should also increase tolerance to latency in processing the socket buffers.
I need to be able to monitor the speed of my internal network using java. I was thinking I could use a two part system with a server and a client. I do not need need response time such as what is generated with ping but and actual speed in mbps for upload and download.
My idea would be to have the Server send a packet or series of packets to the client which then replies and then the Server would calculate the speed of the network between those two points. Does anyone have any idea how I could implement this?
Thank You ahead of time.
Hmm, an interesting problem. I hope you like reading... :-)
I'd be interested to know how the monitoring tool would be used. At
work, the sysadmins just have a couple of large screens in the room,
showing a webpage containing loads of network stats, with it constantly
updating.
The rest of my description assumes the network monitoring tool would be
used as described above. If you just want to be able to do an ad-hoc
test between two random hosts on your network, I'd just use rsync to
transfer a reasonably large file (about 1 - 2MB). I'm sure there are
other file transfer tools that calculate the transfer speed too.
When implementing this, (especially within a large network) you must
minimise the risk that the test floods the network, hampering the people
(or programs) actually using it. You don't want to be blamed for a
massive slowdown (or worse, an outage) just because you were conducting
a test. Your sysadmins won't thank you...
I'd architect the tool in the following way:
Bob is a server which participates in an individual 'test' by doing
the following:
Bob receives a request from a client. The request states how much data the client is about to send.
If the amount of data proposed to be sent is not too large, wait for the data. Otherwise Bob rejects the request immediately and ends the communication.
Once the required number of bytes has been received, reply with the amount of time it took to receive it all. Bob terminates the communication.
Alice is the component that displays the result of the measurements
taken (via a webpage or otherwise). Alice is a long lived process
(maybe a web server), configured to periodically connect to a list of
Bob servers. For each configured Bob:
Send Bob a request with the amount of data Alice is about to
send.
Send Bob the specified amount of data, as fast as possible.
Await the reply from Bob, and compute the network speed.
'Display' the result for this instance of Bob. You may choose
to display an aggregate result. For example, the average result for
each of the last 20 tests, to iron out any anomalies...
When conducting a given test, Alice should report any failures. E.g.
'a TCP connection could not be established with Bob', or 'Bob
prematurely terminated the transfer' or whatever else...
Scatter Bob servers to strategic locations in your (possibly large)
network, and configure Alice to go them. For each instance of Bob, you
should configure
The time interval in between tests.
The 'leeway' (I'll explain this in a bit).
The amount of data to send to Bob for each test.
Bob's address (duh).
You want to 'stagger' the tests that a given Alice will attempt. You
don't want Alice to trigger the test to all Bob servers at once, thereby
flooding your network, possibly giving skewed results and so forth.
Allow the test to occur at a randomised time in the future. For
example, if the test interval is every 10 minutes, configure a 'leeway'
of 1 minute, meaning the next test might occur anywhere between 9 and 11
minutes' time.
If there is to be more than one Alice running at a time, the total
number of instances should be small. The more Alices you have, the more
you interfere with the network. Again, you don't want to be responsible
for an outage.
The amount of data Alice should send in an individual test should be
small. 500KB? You probably want a given test to run for no more than
10 seconds. Maybe get Bob to timeout if the test takes too long.
I've deliberately omitted the transport to use (TCP, UDP, whatever)
because you'll get issues depending on the transport, and I don't know
how you want to handle those issues. For example, you'd have to
consider how to handle dropped datagrams with UDP. What result would
you compute? You don't get this issue with TCP, because it
automatically retransmits dropped packets. With TCP, your
throughput will be artificially low if the two endpoints
are far away from each other. Here's some
info on it.
If you had the patience to read this far, I hope it helped!
Rather than writing a server you might want to just use tomcat or apache to be the server, then you just have the client upload a file of a specific size, and measure the time, then turn around and download the file, to measure the download speed.
You could write your own server to do this, but you would be basically doing what has been done many times before, then you will need to ensure your server isn't skewing the numbers.