I have been doing socket programming for many years, but I have never had a missed message using TCP - until now. I have a java server and a client in C - both on the localhost. They are sending short message back and forth as strings, with some delays in between. I have one particular case where a message never arrives on the client side. It is reproducible, but oddly machine dependent.
To give some more details, I can debug the server side and see the send followed by the flush. I can attach to the client and walk through the select calls (in a loop) but it simply never shows up. Has anyone experienced this and is there an explanation other than a coding error?
In other words, if you have a connected socket and do a write on one side and a read on the other, what can happen in the middle to cause something like this?
One other detail - I've used tcpdump on the loopback interface and can see the missed message.
I've seen this happen in SMTP transactions before. Do you have a virus scanner running on that machine? If so try turning it off and see if that makes a difference.
Otherwise, I'd suggest installing Wireshark so you can take a look at what's actually happening.
Finally - after sniffing some more, I found the problem. Two messages were getting sent before a read (sometimes, but rarely...) so they were both read, but only the first was handled. This is why it seemed as though the second message never arrived. It was buried in the receive buffer.
Related
This is a similar answer, though is not what I exactly want. I want to do following two things:
I want to find out if all the bytes have been sent to the receiver?
Also I want to know the current remaining capacity of output buffer of the socket, without attempting a write to it?
Taking your numbered points in order:
The only way you can find that out is by having the peer application acknowledge the receipt.
There isn't such an API in Java. As far as I know there isn't one at the BSD sockets layer either, but I'm not familiar with the outer limits of Linux where they may have introduced some such thing.
You cannot know. The data is potentially buffered by the OS and TCP/IP stack, and there is no method for determining if it has actually been placed on the wire. Even knowing it was placed on the wire is no guarantee of anything as it could be lost in transit.
For UDP you will never know if the data was received by the destination system unless you write a UDP-based protocol such that the remote system acknowledges the data.
For TCP the protocol stack will ensure that your code is notified if the data is lost in transit, but it may be many seconds before you receive confirmation.
I'm learning to make Minecraft servers similar to Bukkit for fun. I've dealt with NIO before but not very much and not in a practical way. I'm encountering an issue right now where Minecraft has many variable-length packets and since there's not any sort of consistent "header" for these packets of data, NIO is doing this weird thing where it fragments packets because the data isn't always sent immediately in full.
I learned recently that this is a thing from this thread: Java: reading variable-size packets using NIO I'd rather not use Netty/MINA/etc. because I'd like to learn this all myself as I'm doing this for the education and not with the intention of making it some huge project.
So my question is, how exactly do I go about preventing this sort of fragmenting of packets? I tried using Nagle's algorithm in java.net.Socket#setTcpNoDelay(boolean on) but oddly enough, all this does is make it so that every single time the packet is sent, it's fragmented, whereas when I don't have it enabled, the first packet always comes through OK, and then the following packets become fragmented.
I followed the Rox Java NIO Tutorial pretty closely so I know this code should work, but that tutorial only went as far as echoing a string message back to peers, not complicated bytestreams.
Here's my code. For some context, I'm using Executor#execute(Runnable) to create the two threads. Since I'm still learning about threads and concurrency and trying to piece them together with networking, any feedback on that would be very appreciated as well!!
ServerSocketManager
ServerDataManager
Thanks a lot, I know this is quite a bit of stuff to take in, so I can't thank you enough for taking the time to read & respond!!
TCP is ALWAYS a stream of bytes. You don't get to control when you get them or how many you get. It can come in at any time with any amount. That's why protocols exist.
Headers are a common part of a protocol to tell you how much data you need to read before you have the whole message.
So the short answer here is: You can't.
Everything you're saying you don't want to do -- that's what you have to do.
I've recently been writing java code to send notifications to the Apple Push Notification server. The problem I'm running into is if I create the socket and then disconnect from the network. I've bounced around articles on-line and most suggest relying on the methods:
socket.setKeepAlive(false);
socket.setSoTimeout(1000);
Specifically the "setSoTimeout" method. But the javadoc states that setSoTimeout will only throw an exception when reading from the InputStream. But the Apple Push Notification server never puts any data on the InputStream so I can never read anything from it. Does anyone have any suggestions of how to determine a network disconnect without using the socket InputStream?
You can only reliably detect a Socket has been disconnect when you attempt to write or read from a Socket. Reading is better because writing often takes a while to detect a failure.
The server doesn't need to write anything for you to attempt to read it. If you have a server which never writes you will either read nothing or detect a failure.
A quick precision: APNS will return data on your InputStream if you are using the enhanced notification format and that some error occurs. You should therefore make sure you do not ignore your InputStream...
Unless you are doing this as a personal learning project, you might want to take a look at existing APNS-specific Java libraries that deal with all the communication details for you. Communicating reliably with APNS is much more difficult than it looks at first, especially when you get to the error management part which involves various vague or undocumented details.
My Java application receives data through UDP. It uses the data for an online data mining task. This means that it is not critical to receive each and every packet, which is what makes the choice of UDP reasonable on the first place. Also, the data is transferred over LAN, so the physical network should be reasonably reliable. Anyway, I have no control over the choice of protocol or the data included.
Still, I am concerned about packet loss that may arise from overload and long processing time of the application itself. I would like to know how often these things happen and how much data is lost.
Ideally I am looking for a way to monitor packet loss continuously in the production system. But a partial solution would also be welcome.
I realize it is impossible to always know about UDP packet losses (without control on the packet contents). I was thinking of something along the lines of packets received by the OS but never arriving to the application; or maybe some clever solution inside the application, like a fast reading thread that drops data when its client is busy.
We are deploying under Windows Server 2003 or 2008.
The problem is that there is no way to tell that you have lost any packets if you are relying on the UDP format.
If you need to know this type of information, you need to build it into the format that you layer ontop of UDP (like the TCP Sequence Number). If you do that and the format is simple then you can easily create filters into Microsoft's NetMon or WireShark to log and track that information.
Also note that the TCP Sequence Number implementation also helps to detect out of order packets take may happen when using UDP.
If you are concerned about packet loss, use TCP.
That's one reason why it was invented.
I have the following situation: using a "classical" Java server (using ServerSocket) I would like to detect (as rapidly as possible) when the connection with the client failed unexpectedly (ie. non-gracefully / without a FIN packet).
The way I'm simulating this is as follows:
I'm running the server on a Linux box
I connect with telnet to the box
After the connection has succeeded I add "DROP" rule in the box's firewall
What happens is that the sending blocks after ~10k of data. I don't know for how long, but I've waited more than 10 minutes on several occasions. What I've researched so far:
Socket.setSoTimeout - however this affects only reads. If there are only writes, it doesn't have an effect
Checking for errors with PrintWriter.checkError(), since PW swallows the exceptions - however it never returns true
How could I detect this error condition, or at least configure the timeout value? (either at the JVM or at the OS level)
Update: after ~20min checkError returned true on the PrintWriter (using the server JVM 1.5 on a CentOS machine). Where is this timeout value configured?
The ~20 min timeout is because of standard TCP settings in Linux. It's really not a good idea to mess with them unless you know what you're doing. I had a similar project at work, where we were testing connection loss by disconnecting the network cable and things would just hang for a long time, exactly like you're seeing. We tried messing with the following TCP settings, which made the timeout quicker, but it caused side effects in other applications where connections would be broken when they shouldn't, due to small network delays when things got busy.
net.ipv4.tcp_retries2
net.ipv4.tcp_syn_retries
If you check the man page for tcp (man tcp) you can read about what these settings mean and maybe find other settings that might apply. You can either set them directly under /proc/sys/net/ipv4 or use sysctl.conf. These two were the ones we found made the send/recv fail quicker. Try setting them both to 1 and you'll see the send call fail a lot faster. Make sure to take not of the current settings before changing them.
I will reiterate that you really shouldn't mess with these settings. They can have side effects on the OS and other applications. The best solution is like Kitson says, use a heartbeat and/or application level timeout.
Also look into how to create a non-blocking socket, so that the send call won't block like that. Although keep in mind that sending with a non-blocking socket is usually successful as long as there's room in the send buffer. That's why it takes around 10k of data before it blocks, even though you broke the connection before that.
The only sure fire way is to generate application level "checks" instead of relying on the transport level. For example, a bi-directional heartbeat message, where if either end does not get the expected message, it closes and resets the connection.