I am reading here, that
On connect, the JVM (Java Virtual Machine) tries to resolve the
hostname to IP/port. Windows tries a netbios ns query on UDP (User
Datagram Protocol) port 137 with a timeout of 1.5 seconds, ignores any
ICMP (Internet Control Message Protocol) port unreachable packets and
repeats this two more times, adding up to a value of 4.5 seconds. I
suggest putting critical hostnames in your HOSTS file to make sure
they are resolved quickly. Another possibility is turning off NETBIOS
altogether and running pure TCP/IP on your LAN (Local Area Network).
is this currently an issue still? Because I am working on a heartbeat-sensor and I was curious.
Your citation is not a normative reference, just another hobby site, and in this case it is dead wrong. None of this has anything to do with setSoTimeout(). He is totally confused between name resolution time, connect time, and read time. setSoTimeout() sets a read timeout, and is unaffected by the shenanigans he describes, whether accurately or otherwise, which wouldn't even happen at connect time as he states: they would happen at name-resolution time.
It's far from the only confusion to be found on that site, or even on that page, let me assure you. I told him about several errors on this page ten years ago, and about quite a lot of others, all of which remain uncorrected to this day, which gives you an idea of the site's accuracy, up-to-date-ness, and content review mechanisms. His only response was to add a rude remark about me. Unconvincing as a peer review mechanism.
Stick to authoritative sources.
Related
Our project uses Business Objects for reports. Our java webapps that launch reports go thruogh a web service we set up to handle the business rules of how we want to launch them. Works great...with one wrinkle.
BO appears to be massively unreliable. The thing frequently goes down or fails to come up after a nightly timed restart. Our Ops team has sort of gotten used to this as a fact of life.
But the part of that which impacts me, on the java team, is our webservice tries to log on to BO, and instead of timing our or erroring like it should, the BO java library hangs forever. Evidently it is connecting to a half-started BO, and never gives up.
Looking around the internet, it appears that others have experienced this, but none of the things I see suggests how to set a timeout on the logon process so that if it fails, the web service doesn't lock up forever (which in turn can cause our app server to become unstable).
The connection is pretty simple:
session = CrystalEnterprise.getSessionMgr().logon(boUserName, boPassword, boServerName, boSecurityType);
All I am looking for is some way to make sure that if BO is dead, my webservice doesn't die with it. A timeout...a way to reliably detect if BO is not started and healthy before trying to logon....something. Our BO "experts" don't seem to think there is anything they can do about BO's instability and they know even less about the java library.
Ideas?
The Java SDK does not detail how to define a timeout when calling logon. I can only assume that this means it falls back on a default network connection timeout.
However, if a connection is made but the SDK doesn't receive the required information (and keeps waiting for an answer), a network timeout will never be reached as this is an application issue, not a network issue.
Therefore, the only thorough solution would be to deal with the instabilities in your BusinessObjects platform (for which you should create a separate question and describe the issue in more detail).
If this is not an option, an alternative could be to launch the connection attempt in a separate thread and implement a timeout yourself, killing the thread when the predefined timeout is reached and optionally retrying the connection attempt several times.
Keep in mind though that while the initial logon might be successful, the instabilities described in your question could cause other issues (e.g. a different SDK call could remain hanging forever due to the same issue that caused your logon call to hang).
Again, the only good solution is to look at the root cause of your platform instabilities.
I have seen applications that can detect adjacent networks and desktops and devices attached to them. They can also know the computer/device name that is attached within 30 seconds.
Shall I try runtime.execute ping and net view command to do it, for I find them fast.
How can I capture the output as a result from these commands?
I tried sockets but they are time consuming.. only advantage, that I can also know that they have application installed (in which I created socket, enabling this communication).
Regards
Time-Outs in the initialization of Socket are useful, but you cannot have each connection connected within less than 300 Milli-seconds. On the server side also there is a timeout implementation. There is one sided communication in both. Multi-threading will help.
My team has a situation where an SNMP SET will fail once every two weeks or so. Since this set happens automatically, we don't necessarily notice it immediately when it fails, and this can result in an inconsistent configuration and associated wailing and gnashing of teeth. The plan is to fix this by having our software automatically retry the SET when it fails.
The problem is, we aren't sure why the failure is happening. My (extremely limited) knowledge of SNMP isn't particularly helpful in diagnosing this problem, so I thought I'd ask StackOverflow for some advice. We think that every so often a spike in network traffic will cause the SET to fail. Since SNMP uses UDP for communication, I would think it would be relatively easy for a command to be drowned out if traffic was high for a short period of time. However, I have no idea how common this is. We have a small network with a single cisco router and there are less than a dozen SNMP controlled devices on that network. In addition to the SNMP traffic, there are some status web pages being loaded from the various devices. In case it makes a difference, I believe we are using the AdventNet SNMP API version 4.0.4 for Java.
Does it sound reasonable that there will be some SET commands dropped occasionally, or should we be looking for other causes?
SNMP was designed to be unreliable. It uses UDP as its transport protocol. Routers will drop SNMP packets when they've got high priority work to do. So yes, it sounds very reasonable that SET commands are dropped occasionally :)
First upgrade to the newest version of the SNMP library if there is one.
Then you can set up a retry mechanism: verify each SET with a GET. If this fails, queue the SET for a later attempt. This requires an elaborate queuing mechanism: a later SET for the same setting should be queued after, or over, an existing queued SET.
Another option is to synchronize the entire state every hour; use GET for a setting, if it has changed, SET it. Changes that do not make it through for over 3 hours can be reported using an alerting system.
There are many more options, but if you have just 1 failure per week average, I'd go with the simplest one: Verify a SET with a GET, retry for 5 times, if it still fails, email.
I have the following situation: using a "classical" Java server (using ServerSocket) I would like to detect (as rapidly as possible) when the connection with the client failed unexpectedly (ie. non-gracefully / without a FIN packet).
The way I'm simulating this is as follows:
I'm running the server on a Linux box
I connect with telnet to the box
After the connection has succeeded I add "DROP" rule in the box's firewall
What happens is that the sending blocks after ~10k of data. I don't know for how long, but I've waited more than 10 minutes on several occasions. What I've researched so far:
Socket.setSoTimeout - however this affects only reads. If there are only writes, it doesn't have an effect
Checking for errors with PrintWriter.checkError(), since PW swallows the exceptions - however it never returns true
How could I detect this error condition, or at least configure the timeout value? (either at the JVM or at the OS level)
Update: after ~20min checkError returned true on the PrintWriter (using the server JVM 1.5 on a CentOS machine). Where is this timeout value configured?
The ~20 min timeout is because of standard TCP settings in Linux. It's really not a good idea to mess with them unless you know what you're doing. I had a similar project at work, where we were testing connection loss by disconnecting the network cable and things would just hang for a long time, exactly like you're seeing. We tried messing with the following TCP settings, which made the timeout quicker, but it caused side effects in other applications where connections would be broken when they shouldn't, due to small network delays when things got busy.
net.ipv4.tcp_retries2
net.ipv4.tcp_syn_retries
If you check the man page for tcp (man tcp) you can read about what these settings mean and maybe find other settings that might apply. You can either set them directly under /proc/sys/net/ipv4 or use sysctl.conf. These two were the ones we found made the send/recv fail quicker. Try setting them both to 1 and you'll see the send call fail a lot faster. Make sure to take not of the current settings before changing them.
I will reiterate that you really shouldn't mess with these settings. They can have side effects on the OS and other applications. The best solution is like Kitson says, use a heartbeat and/or application level timeout.
Also look into how to create a non-blocking socket, so that the send call won't block like that. Although keep in mind that sending with a non-blocking socket is usually successful as long as there's room in the send buffer. That's why it takes around 10k of data before it blocks, even though you broke the connection before that.
The only sure fire way is to generate application level "checks" instead of relying on the transport level. For example, a bi-directional heartbeat message, where if either end does not get the expected message, it closes and resets the connection.
I'm currently hacking on a new project, a web-app. But something's wrong, and I think it's Vistas fault, when I'm stress-testing the app, not all of the requests are answered.
The only thing I can think of is that the queue of incomming requests is getting too long, I've googled around, but can't find out how long the queue is, only that it depends on OS. Tough this may simply be because I don't know the "real" name for it;)
The project is written in Java SE.
If someone knows the answer I'd be really happy as I can't find it myself anywhere;)
UPDATE: The app is just a pet-project, and I won't be running it on a Vista machine when it's done. I've even been thinking of running it on a Solaris box, just to try Solaris:)
There are no errors or anything, but the request counter is way too low. Testing is done from Opera, with 30 tabs on auto-refresh every one second. I know, it's not the right way to do it, but it works:)
I don't use any frameworks, EE or anything, just pure Java SE.
If you're writing a server app, but running it on Vista, you may be hitting the hard limit for inbound network connections on the Vista OS. I've seen numbers from 5 (for Home Basic) up to 25 (for Business Pro and Ultimate) simultaneous connections allowed.
Vista is not designed as a Server OS. If you want more simultaneous network connections, and you want to run Windows, get a license for one of the Windows Server products. Alternately, you can run on some *NIX system.