This will be a network application that will always (or near as always as I can manage) be listening on a given port.
I'm fairly new to Java, and very new to non-web server side programming, so I'd like to get feedback from the community on my assumptions and preliminary plans.
I've read about jsvc ( http://commons.apache.org/daemon/jsvc.html ) and am currently operating on the assumption that this is the "best" way to write a daemon in java for a linux box (likely running centOS).
Can nagios be configured to monitor whether or not my daemon is running, and to alert me or the sys admin when it isn't? (I assume yes, but I'm not a very talented sys admin type)
This will be an SMPP client app (or ESME app I guess) which is why I've chosen Java as it seems to be a very mature platform for SMPP. However, I know that it's more "traditional" to write a daemon in C/C++. With modern Java, performing fairly uncomplicated tasks, am I likely to run into any major disadvantages?
What's the best way to manage deployment of new builds? Just stop the daemon and replace the binary as quickly as possible and restart?
Any other input would be greatly appreciated.
How to write a Java daemon that has 24/7 uptime...
We run a number of 24/365 applications on our Linux servers that just call the Java like the following -- no need for any C wrappers:
nohup java -D... -X... -jar something.jar ... < /dev/null > output.log 2>&1 &
That will put the jar running in the background (nohup ... &) with no input (< /dev/null) and the output (stdout and stderr) redirected to a logfile (> output.log 2>&1). We have distributed logging infrastructure but some console output (such as thread dumps) is still expected. These applications can run for months until we upgrade them.
Can nagios be configured to monitor whether or not my daemon is running, and to alert me or the sys admin when it isn't?
In terms of monitoring, there is much you can do. Nagios looks to have a JMX plugin for testing the information that jconsole displays. There are also a lot of native JMX logging and monitoring utilities out there. We have internal green/yellow/red indicators that can be pulled up using JMX and easily checked. I also have exported a simple JMX/HTTP service from each application to provide status information making it easy for 3rd party monitoring tools to detect failures.
This will be an SMPP client app (or ESME app I guess) which is why I've chosen Java as it seems to be a very mature platform for SMPP.
I assume you mean SMPP? If so then I see no reason why Java couldn't do a good job. Our applications do a wide variety of HTTP, UDP, SMTP, JDBC, LDAP, and other protocols in real time. We use Jgroups at lot which accomplishes a complete authenticated, encrypted, network stack in Java.
What's the best way to manage deployment of new builds? Just stop the daemon and replace the binary as quickly as possible and restart?
In terms of replacing a running binary on the fly, that it more complicated. We have VIPs up front and replace the binaries at our leisure. Our internal protocols are designed to failover. If you do not have a VIP then one thing to consider would be an orderly handoff. You boot the new jar and it talks to the application running the old jar when it is ready to bind to the new port. Then the old application unbinds and the new one binds immediately afterwards. Something like that.
Hope this helps.
If you really want to have something running non-stop on *nix, I recommend you have a look at daemontools.
There are some examples on how to do this here and here.
Basically svscan will spawn a process that monitors your java process from init, and every time it crashes, it gets restarted.
Related
I have been asked to investigate Oracle Java Mission Control, so that server-side Java applications may be monitored and actions taken (e.g., alerts emitted and logged, flight recordings saved) under certain conditions. Java Mission Control's trigger system, where you specify conditions and actions, meets our needs, but it seems to depend on the GUI application ("Oracle Java Mission Control") being running, implying that triggers are not the monitored JMX server's responsibility. Is this the case? There are a number of servers usually accessed via terminal...
Is there a way of running Java Mission Control as a daemon, from a terminal session, unattended, while retaining and obeying any specified trigger rules (e.g., imported from an XML file)?
If not, are there competing tools with a similar trigger system that can fill the void?
Thanks! :)
Currently no, you can't run JMC without a GUI.
You are not the first person that wants to do this.
One option is to run JMC in another machine, and make it connect to many servers, which of course requires running the remote JMX agent etc.
We have been discussing server side triggers/rules, but AFAIK, it is not planned for any JDK release.
It is possible to dump flight recordings from code, so you could write your own little agent that uses the DiagnosticMBean to do this on another JVM on the same machine or on remotely. I'm pretty sure this how some people solve the same sort of problem. It is also possible to parse and analyze flight recordings in code. If you're interested in this approach, I'm sure there's some sample code around, of course it's more work than if JMC could run as a daemon :/
You should probably have a look at an APM tool instead of monitoring with JMC. The product is extremely weak, introduces a lot of overhead (making it unsuitable for production) and creates a lot of issues. There are also developer focused tools available out there.
APM : AppDynamics (deepest of the bunch), New Relic, Ruxit
Java Developer Tools : Takipi, Fusion Reactor, Javosize
According to the Apache Commons Daemon project:
In case of a system-wide shutdown, the Virtual Machine process may be shut down directly by the operating system without notifying the running server application.
So I'm wondering: what is the value commons-daemon adds when you implement it? If I have an Oracle GlassFish Server instance running, and something happens (OOME, system-wide meltdown, etc.) that would normally send a SIGTERM or a SIGKILL to the JVM running OGS and all of its deployed apps, how could commons-daemon intervene and allows OGS and its deployed apps to shutdown quietly/politely?
And, if that's not what commons-daemon is for, can someone please explain to me a use case where it is used and helpful? Thanks in advance.
According to the docs, operating systems have support for a special class of server/daemon programs, and when the OS is about to shut down, it will send those a signal (before the actual SIGTERM/SIGKILL I guess) to notify them about it. Commons Daemon can interface with that.
I am not sure if this is any help if someone terminates the process directly, but if you use the proper service management tools of the OS, then the app probably has enough time to clean up.
I'm looking for a tool (in linux) that can montior a tomcat/jboss process and if the process fails, I can respawn either or both without having to manually ssh into the box, do any house keeping and then start them up again. I'm not too sure if there is a good tool out there that can monitor the health of jboss/tomcat and report on it's performance. I know jvisualvm gives you various tools, but I'm looking for a disaster recovery solution that is a bit higher level than jvisualvm.
Java Service Wrapper is an application that wraps your Java process and installs it in the system using service (Windows) or daemon (Linux). It pings the VM periodically and restarts it when it does not respond. Worked for us in production with several application, including Tomcat, JBoss, Mule, etc. Actually Mule ESB is even bundled with this application in the distribution.
Also you don't have to run the application manually when the system starts.
I'm currently working on a daemon to do this and more, since JOPR nor naggios didn't do what we needed, but those are good tools you could use. I'm not sure but JOPR (or whatever is called today) can restart your servers in case something goes wrong.
A custom made solution as we're working on shouldn't take you more than a week. The main problem, is that to start either JBoss or Tomcat you have to call the startup scripts. But the startup script will restart the service if the exit code is 10, something like this:
while $? -ne 10; do
start_jboss
done
So, this daemon which is made in Java uses JMX to connect to the JBoss server and tells JBoss to go down and exit with status code 10 using a method in a MBean. I'm at home, so I'm not sure the exact name of the MBean you have to call for this but I'll provide more info tomorrow.
I am using monit to control the launch of Tomcat/JBoss.
I want to find or develop an application that can run as a daemon, notify the administrator by email or sms when the Java applications running on a host get any exceptions or errors. I know JVMTI can achieve part of my goal, but it will impact performance of the monitored applications(I don't know how much will it be, it will be acceptable if it's slight), besides it seems to be a troublesom job to develop a JVMTI agent and I'm not sure what would happen if several applications running at the same time using the same agent. Is there any better solutions? Thanks in advance.
One way would be to use a logging system like log4j that publishes all errors occuring on system A to a logging server on system B from which you can monitor the errors occured. This isn't a completely generic solutation however, since only exceptions propagated to log4j (or any other logging system) would be handled - but it may be a good start.
The best solution is to have the Java application send its errors via email/sms. The problem is that programs will generate exceptions and handle correctly in normal operation. You only want particular exception.
Failing this you could write a log reader, which reads the logs of the application. This is tricky to get right, but it can be done.
An application can generate 1000+ exception per days and still be behaving normally because the application knows how to handle these exceptions. e.g. every time a socket connection is closed an exception can be thrown.
IMO, the best approach is to deploy an external monitoring system. This can:
monitor multiple applications
monitor infrastructure services
monitor network availability and machine accessibility,
monitor resources such as processor and file system usage.
Applications can be monitored in a variety of ways, including:
by processing log events,
by watching for application restarts,
by "pinging" the application's web apis to check service liveness, and
by using the application's JMX interfaces.
This information can be filtered and prioritized in an intelligent fashion, and critical events can be reported by whatever means is most appropriate.
You don't want individual applications sending emails, because they don't have sufficient information to do a decent job. Furthermore, putting the reporting logic into individual applications is likely to lead to inconsistent implementation, poor configurability, and so on.
There is a nearby alternative to JVMTI : JPDA. This infrastructure allows you to create a remote "debugger" (yes, that's what you're planning to do) using Java code, and connect it to the VM using either local or remote connection.
There will be, like for JVMTI, an overhead to program execution. However, as the Trace.java example shows, it's quite simple to both implement and connect to target VM.
Finally, notice if you want to instrument code run by application server (JBoss, Glassfish, Tomcat, you name it) there are various other means available.
I follow the pattern where every exception gets logged to a table.
Then an RSS feed selects from that table.
I subscribe to the RSS feed in MS Outlook at work and also on my Android phone with a program called NewsRob. NewsRob let me set my phone to alert me when there is something new.
I blog about how to do this HERE. It is in .net, but you get the idea.
As a related step I found a way to notify myself when something DIDN'T happen. That blog is HERE.
There are loads of applications out there that do what you are looking for in a way that does not impact performance. Have you had a look at Kibana/ElasticSearch, or Splunk or Logscape for enterprise solutions ( they both also have free versions).
I'm going to echo what has already been said and highlight what java already provides and what you can do with an external monitoring system. Java already provides:
log4j - log ERRORS, WARNINGS, FATAL and Exceptions to a file
JMX - Create custom application metrics and you also have access to java.lang/* which will give you heap memory usage , garbage collection, thread counters etc.
JVM gc logging - you can log all your garbage collection events to a file and watch for any long Full GC collections.
An external monitoring system will allow you to set alerts triggered off different operational scenarios. You will also get visualisation of your system performance through charts. I've used Logscape's java app in the past to monitor 30 java processes spread out over3 hosts.
I have a cluster of 32 servers and I need a tool to distribute a Java service, packaged as a Jar file, to each machine and remotely start the service. The cluster consists of Linux (Suse 10) servers with 8 cores per blade. The application is a data grid which uses Oracle Coherence. What is the best tool for doing this?
I asked something similar once, and it seems that the Java Parallel Processing Framework might be what you need:
http://www.jppf.org/
From the web site:
JPPF is an open source Grid Computing
platform written in Java that makes it
easy to run applications in parallel,
and speed up their execution by orders
of magnitude. Write once, deploy once,
execute everywhere!
Have a look at OpenMOLE: http://www.openmole.org/
This tool enables you to distribute a computing workflow to several resources: from multicores machines, to clusters and computing grids.
It is nicely documented and can be controlled through groovy code or a GUI.
Distributing a jar on a cluster should be very easy to do with OpenMOLE.
Is your service packaged as an EJB? JBoss does a fairly good job with clustering.
Use Bit Torrent. Using Peer to Peer sharing style on clusters can really boost up your deployment speed.
It depends on which operating system you have and how security is setup on your network.
If you can use NFS or Windows Share, I suggest you put the software on an NFS drive which is visible to all machines. That way you can run them all from one copy.
If you have remote shell or secure remote shell you can write a script which runs the same command on each machine e.g. start on all machines, or stop on all machines.
If you have windows you might want to setup a service on each machine. If you have linux you might want to add a startup/shutdown script to each machine.
When you have a number of machines, it may be useful to have a tool which monitors that all your services are running, collects the logs and errors in one place and/or allows you to start/stop them from a GUI. There are a number of tools to do this, not sure which is the best these days.