I am writing to get some help regarding starting my java programs via Monit. I have written a start script program.sh. The monit code and the scipt code is given with this posting.
The issue is that I am not able the start and stop the program using the the script file executed via monit. I can monitor the process if I start it using the terminal but I can't start/stop it with monit. The log from monit says "Failed to start"
However, I can start and stop programs like ssh easliy from monit. The monit runs under sudo and I am running the scripts from an account with administrative privileges.
It will very helpful if someone helps me figure this out
Thanks
monitrc file
#++++++++++#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#Monit settings
set daemon 10 with start delay 2 # check services at 2-minute intervals
set logfile syslog facility log_daemon
set logfile /var/log/monit.log
set idfile /var/lib/monit/id
set statefile /var/lib/monit/state
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# Mail Server
set mailserver smtp.gmail.com port 587
username "monit.abc123#gmail.com" password "password"
using tlsv1 with timeout 30 seconds
set eventqueue
basedir /var/lib/monit/events # set the base directory where events will be stored
slots 100 # optionally limit the queue size
set alert abc123#gmail.com # receive all alerts
set alert abc123#gmail.com only on { timeout } # receive just service-
# # timeout alert
#set alert foo#bar { nonexist, timeout, resource, icmp, connection }
#set alert security#bar on { checksum, permission, uid, gid }
# setup the email for the SMS thing over here.......................
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
set httpd port 2813 and
# use address localhost # only accept connection from localhost
allow localhost # allow localhost to connect to the server and
allow 0.0.0.0/0.0.0.0
allow admin:monit # require user 'admin' with password 'monit'
# allow #monit # allow users of group 'monit' to connect (rw)
# allow #users readonly # allow users of group 'users' to connect readonly
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
set mail-format {
from: monit#$HOST
subject: Monit Alert -- $EVENT $SERVICE
message: $EVENT Service $SERVICE
Date: $DATE
Action: $ACTION
Host: $HOST
Description: $DESCRIPTION
Your faithful employee,
Monit
}
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#***********************************************************************************************
#Computer Resources check
check system myhost.mydomain.tld
if loadavg (1min) > 4 for 5 cycles then alert
if loadavg (5min) > 2 for 5 cycles then alert
if memory usage > 75% for 3 cycles then alert
if swap usage > 25% for 5 cycles then alert
if cpu usage (user) > 70% for 5 cycles then alert
if cpu usage (system) > 70% for 5 cycles then alert
if cpu usage (wait) > 20% for 5 cycles then alert
#***********************************************************************************************
################################################################################################
#Monitoring SSH Service
check process ssh123 with pidfile /var/run/sshd.pid
start program = "/etc/init.d/ssh start"
stop program = "/etc/init.d/ssh stop"
if cpu > 50% for 5 cycles then alert
if totalmem > 200 MB for 5 cycles then alert
if children > 2 then alert
#if loadavg(5min) greater than 10 for 8 cycles then stop
#if 5 restarts within 5 cycles then timeout
################################################################################################
#Monitoring Prorgam in Java
check process javaprg with pidfile /home/user/Desktop/Binaries/javaprg.pid
start program = "/home/user/Desktop/Binaries/javaprg.sh start"
stop program = "/home/user/Desktop/Binaries/javaprg.sh stop"
if cpu > 50% for 5 cycles then alert
if totalmem > 1500 MB for 5 cycles then alert
if children > 2 then alert
#if loadavg(5min) greater than 10 for 8 cycles then stop
#if 5 restarts within 5 cycles then timeout
Start/Stop script
#!/bin/bash
case $1 in
start)
echo $$ > javaprg.pid;
exec /usr/bin/java -jar javaprg.jar
;;
stop)
kill $(cat javaprg.pid);
rm javaprg.pid
;;
*)
echo "usage: javaprg {start|stop}" ;;
esac
exit 0
You should set absolute path in your start stop script.
You can try to launch it using a rooted shell sudo -s.
And you should concider using the /etc/monit/conf.d folder to put your conf files.
I had the same problem when I was trying to configure a shell script under Monit.
what solved the problem was using the /bin/sh prior to the program itself.
try using:
start program = "/home/user/Desktop/Binaries/javaprg.sh start"
stop program = "/home/user/Desktop/Binaries/javaprg.sh stop"
I had the same problem using your script, and it was because the start script did not specify where to save the PID. It was saving javaprg.pid to / rather then the home folder. Change start script to 'echo $$ > /home/usr/binaries/javaprg.pid' and it will work.
Related
I have been looking for a long time to see if anyone has had an answer to my issue, but it doesn't seem to exist. I recently found I rarely used M1 Mac Mini. Since I had barely used it, I decided to turn it into a functioning server that runs 24/7. The only issue is that sometimes while I'm sleeping and my friends are playing, the server crashes, and there's no way of starting it back up unless I'm awake. So I'm looking for help on how to make a .command file that either A.) It Pings the server every minute, and if it senses it's down, it terminates the current terminal and restarts the start command. B.) Once the server crashes and the terminal ends, it restarts. I prefer to go with option A, but I'll take any help that I get! Thank you so much in advance, everyone!
I tried a script online, and one of them goes like this.
while true
do
cd Desktop
cd server
/Library/Internet_Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/bin/java -Xmx7G -Xms7G -jar forgeserver.jar
echo "If you want to completely stop the server process now, press Ctrl+C before
the time is up!"
echo "Rebooting in:"
for i in 5 4 3 2 1
echo "$i..."
sleep 1
done
echo "Rebooting now!"
done
However I am met with this error
Last login: Thu Feb 9 02:37:12 on ttys001
/Users/myname/Desktop/start.command ; exit;
davidking#Davids-Mac-mini ~ % /Users/myname/Desktop/start.command ; exit;
/Users/davidking/Desktop/start.command: line 11: syntax error near unexpected token `echo'
'Users/davidking/Desktop/start.command: line 11: `echo "$i..."
Saving session...
...copying shared history...
...saving history...truncating history files...
...completed.
[Process completed]
You can try something like that:
while true; do
ping -c 1 -W 1 your_server_IP
if [ $? -eq 0 ]; then
echo "Server is up"
else
echo "Server is down, restarting..."
# Add your server restart command here
/path/to/restart/command
fi
sleep 60
done
The ping -c 1 -W 1 your_server_IP command sends one Echo Request packet to your_server_IP, and waits for a response for 1 second. If a response is received, it means the server is up and running. If no response is received, it means the server is down.
You have to save this file as a .command file and add it to your PATH.
Let me know if it works or you're getting errors! I'll be happy to help :)
I am trying to measure my Tomcat server shutdown interval and write it to the log. I'm trying to use the following Python code:
log_times.append(datetime.now().strftime(TIME_PATTERN)) # logs start time
subprocess.check_call(['service', 'tomcat7', 'stop'])
pid = subprocess.Popen(['pgrep', '-utomcat7'], stdout=subprocess.PIPE).communicate()[0]
while (pid != ''):
log.info('pgrep found tomcat process: %s', pid)
pid = subprocess.Popen(['pgrep', '-utomcat7'], stdout=subprocess.PIPE).communicate()[0]
log_times.append(datetime.now().strftime(TIME_PATTERN)) # logs tomcat shutdown time
Is there a better \ accurate way to do this ?
You can try use a bash script. For example stop_tomcat.sh:
START=$(date +%s.%N)
service tomcat7 stop
END=$(date +%s.%N)
DIFF=$(echo "$END - $START" | bc)
echo $DIFF
Change permissions :
chmod 755
call it from python and take it stdout :
time = subprocess.Popen(['./stop_tomcat.sh'], stdout=subprocess.PIPE).communicate()[0]
More details about measuring time in bash https://unix.stackexchange.com/a/12069.
At this moment I am using command below as a part of my batch script to boot domain1:
asadmin start-domain domain1
however I have recently installed domain1 as a service so now when I use this command the domain is starting under my user process instead of booting as a service. So after I logout, the domain is gone. I used:
net start domain1
and
sc start domain1
However both of these seem to return as soon as signal[or whatever else] is dispatched toward service, and they do not wait untill domain1 is actually started. "asadmin start-domain" did return after it started the domain...
I have to wait as in my script I am undeploying/deploying new app shortly after domain start. So is there any way to start Glassfish as service using batch command and wait untill it is started?
Install:
sc create ServiveName binpath= <PATH_TO_SERVICE>.exe
net start ServiveName
PAUSE
Start:
net start ServiceName
PAUSE
Stop:
net stop ServiceName
PAUSE
Uninstall:
net stop ServiceName
sc delete ServiceName
PAUSE
One of the solutions I am using:
#echo off
SETLOCAL enableextensions enabledelayedexpansion
set GLASSFISH_HOME=c:\glassfish
set DOMAIN=domain1
net start %DOMAIN%
:loop
call timeout /t 1 /NOBREAK > NUL
echo Still waiting for domain to start
for /f "tokens=1,2 delims= " %%A IN ( '"%GLASSFISH_HOME%\bin\asadmin.bat" list-domains' ) DO IF "%%A"=="%DOMAIN%" SET GLASSFISH_RUNNING=%%B
if not "%GLASSFISH_RUNNING%"=="running" (
goto loop
)
I modified the above version a little bit for better understanding:
#echo off
SETLOCAL enableextensions enabledelayedexpansion
set GLASSFISH_HOME=D:\glassfish
set DOMAIN=domainName
set SERVICE_NAME="name of your service"
net start %SERVICE_NAME%
:loop
call timeout /t 1 /NOBREAK > NUL
echo Still waiting for domain to start
for /f "tokens=1,2 delims= " %%A IN ( '"%GLASSFISH_HOME%\bin\asadmin.bat" list-domains' ) DO IF "%%A"=="%DOMAIN%" SET GLASSFISH_RUNNING=%%B
if not "%GLASSFISH_RUNNING%"=="running" (
goto loop
)
Some applicatios already provide possiblities to automatically create Windows services. However every .exe can be configured this way.
GUI: http://www.sevenforums.com/tutorials/2495-services-start-disable.html
Console: https://support.microsoft.com/de-de/kb/137890
Automatic startup before logon: https://serverfault.com/questions/227862/run-a-program-without-user-being-logged-on
Type the following code in notepad and save [name].bat (for windows)
cd C:\glassfish3\glassfish3\bin
asadmin start-domain
PAUSE
Situation: I have a keep-alive shell script that restarts an application whenever it shuts down. However I do not want it to do this if the application was closed via a SIGTERM or SIGINT (kill, Ctrl+C, etc.) i.e. a shutdown hook. However I have no way of setting the exit code, hence communicating to the keep-alive script, when exiting from a shutdown hook as calling exit is illegal.
From Javadocs for exit:
If this method is invoked after the virtual machine has begun its shutdown sequence then if shutdown hooks are being run this method will block indefinitely. If shutdown hooks have already been run and on-exit finalization has been enabled then this method halts the virtual machine with the given status code if the status is nonzero; otherwise, it blocks indefinitely.
Is this possible?
If the process has been killed by a signal, the $? variable will be set to 128 + signal:
bash$ sleep 3;echo $?
0
bash$ sleep 3;echo $?
^C
130
Here, 130 is 128 + SIGINT.
Grab the PID of the process in a variable and use the wait builtin: if the process has been terminated by a signal, the return code of wait will be 128 + the signal number.
#
# Note: output from shell trimmed
#
# Launch cat in the background, capture the PID
$ cat & PIDTOCHECK=$!
$ echo $PIDTOCHECK
27764
#
# Call wait a first time: the program is halted waiting for input (SIGTTIN)
#
$ wait $PIDTOCHECK ; echo $?
149
#
# Now kill cat, and call wait again
#
$ kill %1
$ wait $PIDTOCHECK ; echo $?
143
Here's what I do:
Runtime.getRuntime().halt(0);
Note that this will exit the program immediately, so you need to do it after the last shutdown hook has finished.
I am using RAD 7.5.0 and the websphere server v6.1. When i start the server in debug mode, it displays a error message states that
'Starting WebSphere application server in localhost; has encountered a problem.
JVM debug port #### is in use.
What is the problem? How to resolve this?
i also frustrated many times because of this eroor.. Finally found out the solution
To resolve this problem you will need to do the following to all of your subsequent servers:
1.Start the server in 'normal' mode (i.e. non-debug mode).
2.Launch the Administrative console and log in.
3.Expand 'Servers', click on 'Application Servers', and then your server instance (typically 'server1').
4.On the 'Configuration' tab expand 'Java and Process Management' and then click on 'Process Definition'.
5.Under the 'Additional Properties' header, click on 'Java Virtual Machine'.
6.Scroll to the bottom of the page, locate the 'Debug Arguments' text field, and increment the 'address' property at the very end of the string so it will use a unique port value.
7.Save your changes, exit the administrative console, stop the server, and then start it in debug mode
It simply means that the debug port is currently in use. Do you have any other IBM products already running on that box? Does this happen when you start your server for the first time or for subsequent tries?
One suggestion would be to hunt down rogue hanging Java processes and kill them (in case you don't need them) to resolve this.
Open task manager in admin mode and run the following two commands.
1: netstat -ano | findstr :PORT_NUMBER
e.g.: netstat -ano | findstr 7781
2: taskkill /PID PID /F (PID will be the number that is shown e.g.: 32181)
e.g.: taskkill /PID 32181 /F