Powershell process from java - Process may "delay" (Not hang) on some environments - java

We are creating a bunch of PoSH scripts and running them in an orderly manner from Java
(Building a process, handling all input/output streams that may cause hangups, and then invoke it using local admin privileges).
We are facing an interesting occurrence in some of our deployments.
When tracking the log file of this operation we sometime see a considerable delay. The
delay is consistent and clocks at 35 seconds:
If there's a log listing just prior to the process building and the next logging is done
from the invoked PoSH script - The delay between them is consistent and of 35 seconds.
It is consistent for all scripts invoked on that machine.
This behavior is not consistent. We have several (unrelated) machines that exhibit that behavior, but also many other that show 2-5 seconds which we accept as a normal time for building the process.
Our Java process is a 32bit process, and most of the delaying machines are 64bit VMs of Win2K8 server R2. Most of these VMs are using domain authentication and policies (Different ones of different customers).
We tried running the Powershell via various different methods (Such as PSTools) but with no real findings - It always starts in a few seconds. Comparing machine installations did not bring us any insights either. Performance does not seem to be an issue either though I'll admit we haven't analyzed it too deeply, just looked closely at the task manager.
It is important to mention that the process never hangs - It will run and run swiftly when started. The delay is happening during the startup of the PoSH process.
Any ideas, advices or speculative directions will be more than welcome.
Thanks,
Yaron

Related

JVM only using half the cores on a server

I have a number of Java processes using OpenJDK 11 running on Windows Server 2019. The server has two physical processors and 36 total cores; it is an HP machine. When I start my processes, I see work allocation in Task Manager across all the cores. This is good. However after the processes run for some period of time, not a consistent amount of time, the machine begins to only utilize only half the cores.
I am working off a few theories:
The JDK has some problem that is preventing it from consistently accessing all the cores.
Something with Windows Server 2019 is causing a problem, limiting Java from accessing all the cores.
There is a thermal management problem and one processor is getting too hot and the OS is directing all the processing to the other processor.
There is some issue with hyper-threading and the 'logical' processors that is causing the process to not be able to utilize all the cores.
I've tried searching for JDK issues and haven't found anything like this mentioned. I went down to the server and while it's running a little warm, it didn't appear excessively hot. I have not yet tried disabling hyper-threading. I have tried a number of parameters to force the JVM to use all the cores and indeed the process initially does use all the cores; I can see the activity in Task Manager.
Anyone have any thoughts? This is a really baffling problem and I'd appreciate any ideas.
UPDATE: I am able to make it use the other processor by using the Task Manager to assign one of the java.exe processes to the other processor. This is also working from the java invocation on the command line as well with an argument for which socket to use.
Now that said, this feels like a hack. I don't see why I should have to manually assign a socket to each of my java processes; that job should be left to the OS. I'm still not sure exactly where the problem is, if it's the OS or what.

Java server application slow after period of idleness (Windows)

I'm having trouble with a Jetty 9 server application that seems to go into some kind of resting state after a longer period of idleness. Normally the memory usage of the Java process is ~500 MB, but after being idle for some time it seems to drop down to less than 50MB. The first request that comes takes up to several seconds to respond whereas requests are normally on the scale of tens of milliseconds. But after one or two requests it seems like the application is back to it's normal responsive state.
I'm running on the 32-bit Oracle Java 8 JVM. My JVM configuration is very basic:
java -server -jar start.jar
I was hoping that this issue might be solvable through JVM configuration. Does anyone know if there's any particular parameter to disable this type of behavior?
edit: Based on the comment from Ivan, I was able to identify the source of the issue. Turns out Windows was swapping parts of the Java process out to disk. See my own answer below for a description of my solution.
Based on the comment from Ivan, I was able to identify the source of the issue. Turns out Windows was swapping parts of the Java process out to disk. This was clearly visible when comparing the private working set to the commit size in the task manager.
My solution to this was two-fold. First, I made a simple scheduled job inside my server app that runs every minute and does a simple test run to make sure that the important services never go inactive for long periods. I'm hoping this should ensure that Windows doesn't regard the related pages as inactive.
Afterwards, I also noticed that the process was executing with "Below normal" priority. So I changed the script that starts the server to ensure that it's running with "High" priority going forward. This seems likely to affect swapping behavior and may very well also have been enough to resolve the issue on it's own, but I only found it after already deploying my first solution so that remains unclear. In any case, everything seems to be working as it should now.

Why does my Java app run faster with profiler attached?

I am developing a Java 8 SE application in Netbeans. A new feature I added recently to the app was running too slowly (about a minute, until the calculations stopped). So I fired up the profiler to see what is the major bottleneck. To my surprise, the calculations completed in about 7 seconds.
Couldn't believe it at first, but the results were correct.
Tried it a few times again, but the app always ran 10 times faster with the profiler attached to it. I also tried to run the compiled .jar file directly from the Windows command line, but the computations took about a minute again and again.
How is it possible, that the attached profiler provides such a massive boost to the performance? What changes does it do to the JVM or application?
BTW, I am using native OpenCV in these calculations with provided Java wrapper, if it makes any difference.
//Edit - Additional info: I am using the built-in Netbeans 8.1 profiler, which I believe is basically VisualVM. As for a profiling method I chose to monitor "Methods" and their execution times and invocation counts. The performance bump happens both with instrumented and sampled profiling.
Unfortunately there probably isn't one single answer that will explain why this is the case. Of course, it will depend on what the program is doing as well as how the program is being launched. For example, if you're using the profiler to launch the application (as opposed to connecting afterwards) then it may be that the profiler is launching with different configuration (heap size, garbage collector etc.) and that is the cause of the difference.
If you run jcmd you should see a list of processes. You can then run jcmd <id> VM.flags to see what the JVM has been configured with, and verify that the same are for the application when under a profiler and when it isn't.
Another possibility is that your program is excessively locking, and this excessive locking is causing thrashing in your application when the profiler isn't attached. With it attached the locking may be slower, resulting in the application threads co-operating and ultimately making faster progress.
However these are just suggestions of how you can investigate further; it's quite likely that there is another as yet undiscovered problem that you're seeing which is completely different (e.g. it's defaulting to a different level of logging ...)

What's the time tolerance of linux shutdown/logout TERM signal handlers in modern linux?

I'm seeing strange behavior (file missing, file outdated) in a java program of mine that has to save some information at shutdown (using shutdownhooks), that in turn use the TERM signal.
The obvious workaround is to save as soon as that info is modified, but for performance reasons i'd like to avoid this.
Thing is it seems to me that the tolerance value is set ridiculously short and init (i think that's the name of the watchdog proces) is actually killing the JVM before it can terminate. I don't think that's it's a bug with my app, because i used a testcase where it waited at least 20 seconds but was still terminated almost instantly.
You can see this behavior in shutdown and logout, and also in netbeans and it's opened tabs (it won't save them, at least recent 7.1 on java 7).
Is this something i can't avoid and need to work around?
The documentation for telinit(8) says that the init process waits 5 seconds between sending the SIGTERM and SIGKILL signals. This delay can be changed through the -t option.
The same -t option is supported by shutdown(8) and relayed to telinit. Therefore, if you want to increase the delay globally on your system, you'll have to edit either your /etc/inittab configuration file or the helper files in /etc/init.d, depending on your distribution.

How to execute a Java program 24 x 7 on linux

I have a developed two small Java applications - a vanilla Java app and a Java Web application (i.e. Spring MVC, Servlets, JSP, etc.).
The vanilla application consists of several threads which read data continuously at varying rates (from once a second to twice a minute) from several websites, process the data and write it to a database.
The Web Application reads the data from the database and presents it using JSPs, etc.
I'd now like to deploy the applications to a Linux machine and have them run 24 x 7.
If the applications crash I would like them to be restarted.
What's the best way of doing this?
Your web container will run 24x7 by default. If your deployed application throws an exception, it's captured by the container. I wouldn't normally expect this process to not run. Perhaps if threads run away, then it may become unresponsive, so it's worth monitoring (perhaps by a separate process querying it via HTTP?).
Does your vanilla application need to run at regular intervals ? If so, then cron is a good bet. It'll invoke a new instance every 'n' minutes (or however you configure it). If your instance suffers a problem, then it'll simply bail out and a new instance will be launched at the next configured interval. Again, you should probably monitor this (capture log files?) in case some problem determines that it'll never succeed completely.
with Ubuntus upstart you can respawn processes automatically. A little bit more low-level is to put the respawn directly in /etc/inittab. Both work well, but upstart is more manageable (more tools), but requires a newer system (ubuntu, fedora, and debian is switching soon).
For inittab you need to add a line like this to /etc/inittab (from the inittab manpage):
12:2345:respawn:/path/to/myapp flags
For upstart you do something similar (this is a standard script in ubuntu 9.10):
user#host:/etc/init$ cat tty1.conf
# tty1 - getty
#
# This service maintains a getty on tty1 from the point the system is
# started until it is shut down again.
start on stopped rc RUNLEVEL=[2345]
stop on runlevel [!2345]
respawn
exec /sbin/getty -8 38400 tty1
Check out the ServletContextListener, this allows you to embed your java application inside your web application (by creating a background thread). Then you can have it all running inside the web container.
Consider investigating and using a web container supported by the operating system vendor so all the scripts to bring it up and down (including in case of problems) is written and maintained by somebody else but you.
E.g. Ubuntu has a Tomcat as a package
I have a crontab job running every 15 minutes to see if the script is still running. If not, it restarts the service. The script itself is a piece of Perl code:
#!/usr/bin/perl
use diagnostics;
use strict;
my %is_set;
for (#ARGV) {
$is_set{$_} = 1;
}
my $verbose = -1;
if ($is_set{"--verbose"}) {
$verbose = 1;
}
my #components = ("cdk", "qsar", "rdf");
foreach my $comp (#components) {
print "Checking component $comp\n" if ($verbose == 1);
my $bla = `ps aux | grep component | grep $comp-xws | grep -v "ps aux" | wc -l`;
$bla =~ s/\n|\r//g;
if ($bla eq "1") {
print " ... running\n" if ($verbose == 1);
} else {
print " ... restarting component $comp\n" if ($verbose == 1);
system "cd /home/egonw/runtime/$comp; sh runCDKcomponent.sh &";
}
}
First, when a problem occur, it is in general a good idea to have a human look at it to find the root cause as restarting a service without any action will in many cases not magically solve the issue. The common way to handle this situation is to use a monitoring solution offering some kind of alerting (by email, sms, etc) to let a human know that something is wrong and needs a human action. For example, have a look at HypericHQ, OpenNMS, Zenoss, Nagios, etc.
Second, if you want to offer some kind of highly available service, running multiple instances of the service (this is often referred to as clustering) would be a good idea. When doing so, if one instance goes down, the service won't be totally interrupted, obviously. Note that when using a cluster, if one node goes down because of too heavy load, it's very unlikely that the remaining part of the cluster will be able to handle the load so clustering isn't an absolute guarantee in all situations. Implementing this (at least for the web application) depends on the application server or servlet engine you are using.
But actually, if you are looking for something simple and pretty straight forward, I'd warmly suggest to check monit which is really a better alternative to a custom cron job (don't reinvent the wheel, monit is precisely doing what you want in a smart way). See this article for an introduction:
monit is a utility for managing and monitoring processes, files, directories and devices on a Unix system. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations. For example, monit can start a process if it does not run, restart a process if it does not respond and stop a process if it uses to much resources. You may use monit to monitor files, directories and devices for changes, such as timestamps changes, checksum changes or size changes.
Java Service Wrapper may help with keeping the Java program up 24x7 (or very close).
Several years ago I worked on a project using Java 1.2 and our goal was to run 24x7. We never made it. The longest we managed to keep Java running was about 2-3 weeks. Twice it crashed after about 15 days. The first time we just restarted it, the second time a colleague did some research and found that the crash was due to an int variable overflowing in the Calendar class: the JdbcDriver had called new Date(year, month, day, hour minute, second) more than about 300 million times and each call had incremented the int 6 times. I think this particular bug may be fixed but you may find there are others that you encounter as you try to keep the JVM running for a long time.
So you may need to design your application to be restarted occasionally to avoid this kind of thing.

Categories

Resources