Second instance of Wildfly trying to start - java

We use Wildfly 10.1.0.Final here and over the last few days some curious behavior has manifested. First it was one user, but now over the course of the last few days, it's up to four users.
In house, we typically run Wildfly through standalone.bat on Windows (although I use mine as a service, and on Ubuntu).
The behavior is that after about an hour, a second instance of Wildfly attempts to start. These users do not have it installed as a service; it's run purely through the script. Sometimes it takes about two hours, but it's typically one. What we'll see in the log is that the Configured system properties: line prints in the log, followed by the usual startup information. There is no previous shutdown, no restart; the existing Java process controlling Wildfly is still running. A second whole Java process is being started. It gets about 1 second into the startup process, at which point Undertow sees the 8080 port is already in use, and then stops. What happens though, is that the two instances seem to start stepping all over each other and the end result is that both Java processes are still running, and our application is undeployed, with accompanying .undeployed file.
I have searched around but have turned up nothing. Are there any facilities in Wildfly to try and troubleshoot this? Is there a way to determine why Wildfly was started (to try and see why the second instance is popping up)? I think it's unlikely to be anything in our code that makes a whole new second Java process pop up, because we don't have any such thing coded in, but I'm open to possibilities.
EDIT: To add a detail, I just noticed that when this happens, while the original process was started by the logged in user (as shown in Windows Task Manager), the second process was started by the SYSTEM user. Keep in mind that Wildfly was not installed as a service (i.e. it's not erroneous service startup).

Related

OpenJdk initial startup time very slow

I'm running openjdk 11.0.3 on a server. Whenever the server has been rebooted (every night): For the first initial launch of my Application, the users have to wait for 35 Seconds before the Application is even started. (Before the first System.out.println is written from main Method.) (subsequent launches are very fast though)
I have tried the following option to debug this:
-Xlog:class+load:file=classload.txt
Here are the most important finds:
...
[2.284s][info][class,load] jdk.internal.loader.URLClassPath$FileLoader source: jrt:/java.base
[5.032s][info][class,load] sun.security.rsa.RSASignature$SHA1withRSA source: jrt:/java.base
…
[5.051s][info][class,load] java.util.LinkedList$Node source: jrt:/java.base
[8.121s][info][class,load] pos.LFChangeable source: file:/C:/Users/rho/AppData/Roaming/edapp/pos.jar
…
[8.135s][info][class,load] java.io.FileNotFoundException source: jrt:/java.base
[10.584s][info][class,load] sun.reflect.misc.ReflectUtil source: jrt:/java.base
…
[11.744s][info][class,load] java.security.NoSuchAlgorithmException source: jrt:/java.base
[34.853s][info][class,load] jdk.internal.logger.DefaultLoggerFinder source: jrt:/java.base
Why is it hanging for 23 Seconds between loading java.security.NoSuchAlgorithmException and jdk.internal.logger.DefaultLoggerFinder? And what about the other seconds of slowdowns?
edit:
Based on the comments, I will clarify some.
This is a windows rdp server. Actually, it is more than one server, but the problem persists on all servers.
The Application is a standalone Application. So every morning there are problems as users who log in to start the Application, will try to start it multiple times when "nothing happens".
I have now tried restarting one of the servers quite a few times, and this is what I found:
Starting my Application with java11 after reboot takes on average 40 seconds before the first System.out.println. Then it is only 1-2 Seconds before my first JFrame shows.
Starting my Application with java8 (sun) after reboots takes on average 16 Seconds before the first System.out.println. But I then get a 25 second delay before my first JFrame shows.
Starting my Application with java11 after already started with java8 takes on average 4-6 seconds.
Your application might be suffering from the absence of a “class data-sharing (CDS) archive”. Such an archive allows much faster loading of standard classes and has been generated by default by some installers of previous versions, but OpenJDK 11 does not have an installer.
This is addressed by JEP 341:
Currently, a JDK image includes a default class list, generated at build time, in the lib directory. Users who want to take advantage of CDS, even with just the default class list provided in the JDK, must run java -Xshare:dump as an extra step. This option is documented, but many users are unaware of it.
So while this JEP is about JDK 12 doing the necessary steps automatically, it also mentions the fix for JDK 11: just run java -Xshare:dump on the command line once, to generate the archive.
Note that you can improve the startup time even further by including application classes in the CDS. See also the Class Data Sharing section of the JDK 11 documentation.
I have now tested extensively, and I am prepared to publish my results, along with the 2 different "solutions" that I made.
First, let me explain my application a bit. It is a swing enterprise-application that started it's life 13 years ago, and has been extended every since.
This application therefore is big, does a lot of different things (although most users uses only a portion of it), and has about 120 jar-files on it's classpath including all the third-party-jars.
As previously mentioned, after restart of the server it takes 35 seconds before my first login-JFrame is shown.
Solution 1:
This was my first solution, and is not a solution to the slow start, but more a solution to the user not starting multiple instances of the application.
I noticed that although my application was very slow on the first initial start, other applications were not.
A workaround was therefore to make a small standalone-application to display the splash screen, that I start like this in my program:
splashProcess = Runtime.getRuntime().exec("javaw -jar splash.jar");
Later I just kill it off with
splashProcess.destroy();
Note that if I should create a splashscreen with new JFrame() instead, it would take the usual 35 seconds before it is displayed.
Solution 2:
While testing, I found out I could simulate a restart by just deleting all the jar-files and copying them back.
In addition to reduced testing time, I found out that starting the application with just the 4-5 jar-files needed for the initial startup was very fast (although that would have lead to ClassNotFoundExceptions later),
this also ment that I could try to figure out which jar-file which led to the hang, by starting by copying all jar-files back, and then omitting one and one more.
However, I found out that it was not one jar-file that was to blame. The seconds it takes before the application start was steadily reduced a little bit each time I removed some jar-files.
So, it seems that the problem is that the first time I call new JFrame() in my application, java seems to build some sort of index or something of all classes in the classpath, although they are not used at this time.
I don't know why it does this, but this process takes quite some time with 120 jar-files on the classpath.
This led me to solution nr 2. When my application now starts, I check for an argument "startSilent".
If this is present, the only thing my application does is show a new JDialog with size 0,0 and then call System.exit(0);
I then made a script that runs my application with the "startSilent"-parameter that starts when the user logs in.
Now, if the user logs into the server and waits at least 35 seconds before starting our application, the start is now lightning fast, as the application has already started and exited once, so that the "classpath-index" or whatever it is has been built.
If the user starts the application after a shorter time, the start-time is reduced by how long the silent-script has already run.
(And the start will always be at least a fair degree faster than before, as the script starts before the desktop is ready).
So these are the results of my findings. I hope other will find them useful, and if someone can explain why what I call the "classpath-index" is created as it is, I would be welcome.

Java server application slow after period of idleness (Windows)

I'm having trouble with a Jetty 9 server application that seems to go into some kind of resting state after a longer period of idleness. Normally the memory usage of the Java process is ~500 MB, but after being idle for some time it seems to drop down to less than 50MB. The first request that comes takes up to several seconds to respond whereas requests are normally on the scale of tens of milliseconds. But after one or two requests it seems like the application is back to it's normal responsive state.
I'm running on the 32-bit Oracle Java 8 JVM. My JVM configuration is very basic:
java -server -jar start.jar
I was hoping that this issue might be solvable through JVM configuration. Does anyone know if there's any particular parameter to disable this type of behavior?
edit: Based on the comment from Ivan, I was able to identify the source of the issue. Turns out Windows was swapping parts of the Java process out to disk. See my own answer below for a description of my solution.
Based on the comment from Ivan, I was able to identify the source of the issue. Turns out Windows was swapping parts of the Java process out to disk. This was clearly visible when comparing the private working set to the commit size in the task manager.
My solution to this was two-fold. First, I made a simple scheduled job inside my server app that runs every minute and does a simple test run to make sure that the important services never go inactive for long periods. I'm hoping this should ensure that Windows doesn't regard the related pages as inactive.
Afterwards, I also noticed that the process was executing with "Below normal" priority. So I changed the script that starts the server to ensure that it's running with "High" priority going forward. This seems likely to affect swapping behavior and may very well also have been enough to resolve the issue on it's own, but I only found it after already deploying my first solution so that remains unclear. In any case, everything seems to be working as it should now.

web start jar validation getting slower with each Java update

We have an (Eclipse RCP) application of 90MB with 139 self-signed jars which starts in 8s without Web Start and in 10s in an older version of Java 7. We configured Java to not use the browser proxy, i.e. deployment.proxy.type=0.
With each update of Oracle's Java startup performance drops. It takes more and more time to fully start up:
7u60/7u65/8u25: 13s (starts after 5s of web start processing)
7u75: 23s
8u31: 20s
8u40: 29s
8u51/8u60ea: 32s
What can I do to solve this issue?
From the trace/logs I can see that it is very probable that this slowdown is completely due to validating the cached jars taking much more time. Note that this question is similar but doesn't provide the following details:
Diagnosis:
When cached, the update check runs in only 0.5s (server returns "304 Not
Modified"), but even with a full download it takes only a few seconds on the gigabit network.
After the update check, for each jar XXX there is a log entry:
validating cached jar XXX.jar
When this is done, com.sun.javaws.Main is started after which the same validation seems to happen again and takes about the same amount of time, then the application starts.
The time spent in validating the cached jars seems to correspond to
the extra time required before the application starts.
The web start splash screen always shows for about 2 seconds corresponding to the update check and is then hidden. Then after almost 20 seconds the Java console finally appears and my application actually starts.
During the delay, jp2launcher.exe uses about 16% processor time on a quad core with hyperthreading (8 logical cores). So it looks like it is fully using one of the logical processors.
What I have already tried but did not make any difference:
clearing web start cache (countless times)
configure deployment.properties to disable certificate revocation check (as well as blacklist.check and validation.ocsp, validation.crl)
running offline
using the version download protocol
add to site exceptions list
check web server logs for problems. None found, update check runs in about 500ms for all 138 jars.
use another web server
checked certificate expiration date 17 feb 2016
validated my jnlp with JaNela and found no serious issues
create a deployment rule set to allow the application to run unsigned in order to speed up validation. This should be possible and looks like a promising way to solve this, but I could not get it working. See also my answer on this post.
configured Java to "do not start the console"
Detail: some weird behavior on 7u60
In 7u60 the application is started after about 5 seconds, after which the Java console APPEARS to be doing the jar validation in the background while the application is already started. HOWEVER the .log file reports that the application gets started AFTER all the validations are done. It reports this as 25 seconds and then shows the first System.out of my application which actually happened after only 5 seconds or so. It also shows the jar update check with the server taking 10 times as long as reported by the server. So I guess this is an issue with the logging framework lagging behind! Haven't seen this on 8u51.
Not an answer per se (yet), but I found that Java 8u25 when tracing is enabled, only generate a single trace file. 8u51 generates two files, one from the JVM used to update the application and other to run it. This is new (two JVM startups) and I think is related to the new setting for using native Windows sandbox capabilities. The problem is that it shouldn't have to validate the signatures again on the second JVM. The separation on two JVM instances always happen no matter if the setting for using native sandbox is disabled (the default).
I reported a regression bug, will edit the answer if I get an answer from Oracle.
Note: Java 8u31 still runs everything on one JVM but have the same doubled startup time the question stated.

How to implement a java daemon program in weblogic?

I have the task to port a standalone java deamon program to J2EE on weblogic.
Old: The java program starts two threads which loop endlessly based on an interval that can be configured via a properties file.
New: The program should run on weblogic 10.1.x and start when the managed server it will deployed to is started or the servlet is initialized and it shouldn't have to be invoked by a client.
I know already that creating your own threads is highly discouraged for weblogic so I'll search for another way to make this happen. I already tried via startup class, but that means the server remains in the state STARTING forever because naturally the programm is designed to run forever, I didn't know the server is actually waiting for the Startup Class to end. Next best thing I know of would be the usual servlet by calling its URL once and implement starting the programm in it. Even then, how would you prevent the browser from getting hung up on the servlet call (because it does run forever) without making the program logic asynchronous by creating a thread? Also I read something about Listeners, would that be the thing I should be looking for?
One last thing, I definitly need to run it on weblogic, so suggestions for other solutions wouldn't help me.
This is a confusing question because it's so basic... You just need to create a web service with your endless loops in it. You don't need to hit a URL to start it. Just deploy a .war or .ear file with your code and you're done.
http://docs.oracle.com/cd/E13222_01/wls/docs81/webserv/example.html

How to execute a Java program 24 x 7 on linux

I have a developed two small Java applications - a vanilla Java app and a Java Web application (i.e. Spring MVC, Servlets, JSP, etc.).
The vanilla application consists of several threads which read data continuously at varying rates (from once a second to twice a minute) from several websites, process the data and write it to a database.
The Web Application reads the data from the database and presents it using JSPs, etc.
I'd now like to deploy the applications to a Linux machine and have them run 24 x 7.
If the applications crash I would like them to be restarted.
What's the best way of doing this?
Your web container will run 24x7 by default. If your deployed application throws an exception, it's captured by the container. I wouldn't normally expect this process to not run. Perhaps if threads run away, then it may become unresponsive, so it's worth monitoring (perhaps by a separate process querying it via HTTP?).
Does your vanilla application need to run at regular intervals ? If so, then cron is a good bet. It'll invoke a new instance every 'n' minutes (or however you configure it). If your instance suffers a problem, then it'll simply bail out and a new instance will be launched at the next configured interval. Again, you should probably monitor this (capture log files?) in case some problem determines that it'll never succeed completely.
with Ubuntus upstart you can respawn processes automatically. A little bit more low-level is to put the respawn directly in /etc/inittab. Both work well, but upstart is more manageable (more tools), but requires a newer system (ubuntu, fedora, and debian is switching soon).
For inittab you need to add a line like this to /etc/inittab (from the inittab manpage):
12:2345:respawn:/path/to/myapp flags
For upstart you do something similar (this is a standard script in ubuntu 9.10):
user#host:/etc/init$ cat tty1.conf
# tty1 - getty
#
# This service maintains a getty on tty1 from the point the system is
# started until it is shut down again.
start on stopped rc RUNLEVEL=[2345]
stop on runlevel [!2345]
respawn
exec /sbin/getty -8 38400 tty1
Check out the ServletContextListener, this allows you to embed your java application inside your web application (by creating a background thread). Then you can have it all running inside the web container.
Consider investigating and using a web container supported by the operating system vendor so all the scripts to bring it up and down (including in case of problems) is written and maintained by somebody else but you.
E.g. Ubuntu has a Tomcat as a package
I have a crontab job running every 15 minutes to see if the script is still running. If not, it restarts the service. The script itself is a piece of Perl code:
#!/usr/bin/perl
use diagnostics;
use strict;
my %is_set;
for (#ARGV) {
$is_set{$_} = 1;
}
my $verbose = -1;
if ($is_set{"--verbose"}) {
$verbose = 1;
}
my #components = ("cdk", "qsar", "rdf");
foreach my $comp (#components) {
print "Checking component $comp\n" if ($verbose == 1);
my $bla = `ps aux | grep component | grep $comp-xws | grep -v "ps aux" | wc -l`;
$bla =~ s/\n|\r//g;
if ($bla eq "1") {
print " ... running\n" if ($verbose == 1);
} else {
print " ... restarting component $comp\n" if ($verbose == 1);
system "cd /home/egonw/runtime/$comp; sh runCDKcomponent.sh &";
}
}
First, when a problem occur, it is in general a good idea to have a human look at it to find the root cause as restarting a service without any action will in many cases not magically solve the issue. The common way to handle this situation is to use a monitoring solution offering some kind of alerting (by email, sms, etc) to let a human know that something is wrong and needs a human action. For example, have a look at HypericHQ, OpenNMS, Zenoss, Nagios, etc.
Second, if you want to offer some kind of highly available service, running multiple instances of the service (this is often referred to as clustering) would be a good idea. When doing so, if one instance goes down, the service won't be totally interrupted, obviously. Note that when using a cluster, if one node goes down because of too heavy load, it's very unlikely that the remaining part of the cluster will be able to handle the load so clustering isn't an absolute guarantee in all situations. Implementing this (at least for the web application) depends on the application server or servlet engine you are using.
But actually, if you are looking for something simple and pretty straight forward, I'd warmly suggest to check monit which is really a better alternative to a custom cron job (don't reinvent the wheel, monit is precisely doing what you want in a smart way). See this article for an introduction:
monit is a utility for managing and monitoring processes, files, directories and devices on a Unix system. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations. For example, monit can start a process if it does not run, restart a process if it does not respond and stop a process if it uses to much resources. You may use monit to monitor files, directories and devices for changes, such as timestamps changes, checksum changes or size changes.
Java Service Wrapper may help with keeping the Java program up 24x7 (or very close).
Several years ago I worked on a project using Java 1.2 and our goal was to run 24x7. We never made it. The longest we managed to keep Java running was about 2-3 weeks. Twice it crashed after about 15 days. The first time we just restarted it, the second time a colleague did some research and found that the crash was due to an int variable overflowing in the Calendar class: the JdbcDriver had called new Date(year, month, day, hour minute, second) more than about 300 million times and each call had incremented the int 6 times. I think this particular bug may be fixed but you may find there are others that you encounter as you try to keep the JVM running for a long time.
So you may need to design your application to be restarted occasionally to avoid this kind of thing.

Categories

Resources