JVM only using half the cores on a server - java

I have a number of Java processes using OpenJDK 11 running on Windows Server 2019. The server has two physical processors and 36 total cores; it is an HP machine. When I start my processes, I see work allocation in Task Manager across all the cores. This is good. However after the processes run for some period of time, not a consistent amount of time, the machine begins to only utilize only half the cores.
I am working off a few theories:
The JDK has some problem that is preventing it from consistently accessing all the cores.
Something with Windows Server 2019 is causing a problem, limiting Java from accessing all the cores.
There is a thermal management problem and one processor is getting too hot and the OS is directing all the processing to the other processor.
There is some issue with hyper-threading and the 'logical' processors that is causing the process to not be able to utilize all the cores.
I've tried searching for JDK issues and haven't found anything like this mentioned. I went down to the server and while it's running a little warm, it didn't appear excessively hot. I have not yet tried disabling hyper-threading. I have tried a number of parameters to force the JVM to use all the cores and indeed the process initially does use all the cores; I can see the activity in Task Manager.
Anyone have any thoughts? This is a really baffling problem and I'd appreciate any ideas.
UPDATE: I am able to make it use the other processor by using the Task Manager to assign one of the java.exe processes to the other processor. This is also working from the java invocation on the command line as well with an argument for which socket to use.
Now that said, this feels like a hack. I don't see why I should have to manually assign a socket to each of my java processes; that job should be left to the OS. I'm still not sure exactly where the problem is, if it's the OS or what.

Related

Java server application slow after period of idleness (Windows)

I'm having trouble with a Jetty 9 server application that seems to go into some kind of resting state after a longer period of idleness. Normally the memory usage of the Java process is ~500 MB, but after being idle for some time it seems to drop down to less than 50MB. The first request that comes takes up to several seconds to respond whereas requests are normally on the scale of tens of milliseconds. But after one or two requests it seems like the application is back to it's normal responsive state.
I'm running on the 32-bit Oracle Java 8 JVM. My JVM configuration is very basic:
java -server -jar start.jar
I was hoping that this issue might be solvable through JVM configuration. Does anyone know if there's any particular parameter to disable this type of behavior?
edit: Based on the comment from Ivan, I was able to identify the source of the issue. Turns out Windows was swapping parts of the Java process out to disk. See my own answer below for a description of my solution.
Based on the comment from Ivan, I was able to identify the source of the issue. Turns out Windows was swapping parts of the Java process out to disk. This was clearly visible when comparing the private working set to the commit size in the task manager.
My solution to this was two-fold. First, I made a simple scheduled job inside my server app that runs every minute and does a simple test run to make sure that the important services never go inactive for long periods. I'm hoping this should ensure that Windows doesn't regard the related pages as inactive.
Afterwards, I also noticed that the process was executing with "Below normal" priority. So I changed the script that starts the server to ensure that it's running with "High" priority going forward. This seems likely to affect swapping behavior and may very well also have been enough to resolve the issue on it's own, but I only found it after already deploying my first solution so that remains unclear. In any case, everything seems to be working as it should now.

Java application Performance issue

Issue at hand:
I have a java application which is taking twice as long to run on DEV and QA Servers than on my local machine. When running the job on Dev and QA I’m getting times around 1:45 – 2:30 (hh:MM) compared to my local which is getting about 0:45 – 1:10. I’m trying to determine what could be causing slow performance of a java application on servers.
What I have looked into so far none providing an answer:
Testing with same maxheap size
Observing stress on cpu. Dev is idle about 75% of the time when running the batch application, so I don’t think this is an issue.
Observing ram on Dev. Dev has more than enough memory to provide the JVM the specified maxHeap (128mb). If I understand correctly the available memory of the machine doesn’t matter as long as the MaxHeap size can be provided by the machine correct?
Ensuring the version of java isn’t causing the issue.
Set logging level same: “INFO”
Processor: servers has 2.67GHz processor my local only has 2.19GHz
Additional Information.
Server OS: Linux
Local Computer OS: Windows
Single threaded Java application.
Application is reading and writing to text files and also has calls
to a database(hibernate c3p0). These are my most/only expensive operations
I’ve scoured dozens of sites to determine a root cause but I haven’t been able to nail down what is causing the issue any help will be much appreciated.
Application is reading and writing to text files and also has calls to a database(hibernate c3p0). These are my most/only expensive operations
Most likely your server has slower access to the database it is using. e.g. if you were to run the database on the same machine it can be a lot faster than across a network. I would look at the time it takes to perform some simple hibernate operations locally and on your server. If performance is a concern I suggest looking at removing hibernate or even your database and your program can run 10x - 100x faster.
I’m trying to determine what could be causing slow performance of a java application on servers
The Server I was testing on happened to be in the DMZ (outside the network). While the database and my local computer (when working in the office) are inside the network. This was the case I failed to evaluate.

How to handle thousands of threads in Java without using the new java.util.concurrent package

I have a situation in which I need to create thousands of instances of a class from third party API. Each new instance creates a new thread. I start getting OutOfMemoryError once threads are more than 1000. But my application requires creating 30,000 instances. Each instance is active all the time. The application is deployed on a 64 bit linux box with 8gb RAM and only 2 gb available to my application.
The way the third party library works, I cannot use the new Executor framework or thread pooling.
So how can I solve this problem?
Note that using thread pool is not an option. All threads are running all the time to capture events.
Sine memory size on the linux box is not in my control but if I had the choice to have 25GB available to my application in a 32GB system, would that solve my problem or JVM would still choke up?
Are there some optimal Java settings for the above scenario ?
The system uses Oracle Java 1.6 64 bit.
I concur with Ryan's Answer. But the problem is worse than his analysis suggests.
Hotspot JVMs have a hard-wired minimum stack size - 128k for Java 6 and 160k for Java 7.
That means that even if you set the stack size to the smallest possible value, you'd need to use roughly twice your allocated space ... just for thread stacks.
In addition, having 30k native threads is liable to cause problems on some operating systems.
I put it to you that your task is impossible. You need to find an alternative design that does not require you to have 30k threads simultaneously. Alternatively, you need a much larger machine to run the application.
Reference: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-June/003867.html
I'd say give up now and figure another way to do it. Default stack size is 512K. At 30k threads, that's 15G in stack space alone. To fit into 2G, you'll need to cut it down to less than 64K stacks, and that leaves you with zero memory for the heap, including all the Thread objects, or the JVM itself.
And that's just the most obvious problem you're likely to run into when running that many simultaneous threads in one JVM.
I think we are missing lots of details, but would a distributed plateform would work? Each of individual instances would manage a range of your classes instances. Those plateform could be running on different pcs or virtual machines and communicate with each other?
I had the same problem with an SNMP provider that required a thread for each outstanding get (I wanted to have tens of thousands of outstanding gets going on at once). Now that NIO exists I'd just rewrite the library myself if I had to do this again.
You cannot solve it in "Java Code" or configuration. Windows chokes at around 2-3000 threads in my experience (this may have changed in later versions). When I was doing this I surprisingly found that Linux supported even less threads (around 1000).
When the system stops supplying threads, "Out of Memory" is the exception you should expect to see--so I'm sure that's it--I started getting this exception long before I ran out of memory. Perhaps you could hack linux somehow to support more, but I have no idea how.
Using the concurrent package will not help here. If you could switch over to "Green" threads it might, but that might take recompiling the JVM (it would be nice if it was available as a command line switch, but I really don't think it is).

Powershell process from java - Process may "delay" (Not hang) on some environments

We are creating a bunch of PoSH scripts and running them in an orderly manner from Java
(Building a process, handling all input/output streams that may cause hangups, and then invoke it using local admin privileges).
We are facing an interesting occurrence in some of our deployments.
When tracking the log file of this operation we sometime see a considerable delay. The
delay is consistent and clocks at 35 seconds:
If there's a log listing just prior to the process building and the next logging is done
from the invoked PoSH script - The delay between them is consistent and of 35 seconds.
It is consistent for all scripts invoked on that machine.
This behavior is not consistent. We have several (unrelated) machines that exhibit that behavior, but also many other that show 2-5 seconds which we accept as a normal time for building the process.
Our Java process is a 32bit process, and most of the delaying machines are 64bit VMs of Win2K8 server R2. Most of these VMs are using domain authentication and policies (Different ones of different customers).
We tried running the Powershell via various different methods (Such as PSTools) but with no real findings - It always starts in a few seconds. Comparing machine installations did not bring us any insights either. Performance does not seem to be an issue either though I'll admit we haven't analyzed it too deeply, just looked closely at the task manager.
It is important to mention that the process never hangs - It will run and run swiftly when started. The delay is happening during the startup of the PoSH process.
Any ideas, advices or speculative directions will be more than welcome.
Thanks,
Yaron

unresponsive java application for no reason

I have a java application that I run from eclipse 3.5.
My OS is WinXP(SP2) and the JRE version is 6.05.
I run the application on two identical computers (or so I think) but the application behaves differently on each computer.
The computers are the same Dell Optiplex model with the same amount of memory and have the same GPU.
On the first computer, the application runs flawlessly. However, on the second one the application freezes for a couple of minutes and then returns to run normally.
The strange thing is that the CPU usage on the second computer is not high at all. It seems as though my application does not receive any CPU for no apparent reason.
Computers should be deterministic so I assume there must be some difference between the machines but I don't know where to look.
I would love some ideas on where the problem might be.
Thanks,
Yoav.
I've found the problem.
The application that was unresponsive was run in debug mode.
Sorry to have wasted your time...
It may help you to get a Thread Dump when the app freezes. This will hopefully tell you exactly what is holding you up (i.e. waiting for IO somewhere).
Well, I would first update your JRE version as there are newer versions now.
As for both computers being identical, are they really identical? I find it difficult to believe that both have the same exact software and setup and that anything you have done to one, you have always done to the other. If this is indeed the case, you may want to try to debug your application on the second machine (the one that hangs) and find out specifically where it hangs.
It may also help us if you give more information about your application. The problem may not be your computer at all if the application is doing things like web access, network access, etc.
So both computers have nearly identical hardware. A few other things to check
Do they both have Eclipse 3.5, WinXP(SP2) and JRE 6.05 installed?
And behave differently when run from within Eclipse (on both machines or on one run from command-line)?
Is this reproducible? If yes When does it happen? On startup? Or on some specific action?
Does the program have a GUI?
Is there maybe some kind of virus scanner or another comparable software installed on one of the machines which could delay the program
Is networking, file acccess, multithreading involved?
I can think of two non-application possibilities:
Memory Paging. There's something extra happening on the slow machine, so your JVM is not getting a fair share of CPU time. A large daemon process or some such.
Network access. Your app is making some kind of network call and it's glitching or timeing out. Perhaps going and fetching some XML schema, perhaps a disk acesss to a mounted drive.
I've seen all manner of weirdness when apps attempt to access hosts by name and DNS is not well. One machine has an etc/host entry the other does not. Even each machine might want to resolve itself.

Categories

Resources