How to calculate cpu usage for process java (JavaSysMon) [duplicate] - java

Is there a way in monitoring CPU usage using pure Java?

There is a gem in the comments on the article which kgiannakakis linked:
javasysmon
JavaSysMon manages processes and
reports useful system performance
metrics cross-platform. You can think
of it as a cross-platform version of
the UNIX `top’ command, along with the
ability to kill processes. It comes in
the form of a single JAR file /..
-works on Windows, Mac OS X, Linux, and Solaris.

How about using jmx mbeans?
final OperatingSystemMXBean myOsBean=
ManagementFactory.getOperatingSystemMXBean();
double load = myOsBean.getSystemLoadAverage();

You can use jMX beans to calculate a CPU load. Note that this measures CPU load of your java program, not the overall system load. (the question didn't specify which)
Initialize:
ThreadMXBean newBean = ManagementFactory.getThreadMXBean();
try
{
if (this.newBean.isThreadCpuTimeSupported())
this.newBean.setThreadCpuTimeEnabled(true);
else
throw new AccessControlException("");
}
catch (AccessControlException e)
{
System.out.println("CPU Usage monitoring is not available!");
System.exit(0);
}
Then as your loop (assuming your application uses a loop, otherwise what's the point in measuring CPU usage?) use this:
long lastTime = System.nanoTime();
long lastThreadTime = newBean.getCurrentThreadCpuTime();
while (true)
{
// Do something that takes at least 10ms (on windows)
try
{
int j = 0;
for (int i = 0; i < 20000000; i++)
j = (j + i) * j / 2;
Thread.sleep(100);
}
catch (InterruptedException e)
{
}
// Calculate coarse CPU usage:
long time = System.nanoTime();
long threadTime = newBean.getCurrentThreadCpuTime();
double load = (threadTime - lastThreadTime) / (double)(time - lastTime);
System.out.println((float)load);
// For next iteration.
lastTime = time;
lastThreadTime = threadTime;
}
You need to use double precision because a long doesn't fit in a float (though it might work 99.9999999999999999% of the time)
If the 'something' you're doing takes less than approximately 1.6ms (Windows), then the returned value will not even have increased at all and you'll perpetually measure 0% CPU erroneously.
Because getCurrentThreadCpuTime is VERY inaccurate (with delays less than 100ms), smoothing it helps a lot:
long lastTime = System.nanoTime();
long lastThreadTime = newBean.getCurrentThreadCpuTime();
float smoothLoad = 0;
while (true)
{
// Do something that takes at least 10ms (on windows)
try
{
int j = 0;
for (int i = 0; i < 2000000; i++)
j = (j + i) * j / 2;
Thread.sleep(10);
}
catch (InterruptedException e)
{
}
// Calculate coarse CPU usage:
long time = System.nanoTime();
long threadTime = newBean.getCurrentThreadCpuTime();
double load = (threadTime - lastThreadTime) / (double)(time - lastTime);
// Smooth it.
smoothLoad += (load - smoothLoad) * 0.1; // damping factor, lower means less responsive, 1 means no smoothing.
System.out.println(smoothLoad);
// For next iteration.
lastTime = time;
lastThreadTime = threadTime;
}

This is not possible using pure Java. See this article for some ideas.

Maybe if stuck, you might 'sense' cpu availability by running an intermittent bogomips calculator in a background thread, and smoothing and normalising its findings.
...worth a shot no :?

if you are using linux - just use jconsole - you will get all the track of java memory management

Related

Java Benchmarking addition of arrays with both CPU and GPU and compare performance

I am trying to compare a simple addition task with both CPU and GPU, but the results that I get are so weird.
First of all, let me explain how I managed to run the GPU task.
Let's dive into code now this is my code it simply
package gpu;
import com.aparapi.Kernel;
import com.aparapi.Range;
public class Try {
public static void main(String[] args) {
final int size = 512;
final float[] a = new float[size];
final float[] b = new float[size];
for (int i = 0; i < size; i++) {
a[i] = (float) (Math.random() * 100);
b[i] = (float) (Math.random() * 100);
}
//##############CPU-TASK########################
long start = System.nanoTime();
final float[] sum = new float[size];
for(int i=0;i<size;i++){
sum[i] = a[i] + b[i];
}
long finish = System.nanoTime();
long timeElapsed = finish - start;
//######################################
//##############GPU-TASK########################
final float[] sum2 = new float[size];
Kernel kernel = new Kernel(){
#Override public void run() {
int gid = getGlobalId();
sum2[gid] = a[gid] + b[gid];
}
};
long start1 = System.nanoTime();
kernel.execute(Range.create(size));
long finish2 = System.nanoTime();
long timeElapsed2 = finish2 - start1;
//##############GPU-TASK########################
System.out.println("cpu"+timeElapsed);
System.out.println("gpu"+timeElapsed2);
kernel.dispose();
}
}
My specs are:
Aparapi is running on an untested OpenCL platform version: OpenCL 3.0 CUDA 11.6.13
Intel Core i7 6850K # 3.60GHz Broadwell-E/EP 14nm Technology
2047MB NVIDIA GeForce GTX 1060 6GB (ASUStek Computer Inc)
The results that I get are this:
cpu12000
gpu5732829900
My question is why the performance of GPU is so slow. Why does CPU outperform GPU? I expect from GPU to be faster than the CPU does, my calculations are wrong, any way to improve it?
This code is measured the host side execution time for GPU task. It means that the measured time includes the time of the task execution on GPU, the time of copying the data for the task to GPU, the time of reading the data from GPU and the overhead that is introduced by Aparapi. And, according to the documentation for Kernel class, Aparapi uses lazy initialization:
On the first call to Kernel.execute(int _globalSize), Aparapi will determine the EXECUTION_MODE of the kernel.
This decision is made dynamically based on two factors:
Whether OpenCL is available (appropriate drivers are installed and the OpenCL and Aparapi dynamic libraries are included on the system path).
Whether the bytecode of the run() method (and every method that can be called directly or indirectly from the run() method)
can be converted into OpenCL.
Therefore, the host side execution time for GPU task cannot be compared with the execution time for CPU task. Because it includes additional work that is performed only once.
In this case, it is necessary to use getProfileInfo() call to get the execution time breakdown for the kernel:
kernel.execute(Range.create(size));
List<ProfileInfo> profileInfo = kernel.getProfileInfo();
for (final ProfileInfo p : profileInfo) {
System.out.println(p.getType() + " " + p.getLabel() + " " + (p.getEnd() - p.getStart()) + "ns");
}
Also, please note that the following property must be set: -Dcom.aparapi.enableProfiling=true. For more information please see Profiling the Kernel article and the implementation of ProfileInfo class.

Accuracy of System.nanoTime() to measure time elapsed decreases after a call to Thread.sleep()

I'm encountering a really unusual issue here. It seems that the calling of Thread.sleep(n), where n > 0 would cause the following System.nanoTime() calls to be less predictable.
The code below demonstrates the issue.
Running it on my computer (rMBP 15" 2015, OS X 10.11, jre 1.8.0_40-b26) outputs the following result:
Control: 48497
Random: 36719
Thread.sleep(0): 48044
Thread.sleep(1): 832271
On a Virtual Machine running Windows 8 (VMware Horizon, Windows 8.1, are 1.8.0_60-b27):
Control: 98974
Random: 61019
Thread.sleep(0): 115623
Thread.sleep(1): 282451
However, running it on an enterprise server (VMware, RHEL 6.7, jre 1.6.0_45-b06):
Control: 1385670
Random: 1202695
Thread.sleep(0): 1393994
Thread.sleep(1): 1413220
Which is surprisingly the result I expect.
Clearly the Thread.sleep(1) affects the computation of the below code. I have no idea why this happens. Does anyone have a clue?
Thanks!
public class Main {
public static void main(String[] args) {
int N = 1000;
long timeElapsed = 0;
long startTime, endTime = 0;
for (int i = 0; i < N; i++) {
startTime = System.nanoTime();
//search runs here
endTime = System.nanoTime();
timeElapsed += endTime - startTime;
}
System.out.println("Control: " + timeElapsed);
timeElapsed = 0;
for (int i = 0; i < N; i++) {
startTime = System.nanoTime();
//search runs here
endTime = System.nanoTime();
timeElapsed += endTime - startTime;
for (int j = 0; j < N; j++) {
int k = (int) Math.pow(i, j);
}
}
System.out.println("Random: " + timeElapsed);
timeElapsed = 0;
for (int i = 0; i < N; i++) {
startTime = System.nanoTime();
//search runs here
endTime = System.nanoTime();
timeElapsed += endTime - startTime;
try {
Thread.sleep(0);
} catch (InterruptedException e) {
break;
}
}
System.out.println("Thread.sleep(0): " + timeElapsed);
timeElapsed = 0;
for (int i = 0; i < N; i++) {
startTime = System.nanoTime();
//search runs here
endTime = System.nanoTime();
timeElapsed += endTime - startTime;
try {
Thread.sleep(2);
} catch (InterruptedException e) {
break;
}
}
System.out.println("Thread.sleep(1): " + timeElapsed);
}
}
Basically I'm running a search within a while-loop which takes a break every iteration by calling Thread.sleep(). I want to exclude the sleep time from the overall time taken to run the search, so I'm using System.nanoTime() to record the start and finishing times. However, as you notice above, this doesn't work well.
Is there a way to remedy this?
Thanks for any input!
This is a complex topic because the timers used by the JVM are highly CPU- and OS-dependent and also change with JVM versions (e.g. by using newer OS APIs). Virtual machines may also limit the CPU capabilities they pass through to guests, which may alter the choices in comparison to a bare metal setup.
On x86 the RDTSC instruction provides the lowest latency and best granularity of all clocks, but under some configurations it's not available or reliable enough as a time source.
On linux you should check kernel startup messages (dmesg), the tsc-related /proc/cpuinfo flags and the selected /sys/devices/system/clocksource/*/current_clocksource. The kernel will try to use TSC by default, if it doesn't there may be a reason for that.
For some history you may want to read the following, but note that some of those articles may be a bit dated, TSC reliability has improved a lot over the years:
OpenJDK Bug 8068730 exposing more precise system clocks in Java 9 through the Date and Time APIs introduced in java 8
http://shipilev.net/blog/2014/nanotrusting-nanotime/ (mentions the -XX:+AssumeMonotonicOSTimers manual override/footgun)
https://blog.packagecloud.io/eng/2017/03/14/using-strace-to-understand-java-performance-improvement/ (mentions the similar option for linux UseLinuxPosixThreadCPUClocks)
https://btorpey.github.io/blog/2014/02/18/clock-sources-in-linux/
https://stas-blogspot.blogspot.de/2012/02/what-is-behind-systemnanotime.html
https://en.wikipedia.org/wiki/Time_Stamp_Counter (especially CPU capabilities, constant_tsc tsc_reliable nonstop_tsc in linux nomenclature)
http://vanillajava.blogspot.de/2012/04/yield-sleep0-wait01-and-parknanos1.html
I can suggest at least two possible reasons of such behavior:
Power saving. When executing a busy loop, CPU runs at its maximum performance state. However, after Thread.sleep it is likely to fall into one of power-saving states, with frequency and voltage reduced. After than CPU won't return to its maximum performance immediately, this may take from several nanoseconds to microseconds.
Scheduling. After a thread is descheduled due to Thread.sleep, it will be scheduled for execution again after a timer event which might be related to the timer used for System.nanoTime.
In both cases you can't directly work around this - I mean Thread.sleep will also affect timings in your real application. But if the amount of useful work measured is large enough, the inaccuracy will be negligible.
The inconsistencies probably arise not from Java, but from the different OSs and VMs "atomic-" or system- clocks themselves.
According to the official .nanoTime() documentation:
no guarantees are made except that the resolution is at least as good
as that of currentTimeMillis()
source
...I can tell from personal knowledge that this is because in some OSs and VMs, the system itself doesn't support "atomic" clocks, which are necessary for higher resolutions. (I will post the link to source this information as soon as I find it again...It's been a long time.)

Exact timing in Java game loop

I'm developing a network based game, and I'm now focusing on the server side simulation. Of course I need a game loop, and I opted for a fixed timestep loop so that it will be far easier to reproduce on the client(s) than a variable timestep one. I also decided to run my game at 60 Hz. This is the game logic speed, not rendering speed. Rendering will be handled with a variable timestep loop in the clients to have the best possible rendering.
The server is written in Java.
I already made an example game loop using code from http://www.java-gaming.org/index.php?topic=24220.0 and modifying the loop with my code. Here is the loop:
private void gameLoop()
{
final double GAME_HERTZ = 60.0;
final double TIME_BETWEEN_UPDATES = 1000000000 / GAME_HERTZ;
//We will need the last update time.
double lastUpdateTime = System.nanoTime();
//Store the last time we rendered.
double lastRenderTime = System.nanoTime();
int lastSecondTime = (int) (lastUpdateTime / 1000000000);
long extraSleepTime = 0;
while (running)
{
int updateCount = 0;
if (!paused)
{
long loopStartTime = System.nanoTime();
updateGame();
updateCount++;
long timeAfterUpdate = System.nanoTime();
lastUpdateTime = timeAfterUpdate;
//Render. To do so, we need to calculate interpolation for a smooth render.
float interpolation = Math.min(1.0f, (float) ((loopStartTime - lastUpdateTime) / TIME_BETWEEN_UPDATES) );
drawGame(interpolation);
lastRenderTime = loopStartTime;
//Update the frames we got.
int thisSecond = (int) (lastUpdateTime / 1000000000);
if (thisSecond > lastSecondTime)
{
long nanoTime = System.nanoTime();
System.out.println("NEW SECOND " + thisSecond + " " + frameCount + ": " + (nanoTime - lastNanoTime));
lastNanoTime = nanoTime;
fps = frameCount;
frameCount = 0;
lastSecondTime = thisSecond;
}
long loopExecutionTime = timeAfterUpdate - loopStartTime;
long sleepTime = (long)TIME_BETWEEN_UPDATES - loopExecutionTime - extraSleepTime;
// Only sleep for positive intervals
if(sleepTime >= 0)
{
try
{
Thread.sleep(sleepTime / 1000000);
}
catch(InterruptedException e) {}
}
else
{
System.out.println("WARN: sleepTime < 0");
}
// Counts the extra time that elapsed
extraSleepTime = System.nanoTime() - timeAfterUpdate - sleepTime;
}
}
The problem is that, when running, the FPS aren't stable at 60Hz, but sometimes go lower. For example I sometimes get 58-59Hz, going as low as 57Hz.
This variability wouldn't be a problem if the game was run locally, but as our game is networked, I need to keep the exact time so that I can reproduce the logic calculations on both client and server.
Is there any error in this code, or anything that could be improved to make it more stable? Our goal is 60Hz being kept exactly all the time.
EDIT: A first solution that came up in my mind is running the loop a bit faster than it needs to, for example at 70Hz, and checking the frame count to limit the updates to 60 per seconds. This way the simulation would be run in bursts and would need buffering, (up to 60 frames at a time), but should be able to never be slower than needed.
Thanks in advance.
If you want to achieve 60 frames per second, you'd be better off using a scheduled executor as Thread.sleep() may not be as precise as you'd like it to be. Consider the following sample for your server code: (Please note it contains Java 8 code)
public void gameLoop() {
// game logic here
}
Executors.newSingleThreadScheduledExecutor()
.scheduleAtFixedRate(this::gameLoop, 0, 16, TimeUnit.MILLISECONDS)
It will run your gameLoop() every 16 milliseconds which is essentially what you want. This should give you much more precise results. You can also replace 16 and TimeUnit.MILLISECONDS with their nanoseconds counterpart, even though it shouldn't make any noticeable difference

Would this lower frame rate?

I am making a game, and the requirement is to make it have at least 30FPS and shouldn't drop below. Would what I have below achieve this? Or am I off somewhere? Much help would be appreciated.
private long period = 6 * 1000000;
private static final int DELAYS_BEFORE_YIELD = 5;
long before, after, difference, sleep, oversleep = 0;
int delays = 0;
while (running)
{
before = System.nanoTime();
after = System.nanoTime();
difference = after - before;
if (sleep < period && sleep > 0)
{
try
{
Thread.sleep(sleep / 35000L);
oversleep = 0;
}
catch (InterruptedException e)
{
}
}
else if (difference > period)
{
oversleep = difference - period;
}
else if (++delays >= DELAYS_BEFORE_YIELD)
{
Thread.yield();
oversleep = 0;
delays = 0;
}
else
{
oversleep = 0;
}
}
You can set an upper bound to frame rate but not a lower bound that is guaranteed to be always followed.
You can make a function be called no more than 30 times per second but you can't be sure it will be called at least 30 times per second. At 30 fps you have 0.03s of time that will be distributed between your threads and usually the drawing one is the heaviest between them all (unless you have complex operations like AI or whatever but that should be solved by lowering their rate or precomputing what can be precomputed).
If time of draw + time of logic > 0.03 then there is no way to make your game run at least at 30fps.
Good that you are asking early, because one better uses a timer I think. And that turns things inside out. (It could be done your way though.)
Sorry for this non-answer, but I think it is a worthwhile advice.
Look into some game/animation frameworks for their approach.

`Monitor cpu usage per thread in java?

I would like to ask whether there is some simple way to determine cpu usage per thread in java. Thanks
I believe the JConsole (archived link) does provide this kind of information through a plugin
It uses ThreadMXBean getThreadCpuTime() function.
Something along the line of:
long upTime = runtimeProxy.getUptime();
List<Long> threadCpuTime = new ArrayList<Long>();
for (int i = 0; i < threadIds.size(); i++) {
long threadId = threadIds.get(i);
if (threadId != -1) {
threadCpuTime.add(threadProxy.getThreadCpuTime(threadId));
} else {
threadCpuTime.add(0L);
}
}
int nCPUs = osProxy.getAvailableProcessors();
List<Float> cpuUsageList = new ArrayList<Float>();
if (prevUpTime > 0L && upTime > prevUpTime) {
// elapsedTime is in ms
long elapsedTime = upTime - prevUpTime;
for (int i = 0; i < threadIds.size(); i++) {
// elapsedCpu is in ns
long elapsedCpu = threadCpuTime.get(i) - prevThreadCpuTime.get(i);
// cpuUsage could go higher than 100% because elapsedTime
// and elapsedCpu are not fetched simultaneously. Limit to
// 99% to avoid Chart showing a scale from 0% to 200%.
float cpuUsage = Math.min(99F, elapsedCpu / (elapsedTime * 1000000F * nCPUs));
cpuUsageList.add(cpuUsage);
}
}
by using java.lang.management.ThreadMXBean. How to obtain a ThreadMXBean:
ThreadMXBean tmxb = ManagementFactory.getThreadMXBean();
then you can query how much a specific thread is consuming by using:
long cpuTime = tmxb.getThreadCpuTime(aThreadID);
Hope it helps.
Option_1: Code level
In your business logic code; in the beginning call start() API and in the finally block call stop(). So that you will get CPU time for executing your logic by the current running thread. Then log it. Reference.
class CPUTimer
{
private long _startTime = 0l;
public void start ()
{
_startTime = getCpuTimeInMillis();
}
public long stop ()
{
long result = (getCpuTimeInMillis() - _startTime);
_startTime = 0l;
return result;
}
public boolean isRunning ()
{
return _startTime != 0l;
}
/** thread CPU time in milliseconds. */
private long getCpuTimeInMillis ()
{
ThreadMXBean bean = ManagementFactory.getThreadMXBean();
return bean.isCurrentThreadCpuTimeSupported() ? bean.getCurrentThreadCpuTime()/1000000: 0L;
}
}
Option_2: Monitor level using plugins (AIX IBM box which don't have jvisualvm support)
If you think it is delay in adding code now, then you can prefer JConsole with plugins support. I followed this article. Download the topthreads jar from that article and run ./jconsole -pluginpath topthreads-1.1.jar
Option_3: Monitor level using TOP (shift H) + JSTACK (Unix machine which has 'Shif+H' support)
Follow this tutorial, where top command will give option to find top CPU thread (nid). Take that check that nid in jstack output file.
Try the "TopThreads" JConsole plugin. See http://lsd.luminis.nl/top-threads-plugin-for-jconsole/
Though this is platform dependent, I believe what you're looking for is the ThreadMXBean: http://java.sun.com/j2se/1.5.0/docs/api/java/lang/management/ThreadMXBean.html . You can use the getThreadUserTime method, for example, to get what you need. To check if your platform supports CPU measurement, you can call isThreadCpuTimeSupported() .
Indeed the object ThreadMXBean provides the functionality you need (however it might not be implemented on all virtual machines).
In JDK 1.5 there was a demo program doing exactly what you need. It was in the folder demo/management and it was called JTop.java
Unfortnately, it's not there in Java6. Maybe you can find at with google or download JDK5.

Categories

Resources