testing for game algorithm speed

testing for game algorithm speed - java

How should I test my algorithm in terms of speed? The enhanced algorithm I made and the original algorithm search the same depth and they both give the same move, they only differ in terms of speed.
Do you know how I should test my new algorithm that I made? Aside from just subtracting the system time it started to system time it ended. What I'm trying to say is I need to do a little formal tests with little bit of formulas. Should I simulate all possible moves and tally the time each algorithm (enhanced and original) took time to decide on a move? I'm quite clueless here.

I've used the below method a few times and have had success. If you are interested in multi-threaded benchmarking refer to the link at the bottom of the page.
Timing a single-threaded task using CPU, system, and user time
Timing a single-threaded task using CPU, system, and user time
"User time" is the time spent running your application's own code.
"System time" is the time spent running OS code on behalf of your
application (such as for I/O).
Java 1.5 introduced the java.lang.management package to monitor the JVM. The entry point for the package is the ManagementFactory class. It's static methods return a variety of different "MXBean" objects that report JVM information. One such bean can report thread CPU and user time.
Call ManagementFactory . getThreadMXBean() to get a ThreadMXBean that describes current JVM threads. The bean's getCurrentThreadCpuTime() method returns the CPU time for the current thread. The getCurrentThreadUserTime() method returns the thread's user time. Both of these report times in nanoseconds (but see Appendix on Times and (lack of) nanosecond accuracy).
Be sure to call isCurrentThreadCpuTimeSupported() first, though. If it returns false (rare), the JVM implementation or OS does not support getting CPU or user times. In that case, you're back to using wall clock time.
import java.lang.management.*;
/** Get CPU time in nanoseconds. */
public long getCpuTime( ) {
ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
return bean.isCurrentThreadCpuTimeSupported( ) ?
bean.getCurrentThreadCpuTime( ) : 0L;
}
/** Get user time in nanoseconds. */
public long getUserTime( ) {
ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
return bean.isCurrentThreadCpuTimeSupported( ) ?
bean.getCurrentThreadUserTime( ) : 0L;
}
/** Get system time in nanoseconds. */
public long getSystemTime( ) {
ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
return bean.isCurrentThreadCpuTimeSupported( ) ?
(bean.getCurrentCpuTime( ) - bean.getCurrentThreadUserTime( )) : 0L;
}
These methods return the CPU, user, and system time since the thread started. To time a task after the thread has started, call one or more of these before and after the task and take the difference:
long startSystemTimeNano = getSystemTime( );
long startUserTimeNano = getUserTime( );
... do task ...
long taskUserTimeNano = getUserTime( ) - startUserTimeNano;
long taskSystemTimeNano = getSystemTime( ) - startSystemTimeNano;
Taken from, http://nadeausoftware.com/articles/2008/03/java_tip_how_get_cpu_and_user_time_benchmarking#TimingasinglethreadedtaskusingCPUsystemandusertime

Here is a sample program to capture timings, you can change this to suit your needs:
package com.quicklyjava;
public class Main {
/**
* #param args
* #throws InterruptedException
*/
public static void main(String[] args) throws InterruptedException {
// start time
long time = System.nanoTime();
for (int i = 0; i < 5; i++) {
System.out.println("Sleeping Zzzz... " + i);
Thread.sleep(1000);
}
long difference = System.nanoTime() - time;
System.out.println("It took " + difference + " nano seconds to finish");
}
}
And here is the output:
Sleeping Zzzz... 0
Sleeping Zzzz... 1
Sleeping Zzzz... 2
Sleeping Zzzz... 3
Sleeping Zzzz... 4
It took 5007507169 nano seconds to finish

Related

Is there some neat way to make Threads (in Java) wait for theoretical time units as opposed to using Thread.sleep()?

Currently working on a university assessment, so I won't share specifics and I'm not asking for any explanation that will help me solve the main problem. I've already solved the problem, but my solution might be considered a little messy.
Basically, we're working with concurrency and semaphores. There is some shared resource that up to X (where X > 1) number of threads can access at a time and an algorithm which makes it a little more complicated than just acquiring and releasing access. Threads come at a certain time, use the resource for a certain time and then leave. We are to assume that no time is wasted when arriving, accessing, releasing and leaving the resource. The goal is to demonstrate that the algorithm we have written works by outputting the times a thread arrives, accesses the resource and leaves for each thread.
I'm using a semaphore with X number of permits to govern access. And it's all working fine, but I think the way I arrive at the expected output might be a bit janky. Here's something like what I have currently:
#Override
public void run() {
long alive = System.currentTimeMillis();
try { Thread.sleep(arrivalTime * 1000); }
catch (InterruptedException e) {} // no interrupts implemented
long actualArriveTime = System.currentTimeMillis() - alive;
boolean accessed = false;
while (!accessed) accessed = tryAcquire();
long actualAccessTime = System.currentTimeMillis() - alive;
try { Thread.sleep(useTime * 1000); }
catch (InterruptedException e) {} // no interrupts implemented
release();
long actualDepartTime = System.currentTimeMillis() - alive;
System.out.println(actualArriveTime);
System.out.println(actualAccessTime);
System.out.println(actualDepartTime);
}
I do it this way because where the expected output might be:
Thread Arrival Access Departure
A 0 0 3
B 0 0 5
C 2 2 6
... ... ... ...
My output looks something like:
Thread Arrival Access Departure
A 0 0 3006
B 0 0 5008
C 2 2 6012
... ... ... ...
I'm essentially making the time period much larger so that if the computer takes a fews milliseconds to acquire(), for example, it doesn't affect the number much. Then I can round to the nearest second to get the expected output. My algorithm works, but there are issues with this. A: It's slow; B: With enough threads, the milliseconds of delay may build so that I round to the wrong number.
I need something more like this:
public static void main(String[] args) {
int clock = 0;
while (threadsWaiting) {
clock++;
}
}
#Override
public void run() {
Thread.waitUntil(clock == arrivalTime);
boolean accessed = false;
while (!accessed) accessed = tryAcquire();
int accessTime = clock;
int depatureTime = accessTime + useTime;
Thread.waitUntil(clock == departureTime);
release();
System.out.println(arrivalTime);
System.out.println(accessTime);
System.out.println(departureTime);
}
Hopefully that's clear. Any help is appreciated.
Thanks!

Why does running ExecutorService in single, dual and triple threaded program take the same time?

I'm running a Simulation for thin films in very low temperatures. I tried using ExecutorService to properly multithread my code and hopefully reduce runtime. But I found out the hard way that this is easier said than done. I fiddled with many parameters, but wasn't able to increase any efficiency. What I found weird was that running the code with single, double or triple thread ExecutorService takes almost the same time.
The bottomline is that although more iteration of the for loop are running parallel at the same time, the amount of time taken to run a single iteration is increased. The overall total runtime comes out to be almost same no matter how many processors are being utilized.
I'm confused. What am I doing wrong?
public abstract class IsingRun {
public static void main(String[] args) {
long starta = System.currentTimeMillis(); //time for single thread
for (double t=temp[1];t<=temp[0] ;t+=temp[2] ) {
long start = System.currentTimeMillis();
//do stuff
start = System.currentTimeMillis()-start;
print(t,output,start); //start is the time taken for single iteration
}
starta = System.currentTimeMillis()-starta;
System.out.println("Total Runtime (Single): "+ starta); //single thread total time
/*end of single threaded process */
long startb = System.currentTimeMillis();
ExecutorService e = Executors.newFixedThreadPool(2);
for (double t=temp[1];t<=temp[0] ;t+=temp[2] ) {
simulate s = new simulate(t);
e.execute(s);
}
e.shutdown();
startb = System.currentTimeMillis()-startb;
System.out.println("Total Runtime (Double): "+ startb);
/*end of double threaded process */
long startc = System.currentTimeMillis();
e = Executors.newFixedThreadPool(3);
for (double t=temp[1];t<=temp[0] ;t+=temp[2] ) {
simulate s = new simulate(t);
e.execute(s);
}
e.shutdown();
startc = System.currentTimeMillis()-startc;
System.out.println("Total Runtime (Triple): "+ startc);
/*end of triple threaded process */
}
}
class simulate implements Runnable{
simulate(double T){this.t=T;};
public void run() {
long start = System.currentTimeMillis();
//do stuff
start = System.currentTimeMillis()-start;
print(t,output,start); //start is the time taken for single iteration
}
}
I got the following results
Temp - Output - Runtime for single iteration
2.10 - 0.85410 - 632288
2.20 - 0.83974 - 646527
2.30 - 0.81956 - 655128
2.40 - 0.80318 - 645012
2.50 - 0.79169 - 649863
2.60 - 0.77140 - 662429
Total Runtime (Single): 3891257
2.10 - 0.85585 - 1291943
2.20 - 0.83733 - 1299240
2.40 - 0.80284 - 1313495
2.30 - 0.82294 - 1334043
2.50 - 0.79098 - 1315072
2.60 - 0.77341 - 1338203
Total Runtime (Double): 3964290
2.10 - 0.85001 - 1954315
2.20 - 0.84137 - 1971372
2.30 - 0.82196 - 1996214
2.40 - 0.80684 - 1966009
2.50 - 0.78995 - 1970542
2.60 - 0.77437 - 1966503
Total Runtime (Triple): 3962763
What am I doing wrong? Task manager show all processors are being used, but is it?

A couple of things:
1) If you take a look at the ExecutorService Javadoc, the shutdown() method does not wait for threads to complete execution. In you code you are not waiting for the task to actually complete. The javadoc also contains some sample code which demonstrates how to properly wait for ExecutorService to finish.
2) Benchmarking programs is not really a straight forward thing. I may/may not trust the numbers here. Also take a look at this SO post regarding measuring execution time in Java. Also, I would test your different type of execution methods separately and not all in one go.
3) If your task is really CPU intensive. I would create a ExecutorService with the number of threads matching the number of cores/processors in your machine. If your tasks are short lived then the additional overhead of threading/context switching may not be worth it and single threaded might be the way to go.

What happens inside the JVM so that a method invocation in Java becomes slower when you call it somewhere else in your code?

The short code below isolates the problem. Basically I'm timing the method addToStorage. I start by executing it one million times and I'm able to get its time down to around 723 nanoseconds. Then I do a short pause (using a busy spinning method not to release the cpu core) and time the method again N times, on a different code location. For my surprise I find that the smaller the N the bigger is the addToStorage latency.
For example:
If N = 1 then I get 3.6 micros
If N = 2 then I get 3.1 and 2.5 micros
if N = 5 then I get 3.7, 1.8, 1.7, 1.5 and 1.5 micros
Does anyone know why this is happening and how to fix it? I would like my method to consistently perform at the fastest time possible, no matter where I call it.
Note: I would not think it is thread related since I'm not using Thread.sleep. I've also tested using taskset to pin my thread to a cpu core with the same results.
import java.util.ArrayList;
import java.util.List;
public class JvmOdd {
private final StringBuilder sBuilder = new StringBuilder(1024);
private final List<String> storage = new ArrayList<String>(1024 * 1024);
public void addToStorage() {
sBuilder.setLength(0);
sBuilder.append("Blah1: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah2: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah3: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah4: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah5: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah6: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah7: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah8: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah9: ").append(System.nanoTime()).append('\n');
sBuilder.append("Blah10: ").append(System.nanoTime()).append('\n');
storage.add(sBuilder.toString());
}
public static long mySleep(long t) {
long x = 0;
for(int i = 0; i < t * 10000; i++) {
x += System.currentTimeMillis() / System.nanoTime();
}
return x;
}
public static void main(String[] args) throws Exception {
int warmup = Integer.parseInt(args[0]);
int mod = Integer.parseInt(args[1]);
int passes = Integer.parseInt(args[2]);
int sleep = Integer.parseInt(args[3]);
JvmOdd jo = new JvmOdd();
// first warm up
for(int i = 0; i < warmup; i++) {
long time = System.nanoTime();
jo.addToStorage();
time = System.nanoTime() - time;
if (i % mod == 0) System.out.println(time);
}
// now see how fast the method is:
while(true) {
System.out.println();
// Thread.sleep(sleep);
mySleep(sleep);
long minTime = Long.MAX_VALUE;
for(int i = 0; i < passes; i++) {
long time = System.nanoTime();
jo.addToStorage();
time = System.nanoTime() - time;
if (i > 0) System.out.print(',');
System.out.print(time);
minTime = Math.min(time, minTime);
}
System.out.println("\nMinTime: " + minTime);
}
}
}
Executing:
$ java -server -cp . JvmOdd 1000000 100000 1 5000
59103
820
727
772
734
767
730
726
840
736
3404
MinTime: 3404

There is so much going on in here that I don't know where to start. But lets start here....
long time = System.nanoTime();
jo.addToStorage();
time = System.nanoTime() - time;
The latency of addToStoarge() cannot be measured using this technique. It simply runs for too quickly meaning you're likely below the resolution of the clock. Without running this, my guess is that your measures are dominated by clock edge counts. You'll need to bulk up the unit of work to get a measure with lower levels of noise in it.
As for what is happening? There are a number of call site optimizations the most important being inlining. Inlining would totally eliminate the call site but it's a path specific optimization. If you call the method from a different place, that would follow the slow path of performing a virtual method lookup followed by a jump to that code. So to see the benefits of inlining from a different path, that path would also have to be "warmed up".
I would strongly recommend that you look at both JMH (delivered with the JDK). There are facilities in there such as blackhole which will help with the effects of CPU clocks winding down. You might also want to evaluate the quality of the bench with the help of tools like JITWatch (Adopt OpenJDK project) which will take logs produced by the JIT and help you interrupt them.

There is so much to this subject, but the bottom line is that you can't write a simplistic benchmark like this and expect it to tell you anything useful. You will need to use JMH.
I suggest watching this: https://www.infoq.com/presentations/jmh about microbenchmarking and JMH
There's also a chapter on microbenchmarking & JMH in my book: http://shop.oreilly.com/product/0636920042983.do

Java internally uses JIT(Just in Compiler). Based on the number of times the same method executes it optimizes the instruction and perform better. For lesser values, the usage of method would be normal which may not fall under optimization that shows the execution time more. When the same method called more time, it uses JIT and executes in lesser time because of the optimized instruction for the same method execution.

Why this while loop cannot print 1,000 times per seconds?

The following Java method is meant to print the number i by nLoopsPerSecond times per second for seconds seconds:
public void test(int nLoopsPerSecond, int seconds) {
double secondsPerLoop = 1.0/(double)nLoopsPerSecond;
long startTime = System.currentTimeMillis();
long currentTime;
int i = 0;
while ((currentTime = System.currentTimeMillis()) < startTime + seconds*1000) {
System.out.println(i++);
while (System.currentTimeMillis() < currentTime + secondsPerLoop*1000);
}
}
With the following call:
test(1000,1);
I expect this method to do the System.out.println(i++); 1,000 times, but I only got 63.
When I try to see how many seconds it actually use per loop with this code
public void test(int nLoopsPerSecond, int seconds) {
double secondsPerLoop = 1.0/(double)nLoopsPerSecond;
long startTime = System.currentTimeMillis();
long currentTime;
int i = 0;
while ((currentTime = System.currentTimeMillis()) < startTime + seconds*1000) {
while (System.currentTimeMillis() < currentTime + secondsPerLoop*1000);
System.out.println(System.currentTimeMillis() - currentTime);
}
}
I expect it to print 1 milliseconds each loop, but it prints 15 or 16 milliseconds.
Please suggest what is wrong in my code.

Are you running on Windows, perhaps? System.currentTimeMillis() consults the underlying operating system clock, which is only 60Hz on many versions of Windows.

Try System.nanoTime() since you are not measuring time since the epoch. System.currentTimeMillis vs System.nanoTime

That's probably because the processing takes up some time. The processor does not solely dedicate its time to the execution of your program, it performs several other functions in the background. So you might get different results based on the processor load.

Your output console is not fast enough. You do not mention how you run your test and where the output goes. The speed of the terminal and buffers (not) used will limit how fast can the program output data. If running unbuffered, your program will always have to wait, until the new line is printed on screen. If the console waits for screen redraw and screen is redrawn at 60Hz, you've got your 16ms/line and about 60 lines per second.
Running your code without inner loop inside IntelliJ Idea, I get about 140.000 lines per second (and Idea warns me, that it is not displaying every line, as my output is too fast).
With the inner loop, I get about 800-900 lines. That happens because the process may be scheduled out of cpu, or blocked because of something else, like swapping. (If I simplify a lot, usually desktop OSes schedule in 1ms granularity.)

How to test task performance, using multitheading?

I have some exercises, and one of them refers to concurrency. This theme is new for me, however I spent 6 hours and finally solve my problem. But my knowledge of corresponding API is poor, so I need advice: is my solution correct or may be there is more appropriate way.
So, I have to implement next interface:
public interface PerformanceTester {
/**
* Runs a performance test of the given task.
* #param task which task to do performance tests on
* #param executionCount how many times the task should be executed in total
* #param threadPoolSize how many threads to use
*/
public PerformanceTestResult runPerformanceTest(
Runnable task,
int executionCount,
int threadPoolSize) throws InterruptedException;
}
where PerformanceTestResult contains total time (how long the whole performance test took in total), minimum time (how long the shortest single execution took) and maximum time (how long the longest single execution took).
So, I learned many new things today - about thread pools, types Executors, ExecutorService, Future, CompletionService etc.
If I had Callable task, I could make next:
Return current time in the end of call() procedure.
Create some data structure (some Map may be) to store start time and Future object, that retuned by fixedThreadPool.submit(task) (do this executionCount times, in loop);
After execution I could just subtract start time from end time for every Future.
(Is this right way in case of Callable task?)
But! I have only Runnable task, so I continued looking. I even create FutureListener implements Callable<Long>, that have to return time, when Future.isDone(), but is seams little crazy for my (I have to double threads count).
So, eventually I noticed CompletionService type with interesting method take(), that Retrieves and removes the Future representing the next completed task, waiting if none are yet present., and very nice example of using ExecutorCompletionService. And there is my solution.
public class PerformanceTesterImpl implements PerformanceTester {
#Override
public PerformanceTestResult runPerformanceTest(Runnable task,
int executionCount, int threadPoolSize) throws InterruptedException {
long totalTime = 0;
long[] times = new long[executionCount];
ExecutorService pool = Executors.newFixedThreadPool(threadPoolSize);
//create list of executionCount tasks
ArrayList<Runnable> solvers = new ArrayList<Runnable>();
for (int i = 0; i < executionCount; i++) {
solvers.add(task);
}
CompletionService<Long> ecs = new ExecutorCompletionService<Long>(pool);
//submit tasks and save time of execution start
for (Runnable s : solvers)
ecs.submit(s, System.currentTimeMillis());
//take Futures one by one in order of completing
for (int i = 0; i < executionCount; ++i) {
long r = 0;
try {
//this is saved time of execution start
r = ecs.take().get();
} catch (ExecutionException e) {
e.printStackTrace();
return null;
}
//put into array difference between current time and start time
times[i] = System.currentTimeMillis() - r;
//calculate sum in array
totalTime += times[i];
}
pool.shutdown();
//sort array to define min and max
Arrays.sort(times);
PerformanceTestResult performanceTestResult = new PerformanceTestResult(
totalTime, times[0], times[executionCount - 1]);
return performanceTestResult;
}
}
So, what can you say? Thanks for replies.

I would use System.nanoTime() for higher resolution timings. You might want to ignroe the first 10,000 tests to ensure the JVM has warmed up.
I wouldn't bother creating a List of Runnable and add this to the Executor. I would instead just add them to the executor.
Using Runnable is not a problem as you get a Future<?> back.
Note: Timing how long the task spends in the queue can make a big difference to the timing. Instead of taking the time from when the task was created you can have the task time itself and return a Long for the time in nano-seconds. How the timing is done should reflect the use case you have in mind.
A simple way to convert a Runnable task into one which times itself.
finla Runnable run = ...
ecs.submit(new Callable<Long>() {
public Long call() {
long start = System.nanoTime();
run.run();
return System.nanoTime() - start;
}
});

There are many intricacies when writing performance tests in the JVM. You probably aren't worried about them as this is an exercise, but if you are this question might have more information:
How do I write a correct micro-benchmark in Java?
That said, there don't seem to be any glaring bugs in your code. You might want to ask this on the lower traffic code-review site if you want a full review of your code:
http://codereview.stackexchange.com

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

testing for game algorithm speed - java

Related

Is there some neat way to make Threads (in Java) wait for theoretical time units as opposed to using Thread.sleep()?

Why does running ExecutorService in single, dual and triple threaded program take the same time?

What happens inside the JVM so that a method invocation in Java becomes slower when you call it somewhere else in your code?

Why this while loop cannot print 1,000 times per seconds?

How to test task performance, using multitheading?

Categories

Resources