According to its documentation, System.nanoTime returns
nanoseconds since some fixed but arbitrary origin time. However, on all x64 machines I tried the code below, there were time jumps, moving that fixed origin time around. There may be some flaw in my method to acquire the correct time using an alternative method (here, currentTimeMillis). However, the main purpose of measuring relative times (durations) is negatively affected, too.
I came across this problem trying to measure latencies when comparing different queues to LMAX's Disruptor where I got very negative latencies sometimes. In those cases, start and end timestamps were created by different threads, but the latency was computed after those threads had finished.
My code here takes time using nanoTime, computes the fixed origin in currentTimeMillis time, and compares that origin between calls. And since I must ask a question here: What is wrong with this code? Why does it observe violations of the fixed origin contract? Or does it not?
import java.text.*;
/**
* test coherency between {#link System#currentTimeMillis()} and {#link System#nanoTime()}
*/
public class TimeCoherencyTest {
static final int MAX_THREADS = Math.max( 1, Runtime.getRuntime().availableProcessors() - 1);
static final long RUNTIME_NS = 1000000000L * 100;
static final long BIG_OFFSET_MS = 2;
static long startNanos;
static long firstNanoOrigin;
static {
initNanos();
}
private static void initNanos() {
long millisBefore = System.currentTimeMillis();
long millisAfter;
do {
startNanos = System.nanoTime();
millisAfter = System.currentTimeMillis();
} while ( millisAfter != millisBefore);
firstNanoOrigin = ( long) ( millisAfter - ( startNanos / 1e6));
}
static NumberFormat lnf = DecimalFormat.getNumberInstance();
static {
lnf.setMaximumFractionDigits( 3);
lnf.setGroupingUsed( true);
};
static class TimeCoherency {
long firstOrigin;
long lastOrigin;
long numMismatchToLast = 0;
long numMismatchToFirst = 0;
long numMismatchToFirstBig = 0;
long numChecks = 0;
public TimeCoherency( long firstNanoOrigin) {
firstOrigin = firstNanoOrigin;
lastOrigin = firstOrigin;
}
}
public static void main( String[] args) {
Thread[] threads = new Thread[ MAX_THREADS];
for ( int i = 0; i < MAX_THREADS; i++) {
final int fi = i;
final TimeCoherency tc = new TimeCoherency( firstNanoOrigin);
threads[ i] = new Thread() {
#Override
public void run() {
long start = getNow( tc);
long firstOrigin = tc.lastOrigin; // get the first origin for this thread
System.out.println( "Thread " + fi + " started at " + lnf.format( start) + " ns");
long nruns = 0;
while ( getNow( tc) < RUNTIME_NS) {
nruns++;
}
final long runTimeNS = getNow( tc) - start;
final long originDrift = tc.lastOrigin - firstOrigin;
nruns += 3; // account for start and end call and the one that ends the loop
final long skipped = nruns - tc.numChecks;
System.out.println( "Thread " + fi + " finished after " + lnf.format( nruns) + " runs in " + lnf.format( runTimeNS) + " ns (" + lnf.format( ( double) runTimeNS / nruns) + " ns/call) with"
+ "\n\t" + lnf.format( tc.numMismatchToFirst) + " different from first origin (" + lnf.format( 100.0 * tc.numMismatchToFirst / nruns) + "%)"
+ "\n\t" + lnf.format( tc.numMismatchToLast) + " jumps from last origin (" + lnf.format( 100.0 * tc.numMismatchToLast / nruns) + "%)"
+ "\n\t" + lnf.format( tc.numMismatchToFirstBig) + " different from first origin by more than " + BIG_OFFSET_MS + " ms"
+ " (" + lnf.format( 100.0 * tc.numMismatchToFirstBig / nruns) + "%)"
+ "\n\t" + "total drift: " + lnf.format( originDrift) + " ms, " + lnf.format( skipped) + " skipped (" + lnf.format( 100.0 * skipped / nruns) + " %)");
}};
threads[ i].start();
}
try {
for ( Thread thread : threads) {
thread.join();
}
} catch ( InterruptedException ie) {};
}
public static long getNow( TimeCoherency coherency) {
long millisBefore = System.currentTimeMillis();
long now = System.nanoTime();
if ( coherency != null) {
checkOffset( now, millisBefore, coherency);
}
return now - startNanos;
}
private static void checkOffset( long nanoTime, long millisBefore, TimeCoherency tc) {
long millisAfter = System.currentTimeMillis();
if ( millisBefore != millisAfter) {
// disregard since thread may have slept between calls
return;
}
tc.numChecks++;
long nanoMillis = ( long) ( nanoTime / 1e6);
long nanoOrigin = millisAfter - nanoMillis;
long oldOrigin = tc.lastOrigin;
if ( oldOrigin != nanoOrigin) {
tc.lastOrigin = nanoOrigin;
tc.numMismatchToLast++;
}
if ( tc.firstOrigin != nanoOrigin) {
tc.numMismatchToFirst++;
}
if ( Math.abs( tc.firstOrigin - nanoOrigin) > BIG_OFFSET_MS) {
tc.numMismatchToFirstBig ++;
}
}
}
Now I made some small changes. Basically, I bracket the nanoTime calls between two currentTimeMillis calls to see if the thread has been rescheduled (which should take more than currentTimeMillis resolution). In this case, I disregard the loop cycle. Actually, if we know that nanoTime is sufficiently fast (as on newer architectures like Ivy Bridge), we can bracket in currentTimeMillis with nanoTime.
Now the long >10ms jumps are gone. Instead, we count when we get more than 2ms away from first origin per thread. On the machines I have tested, for a runtime of 100s, there are always close to 200.000 jumps between calls. It is for those cases that I think either currentTimeMillis or nanoTime may be inaccurate.
As has been mentioned, computing a new origin each time means you are subject to error.
// ______ delay _______
// v v
long origin = (long)(System.currentTimeMillis() - System.nanoTime() / 1e6);
// ^
// truncation
If you modify your program so you also compute the origin difference, you'll find out it's very small. About 200ns average I measured which is about right for the time delay.
Using multiplication instead of division (which should be OK without overflow for another couple hundred years) you'll also find that the number of origins computed that fail the equality check is much larger, about 99%. If the reason for error is because of the time delay, they would only pass when the delay happens to be identical to the last one.
A much simpler test is to accumulate elapsed time over some number of subsequent calls to nanoTime and see if it checks out with the first and last calls:
public class SimpleTimeCoherencyTest {
public static void main(String[] args) {
final long anchorNanos = System.nanoTime();
long lastNanoTime = System.nanoTime();
long accumulatedNanos = lastNanoTime - anchorNanos;
long numCallsSinceAnchor = 1L;
for(int i = 0; i < 100; i++) {
TestRun testRun = new TestRun(accumulatedNanos, lastNanoTime);
Thread t = new Thread(testRun);
t.start();
try {
t.join();
} catch(InterruptedException ie) {}
lastNanoTime = testRun.lastNanoTime;
accumulatedNanos = testRun.accumulatedNanos;
numCallsSinceAnchor += testRun.numCallsToNanoTime;
}
System.out.println(numCallsSinceAnchor);
System.out.println(accumulatedNanos);
System.out.println(lastNanoTime - anchorNanos);
}
static class TestRun
implements Runnable {
volatile long accumulatedNanos;
volatile long lastNanoTime;
volatile long numCallsToNanoTime;
TestRun(long acc, long last) {
accumulatedNanos = acc;
lastNanoTime = last;
}
#Override
public void run() {
long lastNanos = lastNanoTime;
long currentNanos;
do {
currentNanos = System.nanoTime();
accumulatedNanos += currentNanos - lastNanos;
lastNanos = currentNanos;
numCallsToNanoTime++;
} while(currentNanos - lastNanoTime <= 100000000L);
lastNanoTime = lastNanos;
}
}
}
That test does indicate the origin is the same (or at least the error is zero-mean).
As far as I know the method System.currentTimeMillis() makes indeed sometimes jumps, dependent on the underlying OS. I have observed this behaviour myself sometimes.
So your code gives me the impression you try to get the offset between System.nanoTime() and System.currentTimeMillis() repeated times. You should rather try to observe this offset by calling System.currentTimeMillis() only once before you can say that System.nanoTimes() causes sometimes jumps.
By the way, I will not pretend that the spec (javadoc describes System.nanoTime() related to some fixed point) is always perfectly implemented. You can look on this discussion where multi-core CPUs or changes of CPU-frequencies can negatively affect the required behaviour of System.nanoTime(). But one thing is sure. System.currentTimeMillis() is far more subject to arbitrary jumps.
Related
in the case of using reflection we are accessing entities by their names encoded in strings like this m = getMethod("someMethod"). To find the requested entity a string comparison has to be done. Does it mean that the length of the entity name influences the performance. If it so how much is this impact on the performance?
The answer is heavily dependent on the Java Virtual Machine, you're using. I wrote a test program, just to get some numbers for a JVM 1.8.0_05 (yes, it's old ;-):
import java.lang.reflect.Method;
public class ReflectionAccessTest {
public final static void main(String[] args) throws Exception {
for (int i = 0; i < 100000; i++) {
// do some "training"
ReflectionTarget.class.getMethod("a", Integer.TYPE, Integer.TYPE);
ReflectionTarget.class.getMethod("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", Integer.TYPE, Integer.TYPE);
ReflectionTarget.class.getMethod("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", Integer.TYPE, Integer.TYPE);
}
Method method = null;;
long start;
start = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
// do some "training"
method = ReflectionTarget.class.getMethod("a", Integer.TYPE, Integer.TYPE);
}
System.out.println("Time to get method with short name " + (System.currentTimeMillis() - start) + " ms");
start = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
method.invoke(null, Integer.MAX_VALUE, Integer.MIN_VALUE);
}
System.out.println("Time to execute method with short name " + (System.currentTimeMillis() - start) + " ms");
start = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
// do some "training"
method = ReflectionTarget.class.getMethod("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", Integer.TYPE, Integer.TYPE);
}
System.out.println("Time to get method with medium name " + (System.currentTimeMillis() - start) + " ms");
start = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
method.invoke(null, Integer.MAX_VALUE, Integer.MIN_VALUE);
}
System.out.println("Time to execute method with medium name " + (System.currentTimeMillis() - start) + " ms");
start = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
// do some "training"
method = ReflectionTarget.class.getMethod("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", Integer.TYPE, Integer.TYPE);
}
System.out.println("Time to get method with long name " + (System.currentTimeMillis() - start) + " ms");
start = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
method.invoke(null, Integer.MAX_VALUE, Integer.MIN_VALUE);
}
System.out.println("Time to execute method with long name " + (System.currentTimeMillis() - start) + " ms");
}
private static class ReflectionTarget {
public static void a(int a, int b) {
// do nothing
}
public static void aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(int a, int b) {
// do nothing
}
public static void aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa(int a, int b) {
// do nothing
}
}
}
The output is as follows:
Time to get method with short name 1012 ms
Time to execute method with short name 58 ms
Time to get method with medium name 3690 ms
Time to execute method with medium name 177 ms
Time to get method with long name 6279 ms
Time to execute method with long name 180 ms
The times are actually dependent on the length of the name (that surprised me first but on second thought it's obvious because there needs to be some kind of equaliy-test that is length-dependent).
But you can also see that the impact is negligible. A call of getMethod takes 0.1 nanoseconds for a method with a name with only one character and takes 0.6 nanoseconds for a method with a crazy long name (haven't counted the number of as).
If this difference is actually relevant for you, you might try out caching-mechanisms of the method you retrieved. But dependent on the time the called method takes, that might be completely useless unless its execution time is also in the range of sub-nanoseconds.
I have this code:
public class MainActivity extends AppCompatActivity implements SensorEventListener {
long start_time;
int record_state;
#Override
public void onSensorChanged(SensorEvent event) {
Long time = System.currentTimeMillis();
if (record_state == 1)
{
start_time = time;
record_state = 0;
}
if(Ax.size() == N_SAMPLES && Ay.size() == N_SAMPLES && Az.size() == N_SAMPLES) //assuming this gets executed
{
Toast.makeText(MainActivity.this, "Size of Ax: " + Integer.toString(Ax.size()) +
"\nSize of Ay: " + Integer.toString(Ay.size()) +
"\nSize of Az: " + Integer.toString(Az.size()) + "\n" + Long.toString((time)) + "\n" + Long.toString((start_time)) + "\nrecord state: " + Integer.toString((record_state)), Toast.LENGTH_LONG).show();
}
}
}
But it appears that time and start_time always have the same value. I want start_time to record the time at the very beginning (or freeze the value of time at one instant) only. How can I do this? What is wrong with this code?
If the code below does not work as you expected it to, then I suspect, that you have an issue with when and how "record_state" is being set somewhere in your code.
Local values with class wide scope:
int record_state = 1;
int iterations = 0;
long start_time;
When your onSensorChanged triggers call checkTimeDiff
private void checkTimeDiff(){
iterations++;
long time = SystemClock.elapsedRealtime();
if (record_state == 1)
{
start_time = SystemClock.elapsedRealtime();
record_state = 0;
}
long diff = time - start_time;
Log.e("My Timer", "Time difference = " + diff + " number of iterations = " + iterations);
if (diff >= 4000)
{
record_start = 1;
Toast.makeText(MainActivity.this, Long.toString(time) + "\n" + Long.toString(start_time), Toast.LENGTH_SHORT).show();
}
}
It appears as if you are using "record_state" as a digital flag. In that case a boolean would be more elegant. But I will use your code as much as possible.
Use lowercase 'long'. You are using objects currently and pointing them both to the same reference, so when you change the value of one it affects the other.
Because this code is going to be used in onSensorChanged (referring to your comments) then the solution should be these variables should be declared and used throughout your class and not in the single method and the problem will disappear so please declare this outside any method as local variables:
int record_state = 1;
Long start_time;
Code and logic improvements
Use long instead of Long
Use >= instead of == because its too hard to get it exactly equal!
I've written a program to scan for amicable numbers (a pair of 2 numbers that the sum of all devisors of one equals to the other) It works ok and I'll include the entire code below.
I tried to get it to run with several threads so I moved the code to a class called Breaker and my main looks as follows:
Breaker line1 = new Breaker("thread1");
Breaker line2 = new Breaker("thread2");
Breaker line3 = new Breaker("thread3");
Breaker line4 = new Breaker("thread4");
line1.scanRange(1L, 650000L);
line2.scanRange(650001L, 850000L);
line3.scanRange(850001L, 1000000L);
line4.scanRange(1000001L, 1200001L);
Now this does shorten the time noticably, but this is not a smart solution and the threads end each on very different times.
What I'm trying to do, is to automate the process so that a master thread that has the entire range, will fire up sections of short ranges (10000) from the master range, and when a thread ends, to fire up the next section in a new thread, until the entire master range is done.
I've tried understanding how to use synchronized, notify() and wait() but after several tries all ended with different errors and unwanted behaviour.
Here is Breaker.java:
import java.util.ArrayList;
public class Breaker implements Runnable{
Long from, to = null;
String name = null;
Thread t = new Thread(this);
public Breaker(String name){
this.name = name;
}
public void scanRange(Long from, Long to){
this.from = from;
this.to = to;
t.start();
}
#Override
public void run() {
this.scan();
}
private void scan() {
ArrayList<ArrayList<Long>> results = new ArrayList<ArrayList<Long>>();
Long startingTime = new Long(System.currentTimeMillis() / 1000L);
Long lastReport = new Long(startingTime);
System.out.println(startingTime + ": Starting number is: " + this.from);
for (Long i = this.from; i <= this.to; i++) {
if (((System.currentTimeMillis() / 1000L) - startingTime ) % 60 == 0 && (System.currentTimeMillis() / 1000L) != lastReport) {
System.out.println((System.currentTimeMillis() / 1000L) + ": " + this.name + " DOING NOW " + i.toString() + ".");
lastReport = (System.currentTimeMillis() / 1000L);
}
ArrayList<Long> a = new ArrayList<Long>();
a = getFriendPair(i);
if(a != null) {
results.add(a);
System.out.println(this.name + ": FOUND PAIR! " + a.toString());
}
}
System.out.println((System.currentTimeMillis() / 1000L) + ": " + this.name + " Done. Total pairs found: " + results.size() +
". Total working time: " + ((System.currentTimeMillis() / 1000L) - startingTime) + " seconds.");
}
/**
* Receives integer and returns an array of the integer and the number who is it's
* pair in case it has any. Else returns null.
* #param i
* #return
*/
private static ArrayList<Long> getFriendPair(Long i) {
Long possibleFriend = getAndSumAllDevisors(i);
if (possibleFriend.compareTo(i) <= 0) return null;
Long sumOfPossibleFriend = getAndSumAllDevisors(possibleFriend);
if(sumOfPossibleFriend.equals(i)) {
ArrayList<Long> pair = new ArrayList<Long>();
pair.add(i);
pair.add(possibleFriend);
return pair;
}
return null;
}
private static Long getAndSumAllDevisors(Long victim) {
Long sum = new Long(1);
Long i = 2L;
Long k = new Long(0);
while ((k = i * i) <= victim) {
if ((victim % i) == 0) {
sum += i;
if (k == victim) return sum;
sum += (victim / i);
}
i++;
}
return sum;
}
}
Consider ExecutorService, which is backed by a thread pool. You feed it tasks and they get shuffled off to worker threads as they become available:
http://www.vogella.com/articles/JavaConcurrency/article.html#threadpools
What you need is a "Fixed Thread Pool". See http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool%28int%29
I would go for a ExecutorService with a fixed thread pool size. Your master thread can either feed directly to the executor service or you can disconnect them via a BlockingQueue. The Java Doc of the blocking queue describes the producer-consumer pattern quite nicely.
I ended up taking none of the answers but rather Marko's comment and implemented my solution using a Fork/Join framework. It works and runs almost at twice the speed of the none optimized version.
My code looks now like so:
main file (runner)
public class runner {
private static Long START_NUM = 1L;
private static Long END_NUM = 10000000L;
public static void main(String[] args) {
Long preciseStartingTime = new Long(System.currentTimeMillis());
ForkJoinPool pool = new ForkJoinPool();
WorkManager worker = new WorkManager(START_NUM, END_NUM);
pool.invoke(worker);
System.out.println("precise time: " + (System.currentTimeMillis() - preciseStartingTime));
}
WorkManager
I've defined here 3 class variables. from and to are set from a constructor which is called from the main file. And threshold which is the maximum amount of numbers the program will assign a single thread to compute serially.
As you can see in the code, it will recursively make the range smaller until it is small enough to to compute directly, then it calls Breaker to start breaking.
import java.util.concurrent.RecursiveAction;
public class WorkManager extends RecursiveAction{
Long from, to;
Long threshold = 10000L;
public WorkManager(Long from, Long to) {
this.from = from;
this.to = to;
}
protected void computeDirectly(){
Breaker b = new Breaker(from, to);
b.scan();
}
#Override
protected void compute() {
if ((to - from) <= threshold){
System.out.println("New thread from " + from + " to " + to);
computeDirectly();
}
else{
Long split = (to - from) /2;
invokeAll(new WorkManager(from, from + split),
new WorkManager(from + split + 1L, to));
}
}
}
Breaker (is no longer an implementation of of Runnable)
public class Breaker{
Long from, to = null;
public Breaker(Long lFrom, Long lTo) {
this.from = lFrom;
this.to = lTo;
}
public void scan() {
ArrayList<ArrayList<Long>> results = new ArrayList<ArrayList<Long>>();
Long startingTime = new Long(System.currentTimeMillis() / 1000L);
for (Long i = this.from; i <= this.to; i++) {
ArrayList<Long> a = new ArrayList<Long>();
a = getFriendPair(i);
if(a != null) {
results.add(a);
System.out.println((System.currentTimeMillis() / 1000L) + ": FOUND PAIR! " + a.toString());
}
}
}
/**
* Receives integer and returns an array of the integer and the number who is it's
* pair in case it has any. Else returns null.
* #param i
* #return
*/
private static ArrayList<Long> getFriendPair(Long i) {
Long possibleFriend = getAndSumAllDevisors(i);
if (possibleFriend.compareTo(i) <= 0) return null;
Long sumOfPossibleFriend = getAndSumAllDevisors(possibleFriend);
if(sumOfPossibleFriend.equals(i)) {
ArrayList<Long> pair = new ArrayList<Long>();
pair.add(i);
pair.add(possibleFriend);
return pair;
}
return null;
}
private static Long getAndSumAllDevisors(Long victim) {
Long sum = new Long(1);
Long i = 2L;
Long k = new Long(0);
while ((k = i * i) <= victim) {
if ((victim % i) == 0) {
sum += i;
if (k == victim) return sum;
sum += (victim / i);
}
i++;
}
return sum;
}
}
I've programmed a (very simple) benchmark in Java. It simply increments a double value up to a specified value and takes the time.
When I use this singlethreaded or with a low amount of threads (up to 100) on my 6-core desktop, the benchmark returns reasonable and repeatable results.
But when I use for example 1200 threads, the average multicore duration is significantly lower than the singlecore duration (about 10 times or more). I've made sure that the total amount of incrementations is the same, no matter how much threads I use.
Why does the performance drop so much with more threads? Is there a trick to solve this problem?
I'm posting my source, but I don't think, that there is a problem.
Benchmark.java:
package sibbo.benchmark;
import java.text.DecimalFormat;
import java.util.LinkedList;
import java.util.List;
public class Benchmark implements TestFinishedListener {
private static final double TARGET = 1e10;
private static final int THREAD_MULTIPLICATOR = 2;
public static void main(String[] args) throws InterruptedException {
Benchmark b = new Benchmark(TARGET);
b.start();
}
private int coreCount;
private List<Worker> workers = new LinkedList<>();
private List<Worker> finishedWorkers = new LinkedList<>();
private double target;
public Benchmark(double target) {
this.target = target;
getSystemInfos();
printInfos();
}
private void getSystemInfos() {
coreCount = Runtime.getRuntime().availableProcessors();
}
private void printInfos() {
System.out.println("Usable cores: " + coreCount);
System.out.println("Multicore threads: " + coreCount * THREAD_MULTIPLICATOR);
System.out.println("Loops per core: " + new DecimalFormat("###,###,###,###,##0").format(TARGET));
System.out.println();
}
public synchronized void start() throws InterruptedException {
Thread.currentThread().setPriority(Thread.MAX_PRIORITY);
System.out.print("Initializing singlecore benchmark... ");
Worker w = new Worker(this, 0);
workers.add(w);
Thread.sleep(1000);
System.out.println("finished");
System.out.print("Running singlecore benchmark... ");
w.runBenchmark(target);
wait();
System.out.println("finished");
printResult();
System.out.println();
// Multicore
System.out.print("Initializing multicore benchmark... ");
finishedWorkers.clear();
for (int i = 0; i < coreCount * THREAD_MULTIPLICATOR; i++) {
workers.add(new Worker(this, i));
}
Thread.sleep(1000);
System.out.println("finished");
System.out.print("Running multicore benchmark... ");
for (Worker worker : workers) {
worker.runBenchmark(target / THREAD_MULTIPLICATOR);
}
wait();
System.out.println("finished");
printResult();
Thread.currentThread().setPriority(Thread.NORM_PRIORITY);
}
private void printResult() {
DecimalFormat df = new DecimalFormat("###,###,###,##0.000");
long min = -1, av = 0, max = -1;
int threadCount = 0;
boolean once = true;
System.out.println("Result:");
for (Worker w : finishedWorkers) {
if (once) {
once = false;
min = w.getTime();
max = w.getTime();
}
if (w.getTime() > max) {
max = w.getTime();
}
if (w.getTime() < min) {
min = w.getTime();
}
threadCount++;
av += w.getTime();
if (finishedWorkers.size() <= 6) {
System.out.println("Worker " + w.getId() + ": " + df.format(w.getTime() / 1e9) + "s");
}
}
System.out.println("Min: " + df.format(min / 1e9) + "s, Max: " + df.format(max / 1e9) + "s, Av per Thread: "
+ df.format((double) av / threadCount / 1e9) + "s");
}
#Override
public synchronized void testFinished(Worker w) {
workers.remove(w);
finishedWorkers.add(w);
if (workers.isEmpty()) {
notify();
}
}
}
Worker.java:
package sibbo.benchmark;
public class Worker implements Runnable {
private double value = 0;
private long time;
private double target;
private TestFinishedListener l;
private final int id;
public Worker(TestFinishedListener l, int id) {
this.l = l;
this.id = id;
new Thread(this).start();
}
public int getId() {
return id;
}
public synchronized void runBenchmark(double target) {
this.target = target;
notify();
}
public long getTime() {
return time;
}
#Override
public void run() {
synWait();
value = 0;
long startTime = System.nanoTime();
while (value < target) {
value++;
}
long endTime = System.nanoTime();
time = endTime - startTime;
l.testFinished(this);
}
private synchronized void synWait() {
try {
wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
You need to understand that the OS (or Java thread scheduler, or both) is trying to balance between all of the threads in your application to give them all a chance to perform some work, and there is a non-zero cost to switch between threads. With 1200 threads, you have just reached (and probably far exceeded) the tipping point wherein the processor is spending more time context switching than doing actual work.
Here is a rough analogy:
You have one job to do in room A. You stand in room A for 8 hours a day, and do your job.
Then your boss comes by and tells you that you have to do a job in room B also. Now you need to periodically leave room A, walk down the hall to room B, and then walk back. That walking takes 1 minute per day. Now you spend 3 hours, 59.5 minutes working on each job, and one minute walking between rooms.
Now imagine that you have 1200 rooms to work in. You are going to spend more time walking between rooms than doing actual work. This is the situation that you have put your processor into. It is spending so much time switching between contexts that no real work gets done.
EDIT: Now, as per the comments below, maybe you spend a fixed amount of time in each room before moving on- your work will progress, but the number of context switches between rooms still affects the overall runtime of a single task.
Ok, I think I've found my problem, but until now, no solution.
When measuring the time every thread runs to do his part of the work, there are different possible minimums for different total amounts of threads. The maximum is the same everytime. In case that a thread is started first and then is paused very often and finishes last. For example this maximum value could be 10 seconds. Assuming that the total amount of operations that is done by every thread stays the same, no matter how much threads I use, the amount of operations that is done by a single thread has to be changed when using a different amount of threads. For example, using one thread, it has to do 1000 operations, but using ten threads, everyone of them has to do just 100 operations. Now, using ten threads, the minimum amount of time that one thread can use is much lower than using one thread. So calculating the average amount of time every thread needs to do his work is nonsense. The minimum using ten Threads would be 1 second. This happens if one thread does its work without interruption.
EDIT
The solution would be to simply measure the amount of time between the start of the first thread and the completion of the last.
This is my code, I am trying to test what's the average time to make a call to getLocationIp method by passing ipAddress in that. So what I did is that, I am generating some random ipAddress and passing that to getLocationIp and then calculating the time difference. And then putting that to HashMap with there counts. And after wards I am priniting the hash map to see
what's the actual count. So this is the right way to test this? or there is some other way. Becuase in my case I am not sure whether my generateIPAddress method generates random ipAddress everytime. I am also having start_total time before entering the loop and then end_total time after everything gets completed. So on that I can calculate the average time?
long total = 10000;
long found = 0;
long found_country = 0;
long runs = total;
Map<Long, Long> histgram = new HashMap<Long, Long>();
try {
long start_total = System.nanoTime();
while(runs > 0) {
String ipAddress = generateIPAddress();
long start_time = System.nanoTime();
resp = GeoLocationService.getLocationIp(ipAddress);
long end_time = System.nanoTime();
long difference = (end_time - start_time)/1000000;
Long count = histgram.get(difference);
if (count != null) {
count++;
histgram.put(Long.valueOf(difference), count);
} else {
histgram.put(Long.valueOf(difference), Long.valueOf(1L));
}
runs--;
}
long end_total = System.nanoTime();
long finalTotal = (end_total - start_total)/1000000;
float avg = (float)(finalTotal) / total;
Set<Long> keys = histgram.keySet();
for (Long key : keys) {
Long value = histgram.get(key);
System.out.println("$$$GEO OPTIMIZE SVC MEASUREMENT$$$, HG data, " + key + ":" + value);
}
This is my generateIpAddress method-
private String generateIPAddress() {
Random r = new Random();
String s = r.nextInt(256) + "." + r.nextInt(256) + "." + r.nextInt(256) + "." + r.nextInt(256);
return s;
}
Any suggestions will be appreciated.
Generally when you benchmark functions you want to run the multiple times and average the results That gives you are clearer indication of the actual time your program will spend in them, considering that you rarely care about the performance of something only run once.