I wrote a progrmame to test whether the try catch block affect of the running time or not.
Code as follow shows
public class ExceptionTest {
public static void main(String[] args) {
System.out.println("Loop\t\tNormal(nano second)\t\tException(nano second)");
int[] arr = new int[] { 1, 500, 2500, 12500, 62500, 312500, 16562500 };
for (int i = 0; i < arr.length; i++) {
System.out.println(arr[i] + "," + NormalCase(arr[i]) + ","
+ ExceptionCase(arr[i]));
}
}
public static long NormalCase(int times) {
long firstTime=System.nanoTime();
for (int i = 0; i < times; i++) {
int a = i + 1;
int b = 2;
a = a / b;
}
return System.nanoTime()-firstTime;
}
public static long ExceptionCase(int times) {
long firstTime =System.nanoTime();
for (int i = 0; i < times; i++) {
try {
int a = i + 1;
int b = 0;
a = a / b;
} catch (Exception ex) {
}
}
return System.nanoTime()-firstTime;
}
}
the result shows bellow:
I wonder why less time when turns to 62500 and biger numbers?is It overflow ? seems not.
You are not testing the computational cost of the try/catch block. You are really testing the cost of exception handling. A fair test would be making b= 2 ; also in ExceptionCase. I don't know what extremely wrong conclusions you will draw if you think you are testing only try/catch. I'm frankly alarmed.
The reason why timing changes so much is that you are executing the functions so many times that the JVM decided to compile and optimize them. Enclose your loop into an outer one
for(int e= 0 ; e < 17 ; e++ ) {
for(int i= 0 ; i < arr.length ; i++) {
System.out.println(arr[i] + "," + NormalCase(arr[i]) + "," + ExceptionCase(arr[i]));
}
}
and you will see more stable results by the end of the run.
I also think that in the case NormalCase the optimizer is "realizing" that the for is not really doing anything and just skipping it (for an execution time of 0). For some reason (probably the side effect of exceptions), it's not doing the same with ExceptionCase. To solve this bias, compute something inside the loop and return it.
I don't want to change your code too much, so I'll use a trick to return a second value:
public static long NormalCase(int times,int[] result) {
long firstTime=System.nanoTime();
int computation= 0 ;
for(int i= 0; i < times; i++ ) {
int a= i + 1 ;
int b= 2 ;
a= a / b ;
computation+= a ;
}
result[0]= computation ;
return System.nanoTime()-firstTime;
}
You can call this with NormalCase(arr[i],result), preceded by declaration int[] result= new int[1] ;. Modify ExceptionCase in the same way, and output result[0] to avoid any other optimization. You will probably need one result variable for each function.
Related
I am trying to get familiar with java multithreaded applications. I tried to think of a simple application that can be parallelized very well. I thought vector addition would be a good application to do so.
However, when running on my linux server (which has 4 cores) I dont get any speed up. The time to execute on 4,2,1 threads is about the same.
Here is the code I came up with:
public static void main(String[]args)throws InterruptedException{
final int threads = Integer.parseInt(args[0]);
final int length= Integer.parseInt(args[1]);
final int balk=(length/threads);
Thread[]th = new Thread[threads];
final double[]result =new double[length];
final double[]array1=getRandomArray(length);
final double[]array2=getRandomArray(length);
long startingTime =System.nanoTime();
for(int i=0;i<threads;i++){
final int current=i;
th[i]=new Thread(()->{
for(int k=current*balk;k<(current+1)*balk;k++){
result[k]=array1[k]+array2[k];
}
});
th[i].start();
}
for(int i=0;i<threads;i++){
th[i].join();
}
System.out.println("Time needed: "+(System.nanoTime()-startingTime));
}
length is always a multiple of threads and getRandomArray() creates a random array of doubles between 0 and 1.
Execution Time for 1-Thread: 84579446ns
Execution Time for 2-Thread: 74211325ns
Execution Time for 4-Thread: 89215100ns
length =10000000
Here is the Code for getRandomArray():
private static double[]getRandomArray(int length){
Random random =new Random();
double[]array= new double[length];
for(int i=0;i<length;i++){
array[i]=random.nextDouble();
}
return array;
}
I would appreciate any help.
The difference is observable for the following code. Try it.
public static void main(String[]args)throws InterruptedException{
for(int z = 0; z < 10; z++) {
final int threads = 1;
final int length= 100_000_000;
final int balk=(length/threads);
Thread[]th = new Thread[threads];
final boolean[]result =new boolean[length];
final boolean[]array1=getRandomArray(length);
final boolean[]array2=getRandomArray(length);
long startingTime =System.nanoTime();
for(int i=0;i<threads;i++){
final int current=i;
th[i]=new Thread(()->{
for(int k=current*balk;k<(current+1)*balk;k++){
result[k]=array1[k] | array2[k];
}
});
th[i].start();
}
for(int i=0;i<threads;i++){
th[i].join();
}
System.out.println("Time needed: "+(System.nanoTime()-startingTime)*1.0/1000/1000);
boolean x = false;
for(boolean d : result) {
x |= d;
}
System.out.println(x);
}
}
First things first you need to warmup your code. This way you will measure compiled code. The first two iterations have the same(approximately) time but the next will differ. Also I changed double to boolean because my machine doesn't have much memory. This allows me to allocate a huge array and it also makes work more CPU consuming.
There is a link in comments. I suggest you to read it.
Hi from my side if you are trying to see how your cores shares work you can make very simple task for all cores, but make them to work constantly on something not shared across different threads (basically to simulate for example merge sort, where threads are working on something complicated and use shared resources in a small amount of time). Using your code i did something like this. In such case you should see almost exactly 2x speed up and 4 times speed up.
public static void main(String[]args)throws InterruptedException{
for(int a=0; a<5; a++) {
final int threads = 2;
final int length = 10;
final int balk = (length / threads);
Thread[] th = new Thread[threads];
System.out.println(Runtime.getRuntime().availableProcessors());
final double[] result = new double[length];
final double[] array1 = getRandomArray(length);
final double[] array2 = getRandomArray(length);
long startingTime = System.nanoTime();
for (int i = 0; i < threads; i++) {
final int current = i;
th[i] = new Thread(() -> {
Random random = new Random();
int meaningless = 0;
for (int k = current * balk; k < (current + 1) * balk; k++) {
result[k] = array1[k] + array2[k];
for (int j = 0; j < 10000000; j++) {
meaningless+=random.nextInt(10);
}
}
});
th[i].start();
}
for (int i = 0; i < threads; i++) {
th[i].join();
}
System.out.println("Time needed: " + ((System.nanoTime() - startingTime) * 1.0) / 1000000000 + " s");
}
}
You see, in your code most time is consumed by building big table, and then threads are executing very fast, their work is so fast that your calculation of time is wrong because most of time is consumed by creating threads. When i invoked code which works on precalculated loop like this:
long startingTime =System.nanoTime();
for(int k=0; k<length; k++){
result[k]=array1[k]|array2[k];
}
System.out.println("Time needed: "+(System.nanoTime()-startingTime));
It worked two times faster than your code with 2 threads. I hope that you understand what i mean in this case and will see my point when i gave my threads much more meaningless work.
public class OneHundredDoors
{
static OneHundredDoors.Door[] doors = new OneHundredDoors.Door[100];
static public class Door
{
public int doorClosed = 0;
public void open ()
{
this.doorClosed = 0;
}
public void close ()
{
this.doorClosed = 1;
}
private void printStatus (int address)
{
if (this.doorClosed == 0)
{
System.out.println("Door: " + address + " is Open!");
}
}
public void printStatusOfAll ()
{
for (int i = 0; i < doors.length; i++)
{
doors[i].printStatus(i);
}
}
public void passDoor (int increment)
{
for (int k = 0; k < doors.length; k += increment)
{
if (doors[k].doorClosed == 0)
{
doors[k].close();
}
else
{
doors[k].close();
}
}
}
}
public static void main (String [] args)
{
for (int i = 0; i < doors.length; i++)
{
doors[i] = new OneHundredDoors.Door ();
}
for (int j = 0; j < doors.length; j++)
{
doors[5].passDoor(j);
}
doors[5].printStatusOfAll();
}
}
My problem here is that the loop for doors[5].passDoor(j) simply does not work at all. No errors come up, neither at runtime or at compile time. Nothing happens. Leaving the program for a while and coming back to it does nothing, signifying that it is not doing anything in the background. Now this code solves the problem if you simply say doors[5].passDoor(2) then 3, then 4 up to 100. The problem is that that's a wasteful thing to do, and hence I instead want to do it with a for loop.
About the static array of objects: sorry about that, I'm doing that to make things easier in the testing stage, and will fix things when I've got it up and running (by making the array private to the class Door).
I'm only really posting this here because I'm at a complete loss for why this is happening. No errors to search for on the internet, no freezes so I know it's probably not an infinite (or long) loop, and no one seems to have similar problems with 100 doors (albeit this may be because they have not taken an object-oriented approach to it as I have done). Also, the code works completely fine if you type it 100 times as I have said (or at least, it APPEARS that it would do so had I the patience to actually type it out 100 times).
Note finally, that the loop here does not work for ANY value of x where j < x. (What I'm missing here must be something obvious and simple therefore).
The reason passDoor won't work is that you pass an increment of 0 to:
for (int k = 0; k < doors.length; k += increment) {
so the values of k never increment causing an infinite loop.
For the code below, it stops running when "n" gets around 100,000. I need it to run until 1 million. I dont know where its going wrong, I am still learning Java so there might be simple mistakes in the code as well.
public class Problem14{
public static void main(String[] args) {
int chainLength;
int longestChain = 0;
int startingNumber = 0;
for(int n =2; n<=1000000; n++)
{
chainLength = getChain(n);
if(chainLength > longestChain)
{
System.out.println("chainLength: "+chainLength+" start: "+n);
longestChain = chainLength;
startingNumber = n;
}
}
System.out.println("longest:"+longestChain +" "+"start:"+startingNumber);
}
public static int getChain(int y)
{
int count = 0;
while(y != 1)
{
if((y%2) == 0)
{
y = y/2;
}
else{
y = (3*y) + 1;
}
count = count + 1;
}
return count;
}
}
Please use long as the data type instead of int
I will want this to come into light, that the number does flung higher than 1000000, so variable y needs long to hold it.
It's the datatype for y. It should be long. Otherwise it wraps round to -2 billion.
I thought I recognised this - it's Euler problem 14. I've done this myself.
getChain() method is causing problem it gets to negative and then it hangs forever in the loop.
So basically I needed to optimize this piece of code today. It tries to find the longest sequence produced by some function for the first million starting numbers:
public static void main(String[] args) {
int mostLen = 0;
int mostInt = 0;
long currTime = System.currentTimeMillis();
for(int j=2; j<=1000000; j++) {
long i = j;
int len = 0;
while((i=next(i)) != 1) {
len++;
}
if(len > mostLen) {
mostLen = len;
mostInt = j;
}
}
System.out.println(System.currentTimeMillis() - currTime);
System.out.println("Most len is " + mostLen + " for " + mostInt);
}
static long next(long i) {
if(i%2==0) {
return i/2;
} else {
return i*3+1;
}
}
My mistake was to try to introduce multithreading:
void doSearch() throws ExecutionException, InterruptedException {
final int numProc = Runtime.getRuntime().availableProcessors();
System.out.println("numProc = " + numProc);
ExecutorService executor = Executors.newFixedThreadPool(numProc);
long currTime = System.currentTimeMillis();
List<Future<ValueBean>> list = new ArrayList<Future<ValueBean>>();
for (int j = 2; j <= 1000000; j++) {
MyCallable<ValueBean> worker = new MyCallable<ValueBean>();
worker.setBean(new ValueBean(j, 0));
Future<ValueBean> f = executor.submit(worker);
list.add(f);
}
System.out.println(System.currentTimeMillis() - currTime);
int mostLen = 0;
int mostInt = 0;
for (Future<ValueBean> f : list) {
final int len = f.get().getLen();
if (len > mostLen) {
mostLen = len;
mostInt = f.get().getNum();
}
}
executor.shutdown();
System.out.println(System.currentTimeMillis() - currTime);
System.out.println("Most len is " + mostLen + " for " + mostInt);
}
public class MyCallable<T> implements Callable<ValueBean> {
public ValueBean bean;
public void setBean(ValueBean bean) {
this.bean = bean;
}
public ValueBean call() throws Exception {
long i = bean.getNum();
int len = 0;
while ((i = next(i)) != 1) {
len++;
}
return new ValueBean(bean.getNum(), len);
}
}
public class ValueBean {
int num;
int len;
public ValueBean(int num, int len) {
this.num = num;
this.len = len;
}
public int getNum() {
return num;
}
public int getLen() {
return len;
}
}
long next(long i) {
if (i % 2 == 0) {
return i / 2;
} else {
return i * 3 + 1;
}
}
Unfortunately, the multithreaded version worked 5 times slower than the single-threaded on 4 processors (cores).
Then I tried a bit more crude approach:
static int mostLen = 0;
static int mostInt = 0;
synchronized static void updateIfMore(int len, int intgr) {
if (len > mostLen) {
mostLen = len;
mostInt = intgr;
}
}
public static void main(String[] args) throws InterruptedException {
long currTime = System.currentTimeMillis();
final int numProc = Runtime.getRuntime().availableProcessors();
System.out.println("numProc = " + numProc);
ExecutorService executor = Executors.newFixedThreadPool(numProc);
for (int i = 2; i <= 1000000; i++) {
final int j = i;
executor.execute(new Runnable() {
public void run() {
long l = j;
int len = 0;
while ((l = next(l)) != 1) {
len++;
}
updateIfMore(len, j);
}
});
}
executor.shutdown();
executor.awaitTermination(30, TimeUnit.SECONDS);
System.out.println(System.currentTimeMillis() - currTime);
System.out.println("Most len is " + mostLen + " for " + mostInt);
}
static long next(long i) {
if (i % 2 == 0) {
return i / 2;
} else {
return i * 3 + 1;
}
}
and it worked much faster, but still it was slower than the single thread approach.
I hope it's not because I screwed up the way I'm doing multithreading, but rather this particular calculation/algorithm is not a good fit for parallel computation. If I change calculation to make it more processor intensive by replacing method next with:
long next(long i) {
Random r = new Random();
for(int j=0; j<10; j++) {
r.nextLong();
}
if (i % 2 == 0) {
return i / 2;
} else {
return i * 3 + 1;
}
}
both multithreaded versions start to execute more than twice as fast than the singlethreaded version on a 4 core machine.
So clearly there must be some threshold that you can use to determine if it is worth to introduce multithreading and my question is:
What is the basic rule that would help decide if a given calculation is intensive enough to be optimized by running it in parallel (without spending effort to actually implement it?)
The key to efficiently implementing multithreading is to make sure the cost is not too high. There are no fixed rules as they heavily depend on your hardware.
Starting and stopping threads has a high cost. Of course you already used the executor service which reduces these costs considerably because it uses a bunch of worker threads to execute your Runnables. However each Runnable still comes with some overhead. Reducing the number of runnables and increasing the amount of work each one has to do will improve performance, but you still want to have enough runnables for the executor service to efficiently distribute them over the worker threads.
You have choosen to create one runnable for each starting value so you end up creating 1000000 runnables. You would probably be getting much better results of you let each Runnable do a batch of say 1000 start values. Which means you only need 1000 runnables greatly reducing the overhead.
I think there is another component to this which you are not considering. Parallelization works best when the units of work have no dependence on each other. Running a calculation in parallel is sub-optimal when later calculation results depend on earlier calculation results. The dependence could be strong in the sense of "I need the first value to compute the second value". In that case, the task is completely serial and later values cannot be computed without waiting for earlier computations. There could also be a weaker dependence in the sense of "If I had the first value I could compute the second value faster". In that case, the cost of parallelization is that some work may be duplicated.
This problem lends itself to being optimized without multithreading because some of the later values can be computed faster if you have the previous results already in hand. Take, for example j == 4. Once through the inner loop produces i == 2, but you just computed the result for j == 2 two iterations ago, if you saved the value of len you can compute it as len(4) = 1 + len(2).
Using an array to store previously computed values of len and a little bit twiddling in the next method, you can complete the task >50x faster.
"Will the performance gain be greater than the cost of context switching and thread creation?"
That is a very OS, language, and hardware, dependent cost; this question has some discussion about the cost in Java, but has some numbers and some pointers to how to calculate the cost.
You also want to have one thread per CPU, or less, for CPU intensive work. Thanks to David Harkness for the pointer to a thread on how to work out that number.
Estimate amount of work which a thread can do without interaction with other threads (directly or via common data). If that piece of work can be completed in 1 microsecond or less, overhead is too much and multithreading is of no use. If it is 1 millisecond or more, multithreading should work well. If it is in between, experimental testing required.
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Which loop has better performance? Why?
Which is optimal ?
Efficiency of Java code with primitive types
when looping, for instance:
for ( int j = 0; j < 1000; j++) {}; and I need to instantiate 1000 objects, how does it differ when I declare the object inside the loop from declaring it outside the loop ??
for ( int j = 0; j < 1000; j++) {Object obj; obj =}
vs
Object obj;
for ( int j = 0; j < 1000; j++) {obj =}
It's obvious that the object is accessible either only from the loop scope or from the scope that is surrounding it. But I don't understand the performance question, garbage collection etc.
What is the best practice ? Thank you
The first form is better. Limiting the scope of a variable makes it easier for readers to understand where and how it is used.
Performance-wise, there are some small advantages to limited scope as well, which you can read about in another answer. But these concerns are secondary to code comprehension.
There's no difference. The compiler will optimize them to the very same place.
I've tested the issue on my machine the difference was about 2-4ms over 10000 instances, I tested all kind of stuff, like if you instantiate and assign value:
int i=0;
in compare with:
int i;
i=0;
here is the code I used for testing, of course I changed it for testing, and there is an initial balancing effect before the machine reaches optimization, you can see that in the clear once you test:
package initializer;
public final class EfficiencyTests {
private static class Stoper {
private long initTime;
private long executionDuration;
public Stoper() {
// TODO Auto-generated constructor stub
}
private void start() {
initTime = System.nanoTime();
}
private void stop() {
executionDuration = System.nanoTime() - initTime;
}
#Override
public String toString() {
return executionDuration + " nanos";
}
}
private static Stoper stoper = new Stoper();
public static void main(String[] args) {
for (int i = 0; i < 100; i++) {
theCycleOfAForLoop(100000);
theCycleOfAForLoopWithACallToSize(100000);
howLongDoesItTakeToSetValueToAVariable(100000);
howLongDoesItTakeToDefineAVariable(100000);
System.out.println("\n");
}
}
private static void theCycleOfAForLoop(int loops) {
stoper.start();
for (int i = 0; i < loops; i++);
stoper.stop();
System.out.println("The average duration of 10 cycles of an empty 'for' loop over " + loops + " iterations is: " + stoper.executionDuration * 10 / loops);
}
private static void theCycleOfAForLoopWithACallToSize(int loops) {
ArrayList<Object> objects=new ArrayList<Object>();
for (int i = 0; i < loops; i++)
objects.add(new Object());
stoper.start();
for (int i = 0; i < objects.size(); i++);
stoper.stop();
System.out.println("The average duration of 10 cycles of an empty 'for' loop with call to size over " + loops + " iterations is: " + stoper.executionDuration * 10 / loops);
}
private static void howLongDoesItTakeToSetValueToAVariable(int loops) {
int value = 0;
stoper.start();
for (int i = 0; i < loops; i++) {
value = 2;
}
stoper.stop();
System.out.println("The average duration of 10 cycles of setting a variable to a constant over " + loops + " iterations is: " + stoper.executionDuration * 10 / loops);
}
private static void howLongDoesItTakeToDefineAVariable(int loops) {
stoper.start();
for (int i = 0; i < loops; i++) {
int value = 0;
}
stoper.stop();
System.out.println("The average duration of 10 cycles of initializing and setting a variable to a constant over " + loops + " iterations is: " + stoper.executionDuration * 10 / loops);
}
private static void runAForLoopOnAnArrayOfObjects() {
// TODO Auto-generated method stub
}}
you can derive how long one takes if you reduce the time of the other... (if you understand what I mean)
hope this save you some time.
thing you need to understand is that I tested this things to optimize my paint update loop of my platform and it helped.
Adam.