Which of these pieces of code is faster in Java? - java

a) for(int i = 100000; i > 0; i--) {}
b) for(int i = 1; i < 100001; i++) {}
The answer is there on this website (question 3). I just can't figure out why? From website:
3. a

When you get down to the lowest level (machine code but I'll use assembly since it maps one-to-one mostly), the difference between an empty loop decrementing to 0 and one incrementing to 50 (for example) is often along the lines of:
ld a,50 ld a,0
loop: dec a loop: inc a
jnz loop cmp a,50
jnz loop
That's because the zero flag in most sane CPUs is set by the decrement instruction when you reach zero. The same can't usually be said for the increment instruction when it reaches 50 (since there's nothing special about that value, unlike zero). So you need to compare the register with 50 to set the zero flag.
However, asking which of the two loops:
for(int i = 100000; i > 0; i--) {}
for(int i = 1; i < 100001; i++) {}
is faster (in pretty much any environment, Java or otherwise) is useless since neither of them does anything useful. The fastest version of both those loops no loop at all. I challenge anyone to come up with a faster version than that :-)
They'll only become useful when you start doing some useful work inside the braces and, at that point, the work will dictate which order you should use.
For example if you need to count from 1 to 100,000, you should use the second loop. That's because the advantage of counting down (if any) is likely to be swamped by the fact that you have to evaluate 100000-i inside the loop every time you need to use it. In assembly terms, that would be the difference between:
ld b,100000 dsw a
sub b,a
dsw b
(dsw is, of course, the infamous do something with assembler mnemonic).
Since you'll only be taking the hit for an incrementing loop once per iteration, and you'll be taking the hit for the subtraction at least once per iteration (assuming you'll be using i, otherwise there's little need for the loop at all), you should just go with the more natural version.
If you need to count up, count up. If you need to count down, count down.

On many compilers, the machine instructions emitted for a loop going backwards, are more efficient, because testing for zero (and therefore zero'ing a register) is faster than a load immediate of a constant value.
On the other hand, a good optimising compiler should be able to inspect the loop inner and determine that going backwards won't cause any side effects...
BTW, that is a terrible interview question in my opinion. Unless you are talking about a loop which runs 10 millions of times AND you have ascertained that the slight gain is not outweighed by many instances of recreating the forward loop value (n - i), any performance gain will be minimal.
As always, don't micro-optimise without performance benchmarking and at the expense of harder to understand code.

These kinds of questions are largely an irrelevant distraction that some people get obsessed with it. Call it the Cult of Micro-optimization or whatever you like but is it faster to loop up or down? Seriously? You use whichever is appropriate for what you're doing. You don't write your code around saving two clock cycles or whatever it is.
Let the compiler do what it's for and make you intent clear (both to the compiler and the reader). Another common Java pessimization is:
public final static String BLAH = new StringBuilder().append("This is ").append(3).append(' text").toString();
because excessive concatenation does result in memory fragmentation but for a constant the compiler can (and will) optimize this:
public final static String BLAH = "This is a " + 3 + " test";
where it won't optimize the first and the second is easier to read.
And how about (a>b)?a:b vs Math.max(a,b)? I know I'd rather read the second so I don't really care that the first doesn't incur a function call overhead.
There are a couple of useful things in this list like knowing that a finally block isn't called on System.exit() is potentially useful. Knowing that dividing a float by 0.0 doesn't throw an exception is useful.
But don't bother second-guessing the compiler unless it really matters (and I bet you that 99.99% of the time it doesn't).

A better question is;
Which is easier to understand/work with?
This is far more important than a notional difference in performance. Personally, I would point out that performance shouldn't be the criteria for determining the difference here. If they didn't like me challenging their assumption on this, I wouldn't be unhappy about not getting the job. ;)

On a modern Java implementation this is not true.
Summing up the numbers up to one billion as a benchmark:
Java(TM) SE Runtime Environment 1.6.0_05-b13
Java HotSpot(TM) Server VM 10.0-b19
up 1000000000: 1817ms 1.817ns/iteration (sum 499999999500000000)
up 1000000000: 1786ms 1.786ns/iteration (sum 499999999500000000)
up 1000000000: 1778ms 1.778ns/iteration (sum 499999999500000000)
up 1000000000: 1769ms 1.769ns/iteration (sum 499999999500000000)
up 1000000000: 1769ms 1.769ns/iteration (sum 499999999500000000)
up 1000000000: 1766ms 1.766ns/iteration (sum 499999999500000000)
up 1000000000: 1776ms 1.776ns/iteration (sum 499999999500000000)
up 1000000000: 1768ms 1.768ns/iteration (sum 499999999500000000)
up 1000000000: 1771ms 1.771ns/iteration (sum 499999999500000000)
up 1000000000: 1768ms 1.768ns/iteration (sum 499999999500000000)
down 1000000000: 1847ms 1.847ns/iteration (sum 499999999500000000)
down 1000000000: 1842ms 1.842ns/iteration (sum 499999999500000000)
down 1000000000: 1838ms 1.838ns/iteration (sum 499999999500000000)
down 1000000000: 1832ms 1.832ns/iteration (sum 499999999500000000)
down 1000000000: 1842ms 1.842ns/iteration (sum 499999999500000000)
down 1000000000: 1838ms 1.838ns/iteration (sum 499999999500000000)
down 1000000000: 1838ms 1.838ns/iteration (sum 499999999500000000)
down 1000000000: 1847ms 1.847ns/iteration (sum 499999999500000000)
down 1000000000: 1839ms 1.839ns/iteration (sum 499999999500000000)
down 1000000000: 1838ms 1.838ns/iteration (sum 499999999500000000)
Note that the time differences are brittle, small changes somewhere near the loops can turn them around.
Edit:
The benchmark loops are
long sum = 0;
for (int i = 0; i < limit; i++)
{
sum += i;
}
and
long sum = 0;
for (int i = limit - 1; i >= 0; i--)
{
sum += i;
}
Using a sum of type int is about three times faster, but then sum overflows.
With BigInteger it is more than 50 times slower:
BigInteger up 1000000000: 105943ms 105.943ns/iteration (sum 499999999500000000)

Typically real code will run faster counting upwards. There are a few reasons for this:
Processors are optimised for reading memory forwards.
HotSpot (and presumably other bytecode->native compilers) heavily optimise forward loops, but don't bother with backward loops because they happen so infrequently.
Upwards is usually more obvious, and cleaner code is often faster.
So happily doing the right thing will usually be faster. Unnecessary micro-optimisation is evil. I haven't purposefully written backward loops since programming 6502 assembler.

There are really only two ways to answer this question.
To tell you that it really, really doesn't matter, and you're wasting your time even wondering.
To tell you that the only way to know is to run a trustworthy benchmark on your actual production hardware, OS and JRE installation that you care about.
So, I made you a runnable benchmark you could use to try that out here:
http://code.google.com/p/caliper/source/browse/trunk/test/examples/LoopingBackwardsBenchmark.java
This Caliper framework is not really ready for prime time yet, so it may not be totally obvious what to do with this, but if you really care enough you can figure it out. Here are the results it gave on my linux box:
max benchmark ns
2 Forwards 4
2 Backwards 3
20 Forwards 9
20 Backwards 20
2000 Forwards 1007
2000 Backwards 1011
20000000 Forwards 9757363
20000000 Backwards 10303707
Does looping backwards look like a win to anyone?

Are you sure that the interviewer who asks such a question expects a straight answer ("number one is faster" or "number two is faster"), or if this question is asked to provoke a discussion, as is happening in the answers people are giving here?
In general, it's impossible to say which one is faster, because it heavily depends on the Java compiler, JRE, CPU and other factors. Using one or the other in your program just because you think that one of the two is faster without understanding the details to the lowest level is superstitious programming. And even if one version is faster than the other on your particular environment, then the difference is most likely so small that it's irrelevant.
Write clear code instead of trying to be clever.

Such questions have their base on old best-practice recommendations.
It's all about comparison: comparing to 0 is known to be faster. Years ago this might have been seen as quite important. Nowadays, especially with Java, I'd rather let the compiler and the VM do their job and I'd focus on writing code that is easies to maintain and understand.
Unless there are reasons for doing it otherwise. Remember that Java apps don't always run on HotSpot and/or fast hardware.

With regards for testing for zero in the JVM: it can apparently be done with ifeq whereas testing for anything else requires if_icmpeq which also involves putting an extra value on the stack.
Testing for > 0, as in the question, might be done with ifgt, whereas testing for < 100001 would need if_icmplt.

This is about the dumbest question I have ever seen. The loop body is empty. If the compiler is any good it will just emit no code at all. It doesn't do anything, can't throw an exception and doesn't modify anything outside of its scope.
Assuming your compiler isn't that smart, or that you actually didn't have an empty loop body:
The "backwards loop counter" argument makes sense for some assembly languages (it may make sense to the java byte code too, I don't know it specifically). However, the compiler will very often have the ability to transform your loop to use decrementing counters. Unless you have loop body in which the value of i is explicitly used, the compiler can do this transformation. So again you often see no difference.

I decided to bite and necro back the thread.
both of the loops are ignored by the JVM as no-ops. so essentially even one of the loops were till 10 and the other till 10000000, there would have been no difference.
Looping back to zero is another thing (for jne instruction but again, it's not compiled like that), the linked site is plain weird (and wrong).
This type of a question doesn't fit any JVM (nor any other compiler that can optimize).

The loops are identical, except for one critical part:
i > 0;
and
i < 100001;
The greater than zero check is done by checking the NZP (Commonly known as condition code or Negative Zero or Positive bit) bit of the computer.
The NZP bit is set whenever operation such as load, AND, addition ect. are performed.
The greater than check cannot directly utilize this bit (and therefore takes a bit longer...) The general solution is to make one of the values negative (by doing a bitwise NOT and then adding 1) and then adding it to the compared value. If the result is zero, then they're equal. Positive, then the second value (not the neg) is greater. Negative, then the first value (neg) is greater. This check takes a slightly longer than the direct nzp check.
I'm not 100% certain that this is the reason behind it though, but it seems like a possible reason...

The answer is a (as you probably found out on the website)
I think the reason is that the i > 0 condition for terminating the loop is faster to test.

The bottom line is that for any non-performance critical application, the difference is probably irrelevant. As others have pointed out there are times when using ++i instead of i++ could be faster, however, especially in for loops any modern compiler should optimize that distinction away.
That said, the difference probably has to do with the underlying instructions that get generated for the comparison. Testing if a value is equal to 0 is simply a NAND NOR gate. Whereas testing if a value is equal to an arbitrary constant requires loading that constant into a register, and then comparing the two registers. (This probably would require an extra gate delay or two.) That said, with pipelining and modern ALUs I'd be surprised if the distinction was significant to begin with.

I've been making tests for about 15 minutes now, with nothing running other than eclipse just in case, and I saw a real difference, you can try it out.
When I tried timing how long java takes to do "nothing" and it took around 500 nanoseconds just to have an idea.
Then I tested how long it takes to run a for statement where it increases:
for(i=0;i<100;i++){}
Then five minutes later I tried the "backwards" one:
for(i=100;i>0;i--)
And I've got a huge difference (in a tinny tiny level) of 16% between the first and the second for statements, the latter being 16% faster.
Average time for running the "increasing" for statement during 2000 tests: 1838 n/s
Average time for running the "decreasing" for statement during 2000 tests: 1555 n/s
Code used for such tests:
public static void main(String[] args) {
long time = 0;
for(int j=0; j<100; j++){
long startTime = System.nanoTime();
int i;
/*for(i=0;i<100;i++){
}*/
for(i=100;i>0;i--){
}
long endTime = System.nanoTime();
time += ((endTime-startTime));
}
time = time/100;
System.out.print("Time: "+time);
}
Conclusion:
The difference is basically nothing, it already takes a significant amount of "nothing" to do "nothing" in relation to the for statement tests, making the difference between them negligible, just the time taken for importing a library such as java.util.Scanner takes way more to load than running a for statement, it will not improve your application's performance significantly, but it's still really cool to know.

Related

Optimise modulo code with various divisors aside from 2

I know that we can optimise "find even numbers" code by using bitwise operator &. The following program:
if(i%2==0) sout("even")
else sout("odd")
can be optimised to:
if(i&1==0) sout("even")
else sout("odd")
The above approach works only for 2 as a divisor. What if we have to optimise the code when we have multiple divisors like 4, 9, 20, 56 and so on? Is there a way to further optimise this code?
You obviously didn't even try what you posted because it doesn't compile (even with a reasonable sout added). First expression statements in Java end in semicolon, and second i&1==0 parses as i & (1==0) -> i & true and the & operator doesn't take an int and a boolean.
If i is negative and odd, i%2 is -1 while i&1 is 1 = +1. That's because % is remainder not modulo.
In the limited cases where i%n and (i&(n-1)) are the same -- i nonnegative and n a power of two -- as the commenters said the Java runtime compiler (JIT) will actually produce the same code for both and obfuscating the source will only make your program more likely to be or become wrong without providing any benefit.
Fifty years ago when people were writing in assembler for machines with microsecond clocks (i.e. not even megaHertz much less gigaHertz) this sometimes helped -- only sometimes because usually only a small fraction of the code matters to execution time. In this century it's at best a waste and often harmful.

Java getChars method in Integer class, why is it using bitwise operations instead of arithmetic?

So I was examining the Integer's class source code (JDK 8) to understand how an int get converted to a String. It seems to be using a package private method called getChars (line 433) to convert an int to char array.
While the code is not that difficult to understand, however, there are multiple lines of code where they use bitwise shift operations instead of simple arithmetic multiplication/division, such as the following lines of code:
// really: r = i - (q * 100);
r = i - ((q << 6) + (q << 5) + (q << 2));
and
q = (i * 52429) >>> (16+3);
r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ...
I just do not understand the point of doing that, is this actually an optimization and does it affect the runtime of the algorithm?
Edit:
To put it in a another way, since the compiler does this type of optimization internally, is this manual optimization necessary?
I don't know the reason for this specific change and unless you find the original author, it's unlikely you'll find an authoritative answer to that anyway.
But I'd like to respond to the wider point, which is that a lot of code in the runtime library (java.* and many internal packages) is optimized to a degree that would be very unusual (and I dare say irresponsible) to apply to "normal" application code.
And that has basically two reasons:
It's called a lot and in many different environment. Optimizing a method in your server to take 0.1% less CPU time when it's only executed 50 times per day on 3 servers each won't be worth the effort you put into it. If, however, you can make Integer.toString 0.1% faster for everyone who will ever execute it, then this can turn into a very big change indeed.
If you optimize your application code on a specific VM then updating that VM to a newer version can easily undo your optimization, when the compiler decides to optimize differently. With code in java.* this is far less of an issue, because it is always shipped with the runtime that will run it. So if they introduce a compiler change that makes a given optimization no longer optimal, then they can change the code to match this.
tl;dr java.* code is often optimized to an insane degree because it's worth it and they can know that it will actually work.
There are a couple reasons that this is done. Being a long-time embedded developer, using tiny microcontrollers that sometimes didn't even have a multiplication and division instruction, I can tell you that this is significantly faster. The key here is that the multiplier is a constant. If you were multiplying two variables, you'd either need to use the slower multiply and divide operators or, if they didn't exist, perform multiplication using a loop with the add operator.

How fast is left shift (<<2) compared to multiply by 2(*2) in java? [duplicate]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Shifting bits left and right is apparently faster than multiplication and division operations on most, maybe even all, CPUs if you happen to be using a power of 2. However, it can reduce the clarity of code for some readers and some algorithms. Is bit-shifting really necessary for performance, or can I expect the compiler or VM to notice the case and optimize it (in particular, when the power-of-2 is a literal)? I am mainly interested in the Java and .NET behavior but welcome insights into other language implementations as well.
Almost any environment worth its salt will optimize this away for you. And if it doesn't, you've got bigger fish to fry. Seriously, do not waste one more second thinking about this. You will know when you have performance problems. And after you run a profiler, you will know what is causing it, and it should be fairly clear how to fix it.
You will never hear anyone say "my application was too slow, then I started randomly replacing x * 2 with x << 1 and everything was fixed!" Performance problems are generally solved by finding a way to do an order of magnitude less work, not by finding a way to do the same work 1% faster.
Most compilers today will do more than convert multiply or divide by a power-of-two to shift operations. When optimizing, many compilers can optimize a multiply or divide with a compile time constant even if it's not a power of 2. Often a multiply or divide can be decomposed to a series of shifts and adds, and if that series of operations will be faster than the multiply or divide, the compiler will use it.
For division by a constant, the compiler can often convert the operation to a multiply by a 'magic number' followed by a shift. This can be a major clock-cycle saver since multiplication is often much faster than a division operation.
Henry Warren's book, Hacker's Delight, has a wealth of information on this topic, which is also covered quite well on the companion website:
http://www.hackersdelight.org/
See also a discussion (with a link or two ) in:
Reading assembly code
Anyway, all this boils down to allowing the compiler to take care of the tedious details of micro-optimizations. It's been years since doing your own shifts outsmarted the compiler.
Humans are wrong in these cases.
99% when they try to second guess a modern (and all future) compilers.
99.9% when they try to second guess modern (and all future) JITs at the same time.
99.999% when they try to second guess modern (and all future) CPU optimizations.
Program in a way that accurately describes what you want to accomplish, not how to do it. Future versions of the JIT, VM, compiler, and CPU can all be independantly improved and optimized. If you specify something so tiny and specific, you lose the benefit of all future optimizations.
You can almost certainly depend on the literal-power-of-two multiplication optimisation to a shift operation. This is one of the first optimisations that students of compiler construction will learn. :)
However, I don't think there's any guarantee for this. Your source code should reflect your intent, rather than trying to tell the optimiser what to do. If you're making a quantity larger, use multiplication. If you're moving a bit field from one place to another (think RGB colour manipulation), use a shift operation. Either way, your source code will reflect what you are actually doing.
Note that shifting down and division will (in Java, certainly) give different results for negative, odd numbers.
int a = -7;
System.out.println("Shift: "+(a >> 1));
System.out.println("Div: "+(a / 2));
Prints:
Shift: -4
Div: -3
Since Java doesn't have any unsigned numbers it's not really possible for a Java compiler to optimise this.
On computers I tested, integer divisions are 4 to 10 times slower than other operations.
When compilers may replace divisions by multiples of 2 and make you see no difference, divisions by not multiples of 2 are significantly slower.
For example, I have a (graphics) program with many many many divisions by 255.
Actually my computation is :
r = (((top.R - bottom.R) * alpha + (bottom.R * 255)) * 0x8081) >> 23;
I can ensure that it is a lot faster than my previous computation :
r = ((top.R - bottom.R) * alpha + (bottom.R * 255)) / 255;
so no, compilers cannot do all the tricks of optimization.
I would ask "what are you doing that it would matter?". First design your code for readability and maintainability. The likelyhood that doing bit shifting verses standard multiplication will make a performance difference is EXTREMELY small.
It is hardware dependent. If we are talking micro-controller or i386, then shifting might be faster but, as several answers state, your compiler will usually do the optimization for you.
On modern (Pentium Pro and beyond) hardware the pipelining makes this totally irrelevant and straying from the beaten path usually means you loose a lot more optimizations than you can gain.
Micro optimizations are not only a waste of your time, they are also extremely difficult to get right.
If the compiler (compile-time constant) or JIT (runtime constant) knows that the divisor or multiplicand is a power of two and integer arithmetic is being performed, it will convert it to a shift for you.
According to the results of this microbenchmark, shifting is twice as fast as dividing (Oracle Java 1.7.0_72).
Most compilers will turn multiplication and division into bit shifts when appropriate. It is one of the easiest optimizations to do. So, you should do what is more easily readable and appropriate for the given task.
I am stunned as I just wrote this code and realized that shifting by one is actually slower than multiplying by 2!
(EDIT: changed the code to stop overflowing after Michael Myers' suggestion, but the results are the same! What is wrong here?)
import java.util.Date;
public class Test {
public static void main(String[] args) {
Date before = new Date();
for (int j = 1; j < 50000000; j++) {
int a = 1 ;
for (int i = 0; i< 10; i++){
a *=2;
}
}
Date after = new Date();
System.out.println("Multiplying " + (after.getTime()-before.getTime()) + " milliseconds");
before = new Date();
for (int j = 1; j < 50000000; j++) {
int a = 1 ;
for (int i = 0; i< 10; i++){
a = a << 1;
}
}
after = new Date();
System.out.println("Shifting " + (after.getTime()-before.getTime()) + " milliseconds");
}
}
The results are:
Multiplying 639 milliseconds
Shifting 718 milliseconds

How can I prove that one algorithm is faster than another in Java

Is there anything in Java that would allow me to take a code snippit and allow me to see exactly how many "ticks" it takes to execute. I want to prove that an algorithm I wrote is faster than another.
"Ticks"? No. I'd recommend that you run them several times each and compare the average results:
public class AlgorithmDriver {
public static void main(String [] args) {
int numTries = 1000000;
long begTime = System.currentTimeMillis();
for (int i = 0; i < numTries; ++i) {
Algorithm.someMethodCall();
}
long endTime = System.currentTimeMillis();
System.out.printf("Total time for %10d tries: %d ms\n", numTries, (endTime-begTime));
}
}
You probably are asking two different questions:
How can you measure the run time of a java implementation (Benchmarking)
How can you prove the asymptotic run time of an algorithm
For the first of these I wouldn't use the solutions posted here. They are mostly not quite right. Forst, its probably better to use System.nanoTime than System.currentTimeMillis. Second, you need to use a try catch block. Third, take statistic of running times of your code running many times outside of the metric, so that you can have a more complete picture.
Run code that looks vaguely like this many times:
long totalTime = 0;
long startTime = System.nanoTime();
try{
//method to test
} finally {
totalTime = System.nanoTime() - startTime;
}
Getting benchmarking correct is hard. For example, you must let your code "warmup"" for a few minutes before testing it. Benchmark early and often, but dont over believe your benchmarks. Particularly small micro benchmarks almost always lie in one way or another.
The second way to interpret your question is about asymptotic run times. The truth is this has almost nothing to do with Java, it is general computer science. Here the question we want to ask is: what curves describe the behavior of the run time of our algorithm in terms of the input size.
The first thing is to understand Big-Oh notation. I'll do my best, but SO doesn't support math notation. O(f(n)) denotes a set of algorithms such that in the limit as n goes to infinity f(n) is within a constant factor of an upper bound on the algorithm run time. Formally, T(n) is in O(f(n)) iff there exists some constant n0 and some constant c such that for all n > n0 c*f(n) >= n. Big Omega is the same thing, except for upper bounds, and big Theta f(n) just means its both big Oh f(n) and big Omega f(n). This is not two hard.
Well, it gets a little more complicated because we can talk about different kinds of run time, ie "average case", best case, and worst case. For example, normall quicksort is O(n^2) in the worst case, but O(n log n) for random lists.
So I skipped over what T(n) means. Basically it is the number of "ticks." Some machine instructions (like reading from memory) take much longer than others (like adding). But, so long as they are only a constant factor apart from each other, we can treat them all as the same for the purposes of big Oh, since it will just change the value of c.
Proving asymptotic bounds isn't that hard. For simple structured programming problems you just count
public int square(int n){
int sum = 0
for(int i = 0, i < n, i++){
sum += n
}
return sum
}
In this example we have one instruction each for: initializing sum, initializing i, and returning the value. The loop happens n times and on each time we do a comparison, and addition, and an increment. So we have O(square(n)) = O(3 + 3n) using n0 of 2 and c of 4 we can easily prove this is in O(n). You can always safely simplify big Oh expressions by removing excess constant terms, and by dividing by constant multiples.
When you are faced with a recursive function you have to solve a recurrence relation. If you have a function like T(n) = 2*T(n/2) + O(1) you want to find a closed form solution. You sometimes have to do this by hand, or with a computer algebra system. For this example, using forward substitution, we can see the pattern (in an abuse of notation) T(1) = O(1), T(2) = O(3), T(4) = O(7), T(8) = (15) this looks alot like O(2n - 1), to prove this is the right value:
T(n) = 2*T(n/2) + 1
T(n) = 2*(2(n/2) - 1) + 1
T(n) = 2*(n-1) + 1
T(n) = 2n - 2 + 1
T(n) = 2n - 1
As we saw earlier you can simplify O(2n -1) to O(n)
More often though you can use the master theorem which is a mathematical tool for saving you time on this kind of problem. If you check wikipedia you can find the master theorem, which if you plug and play the example above you get the same answer.
For more, check out an algorithms text book like Levitin's "The Design & Analysis of Algorithms"
You could use System.currentTimeMillis() to get start and end times.
long start = System.currentTimeMillis();
// your code
long end = System.currentTimeMillis();
System.out.println( "time: " + (end - start) );
You can measure wall time with System.currentTimeMillis() or System.nanoTime() (which have different characteristics). This is relatively easy as you just have to print out the differences at the end.
If you need to count specific operations (which is common in algorithms), the easiest is to simply increment a counter when the operations are being done , and then print it when you are done. long is well suited for this. For multiple operations use multiple counters.
I had to do this algorithm efficiency proofs mostly on my Data Structures lesson this year.
First,I measured the time like they mentioned upper.
Then I increased the method's input number with squaring each time(10,100,1000,...)
Lastly,I put the time measurements in an Excel file and drawed graphics for these time values.
By this way,you can check if one algorithm is faster than other or not,slightly.
I would:
Come up with a few data sets for the current algorithm: a set where it performs well, a set where it performs ok, and a data set where it performs poorly. You want to show that your new algorithm outperforms the current one for each scenario.
Run and measure the performance of each algorithm multiple times for increasing input sizes of each of the three data sets, then take average, standard deviation etc. Standard deviation will show a crude measure of the consistency of the algorithm performance.
Finally look at the numbers and decide in your case what is important: which algorithm's performance is more suitable for the type of input you will have most of the time, and how does it degrade when the inputs are not optimal.
Timing the algorithm is not necessarily everything - would memory footprint be important as well? One algorithm might be better computationally but it might create more objects while it runs.. etc. Just trying to point out there is more to consider than purely timing!
I wouldn't use the current time in ms as some of the others have suggested. The methods provided by ThreadMXBeans are more accurate (I dare not say 100% accurate).
They actually measure the cpu time taken by the thread, rather then elapsed system time, which may be skewed due to context switches performed by the underlying OS.
Java Performance Testing
I am not too familiar with the Java Framework but i would do it the following way:
Define a set of test cases (mostly example data) that can be used for both algorithms
Implement a timing method to measure the amount of time that a specific function takes
Create a for loop and execute method A (repeatedly, e.g. 1000 times with the whole test data). Measure the timing of the loop, not the sum of the single functions since timing functions can bias your result when called a lot)
Do the same for method B
Compare your result and choose a winner
If both algorithms have the same definition of a macro-level "tick" (e.g. walking one node in a tree) and your goal is to prove that your algorithm accomplishes its goal in a lower number of those macro-level ticks than the other, then by far the best way is to just instrument each implementation to count those ticks. That approach is ideal because it doesn't reward low-level implementation tricks that can make code execute faster but are not algorithm-related.
If you don't have that luxury, but you are trying to calculate which approach solves the problem using the least amount of CPU resources, contrary to the approaches listed here involving System.currentTimeMillis etc, I would use an external approach: the linux time command would be ideal. You have each program run on the same set of (large) inputs, preferably that take on the order of minutes or hours to process, and just run time java algo1 vs time java algo2.
If your purpose is compare the performances between two pieces of code, the best way to do is using JMH. You can import via maven and is now official in openjdk 12.
https://openjdk.java.net/projects/code-tools/jmh/

Counting down to zero in contrast to counting up to length - 1

Is it recommended to count in small loops (where possible) down from length - 1 to zero
instead of counting up to length - 1?
1.) Counting down
for (int i = a.length - 1; i >= 0; i--) {
if (a[i] == key) return i;
}
2.) Counting up
for (int i = 0; i < a.length; i++) {
if (a[i] == key) return i;
}
The first one is slightly faster that the second one (because comparing to zero is faster) but is a little more error-prone in my opinion. Besides, the first one could maybe not be optimized by future improvements of the JVM. Any ideas on that?
If you store the result of a.length in variable, it won't be any "faster" if it is actually so. In any event, it is rarely worth worrying about the performance of such a trivial operation. Focus on the readability of the method.
For me, counting up is more readable.
In my opinion, it's far better to favour convention and readability (in this case, the count-up approach) over preemptive optimisation. According to Josh Bloch, it's better not to optimise your code until you are sure that optimisation is required.
Counting downwards tends to be slower, despite the possibility to drop one machine code instruction. In the modern day, performance ain't that simple. Compilers have optimisation geared towards forward loop, so you reverse loop may miss out on optimisation. Cache hardware is designed for normal forward scanning. So don't worry about this sort of micro-optimisation (and if you ever find yourself in a situation where you really need to, measure).
I would recommend you to make sure you have benchmark showing that this is a performance issue before doing too much changes like this. I'd go for the most readable any day (in my opinion it's the one counting upwards).
If you are into micro optimizations and don't trust the compiler to do the right thing, maybe you should consider caching a.length in a variable in the second loop to avoid an indirection as well.
I'd say if there is a reason to count one way vs. the other (say order of the items in the list) then don't twist yourself in a knot trying to go with convention (From experience, count up); If there isn't - make it easier for the next person to work on and just go with convention.
Comparing to 0 vs. comparing to int shouldn't really be a concern...
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil", Donald Knuth.
In this case I would argue that any possible performance gain would be outweighed by the loss of readability alone. Programmer hours are much more expensive than cpu hours.
P.S.: To further improve performance you should consider testing for inequality to zero. But watch out for empty arrays ;)

Categories

Resources