After playing around with a simple palindrome function, I was surprised to find the performance difference between two different approaches.
public static boolean checkPalindrome(String inputString) {
String[] arr = inputString.split("");
for (int i = 0; i < arr.length / 2; i++) {
if(!arr[i].equals(arr[arr.length - (i + 1)]))
return false;
}
return true;
}
In this function I am only iterating through half of the array.
And in the following I would imagine that the whole array is iterated and a new object is created through the builder pattern.
public static boolean checkPalindrome2(String inputString) {
return inputString.equals(new StringBuilder(inputString).reverse().toString());
}
I was extremely surprised to find that the first function has an average execution time of 550146 nano seconds, measure using System.nanoTime(), and the second has an average execution time of 61665 nano seconds, which is almost a tenfold increase in performance.
Could anybody help explain what is happening here?
Related
I'm relatively new to Java programming, and I'm running into an issue calculating the amount of time it takes for a function to run.
First some background - I've got a lot of experience with Python, and I'm trying to recreate the functionality of the Jupyter Notebook/Lab %%timeit function, if you're familiar with that. Here's a pic of it in action (sorry, not enough karma to embed yet):
Snip of Jupyter %%timeit
What it does is run the contents of the cell (in this case a recursive function) either 1k, 10k, or 100k times, and give you the average run time of the function, and the standard deviation.
My first implementation (using the same recursive function) used System.nanoTime():
public static void main(String[] args) {
long t1, t2, diff;
long[] times = new long[1000];
int t;
for (int i=0; i< 1000; i++) {
t1 = System.nanoTime();
t = triangle(20);
t2 = System.nanoTime();
diff = t2-t1;
System.out.println(diff);
times[i] = diff;
}
long total = 0;
for (int j=0; j<times.length; j++) {
total += times[j];
}
System.out.println("Mean = " + total/1000.0);
}
But the mean is wildly thrown off -- for some reason, the first iteration of the function (on many runs) takes upwards of a million nanoseconds:
Pic of initial terminal output
Every iteration after the first dozen or so takes either 395 nanos or 0 -- so there could be a problem there too... not sure what's going on!
Also -- the code of the recursive function I'm timing:
static int triangle(int n) {
if (n == 1) {
return n;
} else {
return n + triangle(n -1);
}
}
Initially I had the line n = Math.abs(n) on the first line of the function, but then I removed it because... meh. I'm the only one using this.
I tried a number of different suggestions brought up in this SO post, but they each have their own problems... which I can go into if you need.
Anyway, thank you in advance for your help and expertise!
I recently built a Fibonacci generator that uses recursion and hashmaps to reduce complexity. I am using the System.nanoTime() to keep track of the time it takes for my program to print 10000 Fibonacci number. It started out good with less than a second but gradually became slower and now it takes more than 4 seconds. Can someone explain why this might be happening. The code is down here-
import java.util.*;
import java.math.*;
public class FibonacciGeneratorUnlimited {
static int numFibCalls = 0;
static HashMap<Integer, BigInteger> d = new HashMap<Integer, BigInteger>();
static Scanner fibNumber = new Scanner(System.in);
static BigInteger ans = new BigInteger("0");
public static void main(String[] args){
d.put(0 , new BigInteger("0"));
d.put(1 , new BigInteger("1"));
System.out.print("Enter the term:\t");
int n = fibNumber.nextInt();
long startTime = System.nanoTime();
for (int i = 0; i <= n; i++) {
System.out.println(i + " : " + fib_efficient(i, d));
}
System.out.println((double)(System.nanoTime() - startTime) / 1000000000);
}
public static BigInteger fib_efficient(int n, HashMap<Integer, BigInteger> d) {
numFibCalls += 1;
if (d.containsKey(n)) {
return (d.get(n));
} else {
ans = (fib_efficient(n-1, d).add(fib_efficient(n-2, d)));
d.put(n, ans);
return ans;
}
}
}
If you are restarting the program every time you make a new fibonacci sequence, then your program most likely isn't the problem. It might just be the your processor got hot after running the program a few times, or a background process in your computer suddenly started, causing your program to slow down.
More memory java -Xmx=... or less caching
public static BigInteger fib_efficient(int n, HashMap<Integer, BigInteger> d) {
numFibCalls++;
if ((n & 3) <= 1) { // Every second is cached.
BigInteger cached = d.get(n);
if (cached != null) {
return cached;
} else {
BigInteger ans = fib_efficient(n-1, d).add(fib_efficient(n-2, d));
d.put(n, ans);
return ans;
}
} else {
return fib_efficient(n-1, d).add(fib_efficient(n-2, d));
}
}
Two subsequent numbers are cached out of four in order to stop the
recursion on both branches for:
fib(n) = fib(n-1) + fib(n-2)
BigInteger isn't the nicest class where performance and memory is concerned.
It started out good with less than a second but gradually became slower and now it takes more than 4 seconds.
What do you mean by this? Do you mean that you ran this exact same program with the same input and its run-time changed from < 1 second to > 4 seconds?
If you have the same exact code running with the same exact inputs in a deterministic algorithm...
then the differences are probably external to your code - maybe other processes are taking up more CPU on one run.
Do you mean that you increased the inputs from some value X to 10,000 and now it takes > 4 seconds?
Then that's just a matter of the algorithm taking longer with larger inputs, which is perfectly normal.
recursion and hashmaps to reduce complexity
That's not quite how complexity works. You have improved the best-case and the average-case, but you have done nothing to change the worst-case.
Now for some actual performance improvement advice
Stop printing out the results... that's eating up over 99% of your processing time. Seriously, though, switch out "System.out.println(i + " : " + fib_efficient(i, d))" with "fib_efficient(i,d)" and it'll execute over 100x faster.
Concatenating strings and printing to console are very expensive processes.
It happens because the complexity for Fibonacci is Big-O(n^2). This means that, the larger the input the time increases exponentially, as you can see in the graph for Big-O(n^2) in this link. Check this answer to see a complete explanation about it´s complexity.
Now, the complexity of your algorithm increases because you are using a HashMap to search and insert elements each time that function is invoked. Consider remove this HashMap.
Please have a look at the following code
//Devide the has into set of 3 pieces
private void devideHash(String str)
{
int lastIndex = 0;
for(int i=0;i<=str.length();i=i+3)
{
lastIndex = i;
try
{
String stringPiece = str.substring(i, i+3);
// pw.println(stringPiece);
hashSet.add(stringPiece);
}
catch(Exception arr)
{
String stringPiece = str.substring(lastIndex, str.length());
// pw.println(stringPiece);
hashSet.add(stringPiece);
}
}
}
The above method receives String like abcdefgjijklmnop as the parameter. Inside the method, its job is to divide this sets of 3 letters. So when the operation is completed, the hashset will have pieces like abc def ghi jkl mno p
But the problem is that if the input String is big, then this loop takes noticeable amount of time to complete. Is there any way I can use to speed this process?
As an option, you could replace all your code with this line:
private void divideHash(String str) {
hashSet.addAll(Arrays.asList(str.split("(?<=\\G...)")));
}
Which will perform well.
Here's some test code:
String str = "abcdefghijklmnop";
hashSet.addAll(Arrays.asList(str.split("(?<=\\G...)")));
System.out.println(hashSet);
Output:
[jkl, abc, ghi, def, mno, p]
There is nothing we can really tell unless you tell us what the "noticeable large amount" is, and what is the expected time. It is recommended that you start a profiler to find what logic takes most time.
Some recommendations I can give from briefly reading your code is:
If the result Set is going to be huge, it will involve lots of resize and rehashing when your HashSet resize. It is recommended you first allocate required size. e.g.
HashSet hashSet = new HashSet<String>(input.size() / 3 + 1, 1.0);
This will save you lots of time for unnecessary rehashing
Never use exception to control your program flow.
Why not simply do:
int i = 0;
for (int i = 0; i < input.size(); i += 3) {
if (i + 3 > input.size()) {
// substring from i to end
} else {
// subtring from i to i+3
}
}
This Java method gets used in benchmarks for simulating slow computation:
static int slowItDown() {
int result = 0;
for (int i = 1; i <= 1000; i++) {
result += i;
}
return result;
}
This is IMHO a very bad idea, as its body can get replaced by return 500500. This seems to never happen1; probably because of such an optimization being irrelevant for real code as Jon Skeet stated.
Interestingly, a slightly simpler method with result += 1; gets fully optimized away (caliper reports 0.460543 ns).
But even when we agree that optimizing away methods returning a constant result is useless for real code, there's still loop unrolling, which could lead to something like
static int slowItDown() {
int result = 0;
for (int i = 1; i <= 1000; i += 2) {
result += 2 * i + 1;
}
return result;
}
So my question remains: Why is no optimization performed here?
1Contrary to what I wrote originally; I must have seen something what wasn't there.
Well, the JVM does optimize away such code. The question is how many times it has to be detected as a real hotspot (benchmarks do some more than this single method, usually) before it will be analyzed this way. In my setup it required 16830 invocations before the execution time went to (almost) zero.
It’s correct that such a code does not appear in real code. However it might remain after several inlining operations of other hotspots dealing with values not being compiling-time constants but runtime constants or de-facto constants (values that could change in theory but don’t practically). When such a piece of code remains it’s a great benefit to optimize it away entirely but that is not expected to happen soon, i.e. when calling right from the main method.
Update: I simplified the code and the optimization came even earlier.
public static void main(String[] args) {
final int inner=10;
final float innerFrac=1f/inner;
int count=0;
for(int j=0; j<Integer.MAX_VALUE; j++) {
long t0=System.nanoTime();
for(int i=0; i<inner; i++) slowItDown();
long t1=System.nanoTime();
count+=inner;
final float dt = (t1-t0)*innerFrac;
System.out.printf("execution time: %.0f ns%n", dt);
if(dt<10) break;
}
System.out.println("after "+count+" invocations");
System.out.println(System.getProperty("java.version"));
System.out.println(System.getProperty("java.vm.version"));
}
static int slowItDown() {
int result = 0;
for (int i = 1; i <= 1000; i++) {
result += i;
}
return result;
}
…
execution time: 0 ns
after 15300 invocations
1.7.0_13
23.7-b01
(64Bit Server VM)
Can anyone explain why the following recursive method is faster than the iterative one (Both are doing it string concatenation) ? Isn't the iterative approach suppose to beat up the recursive one ? plus each recursive call adds a new layer on top of the stack which can be very space inefficient.
private static void string_concat(StringBuilder sb, int count){
if(count >= 9999) return;
string_concat(sb.append(count), count+1);
}
public static void main(String [] arg){
long s = System.currentTimeMillis();
StringBuilder sb = new StringBuilder();
for(int i = 0; i < 9999; i++){
sb.append(i);
}
System.out.println(System.currentTimeMillis()-s);
s = System.currentTimeMillis();
string_concat(new StringBuilder(),0);
System.out.println(System.currentTimeMillis()-s);
}
I ran the program multiple time, and the recursive one always ends up 3-4 times faster than the iterative one. What could be the main reason there that is causing the iterative one slower ?
See my comments.
Make sure you learn how to properly microbenchmark. You should be timing many iterations of both and averaging these for your times. Aside from that, you should make sure the VM isn't giving the second an unfair advantage by not compiling the first.
In fact, the default HotSpot compilation threshold (configurable via -XX:CompileThreshold) is 10,000 invokes, which might explain the results you see here. HotSpot doesn't really do any tail optimizations so it's quite strange that the recursive solution is faster. It's quite plausible that StringBuilder.append is compiled to native code primarily for the recursive solution.
I decided to rewrite the benchmark and see the results for myself.
public final class AppendMicrobenchmark {
static void recursive(final StringBuilder builder, final int n) {
if (n > 0) {
recursive(builder.append(n), n - 1);
}
}
static void iterative(final StringBuilder builder) {
for (int i = 10000; i >= 0; --i) {
builder.append(i);
}
}
public static void main(final String[] argv) {
/* warm-up */
for (int i = 200000; i >= 0; --i) {
new StringBuilder().append(i);
}
/* recursive benchmark */
long start = System.nanoTime();
for (int i = 1000; i >= 0; --i) {
recursive(new StringBuilder(), 10000);
}
System.out.printf("recursive: %.2fus\n", (System.nanoTime() - start) / 1000000D);
/* iterative benchmark */
start = System.nanoTime();
for (int i = 1000; i >= 0; --i) {
iterative(new StringBuilder());
}
System.out.printf("iterative: %.2fus\n", (System.nanoTime() - start) / 1000000D);
}
}
Here are my results...
C:\dev\scrap>java AppendMicrobenchmark
recursive: 405.41us
iterative: 313.20us
C:\dev\scrap>java -server AppendMicrobenchmark
recursive: 397.43us
iterative: 312.14us
These are times for each approach averaged over 1000 trials.
Essentially, the problems with your benchmark are that it doesn't average over many trials (law of large numbers), and that it is highly dependent on the ordering of the individual benchmarks. The original result I was given for yours:
C:\dev\scrap>java StringBuilderBenchmark
80
41
This made very little sense to me. Recursion on the HotSpot VM is more than likely not going to be as fast as iteration because as of yet it does not implement any sort of tail optimization that you might find used for functional languages.
Now, the funny thing that happens here is that the default HotSpot JIT compilation threshold is 10,000 invokes. Your iterative benchmark will more than likely be executing for the most part before append is compiled. On the other hand, your recursive approach should be comparatively fast since it will more than likely enjoy append after it is compiled. To eliminate this from influencing the results, I passed -XX:CompileThreshold=0 and found...
C:\dev\scrap>java -XX:CompileThreshold=0 StringBuilderBenchmark
8
8
So, when it comes down to it, they're both roughly equal in speed. Note however that the iterative appears to be a bit faster if you average with higher precision. Order might still make a difference in my benchmark, too, as the latter benchmark will have the advantage of the VM having collected more statistics for its dynamic optimizations.