If an algorithm with O(n2 ) average case time complexity takes 10 seconds to execute for an input size of 1000 elements, how long will it take to run when the input size is 10,000 elements?
Can't be answered.. And anybody who actually gives a number is wrong. Because Time complexity is independent of underlying machine architecture. That is why we ignore the machine dependent constants.
Each platform has its own overhead of doing certain operations. So, again, the answer is not possible to say.
While it is not possible to give a specific number while applies to all machines, you can estimate that an O(n^2) should be around 100x for a 10x increase in n
This comes with two important qualifications
the time taken could be much less than this as the first test could include a significant amount of warm up. i.e. you might find it only takes twice as long. It would be surprising, but it is possible. In Java, warm is a significant factor in short tests, and it is not uncommon to see a test of 10K take far less time if you run it again. In fact a test of 10K run multiple times could start to show times faster than the 1K run once (and the JIT kicks in, esp if the code is easily optimised)
the time taken could be much higher, esp if some resource threshold is reached. e.g. say 1000 elements fits in cache, but 10000 elements fit in main memory. Similarly 1000 could fit in memory but 10000 must be paged from disk. This could result in it taking much longer.
Related
How can I achieve it without giving as input very large arrays? I am measuring the running time of different algorithms and for an array of 20 elements I get very (the same) similar values. I tried divided the total time by 1000000000 to clear of the E and then used like 16 mirrors where I copied the input array and executed it again for the mirror. But still it is the same for Heap and Quick sort. Any ideas without needing to write redundant lines?
Sample output:
Random array:
MergeSort:
Total time 14.333066343496
QuickSort:
Total time 14.3330663435256
HeapSort:
Total time 14.3330663435256
If you need code snippets just notify me.
To your direct question, use System.nanoTime() for more granular timestamps.
To your underlying question of how to get better benchmarks, you should run the benchmark repeatedly and on larger data sets. A benchmark that takes ~14ms to execute is going to be very noisy, even with a more precise clock. See also How do I write a correct micro-benchmark in Java?
You can't improve the granularity of this method. According to Java SE documentation:
Returns the current time in milliseconds. Note that while the unit of
time of the return value is a millisecond, the granularity of the
value depends on the underlying operating system and may be larger.
For example, many operating systems measure time in units of tens of
milliseconds.
(source)
As others stated, for time lapses, public static long nanoTime() would give you more precision, but not resolution:
This method provides nanosecond precision, but not necessarily
nanosecond resolution.
(source)
I was wondering how can I estimate a total running time of a java program on specific machine before program ends? I need to know how much it will take so I can announce the progress by that.
FYI Main algorithm of my program takes O(n^3) time complexity. Suppose n= 100000, how much it takes to run this program on my machine? (dual intel xeon e2650)
Regards.
In theory 1GHz of computational power should result in about 1 billion simple operations. However finding the number of simple operations is not always easy. Even if you know the time complexity of a given algorithm, this is not enough - you also need to know the constant factor. In theory it is possible to have a linear algorithm that takes several seconds to compute something for input of size 10000(and some algorithms like this exist - like the linear pre-compute time RMQ).
What you do know, however is that something of O(n^3) will need to perform on the order of 100000^3 operations. So even if your constant is about 1/10^6(which is highly unprobable), this computation will take a lot of time.
I believe #ArturMalinowski 's proposal is the right way to approach your problem. If you benchmark the performance of your algorithm for some sequence known aforehand e.g. {32,64,128,...} or as he proposes {1,10,100,...}. This way you will be able to determine the constant factor with relatively good precision.
I have a Java program that operates on a (large) graph. Thus, it uses a significant amount of heap space (~50GB, which is about 25% of the physical memory on the host machine). At one point, the program (repeatedly) picks one node from the graph and does some computation with it. For some nodes, this computation takes much longer than anticipated (30-60 minutes, instead of an expected few seconds). In order to profile these opertations to find out what takes so much time, I have created a test program that creates only a very small part of the large graph and then runs the same operation on one of the nodes that took very long to compute in the original program. Thus, the test program obviously only uses very little heap space, compared to the original program.
It turns out that an operation that took 48 minutes in the original program can be done in 9 seconds in the test program. This really confuses me. The first thought might be that the larger program spends a lot of time on garbage collection. So I turned on the verbose mode of the VM's garbage collector. According to that, no full garbage collections are performed during the 48 minutes, and only about 20 collections in the young generation, which each take less than 1 second.
So my questions is what else could there be that explains such a huge difference in timing? I don't know much about how Java internally organizes the heap. Is there something that takes significantly longer for a large heap with a large number of live objects? Could it be that object allocation takes much longer in such a setting, because it takes longer to find an adequate place in the heap? Or does the VM do any internal reorganization of the heap that might take a lot of time (besides garbage collection, obviously).
I am using Oracle JDK 1.7, if that's of any importance.
While bigger memory might mean bigger problems, I'd say there's nothing (except the GC which you've excluded) what could extend 9 seconds to 48 minutes (factor 320).
A big heap makes seemingly worse spatial locality possible, but I don't think it matters. I disagree with Tim's answer w.r.t. "having to leave the cache for everything".
There's also the TLB which a cache for the virtual address translation, which could cause some problems with very large memory. But again, not factor 320.
I don't think there's anything in the JVM which could cause such problems.
The only reason I can imagine is that you have some swap space which gets used - despite the fact that you have enough physical memory. Even slight swapping can be the cause for a huge slowdown. Make sure it's off (and possibly check swappiness).
Even when things are in memory you have multiple levels of caching of data on modern CPUs. Every time you leave the cache to fetch data the slower that will go. Having 50GB of ram could well mean that it is having to leave the cache for everything.
The symptoms and differences you describe are just massive though and I don't see something as simple as cache coherency making that much difference.
The best advice I can five you is to try running a profiler against it both when it's running slow and when it's running fast and compare the difference.
You need solid numbers and timings. "In this environment doing X took Y time". From that you can start narrowing things down.
I have read about time complexities only in theory.. Is there any way to calculate them in a program? Not by assumptions like 'n' or anything but by actual values..
For example.. calculating time complexities of Merge sort and quick sort..
Merge Sort= O(nlogn);// any case
Quick Sort= O(n^2);// worst case(when pivot is largest or smallest value)
there is a huge difference in nlogn and n^2 mathematically..
So i tried this in my program..
main()
{
long t1=System.nanoTime();
// code of program..
long t2=System.nanoTime();
time taken=t2-t1;
}
The answer i get for both the algorithms,in fact for any algorithm i tried is mostly 20.
Is System.nanoTime() not precise enough or should i use a slower system? Or is there any other way?
Is there any way to calculate them in a program? Not by assumptions like 'n' or anything but by actual values.
I think you misunderstand what complexity is. It is not a value. It is not even a series of values. It is a formula. If you get rid of the N it is meaningless as a complexity measure (except in the case of O(1) ... obviously).
Setting that issue on one side, it would be theoretically possible to automate the rigorous analysis of complexity. However this is a hard problem: automated theorem proving is difficult ... especially if there is no human being in the loop to "guide" the process. And the Halting Theorem implies that there cannot be an automated theorem prover that can prove the complexity of an arbitrary program. (Certainly there cannot be a complexity prover that works for all programs that may or may not terminate ...)
But there is one way to calculate a performance measure for a program with a given set of input. You just run it! And indeed, you do a series of runs, graphing performance against some problem size measure (i.e. an N) ... and make an educated guess at a formula that relates the performance and the N measure. Or you could attempt to fit the measurements to a formula.
However ...
it is only a guess, and
this approach is not always going to work.
For example, if you tried this on classic Quicksort, you most likely conclude that complexity is O(NlogN) and miss the important caveat that there is a "worst case" where it is O(N^2). Another example is where the observable performance characteristics change as the problem size gets big.
In short, this approach is liable to give you unreliable answers.
Well, in practice with some assumptions on the program, you might be able to run your program on large number of test case (and measure the time it takes) and use interpolation to estimate the growth rate and the complexity of the program, and use statistical hypothesis testing to show the probability you are correct.
However, this thing cannot be done in ALL cases. In fact, you cannot even have an algorithm that tells for each program if it is going to halt or not (run an infinite loop). This is known as the Halting Problem, which is proven to be insolveable.
Micro benchmarks like this are inherently flawed, and you're never going to get brilliantly accurate readings using them - especially not in the nanoseconds range. The JIT needs time to "warm up" to your code, during which time it will optimise itself around what code is being called.
If you must go down this route, then you need a big test set for your algorithm that'll take seconds to run rather than nanoseconds, and preferably a "warm up" period in there as well - then you might see some differences close to what you're expecting. You're never going to just be able to take those timings though and calculate the time complexity from them directly - you'd need to run many cases with different sizes and then plot a graph as to the time taken for each input size. Even that approach won't gain you brilliantly accurate results, but it would be enough to give an idea.
Your question might be related to Can a program calculate the complexity of an algorithm? and Program/algorithm to find the time complexity of any given program, I think that you do a program where you count while or for loops and see if its nested or not but I don't figure how you can calculate complexity for some recursive functions.
The microbenchmark which you wrote is incorrect. When you want to gather some time metrics of your code for further optimization, JFF, etc. use JMH. This will help you a lot.
When we say that an algorithm exhibits O(nlogn) complexity, we're saying that the asymptotic upper bound for that algorithm is O(nlogn). That is, for sufficiently large values of n, the algorithm behaves like a function n log n. We're not saying that for n inputs, there will definitely be n log n executions. Simply that this is the definition set, that your algorithm belongs to.
By taking time intervals on your system, you're actually exposing yourself to the various variables involved in the computer system. That is, you're dealing with system latency, wire resistance, CPU speed, RAM usage... etc etc. All of these things will have a measurable effect on your outcome. That is why we use asymptotics to compute the time complexity of an algorithm.
One way to check the time complexity is to run both algorithms , on different sizes of n and check the ratio between each run . From this ratio you can get the time complexity
For example
If time complexity is O(n) then the ratio will be linear
If time complexity is O(n^2) the ratio will be (n1/n2)^2
if time complexity is O(log(n)) the ratio will be log(n1)/log(n2)
I'm trying to create a benchmark test with java. Currently I have the following simple method:
public static long runTest(int times){
long start = System.nanoTime();
String str = "str";
for(int i=0; i<times; i++){
str = "str"+i;
}
return System.nanoTime()-start;
}
I'm currently having this loop multiple times within another loop that is happening multiple times and getting the min/max/avg time it takes to run this method through. Then I am starting some activity on another thread and testing again. Basically I am just wanting to get consistent results... It seems pretty consistent if I have the runTest loop 10 million times:
Number of times ran: 5
The max time was: 1231419504 (102.85% of the average)
The min time was: 1177508466 (98.35% of the average)
The average time was: 1197291937
The difference between the max and min is: 4.58%
Activated thread activity.
Number of times ran: 5
The max time was: 3872724739 (100.82% of the average)
The min time was: 3804827995 (99.05% of the average)
The average time was: 3841216849
The difference between the max and min is: 1.78%
Running with thread activity took 320.83% as much time as running without.
But this seems a bit much, and takes some time... if I try a lower number (100000) in the runTest loop... it starts to become very inconsistent:
Number of times ran: 5
The max time was: 34726168 (143.01% of the average)
The min time was: 20889055 (86.02% of the average)
The average time was: 24283026
The difference between the max and min is: 66.24%
Activated thread activity.
Number of times ran: 5
The max time was: 143950627 (148.83% of the average)
The min time was: 64780554 (66.98% of the average)
The average time was: 96719589
The difference between the max and min is: 122.21%
Running with thread activity took 398.3% as much time as running without.
Is there a way that I can do a benchmark like this that is both consistent and efficient/fast?
I'm not testing the code that is between the start and end times by the way. I'm testing the CPU load in a way (see how I'm starting some thread activity and retesting). So I think that what I'm looking for it something to substitute for the code I have in "runTest" that will yield quicker and more consistent results.
Thanks
In short:
(Micro-)benchmarking is very complex, so use a tool like the Benchmarking framework http://www.ellipticgroup.com/misc/projectLibrary.zip - and still be skeptical about the results ("Put micro-trust in a micro-benchmark", Dr. Cliff Click).
In detail:
There are a lot of factors that can strongly influence the results:
The accuracy and precision of System.nanoTime: it is in the worst case as bad as of System.currentTimeMillis.
code warmup and class loading
mixed mode: JVMs JIT compile (see Edwin Buck's answer) only after a code block is called sufficiently often (1500 or 1000 times)
dynamic optimizations: deoptimization, on-stack replacement, dead code elimination (you should use the result you computed in your loop, e.g. print it)
resource reclamation: garbace collection (see Michael Borgwardt's answer) and object finalization
caching: I/O and CPU
your operating system on the whole: screen saver, power management, other processes (indexer, virus scan, ...)
Brent Boyer's article "Robust Java benchmarking, Part 1: Issues" ( http://www.ibm.com/developerworks/java/library/j-benchmark1/index.html) is a good description of all those issues and whether/what you can do against them (e.g. use JVM options or call ProcessIdleTask beforehand).
You won't be able to eliminate all these factors, so doing statistics is a good idea. But:
instead of computing the difference between the max and min, you should put in the effort to compute the standard deviation (the results {1, 1000 times 2, 3} is different from {501 times 1, 501 times 3}).
The reliability is taken into account by producing confidence intervals (e.g. via bootstrapping).
The above mentioned Benchmark framework ( http://www.ellipticgroup.com/misc/projectLibrary.zip) uses these techniques. You can read about them in Brent Boyer's article "Robust Java benchmarking, Part 2: Statistics and solutions" ( https://www.ibm.com/developerworks/java/library/j-benchmark2/).
Your code ends up testing mainly garbage collection performance because appending to a String in a loop ends up creating and immediately discarding a large number of increasingly large String objects.
This is something that inherently leads to wildly varying measurements and is influenced strongy by multi-thread activity.
I suggest you do something else in your loop that has more predictable performance, like mathematical calculations.
In the 10 million times run, odds are good the HotSpot compiler detected a "heavily used" piece of code and compiled it into machine native code.
JVM bytecode is interpreted, which leads it susceptible to more interrupts from other background processes occurring in the JVM (like garbage collection).
Generally speaking, these kinds of benchmarks are rife with assumptions that don't hold. You cannot believe that a micro benchmark really proves what it set out to prove without a lot of evidence proving that the initial measurement (time) isn't actually measuring your task and possibly some other background tasks. If you don't attempt to control for background tasks, then the measurement is much less useful.