I was going through the following article:
Understanding Collections and Thread Safety in Java
The article says:
You know, Vector and Hashtable are the two collections exist early in Java history, and they are designed for thread-safe from the start (if you have chance to look at their source code, you will see their methods are all synchronized!). However, they quickly expose poor performance in multi-threaded programs. As you may know, synchronization requires locks which always take time to monitor, and that reduces the performance.
[I've also done a benchmark using Caliper; please hear me out on this]
A sample code has also been provided:
public class CollectionsThreadSafeTest {
public void testVector() {
long startTime = System.currentTimeMillis();
Vector<Integer> vector = new Vector<>();
for (int i = 0; i < 10_000_000; i++) {
vector.addElement(i);
}
long endTime = System.currentTimeMillis();
long totalTime = endTime - startTime;
System.out.println("Test Vector: " + totalTime + " ms");
}
public void testArrayList() {
long startTime = System.currentTimeMillis();
List<Integer> list = new ArrayList<>();
for (int i = 0; i < 10_000_000; i++) {
list.add(i);
}
long endTime = System.currentTimeMillis();
long totalTime = endTime - startTime;
System.out.println("Test ArrayList: " + totalTime + " ms");
}
public static void main(String[] args) {
CollectionsThreadSafeTest tester = new CollectionsThreadSafeTest();
tester.testVector();
tester.testArrayList();
}
}
The output they have provided for the above code is as follows:
Test Vector: 9266 ms
Test ArrayList: 4588 ms
But when I ran it in my machine, it gave me the following result:
Test Vector: 521 ms
Test ArrayList: 2273 ms
I found this to be quite odd. I thought doing a micro benchmark would be better. So, I wrote a benchmark for the above using caliper. The code is as follows:
public class CollectionsThreadSafeTest extends SimpleBenchmark {
public static final int ELEMENTS = 10_000_000;
public void timeVector(int reps) {
for (int i = 0; i < reps; i++) {
Vector<Integer> vector = new Vector<>();
for (int k = 0; k < ELEMENTS; k++) {
vector.addElement(k);
}
}
}
public void timeArrayList(int reps) {
for (int i = 0; i < reps; i++) {
List<Integer> list = new ArrayList<>();
for (int k = 0; k < ELEMENTS; k++) {
list.add(k);
}
}
}
public static void main(String[] args) {
String[] classesToTest = { CollectionsThreadSafeTest.class.getName() };
Runner.main(classesToTest);
}
}
But I got a similar result:
0% Scenario{vm=java, trial=0, benchmark=ArrayList} 111684174.60 ns; ?=18060504.25 ns # 10 trials
50% Scenario{vm=java, trial=0, benchmark=Vector} 67701359.18 ns; ?=17924728.23 ns # 10 trials
benchmark ms linear runtime
ArrayList 111.7 ==============================
Vector 67.7 ==================
vm: java
trial: 0
I'm kinda confused. What is happening here? Am I doing something wrong here (that would be really embarrassing) ?
If this is the expected behavior, then what is the explanation behind this?
Update #1
After reading #Kayaman's answer, I ran the caliper tests by changing the values of the initial capacities of both the Vector and the ArrayList. Following are the timings (in ms):
Initial Capacity Vector ArrayList
-------------------------------------
10_000_000 49.2 67.1
10_000_001 48.9 71.2
10_000_010 48.1 61.2
10_000_100 43.9 70.1
10_001_000 45.6 70.6
10_010_000 44.8 68.0
10_100_000 52.8 64.6
11_000_000 52.7 71.7
20_000_000 74.0 51.8
-------------------------------------
Thanks for all the inputs :)
You're not really testing the add() method here. You're testing the different ways that a Vector and an ArrayList grow. A Vector doubles in size when it's full, but an ArrayList has some more refined logic to prevent the internal array from growing exponentially and wasting memory.
If you run your test with a > 10000000 initial capacity for both classes, they won't need to resize and you'll be profiling just the adding part.
The vector is expected to be slower in multithreaded environment. It is expected to be lightweight in your case. Better do the tests with adding these items from 10000 different threads
Both ArrayList and Vector have the same add method:
ensureCapacity();
elementData[elementCount++] = newElement;
The difference is only one. Vector's add method is synchronized and ArrayList's is not. From theory synchronized methods are slower than non-synchronized.
To improve performance of add method you have to specify initialCapacity in constructor or call method ensureCapacity. This creates internal array as long as you need and so no need to recreate it.
Related
The Goal of my question is to enhance the performance of my algorithm by splitting the range of my loop iterations over a large array list.
For example: I have an Array list with a size of about 10 billion entries of long values, the goal I am trying to achieve is to start the loop from 0 to 100 million entries, output the result for the 100 million entries of whatever calculations inside the loop; then begin and 100 million to 200 million doing the previous and outputting the result, then 300-400million,400-500million and so on and so forth.
after I get all the 100 billion/100 million results, then I can sum them up outside of the loop collecting the results from the loop outputs parallel.
I have tried to use a range that might be able to achieve something similar by trying to use a dynamic range shift method but I cant seem to have the logic fully implemented like I would like to.
public static void tt4() {
long essir2 = 0;
long essir3 = 0;
List cc = new ArrayList<>();
List<Long> range = new ArrayList<>();
// break point is a method that returns list values, it was converted to
// string because of some concatenations and would be converted back to long here
for (String ari1 : Breakpoint()) {
cc.add(Long.valueOf(ari1));
}
// the size of the List is huge about 1 trillion entries at the minimum
long hy = cc.size() - 1;
for (long k = 0; k < hy; k++) {
long t1 = (long) cc.get((int) k);
long t2 = (long) cc.get((int) (k + 1));
// My main question: I am trying to iterate the entire list in a dynamic way
// which would exclude repeated endpoints on each iteration.
range = LongStream.rangeClosed(t1 + 1, t2)
.boxed()
.collect(Collectors.toList());
for (long i : range) {
// Hard is another method call on the iteration
// complexcalc is a method as well
essir2 = complexcalc((int) i, (int) Hard(i));
essir3 += essir2;
}
}
System.out.println("\n" + essir3);
}
I don't have any errors, I am just looking for a way to enhance performance and time. I can do a million entries in under a second directly, but when I put the size I require it runs forever. The size I'm giving are abstracts to illustrate size magnitudes, I don't want opinions like a 100 billion is not much, if I can do a million under a second, I'm talking massively huge numbers I need to iterate over doing complex tasks and calls, I just need help with the logic I'm trying to achieve if I can.
One thing I would suggest right off the bat would be to store your Breakpoint return value inside a simple array rather then using a List. This should improve your execution time significantly:
List<Long> cc = new ArrayList<>();
for (String ari1 : Breakpoint()) {
cc.add(Long.valueOf(ari1));
}
Long[] ccArray = cc.toArray(new Long[0]);
I believe what you're looking for is to split your tasks across multiple threads. You can do this with ExecutorService "which simplifies the execution of tasks in asynchronous mode".
Note that I am not overly familiar with this whole concept but have experimented with it a bit recently and give you a quick draft of how you could implement this.
I welcome those more experienced with multi-threading to either correct this post or provide additional information in the comments to help improve this answer.
Runnable Task class
public class CompartmentalizationTask implements Runnable {
private final ArrayList<Long> cc;
private final long index;
public CompartmentalizationTask(ArrayList<Long> list, long index) {
this.cc = list;
this.index = index;
}
#Override
public void run() {
Main.compartmentalize(cc, index);
}
}
Main class
private static ExecutorService exeService = Executors.newCachedThreadPool();
private static List<Future> futureTasks = new ArrayList<>();
public static void tt4() throws ExecutionException, InterruptedException
{
long essir2 = 0;
long essir3 = 0;
ArrayList<Long> cc = new ArrayList<>();
List<Long> range = new ArrayList<>();
// break point is a method that returns list values, it was converted to
// string because of some concatenations and would be converted back to long here
for (String ari1 : Breakpoint()) {
cc.add(Long.valueOf(ari1));
}
// the size of the List is huge about 1 trillion entries at the minimum
long hy = cc.size() - 1;
for (long k = 0; k < hy; k++) {
futureTasks.add(Main.exeService.submit(new CompartmentalizationTask(cc, k)));
}
for (int i = 0; i < futureTasks.size(); i++) {
futureTasks.get(i).get();
}
exeService.shutdown();
}
public static void compartmentalize(ArrayList<Long> cc, long index)
{
long t1 = (long) cc.get((int) index);
long t2 = (long) cc.get((int) (index + 1));
// My main question: I am trying to iterate the entire list in a dynamic way
// which would exclude repeated endpoints on each iteration.
range = LongStream.rangeClosed(t1 + 1, t2)
.boxed()
.collect(Collectors.toList());
for (long i : range) {
// Hard is another method call on the iteration
// complexcalc is a method as well
essir2 = complexcalc((int) i, (int) Hard(i));
essir3 += essir2;
}
}
I'm relatively new to Java programming, and I'm running into an issue calculating the amount of time it takes for a function to run.
First some background - I've got a lot of experience with Python, and I'm trying to recreate the functionality of the Jupyter Notebook/Lab %%timeit function, if you're familiar with that. Here's a pic of it in action (sorry, not enough karma to embed yet):
Snip of Jupyter %%timeit
What it does is run the contents of the cell (in this case a recursive function) either 1k, 10k, or 100k times, and give you the average run time of the function, and the standard deviation.
My first implementation (using the same recursive function) used System.nanoTime():
public static void main(String[] args) {
long t1, t2, diff;
long[] times = new long[1000];
int t;
for (int i=0; i< 1000; i++) {
t1 = System.nanoTime();
t = triangle(20);
t2 = System.nanoTime();
diff = t2-t1;
System.out.println(diff);
times[i] = diff;
}
long total = 0;
for (int j=0; j<times.length; j++) {
total += times[j];
}
System.out.println("Mean = " + total/1000.0);
}
But the mean is wildly thrown off -- for some reason, the first iteration of the function (on many runs) takes upwards of a million nanoseconds:
Pic of initial terminal output
Every iteration after the first dozen or so takes either 395 nanos or 0 -- so there could be a problem there too... not sure what's going on!
Also -- the code of the recursive function I'm timing:
static int triangle(int n) {
if (n == 1) {
return n;
} else {
return n + triangle(n -1);
}
}
Initially I had the line n = Math.abs(n) on the first line of the function, but then I removed it because... meh. I'm the only one using this.
I tried a number of different suggestions brought up in this SO post, but they each have their own problems... which I can go into if you need.
Anyway, thank you in advance for your help and expertise!
This question already has answers here:
Java benchmarking - why is the second loop faster?
(6 answers)
Closed 9 years ago.
I had the below code. I just wanted to check the running time of a code block. And mistakenly i had copied and pasted the same code again and get an interesting result. Though the code block is the same the running times are different. And the code block 1 taking more time than the others. If i switch the code blocks (say i move the code blocks 4 to the top) then code block 4 will be taking more time than others.
I used two different types of Arrays in my code blocks to check it depends on that. And the result is same. If the code blocks has the same type of arrays then the top most code block is taking more time. See the below code and the given out put.
public class ABBYtest {
public static void main(String[] args) {
long startTime;
long endTime;
//code block 1
startTime = System.nanoTime();
Long a[] = new Long[10];
for (int i = 0; i < a.length; i++) {
a[i] = 12l;
}
Arrays.sort(a);
endTime = System.nanoTime();
System.out.println("code block (has Long array) 1 = " + (endTime - startTime));
//code block 6
startTime = System.nanoTime();
Long aa[] = new Long[10];
for (int i = 0; i < aa.length; i++) {
aa[i] = 12l;
}
Arrays.sort(aa);
endTime = System.nanoTime();
System.out.println("code block (has Long array) 6 = " + (endTime - startTime));
//code block 7
startTime = System.nanoTime();
Long aaa[] = new Long[10];
for (int i = 0; i < aaa.length; i++) {
aaa[i] = 12l;
}
Arrays.sort(aaa);
endTime = System.nanoTime();
System.out.println("code block (has Long array) 7 = " + (endTime - startTime));
//code block 2
startTime = System.nanoTime();
long c[] = new long[10];
for (int i = 0; i < c.length; i++) {
c[i] = 12l;
}
Arrays.sort(c);
endTime = System.nanoTime();
System.out.println("code block (has long array) 2 = " + (endTime - startTime));
//code block 3
startTime = System.nanoTime();
long d[] = new long[10];
for (int i = 0; i < d.length; i++) {
d[i] = 12l;
}
Arrays.sort(d);
endTime = System.nanoTime();
System.out.println("code block (has long array) 3 = " + (endTime - startTime));
//code block 4
startTime = System.nanoTime();
long b[] = new long[10];
for (int i = 0; i < b.length; i++) {
b[i] = 12l;
}
Arrays.sort(b);
endTime = System.nanoTime();
System.out.println("code block (has long array) 4 = " + (endTime - startTime));
//code block 5
startTime = System.nanoTime();
Long e[] = new Long[10];
for (int i = 0; i < e.length; i++) {
e[i] = 12l;
}
Arrays.sort(e);
endTime = System.nanoTime();
System.out.println("code block (has Long array) 5 = " + (endTime - startTime));
}
}
The running times:
code block (has Long array) 1 = 802565
code block (has Long array) 6 = 6158
code block (has Long array) 7 = 4619
code block (has long array) 2 = 171906
code block (has long array) 3 = 4105
code block (has long array) 4 = 3079
code block (has Long array) 5 = 8210
As we can see the first code block which contains the Long array will take more time than others which contain Long arrays. And it is the same for the first code block which contains long array.
Can anyone explain this behavior. or Am i doing some mistake here ??
Faulty benchmarking. The non exhaustive list of what is wrong:
No warmup: single shot measurements are almost always wrong;
Mixing several codepaths in the single method: we probably start compiling the method with the execution data available only for the first loop in the method;
Sources are predictable: should the loop compile, we can actually predict the result;
Results are dead-code eliminated: should the loop compile, we can throw the loop it away
Here is how you do it arguably right with jmh:
#OutputTimeUnit(TimeUnit.NANOSECONDS)
#BenchmarkMode(Mode.AverageTime)
#Warmup(iterations = 3, time = 1)
#Measurement(iterations = 3, time = 1)
#Fork(10)
#State(Scope.Thread)
public class Longs {
public static final int COUNT = 10;
private Long[] refLongs;
private long[] primLongs;
/*
* Implementation notes:
* - copying the array from the field keeps the constant
* optimizations away, but we implicitly counting the
* costs of arraycopy() in;
* - two additional baseline experiments quantify the
* scale of arraycopy effects (note you can't directly
* subtract the baseline scores from the tests, because
* the code is mixed together;
* - the resulting arrays are always fed back into JMH
* to prevent dead-code elimination
*/
#Setup
public void setup() {
primLongs = new long[COUNT];
for (int i = 0; i < COUNT; i++) {
primLongs[i] = 12l;
}
refLongs = new Long[COUNT];
for (int i = 0; i < COUNT; i++) {
refLongs[i] = 12l;
}
}
#GenerateMicroBenchmark
public long[] prim_baseline() {
long[] d = new long[COUNT];
System.arraycopy(primLongs, 0, d, 0, COUNT);
return d;
}
#GenerateMicroBenchmark
public long[] prim_sort() {
long[] d = new long[COUNT];
System.arraycopy(primLongs, 0, d, 0, COUNT);
Arrays.sort(d);
return d;
}
#GenerateMicroBenchmark
public Long[] ref_baseline() {
Long[] d = new Long[COUNT];
System.arraycopy(refLongs, 0, d, 0, COUNT);
return d;
}
#GenerateMicroBenchmark
public Long[] ref_sort() {
Long[] d = new Long[COUNT];
System.arraycopy(refLongs, 0, d, 0, COUNT);
Arrays.sort(d);
return d;
}
}
...this yields:
Benchmark Mode Samples Mean Mean error Units
o.s.Longs.prim_baseline avgt 30 19.604 0.327 ns/op
o.s.Longs.prim_sort avgt 30 51.217 1.873 ns/op
o.s.Longs.ref_baseline avgt 30 16.935 0.087 ns/op
o.s.Longs.ref_sort avgt 30 25.199 0.430 ns/op
At this point you may start to wonder why sorting Long[] and sorting long[] takes different time. The answer lies in the Array.sort() overloads: OpenJDK sorts primitive and reference arrays via different algos (references with TimSort, primitives with dual-pivot quicksort). Here's the highlight of choosing another algo with -Djava.util.Arrays.useLegacyMergeSort=true, which falls back to merge sort for references:
Benchmark Mode Samples Mean Mean error Units
o.s.Longs.prim_baseline avgt 30 19.675 0.291 ns/op
o.s.Longs.prim_sort avgt 30 50.882 1.550 ns/op
o.s.Longs.ref_baseline avgt 30 16.742 0.089 ns/op
o.s.Longs.ref_sort avgt 30 64.207 1.047 ns/op
Hope that helps to explain the difference.
The explanation above barely scratch the surface about the performance of sorting. The performance is very different when presented with different source data (including available pre-sorted subsequences, their patterns and run lengths, sizes of the data itself).
Can anyone explain this behavior. or Am i doing some mistake here ??
Your problem is a badly written benchmark. You do not take account of JVM warmup effects. Things like the overheads of loading code, initial expansion of the heap, and JIT compilation. In addition, startup of an application always generates extra garbage that needs to be collected.
In addition, if your application itself generates garbage (and I expect that sort and / or println are doing that) then you need to take account of possible GC runs during the "steady state" phase of your benchmark application's run.
See this Q&A for hints on how to write valid Java benchmarks:
How do I write a correct micro-benchmark in Java?
There are numerous other articles on this. Google for "how to write a java benchmark".
In this example, I suspect that the first code block takes so much longer than the rest because of (initially) bytecode interpretation followed by the overhead of JIT compilation. You may well be garbage collecting to deal with temporary objects created during loading and JIT compilation. The high value for the 4th measurement is most likely due to another garbage collection cycle.
However, one would need to turn on some JVM logging to figure out the real cause.
Just to add to what everyone else is saying. Java will not necessarily compile everything. When it analyses the code for optimization, java will choose to interpret code that is not used extensively a fair amount of the time. If you look at the byte codes your Long arrays should always take more time and certainly space complexity than your long arrays, but as has been pointed out, warmup effects will have an effect.
This is could be due to a few things:
As noted by syrion, Java's virtual machine is allowed to perform optimizations on your code as it is running. Your first block is likely taking longer because Java hasn't yet optimized your code fully. As the first block runs, the JVM is applying changes which can then be utilized in the other blocks.
Your processor could be caching the results of your code, speeding up future blocks. This is similar to the previous point, but can vary even between identical JVM implementations.
While your program is running, your computer is also performing other tasks. These include handling the OS's UI, checking for program updates, etc. For this reason, some blocks of code can be slower than others, because your computer isn't concentrating as much resources towards its execution.
Java's virtual machine is garbage collected. That is to say, at unspecified points during your program's execution, the JVM takes some time to clean up any objects that are no longer used.
Points 1 and 2 are likely the cause for the large difference in the first block's execution time. Point 3 could be the reason for the smaller fluctuations, and point 4, as noted by Stephen, probably caused the large stall in block 3.
Another thing that I didn't notice is your use of both long and Long. The object form contains a larger memory overhead, and both are subject to different optimizations.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
class DummyInteger {
private int i;
public DummyInteger(int i) {
this.i = i;
}
public int getI() {
return i;
}
}
long start = System.nanoTime();
DummyInteger n = new DummyInteger(10);
long end = System.nanoTime();
long duration = end - start;
System.out.println(duration);
The previous code produces the following output:
341000
Whereas:
long start = System.nanoTime();
ArrayList a = new ArrayList();
long end = System.nanoTime();
long duration = end - start;
System.out.println(duration);
produces the following output:
17000
Now, my question is, why do we observe such difference in the running time, even though the work done by the DummyInteger Class seems to be at most as much as that performed by the ArrayList constructor? Does it have to do with ArrayList's code being precompiled? or is it some other factor that's affecting the processing time?
Thank you.
--EDIT--
I thought the issue of comparing two different types of objects would arise, however, even with the following code, compared to creating an ArrayList that is:
class IntList {
int [] elementData;
public IntList() {
elementData = new int [20];
}
}
long start = System.nanoTime();
IntList n = new IntList();
long end = System.nanoTime();
long duration = end - start;
System.out.println(duration);
The result is still the same, bearing in mind, that in this case, the overhead by the creation of the ArrayList should be greater, due to certain checks that are performed, and could be found by going through the source code.
One more thing to note, is that, I run both codes on two different runs, which eliminates any overhead that may result from the initialisation of the JVM.
why do we observe such difference in the running time?
Because your measurement is meaningless:
it is a flawed microbenchmark (no JVM warmup, if you only run that then the JVM startup probably intereferes with your measurement too...) => How do I write a microbenchmark?
the resolution of nanotime is WAY too low for the operations your are measuring - they probably take a few nanoseconds wheeas nanotime's resolution is closer to 1 ms
Here are the results I get following a more robust methodology: creating a new Integer takes around 6 nanoseconds, while creating an ArrayList of default size (10) takes about 19 nanoseconds on my machine.
DummyInteger creation:
Run result "newInteger": 6.064 ±(95%) 0.101 ±(99%) 0.167 nsec/op
Run statistics "newInteger": min = 6.007, avg = 6.064, max = 6.200, stdev = 0.081
Run confidence intervals "newInteger": 95% [5.964, 6.165], 99% [5.897, 6.231]
List creation:
Run result "newList": 19.139 ±(95%) 0.192 ±(99%) 0.318 nsec/op
Run statistics "newList": min = 18.866, avg = 19.139, max = 19.234, stdev = 0.155
Run confidence intervals "newList": 95% [18.948, 19.331], 99% [18.821, 19.458]
EDIT
My "more robust methodology" also had a flaw and the creation was actually optimised away by the cheeky JIT... New results above although the conclusion is similar: these are extremely fast operations.
Loading a class is one of the most expensive things you can do. (Especially for a class which doesn't do much) Many of the built in classes will be used before your program starts and so they don't need to be loaded again. As you use the class more the code for it is warmed up.
Consider the following example where the three class you mentioned are create repeatedly
static class DummyInteger {
private int i;
public DummyInteger(int i) {
this.i = i;
}
public int getI() {
return i;
}
}
static class IntList {
int[] elementData;
public IntList() {
elementData = new int[20];
}
}
public static void main(String... ignored) {
timeEach("First time", 1);
for (int i = 1000; i <= 5000; i += 1000)
timeEach(i + " avg", i);
for (int i = 10000; i <= 20000; i += 10000)
timeEach(i + " avg", i);
}
public static void timeEach(String desc, int repeats) {
long time1 = System.nanoTime();
for (int i = 0; i < repeats; i++) {
List l = new ArrayList();
}
long time2 = System.nanoTime();
for (int i = 0; i < repeats; i++) {
DummyInteger di = new DummyInteger(i);
}
long time3 = System.nanoTime();
for (int i = 0; i < repeats; i++) {
IntList il = new IntList();
}
long time4 = System.nanoTime();
System.out.printf("%s: ArrayList %,d; DummyInteger %,d; IntList %,d%n",
desc, (time2 - time1) / repeats, (time3 - time2) / repeats, (time4 - time3) / repeats);
}
prints with Java 7 update 21 and -XX:+PrintCompilation
89 1 java.lang.String::hashCode (55 bytes)
89 2 java.lang.String::charAt (29 bytes)
First time: ArrayList 41,463; DummyInteger 422,837; IntList 334,986
1000 avg: ArrayList 268; DummyInteger 60; IntList 136
120 3 java.lang.Object::<init> (1 bytes)
2000 avg: ArrayList 321; DummyInteger 75; IntList 142
3000 avg: ArrayList 293; DummyInteger 63; IntList 133
123 4 Main::timeEach (152 bytes)
124 5 java.util.AbstractCollection::<init> (5 bytes)
124 6 java.util.AbstractList::<init> (10 bytes)
125 7 java.util.ArrayList::<init> (44 bytes)
4000 avg: ArrayList 309; DummyInteger 64; IntList 175
126 8 java.util.ArrayList::<init> (7 bytes)
127 9 Main$DummyInteger::<init> (10 bytes)
127 10 Main$IntList::<init> (13 bytes)
5000 avg: ArrayList 162; DummyInteger 70; IntList 149
10000 avg: ArrayList 0; DummyInteger 0; IntList 0
20000 avg: ArrayList 0; DummyInteger 0; IntList 0
You can see as the performance improves ArrayList is the slowest. Finally the JIT determines the objects are not used and don't need to be created. Then the loop is empty and doesn't need to be run so the average time drops to 0.
Now, my question is, why do we observe such difference in the running
time, even though the work done by the DummyInteger Class seems to be
at most as much as that performed by the ArrayList constructor?
Because this benchmark is bad. You can't tell nothing from 1 run.
Does it have to do with ArrayList's code being precompiled?
No.
you are comparing the time performance for two different types of objects i.e arraylist and Your Dummyobject thats why time stats are different. Now lets come to your question
Creating an object of a pre-implemented java class is much faster than
creating a custom object?
its not the right statement as pre-implemented java is also a kind of custom object but created by somebody else for you (like you create the custom object for yourself). So its not
pre-implemented java class vs custom object but actually depends on what are the operation that takes place during object creation.
I was doing some tests to find out what the speed differences are between using getters/setters and direct field access. I wrote a simple benchmark application like this:
public class FieldTest {
private int value = 0;
public void setValue(int value) {
this.value = value;
}
public int getValue() {
return this.value;
}
public static void doTest(int num) {
FieldTest f = new FieldTest();
// test direct field access
long start1 = System.nanoTime();
for (int i = 0; i < num; i++) {
f.value = f.value + 1;
}
f.value = 0;
long diff1 = System.nanoTime() - start1;
// test method field access
long start2 = System.nanoTime();
for (int i = 0; i < num; i++) {
f.setValue(f.getValue() + 1);
}
f.setValue(0);
long diff2 = System.nanoTime() - start2;
// print results
System.out.printf("Field Access: %d ns\n", diff1);
System.out.printf("Method Access: %d ns\n", diff2);
System.out.println();
}
public static void main(String[] args) throws InterruptedException {
int num = 2147483647;
// wait for the VM to warm up
Thread.sleep(1000);
for (int i = 0; i < 10; i++) {
doTest(num);
}
}
}
Whenever I run it, I get consistent results such as these: http://pastebin.com/hcAtjVCL
I was wondering if someone could explain to me why field access seems to be slower than getter/setter method access, and also why the last 8 iterations execute incredibly fast.
Edit: Having taken into account assylias and Stephen C comments, I have changed the code to http://pastebin.com/Vzb8hGdc where I got slightly different results: http://pastebin.com/wxiDdRix .
The explanation is that your benchmark is broken.
The first iteration is done using the interpreter.
Field Access: 1528500478 ns
Method Access: 1521365905 ns
The second iteration is done by the interpreter to start with and then we flip to running JIT compiled code.
Field Access: 1550385619 ns
Method Access: 47761359 ns
The remaining iterations are all done using JIT compiled code.
Field Access: 68 ns
Method Access: 33 ns
etcetera
The reason they are unbelievably fast is that the JIT compiler has optimized the loops away. It has detected that they were not contributing anything useful to the computation. (It is not clear why the first number seems consistently faster than the second, but I doubt that the optimized code is measuring field versus method access in any meaningful way.)
Re the UPDATED code / results: it is obvious that the JIT compiler is still optimizing the loops away.