[Java]Arrays and Collections test, unexpected outcome?

[Java]Arrays and Collections test, unexpected outcome? - java

There are probably hundreds of questions about Java Collections vs. arrays, but this is something I really didn't expect.
I am developing a server for my game, and to communicate between the client and server you need to send packets (obviously), so I did some tests which Collection (or array) I could use best to handle them, HashMap, ArrayList and a PacketHandler array. And the outcome is very unexpected to me, because the ArrayList wins.
The packet handling structure is just like dictionary usage (index to PacketHandler), and because an array is the most primitive form of dictionary use I thought that would easily perform better than an ArrayList. Could someone explain me why this is?
My test
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Random;
public class Main {
/**
* Packet handler interface.
*/
private interface PacketHandler {
void handle();
}
/**
* A dummy packet handler.
*/
private class DummyPacketHandler implements PacketHandler {
#Override
public void handle() { }
}
public Main() {
Random r = new Random();
PacketHandler[] handlers = new PacketHandler[256];
HashMap<Integer, PacketHandler> m = new HashMap<Integer, PacketHandler>();
ArrayList<PacketHandler> list = new ArrayList<PacketHandler>();
// packet handler initialization
for (int i = 0; i < 255; i++) {
DummyPacketHandler p = new DummyPacketHandler();
handlers[i] = p;
m.put(new Integer(i), p);
list.add(p);
}
// array
long time = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++)
handlers[r.nextInt(255)].handle();
System.out.println((System.currentTimeMillis() - time));
// hashmap
time = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++)
m.get(new Integer(r.nextInt(255))).handle();
System.out.println((System.currentTimeMillis() - time));
// arraylist
time = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++)
list.get(r.nextInt(255)).handle();
System.out.println((System.currentTimeMillis() - time));
}
public static void main(String[] args) {
new Main();
}
}
I think the problem is quite solved, thanks everybody

The shorter answer is that ArrayList is slightly more optimised the first time, but is still slower in the long run.
How and when the JVM optimise the code before its completely warmed up isn't always obvious and can change between version and based on your command line options.
What is really interesting is what you get when you repeat the test. The reason that makes a difference here is that the code is compiled in stages in the background as you want to have tests where the code is already as fast as it will get right from the start.
There are a few things you can do to make your benchmark more reproducaeable.
generate your random numbers in advance, they are not part of your test but can slow you down.
place each loop in a separate method. The first loop triggers the whole method to be compiled for better or worse.
repeat the test 5 to 10 times and ignore the first one.
Using System.nanoTime() instead of currentTimeMillis() It may not make any difference here but is a good habit to get into.
Use autoboxing where you can so it uses the integer cache or Integer.valueOf(n) which does the same thing. new Integer(n) will always create an object.
make sure your inner loop does something otherwise its quite likely the JIT will optimise it away to nothing. ;)
.
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Random;
public class Main {
/**
* Packet handler interface.
*/
private interface PacketHandler {
void handle();
}
/**
* A dummy packet handler.
*/
static class DummyPacketHandler implements PacketHandler {
#Override
public void handle() {
}
}
public static void main(String[] args) {
Random r = new Random();
PacketHandler[] handlers = new PacketHandler[256];
HashMap<Integer, PacketHandler> m = new HashMap<Integer, PacketHandler>();
ArrayList<PacketHandler> list = new ArrayList<PacketHandler>();
// packet handler initialization
for (int i = 0; i < 256; i++) {
DummyPacketHandler p = new DummyPacketHandler();
handlers[i] = p;
m.put(new Integer(i), p);
list.add(p);
}
int runs = 10000000;
int[] handlerToUse = new int[runs];
for (int i = 0; i < runs; i++)
handlerToUse[i] = r.nextInt(256);
for (int i = 0; i < 5; i++) {
testArray(handlers, runs, handlerToUse);
testHashMap(m, runs, handlerToUse);
testArrayList(list, runs, handlerToUse);
System.out.println();
}
}
private static void testArray(PacketHandler[] handlers, int runs, int[] handlerToUse) {
// array
long time = System.nanoTime();
for (int i = 0; i < runs; i++)
handlers[handlerToUse[i]].handle();
System.out.print((System.nanoTime() - time)/1e6+" ");
}
private static void testHashMap(HashMap<Integer, PacketHandler> m, int runs, int[] handlerToUse) {
// hashmap
long time = System.nanoTime();
for (int i = 0; i < runs; i++)
m.get(handlerToUse[i]).handle();
System.out.print((System.nanoTime() - time)/1e6+" ");
}
private static void testArrayList(ArrayList<PacketHandler> list, int runs, int[] handlerToUse) {
// arraylist
long time = System.nanoTime();
for (int i = 0; i < runs; i++)
list.get(handlerToUse[i]).handle();
System.out.print((System.nanoTime() - time)/1e6+" ");
}
}
prints for array HashMap ArrayList
24.62537 263.185092 24.19565
28.997305 206.956117 23.437585
19.422327 224.894738 21.191718
14.154433 194.014725 16.927638
13.897081 163.383876 16.678818
After the code warms up, the array is marginally faster.

There are at least a few problems with your benchmark:
you run your tests directly in main, meaning that when you main method gets compiled, the JIT compiler has not had time to optimise all the code because it has not run it yet
the map method creates a new integer each time, which is not fair: use m.get(r.nextInt(255)).handle(); to allow the Integer cache to be used
you need to run your test several times before you can draw conclusions
you are not using the result of what you do in your loops and the JIT is therefore allowed to simply ignore them
monitor GC as it might always run at the same time and bias the results of one of your loop and add a System.gc() call between each loop.
But before doing all that, read this post ;-)
After tweaking your code a bit, I get these results:
Array: 116
Map: 139
List: 117
So array and list are close to identical once compiled and map is slightly slower.
Code:
public class Main {
/**
* Packet handler interface.
*/
private interface PacketHandler {
int handle();
}
/**
* A dummy packet handler.
*/
private class DummyPacketHandler implements PacketHandler {
#Override
public int handle() {
return 123;
}
}
public Main() {
Random r = new Random();
PacketHandler[] handlers = new PacketHandler[256];
HashMap<Integer, PacketHandler> m = new HashMap<Integer, PacketHandler>();
ArrayList<PacketHandler> list = new ArrayList<PacketHandler>();
// packet handler initialization
for (int i = 0; i < 255; i++) {
DummyPacketHandler p = new DummyPacketHandler();
handlers[i] = p;
m.put(new Integer(i), p);
list.add(p);
}
long sum = 0;
runArray(handlers, r, 20000);
runMap(m, r, 20000);
runList(list, r, 20000);
// array
long time = System.nanoTime();
sum += runArray(handlers, r, 10000000);
System.out.println("Array: " + (System.nanoTime() - time) / 1000000);
// hashmap
time = System.nanoTime();
sum += runMap(m, r, 10000000);
System.out.println("Map: " + (System.nanoTime() - time) / 1000000);
// arraylist
time = System.nanoTime();
sum += runList(list, r, 10000000);
System.out.println("List: " + (System.nanoTime() - time) / 1000000);
System.out.println(sum);
}
public static void main(String[] args) {
new Main();
}
private long runArray(PacketHandler[] handlers, Random r, int loops) {
long sum = 0;
for (int i = 0; i < loops; i++)
sum += handlers[r.nextInt(255)].handle();
return sum;
}
private long runMap(HashMap<Integer, PacketHandler> m, Random r, int loops) {
long sum = 0;
for (int i = 0; i < loops; i++)
sum += m.get(new Integer(r.nextInt(255))).handle();
return sum;
}
private long runList(List<PacketHandler> list, Random r, int loops) {
long sum = 0;
for (int i = 0; i < loops; i++)
sum += list.get(r.nextInt(255)).handle();
return sum;
}
}

Related

Java Multithreaded vector addition

I am trying to get familiar with java multithreaded applications. I tried to think of a simple application that can be parallelized very well. I thought vector addition would be a good application to do so.
However, when running on my linux server (which has 4 cores) I dont get any speed up. The time to execute on 4,2,1 threads is about the same.
Here is the code I came up with:
public static void main(String[]args)throws InterruptedException{
final int threads = Integer.parseInt(args[0]);
final int length= Integer.parseInt(args[1]);
final int balk=(length/threads);
Thread[]th = new Thread[threads];
final double[]result =new double[length];
final double[]array1=getRandomArray(length);
final double[]array2=getRandomArray(length);
long startingTime =System.nanoTime();
for(int i=0;i<threads;i++){
final int current=i;
th[i]=new Thread(()->{
for(int k=current*balk;k<(current+1)*balk;k++){
result[k]=array1[k]+array2[k];
}
});
th[i].start();
}
for(int i=0;i<threads;i++){
th[i].join();
}
System.out.println("Time needed: "+(System.nanoTime()-startingTime));
}
length is always a multiple of threads and getRandomArray() creates a random array of doubles between 0 and 1.
Execution Time for 1-Thread: 84579446ns
Execution Time for 2-Thread: 74211325ns
Execution Time for 4-Thread: 89215100ns
length =10000000
Here is the Code for getRandomArray():
private static double[]getRandomArray(int length){
Random random =new Random();
double[]array= new double[length];
for(int i=0;i<length;i++){
array[i]=random.nextDouble();
}
return array;
}
I would appreciate any help.

The difference is observable for the following code. Try it.
public static void main(String[]args)throws InterruptedException{
for(int z = 0; z < 10; z++) {
final int threads = 1;
final int length= 100_000_000;
final int balk=(length/threads);
Thread[]th = new Thread[threads];
final boolean[]result =new boolean[length];
final boolean[]array1=getRandomArray(length);
final boolean[]array2=getRandomArray(length);
long startingTime =System.nanoTime();
for(int i=0;i<threads;i++){
final int current=i;
th[i]=new Thread(()->{
for(int k=current*balk;k<(current+1)*balk;k++){
result[k]=array1[k] | array2[k];
}
});
th[i].start();
}
for(int i=0;i<threads;i++){
th[i].join();
}
System.out.println("Time needed: "+(System.nanoTime()-startingTime)*1.0/1000/1000);
boolean x = false;
for(boolean d : result) {
x |= d;
}
System.out.println(x);
}
}
First things first you need to warmup your code. This way you will measure compiled code. The first two iterations have the same(approximately) time but the next will differ. Also I changed double to boolean because my machine doesn't have much memory. This allows me to allocate a huge array and it also makes work more CPU consuming.
There is a link in comments. I suggest you to read it.

Hi from my side if you are trying to see how your cores shares work you can make very simple task for all cores, but make them to work constantly on something not shared across different threads (basically to simulate for example merge sort, where threads are working on something complicated and use shared resources in a small amount of time). Using your code i did something like this. In such case you should see almost exactly 2x speed up and 4 times speed up.
public static void main(String[]args)throws InterruptedException{
for(int a=0; a<5; a++) {
final int threads = 2;
final int length = 10;
final int balk = (length / threads);
Thread[] th = new Thread[threads];
System.out.println(Runtime.getRuntime().availableProcessors());
final double[] result = new double[length];
final double[] array1 = getRandomArray(length);
final double[] array2 = getRandomArray(length);
long startingTime = System.nanoTime();
for (int i = 0; i < threads; i++) {
final int current = i;
th[i] = new Thread(() -> {
Random random = new Random();
int meaningless = 0;
for (int k = current * balk; k < (current + 1) * balk; k++) {
result[k] = array1[k] + array2[k];
for (int j = 0; j < 10000000; j++) {
meaningless+=random.nextInt(10);
}
}
});
th[i].start();
}
for (int i = 0; i < threads; i++) {
th[i].join();
}
System.out.println("Time needed: " + ((System.nanoTime() - startingTime) * 1.0) / 1000000000 + " s");
}
}
You see, in your code most time is consumed by building big table, and then threads are executing very fast, their work is so fast that your calculation of time is wrong because most of time is consumed by creating threads. When i invoked code which works on precalculated loop like this:
long startingTime =System.nanoTime();
for(int k=0; k<length; k++){
result[k]=array1[k]|array2[k];
}
System.out.println("Time needed: "+(System.nanoTime()-startingTime));
It worked two times faster than your code with 2 threads. I hope that you understand what i mean in this case and will see my point when i gave my threads much more meaningless work.

Why is initial capacity important for deleting from ArrayList?

For my work I have done some tests for time chart.
I have come to something that surprised me and need help understanding it.
I used few data structures as queue and wanted to know how deleting is fast according to number of items. And arraylist with 10 items, deleting from front and not set initial capacity is much slower than the same with set initial capacity (to 15). Why? And why it's same at 100 items.
Here's the chart:
Data Structures: L - implements List, C - set initial capacity, B - removing from back, Q - implements Queue
EDIT:
Appending relevant piece of code
new Thread(new Runnable() {
#Override
public void run()
{
long time;
final int[] arr = {10, 100, 1000, 10000, 100000, 1000000};
for (int anArr : arr)
{
final List<Word> temp = new ArrayList<>();
while (temp.size() < anArr) temp.add(new Item());
final int top = (int) Math.sqrt(anArr);
final List<Word> first = new ArrayList<>();
final List<Word> second = new ArrayList<>(anArr);
...
first.addAll(temp);
second.addAll(temp);
...
SystemClock.sleep(5000);
time = System.nanoTime();
for (int i = 0; i < top; ++i) first.remove(0);
Log.d("al_l", "rem: " + (System.nanoTime() - time));
time = System.nanoTime();
for (int i = 0; i < top; ++i) second.remove(0);
Log.d("al_lc", "rem: " + (System.nanoTime() - time));
...
}
}
}).start();

Read this article about Avoiding Benchmarking Pitfalls on the JVM. It explains the impact of the Hotspot VM on the test results. If you don't take care about it, your measurement isn't right. As you have found out with your own test.
If you want to do reliable benchamrking use JMH.

I too was able to replicate it by creating the code below. However, I noticed that whatever is run first (the set capacity vs non-set capacity) is the one that will take the longest. I assume this is some kind of optimization, maybe the JVM, or some kind of Caching?
public class Test {
public static void main(String[] args) {
measure(-1, 10); // switch with line below
measure(15, 10); // switch with line above
measure(-1, 100);
measure(15, 100);
}
public static void measure(int capacity, long numItems) {
ArrayList<String> arr = new ArrayList<>();
if (capacity >= 1) {
arr.ensureCapacity(capacity);
}
for (int i = 0; i <= numItems; i++) {
arr.add("T");
}
long start = System.nanoTime();
for (int i = 0; i <= numItems; i++) {
arr.remove(0);
}
long end = System.nanoTime();
System.out.println("Capacity: " + capacity + ", " + "Runtime: "
+ (end - start));
}
}

ThreadLocalRandom with shared static Random instance performance comparing test

In our project for one task we used static Random instance for random numbers generation goal. After Java 7 release new ThreadLocalRandom class appeared for generating random numbers.
From spec:
When applicable, use of ThreadLocalRandom rather than shared Random objects in concurrent programs will typically encounter much less overhead and contention. Use of ThreadLocalRandom is particularly appropriate when multiple tasks (for example, each a ForkJoinTask) use random numbers in parallel in thread pools.
and also:
When all usages are of this form, it is never possible to accidently share a ThreadLocalRandom across multiple threads.
So I've made my little test:
public class ThreadLocalRandomTest {
private static final int THREAD_COUNT = 100;
private static final int GENERATED_NUMBER_COUNT = 1000;
private static final int INT_RIGHT_BORDER = 5000;
private static final int EXPERIMENTS_COUNT = 5000;
public static void main(String[] args) throws InterruptedException {
System.out.println("Number of threads: " + THREAD_COUNT);
System.out.println("Length of generated numbers chain for each thread: " + GENERATED_NUMBER_COUNT);
System.out.println("Right border integer: " + INT_RIGHT_BORDER);
System.out.println("Count of experiments: " + EXPERIMENTS_COUNT);
int repeats = 0;
int workingTime = 0;
long startTime = 0;
long endTime = 0;
for (int i = 0; i < EXPERIMENTS_COUNT; i++) {
startTime = System.currentTimeMillis();
repeats += calculateRepeatsForSharedRandom();
endTime = System.currentTimeMillis();
workingTime += endTime - startTime;
}
System.out.println("Average repeats for shared Random instance: " + repeats / EXPERIMENTS_COUNT
+ ". Average working time: " + workingTime / EXPERIMENTS_COUNT + " ms.");
repeats = 0;
workingTime = 0;
for (int i = 0; i < EXPERIMENTS_COUNT; i++) {
startTime = System.currentTimeMillis();
repeats += calculateRepeatsForTheadLocalRandom();
endTime = System.currentTimeMillis();
workingTime += endTime - startTime;
}
System.out.println("Average repeats for ThreadLocalRandom: " + repeats / EXPERIMENTS_COUNT
+ ". Average working time: " + workingTime / EXPERIMENTS_COUNT + " ms.");
}
private static int calculateRepeatsForSharedRandom() throws InterruptedException {
final Random rand = new Random();
final Map<Integer, Integer> counts = new HashMap<>();
for (int i = 0; i < THREAD_COUNT; i++) {
Thread thread = new Thread() {
#Override
public void run() {
for (int j = 0; j < GENERATED_NUMBER_COUNT; j++) {
int random = rand.nextInt(INT_RIGHT_BORDER);
if (!counts.containsKey(random)) {
counts.put(random, 0);
}
counts.put(random, counts.get(random) + 1);
}
}
};
thread.start();
thread.join();
}
int repeats = 0;
for (Integer value : counts.values()) {
if (value > 1) {
repeats += value;
}
}
return repeats;
}
private static int calculateRepeatsForTheadLocalRandom() throws InterruptedException {
final Map<Integer, Integer> counts = new HashMap<>();
for (int i = 0; i < THREAD_COUNT; i++) {
Thread thread = new Thread() {
#Override
public void run() {
for (int j = 0; j < GENERATED_NUMBER_COUNT; j++) {
int random = ThreadLocalRandom.current().nextInt(INT_RIGHT_BORDER);
if (!counts.containsKey(random)) {
counts.put(random, 0);
}
counts.put(random, counts.get(random) + 1);
}
}
};
thread.start();
thread.join();
}
int repeats = 0;
for (Integer value : counts.values()) {
if (value > 1) {
repeats += value;
}
}
return repeats;
}
}
I've also added test for non-shared Random and got next results:
Number of threads: 100
Length of generated numbers chain for each thread: 100
Right border integer: 5000
Count of experiments: 10000
Average repeats for non-shared Random instance: 8646. Average working time: 13 ms.
Average repeats for shared Random instance: 8646. Average working time: 13 ms.
Average repeats for ThreadLocalRandom: 8646. Average working time: 13 ms.
To me it's little strange, as I expected at least speed increasing when using ThreadLocalRandom comparing to shared Random instance, but see no difference at all.
Can someone explain why it works that way, maybe I haven't done testing properly. Thank you!

You're not running anything in parallel because you're waiting for each thread to finish immediately after starting it. You need a waiting loop outside the loop that starts the threads:
List<Thread> threads = new ArrayList<Thread>();
for (int i = 0; i < THREAD_COUNT; i++) {
Thread thread = new Thread() {
#Override
public void run() {
for (int j = 0; j < GENERATED_NUMBER_COUNT; j++) {
int random = rand.nextInt(INT_RIGHT_BORDER);
if (!counts.containsKey(random)) {
counts.put(random, 0);
}
counts.put(random, counts.get(random) + 1);
}
}
};
threads.add(thread);
thread.start();
}
for (Thread thread: threads) {
thread.join();
}

Your testing code is flawed for one. The bane of benchmarkers everywhere.
thread.start();
thread.join();
why not save LOCs and write
thread.run();
the outcome is the same.
EDIT: If you don't realize the outcome from the above, it means that you're running single threaded tests, there's no multithreading going on.

Maybe it would be easier to just have a look at what actually happens. Here is the source for ThreadLocal.get() which is also called for the ThreadLocalRandom.current().
public T get() {
Thread t = Thread.currentThread();
ThreadLocalMap map = getMap(t);
if (map != null) {
ThreadLocalMap.Entry e = map.getEntry(this);
if (e != null)
return (T)e.value;
}
return setInitialValue();
}
Where ThreadLocalMap is a specialized HashMap-like implementation with optimizations.
So what basically happens is that ThreadLocal holds a map Thread->Object - or in this case Thread->Random - which is then looked up and either returned or created. As this is nothing 'magical', the timing will be equal to a HashMap-lookup + the initial creation overhead of the actual Object to be returned. Since a HashMap lookup (in this optimized case) is linear, the cost for a lookup is k, where k is the calculation cost of the hash function.
So you can make some assumptions:
ThreadLocal will be faster than creating the object each time in each Runnable, unless the creation cost is much smaller than k. So looking up Random is a good thing, putting an int inside might not be so smart.
ThreadLocal will be better than using your own HashMap, as such a generic implementation can be assumed to be equal to k or worse.
ThreadLocal will be slower than using any lookup with a cost < k. Example: store everything in an array first, then do myRandoms[threadID].
But then this assumes that you know which threads will be processing your work in the first place, so this isn't a real candidate for ThreadLocal anyways.

Java - Vector vs ArrayList performance - test

Everybody's saying that one should use vector because of the perfomance (cause Vector synchronizes after every operation and stuff). I've written a simple test:
import java.util.ArrayList;
import java.util.Date;
import java.util.Vector;
public class ComparePerformance {
public static void main(String[] args) {
ArrayList<Integer> list = new ArrayList<Integer>();
Vector<Integer> vector = new Vector<Integer>();
int size = 10000000;
int listSum = 0;
int vectorSum = 0;
long startList = new Date().getTime();
for (int i = 0; i < size; i++) {
list.add(new Integer(1));
}
for (Integer integer : list) {
listSum += integer;
}
long endList = new Date().getTime();
System.out.println("List time: " + (endList - startList));
long startVector = new Date().getTime();
for (int i = 0; i < size; i++) {
vector.add(new Integer(1));
}
for (Integer integer : list) {
vectorSum += integer;
}
long endVector = new Date().getTime();
System.out.println("Vector time: " + (endVector - startVector));
}
}
The results are as follows:
List time: 4360
Vector time: 4103
Based on this it seems that Vector perfomance at iterating over and reading is slightly better. Maybe this is a dumb queston or I've made wrong assumptions - can somebody please explan this?

You have written a naïve microbenchmark. Microbenchmarking on the JVM is very tricky business and it is not even easy to enumerate all the pitfalls, but here are some classic ones:
you must warm up the code;
you must control for garbage collection pauses;
System.currentTimeMillis is imprecise, but you don't seem to be aware of even this method (your new Date().getTime() is equivalent, but slower).
If you want to do this properly, then check out Oracle's jmh tool or Google's Caliper.
My Test Results
Since I was kind of interested to see these numbers myself, here is the output of jmh. First, the test code:
public class Benchmark1
{
static Integer[] ints = new Integer[0];
static {
final List<Integer> list = new ArrayList(asList(1,2,3,4,5,6,7,8,9,10));
for (int i = 0; i < 5; i++) list.addAll(list);
ints = list.toArray(ints);
}
static List<Integer> intList = Arrays.asList(ints);
static Vector<Integer> vec = new Vector<Integer>(intList);
static List<Integer> list = new ArrayList<Integer>(intList);
#GenerateMicroBenchmark
public Vector<Integer> testVectorAdd() {
final Vector<Integer> v = new Vector<Integer>();
for (Integer i : ints) v.add(i);
return v;
}
#GenerateMicroBenchmark
public long testVectorTraverse() {
long sum = (long)Math.random()*10;
for (int i = 0; i < vec.size(); i++) sum += vec.get(i);
return sum;
}
#GenerateMicroBenchmark
public List<Integer> testArrayListAdd() {
final List<Integer> l = new ArrayList<Integer>();
for (Integer i : ints) l.add(i);
return l;
}
#GenerateMicroBenchmark
public long testArrayListTraverse() {
long sum = (long)Math.random()*10;
for (int i = 0; i < list.size(); i++) sum += list.get(i);
return sum;
}
}
And the results:
testArrayListAdd 234.896 ops/msec
testVectorAdd 274.886 ops/msec
testArrayListTraverse 1718.711 ops/msec
testVectorTraverse 34.843 ops/msec
Note the following:
in the ...add methods I am creating a new, local collection. The JIT compiler uses this fact and elides the locking on Vector methods—hence almost equal performance;
in the ...traverse methods I am reading from a global collection; the locks cannot be elided and this is where the true performance penalty of Vector shows up.
The main takeaway from this should be: the performance model on the JVM is highly complex, sometimes even erratic. Extrapolating from microbenchmarks, even when they are done with all due care, can lead to dangerously wrong predictions about production system performance.

I agree with Marko about using Caliper, it's an awesome framework.
But you can get a part of it done yourself if you organize your benchmark a bit better:
public class ComparePerformance {
private static final int SIZE = 1000000;
private static final int RUNS = 500;
private static final Integer ONE = Integer.valueOf(1);
static class Run {
private final List<Integer> list;
Run(final List<Integer> list) {
this.list = list;
}
public long perform() {
long oldNanos = System.nanoTime();
for (int i = 0; i < SIZE; i++) {
list.add(ONE);
}
return System.nanoTime() - oldNanos;
}
}
public static void main(final String[] args) {
long arrayListTotal = 0L;
long vectorTotal = 0L;
for (int i = 0; i < RUNS; i++) {
if (i % 50 == 49) {
System.out.println("Run " + (i + 1));
}
arrayListTotal += new Run(new ArrayList<Integer>()).perform();
vectorTotal += new Run(new Vector<Integer>()).perform();
}
System.out.println();
System.out.println("Runs: "+RUNS+", list size: "+SIZE);
output(arrayListTotal, "List");
output(vectorTotal, "Vector");
}
private static void output(final long value, final String name) {
System.out.println(name + " total time: " + value + " (" + TimeUnit.NANOSECONDS.toMillis(value) + " " + "ms)");
long avg = value / RUNS;
System.out.println(name + " average time: " + avg + " (" + TimeUnit.NANOSECONDS.toMillis(avg) + " " + "ms)");
}
}
The key part is running your code, often. Also, remove stuff that's unrelated to your benchmark. Re-use Integers instead of creating new ones.
The above benchmark code creates this output on my machine:
Runs: 500, list size: 1000000
List total time: 3524708559 (3524 ms)
List average time: 7049417 (7 ms)
Vector total time: 6459070419 (6459 ms)
Vector average time: 12918140 (12 ms)
I'd say that should give you an idea of the performance differences.

As Marko Topolnik said, it is hard to write correct microbenchmarks and to interprete the results correctly. There are good articles about this subject availible.
From my experience and what I know of the implementation I use this rule of thumb:
Use ArrayList
If the collection must be synchronized consider the usage of vector. (I never end up using it, because there are other solutions for synchronization, concurrency and parallel programming)
If there are many elements in the collection and there are frequent insert or remove operations inside the list (not at the end) then use LinkedList
Most collections do not contain many elements and it would be a waste of time to spend more effort to them. Also in scala there are parallel collections, which perform some operations in parallel. Maybe there is something available for use in pure Java, too.
Whenever possible use the List interface to hide implementation details and try to add comments which show your reasons WHY you've chosen a specific implementation.

I make your test, and ArrayList is faster than Vector with size of 1000000
public static void main(String[] args) {
ArrayList<Integer> list = new ArrayList<Integer>();
Vector<Integer> vector = new Vector<Integer>();
int size= 1000000;
int listSum = 0;
int vectorSum = 0;
long startList = System.nanoTime();
for (int i = 0; i < size; i++) {
list.add(Integer.valueOf(1));
}
for (Integer integer : list) {
listSum += integer;
}
long endList = System.nanoTime();
System.out.println("List time: " + (endList - startList)/1000000);
//
// long startVector = System.nanoTime();
// for (int i = 0; i < size; i++) {
// vector.add(Integer.valueOf(1));
// }
// for (Integer integer : list) {
// vectorSum += integer;
// }
// long endVector = System.nanoTime();
// System.out.println("Vector time: " + (endVector - startVector)/1000000);
}
}
Output running different times.
Code : list time 83
vector time 113

java looping - declaration of a Class outside / inside the loop [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Which loop has better performance? Why?
Which is optimal ?
Efficiency of Java code with primitive types
when looping, for instance:
for ( int j = 0; j < 1000; j++) {}; and I need to instantiate 1000 objects, how does it differ when I declare the object inside the loop from declaring it outside the loop ??
for ( int j = 0; j < 1000; j++) {Object obj; obj =}
vs
Object obj;
for ( int j = 0; j < 1000; j++) {obj =}
It's obvious that the object is accessible either only from the loop scope or from the scope that is surrounding it. But I don't understand the performance question, garbage collection etc.
What is the best practice ? Thank you

The first form is better. Limiting the scope of a variable makes it easier for readers to understand where and how it is used.
Performance-wise, there are some small advantages to limited scope as well, which you can read about in another answer. But these concerns are secondary to code comprehension.

There's no difference. The compiler will optimize them to the very same place.

I've tested the issue on my machine the difference was about 2-4ms over 10000 instances, I tested all kind of stuff, like if you instantiate and assign value:
int i=0;
in compare with:
int i;
i=0;
here is the code I used for testing, of course I changed it for testing, and there is an initial balancing effect before the machine reaches optimization, you can see that in the clear once you test:
package initializer;
public final class EfficiencyTests {
private static class Stoper {
private long initTime;
private long executionDuration;
public Stoper() {
// TODO Auto-generated constructor stub
}
private void start() {
initTime = System.nanoTime();
}
private void stop() {
executionDuration = System.nanoTime() - initTime;
}
#Override
public String toString() {
return executionDuration + " nanos";
}
}
private static Stoper stoper = new Stoper();
public static void main(String[] args) {
for (int i = 0; i < 100; i++) {
theCycleOfAForLoop(100000);
theCycleOfAForLoopWithACallToSize(100000);
howLongDoesItTakeToSetValueToAVariable(100000);
howLongDoesItTakeToDefineAVariable(100000);
System.out.println("\n");
}
}
private static void theCycleOfAForLoop(int loops) {
stoper.start();
for (int i = 0; i < loops; i++);
stoper.stop();
System.out.println("The average duration of 10 cycles of an empty 'for' loop over " + loops + " iterations is: " + stoper.executionDuration * 10 / loops);
}
private static void theCycleOfAForLoopWithACallToSize(int loops) {
ArrayList<Object> objects=new ArrayList<Object>();
for (int i = 0; i < loops; i++)
objects.add(new Object());
stoper.start();
for (int i = 0; i < objects.size(); i++);
stoper.stop();
System.out.println("The average duration of 10 cycles of an empty 'for' loop with call to size over " + loops + " iterations is: " + stoper.executionDuration * 10 / loops);
}
private static void howLongDoesItTakeToSetValueToAVariable(int loops) {
int value = 0;
stoper.start();
for (int i = 0; i < loops; i++) {
value = 2;
}
stoper.stop();
System.out.println("The average duration of 10 cycles of setting a variable to a constant over " + loops + " iterations is: " + stoper.executionDuration * 10 / loops);
}
private static void howLongDoesItTakeToDefineAVariable(int loops) {
stoper.start();
for (int i = 0; i < loops; i++) {
int value = 0;
}
stoper.stop();
System.out.println("The average duration of 10 cycles of initializing and setting a variable to a constant over " + loops + " iterations is: " + stoper.executionDuration * 10 / loops);
}
private static void runAForLoopOnAnArrayOfObjects() {
// TODO Auto-generated method stub
}}
you can derive how long one takes if you reduce the time of the other... (if you understand what I mean)
hope this save you some time.
thing you need to understand is that I tested this things to optimize my paint update loop of my platform and it helped.
Adam.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

[Java]Arrays and Collections test, unexpected outcome? - java

Related

Java Multithreaded vector addition

Why is initial capacity important for deleting from ArrayList?

ThreadLocalRandom with shared static Random instance performance comparing test

Java - Vector vs ArrayList performance - test

java looping - declaration of a Class outside / inside the loop [duplicate]

Categories

Resources