ArrayList .get faster than HashMap .get?

ArrayList .get faster than HashMap .get? - java

I had thought that HashMaps were faster for random access of individual values than ArrayLists . . . that is, to say, that HashMap.get(key) should be faster than ArrayList.get(index) simply because the ArrayList has to traverse every element of the collection to reach its value, whereas the HashMap does not. You know, O(1) vs O(n) and all that.
edit: So my understanding of HashMaps was/is inadequate, hence my confusion. The results from this code are as expected. Thanks for the many explanations.
So I decided to test it, on a lark. Here is my code:
import java.util.HashMap;
import java.util.Iterator;
import java.util.ListIterator;
import java.util.NoSuchElementException;
import java.util.Scanner;
public class Testing
{
public static void main(String[] args)
{
ArrayList<SomeClass> alist = new ArrayList<>();
HashMap<Short, SomeClass> hmap = new HashMap<>(4000, (float).75);
ListIterator<SomeClass> alistiterator = alist.listIterator();
short j = 0;
do
{
alistiterator.add(new SomeClass());
j++;
}
while(j < 4000);
for (short i = 0; i < 4000; i++)
{
hmap.put(i, new SomeClass());
}
boolean done = false;
Scanner input = new Scanner(System.in);
String blargh = null;
do
{
System.out.println("\nEnter 1 to run iteration tests.");
System.out.println("Enter w to run warmup (recommended)");
System.out.println("Enter x to terminate program.");
try
{
blargh = input.nextLine();
}
catch (NoSuchElementException e)
{
System.out.println("Uh, what? Try again./n");
continue;
}
switch (blargh)
{
case "1":
long starttime = 0;
long total = 0;
for (short i = 0; i < 1000; i++)
{
starttime = System.nanoTime();
iteratearraylist(alist);
total += System.nanoTime() - starttime;
}
total = (long)(total * .001);
System.out.println(total + " ns: iterating sequentially"
+ " through ArrayList");
total = 0;
for (short i = 0; i< 1000; i++)
{
starttime = System.nanoTime();
iteratearraylistbyget(alist);
total += System.nanoTime() - starttime;
}
total = (long)(total * .001);
System.out.println(total + " ns: iterating sequentially"
+ " through ArrayList via .get()");
total = 0;
for (short i = 0; i< 1000; i++)
{
starttime = System.nanoTime();
iteratehashmap(hmap);
total += System.nanoTime() - starttime;
}
total = (long)(total * .001);
System.out.println(total + " ns: iterating sequentially"
+ " through HashMap via .next()");
total = 0;
for (short i = 0; i< 1000; i++)
{
starttime = System.nanoTime();
iteratehashmapbykey(hmap);
total += System.nanoTime() - starttime;
}
total = (long)(total * .001);
System.out.println(total + " ns: iterating sequentially"
+ " through HashMap via .get()");
total = 0;
for (short i = 0; i< 1000; i++)
{
starttime = System.nanoTime();
getvaluebyindex(alist);
total += System.nanoTime() - starttime;
}
total = (long)(total * .001);
System.out.println(total + " ns: getting end value"
+ " from ArrayList");
total = 0;
for (short i = 0; i< 1000; i++)
{
starttime = System.nanoTime();
getvaluebykey(hmap);
total += System.nanoTime() - starttime;
}
total = (long)(total * .001);
System.out.println(total + " ns: getting end value"
+ " from HashMap");
break;
case "w":
for (int i = 0; i < 60000; i++)
{
iteratearraylist(alist);
iteratearraylistbyget(alist);
iteratehashmap(hmap);
iteratehashmapbykey(hmap);
getvaluebyindex(alist);
getvaluebykey(hmap);
}
break;
case "x":
done = true;
break;
default:
System.out.println("Invalid entry. Please try again.");
break;
}
}
while (!done);
input.close();
}
public static void iteratearraylist(ArrayList<SomeClass> alist)
{
ListIterator<SomeClass> tempiterator = alist.listIterator();
do
{
tempiterator.next();
}
while (tempiterator.hasNext());
}
public static void iteratearraylistbyget(ArrayList<SomeClass> alist)
{
short i = 0;
do
{
alist.get(i);
i++;
}
while (i < 4000);
}
public static void iteratehashmap(HashMap<Short, SomeClass> hmap)
{
Iterator<HashMap.Entry<Short, SomeClass>> hmapiterator =
map.entrySet().iterator();
do
{
hmapiterator.next();
}
while (hmapiterator.hasNext());
}
public static void iteratehashmapbykey(HashMap<Short, SomeClass> hmap)
{
short i = 0;
do
{
hmap.get(i);
i++;
}
while (i < 4000);
}
public static void getvaluebykey(HashMap<Short, SomeClass> hmap)
{
hmap.get(3999);
}
public static void getvaluebyindex(ArrayList<SomeClass> alist)
{
alist.get(3999);
}
}
and
public class SomeClass
{
int a = 0;
float b = 0;
short c = 0;
public SomeClass()
{
a = (int)(Math.random() * 100000) + 1;
b = (float)(Math.random() * 100000) + 1.0f;
c = (short)((Math.random() * 32000) + 1);
}
}
Interestingly enough, the code seems to warm up in stages. The final stage that I've identified comes after around 120,000 iterations of all methods. Anyway, on my test machine (AMD x2-220, L3 + 1 extra core unlocked, 3.6 ghz, 2.1 ghz NB), the numbers that really jumped out at me were the last two reported. Namely, the time taken to .get() the last entry of the ArrayList (index == 3999) and the time taken to .get() the value associated with a Short key of 3999.
After 2-3 warmup cycles, testing shows that ArrayList.get() takes around 56 ns, while HashMap.get() takes around 68 ns. That is . . . not what I expected. Is my HashMap all eaten up with collisions? All the key entries are supposed to autobox to Shorts which are supposed to report their stored short value in response to .hashcode(), so all the hashcodes should be unique. I think?
Even without warmups, the ArrayList.get() is still faster. That is contrary to everything I've seen elsewhere, such as this question. Of course, I've also read that traversing an ArrayList with a ListIterator is faster than just using .get() in a loop, and obviously, that is also not the case . . .

Hashmaps aren't faster at retrieval of something at a known index. If you are storing things in a known order, the list will win.
But say instead of your example of inserting everything into the list 1-4000, you did it in a total random order. Now to retrieve the correct item from a list, you have to check each item one by one looking for the right item. But to retrieve it from the hashmap, all you need to know is the key you would have given it when you inserted it.
So really, you should be comparing Hashmap.get(i) to
for(Integer i : integerList)
if(i==value)
//found it!
Then you would see the real efficiency of the hashmap.

the ArrayList has to traverse every element of the collection to reach its value
This is not true. ArrayList is backed by an array which allows for constant-time get operations.
HashMap's get, on the other hand, first must hash its argument, then it must traverse the bucket to which the hash code corresponds, testing each element in the bucket for equality with the given key. This will generally be slower than just indexing an array.

ArrayList.get(index) acctualy uses constant time, since ArrayList is backed by an array, so it just uses that index in the bacing array. ArrayList.contains(Object) is a long operation in O(n) in worst case.

Big O for HashMap is O(1+α). Your α comes from hashcode collisions and a bucket must be traversed to check for equality.
Big O for pulling an item out of an ArrayList by index O(1)
When in doubt... draw it out...

Both ArrayList and HashMap are backed with arrays, HashMap has to compute a hash code of the key from which it derives the index to use for accessing the array while for accessing an and element in the ArrayList using get you provide the index. So its 3 operations vs 1 operation for the ArrayList.
But whether a List or a Map is backed with an array is implementation detail. So the answer may differ depending on which implementations you use.

Related

Java Data Structure Benchmark

So I'm still fairly new to programming and I'm just wondering if I'm doing these benchmarks correctly. For queue I'm basically giving it a list filled with integers, and I would time how long it would take for it to find a number on the list. As for the HashMap it's basically the same idea, I would time how long it would take to get a number from the list. Also for both of them I would also time how long it would take for them to remove the contents of the list. Any help on this would be appreciated. Thank you!
// Create a queue, and test its performance
PriorityQueue<Integer> queue = new PriorityQueue <> (list);
System.out.println("Member test time for Priority Queue is " +
getTestTime(queue) + " milliseconds");
System.out.println("Remove element time for Priority Queue is " +
getRemoveTime(queue) + " milliseconds");
// Create a hash map, and test its performance
HashMap<Integer, Integer> newmap = new HashMap<Integer, Integer>();
for (int i = 0; i <N;i++) {
newmap.put(i, i);
}
System.out.println("Member test time for hash map is " +
getTestTime1(newmap) + " milliseconds");
System.out.println("Remove element time for hash map is " +
getRemoveTime1(newmap) + " milliseconds");
}
public static long getTestTime(Collection<Integer> c) {
long startTime = System.currentTimeMillis();
// Test if a number is in the collection
for (int i = 0; i < N; i++)
c.contains((int)(Math.random() * 2 * N));
return System.currentTimeMillis() - startTime;
}
public static long getTestTime1(HashMap<Integer,Integer> newmap) {
long startTime = System.currentTimeMillis();
// Test if a number is in the collection
for (int i = 0; i < N; i++)
newmap.containsKey((int)(Math.random() * 2 * N));
return System.currentTimeMillis() - startTime;
}
public static long getRemoveTime(Collection<Integer> c) {
long startTime = System.currentTimeMillis();
for (int i = 0; i < N; i++)
c.remove(i);
return System.currentTimeMillis() - startTime;
}
public static long getRemoveTime1(HashMap<Integer,Integer> newmap) {
long startTime = System.currentTimeMillis();
for (int i = 0; i < N; i++)
newmap.remove(i);
return System.currentTimeMillis() - startTime;
}
}

I have two suggestion. First, when doing benchmarking, do the bare minimum work immediately before and after the code you are evaluating. You don't want your benchmark activity to affect the result.
Second, System.currentTimeMillis() can, depending on the OS, only be accurate within 10 milliseconds. Better to use System.nanoTime(), which is accurate to perhaps 200 nanoseconds. Divide by 1_000_000 to get milliseconds.
Practically,
final long startNanos, endNanos;
startNanos = System.nanoTime();
// your code goes here
endNanos = System.nanoTime();
// display the results of the benchmark

Why is initial capacity important for deleting from ArrayList?

For my work I have done some tests for time chart.
I have come to something that surprised me and need help understanding it.
I used few data structures as queue and wanted to know how deleting is fast according to number of items. And arraylist with 10 items, deleting from front and not set initial capacity is much slower than the same with set initial capacity (to 15). Why? And why it's same at 100 items.
Here's the chart:
Data Structures: L - implements List, C - set initial capacity, B - removing from back, Q - implements Queue
EDIT:
Appending relevant piece of code
new Thread(new Runnable() {
#Override
public void run()
{
long time;
final int[] arr = {10, 100, 1000, 10000, 100000, 1000000};
for (int anArr : arr)
{
final List<Word> temp = new ArrayList<>();
while (temp.size() < anArr) temp.add(new Item());
final int top = (int) Math.sqrt(anArr);
final List<Word> first = new ArrayList<>();
final List<Word> second = new ArrayList<>(anArr);
...
first.addAll(temp);
second.addAll(temp);
...
SystemClock.sleep(5000);
time = System.nanoTime();
for (int i = 0; i < top; ++i) first.remove(0);
Log.d("al_l", "rem: " + (System.nanoTime() - time));
time = System.nanoTime();
for (int i = 0; i < top; ++i) second.remove(0);
Log.d("al_lc", "rem: " + (System.nanoTime() - time));
...
}
}
}).start();

Read this article about Avoiding Benchmarking Pitfalls on the JVM. It explains the impact of the Hotspot VM on the test results. If you don't take care about it, your measurement isn't right. As you have found out with your own test.
If you want to do reliable benchamrking use JMH.

I too was able to replicate it by creating the code below. However, I noticed that whatever is run first (the set capacity vs non-set capacity) is the one that will take the longest. I assume this is some kind of optimization, maybe the JVM, or some kind of Caching?
public class Test {
public static void main(String[] args) {
measure(-1, 10); // switch with line below
measure(15, 10); // switch with line above
measure(-1, 100);
measure(15, 100);
}
public static void measure(int capacity, long numItems) {
ArrayList<String> arr = new ArrayList<>();
if (capacity >= 1) {
arr.ensureCapacity(capacity);
}
for (int i = 0; i <= numItems; i++) {
arr.add("T");
}
long start = System.nanoTime();
for (int i = 0; i <= numItems; i++) {
arr.remove(0);
}
long end = System.nanoTime();
System.out.println("Capacity: " + capacity + ", " + "Runtime: "
+ (end - start));
}
}

Algorithm too slow for shuffling ArrayList

I am trying to implement the Fisher-Yates shuffle algorithm on java. It works but when my ArrayList is of a size > 100000, it goes very slow. I will show you my code and do you see any way to optimize the code? I did some research about the complexity of .get and .set from ArrayList and it is O(1) which makes sense to me.
UPDATE 1: I noticed my implementation was wrong. This is the proper Fisher-Yates algorithm. Also I included my next() function so you guys can see it. I tested with java.Random to see if my next() function was the problem but it gives the same result. I believe the problem is with the usage of my data structure.
UPDATE 2: I made a test and the ArrayList is an instanceof RandomAccess. So the problem is not there.
private long next(){ // MurmurHash3
seed ^= seed >> 33;
seed *= 0xff51afd7ed558ccdL;
seed ^= seed >> 33;
seed *= 0xc4ceb9fe1a85ec53L;
seed ^= seed >> 33;
return seed;
}
public int next(int range){
return (int) Math.abs((next() % range));
}
public ArrayList<Integer> shuffle(ArrayList<Integer> pList){
Integer temp;
int index;
int size = pList.size();
for (int i = size - 1; i > 0; i--){
index = next(i + 1);
temp = pList.get(index);
pList.set(index, pList.get(i));
pList.set(i, temp);
}
return pList;
}

EDIT: Added some comments after you implemented correctly the Fisher-Yates algorithm.
The Fisher-Yates algorithm relies on uniformly distributed random integers to produce unbiased permutations. Using an hash function (MurmurHash3) to generate random numbers and introducing the abs and modulo operations to force the numbers in a fixed range make the implementation less robust.
This implementation uses the java.util.Random PRNG and should work fine for your needs:
public <T> List<T> shuffle(List<T> list) {
// trust the default constructor which sets the seed to a value very likely
// to be distinct from any other invocation of this constructor
final Random random = new Random();
final int size = list.size();
for (int i = size - 1; i > 0; i--) {
// pick a random number between one and the number
// of unstruck numbers remaining (inclusive)
int index = random.nextInt(i + 1);
list.set(index, list.set(i, list.get(index)));
}
return list;
}
I can't see any major performance bottleneck in your code. However, here is a fire&forget comparison of the implementation above against the Collections#shuffle method:
public void testShuffle() {
List<Integer> list = new ArrayList<>();
for (int i = 0; i < 1_000_000; i++) {
list.add(i);
}
System.out.println("size: " + list.size());
System.out.println("Fisher-Yates shuffle");
for (int i = 0; i < 10; i++) {
long start = System.currentTimeMillis();
shuffle(list);
long stop = System.currentTimeMillis();
System.out.println("#" + i + " " + (stop - start) + "ms");
}
System.out.println("Java shuffle");
for (int i = 0; i < 10; i++) {
long start = System.currentTimeMillis();
Collections.shuffle(list);
long stop = System.currentTimeMillis();
System.out.println("#" + i + " " + (stop - start) + "ms");
}
}
which gives me the following results:
size: 1000000
Fisher-Yates shuffle
#0 84ms
#1 60ms
#2 42ms
#3 45ms
#4 47ms
#5 46ms
#6 52ms
#7 49ms
#8 47ms
#9 53ms
Java shuffle
#0 60ms
#1 46ms
#2 44ms
#3 48ms
#4 50ms
#5 46ms
#6 46ms
#7 49ms
#8 50ms
#9 47ms

(Better suited for Code Review forum.)
I changed what I could:
Random random = new Random(42);
for (ListIterator<Integer>.iter = pList.listIterator(); iter.hasNext(); ) {
Integer value = iter.next();
int index = random.nextInt(size);
iter.set(pList.get(index));
pList.set(index, value);
}
As an ArrayList is a list of large arrays, you might set the initialCapacity in the ArrayList constructor. trimToSize() might do something too. Using a ListIterator means that one already is at the the current partial array, and that might help.
The optional parameter of the Random constructor (here 42) allows to pick a fixed random sequence (= repeatable), allowing during development timing and tracing the same sequence.

Combining some fragments that have been scattered in comments and other answers:
The original code was not an implementation of the Fisher-Yates-Shuffle. It was only swapping random elements. This means that certain permutations are more likely than others, and the result is not truly random
If there is a bottleneck, it could (based on the code provided) only be in the next method, which you did not say anything about. It should be replaced by the nextInt method of an instance of java.util.Random
Here is an example of what it may look like. (Note that the speedTest method is not even remotely intended as a "benchmark", but should only indicate that the execution time is negligible even for large lists).
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
class FisherYatesShuffle {
public static void main(String[] args) {
basicTest();
speedTest();
}
private static void basicTest() {
List<Integer> list = new ArrayList<Integer>(Arrays.asList(1,2,3,4,5));
shuffle(list, new Random(0));;
System.out.println(list);
}
private static void speedTest() {
List<Integer> list = new ArrayList<Integer>();
int n = 1000000;
for (int i=0; i<n; i++) {
list.add(i);
}
long before = System.nanoTime();
shuffle(list, new Random(0));;
long after = System.nanoTime();
System.out.println("Duration "+(after-before)/1e6+"ms");
System.out.println(list.get(0));
}
public static <T> void shuffle(List<T> list, Random random) {
for (int i = list.size() - 1; i > 0; i--) {
int index = random.nextInt(i + 1);
T t = list.get(index);
list.set(index, list.get(i));
list.set(i, t);
}
}
}
An aside: You gave a list as an argument, and returned the same list. This may be appropriate in some cases, but did not make any sense here. There are several options for the signature and behavior of such a method. But most likely, it should receive a List, and shuffle this list in-place. In fact, it would also make sense to explicitly check whether the list implements the java.util.RandomAccess interface. For a List that does not implement the RandomAccess interface, this algorithm would degrade to quadratic performance. In this case, it would be better to copy the given list into a list that implements RandomAccess, shuffle this copy, and the copy the results back into the original list.

Try this code and compare the execution time with your fisher yates method.
This is probably the "next" method which is slow
function fisherYates(array) {
for (var i = array.length - 1; i > 0; i--) {
var index = Math.floor(Math.random() * i);
//swap
var tmp = array[index];
array[index] = array[i];
array[i] = tmp;
}

Java - Vector vs ArrayList performance - test

Everybody's saying that one should use vector because of the perfomance (cause Vector synchronizes after every operation and stuff). I've written a simple test:
import java.util.ArrayList;
import java.util.Date;
import java.util.Vector;
public class ComparePerformance {
public static void main(String[] args) {
ArrayList<Integer> list = new ArrayList<Integer>();
Vector<Integer> vector = new Vector<Integer>();
int size = 10000000;
int listSum = 0;
int vectorSum = 0;
long startList = new Date().getTime();
for (int i = 0; i < size; i++) {
list.add(new Integer(1));
}
for (Integer integer : list) {
listSum += integer;
}
long endList = new Date().getTime();
System.out.println("List time: " + (endList - startList));
long startVector = new Date().getTime();
for (int i = 0; i < size; i++) {
vector.add(new Integer(1));
}
for (Integer integer : list) {
vectorSum += integer;
}
long endVector = new Date().getTime();
System.out.println("Vector time: " + (endVector - startVector));
}
}
The results are as follows:
List time: 4360
Vector time: 4103
Based on this it seems that Vector perfomance at iterating over and reading is slightly better. Maybe this is a dumb queston or I've made wrong assumptions - can somebody please explan this?

You have written a naïve microbenchmark. Microbenchmarking on the JVM is very tricky business and it is not even easy to enumerate all the pitfalls, but here are some classic ones:
you must warm up the code;
you must control for garbage collection pauses;
System.currentTimeMillis is imprecise, but you don't seem to be aware of even this method (your new Date().getTime() is equivalent, but slower).
If you want to do this properly, then check out Oracle's jmh tool or Google's Caliper.
My Test Results
Since I was kind of interested to see these numbers myself, here is the output of jmh. First, the test code:
public class Benchmark1
{
static Integer[] ints = new Integer[0];
static {
final List<Integer> list = new ArrayList(asList(1,2,3,4,5,6,7,8,9,10));
for (int i = 0; i < 5; i++) list.addAll(list);
ints = list.toArray(ints);
}
static List<Integer> intList = Arrays.asList(ints);
static Vector<Integer> vec = new Vector<Integer>(intList);
static List<Integer> list = new ArrayList<Integer>(intList);
#GenerateMicroBenchmark
public Vector<Integer> testVectorAdd() {
final Vector<Integer> v = new Vector<Integer>();
for (Integer i : ints) v.add(i);
return v;
}
#GenerateMicroBenchmark
public long testVectorTraverse() {
long sum = (long)Math.random()*10;
for (int i = 0; i < vec.size(); i++) sum += vec.get(i);
return sum;
}
#GenerateMicroBenchmark
public List<Integer> testArrayListAdd() {
final List<Integer> l = new ArrayList<Integer>();
for (Integer i : ints) l.add(i);
return l;
}
#GenerateMicroBenchmark
public long testArrayListTraverse() {
long sum = (long)Math.random()*10;
for (int i = 0; i < list.size(); i++) sum += list.get(i);
return sum;
}
}
And the results:
testArrayListAdd 234.896 ops/msec
testVectorAdd 274.886 ops/msec
testArrayListTraverse 1718.711 ops/msec
testVectorTraverse 34.843 ops/msec
Note the following:
in the ...add methods I am creating a new, local collection. The JIT compiler uses this fact and elides the locking on Vector methods—hence almost equal performance;
in the ...traverse methods I am reading from a global collection; the locks cannot be elided and this is where the true performance penalty of Vector shows up.
The main takeaway from this should be: the performance model on the JVM is highly complex, sometimes even erratic. Extrapolating from microbenchmarks, even when they are done with all due care, can lead to dangerously wrong predictions about production system performance.

I agree with Marko about using Caliper, it's an awesome framework.
But you can get a part of it done yourself if you organize your benchmark a bit better:
public class ComparePerformance {
private static final int SIZE = 1000000;
private static final int RUNS = 500;
private static final Integer ONE = Integer.valueOf(1);
static class Run {
private final List<Integer> list;
Run(final List<Integer> list) {
this.list = list;
}
public long perform() {
long oldNanos = System.nanoTime();
for (int i = 0; i < SIZE; i++) {
list.add(ONE);
}
return System.nanoTime() - oldNanos;
}
}
public static void main(final String[] args) {
long arrayListTotal = 0L;
long vectorTotal = 0L;
for (int i = 0; i < RUNS; i++) {
if (i % 50 == 49) {
System.out.println("Run " + (i + 1));
}
arrayListTotal += new Run(new ArrayList<Integer>()).perform();
vectorTotal += new Run(new Vector<Integer>()).perform();
}
System.out.println();
System.out.println("Runs: "+RUNS+", list size: "+SIZE);
output(arrayListTotal, "List");
output(vectorTotal, "Vector");
}
private static void output(final long value, final String name) {
System.out.println(name + " total time: " + value + " (" + TimeUnit.NANOSECONDS.toMillis(value) + " " + "ms)");
long avg = value / RUNS;
System.out.println(name + " average time: " + avg + " (" + TimeUnit.NANOSECONDS.toMillis(avg) + " " + "ms)");
}
}
The key part is running your code, often. Also, remove stuff that's unrelated to your benchmark. Re-use Integers instead of creating new ones.
The above benchmark code creates this output on my machine:
Runs: 500, list size: 1000000
List total time: 3524708559 (3524 ms)
List average time: 7049417 (7 ms)
Vector total time: 6459070419 (6459 ms)
Vector average time: 12918140 (12 ms)
I'd say that should give you an idea of the performance differences.

As Marko Topolnik said, it is hard to write correct microbenchmarks and to interprete the results correctly. There are good articles about this subject availible.
From my experience and what I know of the implementation I use this rule of thumb:
Use ArrayList
If the collection must be synchronized consider the usage of vector. (I never end up using it, because there are other solutions for synchronization, concurrency and parallel programming)
If there are many elements in the collection and there are frequent insert or remove operations inside the list (not at the end) then use LinkedList
Most collections do not contain many elements and it would be a waste of time to spend more effort to them. Also in scala there are parallel collections, which perform some operations in parallel. Maybe there is something available for use in pure Java, too.
Whenever possible use the List interface to hide implementation details and try to add comments which show your reasons WHY you've chosen a specific implementation.

I make your test, and ArrayList is faster than Vector with size of 1000000
public static void main(String[] args) {
ArrayList<Integer> list = new ArrayList<Integer>();
Vector<Integer> vector = new Vector<Integer>();
int size= 1000000;
int listSum = 0;
int vectorSum = 0;
long startList = System.nanoTime();
for (int i = 0; i < size; i++) {
list.add(Integer.valueOf(1));
}
for (Integer integer : list) {
listSum += integer;
}
long endList = System.nanoTime();
System.out.println("List time: " + (endList - startList)/1000000);
//
// long startVector = System.nanoTime();
// for (int i = 0; i < size; i++) {
// vector.add(Integer.valueOf(1));
// }
// for (Integer integer : list) {
// vectorSum += integer;
// }
// long endVector = System.nanoTime();
// System.out.println("Vector time: " + (endVector - startVector)/1000000);
}
}
Output running different times.
Code : list time 83
vector time 113

Comparison: Runtime of LinkedList addFirst() vs. ArrayList add(0, item)

My partner and I are attempting to program a LinkedList data structure. We have completed the data structure, and it functions properly with all required methods. We are required to perform a comparative test of the runtimes of our addFirst() method in our LinkedList class vs. the add(0, item) method of Java's ArrayList structure. The expected complexity of the addFirst() method for our LinkedList data structure is O(1) constant. This held true in our test. In timing the ArrayList add() method, we expected a complexity of O(N), but we again received a complexity of approximately O(1) constant. This appeared to be a strange discrepancy since we are utilizing Java's ArrayList. We thought there may be an issue in our timing structure, and we would be most appreciative if any one could help us identify our problem. Our Java code for the timing of both methods is listed below:
public class timingAnalysis {
public static void main(String[] args) {
//timeAddFirst();
timeAddArray();
}
public static void timeAddFirst()
{
long startTime, midTime, endTime;
long timesToLoop = 10000;
int inputSize = 20000;
MyLinkedList<Long> linkedList = new MyLinkedList<Long>();
for (; inputSize <= 1000000; inputSize = inputSize + 20000)
{
// Clear the collection so we can add new random
// values.
linkedList.clear();
// Let some time pass to stabilize the thread.
startTime = System.nanoTime();
while (System.nanoTime() - startTime < 1000000000)
{ }
// Start timing.
startTime = System.nanoTime();
for (long i = 0; i < timesToLoop; i++)
linkedList.addFirst(i);
midTime = System.nanoTime();
// Run an empty loop to capture the cost of running the loop.
for (long i = 0; i < timesToLoop; i++)
{} // empty block
endTime = System.nanoTime();
// Compute the time, subtract the cost of running the loop from
// the cost of running the loop and computing the removeAll method.
// Average it over the number of runs.
double averageTime = ((midTime - startTime) - (endTime - midTime)) / timesToLoop;
System.out.println(inputSize + " " + averageTime);
}
}
public static void timeAddArray()
{
long startTime, midTime, endTime;
long timesToLoop = 10000;
int inputSize = 20000;
ArrayList<Long> testList = new ArrayList<Long>();
for (; inputSize <= 1000000; inputSize = inputSize + 20000)
{
// Clear the collection so we can add new random
// values.
testList.clear();
// Let some time pass to stabilize the thread.
startTime = System.nanoTime();
while (System.nanoTime() - startTime < 1000000000)
{ }
// Start timing.
startTime = System.nanoTime();
for (long i = 0; i < timesToLoop; i++)
testList.add(0, i);
midTime = System.nanoTime();
// Run an empty loop to capture the cost of running the loop.
for (long i = 0; i < timesToLoop; i++)
{} // empty block
endTime = System.nanoTime();
// Compute the time, subtract the cost of running the loop from
// the cost of running the loop and computing the removeAll method.
// Average it over the number of runs.
double averageTime = ((midTime - startTime) - (endTime - midTime)) / timesToLoop;
System.out.println(inputSize + " " + averageTime);
}
}
}

You want to test for different inputSize, but you perform the operation to test timesToLoop times, which is constant. So of course, it takes the same time. You should use:
for (long i = 0; i < inputSize; i++)
testList.add(0, i);

As per my knowledge, Arraylist add operaion runs in O(1) time, so the results of your experiment are correct. I think the constant time for arrayList add method is amortized constant time.
As per java doc :
adding n elements require O(N) time so that is why the amortized constant time for adding.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

ArrayList .get faster than HashMap .get? - java

ArrayList.get(index) acctualy uses constant time, since ArrayList is backed by an array, so it just uses that index in the bacing array. ArrayList.contains(Object) is a long operation in O(n) in worst case.

Big O for HashMap is O(1+α). Your α comes from hashcode collisions and a bucket must be traversed to check for equality. Big O for pulling an item out of an ArrayList by index O(1) When in doubt... draw it out...

Related

Java Data Structure Benchmark

Why is initial capacity important for deleting from ArrayList?

Algorithm too slow for shuffling ArrayList

Java - Vector vs ArrayList performance - test

Comparison: Runtime of LinkedList addFirst() vs. ArrayList add(0, item)

Categories

Resources