I want to know if there is a difference in performance if I use a primitive array and then rebuild it to add new elements like this:
AnyClass[] elements = new AnyClass[0];
public void addElement(AnyClass e) {
AnyClass[] temp = new AnyClass[elements.length + 1];
for (int i = 0; i < elements.length; i++) {
temp[i] = elements[i];
}
temp[elements.length] = e;
elements = temp;
}
or if I just use an ArrayList and add the elements.
I am not certain that is why I ask, is it the same speed because an ArrayList is build in the same way as I did it with the primitive array or is there really a difference and a primitive array is always faster even if I rebuild it everytime I add an element?
ArrayLists work in a similar way but instead of rebuilding every time they double there capacity every time the limit is reached. so if you are constantly adding to it ArrayLists will be faster because recreating the array is fairly slow.
So your implementation could use less memory if you are not adding to it often but as far as speed goes it will be slower most of the time.
In a nutshell, stick with ArrayList. It is:
widely understood;
well tested;
will probably be more performant that your own implementation (for example, ArrayList.add() is guaranteed to be amortised constant-time, which your method is not).
When an ArrayList resizes it doubles itself, so that you are not wasting time resizing each time. Amortized, that means that it doesn't take any time to resize. That's why you shouldn't waste time recreating the wheel. The people who created the first one already learned how to make one more efficient and know more about the platform than you do.
There is no performance issue in both Arrays and ArrayList.
Arrays and ArrayList are index based so both will work in same way.
If you required the dynamic Array you can use arrayList.
If Array size is static then go with Array.
Your implementation is likely to lose clearly to Java's ArrayList in terms of speed. One particularly expensive thing you're doing is reallocating the array every time you want to add elements, while Java's ArrayList tries to optimize by having some "buffer" before having to reallocate.
ArrayList will also use internally Array Only , so this is true Array will be faster than ArrayList. While writing high performance code always use Array. For the same reason Array is back bone for most of the collections.
You must go through JDK implementation of Collections.
We use ArrayList when we are developing some application and we are not concerned about such minor performance issues and we do trade off because we get already written API to put , get , resize etc.
Context is very important: I mean if you are constantly inserting new items/elements ArrayList will certainly be faster than Array. On the other hand if you just want to access an element at a known position-say arrayItems[8], Array is faster than ArrayList.get(8); Sine there is overhead of get() function calls and other steps and checks.
Related
Is there a fundamental difference in Java between an ArrayList, and a class that uses regular arrays to store items, has an index to keep track of the number of items in the list, and automatically increases the size of the array when it runs out of space?
class myArrayList {
private int[] array = new int[10];
private int itemsInArray = 0;
private void increaseArraySize() {
int[] newArray = new int[array.length + 10];
System.arraycopy(array, 0, newArray, 0, array.length);
array = newArray;
}
public void put(int i) {
if (itemsInArray == array.length)
{
increaseArraySize();
}
array[itemsInArray] = i;
itemsInArray++;
}
public int get(int idx) {
return array[idx];
}
public int size() {
return itemsInArray;
}
}
An ArrayList has some additional methods my class doesn't have (that I could add), and implements the List interface, but other than that, is ArrayList just for convenience? Do both use the heap to store data?
Is there a fundamental difference in Java between an ArrayList, and a class that uses regular arrays to store items, has an index to keep track of the number of items in the list, and automatically increases the size of the array when it runs out of space?
In general no. Under the hood, ArrayList is just an ordinary pure Java class i.e. no native code. It is (roughly speaking!) doing what your code does.
But (as the comments say) it already exists. You don't need to design it, code it, debug it, tune it ... You just use it! Also read Basil's answer!
However, I would note that your version is different from ArrayList in some (other) important respects:
A myArrayList holds only int values. It is not generic.
An ArrayList holds objects. If you needed a list of integers you would need to use Integer as the type parameter rather than int. (Because that's the way that Java generic classes work.)
In myArrayList, a set call beyond the end of the list will grow the list. It is behaving more like a dynamic array than a list.
In ArrayList, a set call beyond the end of the list will throw an exception.
If you want a Java "list" type that is specialized for int or some other primitive type, there are existing 3rd party libraries; e.g. the GNU Trove library.
Do both use the heap to store data?
Yes. In fact, if you look at the source code of ArrayList you will see that it does something like what you code is doing. But it is doing it "smarter" and this will result in better "big O" performance in certain operations.
Consider this:
myArrayList list = new myArrayList();
for (int i = 0; i < N; i++) {
list.put(i, 1);
}
The computational complexity of this is O(N2).
Each call to list.put(i, 1) will cause a resize, creating a new array of size i and will then copy i - 1 values to the new array. That adds up to 0 + 1 + ... N - 1 or N * (N - 1) / 2 copies. That is O(N2) for the N calls.
By contrast, ArrayList uses a resize strategy of growing the list by 50% of its current size. If you do the analysis, it turns out that the average amortized cost for N calls to ArrayList.append is O(N) ... not O(N2).
Lesson #1: Don't go trying to re-implement standard Java utility classes. It is usually a waste of time and there is a good chance that your efforts will actually make things worse!
There are exceptions to this lesson, but you need a lot of Java programming experience (and / or use of profiling tools) to identify them. Even then, there is a good chance that there is an existing a 3rd-party alternative that addresses the problem.
Lesson #2: If your goal is to understand how the standard utility classes work under the hood, the best way is to read the OpenJDK source code. It is good code and well commented. In cases where it is complicated there is a good reason for that. But any experienced Java programmer should be capable of understanding it if they work hard at it.
You asked:
What's the difference between using ArrayList, or dynamically growing an array in Java?
Looking at the Collections Framework Overview, the very first bullet item says:
The primary advantages of a collections framework are that it:
• Reduces programming effort by providing data structures and algorithms so you don't have to write them yourself.
You asked:
Is there a fundamental difference in Java between an ArrayList, and a class that uses regular arrays
In terms of behavior, no fundamental difference. As the name implies, the current implementation of ArrayList is a class that uses regular arrays. So there is no point to you writing your own.
Keep in mind that future versions of ArrayList implementations are free to use some other approach besides actual arrays provided the contract promised in the Javadoc is met.
is ArrayList just for convenience?
Yes, as stated above. Rather than have every individual programmer write their own implementation, why not share one single well-written, well-debugged, and well-documented implementation?
Most implementations of Java are based on the OpenJDK open-source codebase. You are free to peruse the source code.
I have a short (12 elements) LinkedList of short strings (7 characters each).
I need to search through this list both by index and by content (i.e. search a particular string and get its index in the list).
I thought about making a copy of the LinkedList as an array at runtime (just once, since the LinkedList is a static member of my class), so I can access the strings by index more quickly.
Given that the LinkedList is never changed at runtime, is this bad programming practice or is this an idea worth considering?
IMPORTANT EDIT: the array can't be sorted, I need it to map specific strings to specific numbers.
Instead of a LinkedList just use an ArrayList - you can look up fast based on an index, and you can easily search through it.
What problem are you trying to solve here? Are you worried that accessing elements by index is too slow in LinkedList? If so, you might want to use ArrayList instead.
But for a 12-element list, the improvement probably won't make any measurable difference. Unless this is something you're accessing several hundred times a second, I wouldn't waste any time on trying to optimize it.
Another idea you might want to consider is using a Map:
Map someMap<int, String>
It's easy to search for values in a map by both key and value.
Might also not be the best idea, but at least better then creating 2 lists with the same values =)
The question is, why are you using a LinkedList in the first place?
The main reason to choose a LinkedList over an array list is if you need to make a number of insertions/deletions in the middle of the List or if you don't know the exact size of the list and don't want to make a number of reallocations of the Array.
The main reason to choose an ArrayList over a LinkedList is if you need to have random access to each of the elements.
(There are other advantages/disadvantages to each, but those are probably the main ones that come to mind)
It looks like you do need random access to the list, so why did you pick a LinkedList over an ArrayList
I would say it depends on your intention and the effect it really has.
With only 12 elements it seems unlikely to me that converting the LinkedList to an array has an impact on performance. So it could make the code unnecessarily (slightly) harder to understand for other people. From this point of view it could be considered a non optimal programming style.
If the number of elements increases, i.g. you're need to pre-process some data which would require a dynamic data structure. And for later use an indexed lookup performs much better, this wouldn't be a bad programming style, rather a required improvement.
Given that you know the exact amount of elements you are going to be using why not use an array from the start?
string[] myArray = new string[7];
// Add your data
Sort(myArray); // Sort your strings
int value = binarySearch(myArray, "key"); // Search your array
Or since you cant sort the array you could just make a linear search method
public int Search(string[] array, string key)
{
for(int i = 0; i < array.legnth(); i++)
{
if(array[i] == key)
return i;
}
return -1;
}
Edit: After re-loading the page and reading peoples responses I agree that ArrayList should be exactly what you need.
I have a java list
List<myclass> myList = myClass.selectFromDB("where clause");
//myClass.selectFromDB returns a list of objects from DB
But I want a different list, specifically.
List<Integer> goodList = new ArrayList<Integer>();
for(int i = 0;i++; i<= myList.size()) {
goodList[i] = myList[i].getGoodInteger();
}
Yes, I could do a different query from the DB in the initial myList creation, but assume for now I must use that as the starting point and no other DB queries. Can I replace the for loop with something much more efficient?
Thank you very much for any input, apologies for my ignorance.
In order to extract a field from the "myclass", you're going to have to loop through the entire contents of the list. Whether you do that with a for loop, or use some sort of construct that hides the for loop from you, it's still going to take approximately the same time and use the same amount of resources.
An important question is: why do you want to do this? Are you trying to make your code cleaner? If so, you could write a method along these lines:
public static List<Integer> extractGoodInts (List<myClass> myList) {
List<Integer> goodInts = new ArrayList<Integer>();
for(int i = 0; i < myList.size(); i++){
goodInts.add(myList.get(i).getGoodInteger());
}
return goodInts;
}
Then, in your code, you can just go:
List<myClass> myList = myClass.selectFromDB("where clause");
List<Integer> goodInts = myClass.extractGoodInts(myList);
However, if you're trying to make your code more efficient and you're not allowed to change the query, you're out of luck; somehow or another, you're going to need to individually grab each int from the list, which means you're going to be running in O(n) time no matter what clever tricks you can come up with.
There are really only two ways I can think of that you can make this more "efficient":
Somehow split this up between multiple cores so you can do the work in parallel. Of course, this assumes that you've got other cores, they aren't doing anything useful already, and that there's enough processing going on that the overheard of doing this is even worth it. My guess is that (at least) the last point isn't true in your case given that you're just calling a getter. If you wanted to do this you'd try to have a number of threads (I'd probably actually use an Executor and Futures for this) equal to the number of cores, and then give roughly equal amounts of work to each of them (probably just by slicing your list into roughly equal sized pieces).
If you believe that you'll only be accessing a small subset of the resulting List, but are unsure of exactly which elements, you could try doing things lazily. The easiest way to do that would be to use a pre-built lazy mapping List implementation. There's one in Google Collections Library. You use it by calling Lists.transform(). It'll immediately return a List, but it'll only perform your transformation on elements as they are requested. Again, this is only more efficient if it turns out that you only ever look at a small fraction of the output List. If you end up looking at the entire thing this will not be more efficient, and will probably work out to be less efficient.
Not sure what you mean by efficient. As the others said, you have to call the getGoodInteger method on every element of that list one way or another. About the best you can do is avoid checking the size every time:
List<Integer> goodInts = new ArrayList<Integer>();
for (MyClass myObj : myList) {
goodInts.add(myObj.getGoodInteger());
}
I also second jboxer's suggestion of making a function for this purpose.
I have a fairly expensive array calculation (SpectralResponse) which I like to keep to a minimum. I figured the best way is to store them and bring it back up when same array is needed again in the future. The decision is made using BasicParameters.
So right now, I use a LinkedList of object for the arrays of SpectralResponse, and another LinkedList for the BasicParameter. And the BasicParameters has a isParamsEqualTo(BasicParameters) method to compare the parameter set.
LinkedList<SpectralResponse> responses
LinkedList<BasicParameters> fitParams
LinkedList<Integer> responseNumbers
So to look up, I just go through the list of BasicParameters, check for match, if matched, return the SpectralResponse. If no match, then calculate the SpectralResponse.
Here's is the for loop I used to lookup.
size: LinkedList size, limited to a reasonable value
responseNumber: just another variable to distinguish the SpectralResponse.
for ( i = size-1; i > 0 ; i--) {
if (responseNumbers.get(i) == responseNum)
{
tempFit = fitParams.get(i);
if (tempFit.isParamsEqualTo(fit))
{
return responses.get(i);
}
}
}
But somehow, doing it this way no only take out lots of memory, it's actually slower than just calculating SpectralResponse straight. Much slower.
So it is my implementation that's wrong, or I was mistaken that precalculating and lookup is faster?
You are accessing a LinkedList by index, this is the worst possible way to access it ;)
You should use ArrayList instead, or use iterators for all your lists.
Possibly you should merge the three objects into one, and keep them in a map with responseNum as key.
Hope this helps!
You probably should use an array type (an actual array, like Vector, ArrayList), not Linked lists. Linked lists is best for stack or queue operation, not indexing (since you have to traverse it from one end). Vector is a auto resizing array, wich has less overhead in accessing inexes.
The get(i) methods of LinkedList require that to fetch each item it has to go further and further along the list. Consider using an ArrayList, the iterator() method, or just an array.
The second line, 'if (responseNumbers.get(i) == responseNum)' will also be inefficient as the responseNumbers.get(i) is an Integer, and has to be unboxed to an int (Java 5 onwards does this automatically; your code would not compile on Java 1.4 or earlier if responseNum is declared as an an int). See this for more information on boxing.
To remove this unboxing overhead, use an IntList from the apache primitives library. This library contains collections that store the underlying objects (ints in your case) as a primitive array (e.g. int[]) instead of an Object array. This means no boxing is required as the IntList's methods return primitive types, not Integers.
What is the fastest list implementation (in java) in a scenario where the list will be created one element at a time then at a later point be read one element at a time? The reads will be done with an iterator and then the list will then be destroyed.
I know that the Big O notation for get is O(1) and add is O(1) for an ArrayList, while LinkedList is O(n) for get and O(1) for add. Does the iterator behave with the same Big O notation?
It depends largely on whether you know the maximum size of each list up front.
If you do, use ArrayList; it will certainly be faster.
Otherwise, you'll probably have to profile. While access to the ArrayList is O(1), creating it is not as simple, because of dynamic resizing.
Another point to consider is that the space-time trade-off is not clear cut. Each Java object has quite a bit of overhead. While an ArrayList may waste some space on surplus slots, each slot is only 4 bytes (or 8 on a 64-bit JVM). Each element of a LinkedList is probably about 50 bytes (perhaps 100 in a 64-bit JVM). So you have to have quite a few wasted slots in an ArrayList before a LinkedList actually wins its presumed space advantage. Locality of reference is also a factor, and ArrayList is preferable there too.
In practice, I almost always use ArrayList.
First Thoughts:
Refactor your code to not need the list.
Simplify the data down to a scalar data type, then use: int[]
Or even just use an array of whatever object you have: Object[] - John Gardner
Initialize the list to the full size: new ArrayList(123);
Of course, as everyone else is mentioning, do performance testing, prove your new solution is an improvement.
Iterating through a linked list is O(1) per element.
The Big O runtime for each option is the same. Probably the ArrayList will be faster because of better memory locality, but you'd have to measure it to know for sure. Pick whatever makes the code clearest.
Note that iterating through an instance of LinkedList can be O(n^2) if done naively. Specifically:
List<Object> list = new LinkedList<Object>();
for (int i = 0; i < list.size(); i++) {
list.get(i);
}
This is absolutely horrible in terms of efficiency due to the fact that the list must be traversed up to i twice for each iteration. If you do use LinkedList, be sure to use either an Iterator or Java 5's enhanced for-loop:
for (Object o : list) {
// ...
}
The above code is O(n), since the list is traversed statefully in-place.
To avoid all of the above hassle, just use ArrayList. It's not always the best choice (particularly for space efficiency), but it's usually a safe bet.
There is a new List implementation called GlueList which is faster than all classic List implementations.
Disclaimer: I am the author of this library
You almost certainly want an ArrayList. Both adding and reading are "amortized constant time" (i.e. O(1)) as specified in the documentation (note that this is true even if the list has to increase it's size - it's designed like that see http://java.sun.com/j2se/1.5.0/docs/api/java/util/ArrayList.html ). If you know roughly the number of objects you will be storing then even the ArrayList size increase is eliminated.
Adding to the end of a linked list is O(1), but the constant multiplier is larger than ArrayList (since you are usually creating a node object every time). Reading is virtually identical to the ArrayList if you are using an iterator.
It's a good rule to always use the simplest structure you can, unless there is a good reason not to. Here there is no such reason.
The exact quote from the documentation for ArrayList is: "The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking). The constant factor is low compared to that for the LinkedList implementation."
I suggest benchmarking it. It's one thing reading the API, but until you try it for yourself, it'd academic.
Should be fair easy to test, just make sure you do meaningful operations, or hotspot will out-smart you and optimise it all to a NO-OP :)
I have actually begun to think that any use of data structures with non-deterministic behavior, such as ArrayList or HashMap, should be avoided, so I would say only use ArrayList if you can bound its size; any unbounded list use LinkedList. That is because I mainly code systems with near real time requirements though.
The main problem is that any memory allocation (which could happen randomly with any add operation) could also cause a garbage collection, and any garbage collection can cause you to miss a target. The larger the allocation, the more likely this is to occur, and this is also compounded if you are using CMS collector. CMS is non-compacting, so finding space for a new linked list node is generally going to be easier than finding space for a new 10,000 element array.
The more rigorous your approach to coding, the closer you can come to real time with a stock JVM. But choosing only data structures with deterministic behavior is one of the first steps you would have to take.