Dictionary dilemma: Array vs Arraylist

Dictionary dilemma: Array vs Arraylist - java

What I want to know is which one would be more efficient, should I use a 1D array and list 100 words, or make an array list to do the same thing in Java?
Note: I've only used arrays so far, array lists would be slightly new to me, I know what it is, I just have never used it before, also they would be used to randomly select a word.

If you know from the beginning the final number of elements then there is no point of using an ArrayList over an Array. ArrayList are dynamical: they can grow but you have a small price to pay for it in term of performance and memory space requirement. The difference is slim but if you don't need the autogrow feature of ArrayList then why asking for it?
However, beside that, there is anoter criteria that can make (or not) a bigger splash: Arrays are covariant where ArrayList are not; that is: if B is a subclass of A than a reference to an Array of A can also accept a reference to an array of B but a reference to an ArrayList of A cannot accept an ArrayList of B. In other words, an Array of B will be considered as a covariant for an Array of A but an ArrayList of A won't:
class A {}
class B extends A {}
A[] a = new B[1]; // OK
ArrayList<A> a2 = new ArrayList<B>(); // Error.
To circumvent this last error, you can try with a family of types such as:
ArrayList<? extends A> a3 = new ArrayList<B>();
but then, you are limiting the contravariance of the ArrayList a3:
a3.add(new A()); // Error!
a3.add(new B()); // Error again!
However, when you have an hierarchy of classes, it's usually a better idea to keep working with the superclass. Therefore, even when you have a set of objects B where B is a subclass of A, keeping A[] and ArrayList instead of B[] and ArrayList for keeping references to these objects B is often better suited to OOP and easier to work with.
Sometimes, you may have to make a cast from A to B in order to access a property or a method of B which is not accessible from A. However, this could be considered as a weakness in the design. OOP works best when you use the polymorphism at its fullest extent and the base class (or super class) should have all the necessary virtual functions to access the properties and methods of all subclasses and therefore you should be able to keep a reference to a subclass using the base class without having to make any cast thereafter.

I suggest you to use List, there is almost no such difference between array and list on performance based.
But in case of List your code will easy to manage and flexible as comparing to array.

If efficiency is your biggest worry here, then you have no worries. Use whichever you want. You'll see no appreciable difference in performance between an array and a List for 99.99% (completely made up) of applications. In general Lists are preferred over arrays because they're easier to work with.

Related

Could java.util.ArrayList<T>.toArray() be made friendlier?

I'm surprised by how painful it is to use java.util.ArrayList<T>.toArray().
Suppose I declare my array list as:
java.util.ArrayList<double[]> arrayList = new java.util.ArrayList<double[]>();
... add some items ...
Then to convert it to an array, I have to do one of the following:
double[][] array = (double[][])arrayList.toArray(new double[0][]);
or:
double[][] array = (double[][])arrayList.toArray(new double[arrayList.size()][]);
or:
double[][] array = new double[arrayList.size()];
arrayList.toArray(array);
None of the above are very readable. Shouldn't I be able to say the following instead?
double[][] array = arrayList.toArray();
But that gives a compile error because Object[] can't be converted to double[][].
Perhaps it's not possible because toArray has to return Object[]
for backwards compatibility with pre-template days.
But if that's the case, couldn't a friendlier alternative method be added
with a different name? I can't think of a good name, but almost anything
would be better than the existing ways; e.g. the following would be fine:
double[][] array = arrayList.toArrayOfNaturalType();
No such member function exists, but maybe it's possible to write a generic helper function that will do it?
double[][] array = MyToArray(arrayList);
The signature of MyToArray would be something like:
public static <T> T[] MyToArray(java.util.ArrayList<T> arrayList)
Is it possible to implement such a function?
My various attempts at implementing it resulted in compile errors
"error: generic array creation" or "error: cannot select from a type variable".
Here's the closest I was able to get:
public static <T> T[] MyToArray(java.util.ArrayList<T> arrayList, Class type)
{
T[] array = (T[])java.lang.reflect.Array.newInstance(type, arrayList.size());
arrayList.toArray(array);
return array;
}
It's called like this:
double[][] array = MyToArray(arrayList, double[].class);
I wish the redundant final parameter wasn't there, but, even so,
I think this is the least-horrible way that I've seen so far for converting array list to array.
Is it possible to do any better than this?

Is it possible to do any better than this?
Nope.
None of the above are very readable. Shouldn't I be able to say the following instead?
double[][] array = arrayList.toArray();
It would be nice ... but you can't.
The problem is that the toArray() method was specified way back in Java 1.2 with the behavior you are seeing. Generic types were not added to the language until Java 1.5. When they were added, the designers chose the "type erasure" approach, for compatibility with earlier versions of Java. So:
the semantics of the toArray() methods could not be changed without breaking compatibility, and
type erasure makes it impossible for a toArray() method implementation to know what the list's actual element type is, so it could not get it right anyway.

Unfortunately you cannot write
double[][] array = arrayList.toArray();
The reason is that toArray() was defined in JDK 1.2 (prior to generics) to return Object[]. This cannot be changed compatibly.
Generics were introduced in Java 5 but were implemented using erasure. This means that the ArrayList instance has no knowledge at runtime of the types of objects it contains; therefore, it cannot create an array of the desired element type. That's why you have to pass a type token of some sort -- in this case an actual array instance -- to tell ArrayList the type of the array to create.
You should be able to write
double[][] array = arrayList.toArray(new double[0][]);
without a cast. The one-arg overload of toArray() is generified, so you'll get the right return type.
One might think that it's preferable to pass a pre-sized array instead of a throwaway zero-length array. Aleksey Shipilev wrote an article analyzing this question. The answer is, somewhat counterintuitively, that creating a zero-length array is potentially faster.
Briefly, the reason is that allocation is cheap, a zero-length array is small, and it's probably going to be thrown away and garbage collected quickly, which is also cheap. By contrast, creating a pre-sized array requires it to be allocated and then filled with nulls/zeroes. It's then passed to toArray(), which then fills it with values from the list. Thus, every array element is typically written twice. By passing a zero-length array to toArray(), this allows the array allocation to occur in the same code as the array filling code, providing the opportunity for the JIT compiler to bypass the initial zero-fill, since it knows that every array element will be filled.
There is also JDK-8060192 which proposes to add the following:
<A> A[] Collection.toArray(IntFunction<A[]> generator)
This lets you pass a lambda expression that is given the array size and returns a created array of that size. (This is similar to Stream.toArray().) For example,
// NOT YET IMPLEMENTED
double[][] array = arrayList.toArray(n -> new double[n][]);
double[][] array = arrayList.toArray(double[][]::new);
This isn't implemented yet, but I'm still hopeful this can get into JDK 9.
You could rewrite your helper function along these lines:
static <T> T[] myToArray(List<T> list, IntFunction<T[]> generator) {
return list.toArray(generator.apply(list.size()));
}
(Note that there is some subtlety here with concurrent modification of the list, which I'm ignoring for this example.) This would let you write:
double[][] array = myToArray(arrayList, double[][]::new);
which isn't terribly bad. But it's not actually clear that it's any better than just allocating a zero-length array to pass to toArray().
Finally, one might ask why toArray() takes an actual array instance instead of a Class object to denote the desired element type. Joshua Bloch (creator of the Java collections framework) said in comments on JDK-5072831 that this is feasible but that he's not sure it's a good idea, though he could live with it.
There an additional use case here as well, of copying the elements into an existing array, like the old Vector.copyInto() method. The array-bearing toArray(T[]) method also supports this use case. In fact, it's better than Vector.copyInto() because the latter cannot be used safely in the presence of concurrent modification, if the collection's size changes. The auto-sizing behavior of toArray(T[]) handles this, and it also handles the case of creating an array of the caller's desired type as described above. Thus, while adding an overload that takes a Class object would certainly work, it doesn't add much over the existing API.

Creating a Generic array without using Arraylist

In order to complete one of my Java assignments, I have to do what seems like the impossible.
I have to create a method that takes in different stuff and plugs it into an array. We don't necessarily know what is being put into the array and thus the array must be able to accept Strings, Double, Integer, etc...
Of course, the obvious solution would be to use ArrayList<E> (i.e. a generic array). However, that's partly the complication of the problem. We cannot use an ArrayList, only a regular array. As far as I can find, when creating an array its intake value must be declared. Which leads me to believe that this assignment is impossible (yet I doubt the teacher would give me an impossible assignment).
Any suggestions?

You can always use an array of Object - Object[].
Object[] objects = new Object[2];
objects[0] = "ABC";
objects[1] = Integer.valueOf("15");

Are you sure you need a generic array or an array that can hold anything?
If the former, then create a class that will act as wrapper of Object[] array and use a <T> generic for type cast when getting the elements of the array, which is similar to the implementation of ArrayList class. If the latter, use Object[] directly.

Fixed number of generic objects which can be iterated over

I have a class which always holds four objects:
class Foo<E> {
Cow<E> a, b, c, d;
}
I want to be able to iterate over them, so ideally I'd like to use an array:
class Foo<E> {
Cow<E>[] cows = new Cow<E>[4]; // won't work, can't create generic array
}
I don't want to use a list or a set since I want there to always be 4 Cow objects. What's the best solution for me?

If you want to preserve the genericity, you will have to reimplement something similar to a list and I don't think it is worth it.
You said:
The first is that you can add and remove elements to and from a list.
Well you can create an unmodifiable list:
List<E> list = Collections.unmodifiableList(Arrays.asList(a, b, c, d));
The second is that I'm creating a quadtree data structure and using a list wouldn't be too good for performance. Quadtrees have a lot of quadrants and using lists would decrease performance significantly.
First you can initialise the list to the right size:
List<E> list = new ArrayList<>(4);
Once you have done that, the list will only use a little bit more memory than an array (probably 8 bytes: 4 byte for the backing array reference and another 4 byte for the size).
And in terms of performance an ArrayList performs almost as good as an array.
Bottom line: I would start by using a list and measure the performance. If it is not good enough AND it is due to using a list instead of an array, then you will have to adapt your design - but I doubt that this will be your main issue.

Use a generic ArrayList and simply have methods to insert values into your object, and do checks inside those methods, to make sure you don't end up having more than 4 Cow objects.

I will suggest creating a bounded list. Java does not have an inbuilt one however you can create a custom one using Google collections or use the one in Apache collections. See Is there a bounded non-blocking Collection in Java?

Use Collection instead of array:
List<Cow<E>> cows = new ArrayList<>(); // in Java 7
Or
List<Cow<E>> cows = new ArrayList<Cow<E>>(); //Java 6 and below
More information will show why it is IMPOSSIBLE to have arrays whit generics. You can see here

Cow<E>[] cows = (Cow<E>[])new Cow[4];
or
Cow<E>[] cows = (Cow<E>[])new Cow<?>[4];

Using Java List when array is enough

Is it advisable to use Java Collections List in the cases when you know the size of the list before hand and you can also use array there? Are there any performance drawbacks?
Can a list be initialised with elements in a single statement like an array (list of all elements separated by commas) ?

Is it advisable to use Java Collections List in the cases when you know the size of the list before hand and you can also use array there ?
In some (probably most) circumstances yes, it is definitely advisable to use collections anyway, in some circumstances it is not advisable.
On the pro side:
If you use an List instead of an array, your code can use methods like contains, insert, remove and so on.
A lot of library classes expect collection-typed arguments.
You don't need to worry that the next version of the code may require a more dynamically sized array ... which would make an initial array-based approach a liability.
On the con side:
Collections are a bit slower, and more so if the base type of your array is a primitive type.
Collections do take more memory, especially if the base type of your array is a primitive type.
But performance is rarely a critical issue, and in many cases the performance difference is not relevant to the big picture.
And in practice, there is often a cost in performance and/or code complexity involved in working out what the array's size should be. (Consider the hypothetical case where you used a char[] to hold the concatenation of a series. You can work out how big the array needs to be; e.g. by adding up the component string sizes. But it is messy!)

Collections/lists are more flexible and provide more utility methods. For most situations, any performance overhead is negligible.
And for this single statement initialization, use:
Arrays.asList(yourArray);
From the docs:
Returns a fixed-size list backed by the specified array. (Changes to the returned list "write through" to the array.) This method acts as bridge between array-based and collection-based APIs, in combination with Collection.toArray. The returned list is serializable and implements RandomAccess.
My guess is that this is the most performance-wise way to convert to a list, but I may be wrong.

1) a Collection is the most basic type and only implies there is a collection of objects. If there is no order or duplication use java.util.Set, if there is possible duplication and ordering use java.util.List, is there is ordering but no duplication use java.util.SortedSet
2) Curly brackets to instantiate an Array, Arrays.asList() plus generics for the type inference
List<String> myStrings = Arrays.asList(new String[]{"one", "two", "three"});
There is also a trick using anonymous types but personally I'm not a big fan:
List<String> myStrings = new ArrayList<String>(){
// this is the inside of an anonymouse class
{
// this is the inside of an instance block in the anonymous class
this.add("one");
this.add("two");
this.add("three");
}};

Yes, it is advisable.
Some of the various list constructors (like ArrayList) even take arguments so you can "pre-allocate" sufficient backing storage, alleviating the need for the list to "grow" to the proper size as you add elements.

There are different things to consider: Is the type of the array known? Who accesses the array?
There are several issues with arrays, e.g.:
you can not create generic arrays
arrays are covariant: if A extends B -> A[] extends B[], which can lead to ArrayStoreExceptions
you cannot make the fields of an array immutable
...
Also see, item 25 "Prefer lists to arrays" of the Effective Java book.
That said, sometimes arrays are convenient, e.g. the new Object... parameter syntax.
How can a list be initialised with elements in a single statement like an array = {list of all elements separated by commas} ?
Arrays.asList(): http://download.oracle.com/javase/6/docs/api/java/util/Arrays.html#asList%28T...%29

Is it advisable to use Java Collections List in the cases when you know the size of the list before hand and you can also use array there ? Performance drawbacks ???
If an array is enough, then use an array. Just to keep things simple. You may even get a slightly better performance out of it. Keep in mind that if you...
ever need to pass the resulting array to a method that takes a Collection, or
if you ever need to work with List-methods such as .contains, .lastIndexOf, or what not, or
if you need to use Collections methods, such as reverse...
then may just as well go for the Collection/List classes from the beginning.
How can a list be initialised with elements in a single statement like an array = {list of all elements separated by commas} ?
You can do
List<String> list = Arrays.asList("foo", "bar");
or
List<String> arrayList = new ArrayList<String>(Arrays.asList("foo", "bar"));
or
List<String> list = new ArrayList<String>() {{ add("foo"); add("bar"); }};

Is it advisable to use Java
Collections List in the cases when you
know the size of the list before hand
and you can also use array there ?
Performance drawbacks ?
It can be perfectly acceptable to use a List instead of an array, even if you know the size before hand.
How can a list be initialised with
elements in a single statement like an
array = {list of all elements
separated by commas} ?
See Arrays.asList().

Comparing an array and getting the difference

How would I compare two arrays that might have different lengths and get the difference between each array?
For example:
Cat cat = new Cat();
Dog dog = new Dog();
Alligator alligator = new Alligator();
Animal animals[] = { cat, dog };
Animal animals2[] = { cat, dog, alligator };
How would I compare them two arrays and make it return the instance of Alligator?

I would suggest that your question needs to be clarified. Currently, everyone is guessing what about what you are actually asking.
Are the arrays intended to represent sets, or lists, or something in between? In other words, does element order matter, and can there be duplicates?
What does "equal" mean? Does new Cat() "equal" new Cat()? Your example suggests that it does!!
What do you mean by the "difference"? Do you mean set difference?
What do you want to happen if the two arrays have the same length?
Is this a once-off comparison or does it occur repeatedly for the same arrays?
How many elements are there in the arrays (on average)?
Why are you using arrays at all?
Making the assumption that these arrays are intended to be true sets, then you probably should be using HashSet instead of arrays, and using collection operations like addAll and retainAll to calculate the set difference.
On the other hand, if the arrays are meant to represent lists, it is not at all clear what "difference" means.
If it is critical that the code runs fast, then you most certainly need to rethink your data structures. If you always start with arrays, you are not going to be able to calculate the "differences" fast ... at least in the general case.
Finally, if you are going to use anything that depends on the equals(Object) method (and that includes any of the Java collection types, you really need to have a clear understanding of what "equals" is supposed to mean in your application. Are all Cat instances equal? Are they all different? Are some Cat instances equal and others not? If you don't figure this out, and implement the equals and hashCode methods accordingly you will get confusing results.

I suggest that you put your objects in sets and then use an intersection of the sets:
// Considering you put your objects in setA and setB
Set<Object> intersection = new HashSet<Object>(setA);
intersection.retainAll(setB);
After that you can use removeAll to get a difference to any of the two sets:
setA.removeAll(intersection);
setB.removeAll(intersection);
Inspired by: http://hype-free.blogspot.com/2008/11/calculating-intersection-of-two-java.html

Well, you could maybe use Set instead and use the removeAll() method.
Or you could use the following simple and slow algorithm for doing:
List<Animal> differences = new ArrayList<Animal>();
for (Animal a1 : animals) {
boolean isInSecondArray = false;
for (Animal a2 : animals2) {
if (a1 == a2) {
isInSecondArray = true;
break;
}
}
if (!isInSecondArray)
differences.add(a1)
}
Then differences will have all the objects that are in animals array but not in animals2 array. In a similar way you can do the opposite (get all the objects that are in animals2 but not in animals).

You may want to look at this article for more information:
http://download-llnw.oracle.com/javase/tutorial/collections/interfaces/set.html
As was mentioned, removeAll() is made for this, but you will want to do it twice, so that you can create a list of all that are missing in both, and then you could combine these two results to have a list of all the differences.
But, this is a destructive operation, so if you don't want to lose the information, copy the Set and operate on that one.
UPDATE:
It appears that my assumption of what is in the array is wrong, so removeAll() won't work, but with a 5ms requirement, depeending on the number of items to search it could be a problem.
So, it would appear a HashMap<String, Animal> would be the best option, as it is fast in searching.
Animal is an interface with at least one property, String name. For each class that implements Animal write code for Equals and hashCode. You can find some discussion here: http://www.ibm.com/developerworks/java/library/j-jtp05273.html. This way, if you want the hash value to be a combination of the type of animal and the name then that will be fine.
So, the basic algorithm is to keep everything in the hashmaps, and then to search for differences, just get an array of keys, and search through to see if that key is contained in the other list, and if it isn't put it into a List<Object>, storing the value there.
You will want to do this twice, so, if you have at least a dual-core processor, you may get some benefit out of having both searches being done in separate threads, but then you will want to use one of the concurrent datatypes added in JDK5 so that you don't have to worry about synchronizations in the combined list of differences.
So, I would write it first as a single-thread and test, to get some ideas as to how much faster it is, also comparing it to the original implmemntation.
Then, if you need it faster, try using threads, again, compare to see if there is a speed increase.
Before making any optimization ensure you have some metrics on what you already have, so that you can compare and see if the one change will lead to an increase in speed.
If you make too many changes at a time, one may have a large improvement on speed, but others may lead to a performance decrease, and it wouldn't be seen, which is why each change should be one at a time.
Don't lose the other implementations though, by using unit tests and testing perhaps 100 times each, you can get an idea as to what improvement each change gives you.

I don't care about perf for my usages (and you shouldn't either, unless you have a good reason to, and you find out via your profiler that this code is the bottleneck).
What I do is similar to functional's answer. I use LINQ set operators to get the exception on each list:
http://msdn.microsoft.com/en-us/library/bb397894.aspx
Edit:
Sorry, I didn't notice this is Java. Sorry, I'm off in C# la-la land, and they look very similar :)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.