In Java ArrayList<E> implementation base on a array of objects.
Can anybody explain me why implementation of ArrayList<E> uses array Object[] for data storage instead of E[]? What the benefit of using Object[]?
In Java, creating an array of a generic type is not straightforward.
The simple approach does not compile:
public class Container<E> {
E[] arr = new E[3]; // ERROR: Cannot create a generic array of E
}
Replace E with Object, and all is well (at the expense of added complexity elsewhere in the container implementation).
There are alternative approaches, but they present a different set of tradeoffs. For an extensive discussion, see How to create a generic array in Java?
So first of all, realize that the actual runtime type of the array object must be Object[]. This is because arrays know their component types at runtime (different array types are actually different types at runtime), and thus you need to specify the component type in creating the array, but the ArrayList object does not know its type argument at runtime.
That said, the compile-time type of the instance variable could be declared as either Object[] or E[], with different advantages and disadvantages:
If it is declared as Object[]:
private Object[] arr;
// to create it:
arr = new Object[3];
// to get an element:
E get(int i) { return (E)arr[i]; }
The disadvantage of this is that you must cast it to E every time you take something out of it, which means you are basically using it as a pre-generics container.
If it is declared as E[]:
private E[] arr;
// to create it:
arr = (E[])new Object[3];
// to get an element:
E get(int i) { return arr[i]; }
The advantage of this is that you no longer have to cast when you get things out of it -- it provides type-checking on the uses of arr, like generic containers. The disadvantage is that, logically, the cast is lie -- we know we created an object whose runtime type is Object[], and so it is not an instance of E[], unless E is Object.
However, there is no immediate problem with doing this, because E is erased to Object inside the instance methods of the class. The only way a problem can occur is if the object is somehow exposed to the outside of the class (e.g. returned in a method, put in a public field, etc.) in a capacity that uses its type as E[] (which it's not):
// This would be bad. It would cause a class cast exception at the call site
E[] getArray() { return arr; }
But ArrayList, and indeed any properly-designed container class, would never expose an implementation detail such as its internal array to the outside. It would break abstraction, among other things. So as long as the author of this class is aware of not ever exposing this array, there is no problem with doing it this way (save perhaps confusing the next person who sees the code and is unaware of it), and is free to take advantage of the increased type-checking that this way brings.
Considering type erasure (that is, the fact that generics type parameters such as E in your example are deleted at compilation type), I suspect the generated bytecode would be similar in both cases.
From a maintenance point of view, using a type parameter instead of Object would lead to easier to read code (since it would limit casts). But since the API of ArrayList never exposes that "raw" Object array, I suppose it does not make any difference for us mere Java developers :)
Related
I am working with data structures and i cannot seem to initialize array of generic elements;
public class Heap <'E extends Comparable<E>'> {
private E elements[];
public Heap(int n) {
E[] es = (E[]) new Object[n];
elements=es;
}
}
public static void main(String[]args) {
Heap<Integer>tree=new Heap<Integer>(10);
}
When i run the program i get this error:
Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Ljava.lang.Comparable;
at heaptree.Heap.(Heap.java:16)
at heaptree.Heap.main(Heap.java:70)
Can somebody suggest a way to fix this?
Problem
You can't do it that way in Java, no chance. Because you'd need information at run-time that's only availyble at compile-time. You simply can't create an array of a generic type.
Wherever you create a new Heap<Whatever>, only the compiler knows that you want a Whatever for your generic E type. Inside the E constructor you try to create an array that can be used as E[], meaning a Whatever[] in that example case. In Java, Whatever[] is not a subclass of e.g. Object[], although Whatever is of course a subclass of Object. You can't use a Whatever[] where an Object[] is expected, nor the other way round.
So, only an exact Whatever[] array can be used where a Whatever[] array is expected, leaving you in need to know at runtime that this time you have to create an array of Whatevers. And the <Whatever> information is erased by the compiler, not available at runtime.
Workaround / Solution
If you look at the sources of e.g. ArrayList, you'll see that the experts hav done it differently, not trying to create an array of the specific type, but of a constant, general type Object. ArrayList only casts things like the return value of the get() method to the generic type, not the internal array itself.
So, create your array with a fixed type that covers all possible E types, e.g. Comparable[], and have methods like get() cast the array element to E before returning it.
Or, you can pass the E class object into the constructor like Heap(Class<E> targetClass, int n), so you get it available at run-time, and use Array.newInstance(targetClass, n) with that class object and the desired dimension.
I'm learning Java generics and reading through Generic Methods.
This page starts with
Consider writing a method that takes an array of objects and a collection and puts all objects in the array into the collection
It then states
By now, you will have learned to avoid the beginner's mistake of trying to use Collection<Object> as the type of the collection parameter.
The page infers that using Collection<Object> won't work.
Why is that an error? Why is it a beginner's error?
Collection<Object> as the parameter works fine for me. Am I so beginner that I've somehow made code that works, but misses the point of the exercise?
import java.util.ArrayList;
import java.util.Collection;
public class test {
static void fromArrayToCol(Object a[],Collection<Object> c)
{
for (Object x:a){c.add(x);}
System.out.println(c);
}
public static void main(String[] args) {
test r=new test();
Object[] oa=new Object[]{"hello",678};
Collection<Object> c=new ArrayList<>();
test.fromArrayToCol(oa,c);
}
}
It looks to me like Oracle's tutorial is wrong in its assertion. But I'm a beginner, so it's likely that I'm not grasping what it's trying to tell me.
You can find the answer if you read the Wildcards section.
The problem is that this new version is much less useful than the old one. Whereas the old code could be called with any kind of collection as a parameter, the new code only takes Collection, which, as we've just demonstrated, is not a supertype of all kinds of collections!
Here, old version refers to parameter Collection whereas new code refers to Collection<Object>
When you have a parameter of type Collection<Object>you can pass either a Collection (raw type) or a Collection<Object>. You cannot pass any other collection like Collection<String> or Collection<SomeClass>.
So, the goal of that tutorial is to copy the elements of an array containing any type to a new collection of the same type.
Example: Integer[] to Collection<Integer>
I would say it wasn't worded properly to bring out the above meaning.
It's often a mistake not because it's a compiler error but because having a Collection<Object> is very rarely useful. Such a collection can hold anything. How often do you need a collection that can hold anything and everything? Very rarely. There will almost always be a more specific type parameter you can use for your collection.
Using Collection<Object> more often than not just makes a programmer's life harder than it needs to be. To get anything out of it we need to inspect it's type (e.g. use instanceof) and cast it.
By using the most appropriate type parameter, you give yourself compile-time assurance that the collection will only contain the types of objects that you expect it will and the resulting code is more concise and more readable.
the beginner's mistake they're referring to is the attempt to use Collection<Object> as parameter when you intend to accept Any
collection of something.
Because Object is superclass of all java class, one may think Collection<Object> is "super-collection" of all collection. This point is demonstrated in the Wildcard section:
The problem is that this new version is much less useful than the old one. Whereas the old code could be called with any kind of collection as a parameter, the new code only takes Collection, which, as we've just demonstrated, is not a supertype of all kinds of collections!
Instead of Collection<Object> you have to use Collection<T> to express that your method accept Any collection of something.
To clarify. If you do not know the type of a collection at compile time you cannot put an element in it.
In your current example, you do know the type of object you wish to put into the collection (In your case its an object). The example shown on the site you liked is
static void fromArrayToCollection(Object[] a, Collection<?> c) {
for (Object o : a) {
c.add(o); // compile-time error
}
}
Notice that there is an ? and not an Object. Hence you get a compile time error.
In the example you showed in your question which uses Object it does compile; however, there are better ways to solve this problem.
The biggest problem with that is that it only work on types that extend collection that have the generic type of Object. This makes it quite restrictive. The page is telling you that there is a way that will work for any collection and array as long as they hold the same type.
I think thats also a option:
private <T> void arrayToCollection(T[] objArray, Collection<T> collection) {
for (T obj : objArray) {
collection.add(obj);
}
}
I'm surprised by how painful it is to use java.util.ArrayList<T>.toArray().
Suppose I declare my array list as:
java.util.ArrayList<double[]> arrayList = new java.util.ArrayList<double[]>();
... add some items ...
Then to convert it to an array, I have to do one of the following:
double[][] array = (double[][])arrayList.toArray(new double[0][]);
or:
double[][] array = (double[][])arrayList.toArray(new double[arrayList.size()][]);
or:
double[][] array = new double[arrayList.size()];
arrayList.toArray(array);
None of the above are very readable. Shouldn't I be able to say the following instead?
double[][] array = arrayList.toArray();
But that gives a compile error because Object[] can't be converted to double[][].
Perhaps it's not possible because toArray has to return Object[]
for backwards compatibility with pre-template days.
But if that's the case, couldn't a friendlier alternative method be added
with a different name? I can't think of a good name, but almost anything
would be better than the existing ways; e.g. the following would be fine:
double[][] array = arrayList.toArrayOfNaturalType();
No such member function exists, but maybe it's possible to write a generic helper function that will do it?
double[][] array = MyToArray(arrayList);
The signature of MyToArray would be something like:
public static <T> T[] MyToArray(java.util.ArrayList<T> arrayList)
Is it possible to implement such a function?
My various attempts at implementing it resulted in compile errors
"error: generic array creation" or "error: cannot select from a type variable".
Here's the closest I was able to get:
public static <T> T[] MyToArray(java.util.ArrayList<T> arrayList, Class type)
{
T[] array = (T[])java.lang.reflect.Array.newInstance(type, arrayList.size());
arrayList.toArray(array);
return array;
}
It's called like this:
double[][] array = MyToArray(arrayList, double[].class);
I wish the redundant final parameter wasn't there, but, even so,
I think this is the least-horrible way that I've seen so far for converting array list to array.
Is it possible to do any better than this?
Is it possible to do any better than this?
Nope.
None of the above are very readable. Shouldn't I be able to say the following instead?
double[][] array = arrayList.toArray();
It would be nice ... but you can't.
The problem is that the toArray() method was specified way back in Java 1.2 with the behavior you are seeing. Generic types were not added to the language until Java 1.5. When they were added, the designers chose the "type erasure" approach, for compatibility with earlier versions of Java. So:
the semantics of the toArray() methods could not be changed without breaking compatibility, and
type erasure makes it impossible for a toArray() method implementation to know what the list's actual element type is, so it could not get it right anyway.
Unfortunately you cannot write
double[][] array = arrayList.toArray();
The reason is that toArray() was defined in JDK 1.2 (prior to generics) to return Object[]. This cannot be changed compatibly.
Generics were introduced in Java 5 but were implemented using erasure. This means that the ArrayList instance has no knowledge at runtime of the types of objects it contains; therefore, it cannot create an array of the desired element type. That's why you have to pass a type token of some sort -- in this case an actual array instance -- to tell ArrayList the type of the array to create.
You should be able to write
double[][] array = arrayList.toArray(new double[0][]);
without a cast. The one-arg overload of toArray() is generified, so you'll get the right return type.
One might think that it's preferable to pass a pre-sized array instead of a throwaway zero-length array. Aleksey Shipilev wrote an article analyzing this question. The answer is, somewhat counterintuitively, that creating a zero-length array is potentially faster.
Briefly, the reason is that allocation is cheap, a zero-length array is small, and it's probably going to be thrown away and garbage collected quickly, which is also cheap. By contrast, creating a pre-sized array requires it to be allocated and then filled with nulls/zeroes. It's then passed to toArray(), which then fills it with values from the list. Thus, every array element is typically written twice. By passing a zero-length array to toArray(), this allows the array allocation to occur in the same code as the array filling code, providing the opportunity for the JIT compiler to bypass the initial zero-fill, since it knows that every array element will be filled.
There is also JDK-8060192 which proposes to add the following:
<A> A[] Collection.toArray(IntFunction<A[]> generator)
This lets you pass a lambda expression that is given the array size and returns a created array of that size. (This is similar to Stream.toArray().) For example,
// NOT YET IMPLEMENTED
double[][] array = arrayList.toArray(n -> new double[n][]);
double[][] array = arrayList.toArray(double[][]::new);
This isn't implemented yet, but I'm still hopeful this can get into JDK 9.
You could rewrite your helper function along these lines:
static <T> T[] myToArray(List<T> list, IntFunction<T[]> generator) {
return list.toArray(generator.apply(list.size()));
}
(Note that there is some subtlety here with concurrent modification of the list, which I'm ignoring for this example.) This would let you write:
double[][] array = myToArray(arrayList, double[][]::new);
which isn't terribly bad. But it's not actually clear that it's any better than just allocating a zero-length array to pass to toArray().
Finally, one might ask why toArray() takes an actual array instance instead of a Class object to denote the desired element type. Joshua Bloch (creator of the Java collections framework) said in comments on JDK-5072831 that this is feasible but that he's not sure it's a good idea, though he could live with it.
There an additional use case here as well, of copying the elements into an existing array, like the old Vector.copyInto() method. The array-bearing toArray(T[]) method also supports this use case. In fact, it's better than Vector.copyInto() because the latter cannot be used safely in the presence of concurrent modification, if the collection's size changes. The auto-sizing behavior of toArray(T[]) handles this, and it also handles the case of creating an array of the caller's desired type as described above. Thus, while adding an overload that takes a Class object would certainly work, it doesn't add much over the existing API.
ArrayList chose to just use a reference type of Object in its instance variable elementData.
Using Object as its reference type would require explicit cast in getting the correct instance type of its elements. What's the difference if it just used type parameter in declaring said instance field?
By that, I think it could eliminate the need for suppressing unchecked explicit cast.
// From Java API:
public E get(int index) {
rangeCheck(index);
return elementData(index);
}
#SuppressWarnings("unchecked")
E elementData(int index) {
return (E) elementData[index];
}
could have been like this?
private transient E[] elementData;
public E get(int index) {
rangeCheck(index);
return elementData[index];
}
Please share your thoughts. Cheers!
I've already got the answer and it's from reading "Effective Java 2nd Edition" by Joshua Bloch. It says there in Item 26..
Which of the two techniques you choose for dealing with the generic array
creation error is largely a matter of taste. All other things being equal, it is riskier
to suppress an unchecked cast to an array type than to a scalar type, which would
suggest the second solution. But in a more realistic generic class than Stack, you
would probably be reading from the array at many points in the code, so choosing
the second solution would require many casts to E rather than a single cast to E[],
which is why the first solution is usedmore commonly [Naftalin07, 6.7]
ArrayList is using the second technique in dealing with Generic array creation which is to suppress cast to a scalar type.
Due to the "type erasure" the type information will be lost anyway. The type information in Java generics is there to help compiler issue errors when developer is trying to use incorrect type.
However, I believe the main reason to use Object there is that ArrayList is allocating elements as well. Java does not allow you to do new E[startCapacity]. ArrayList(int initialCapacity) constructor is doing exactly that.
Say I have an ArrayList that I have cast to an ArrayList of objects. I know that all the objects that were in the ArrayList I cast were of the same type, but not what the type was.
Now, if the ArrayList is not empty, I could take one of the objects in it and use the instanceof operator to learn what the actual type is. But what of the case where the ArrayList is empty? How do I determine what type Object actually is then? Is it possible?
Edit: In retrospect, I suppose it doesn't strictly matter what type an empty ArrayList holds. I can discard the old one and construct a new empty ArrayList of the proper type I was looking for in the first place.
It's a bit messy, so if anyone has alternate suggestions for how to allow a large variety of potential types which may not share a relevant common superclass (Ex. Integer and ArrayList), I'm open to suggestions. The thing is, I have a class that will contain other classes specifically designed to be interchangably used within it. The contained classes can make assumptions as to the type, as they're the ones defining it. The surrounding class cannot make such assumptions, but must confirm that it is specifying the same type as the contained classes that have been slotted into it.
Thus my question, as the containing class is generic (the generic specifying what sort of type it is handling), but it must ensure the contained classes it was passed are returning and operating on the type that was specified to it (they are designed to be created separately and as such the fact that they match must be confirmed when they are slotted in, not at the time of their creation).
Nope, type erasure makes sure of that.
As Jeffrey mentioned, this is impossible due to magics of type erasure. Your best choice I guess is to add an additional case for the empty list:
if (list.isEmpty()) {
// process empty ...
} else {
Object o = list.get(0);
if (o instanceof Integer) {
List<Integer> integers = (List<Integer>) list;
// process Integer ...
} else if (o instanceof String) {
List<String> strings = (List<String>) list;
// process String ...
} // etc...
}
But beware! This instanceof chain is not generally considered good OO practice. Rather than passing around bare lists and then trying to guess their constituent types consider creating a wrapper class for the lists which may also hold a reference to a Class object. That way you may also be able to refactor your different processing algorithms as overriding a single process() method...