varargs heap pollution : what's the big deal? - java

I was reading about varargs heap pollution and I don't really get how varargs or non-reifiable types would be responsible for problems that do not already exist without genericity. Indeed, I can very easily replace
public static void faultyMethod(List<String>... l) {
Object[] objectArray = l; // Valid
objectArray[0] = Arrays.asList(42);
String s = l[0].get(0); // ClassCastException thrown here
}
with
public static void faultyMethod(String... l) {
Object[] objectArray = l; // Valid
objectArray[0] = 42; // ArrayStoreException thrown here
String s = l[0];
}
The second one simply uses the covariance of arrays, which is really the problem here. (Even if List<String> was reifiable, I guess it would still be a subclass of Object and I would still be able to assign any object to the array.) Of course I can see there's a little difference between the two, but this code is faulty whether it uses generics or not.
What do they mean by heap pollution (it makes me think about memory usage but the only problem they talk about is potential type unsafetiness), and how is it different from any type violation using arrays' covariance?

You're right that the common (and fundamental) problem is with the covariance of arrays. But of those two examples you gave, the first is more dangerous, because can modify your data structures and put them into a state that will break much later on.
Consider if your first example hadn't triggered the ClassCastException:
public static void faultyMethod(List<String>... l) {
Object[] objectArray = l; // Valid
objectArray[0] = Arrays.asList(42); // Also valid
}
And here's how somebody uses it:
List<String> firstList = Arrays.asList("hello", "world");
List<String> secondList = Arrays.asList("hello", "dolly");
faultyMethod(firstList, secondList);
return secondList.isEmpty()
? firstList
: secondList;
So now we have a List<String> that actually contains an Integer, and it's floating around, safely. At some point later — possibly much later, and if it's serialized, possibly much later and in a different JVM — someone finally executes String s = theList.get(0). This failure is so far distant from what caused it that it could be very difficult to track down.
Note that the ClassCastException's stack trace doesn't tell us where the error really happened; it just tells us who triggered it. In other words, it doesn't give us much information about how to fix the bug; and that's what makes it a bigger deal than an ArrayStoreException.

The difference between an array and a List is that the array checks it's references. e.g.
Object[] array = new String[1];
array[0] = new Integer(1); // fails at runtime.
however
List list = new ArrayList<String>();
list.add(new Integer(1)); // doesn't fail.

From the linked document, I believe what Oracle means by "heap pollution" is to have data values that are technically allowed by the JVM specification, but are disallowed by the rules for generics in the Java programming language.
To give you an example, let's say we define a simple List container like this:
class List<E> {
Object[] values;
int len = 0;
List() { values = new Object[10]; }
void add(E obj) { values[len++] = obj; }
E get(int i) { return (E)values[i]; }
}
This is an example of code that is generic and safe:
List<String> lst = new List<String>();
lst.add("abc");
This is an example of code that uses raw types (bypassing generics) but still respects type safety at a semantic level, because the value we added has a compatible type:
String x = (String)lst.values[0];
The twist - now here is code that works with raw types and does something bad, causing "heap pollution":
lst.values[lst.len++] = new Integer("3");
The code above works because the array is of type Object[], which can store an Integer. Now when we try to retrieve the value, it'll cause a ClassCastException - at retrieval time (which is way after the corruption occurred), instead of at add time:
String y = lst.get(1); // ClassCastException for Integer(3) -> String
Note that the ClassCastException happens in our current stack frame, not even in List.get(), because the cast in List.get() is a no-op at run time due to Java's type erasure system.
Basically, we inserted an Integer into a List<String> by bypassing generics. Then when we tried to get() an element, the list object failed to uphold its promise that it must return a String (or null).

Prior to generics, there was absolutely no possibility that an object's runtime type is inconsistent with its static type. This is obviously a very desirable property.
We can cast an object to an incorrect runtime type, but the cast would fail immediately, at the exact site of casting; the error stops there.
Object obj = "string";
((Integer)obj).intValue();
// we are not gonna get an Integer object
With the introduction of generics, along with type erasure (the root of all evils), now it is possible that a method returns String at compile time, yet returns Integer at runtime. This is messed up. And we should do everything we can to stop it from the source. It is why the compiler is so vocal about every sight of unchecked casts.
The worst thing about heap pollution is that the runtime behavior is undefined! Different compiler/runtime may execute the program in different ways. See case1 and case2.

They are different because ClassCastException and ArrayStoreException are different.
Generics compile-time type checking rules should ensure that it's impossible to get a ClassCastException in a place where you didn't put an explicit cast, unless your code (or some code you called or called you) did something unsafe at compile-time, in which case you should (or whatever code did the unsafe thing should) receive a compile-time warning about it.
ArrayStoreException, on the other hand, is a normal part of how arrays work in Java, and pre-dates Generics. It is not possible for compile-time type checking to prevent ArrayStoreException because of the way the type system for arrays is designed in Java.

Related

Raw use of unparameterized class vs. unchecked assignment vs. generic array creation

For various reasons I'm stuck with this bit of Java code that uses Scala types:
scala.Tuple2<scala.Enumeration.Value, Integer>[] tokens = new scala.Tuple2[] {
new scala.Tuple2(scala.math.BigDecimal.RoundingMode.UP(), 0)
};
IntelliJ throws this warning on line 1:
Unchecked assignment: 'scala.Tuple2[]' to 'scala.Tuple2<scala.Enumeration.Value,java.lang.Integer>[]'
And it throws two warnings on line 2:
Raw use of parameterized class 'scala.Tuple2'
Unchecked call to 'Tuple2(T1, T2)' as a member of raw type 'scala.Tuple2'
I can get rid of the warnings on line 2 by simply adding <> after new scala.Tuple2 and before (:
scala.Tuple2<scala.Enumeration.Value, Integer>[] tokens = new scala.Tuple2[] {
new scala.Tuple2<>(scala.math.BigDecimal.RoundingMode.UP(), 0)
};
But the warning on line 1 remains. Adding <> after new scala.Tuple2 and before [] doesn't help. I also tried this:
scala.Tuple2<scala.Enumeration.Value, Integer>[] tokens = new scala.Tuple2<scala.Enumeration.Value, Integer>[] {
new scala.Tuple2<>(scala.math.BigDecimal.RoundingMode.UP(), 0)
};
This causes an error: Generic array creation. I don't understand what this means or why it wouldn't work.
Generics are entirely a compile time thing. The stuff in the <> either doesn't end up in class files at all, or if it does, it is, as far as the JVM is concerned, a comment. It has no idea what any of it means. The only reason <> survives is purely for javac's needs: It needs to know that e.g. the signature of the List interface is boolean add(E), even though as far as the JVM is concerned, it's just boolean add(Object).
As a consequence, given an instance of some list, e.g.:
// snippet 1:
List<?> something = foo();
List<String> foo() {
return new ArrayList<String>();
}
// snippet 2:
List<?> something = foo();
List<Integer> foo() {
return new ArrayList<String>();
}
These are bytecode wise identical, at least as far as the JVM is concerned. There's this one weird comment thing the JVM doesn't know about that is ever so slightly different, is all. The runtime structure of the object created here is identical and hence it is simply not possible to call anything on the something variable to determine if it is a list of strings or a list of integers.
But, array types are a runtime thing. You can figure it out:
// snippet 1:
Object[] something = foo();
String[] foo() {
return new String[0];
}
// snippet 2:
Object[] something = foo();
Integer[] foo() {
return new Integer[0];
}
Here, you can tell the difference: something.getClass().getComponentType() will be String.class in snippet 1, and Integer.class in snippet 2.
Generics are 100% a compile time thing. If javac (or scalac, or whatever compiler you are using) doesn't stop you, then the runtime never will. You can trivially 'break' the heap if you insist on doing this:
List<String> strings = new ArrayList<String>();
List /* raw */ raw = strings; // warning, but, compiles
raw.add(Integer.valueOf(5));
String a = strings.get(0); // uhoh!
The above compiles fine. The only reason it crashes at runtime is because a ClassCastException occurs, but you can avoid that with more shenanigans if you must.
In contrast to arrays where all this is a runtime thing:
Object[] a = new String[10];
a[0] = Integer.valueOf(5);
The above compiles. At runtime you get an ArrayStoreException.
Thus, generics and arrays are like fire and water. Mutually exclusive; at opposite ends of a spectrum. Do not play together, at all.
Now we get to the construct new T[]. This doesn't even compile. Because javac doesn't know what T is going to be, but arrays know the component type, and it is not possible to derive T at runtime, so this creation isn't possible.
In other words, mixing arrays and generics is going to fail, in the sense that generics are entirely a compile time affair, and tossing arrays into the mix means the compiler can no longer do the job of ensuring you don't get 'heap corruption' (the notion that there's an integer in a list that a variable of type List<String> is pointing at).
You simply write this:
List<String>[] arr = new List[10];
And yes, the compiler will warn you that it has no way of ensuring that arr will in fact only contain this; you get an 'this code uses unchecked/unsafe operations' warning. But, key word, warning. You can ignore them with #SuppressWarnings.
There's no way to get rid of this otherwise: Mixing arrays and generics usually ends up there (in warnings that you have to suppress).

Is ArrayList<String> any special from ArrayList containing other types in Java?

Code below run on my hotspot JVM and I got "a" as output.
ArrayList<Method> list = new ArrayList<>();
Method method = list.getClass().getDeclaredMethod("add", Object.class);
method.invoke(list, "a");
System.out.println(list.get(0));
But a ClassCastException occured after running the below code:
ArrayList<String> list1 = new ArrayList<>();
Method method1 = list1.getClass().getDeclaredMethod("add", Object.class);
method1.invoke(list1, 1); // or replaced with method1.invoke(list1, new int[]{1});
System.out.println(list1.get(0));
What's wrong with the second code?
Is ArrayList<String> any special?
There's nothing special about ArrayList<String>. It's simply how generics and method invocation expressions interact in the language. What you are observing is as a result of there being overloads of PrintStream.println for both Object and String parameters.
The TL;DR: is that the former case invokes PrintStream.println(Object); the latter cast invokes PrintStream.println(String), for which the compiler inserts a cast because the String is coming from a list expected to contain Strings only.
Generics are simply compiler-inserted casts. When the compiler sees that a method returns a E (e.g. ArrayList<E>::get(int)), it thinks that the result of that method can be safely cast to E (because it is an E, a subclass of E, or null).
In order to use that result as an E, though, it has to cast the result to E, because the result of get is an Object, because of type erasure.
So, when you write things like:
List<String> list = ...
String s = list.get(0);
list.get(0).toString();
System.out.println("" + list.get(0));
the compiler will insert casts, and so the code which is actually executed looks like:
List<String> list = ...
String s = (String) list.get(0);
((String) list.get(0)).toString();
System.out.println("" + (String) list.get(0));
which is harder to read; but you don't need the explicit casts, because the compiler knows to insert them for you, based on the fact that list is a List<String>.
When you invoke a method, like System.out.println, the compiler goes through quite a complicated process to determine which method to invoke.
In this case, it looks at the PrintStream class, and finds all of the overloads of the method println; then it narrows these down to the ones which could be invoked for the given arguments; then it picks which of the potential matches is most specific.
Again, "most specific" is rather complicated, but it is summarised as a method is more specific if any valid parameters can also be passed to a less specific method, but not vice versa.
class Foo {
static void foo(String str) {}
static void foo(Object str) {}
}
So, foo(String) is more specific than foo(Object), because all Strings are Object, but not all Objects are Strings.
So, when you're invoking foo(something) and something is expected to be a String, foo(String) is invoked, even though foo(Object) could also be invoked; if it's any other kind of object, foo(Object) is invoked, because not-Strings can't be passed to foo(String).
Enough theory, let's look at this specific example:
ArrayList<Method> list = new ArrayList<>();
// ...
System.out.println(list.get(0));
The overload of PrintStream.println which is most specific for this invocation is PrintStream.println(Object). The raw list.get(0) call returns an Object, so no cast needs to be inserted by the compiler to make it compatible.
ArrayList<String> list = new ArrayList<>();
// ...
System.out.println(list.get(0));
Hence, there is no problem if list.get(0) returns something that isn't a Method.
The overload of PrintStream.println which is most specific for this invocation is PrintStream.println(String). The raw list.get(0) call returns an Object, so a cast to String needs to be inserted by the compiler to make it compatible.
In fact, the code executed is effectively:
System.out.println((String) list.get(0));
Hence, there is problem if list.get(0) returns something that isn't a String: you will get a ClassCastException, as you found.
The important thing to point out here is that this happens because of what the compiler expects the types to be, because of the type information it has at its disposal. These are reasonable expectations that are safe if you haven't done type-unsafe things (like adding to the list reflectively); but the protections offered by the compiler are somewhat trivial to work around.
For this reason, you should pay careful attention to ensure that what you are doing really is type-safe still, even when you are working behind the safety guard of the compiler.

Why can't I create an array of a type parameter in Java?

Well, I have read a lot of answers to this question, but I have a more specific one. Take the following snippet of code as an example.
public class GenericArray<E>{
E[] s= new E[5];
}
After type erasure, it becomes
public class GenericArray{
Object[] s= new Object[5];
}
This snippet of code seems to work well. Why does it cause a compile-time error?
In addition, I have known from other answers that the following codes work well for the same purpose.
public class GenericArray<E>{
E[] s= (E[])new Object[5];
}
I've read some comments saying that the piece of code above is unsafe, but why is it unsafe? Could anyone provide me with a specific example where the above piece of code causes an error?
In addition, the following code is wrong as well. But why? It seems to work well after erasure, too.
public class GenericArray<E>{
E s= new E();
}
Array declarations are required to have a reifiable type, and generics are not reifiable.
From the documentation: the only type you can place on an array is one that is reifiable, that is:
It refers to a non-generic class or interface type declaration.
It is a parameterized type in which all type arguments are unbounded wildcards (§4.5.1).
It is a raw type (§4.8).
It is a primitive type (§4.2).
It is an array type (§10.1) whose element type is reifiable.
It is a nested type where, for each type T separated by a ".", T itself is reifiable.
This means that the only legal declaration for a "generic" array would be something like List<?>[] elements = new ArrayList[10];. But that's definitely not a generic array, it's an array of List of unknown type.
The main reason that Java is complaining about the you performing the cast to E[] is because it's an unchecked cast. That is, you're going from a checked type explicitly to an unchecked one; in this case, a checked generic type E to an unchecked type Object. However, this is the only way to create an array that is generic, and is generally considered safe if you have to use arrays.
In general, the advice to avoid a scenario like that is to use generic collections where and when you can.
This snippet of code seems to work well. Why does it cause a compile-time error?
First, because it would violate type safety (i.e. it is unsafe - see below), and in general code that can be statically determined to do this is not allowed to compile.
Remember that, due to type erasure, the type E is not known at run-time. The expression new E[10] could at best create an array of the erased type, in this case Object, rendering your original statement:
E[] s= new E[5];
Equivalent to:
E[] s= new Object[5];
Which is certainly not legal. For instance:
String[] s = new Object[10];
... is not compilable, for basically the same reason.
You argued that after erasure, the statement would be legal, implying that you think this means that the original statement should also be considered legal. However this is not right, as can be shown with another simple example:
ArrayList<String> l = new ArrayList<Object>();
The erasure of the above would be ArrayList l = new ArrayList();, which is legal, while the original is clearly not.
Coming at it from a more philosophical angle, type erasure is not supposed to change the semantics of the code, but it would do so in this case - the array created would be an array of Object rather than an array of E (whatever E might be). Storing a non-E object reference in it would then be possible, whereas if the array were really an E[], it should instead generate an ArrayStoreException.
why is it unsafe?
(Bearing in mind we are now talking about the case where E[] s= new E[5]; has been replaced with E[] s = (E[]) new Object[5];)
It is unsafe (which in this instance is short for type unsafe) because it creates at run-time a situation in which a variable (s) holds a reference to an object instance which is not a sub-type of the variable's declared type (Object[] is not a subtype of E[], unless E==Object).
Could anyone provide me with a specific example where the above piece of code causes an error?
The essential problem is that it is possible to put non-E objects into an array that you create by performing a cast (as in (E[]) new Object[5]). For example, say there is a method foo which takes an Object[] parameter, defined as:
void foo(Object [] oa) {
oa[0] = new Object();
}
Then take the following code:
String [] sa = new String[5];
foo(sa);
String s = sa[0]; // If this line was reached, s would
// definitely refer to a String (though
// with the given definition of foo, this
// line won't be reached...)
The array definitely contains String objects even after the call to foo. On the other hand:
E[] ea = (E[]) new Object[5];
foo(ea);
E e = ea[0]; // e may now refer to a non-E object!
The foo method might have inserted a non-E object into the array. So even though the third line looks safe, the first (unsafe) line has violated the constraints that guarantee that safety.
A full example:
class Foo<E>
{
void foo(Object [] oa) {
oa[0] = new Object();
}
public E get() {
E[] ea = (E[]) new Object[5];
foo(ea);
return ea[0]; // returns the wrong type
}
}
class Other
{
public void callMe() {
Foo<String> f = new Foo<>();
String s = f.get(); // ClassCastException on *this* line
}
}
The code generates a ClassCastException when run, and it is not safe. Code without unsafe operations such as casts, on the other hand, cannot produce this type of error.
In addition, the following code is wrong as well. But why? It seems to work well after erasure, too.
The code in question:
public class GenericArray<E>{
E s= new E();
}
After erasure, this would be:
Object s = new Object();
While this line itself would be fine, to treat the lines as being the same would introduce the semantic change and safety issue that I have described above, which is why the compiler won't accept it. As an example of why it could cause a problem:
public <E> E getAnE() {
return new E();
}
... because after type erasure, 'new E()' would become 'new Object()' and returning a non-E object from the method clearly violates its type constraints (it is supposed to return an E) and is therefore unsafe. If the above method were to compile, and you called it with:
String s = <String>getAnE();
... then you would get a type error at runtime, since you would be attempting to assign an Object to a String variable.
Further notes / clarification:
Unsafe (which is short for "type unsafe") means that it could potentially cause a run-time type error in code that would otherwise be sound. (It actually means more than this, but this definition is enough for purposes of this answer).
it's possible to cause a ClassCastException or ArrayStoreException or other exceptions with "safe" code, but these exceptions only occur at well defined points. That is, you can normally only get a ClassCastException when you perform a cast, an operation that inherently carries this risk. Similarly, you can only get an ArrayStoreException when you store a value into an array.
the compiler doesn't verify that such an error will actually occur before it complains that an operation is unsafe. It just knows that that certain operations are potentially able to cause problems, and warns about these cases.
that you can't create a new instance of (or an array of) a type parameter is both a language feature designed to preserve safety and probably also to reflect the implementation restrictions posed by the use of type erasure. That is, new E() might be expected to produce an instance of the actual type parameter, when in fact it could only produce an instance of the erased type. To allow it to compile would be unsafe and potentially confusing. In general you can use E in place of an actual type with no ill effect, but that is not the case for instantiation.
A compiler can use a variable of type Object to do anything a variable of type Cat can do. The compiler may have to add a typecast, but such typecast will either throw an exception or yield a reference to an instance of Cat. Because of this, the generated code for a SomeCollection<T> doesn't have to actually use any variables of type T; the compiler can replace T with Object and cast things like function return values to T where necessary.
A compiler cannot use an Object[], however, to do everything a Cat[] can do. If a SomeCollection[] had an array of type T[], it would not be able to create an instance of that array type without knowing the type of T. It could create an instance of Object[] and store references to instances of T in it without knowing the type of T, but any attempt to cast such an array to T[] would be guaranteed to fail unless T happened to be Object.
Let's say generic arrays are allowed in Java. Now, take a look at following code,
Object[] myStrs = new Object[2];
myStrs[0] = 100; // This is fine
myStrs[1] = "hi"; // Ambiguity! Hence Error.
If user is allowed to create generic Array, then user can do as I've shown in above code and it will confuse compiler. It defeats the purpose of arrays (Arrays can handle only same/similar/homogeneous type of elements, remember?). You can always use array of class/struct if you want heterogeneous array.
More info here.

shouldn't this code produce a ClassCastException

The following code compiles and runs successfully without any exception
import java.util.ArrayList;
class SuperSample {}
class Sample extends SuperSample {
#SuppressWarnings("unchecked")
public static void main(String[] args) {
try {
ArrayList<Sample> sList = new ArrayList<Sample>();
Object o = sList;
ArrayList<SuperSample> ssList = (ArrayList<SuperSample>)o;
ssList.add(new SuperSample());
} catch (Exception e) {
e.printStackTrace();
}
}
}
shouldn't the line ArrayList<SuperSample> ssList = (ArrayList<SuperSample>)o; produce a ClassCastException ?
while the following code produces a compile time error error to prevent heap pollution, shouldn't the code mentioned above hold a similar prevention at runtime?
ArrayList<Sample> sList = new ArrayList<Sample>();
ArrayList<SuperSample> ssList = (ArrayList<SuperSample>) sList;
EDIT:
If Type Erasure is the reason behind this, shouldn't there be additional mechanisms to prevent an invalid object from being added to the List? for instance
String[] iArray = new String[5];
Object[] iObject = iArray;
iObject[0]= 5.5; // throws ArrayStoreException
then why,
ssList.add(new SuperSample());
is not made to throw any Exception?
No it should not, at run time both lists have the same type ArrayList. This is called erasure. Generic parameters are not part of compiled class, they all are erased during compilation. From JVM's perspective your code is equal to:
public static void main(String[] args) {
try {
ArrayList sList = new ArrayList();
Object o = sList;
ArrayList ssList = (ArrayList)o;
ssList.add(new SuperSample());
} catch (Exception e) {
e.printStackTrace();
}
}
Basically generics only simplify development, by producing compile time errors and warnings, but they don't affect execution at all.
EDIT:
Well, the base concept behind this is Reifiable Type. Id strongly recomend reading this manual:
A reifiable type is a type whose type information is fully available
at runtime. This includes primitives, non-generic types, raw types,
and invocations of unbound wildcards.
Non-reifiable types are types where information has been removed at
compile-time by type erasure
To be short: arrays are rifiable and generic collections are not. So when you store smth in the array, type is checked by JVM, because array's type is present at runtime. Array represents just a piece of memmory, while collection is an ordinary class, which might have any sort of implementation. For example it can store data in db or on the disk under the hood. If you'd like to get deeper, I suggest reading Java Generics and Collections book.
In your code example,
class SuperSample { }
class Sample extends SuperSample { }
...
ArrayList<Sample> sList = new ArrayList<Sample>();
Object o = sList;
ArrayList<SuperSample> ssList = (ArrayList<SuperSample>)o;
Shouldn't the last line produce a ClassCastException?
No. That exception is thrown by the JVM when it detects incompatible types being cast at runtime. As others have noted, this is because of erasure of generic types. That is, generic types are known only to the compiler. At the JVM level, the variables are all of type ArrayList (the generics having been erased) so there is no ClassCastException at runtime.
As an aside, instead of assigning to an intermediate local variable of type Object, a more concise way to do this assignment is to cast through raw:
ArrayList<SuperSample> ssList = (ArrayList)sList;
where a "raw" type is the erased version of a generic type.
Shouldn't there be additional mechanisms to prevent an invalid object from being added to the List?
Yes, there are. The first mechanism is compile-time checking. In your own answer you found the right location in the Java Language Specification where it describes heap pollution which is the term for an invalid object occurring in the list. The money quote from that section, way down at the bottom, is
If no operation that requires a compile-time unchecked warning to be issued takes place, and no unsafe aliasing occurs of array variables with non-reifiable element types, then heap pollution cannot occur.
So the mechanism you're looking for is in the compiler, and the compiler notifies you of this via compilation warnings. However, you've disabled this mechanism by using the #SuppressWarnings annotation. If you were to remove this annotation, you'd get a compiler warning at the offending line. If you absolutely want to prevent heap pollution, don't use #SuppressWarnings, and add the options -Xlint:unchecked -Werror to your javac command line.
The second mechanism is runtime checking, which requires use of one of the checked wrappers. Replace the initialization of sList with the following:
List<Sample> sList = Collections.checkedList(new ArrayList<Sample>(), Sample.class);
This will cause a ClassCastException to be thrown at the point where a SuperSample is added to the list.
The key here to answer your question is Type Erasure in java
You have a warning at compile time for your first case and not in the second because of your indirection by an object which prevent the compiler to raise you a warning (I'm guessing that this warning is raised when casting a parametrized type to another one which is not done on your second case, if anyone can confirm that I would be glad to here about it).
And your code run because, in the end sList ssList et o are all ArrayList
I think that this cant produce ClassCastException because of backward compatibility issue in Java.
Generic information is not included in bytecode (compiler get rids of it during compilation).
Imagine scenario that you use in your project some old legacy code (some old library writen in java 1.4) and you pass generic List to some method in this legacy code.
You can do this.
In time before generics legacy code was allowed to put anything at all (except primitives) into a collection.
So this legacy code cant get ClassCastException even if it try to put String to List<Integer>.
From the legacy code perspective it is just List.
So this strange behaviour is a consequence of type erasure and to allow backward compatibility in Java.
EDIT:
You get ArrayStoreException for arrays because at runtime the JVM KNOWS the type of arrays, and you dont get any exception for collections because of type erasure and this backward compatibility issue JVM doesnt know the type of collection at runtime.
You can read about this topic in "SCJP Sun® Certified Programmer for Java™ 6 Study Guide" book in chapter 7 "Generics and Collections"
From the JLS (4.12.2)
It is possible that a variable of a parameterized type refers to an object that is not
of that parameterized type. This situation is known as heap pollution. This situation
can only occur if the program performed some operation that would give rise
to an unchecked warning at compile-time.
For example, the code:
List l = new ArrayList<Number>();
List<String> ls = l; // unchecked warning
gives rise to an unchecked warning, because it is not possible to ascertain, either at compile-
time (within the limits of the compile-time type checking rules) or at run-time, whether
the variable l does indeed refer to a List<String>.
If the code above is executed, heap pollution arises, as the variable ls, declared to be a
List<String>, refers to a value that is not in fact a List<String>.
The problem cannot be identified at run-time because type variables are not reified,
and thus instances do not carry any information at run-time regarding the actual type
parameters used to create them.

Java use generics to set the primitive type of an array later

I am trying to write some simple numerical code in Java where one can choose between a float and double later. A simplified version of my class looks like the example below:
public class UniformGrid<T> {
public T[] data;
public UniformGrid(int arrayDim) {
data = new T[arrayDim];
}
}
This didn't work I got a generic array creation error when trying to compile. Googling and reading some SO answers I learned about java.lang.reflect.Array and tried to use
data = (T[]) Array.newInstance(T.class, arrayDim);
Which also didn't work, since T is (probably) a primitive type. My Java knowledge is quite rusty (especially when it comes to generics) and I would like to know why the new operator cannot be used with a generic array type. Also of course I am interested in how one would solve this problem in Java.
You cannot create a generic array in Java because of type erasure. The easiest way to get around this would be to use a a List<T>. But if you must use an array, you can use an Object[] for your array and ensure that only T objects are put into it. (This is the strategy ArrayList takes.)
Ex:
private Object[] data = new Object[10];
private int size = 0;
public void add(T obj) {
data[size++] = obj;
}
public T get(int i){
return (T) data[i];
}
Of course you'll get an unchecked warning from your compiler, but you can suppress that.
Generics can't be used when creating an array because you don't know at runtime what type T is. This is called type erasure.
The solution is simple: use List<T> data.
Sorry, you'll have to take another approach:
Type parameters must be reference types, they can't be primitive types.
Only reference types support polymorphism, and only for instance methods. Primitive types do not. float and double don't have a common supertype; you can not write an expression like a + b and choose at runtime whether to perform float addition or double addition. And since Java (unlike C++ or C#, which emit new code for each type parameter) uses the same bytecode for all instances of a generic type, you'd need polymorphism to use a different operator implementation.
If you really need this, I'd look into code generation, perhaps as part of an automated build. (A simple search & replace on the source ought to be able to turn a library operating on double into a library operating on float.)
This is possible, as long as you use Float and Double instead of float and double, as primitive types are not allowed in Java Generics. Of course, this will probably be quite slow. And, you won't be able to (safely) allow direct public access to the array. So this answer is not very useful, but it might be theoretically interesting. Anyway, how to construct the array ...
data = (T[]) new Object[arrayDim];
This will give you a warning, but it's not directly anything to worry about. It works in this particular form - it's inside a generic constructor and data is the only reference to this newly constructed object. See this page about this.
You will not be able to access this array object publicly in the way you might like. You'll need to set up methods in UniformGrid<T> to get and set objects. This way, the compiler will ensure type-safety and the runtime won't give you any problems.
private T[] data;
public void set(int pos, T t) {
data[pos] = t;
}
public T get(int pos) {
return data[pos];
}
In this case, the interface to set will (at compile-time) enforce the correct type is passed. The underlying array is of type Object[] but that's OK as it can take any reference type - and all generic types are effectively List<Object> or something like that at runtime anyway.
The interesting bit is the getter. The compiler 'knows' that the type of data is T[] and hence the getter will compile cleanly and promises to return a T. So as long as you keep the data private and only access it through get and set then everything will be fine.
Some example code is on ideone.
public static void main(String[] args) {
UniformGrid<A> uf = new UniformGrid<A>(1);
//uf.insert(0, new Object()); // compile error
uf.insert(0, new A());
uf.insert(0, new B());
Object o1= uf.get(0);
A o2= uf.get(0);
// B o2= uf.get(0); // compiler error
System.out.println(o1);
System.out.println(o2);
System.out.println("OK so far");
// A via_array1 = uf.data[0]; // Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [LA;
}
As you would desire, there are compilation errors with uf.insert(0, new Object()) and B o2= uf.get(0);
But you shouldn't make the data member public. If you did, you could write and compile A via_array1 = uf.data[0];. That line looks like it should be OK, but you get a runtime exception: Ljava.lang.Object; cannot be cast to [LA;.
In short, the get and set interface provide a safe interface. But if you go to this much trouble to use an array, you should just use an ArrayList<T> instead. Moral of the story: in any language (Java or C++), with generics or without generics, just say no to arrays. :-)
Item 25 in Effective Java, 2nd Edition talks about this problem:
Arrays are covariant and reified; generics are invariant and erased.
As a consequence, arrays provide run-time type safety but not compile-time type safety and vice versa for generics. Generally speaking arrays and generics don't mix well.

Categories

Resources