This has probably been asked before but i haven't been able to find a question.
I would like to understand the underlying reasons why does the following block of code not compile:
public class Box<T> {
private T value;
public Box(T value) {
this.value = value;
}
public T getValue() {
return value;
}
public static void main(String[] args) {
Box<Integer> i = new Box<Integer>(13);
// does not compile
Box<Object> o = i;
}
}
One way to look at it is, let's assume Box defined a void setValue(T val) method.
Box<Object> o = i;
o.setValue("a string");
Integer x = i.getValue(); // ?!
The problem is mentioned in java documentation here (it is similar to Vlad's answer):
https://docs.oracle.com/javase/tutorial/extra/generics/subtype.html
Let's test your understanding of generics. Is the following code snippet legal?
List<String> ls = new ArrayList<String>(); // 1
List<Object> lo = ls; // 2
Line 1 is certainly legal. The trickier part of the question is line 2. This boils down to the question: is a List of String a List of Object. Most people instinctively answer, "Sure!"
Well, take a look at the next few lines:
lo.add(new Object()); // 3
String s = ls.get(0); // 4: Attempts to assign an Object to a String!
Here we've aliased ls and lo. Accessing ls, a list of String, through the alias lo, we can insert arbitrary objects into it. As a result ls does not hold just Strings anymore, and when we try and get something out of it, we get a rude surprise.
Firstly you cannot cast parameterized types. Check this oracle doc.
Typically, you cannot cast to a parameterized type unless it is
parameterized by unbounded wildcards. For example:
List li = new ArrayList<>();
List ln = (List) li; // compile-time error
Hence the line Box<Object> o = i; causes compile time error.
Even though while creating Box object you have not specified the generic parameter, but using constructor parameter type java Type inference the constructing object's type.
You can assign any reference to an Object reference but when generics kicks in, the exact generic type is matched at compile time. Hence it's not working. If you change to Box<? extends Object> o = i; it'll work since the generic type matches the contained type.
Related
This question already has answers here:
Java generics type erasure: when and what happens?
(7 answers)
Create instance of generic type in Java?
(29 answers)
Closed 11 months ago.
I've read Oracle docs on Generics, and some reference books and i still cannot grasp some of the things about Java type erasing. First of all why aren't we allowed to say :
public class Gen<T> {
T obj = new T();
public T getObj() {
return obj;
}
public void setObj(T obj) {
this.obj = obj;
}
}
Why doesnt Java allow me to say new T()? I understand that memory allocation for object of type T is allocated at runtime and type erasure is done in compile time, but when the type erasure is done, all of my T's will be replaced with Objects, so why is this a big deal?
Also how is this type of manipulation with T[] possible :
T[] arr = (T[]) new Object[size];
I just cant wrap my head around this things.
Thanks in advance.
I expected for it to create Object obj = new Object(), and to give me type safety throught the code, like inserting element, or extracting it with some getter. I dont understand why is this not allowed even with type erasure?
All of my T's will be replaced with Objects, so why is this a big deal?
Because T can be something other than Object.
class Gen<T> {
public T obj;
public Gen() { obj = new T(); /* illegal */ }
public Gen(T t) { obj = t; /* legal */ }
// getters and setters are unnecessary complications for this example
}
Gen<Integer> g = new Gen<Integer>();
Integer i = g.obj; // should be safe, but you would make it unsafe
i = i + 5; // uh oh
Gen<Integer> h = new Gen<Integer>(0);
Integer j = h.obj;
j = j + 5;
Type erasure is meant to remove generics while keeping the program the same, in the sense that if you ran the program without doing erasure you would get the same results. When this program is interpreted without erasure, i is an Integer. If we followed your method of type erasure, it would instead get assigned with an Object. So your way of doing it is wrong. Further, since new T() needs to know what T is to work, but erasure removes all runtime knowledge of T, there is in fact no way to compile new T(); while doing erasure, so it's banned. In contrast, the non-erased and erased versions of the h and j sequence do the same operations, so those are allowed.
The thing with the array is a hack and doesn't actually create a T[].
<T> T[] hack(int n) { return (T[])new Object[n]; }
Integer[] is = hack(5); // runtime error
Unchecked casts like (T) or (T[]) are where Java compromises on the "same-behavior" property of erased programs. A non-erased program would fail in hack because the cast would fail. The erased program can't actually perform the cast, so hack succeeds, and the failure is in the variable assignment. As long as an incorrectly cast object is not passed anywhere where the actual type is known, nothing goes wrong. It becomes your responsibility to maintain type safety. The above function, for example, fails to do that. The following example class does it correctly.
class SmallLIFO<T> {
private T[] buf = (T[])new Object[10]; // take responsibility for maintaining type safety
private int used = 0; // the Object[]-pretending-to-be-a-T[] is never given to the user, who may know what a T is and expose the lie
public boolean push(T t) { // this class's public interface only operates on objects that are the right type
boolean ret = used < 10;
if(ret) buf[used++] = t;
return ret;
}
public T pop() {
return used > 0 ? buf[--used] : null; // we'd either need a cast to (T[]) in buf or a cast to (T) here; no avoiding it
}
}
You seem to be saying that since new T() should all be replaced with new Object(), which is a perfectly valid constructor to call. Indeed this is true, but is that the intention of "new T()"?
The purpose of new T() is of course not to create a new Object instance, but to create a new instance of T, whatever that may be. And it is exactly because the JVM doesn't know what T is, that it is impossible to create an instance of T.
Suppose that Java works the way you said it would, and changed all new T() to new Object(), and you have:
public class Foo {
private int x = 10;
public Foo() { System.out.println("Hello"); }
public static <T> T magicallyCreateT() {
return new T();
}
public int getX() { return x; }
}
What would a reasonable person expect if I did this?
Foo foo = Foo.magicallyCreateT();
System.out.println(foo.getX());
From a type-checking perspective, that snippet looks completely normal, doesn't it?
They would expect Hello to be printed, and foo.getX() to return 10, wouldn't they? But the truth is, since the Object constructor is called, not Foo's, no Hello is printed, and since magicallyCreateT returns an instance of Object, you wouldn't even able to call getX on foo! There's no getX method in the Object class! I'd imagine the program would throw a ClassCastException at runtime.
So you see there are lots of problems if you just "create an Object", when you say "I want to create a T", so it is not allowed to do things like new T().
For the case of (T[])new Object[], it is different. You are explicitly saying that you are creating an Object[], and you are casting it to T[]. In the same way, you can also do (T)new Object(). In both cases, you'd get a ClassCastException if something goes wrong later down the line, like the scenario above. In the same way that you can't do new T(), you can't do new T[] either!
Whenever you're casting with a type parameter like this, you're basically telling the compiler that "trust me, I know what I'm doing".
public static void main(String[] args) {
List<Integer> integers = new ArrayList<>();
integers.add(5); //element #0
List list = integers;
list.add("foo"); //element #1
integers.get(1); //no error
System.out.println(integers.get(1)); //no error, prints "foo"
Integer i = integers.get(1); //throws ClassCastException
}
I'm trying to understand the process of casting variables of type, declared as a generic type parameter, and I'm a bit confused.
So, you may see in the example I've provided, that after we create a non-parametrized List, which refers to the same object that List<Integer>, then we can add any objects to that list (OK, nothing surprising here) and, what confuses me so much, we can extract non-Integer values from List<Integer> integers. Why isn't ClassCastException thrown at the first or the second call of integers.get(1)?
I assumed that methods returning parameter types, in fact always return Object and those returned values are implicitly tried to be converted to
l-value type or method parameter type at runtime (as there are no generics at runtime), however the following test convinced me that Integer is always preferred over Object:
public static void main(String[] args) {
List<Integer> integers = new ArrayList<>();
integers.add(5); //element #0
List list = integers;
list.add("foo"); //element #1
print(integers.get(1));
}
private static void print(Object var) {
System.out.println(var);
}
//this method is entered
private static void print(Integer var) {
System.out.println(var);
}
private static void print(String var) {
System.out.println(var);
}
Another interesting fact is that although elements of ArrayList are stored in Object[] array, they are always converted to a type defined in type parameter before being returned in method get():
public E get(int index) {
rangeCheck(index);
return elementData(index);
}
E elementData(int index) {
return (E) elementData[index];
}
So, if anyone may point me to the documentation where these questions are explained step by step, I would be very thankful
The compiler inserts casts when casts are needed. The method System.out.println has a parameter of type Object, so no cast to Integer is required.
In the case of your three print methods, the method with a parameter of type Integer is chosen, so the compiler inserts a cast. The choice of which of the three methods to use occurs at compile time based on a complicated set of rules. These rules use the generic information to see that integers.get(1) has type Integer, and so the Integer version is chosen and the cast is needed. As a result, the code is more or less equivalent to Java 4 code
List integers = new ArrayList();
integers.add("foo");
integers.add(Integer.valueOf(5)); // No autoboxing in Java 4!
print((Integer) integers.get(1)); // Cast inserted by compiler
The cast to (E) in the final part of your question does not actually do anything at runtime, and so will not throw a ClassCastException. It is only needed to make the code compile. You are telling the compiler that, yes, you are sure the Object is really an E and won't cause an exception later on (although you have subverted that by mixing raw and generic types).
How are references to << T >> handled by the compiler in the following code, since the method takes no parameters that would allow inference of T? Are any restrictions being placed on what type of object can be placed into the list? Is a cast even taking place on the line where I add the String to the list? My first thought is that without anything to infer T from, T becomes an Object type. Thanks in advance.
public class App {
private <T> void parameterizedMethod()
{
List<T> list = new ArrayList<>();
for(int i = 0; i < 10; i++)
{
list.add((T)new String()); //is a cast actually occurring here?
}
}
public App()
{
parameterizedMethod();
}
public static void main(String[] args) {
new App();
}
}
This is initially determined by 18.1.3:
When inference begins, a bound set is typically generated from a list of type parameter declarations P1, ..., Pp and associated inference variables α1, ..., αp. Such a bound set is constructed as follows. For each l (1 ≤ l ≤ p):
If Pl has no TypeBound, the bound αl <: Object appears in the set.
Otherwise, for each type T delimited by & in the TypeBound, the bound αl <: T[P1:=α1, ..., Pp:=αp] appears in the set; [...].
At the end of inference, the bound set gets "resolved" to the inferred type. Without any additional context, the bound set will only consist of the initial bounds based on the declaration of the type parameter.
A bound with a form like αl <: Object means αl (an inference variable) is Object or a subtype of Object. This bound is resolved to Object.
So in your case, yes, Object is inferred.
If we declared a type bound:
private <T extends SomeType> void parameterizedMethod()
then SomeType will be inferred.
No cast actually happens in this case (erasure). That's why it's "unchecked". A cast only happens when the object is exposed due to e.g.:
<T> T parameterizedMethodWithAResult()
{
return (T) new String();
}
// the cast happens out here
Integer i = parameterizedMethodWithAResult();
// parameterizedMethodWithAResult returns Object actually,
// and we are implicitly doing this:
Integer i = (Integer) parameterizedMethodWithAResult();
Are any restrictions being placed on what type of object can be placed into the list?
Semantically (compile-time), yes. And note that the restriction is determined outside the method. Inside the method, we don't know what that restriction actually is. So we should not be putting String in a List<T>. We don't know what T is.
Practically (run-time), no. It's just a List and there's no checked cast. parameterizedMethod won't cause an exception...but that only holds for this kind of isolated example. This kind of code may very well lead to issues.
Inside the method body, Java provides us no way to get any information about the substitution for T, so how can we do anything useful with T?
Sometimes, T is not really important to the method body; it's just more convenient for the caller
public static List<T> emptyList(){...}
List<String> emptyStringList = emptyList();
But if T is important to method body, there must be an out-of-band protocol, not enforceable by the compiler, that both the caller and the callee must obey. For example
class Conf
<T> T get(String key)
//
<conf>
<param name="size" type="int" ...
//
String name = conf.get("name");
Integer size = conf.get("size");
The API uses <T> here just so that the caller doesn't need to do an explicit cast. It is the caller's responsibility to ensure that the correct T is supplied.
In your example, the callee assumes that T is a supertype of String; the caller must uphold that assumption. It would be nice if such constraint can be expressed to the compiler as
<T super String> void parameterizedMethod()
{
List<T> list
...
list.add( new String() ); // obviously correct; no cast is needed
}
//
this.<Integer>parameterizedMethod(); // compile error
unfortunately, java does not support <T super Foo> ... :) So you need to javadoc the constraint instead
/** T must be a supertype of String! **/
<T> void parameterizedMethod()
I have an actual API example just like that.
List<T> list = new ArrayList<>();
for(int i = 0; i < 10; i++)
{
list.add((T)new String()); //is a cast actually occurring here?
}
No, no cast is actually occurring there. If you did anything with list that forced it to be a List<T> -- such as returning it -- then that may cause ClassCastExceptions at the point where the compiler inserted the real cast.
The crux of the question is, why does this cause a compile-time error?
List<Collection> raws = new ArrayList<Collection>();
List<Collection<?>> c = raws; // error
Background
I understand why generics aren't covariant in general. If we could assign List<Integer> to List<Number>, we'd expose ourselves to ClassCastExceptions:
List<Integer> ints = new ArrayList<Integer>();
List<Number> nums = ints; // compile-time error
nums.add(Double.valueOf(1.2));
Integer i = ints.get(0); // ClassCastException
We get a compile-time error at line 2 to save us from a run-time error at line 4. That makes sense.
List<C> to List<C<?>>
But how about this:
List<Collection> rawLists = new ArrayList<Collection>();
List<Collection<?>> wildLists = rawLists; // compile-time error
// scenario 1: add to raw and get from wild
rawLists.add(new ArrayList<Integer>());
Collection<?> c1 = wildLists.get(0);
Object o1 = c1.iterator().next();
// scenario 2: add to wild and get from raw
wildLists.add(new ArrayList<String>());
Collection c2 = rawLists.get(0);
Object o2 = c2.iterator().next();
In both scenarios, ultimately I get only get Object elements without casting, so I can't get a "mysterious" ClassCastException.
The section in the JLS that corresponds to this is §4.10.2, so I understand why the compiler is giving me the error; what I don't get is why the spec was written this way, and (to ward off speculative/opinion-based answers), whether it actually provides me any compile-time safety.
Motivating example
In case you're wondering, here's (a stripped-down version of) the use case:
public Collection<T> readJsons(List<String> jsons, Class<T> clazz) {
List<T> list = new ArrayList<T>();
for (String json : jsons) {
T elem = jsonMapper.readAs(json, clazz);
list.add(elem);
}
return list;
}
// call site
List<GenericFoo<?>> foos = readJsons(GenericFoo.class); // error
The error is because GenericFoo.class has type Class<GenericFoo>, not Class<GenericFoo<?>> (§15.8.2). I'm not sure why that is, though I suspect it's a related reason; but regardless, that wouldn't be a problem if Class<GenericFoo> could be casted — either implicitly or explicitly — to Class<GenericFoo<?>>.
First of all, raw type and wildcard type are quite different. For one, raw type completely erases all generic information.
So we have List<x> and List<y> where x is not y. This is certainly not subtype relationship.
You can, nevertheless, ask the casting to be allowed. But please read
JLS 5.5.1 , and tell me you want to add something more to it:) Browse the whole page, actually, it's a great wall of text just for casting.
And remember this is just the first ripple in the whole effect. What about List<List<x>> and List<List<y>>, etc.
I have some code that I would write
GenericClass<Foo> foos = new GenericClass<>();
While a colleague would write it
GenericClass<Foo> foos = new GenericClass();
arguing that in this case the diamond operator adds nothing.
I'm aware that constructors that actually use arguments related to the generic type can cause a compile time error with <> instead of a run time error in the raw case. And that the compile time error is much better. (As outlined in this question)
I'm also quite aware that the compiler (and IDE) can generate warnings for the assignment of raw types to generics.
The question is instead for the case where there are no arguments, or no arguments related to the generic type. In that case, is there any way the constructed object GenericClass<Foo> foos can differ depending on which constructor was used, or does Javas type erasure guarantee they are identical?
For instantiations of two ArrayLists, one with the diamond operator at the end and one without...
List<Integer> fooList = new ArrayList<>();
List<Integer> barList = new ArrayList();
...the bytecode generated is identical.
LOCALVARIABLE fooList Ljava/util/List; L1 L4 1
// signature Ljava/util/List<Ljava/lang/Integer;>;
// declaration: java.util.List<java.lang.Integer>
LOCALVARIABLE barList Ljava/util/List; L2 L4 2
// signature Ljava/util/List<Ljava/lang/Integer;>;
// declaration: java.util.List<java.lang.Integer>
So there wouldn't any difference between the two as per the bytecode.
However, the compiler will generate an unchecked warning if you use the second approach. Hence, there's really no value in the second approach; all you're doing is generating a false positive unchecked warning with the compiler that adds to the noise of the project.
I've managed to demonstrate a scenario in which doing this is actively harmful. The formal name for this is heap pollution. This is not something that you want to occur in your code base, and any time you see this sort of invocation, it should be removed.
Consider this class which extends some functionality of ArrayList.
class Echo<T extends Number> extends ArrayList<T> {
public Echo() {
}
public Echo(Class<T> clazz) {
try {
this.add(clazz.newInstance());
} catch (InstantiationException | IllegalAccessException e) {
System.out.println("YOU WON'T SEE ME THROWN");
System.exit(-127);
}
}
}
Seems innocuous enough; you can add an instance of whatever your type bound is.
However, if we're playing around with raw types...there can be some unfortunate side effects to doing so.
final Echo<? super Number> oops = new Echo(ArrayList.class);
oops.add(2);
oops.add(3);
System.out.println(oops);
This prints [[], 2, 3] instead of throwing any kind of exception. If we wanted to do an operation on all Integers in this list, we'd run into a ClassCastException, thanks to that delightful ArrayList.class invocation.
Of course, all of that could be avoided if the diamond operator were added, which would guarantee that we wouldn't have such a scenario on our hands.
Now, because we've introduced a raw type into the mix, Java can't perform type checking per JLS 4.12.2:
For example, the code:
List l = new ArrayList<Number>();
List<String> ls = l; // Unchecked warning
gives rise to a compile-time unchecked warning, because it is not
possible to ascertain, either at compile time (within the limits of
the compile-time type checking rules) or at run time, whether the
variable l does indeed refer to a List<String>.
The situation above is very similar; if we take a look at the first example we used, all we're doing is not adding an extra variable into the matter. The heap pollution occurs all the same.
List rawFooList = new ArrayList();
List<Integer> fooList = rawFooList;
So, while the byte code is identical (likely due to erasure), the fact remains that different or aberrant behavior can arise from a declaration like this.
Don't use raw types, mmkay?
The JLS is actually pretty clear on this point. http://docs.oracle.com/javase/specs/jls/se8/html/jls-8.html#jls-8.1.2
First it says "A generic class declaration defines a set of parameterized types (§4.5), one for each possible parameterization of the type parameter section by type arguments. All of these parameterized types share the same class at run time."
Then it gives us the code block
Vector<String> x = new Vector<String>();
Vector<Integer> y = new Vector<Integer>();
boolean b = x.getClass() == y.getClass();
and says that it "will result in the variable b holding the value true."
The test for instance equality (==) says that both x and y share exactly the same Class object.
Now do it with the diamond operator and without.
Vector<Integer> z = new Vector<>();
Vector<Integer> w = new Vector();
boolean c = z.getClass() == w.getClass();
boolean d = y.getClass() == z.getClass();
Again, c is true, and so is d.
So if, as I understand, you're asking whether there is some difference at runtime or in the bytecode between using the diamond and not, the answer is simple. There is no difference.
Whether it's better to use the diamond operator in this case is a matter of style and opinion.
P.S. Don't shoot the messenger. I would always use the diamond operator in this case. But that's just because I like what the compiler does for me in general w/r/t generics, and I don't want to fall into any bad habits.
P.P.S. Don't forget that this may be a temporary phenomenon. http://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.8 warns us that "The use of raw types in code written after the introduction of generics into the Java programming language is strongly discouraged. It is possible that future versions of the Java programming language will disallow the use of raw types."
You may have problem with default constructor if your generic arguments are limited. For example, here's sloppy and incomplete implementation of the list of numbers which tracks the total sum:
public class NumberList<T extends Number> extends AbstractList<T> {
List<T> list = new ArrayList<>();
double sum = 0;
#Override
public void add(int index, T element) {
list.add(index, element);
sum += element.doubleValue();
}
#Override
public T remove(int index) {
T removed = list.remove(index);
sum -= removed.doubleValue();
return removed;
}
#Override
public T get(int index) {
return list.get(index);
}
#Override
public int size() {
return list.size();
}
public double getSum() {
return sum;
}
}
Omitting the generic arguments for default constructor may lead to ClassCastException in runtime:
List<String> list = new NumberList(); // compiles with warning and runs normally
list.add("test"); // throws CCE
However adding the diamond operator will produce a compile-time error:
List<String> list = new NumberList<>(); // error: incompatible types
list.add("test");
In your specific example: Yes, they are identical.
Generally: Beware, they may not be!
The reason is that different overloaded constructor/method may be invoked when raw type is used; it is not only that you get better type safety and avoid runtime ClassCastException.
Overloaded constructors
public class Main {
public static void main(String[] args) {
Integer anInteger = Integer.valueOf(1);
GenericClass<Integer> foosRaw = new GenericClass(anInteger);
GenericClass<Integer> foosDiamond = new GenericClass<>(anInteger);
}
private static class GenericClass<T> {
public GenericClass(Number number) {
System.out.println("Number");
}
public GenericClass(T t) {
System.out.println("Parameter");
}
}
}
Version with diamond invokes the different constructor; the output of the above program is:
Number
Parameter
Overloaded methods
public class Main {
public static void main(String[] args) {
method(new GenericClass());
method(new GenericClass<>());
}
private static void method(GenericClass<Integer> genericClass) {
System.out.println("First method");
}
private static void method(Object object) {
System.out.println("Second method");
}
private static class GenericClass<T> { }
}
Version with diamond invokes the different method; the output:
First method
Second method
This is not a complete answer - but does provide a few more details.
While you can not distinguish calls like
GenericClass<T> x1 = new GenericClass<>();
GenericClass<T> x2 = new GenericClass<T>();
GenericClass<T> x3 = new GenericClass();
There are tools that will allow you to distinguish between
GenericClass<T> x4 = new GenericClass<T>() { };
GenericClass<T> x5 = new GenericClass() { };
Note: While it looks like we're missing new GenericClass<>() { }, it is not currently valid Java.
The key being that type information about the generic parameters are stored for anonymous classes. In particular we can get to the generic parameters via
Type superclass = x.getClass().getGenericSuperclass();
Type tType = (superclass instanceof ParameterizedType) ?
((ParameterizedType) superclass).getActualTypeArguments()[0] :
null;
For x1, x2, and x3 tType will be an instance of TypeVariableImpl (the same instance in all three cases, which is not surprising as getClass() returns the same object for all three cases.
For x4 tType will be T.class
For x5 getGenericSuperclass() does not return an instance of ParameterizedType, but instead a Class (infact GenericClass.class)
We could then use this to determine whether our obect was constructed via (x1,x2 or x3) or x4 or x5.