Invariance, covariance and contravariance in Java

Invariance, covariance and contravariance in Java - java

Java lesson on generics are leading me to variance concept. This causes me some headaches as I cannot find a very simple demonstration of what it is.
I have read several similar questions on stackoverflow, but I found them too difficult to understand for a Java learner. Actually the problem is that the explanation of generics requires variance to be understood, and the concept of variance is demonstrated relying heavily on generics understanding.
I had some hope reading this, but in the end I shared C. R.'s feeling:
The title reminds me of the days learning general relativity. – C.R.
Dec 22 '13 at 7:34
Four theory questions are very confusing to me, and I cannot find good and simple explanations. Here they are, with my current partial understanding (I fear experts will have a great fun reading this).
Your help to correct and clarify is welcome (remember this is for beginners, not for experts).
Is there something wrong with this understanding?
What is invariance / covariance / contravariance related to in the context of programing? My best guess is that:
This is something encountered in object-oriented programing.
This has to do when looking at method arguments and result type in the class and an ancestor.
This is used in the context of method overriding and overloading.
This is used to establish a connection between the type of a method argument, or the method return type, and the inheritance of the classes themselves, e.g. if class D is a descendant of class A, what can we say about the types of arguments and the method method return type?
How variance relates to Java methods? My best guess is that, given two classes A and D, with A being an ancestor of D, and a overhiden/overloaded method f(arg):
If the relation between the argument type in the two methods IS THE SAME than the relation between the two classes, the argument type in the method is said COVARIANT with the class type, said otherwise: the inheritance between arg types in A and D is covariant with the inheritance of classes A and D.
If the relation between the arguments REVERSES the relation between classes, the arg type is said CONTRAVARIANT to the class type, said otherwise: the inheritance between arg types in A and D is contravariant with the inheritance of classes A and D..
Why is variance understanding so important for Java programmers? My guess is that:
Java language creators have implemented rules for variance in the language, and this has implications on what a programmer can do.
A rule states that the return type of an overriding/overloading method must be contravariant to the inheritance.
Another rule states that the type of an argument of an overriding/overloading must be is covariant to the inheritance.
The Java compiler checks the variance rules are valid, and provides errors or warnings accordingly. Deciphering the messages is easier with variance knowledge.
What is the difference between overrhiding and overloading? Best guess:
A method overrides another method when argument and return types are both invariant. All other cases are understood by the compiler as overloading.

This is not specific to OO, but has to do with the properties of certain types.
For example, with the function type
A -> B // functional notation
public B meth(A arg) // how this looks in Java
we have the following:
Let C be a subtype of A, and D be a subtype of B. Then the following is valid:
B b = meth(new C()); // B >= B, C < A
Object o = meth(new C()); // Object > B, C < A
but the follwoing are invalid:
D d = meth(new A()); // because D < B
B b = meth(new Object()); // because Object > A
hence, to check whether a call of meth is valid, we must check
The expected return type is a supertype of the declared return type.
The actual argument type is a subtype of the declared argument type.
This is all well known and intuitive. By convention we say that the return type of a function is covariant, and the argument type of a method is contravariant.
With parameterized types, like List, we have it that the argument type is invariant in languages like Java, where we have mutability. We can't say that a list of C's is a list of A's, because, if it were so, we could store an A in a list of Cs, much to the surprise of the caller, who assumes only Cs in the list. However, in languages where values are immutable, like Haskell, this is not a problem. Because the data we pass to functions cannot be mutated, a list of C actually is a list of A if C is a subtype of A. (Note that Haskell has no real subtyping, but has instead the related notion of "more/less polymorphic" types.)

Related

c# upcasting is always allowed

In Java,
"Up-casting is casting to a supertype, while downcasting is casting to
a subtype. Supercasting is always allowed, but subcasting involves a
type check and can throw a ClassCastException."
(What is the difference between up-casting and down-casting with respect to class variable)
Is upcasting also always allowed in C#?

Yes, it is allowed, since a subclass is a particularization of the ancestor class.
Example:
Let us consider the case when we have a class called Bird, another called Sparrow and a third one Eagle. Sparrow and Eagle are inherited from Bird. Sparrows differ from Eagles greatly, but they are Birds. So, if you want to have a Collection of Birds for some reason, then you can have Eagle and Sparrow objects in that Collection at the same time, since they are still Birds, if only specific Birds.

yes, up-casting is allowed :-)
Casting and Type Conversions (C# Programming Guide)

OOP principles state that you can always upcast; however, unlike Java with very restricted number of primitive classes, .Net implementation allows to declare struct types, some of them are weird counter-examples with boxing:
TypedReference reference = new TypedReference();
// Compile time error here! Even if Object is the base type for all types
Object o = (Object)reference;
Technically, TypedReference is an Object:
Object
ValueType
TypedReference
you can easily check it:
Console.Write(typeof(TypedReference).BaseType.BaseType == typeof(Object)
? "TypedReference derived from Object via ValueType"
: "Very strange");
but in order to be represented as Object instance (via cast) it should be boxed which can't be done.

What type is <?> when making instantiating lists?

I have seen in multiple different places people who instantiate a list or ArrayList like:
List<?> l = new ArrayList<>();
What type is ?? Does this mean that it can hold any types in it? If so, why would this be used instead of just and ArrayList?

Does this mean that it can hold any types in it?
No. It means that your l variable could be referring to a list parameterized with any type. So it's actually a restriction: you will not be allowed to add any object to l because you have no idea which items it accepts. To give a concrete example, l could be a List<String> or it could be a List<ExecutorService>.

As correctly pointed by Marko, its an unknown restriction on the List type.
The Java docs says that:
The unbounded wildcard type is specified using the wildcard character
(?), for example, List<?>. This is called a list of unknown type.
There are two scenarios where an unbounded wildcard is a useful
approach:
If you are writing a method that can be implemented using functionality provided in the Object class.
When the code is using methods in the generic class that don't depend on the type parameter. For example, List.size or List.clear.
In fact, Class<?> is so often used because most of the methods in
Class do not depend on T.

Let me make this a long bed time story; read it to fall asleep:)
Let's begin with this point -- To invoke a generic method, its type arguments must be supplied. (Unless the method is invoked in a "raw" manner, i.e. in the erased form, which is another topic:)
For example, to invoke Collections.<T>emptyList(), T must be supplied. It can be supplied explicitly by the programmer --
List<String> list = Collections.<String>emptyList(); // T=String
But that is tedious, and kind of dumb. Obviously in this context, T can only be String. It's stupid if the programmer has to repeat the obvious.
That's where type inference is helpful. We can omit the type argument, and the compiler can infer what the programmer intends it to be
List<String> list = Collections.emptyList(); // T=String is implied
Remember, <String> is still supplied, by the programmer, implicitly.
Supposedly, the programmer is the all-knowing dictator of all type arguments, and, the compiler and the programmer have a common understanding on when type arguments can be omitted and inferable from context. When the programmer omits a type argument, he knows the compiler can infer it exactly as he intended, based on a rigorous algorithm (which he masters:)
It is not the compiler's discretion to pick and choose type arguments, rather, the programmer does, and conveys it to the compiler.
Realistically, type inference is so complex, few no programmer has any idea what's going on in a lot of cases:) The programmer is more like a dictator making vague commands, and the compiler tries its best to make sense out of it. We mostly write code on intuition, not paying attention to details, and we sort of believe that the code does what we want if the compiler approves it.
In any case, all type arguments are fixed precisely and predictably at compile time. Any omitted type argument is equivalent to an explicitly specified one.
Some type arguments are "undenotable", e.g. a type variable introduced by capture conversion. They can not be explicitly specified, they can only be inferred. (Nevertheless the programmer is supposed to know what they are, even though they cannot be named)
In the previous example, T can only be inferred as String, there's no other choices. But in a lot of cases, there are more candidates for T, and the type inference algorithm must have a strategy to resolve it to one of the candidates. For example, consider this lonely statement
Collections.emptyList();
T could be any type; T is resolved to Object, because, well, there's no good reason to resolve it to anything else, like Integer or String etc. Object is more special because it's the supertype of all.
Now, let's get to constructors. Formally speaking, constructors are not methods. But they are very much alike in a lot of aspects. Particularly, type inference on constructors is almost the same as on methods. Invoking a constructor of a class CLASS takes the form of new CLASS(args).
Just like methods, a constructor can be generic, with its own type parameters. For example,
class Bar
{
<T>Bar(T x){ .. }
and type inference works on generic constructors too
new Bar("abc"); // inferred: T=String
To explicitly supply type arguments for a constructor,
new <String>Bar("abc");
It's pretty rare though that a constructor is generic.
A generic constructor is different from a generic CLASS! Consider this
class Foo<T>
{
Foo(T x){ .. }
The class is generic, the constructor is not. To invoke the constructor for class Foo<String>, we do
new Foo<String>(""); // CLASS = Foo<String>
Method type inference we've been talking about so far does not apply here, because the constructor is not even generic. In Java 5/6, there is no type inference on CLASS, therefore <String> must be explicitly specified. It's stupid, because <String> is obvious in this context. There were workarounds (i.e. using static factory methods), but people were of course very upset and demanded a solution.
In Java 7, this problem is solved by "diamond inference" -
new Foo<>(""); // inferred: T=String
"diamond" refers to the curious <> operator. It is required; we cannot simply write
new Foo("");
because that already had a different meaning - invoking the constructor of "raw" Foo.
With diamond inference, we can do things we couldn't in Java 5/6
List<Object> list = new ArrayList<>(); // Java 7. inferred: E=Object
// equivalent to
List<Object> list = new ArrayList<Object>(); // <Object> is required in Java 5/6
Remember, T=Object is still supplied, through diamond inference.
Finally, we come back to your original question
List<?> list = new ArrayList<>();
Here, E=Object is inferred (what else?). The code is equivalent to
List<?> list = new ArrayList<Object>();
Yep, the list object is indeed an ArrayList<Object>, not ArrayList<SomethingElse>.
Also note that the following would be illegal and nonsensical
List<?> list = new ArrayList<?>();
^^^
CLASS in new CLASS(args) must be a concrete type. We can only instantiate an ArrayList of a specific element type.
The declared type List<?> of variable list is too general though. For a local variable, it is the best practice IMO to declare it in its more specific type
ArrayList<Object> list = new ArrayList<>();
Don't use <?> here - it just causes confusion to everybody.
On a related note, a lot of people would argue for "program against interface"
List<Object> list = new ArrayList<>();
^^^^
That is wrong IMO. Who are we providing abstraction for in a local block? Use the most specific type in implementation for max clarity;
use abstract types in interfaces.
zzzzzzzzzz

What is the difference between Class clazz and Class<?> clazz in java?

I need to make use of reflection in java. I understand that Class clazz creates a variable representing a Class object. However, I am trying to reference a Class object from a String using the forName("aClassName") method. My IDE (Eclipse), seems to prefer the notation Class<?> clazz for declaring the variable. I have seen this notation many times elsewhere. What does this mean?
Edit: Removed reference to ternary operator as it is not relevant to this question.

Class is a raw type - it's basically a generic type that you're treating as if you didn't know about generics at all.
Class<?> is a generic type using an unbound wildcard - it basically means "Class<Foo> for some type Foo, but I don't know what".
Similarly you can have wildcards with bounds:
Class<? extends InputStream> means "Class<Foo> for some type Foo, but I don't know what so long as it's InputStream or a subclass"
Class<? super InputStream> means "Class<Foo> for some type Foo, but I don't know what so long as it's InputStream or a superclass"
See also the Java Generics FAQ for a lot more information:
Raw types
Wildcards
And the Java Language Specification:
Raw types (section 4.8)
Type arguments and wildcards (section 4.5.1)
In particular, from the raw types section:
Raw types are closely related to wildcards. Both are based on existential types. Raw types can be thought of as wildcards whose type rules are deliberately unsound, to accommodate interaction with legacy code. Historically, raw types preceded wildcards; they were first introduced in GJ, and described in the paper Making the future safe for the past: Adding Genericity to the Java Programming Language by Gilad Bracha, Martin Odersky, David Stoutamire, and Philip Wadler, in Proceedings of the ACM Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA 98), October 1998.

The first thing to realize is that, in this case, the "?" is NOT the ternary operator, but is part of Java's generics implementation and indicates that the type of Class is unspecified, as some of the other answers have already explained.
To clarify the question about the ternary operator, it is actually very simple.
Imagine you have the following if statement:
boolean correct = true;
String message;
if (correct) {
message = "You are correct.";
} else {
message = "You are wrong.";
}
You can rewrite that with the ternary operator (think of it as the if-else-shortcut operator):
message = (correct) ? "You are correct." : "You are wrong.";
However, it's best to avoid the ternary operator for all but the simplest statements in order to improve the readability of your code.

In generic types the wildcard ?means "whatever class" (so Class<?> is the same as just Class but as raw type correctly parametrized).

Isn't the argument type co- not contra-variant?

I understand the terms co-variance and contra-variance. But there is one small thing I am unable to understand. In the course "Functional Programming in Scala" on coursera, Martin Ordersky mentions that:
Functions are contravariant in their argument types and co-variant in
their return types
So for example in Java, let Dog extends Animal. And let a function be :
void getSomething(Animal a){
and I have the function call as
Dog d = new Dog();
getSomething(d)
So basically what is happeneing is that Animal a = d. And according to wiki covariance is "Converting wider to narrow". And above we are converting from dog to Animal. SO isnt the argument type covariant rather than contravariant?

This is how functions are defined in Scala:
trait Function1 [-T1, +R] extends AnyRef
In English, parameter T1 is contravariant and result type R is covariant. What does it mean?
When some piece of code requires a function of Dog => Animal type, you can supply a function of Animal => Animal type, thanks to contravariance of parameter (you can use broader type).
Also you can supply function of Dog => Dog type, thanks to covariance of result type (you can use narrower type).
This actually makes sense: someone wants a function to transform dog to any animal. You can supply a function that transforms any animal (including dogs). Also your function can return only dogs, but dogs are still animals.

Converting Dog to Animal is converting narrow to wider, so it's not covariance.

I remember being confused by that very sentence when I was reading the Scala Book back in 2007. Martin delivers it as if he was talking about a language feature, but in that sentence he only states a fact about functions in general. Scala, specifically, models that fact simply by a regular trait. Since Scala has declaration-site variance, expressing those semantics is natural to the language.
Java Generics, on the other hand, support only use-site variance, so the closest one can get to co/contravariance of a function type in Java is to hand-code it at each use site:
public int secondOrderFunction(Function<? super Integer, ? extends Number> fn) {
....
}
(assuming an appropriately declared interface Function<P, R>, P standing for parameter type and R for return type). Naturally, since this code is in the hands of the client, and not being specific to functions at all, the statement about param type/return type variance is not applicable to any language feature of Java. It is only applicable in a broader sense, pertaining to the nature of functions.
Java 8 will introduce closures, which implies first-class functions, but, as per Jörg's comment below, the implementation will not include a fully-fledged function type.

I think the original question about converting Dog to Animal as already been clarified but it might be of interest to note that there is a reason why functions are defined contravariant in its arguments and covariant in its return types. Let’s say you have two functions:
val f: Vertebrate => Mammal = ???
val g: Mammal => Primate = ???
As we are talking about functions, you would expect functions composition to be amongst your primitive operations. Indeed, you can compose f and g (g o f) and obtain as result a function:
val h: Vertebrate => Primate = f andThen g
But I can replace g with a subtype:
val gChild: Animal => Primate
Without breaking the composability. And gChild is a subtype of g precisely because we defined Function contravariant in its argument. As a conclusion, you can see that a function must be defined in such a way if you want to capture and preserve the idea of functions composability.
You can find more details and few graphics that should help in digesting this subject here

Wikipedia says for Generics - "One version of the class or function is compiled, works for all type parameters. "

How does Java do this? If there are not multiple Classes being created, then how does it support multiple Typed instantiations of the Generic class?
Until now I used to believe that it is like C++, but now i am totally confused.
Can't figure out how Java pulls this off?
-Ajay

This is due to type erasure. Java's generics are primarily a compile-time feature. All generic types are, at runtime, Objects replaced with their lower bound.
Thanks to Michael for the correction:
Generics are not strictly a compile-time feature. If a class, method or field has a generic type with a concrete type parameter specified, this information will be present at runtime and is available via reflection.
To elaborate:
When runtime inspecting a parameterizable type itself, like java.util.List, there is no way of knowing what type is has been parameterized to. This makes sense since the type can be parameterized to all kinds of types in the same application. But, when you inspect the method or field that declares the use of a parameterized type, you can see at runtime what type the paramerizable type was parameterized to. In short:
You cannot see on a type itself what type it is parameterized to a runtime, but you can see it in fields and methods where it is used and parameterized. Its concrete parameterizations in other words.
Source

Since only reference types can be used as generic type arguments in Java, and all pointers are the same size, the same byte code can be used.
As for type safety, generics in Java a compile/link-time only. That is, during compilation generic types are replaced by their erasure. The erasure of a type variable T is its lower bound (or Object, if it doesn't have one). For instance,
class Complex<N extends Number> {
N real;
N imag;
}
becomes
class Complex {
Number real;
Number imag;
}
as far as byte code is concerned.
Needless to say that is not pretty and causes numerous limitations. The most obvious one is that
new N();
does not compile, because the runtime does not know the type N stands for and hence can't instiate the type. Similarly,
(N) n
will compile, but unlike an ordinary cast in Java, will not be checked at runtime. An incorrect cast can therefore cause a variable to hold an object of the wrong type. This is called heap pollution. To ensure (a weaker form of) type safety, the compiler will introduce casts into calling code. For instance,
boolean right(Complex<Integer> c) {
return c.real > 0;
}
will become
boolean right(Complex c) {
return ((Integer) c.real) > 0;
}
To sum things up, the generics implementation in Java is not pretty, especially compared to the .NET one. The things we have to live with for the sake of backwards compatibility ...

good question. The genetic information is not kept during runtime. E.g if you have this code
List<Apple> apples = new ArrayList<Apple>(); // this is a list of apples
But in runtime it becomes :
List apples = new ArrayList(); // this is how it looks in runtime

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Invariance, covariance and contravariance in Java - java

Related

c# upcasting is always allowed

What type is <?> when making instantiating lists?

What is the difference between Class clazz and Class<?> clazz in java?

Isn't the argument type co- not contra-variant?

Wikipedia says for Generics - "One version of the class or function is compiled, works for all type parameters. "

Categories

Resources