Related
Let me preface this question by saying up front that I understand what Java can and can't do and am not asking about that. I'm wondering what the actual technical challenges are, from JVM and compiler standpoint, that require the compiler to behave the way it does.
Whenever I see discussions on weaknesses or most hated aspects of java Type Erasure always seems to be somewhere near the top of the list for Java Developers (it is for me!). If my history is correct Java 1.0 never implementing any type checking beyond passing Objects and recasting them. When a better Type system was required Sun had to decide between full Typing support which would break backwards comparability or going with their chosen solution of generics which didn't break old code.
Meanwhile C# ran into the same issue and went the opposite route of breaking backwards comparability to implement a more complex typing system around the same time (I believe).
My main question is why was this a either-or question for the two languages? What is it about the compiler process that means there is no way to support C# style handling of type without breaking backwards comparability in old code? I understand part of the problem is that the exact type is not always known at compile time, but at first (naive) glance it seems like some times it can be known at compile time, or that it can be left unknown at compile time and handled with a sort of reflection approach at runtime.
Is the problem that it's not feasible to implement, or that it was simply deemed too slow to implement a runtime sort of solution?
To go a step further lets use a simple generic factory example of code as an example of a place where type erasure feels rather cumbersome.
public class GenericFactory<FinalType, BuilderType<FinalType> extends GenericBuilder<FinalType>>{
private Class builderClass;
public GenericFactory(Class<BuilderType> builderClass){
this.builderClass=builderClass;
}
public FinalType create(){
GenericBuilder builder=builderClass.newInstance();
builder.setFoo(getSystemProperty("foo");
builder.setBar(getSystemProperty("bar");
builder.setBaz(getSystemProperty("baz");
return builder.build();
}
}
This example, assuming I didn't screw up on syntax somewhere, shows two particular annoyances of type erasure that at first glance seem like they should be easier to handle.
First, and less relevant, I had to add a FinalType parameter before I could refer to BuilderType extends GenericBuilder, even though it seems like FinalType could be inferred from BuilderType. I say less relevant since this may be more about generics syntax/implementation then the compiler limits that forced type erasure.
The second issue is that I had to pass in my BuilderClass object to the constructor in order to use reflection to build the builder, despite it being defined by the generics already. It seems as if it would be relatively easy for the compiler to store the generic class used here (so long as it didn't use the ? syntax) to allow reflection to look up the generic and then construct it.
Since this isn't done I presume there is a very good reason it is not. I'm trying to understand what these reasons are, what forces the JVM to stick with type erasure to maintain backwards compatibility?
I'm not sure what you're describing (the two "annoyances") are a result of type erasure.
I had to add a FinalType parameter before I could refer to BuilderType extends GenericBuilder, even though it seems like FinalType could be inferred from BuilderType
BuilderType<FinalType> would not be a valid generic type name unless I missed some changes to that in Java 8. Thus it should be BuilderType extends GenericBuilder<FinalType> which is fine. FinalType can't be inferred here, how should the compiler know which type to provide?
The second issue is that I had to pass in my BuilderClass object to the constructor in order to use reflection to build the builder, despite it being defined by the generics already.
That's not true. The generic parameters don't define what FinalType actually is. I could create a GenericFactory<String, StringBuilderType> (with StringBuilderType extends GenericBuilder<String>) as well as a GenericFactory<Integer, IntegerBuilderType> (with IntegerBuilderType extends GenericBuilder<Integer>).
Here, if you'd provide the type parameters to a variable definition or method call, type erasure would happen. As for the why refer to Andy's comment.
However, if you'd have a field or subclass, e.g. private GenericFactory<String, StringBuilderType> stringFactory, there is no type erasure. The generic types can be extracted from the reflection data (unfortunately there's no easy built-in way, but have a look here: http://www.artima.com/weblogs/viewpost.jsp?thread=208860).
Studying Java, I've come across generic methods.
public <T> void foo(T variable) { }
That is, a method which takes a parameter with an undecided type (á la PHP?). I'm however unable to see how this would be a good solution - especially since I've come to fall in love with a strongly typed languages after coming from a loose ones.
Is there any reason to use generic methods? If so, when?
Those who are coming from prior to Java 5 background knows that how inconvenient it was to store object in Collection and then cast it back to correct Type before using it. Generics prevents from those. it provides compile time type-safety and ensures that you only insert correct Type in collection and avoids ClassCastException in runtime.
So it provides compile time type-safety and casting. When you want to write complex APIs with complex method signatures it will save you a lot both when writing the API and when using the API and prevents writing lots of code for casting part and catch your errors at compile time. just take a look at java.util.Collection package and see the source code.
As a developer I always want compiler to catch my error at compile time and inform me when I want to compile it then i will fix my errors and at runtime there won't be many errors related to type-safety.
for more info see :
http://javarevisited.blogspot.com/2011/09/generics-java-example-tutorial.html
http://javarevisited.blogspot.com/2012/06/10-interview-questions-on-java-generics.html
Generics, among other things, give you a way to provide a template -- i.e. you want to do the same thing, and the only difference is the type.
For example, look at the List API, you will see the methods
add(E e)
For every list of the same type you declare, the only thing different about the add method is the type of the thing going into the list. This is a prime example of where generics are useful. (Before generics were introduced to Java, you would declare a list, and you could add anything to the list, but you would have to cast the object when you retrieved it)
More specifically, you might want 2 ArrayList instances, one that takes type1 and one that takes type2. The list code for add is going to do the same thing, execute the same code, for each list (since the two lists are both ArrayList instances), right? So the only thing different is what's in the lists.
(As #michael points out, add isn't a true example of a generic method, but there are true generic methods in the API linked, and the concept is the same)
There's nothing non-strongly typed about generic functions in general. The type is resolved and checked at compile time. It's not an undecided type, it's one of a range of possible types (these can be constrained, in your example they are not). At compile time it is known and decided.
As hvgotcodes says, the Collections API contains a number of good examples of this in use.
The main objective of Generic concepts are :
To provide type safety to the Collections so that they can hold only
one particular type of object.
To resolve typecasting problems.
To hold only String type of object a Generic version of ArrayList can be declare as follows :
ArrayList l = new ArrayList ();
To know more : http://algovalley.com/java/generics.php
I've read the whole SCJP6 book Sierra and Bates book, scored 88% the exam.
But still, i never heard of how this kind of code works as it's not explained in the generics chapter:
Collections.<TimeUnit>reverseOrder()
What is this kind of generics usage?
I discovered it in some code but never read anything about it.
It seems to me it permits to give some help to type inference.
I've tried to search about that but it's not so easy to find (and it's not even in the SCJP book/exam!)
So can someone give me a proper explaination of how it works, which are all the usecases etc?
Thanks
Edit
Thanks for the answers but i expected more details :) so if someone want to add some extra informations:
What about more complex cases like
Using a type declared in class , can i do something like Collections.<T>reverseOrder() for exemple?
Using extends, super?
Using ?
Giving the compiler only partial help (ie O.manyTypesMethod<?,MyHelpTypeNotInfered,?,?,?,?,?>() )
It is explicit type specification of a generic method. You can always do it, but in most cases it's not needed. However, it is required in some cases if the compiler is unable to infer generic type on its own.
See an example towards the end of the tutorial page.
Update: only the first of your examples is valid. The explicit type argument must be, well, explicit, so no wildcards, extends or super is allowed there. Moreover, either you specify each type argument explicitly or none of them; i.e. the number of explicit type arguments must match the number of type parameters of the called method. A type parameter such as T is allowed if it is well defined in the current scope, e.g. as a type parameter of the enclosing class.
You are 100% correct, it is to help with type inference. Most of the time you don't need to do this in Java, as it can infer the type (even from the left hand side of an assignment, which is quite cool). This syntax is covered in the generics tutorial on the Java website.
Just a small addition to the other responses.
When getting the according compiler error:
While the "traditional" casting approach
(Comparator<TimeUnit>) Collections.reverseOrder()
looks similar to the generics approach
Collections.<TimeUnit>reverseOrder()
the casting approach is of course not type-safe (possible runtime exception), while the generics approach would create a compilation error, if there is an issue. Thus the generics approach is preferred, of course.
As the other answers have clarified, it's to help the compiler figure out what generic type you want. It's usually needed when using the Collections utility methods that return something of a generic type and do not receive parameters.
For example, consider the Collections.empty* methods, which return an empty collection. If you have a method that expects a Map<String, String>:
public static void foo(Map<String, String> map) { }
You cannot directly pass Collections.emptyMap() to it. The compiler will complain even if it knows that it expects a Map<String, String>:
// This won't compile.
foo(Collections.emptyMap());
You have to explicitly declare the type you want in the call, which i think looks quite ugly:
foo(Collections.<String, String>emptyMap());
Or you can omit that type declaration in the method call if you assign the emptyMap return value to a variable before passing it to the function, which i think is quite ridiculous, because it seems unnecessary and it shows that the compiler is really inconsistent: it sometimes does type inference on generic methods with no parameters, but sometimes it doesn't:
Map<String, String> map = Collections.emptyMap();
foo(map);
It may not seem like a very important thing, but when the generic types start getting more complex (e.g. Map<String, List<SomeOtherGenericType<Blah>>>) one kind of starts wishing that Java would have more intelligent type inference (but, as it doesn't, one will probably start writing new classes where it's not needed, just to avoid all those ugly <> =D).
In this case it is a way of telling the reverseOrder method what kind of ordering should be imposed on the object, based on what type you specify. The comparator needs to get specific information about how to order things.
I read from an interview with Neal Gafter:
"For example, adding function types to the programming language is much more difficult with Erasure as part of Generics."
EDIT:
Another place where I've met similar statement was in Brian Goetz's message in Lambda Dev mailing list, where he says that lambdas are easier to handle when they are just anonymous classes with syntactic sugar:
But my objection to function types was not that I don't like function types -- I love function types -- but that function types fought badly with an existing aspect of the Java type system, erasure. Erased function types are the worst of both worlds. So we removed this from the design.
Can anyone explain these statements? Why would I need runtime type information with lambdas?
The way I understand it, is that they decided that thanks to erasure it would be messy to go the way of 'function types', e.g. delegates in C# and they only could use lambda expressions, which is just a simplification of single abstract method class syntax.
Delegates in C#:
public delegate void DoSomethingDelegate(Object param1, Object param2);
...
//now assign some method to the function type variable (delegate)
DoSomethingDelegate f = DoSomething;
f(new Object(), new Object());
(another sample here
http://geekswithblogs.net/joycsharp/archive/2008/02/15/simple-c-delegate-sample.aspx)
One argument they put forward in Project Lambda docs:
Generic types are erased, which would expose additional places where
developers are exposed to erasure. For example, it would not be
possible to overload methods m(T->U) and m(X->Y), which would be
confusing.
section 2 in:
http://cr.openjdk.java.net/~briangoetz/lambda/lambda-state-3.html
(The final lambda expressions syntax will be a bit different from the above document:
http://mail.openjdk.java.net/pipermail/lambda-dev/2011-September/003936.html)
(x, y) => { System.out.printf("%d + %d = %d%n", x, y, x+y); }
All in all, my best understanding is that only a part of syntax stuff that could, actually will be used.
What Neal Gafter most likely meant was that not being able to use delegates will make standard APIs more difficult to adjust to functional style, rather than that javac/JVM update would be more difficult to be done.
If someone understands this better than me, I will be happy to read his account.
Goetz expands on the reasoning in State of the Lambda 4th ed.:
An alternative (or complementary) approach to function types,
suggested by some early proposals, would have been to introduce a new,
structural function type. A type like "function from a String and an
Object to an int" might be expressed as (String,Object)->int. This
idea was considered and rejected, at least for now, due to several
disadvantages:
It would add complexity to the type system and further mix structural and nominal types.
It would lead to a divergence of library styles—some libraries would continue to use callback interfaces, while others would use structural
function types.
The syntax could be unweildy, especially when checked exceptions were included.
It is unlikely that there would be a runtime representation for each distinct function type, meaning developers would be further exposed to
and limited by erasure. For example, it would not be possible (perhaps
surprisingly) to overload methods m(T->U) and m(X->Y).
So, we have instead chosen to take the path of "use what you
know"—since existing libraries use functional interfaces extensively,
we codify and leverage this pattern.
To illustrate, here are some of the functional interfaces in Java SE 7
that are well-suited for being used with the new language features;
the examples that follow illustrate the use of a few of them.
java.lang.Runnable
java.util.concurrent.Callable
java.util.Comparator
java.beans.PropertyChangeListener
java.awt.event.ActionListener
javax.swing.event.ChangeListener
...
Note that erasure is just one of the considerations. In general, the Java lambda approach goes in a different direction from Scala, not just on the typed question. It's very Java-centric.
Maybe because what you'd really want would be a type Function<R, P...>, which is parameterised with a return type and some sequence of parameter types. But because of erasure, you can't have a construct like P..., because it could only turn into Object[], which is too loose to be much use at runtime.
This is pure speculation. I am not a type theorist; i haven't even played one on TV.
I think what he means in that statement is that at runtime Java cannot tell the difference between these two function definitions:
void doIt(List<String> strings) {...}
void doIt(List<Integer> ints) {...}
Because at compile time, the information about what type of data the List contains is erased, so the runtime environment wouldn't be able to determine which function you wanted to call.
Trying to compile both of these methods in the same class will throw the following exception:
doIt(List<String>) clashes with doIt(List<Integer); both methods have the same erasure
I've occasionally heard that with generics, Java didn't get it right. (nearest reference, here)
Pardon my inexperience, but what would have made them better?
Bad:
Type information is lost at compile time, so at execution time you can't tell what type it's "meant" to be
Can't be used for value types (this is a biggie - in .NET a List<byte> really is backed by a byte[] for example, and no boxing is required)
Syntax for calling generic methods sucks (IMO)
Syntax for constraints can get confusing
Wildcarding is generally confusing
Various restrictions due to the above - casting etc
Good:
Wildcarding allows covariance/contravariance to be specified at calling side, which is very neat in many situations
It's better than nothing!
The biggest problem is that Java generics are a compile-time only thing, and you can subvert it at run-time. C# is praised because it does more run-time checking. There is some really good discussion in this post, and it links to other discussions.
The main problem is that Java doesn't actually have generics at runtime. It's a compile time feature.
When you create a generic class in Java they use a method called "Type Erasure" to actually remove all of the generic types from the class and essentially replace them with Object. The mile high version of generics is that the compiler simply inserts casts to the specified generic type whenever it appears in the method body.
This has a lot of downsides. One of the biggest, IMHO, is that you can't use reflection to inspect a generic type. Types are not actually generic in the byte code and hence can't be inspected as generics.
Great overview of the differences here: http://www.jprl.com/Blog/archive/development/2007/Aug-31.html
Runtime implementation (ie not type erasure);
The ability to use primitive types (this is related to (1));
While the wildcarding is useful the syntax and knowing when to use it is something that stumps a lot of people. and
No performance improvement (because of (1); Java generics are syntactic sugar for castingi Objects).
(1) leads to some very strange behaviour. The best example I can think of is. Assume:
public class MyClass<T> {
T getStuff() { ... }
List<String> getOtherStuff() { ... }
}
then declare two variables:
MyClass<T> m1 = ...
MyClass m2 = ...
Now call getOtherStuff():
List<String> list1 = m1.getOtherStuff();
List<String> list2 = m2.getOtherStuff();
The second has its generic type argument stripped off by the compiler because it is a raw type (meaning the parameterized type isn't supplied) even though it has nothing to do with the parameterized type.
I'll also mention my favourite declaration from the JDK:
public class Enum<T extends Enum<T>>
Apart from wildcarding (which is a mixed bag) I just think the .Net generics are better.
I'm going to throw out a really controversial opinion. Generics complicate the language and complicate the code. For example, let's say that I have a map that maps a string to a list of strings. In the old days, I could declare this simply as
Map someMap;
Now, I have to declare it as
Map<String, List<String>> someMap;
And every time I pass it into some method, I have to repeat that big long declaration all over again. In my opinion, all that extra typing distracts the developer and takes him out of "the zone". Also, when code is filled with lots of cruft, sometimes it's hard to come back to it later and quickly sift through all the cruft to find the important logic.
Java already has a bad reputation for being one of the most verbose languages in common use, and generics just add to that problem.
And what do you really buy for all that extra verbosity? How many times have you really had problems where someone put an Integer into a collection that's supposed to hold Strings, or where someone tried to pull a String out of a collection of Integers? In my 10 years of experience working at building commercial Java applications, this has just never been a big source of errors. So, I'm not really sure what you're getting for the extra verbosity. It really just strikes me as extra bureaucratic baggage.
Now I'm going to get really controversial. What I see as the biggest problem with collections in Java 1.4 is the necessity to typecast everywhere. I view those typecasts as extra, verbose cruft that have many of the same problems as generics. So, for example, I can't just do
List someList = someMap.get("some key");
I have to do
List someList = (List) someMap.get("some key");
The reason, of course, is that get() returns an Object which is a supertype of List. So the assignment can't be made without a typecast. Again, think about how much that rule really buys you. From my experience, not much.
I think Java would have been way better off if 1) it had not added generics but 2) instead had allowed implicit casting from a supertype to a subtype. Let incorrect casts be caught at runtime. Then I could have had the simplicity of defining
Map someMap;
and later doing
List someList = someMap.get("some key");
all the cruft would be gone, and I really don't think I'd be introducing a big new source of bugs into my code.
Another side effect of them being compile-time and not run time is that you can't call the constructor of the generic type. So you can't use them to implement a generic factory...
public class MyClass {
public T getStuff() {
return new T();
}
}
--jeffk++
Ignoring the whole type erasure mess, generics as specified just don't work.
This compiles:
List<Integer> x = Collections.emptyList();
But this is a syntax error:
foo(Collections.emptyList());
Where foo is defined as:
void foo(List<Integer> x) { /* method body not important */ }
So whether an expression type checks depends on whether it is being assigned to a local variable or an actual parameter of a method call. How crazy is that?
Java generics are checked for correctness at compile time and then all type information is removed (the process is called type erasure. Thus, generic List<Integer> will be reduced to its raw type, non-generic List, which can contain objects of arbitrary class.
This results in being able to insert arbitrary objects to the list at runtime, as well as it's now impossible to tell what types were used as generic parameters. The latter in turn results in
ArrayList<Integer> li = new ArrayList<Integer>();
ArrayList<Float> lf = new ArrayList<Float>();
if(li.getClass() == lf.getClass()) // evaluates to true
System.out.println("Equal");
Java generics are compile-time only and are compiled into non-generic code. In C#, the actual compiled MSIL is generic. This has huge implications for performance because Java still casts during runtime. See here for more.
The introduction of generics into Java was a difficult task because the architects were trying to balance functionality, ease of use, and backward compatibility with legacy code. Quite expectedly, compromises had to be made.
There are some who also feel that Java's implementation of generics increased the complexity of the language to an unacceptable level (see Ken Arnold's "Generics Considered Harmful"). Angelika Langer's Generics FAQs gives a pretty good idea as to how complicated things can become.
I wish this was a wiki so I could add to other people... but...
Problems:
Type Erasure (no runtime availability)
No support for primative types
Incompatability with Annotations (they were both added in 1.5 I'm still not sure why annotations don't allow generics aside from rushing the features)
Incompatability with Arrays. (Sometimes I really want to do somthing like Class<? extends MyObject>[], but I'm not allowed)
Wierd wildcard syntax and behavior
The fact that generic support is inconsistant across Java classes. They added it to most of the collections methods, but every once in a while, you run into an instance where its not there.
Java doesn't enforce Generics at run time, only at compile time.
This means that you can do interesting things like adding the wrong types to generic Collections.
If you listen to Java Posse #279 - Interview with Joe Darcy and Alex Buckley, they talk about this issue. That also links to a Neal Gafter blog post titled Reified Generics for Java that says:
Many people are unsatisfied with the
restrictions caused by the way
generics are implemented in Java.
Specifically, they are unhappy that
generic type parameters are not
reified: they are not available at
runtime. Generics are implemented
using erasure, in which generic type
parameters are simply removed at
runtime.
That blog post, references an older entry, Puzzling Through Erasure: answer section, that stressed the point about migration compatibility in the requirements.
The goal was to provide backwards
compatibility of both source and
object code, and also migration
compatibility.