Oracle's Collection<Object> tutorial is confusing - java

I'm learning Java generics and reading through Generic Methods.
This page starts with
Consider writing a method that takes an array of objects and a collection and puts all objects in the array into the collection
It then states
By now, you will have learned to avoid the beginner's mistake of trying to use Collection<Object> as the type of the collection parameter.
The page infers that using Collection<Object> won't work.
Why is that an error? Why is it a beginner's error?
Collection<Object> as the parameter works fine for me. Am I so beginner that I've somehow made code that works, but misses the point of the exercise?
import java.util.ArrayList;
import java.util.Collection;
public class test {
static void fromArrayToCol(Object a[],Collection<Object> c)
{
for (Object x:a){c.add(x);}
System.out.println(c);
}
public static void main(String[] args) {
test r=new test();
Object[] oa=new Object[]{"hello",678};
Collection<Object> c=new ArrayList<>();
test.fromArrayToCol(oa,c);
}
}
It looks to me like Oracle's tutorial is wrong in its assertion. But I'm a beginner, so it's likely that I'm not grasping what it's trying to tell me.

You can find the answer if you read the Wildcards section.
The problem is that this new version is much less useful than the old one. Whereas the old code could be called with any kind of collection as a parameter, the new code only takes Collection, which, as we've just demonstrated, is not a supertype of all kinds of collections!
Here, old version refers to parameter Collection whereas new code refers to Collection<Object>
When you have a parameter of type Collection<Object>you can pass either a Collection (raw type) or a Collection<Object>. You cannot pass any other collection like Collection<String> or Collection<SomeClass>.
So, the goal of that tutorial is to copy the elements of an array containing any type to a new collection of the same type.
Example: Integer[] to Collection<Integer>
I would say it wasn't worded properly to bring out the above meaning.

It's often a mistake not because it's a compiler error but because having a Collection<Object> is very rarely useful. Such a collection can hold anything. How often do you need a collection that can hold anything and everything? Very rarely. There will almost always be a more specific type parameter you can use for your collection.
Using Collection<Object> more often than not just makes a programmer's life harder than it needs to be. To get anything out of it we need to inspect it's type (e.g. use instanceof) and cast it.
By using the most appropriate type parameter, you give yourself compile-time assurance that the collection will only contain the types of objects that you expect it will and the resulting code is more concise and more readable.

the beginner's mistake they're referring to is the attempt to use Collection<Object> as parameter when you intend to accept Any
collection of something.
Because Object is superclass of all java class, one may think Collection<Object> is "super-collection" of all collection. This point is demonstrated in the Wildcard section:
The problem is that this new version is much less useful than the old one. Whereas the old code could be called with any kind of collection as a parameter, the new code only takes Collection, which, as we've just demonstrated, is not a supertype of all kinds of collections!
Instead of Collection<Object> you have to use Collection<T> to express that your method accept Any collection of something.

To clarify. If you do not know the type of a collection at compile time you cannot put an element in it.
In your current example, you do know the type of object you wish to put into the collection (In your case its an object). The example shown on the site you liked is
static void fromArrayToCollection(Object[] a, Collection<?> c) {
for (Object o : a) {
c.add(o); // compile-time error
}
}
Notice that there is an ? and not an Object. Hence you get a compile time error.
In the example you showed in your question which uses Object it does compile; however, there are better ways to solve this problem.
The biggest problem with that is that it only work on types that extend collection that have the generic type of Object. This makes it quite restrictive. The page is telling you that there is a way that will work for any collection and array as long as they hold the same type.

I think thats also a option:
private <T> void arrayToCollection(T[] objArray, Collection<T> collection) {
for (T obj : objArray) {
collection.add(obj);
}
}

Related

Build iterator for an interface, using the child [duplicate]

I have a List<SubClass> that I want to treat as a List<BaseClass>. It seems like it shouldn't be a problem since casting a SubClass to a BaseClass is a snap, but my compiler complains that the cast is impossible.
So, what's the best way to get a reference to the same objects as a List<BaseClass>?
Right now I'm just making a new list and copying the old list:
List<BaseClass> convertedList = new ArrayList<BaseClass>(listOfSubClass)
But as I understand it that has to create an entirely new list. I'd like a reference to the original list, if possible!
The syntax for this sort of assignment uses a wildcard:
List<SubClass> subs = ...;
List<? extends BaseClass> bases = subs;
It's important to realize that a List<SubClass> is not interchangeable with a List<BaseClass>. Code that retains a reference to the List<SubClass> will expect every item in the list to be a SubClass. If another part of code referred to the list as a List<BaseClass>, the compiler will not complain when a BaseClass or AnotherSubClass is inserted. But this will cause a ClassCastException for the first piece of code, which assumes that everything in the list is a SubClass.
Generic collections do not behave the same as arrays in Java. Arrays are covariant; that is, it is allowed to do this:
SubClass[] subs = ...;
BaseClass[] bases = subs;
This is allowed, because the array "knows" the type of its elements. If someone attempts to store something that isn't an instance of SubClass in the array (via the bases reference), a runtime exception will be thrown.
Generic collections do not "know" their component type; this information is "erased" at compile time. Therefore, they can't raise a runtime exception when an invalid store occurs. Instead, a ClassCastException will be raised at some far distant, hard-to-associate point in code when a value is read from the collection. If you heed compiler warnings about type safety, you will avoid these type errors at runtime.
erickson already explained why you can't do this, but here some solutions:
If you only want to take elements out of your base list, in principle your receiving method should be declared as taking a List<? extends BaseClass>.
But if it isn't and you can't change it, you can wrap the list with Collections.unmodifiableList(...), which allows returning a List of a supertype of the argument's parameter. (It avoids the typesafety problem by throwing UnsupportedOperationException on insertion tries.)
As #erickson explained, if you really want a reference to the original list, make sure no code inserts anything to that list if you ever want to use it again under its original declaration. The simplest way to get it is to just cast it to a plain old ungeneric list:
List<BaseClass> baseList = (List)new ArrayList<SubClass>();
I would not recommend this if you don't know what happens to the List and would suggest you change whatever code needs the List to accept the List you have.
The most efficient and at the same time safe way of accomplishing this is as follows:
List<S> supers = List.copyOf( descendants );
The documentation of this function is here: oracle.com - Java SE 19 docs - List.copyOf() The documentation states that this function exists "Since: 10".
The use of this function has the following advantages:
It is a neat one-liner.
It produces no warnings.
It does not require any typecast.
It does not require the cumbersome List<? extends S> construct.
It does not necessarily make a copy !!!
Most importantly: it does the right thing. (It is safe.)
Why is this the right thing?
If you look at the source code of List.copyOf() you will see that it works as follows:
If your list was created with List.of(), then it will do the cast and return it without copying it.
Otherwise, (e.g. if your list is an ArrayList(),) it will create a copy and return it.
If your original List<D> is an ArrayList<D>, then in order to obtain a List<S>, a copy of the ArrayList must be made. If a cast was made instead, it would be opening up the possibility of inadvertently adding an S into that List<S>, causing your original ArrayList<D> to contain an S among the Ds, which is a disastrous situation known as Heap Pollution (Wikipedia): attempting to iterate all the Ds in the original ArrayList<D> would throw a ClassCastException.
On the other hand, if your original List<D> has been created using List.of(), then it is unchangeable(*1), so it is okay to simply cast it to List<S>, because nobody can actually add an S among the Ds.
List.copyOf() takes care of this decision logic for you.
(*1) when these lists were first introduced they were called "immutable"; later they realized that it is wrong to call them immutable, because a collection cannot be immutable, since it cannot vouch for the immutability of the elements that it contains; so they changed the documentation to call them "unmodifiable" instead; however, "unmodifiable" already had a meaning before these lists were introduced, and it meant "an unmodifiable to you view of my list which I am still free to mutate as I please, and the mutations will be very visible to you". So, neither immutable or unmodifiable is correct. I like to call them "superficially immutable" in the sense that they are not deeply immutable, but that may ruffle some feathers, so I just called them "unchangeable" as a compromise.
I missed the answer where you just cast the original list, using a double cast. So here it is for completeness:
List<BaseClass> baseList = (List<BaseClass>)(List<?>)subList;
Nothing is copied, and the operation is fast. However, you are tricking the compiler here so you must make absolutely sure to not modify the list in such a way that the subList starts containing items of a different sub type. When dealing with immutable lists this is usually not an issue.
Below is a useful snippet that works. It constructs a new array list but JVM object creation over head is in-significant.
I saw other answers are un-necessarily complicated.
List<BaseClass> baselist = new ArrayList<>(sublist);
What you are trying to do is very useful and I find that I need to do it very often in code that I write.
Most java programmers would not think twice before implementing getConvertedList() by allocating a new ArrayList<>(), populating it with all the elements from the original list, and returning it. I enjoy entertaining the thought that about 30% of all clock cycles consumed by java code running on millions of machines all over the planet is doing nothing but creating such useless copies of ArrayLists which are garbage-collected microseconds after their creation.
The solution to this problem is, of course, down-casting the collection. Here is how to do it:
static <T,U extends T> List<T> downCastList( List<U> list )
{
#SuppressWarnings( "unchecked" )
List<T> result = (List<T>)list;
return result;
}
The intermediate result variable is necessary due to a perversion of the java language:
return (List<T>)list; would produce an "unchecked cast" warning;
in order to suppress the warning, you need a #SuppressWarnings( "unchecked" ) annotation, and good programming practices mandate that it must be placed in the smallest possible scope, which is the individual statement, not the method.
in java an annotation cannot be placed on just any line of code; it must be placed on some entity, like a class, a field, a method, etc.
luckily, one such annotatable entity is a local variable declaration.
therefore, we have to declare a new local variable to use the #SuppressWarnings annotation on it, and then return the value of that variable. (It should not matter anyway, because it should be optimized away by the JIT.)
Note: this answer was just upvoted, which is cool, but if you are reading this, please be sure to also read the second, more recent answer of mine to this same question: https://stackoverflow.com/a/72195980/773113
How about casting all elements. It will create a new list, but will reference the original objects from the old list.
List<BaseClass> convertedList = listOfSubClass.stream().map(x -> (BaseClass)x).collect(Collectors.toList());
Something like this should work too:
public static <T> List<T> convertListWithExtendableClasses(
final List< ? extends T> originalList,
final Class<T> clazz )
{
final List<T> newList = new ArrayList<>();
for ( final T item : originalList )
{
newList.add( item );
}// for
return newList;
}
Don't really know why clazz is needed in Eclipse..
This is the complete working piece of code using Generics, to cast sub class list to super class.
Caller method that passes subclass type
List<SubClass> subClassParam = new ArrayList<>();
getElementDefinitionStatuses(subClassParam);
Callee method that accepts any subtype of the base class
private static List<String> getElementDefinitionStatuses(List<? extends
BaseClass> baseClassVariableName) {
return allElementStatuses;
}
}

Difference between two ways of type casting List to List<T>

public static <T> List<T> templatizeList(final List list, final Class<T> clazz) {
return (List<T>) list;
}
public static <T> List<T> typeSafeAdd(List<?> from, Class<T> clazz) {
List<T> to = new ArrayList<>();
from.forEach(item -> to.add(clazz.cast(item)));
return to;
}
What is the difference between the two methods? Is one way safer or faster than the other or it does not matter?
As per Java docs, generics are limited to compile time. They go away once the code compiles, this is called Type Erasure.
Now, regarding the methods, method 1 just adds a cast to the list without checking the type of all the elements present into it. Meaning, you may get an unexpected ClasscastException anywhere in code at runtime if the List is cast to Cat type and what comes out of it is a Dog.
Method 2 creates an entirely new list, it iterates through all the elements and tries to cast each element in target type. Meaning, it would fail if all the elements can't be cast to the target type.
I would say method 2 is safe as it makes sure everything is fine before adding a cast (i.e. localizes the risk). Method 1 may allow List (which contains Cat, Dog, Dinosaur) to be cast to List< Cat > and then, you may get unexpected failures.
This example explains it well.
Given the discussion, I'd like to suggest an option that combines the best of both worlds: it both localizes the risk of an unsafe cast AND it avoids creating a new List. And it's easy. (I'm making this community wiki since I'm just borrowing others' ideas.)
Step 1: Do the cast for each element from Method 2. Don't do anything with the result, just do the cast. That will ensure that a bad value will be caught right upfront.
Step 2: Do the List cast from Method 1.
Yet another idea
If you know that the List you got from Hibernate contains only the right type of element, then you can go ahead with Method 1. (The Eclipse JDT (at least prior to Mars) does the same thing in the AST. I have to deal with raw types far more often than I'd like.)
The difference is that in the first one you return the same list, while the second creates a new list.
I dont think there's a 'safer' of the two- since it eventually cast all list objecta on both, a non T object will cause classCastException on both.
The first is better in my mind- not creating a rather redundant new list
There is no way to determine whether the List really should have the generic parameters . You must know beforehand what the parameters should be (or you'll find out when you get a ClassCastException). This is why the code generates a warning, because the compiler can't possibly know whether it is safe.
If you want to support more generic data type,
then you can go with
public static <T> List<?> templatizeList(final List list, final Class<T> clazz) {
return (List<?>) list;
}
Otherwise,
2nd option makes sure that it is always of same type but it creates new List object...
So if you are sure then use either your 1st solution or my first option otherwise your second solution is decent removing the part it is creating new object.

Why has ArrayList add E and remove Object [duplicate]

This question already has answers here:
What are the reasons why Map.get(Object key) is not (fully) generic
(11 answers)
Closed 6 months ago.
Why isn't Collection.remove(Object o) generic?
Seems like Collection<E> could have boolean remove(E o);
Then, when you accidentally try to remove (for example) Set<String> instead of each individual String from a Collection<String>, it would be a compile time error instead of a debugging problem later.
remove() (in Map as well as in Collection) is not generic because you should be able to pass in any type of object to remove(). The object removed does not have to be the same type as the object that you pass in to remove(); it only requires that they be equal. From the specification of remove(), remove(o) removes the object e such that (o==null ? e==null : o.equals(e)) is true. Note that there is nothing requiring o and e to be the same type. This follows from the fact that the equals() method takes in an Object as parameter, not just the same type as the object.
Although, it may be commonly true that many classes have equals() defined so that its objects can only be equal to objects of its own class, that is certainly not always the case. For example, the specification for List.equals() says that two List objects are equal if they are both Lists and have the same contents, even if they are different implementations of List. So coming back to the example in this question, it is possible to have a Map<ArrayList, Something> and for me to call remove() with a LinkedList as argument, and it should remove the key which is a list with the same contents. This would not be possible if remove() were generic and restricted its argument type.
Josh Bloch and Bill Pugh refer to this issue in Java Puzzlers IV: The
Phantom Reference Menace, Attack of the Clone, and Revenge of The
Shift.
Josh Bloch says (6:41) that they attempted to generify the get method
of Map, remove method and some other, but "it simply didn't work".
There are too many reasonable programs that could not be generified if
you only allow the generic type of the collection as parameter type.
The example given by him is an intersection of a List of Numbers and a
List of Longs.
Because if your type parameter is a wildcard, you can't use a generic remove method.
I seem to recall running into this question with Map's get(Object) method. The get method in this case isn't generic, though it should reasonably expect to be passed an object of the same type as the first type parameter. I realized that if you're passing around Maps with a wildcard as the first type parameter, then there's no way to get an element out of the Map with that method, if that argument was generic. Wildcard arguments can't really be satisfied, because the compiler can't guarantee that the type is correct. I speculate that the reason add is generic is that you're expected to guarantee that the type is correct before adding it to the collection. However, when removing an object, if the type is incorrect then it won't match anything anyway. If the argument were a wildcard the method would simply be unusable, even though you may have an object which you can GUARANTEE belongs to that collection, because you just got a reference to it in the previous line....
I probably didn't explain it very well, but it seems logical enough to me.
In addition to the other answers, there is another reason why the method should accept an Object, which is predicates. Consider the following sample:
class Person {
public String name;
// override equals()
}
class Employee extends Person {
public String company;
// override equals()
}
class Developer extends Employee {
public int yearsOfExperience;
// override equals()
}
class Test {
public static void main(String[] args) {
Collection<? extends Person> people = new ArrayList<Employee>();
// ...
// to remove the first employee with a specific name:
people.remove(new Person(someName1));
// to remove the first developer that matches some criteria:
people.remove(new Developer(someName2, someCompany, 10));
// to remove the first employee who is either
// a developer or an employee of someCompany:
people.remove(new Object() {
public boolean equals(Object employee) {
return employee instanceof Developer
|| ((Employee) employee).company.equals(someCompany);
}});
}
}
The point is that the object being passed to the remove method is responsible for defining the equals method. Building predicates becomes very simple this way.
Assume one has a collection of Cat, and some object references of types Animal, Cat, SiameseCat, and Dog. Asking the collection whether it contains the object referred to by the Cat or SiameseCat reference seems reasonable. Asking whether it contains the object referred to by the Animal reference may seem dodgy, but it's still perfectly reasonable. The object in question might, after all, be a Cat, and might appear in the collection.
Further, even if the object happens to be something other than a Cat, there's no problem saying whether it appears in the collection--simply answer "no, it doesn't". A "lookup-style" collection of some type should be able to meaningfully accept reference of any supertype and determine whether the object exists within the collection. If the passed-in object reference is of an unrelated type, there's no way the collection could possibly contain it, so the query is in some sense not meaningful (it will always answer "no"). Nonetheless, since there isn't any way to restrict parameters to being subtypes or supertypes, it's most practical to simply accept any type and answer "no" for any objects whose type is unrelated to that of the collection.
I always figured this was because remove() has no reason to care what type of object you give it. It's easy enough, regardless, to check if that object is one of the ones the Collection contains, since it can call equals() on anything. It's necessary to check type on add() to ensure that it only contains objects of that type.
It was a compromise. Both approaches have their advantage:
remove(Object o)
is more flexible. For example it allows to iterate through a list of numbers and remove them from a list of longs.
code that uses this flexibility can be more easily generified
remove(E e) brings more type safety to what most programs want to do by detecting subtle bugs at compile time, like mistakenly trying to remove an integer from a list of shorts.
Backwards compatibility was always a major goal when evolving the Java API, therefore remove(Object o) was chosen because it made generifying existing code easier. If backwards compatibility had NOT been an issue, I'm guessing the designers would have chosen remove(E e).
Remove is not a generic method so that existing code using a non-generic collection will still compile and still have the same behavior.
See http://www.ibm.com/developerworks/java/library/j-jtp01255.html for details.
Edit: A commenter asks why the add method is generic. [...removed my explanation...] Second commenter answered the question from firebird84 much better than me.
Another reason is because of interfaces. Here is an example to show it :
public interface A {}
public interface B {}
public class MyClass implements A, B {}
public static void main(String[] args) {
Collection<A> collection = new ArrayList<>();
MyClass item = new MyClass();
collection.add(item); // works fine
B b = item; // valid
collection.remove(b); /* It works because the remove method accepts an Object. If it was generic, this would not work */
}
Because it would break existing (pre-Java5) code. e.g.,
Set stringSet = new HashSet();
// do some stuff...
Object o = "foobar";
stringSet.remove(o);
Now you might say the above code is wrong, but suppose that o came from a heterogeneous set of objects (i.e., it contained strings, number, objects, etc.). You want to remove all the matches, which was legal because remove would just ignore the non-strings because they were non-equal. But if you make it remove(String o), that no longer works.

Why does the ArrayList implementation use Object[]?

In Java ArrayList<E> implementation base on a array of objects.
Can anybody explain me why implementation of ArrayList<E> uses array Object[] for data storage instead of E[]? What the benefit of using Object[]?
In Java, creating an array of a generic type is not straightforward.
The simple approach does not compile:
public class Container<E> {
E[] arr = new E[3]; // ERROR: Cannot create a generic array of E
}
Replace E with Object, and all is well (at the expense of added complexity elsewhere in the container implementation).
There are alternative approaches, but they present a different set of tradeoffs. For an extensive discussion, see How to create a generic array in Java?
So first of all, realize that the actual runtime type of the array object must be Object[]. This is because arrays know their component types at runtime (different array types are actually different types at runtime), and thus you need to specify the component type in creating the array, but the ArrayList object does not know its type argument at runtime.
That said, the compile-time type of the instance variable could be declared as either Object[] or E[], with different advantages and disadvantages:
If it is declared as Object[]:
private Object[] arr;
// to create it:
arr = new Object[3];
// to get an element:
E get(int i) { return (E)arr[i]; }
The disadvantage of this is that you must cast it to E every time you take something out of it, which means you are basically using it as a pre-generics container.
If it is declared as E[]:
private E[] arr;
// to create it:
arr = (E[])new Object[3];
// to get an element:
E get(int i) { return arr[i]; }
The advantage of this is that you no longer have to cast when you get things out of it -- it provides type-checking on the uses of arr, like generic containers. The disadvantage is that, logically, the cast is lie -- we know we created an object whose runtime type is Object[], and so it is not an instance of E[], unless E is Object.
However, there is no immediate problem with doing this, because E is erased to Object inside the instance methods of the class. The only way a problem can occur is if the object is somehow exposed to the outside of the class (e.g. returned in a method, put in a public field, etc.) in a capacity that uses its type as E[] (which it's not):
// This would be bad. It would cause a class cast exception at the call site
E[] getArray() { return arr; }
But ArrayList, and indeed any properly-designed container class, would never expose an implementation detail such as its internal array to the outside. It would break abstraction, among other things. So as long as the author of this class is aware of not ever exposing this array, there is no problem with doing it this way (save perhaps confusing the next person who sees the code and is unaware of it), and is free to take advantage of the increased type-checking that this way brings.
Considering type erasure (that is, the fact that generics type parameters such as E in your example are deleted at compilation type), I suspect the generated bytecode would be similar in both cases.
From a maintenance point of view, using a type parameter instead of Object would lead to easier to read code (since it would limit casts). But since the API of ArrayList never exposes that "raw" Object array, I suppose it does not make any difference for us mere Java developers :)

java: programming against interface, but having to instantiate a concrete class

I'm giving my first steps with generics, and I've just coded a generic function to compare two List objects, like this
public static <T> List<T> diffAdded(List<T> source, List<T> dest) {
List<T> ret = new ArrayList<T>();
for(T element: dest) {
if (!source.contains(element)) {
ret.add(element);
}
}
return ret;
}
Everything works fine, but I'm instantiating an ArrayList, because obviously I cannot instantiate an interface List
the fact is that I want to return an object of the same type as source...
how do you handle these kind of situations?
can I face any cast trouble with the method as it is right now?
thanks a lot
The answer to this question is almost always that "returning an object of the same type as the source" is not actually relevant to your application, and that you're going about things the wrong way.
If your caller needs a specific List implementation, because they'll be doing a specific kind of operation on it, then they can do the copying themselves...but if your method takes an arbitrary argument of type List, and the output changes semantically depending on the exact implementation of the input, then that's a huge code smell.
Leave your code as it is, and if your method's callers really, really need a specific implementation, then they can copy your output into that implementation themselves.
You've got two choices here.
1: Realize that the point of accepting the List<T> interface as your input type means that you are explicitly saying that you don't care about the underlying implementation. Furthermore, by returning a List<T> it says that your caller shouldn't care about the underlying implementation either. In most cases, a List is a List and the details shouldn't matter. If it does, you should explicitly return ArrayList<T> instead.
2: Make a bunch of polymorphic calls that match every List implementation type that you want to support.
I very much think that the first answer is where you should direct your efforts.

Categories

Resources