This question already has answers here:
What are the reasons why Map.get(Object key) is not (fully) generic
(11 answers)
Closed 6 months ago.
Why isn't Collection.remove(Object o) generic?
Seems like Collection<E> could have boolean remove(E o);
Then, when you accidentally try to remove (for example) Set<String> instead of each individual String from a Collection<String>, it would be a compile time error instead of a debugging problem later.
remove() (in Map as well as in Collection) is not generic because you should be able to pass in any type of object to remove(). The object removed does not have to be the same type as the object that you pass in to remove(); it only requires that they be equal. From the specification of remove(), remove(o) removes the object e such that (o==null ? e==null : o.equals(e)) is true. Note that there is nothing requiring o and e to be the same type. This follows from the fact that the equals() method takes in an Object as parameter, not just the same type as the object.
Although, it may be commonly true that many classes have equals() defined so that its objects can only be equal to objects of its own class, that is certainly not always the case. For example, the specification for List.equals() says that two List objects are equal if they are both Lists and have the same contents, even if they are different implementations of List. So coming back to the example in this question, it is possible to have a Map<ArrayList, Something> and for me to call remove() with a LinkedList as argument, and it should remove the key which is a list with the same contents. This would not be possible if remove() were generic and restricted its argument type.
Josh Bloch and Bill Pugh refer to this issue in Java Puzzlers IV: The
Phantom Reference Menace, Attack of the Clone, and Revenge of The
Shift.
Josh Bloch says (6:41) that they attempted to generify the get method
of Map, remove method and some other, but "it simply didn't work".
There are too many reasonable programs that could not be generified if
you only allow the generic type of the collection as parameter type.
The example given by him is an intersection of a List of Numbers and a
List of Longs.
Because if your type parameter is a wildcard, you can't use a generic remove method.
I seem to recall running into this question with Map's get(Object) method. The get method in this case isn't generic, though it should reasonably expect to be passed an object of the same type as the first type parameter. I realized that if you're passing around Maps with a wildcard as the first type parameter, then there's no way to get an element out of the Map with that method, if that argument was generic. Wildcard arguments can't really be satisfied, because the compiler can't guarantee that the type is correct. I speculate that the reason add is generic is that you're expected to guarantee that the type is correct before adding it to the collection. However, when removing an object, if the type is incorrect then it won't match anything anyway. If the argument were a wildcard the method would simply be unusable, even though you may have an object which you can GUARANTEE belongs to that collection, because you just got a reference to it in the previous line....
I probably didn't explain it very well, but it seems logical enough to me.
In addition to the other answers, there is another reason why the method should accept an Object, which is predicates. Consider the following sample:
class Person {
public String name;
// override equals()
}
class Employee extends Person {
public String company;
// override equals()
}
class Developer extends Employee {
public int yearsOfExperience;
// override equals()
}
class Test {
public static void main(String[] args) {
Collection<? extends Person> people = new ArrayList<Employee>();
// ...
// to remove the first employee with a specific name:
people.remove(new Person(someName1));
// to remove the first developer that matches some criteria:
people.remove(new Developer(someName2, someCompany, 10));
// to remove the first employee who is either
// a developer or an employee of someCompany:
people.remove(new Object() {
public boolean equals(Object employee) {
return employee instanceof Developer
|| ((Employee) employee).company.equals(someCompany);
}});
}
}
The point is that the object being passed to the remove method is responsible for defining the equals method. Building predicates becomes very simple this way.
Assume one has a collection of Cat, and some object references of types Animal, Cat, SiameseCat, and Dog. Asking the collection whether it contains the object referred to by the Cat or SiameseCat reference seems reasonable. Asking whether it contains the object referred to by the Animal reference may seem dodgy, but it's still perfectly reasonable. The object in question might, after all, be a Cat, and might appear in the collection.
Further, even if the object happens to be something other than a Cat, there's no problem saying whether it appears in the collection--simply answer "no, it doesn't". A "lookup-style" collection of some type should be able to meaningfully accept reference of any supertype and determine whether the object exists within the collection. If the passed-in object reference is of an unrelated type, there's no way the collection could possibly contain it, so the query is in some sense not meaningful (it will always answer "no"). Nonetheless, since there isn't any way to restrict parameters to being subtypes or supertypes, it's most practical to simply accept any type and answer "no" for any objects whose type is unrelated to that of the collection.
I always figured this was because remove() has no reason to care what type of object you give it. It's easy enough, regardless, to check if that object is one of the ones the Collection contains, since it can call equals() on anything. It's necessary to check type on add() to ensure that it only contains objects of that type.
It was a compromise. Both approaches have their advantage:
remove(Object o)
is more flexible. For example it allows to iterate through a list of numbers and remove them from a list of longs.
code that uses this flexibility can be more easily generified
remove(E e) brings more type safety to what most programs want to do by detecting subtle bugs at compile time, like mistakenly trying to remove an integer from a list of shorts.
Backwards compatibility was always a major goal when evolving the Java API, therefore remove(Object o) was chosen because it made generifying existing code easier. If backwards compatibility had NOT been an issue, I'm guessing the designers would have chosen remove(E e).
Remove is not a generic method so that existing code using a non-generic collection will still compile and still have the same behavior.
See http://www.ibm.com/developerworks/java/library/j-jtp01255.html for details.
Edit: A commenter asks why the add method is generic. [...removed my explanation...] Second commenter answered the question from firebird84 much better than me.
Another reason is because of interfaces. Here is an example to show it :
public interface A {}
public interface B {}
public class MyClass implements A, B {}
public static void main(String[] args) {
Collection<A> collection = new ArrayList<>();
MyClass item = new MyClass();
collection.add(item); // works fine
B b = item; // valid
collection.remove(b); /* It works because the remove method accepts an Object. If it was generic, this would not work */
}
Because it would break existing (pre-Java5) code. e.g.,
Set stringSet = new HashSet();
// do some stuff...
Object o = "foobar";
stringSet.remove(o);
Now you might say the above code is wrong, but suppose that o came from a heterogeneous set of objects (i.e., it contained strings, number, objects, etc.). You want to remove all the matches, which was legal because remove would just ignore the non-strings because they were non-equal. But if you make it remove(String o), that no longer works.
Related
I'm learning Java generics and reading through Generic Methods.
This page starts with
Consider writing a method that takes an array of objects and a collection and puts all objects in the array into the collection
It then states
By now, you will have learned to avoid the beginner's mistake of trying to use Collection<Object> as the type of the collection parameter.
The page infers that using Collection<Object> won't work.
Why is that an error? Why is it a beginner's error?
Collection<Object> as the parameter works fine for me. Am I so beginner that I've somehow made code that works, but misses the point of the exercise?
import java.util.ArrayList;
import java.util.Collection;
public class test {
static void fromArrayToCol(Object a[],Collection<Object> c)
{
for (Object x:a){c.add(x);}
System.out.println(c);
}
public static void main(String[] args) {
test r=new test();
Object[] oa=new Object[]{"hello",678};
Collection<Object> c=new ArrayList<>();
test.fromArrayToCol(oa,c);
}
}
It looks to me like Oracle's tutorial is wrong in its assertion. But I'm a beginner, so it's likely that I'm not grasping what it's trying to tell me.
You can find the answer if you read the Wildcards section.
The problem is that this new version is much less useful than the old one. Whereas the old code could be called with any kind of collection as a parameter, the new code only takes Collection, which, as we've just demonstrated, is not a supertype of all kinds of collections!
Here, old version refers to parameter Collection whereas new code refers to Collection<Object>
When you have a parameter of type Collection<Object>you can pass either a Collection (raw type) or a Collection<Object>. You cannot pass any other collection like Collection<String> or Collection<SomeClass>.
So, the goal of that tutorial is to copy the elements of an array containing any type to a new collection of the same type.
Example: Integer[] to Collection<Integer>
I would say it wasn't worded properly to bring out the above meaning.
It's often a mistake not because it's a compiler error but because having a Collection<Object> is very rarely useful. Such a collection can hold anything. How often do you need a collection that can hold anything and everything? Very rarely. There will almost always be a more specific type parameter you can use for your collection.
Using Collection<Object> more often than not just makes a programmer's life harder than it needs to be. To get anything out of it we need to inspect it's type (e.g. use instanceof) and cast it.
By using the most appropriate type parameter, you give yourself compile-time assurance that the collection will only contain the types of objects that you expect it will and the resulting code is more concise and more readable.
the beginner's mistake they're referring to is the attempt to use Collection<Object> as parameter when you intend to accept Any
collection of something.
Because Object is superclass of all java class, one may think Collection<Object> is "super-collection" of all collection. This point is demonstrated in the Wildcard section:
The problem is that this new version is much less useful than the old one. Whereas the old code could be called with any kind of collection as a parameter, the new code only takes Collection, which, as we've just demonstrated, is not a supertype of all kinds of collections!
Instead of Collection<Object> you have to use Collection<T> to express that your method accept Any collection of something.
To clarify. If you do not know the type of a collection at compile time you cannot put an element in it.
In your current example, you do know the type of object you wish to put into the collection (In your case its an object). The example shown on the site you liked is
static void fromArrayToCollection(Object[] a, Collection<?> c) {
for (Object o : a) {
c.add(o); // compile-time error
}
}
Notice that there is an ? and not an Object. Hence you get a compile time error.
In the example you showed in your question which uses Object it does compile; however, there are better ways to solve this problem.
The biggest problem with that is that it only work on types that extend collection that have the generic type of Object. This makes it quite restrictive. The page is telling you that there is a way that will work for any collection and array as long as they hold the same type.
I think thats also a option:
private <T> void arrayToCollection(T[] objArray, Collection<T> collection) {
for (T obj : objArray) {
collection.add(obj);
}
}
I have a List<SubClass> that I want to treat as a List<BaseClass>. It seems like it shouldn't be a problem since casting a SubClass to a BaseClass is a snap, but my compiler complains that the cast is impossible.
So, what's the best way to get a reference to the same objects as a List<BaseClass>?
Right now I'm just making a new list and copying the old list:
List<BaseClass> convertedList = new ArrayList<BaseClass>(listOfSubClass)
But as I understand it that has to create an entirely new list. I'd like a reference to the original list, if possible!
The syntax for this sort of assignment uses a wildcard:
List<SubClass> subs = ...;
List<? extends BaseClass> bases = subs;
It's important to realize that a List<SubClass> is not interchangeable with a List<BaseClass>. Code that retains a reference to the List<SubClass> will expect every item in the list to be a SubClass. If another part of code referred to the list as a List<BaseClass>, the compiler will not complain when a BaseClass or AnotherSubClass is inserted. But this will cause a ClassCastException for the first piece of code, which assumes that everything in the list is a SubClass.
Generic collections do not behave the same as arrays in Java. Arrays are covariant; that is, it is allowed to do this:
SubClass[] subs = ...;
BaseClass[] bases = subs;
This is allowed, because the array "knows" the type of its elements. If someone attempts to store something that isn't an instance of SubClass in the array (via the bases reference), a runtime exception will be thrown.
Generic collections do not "know" their component type; this information is "erased" at compile time. Therefore, they can't raise a runtime exception when an invalid store occurs. Instead, a ClassCastException will be raised at some far distant, hard-to-associate point in code when a value is read from the collection. If you heed compiler warnings about type safety, you will avoid these type errors at runtime.
erickson already explained why you can't do this, but here some solutions:
If you only want to take elements out of your base list, in principle your receiving method should be declared as taking a List<? extends BaseClass>.
But if it isn't and you can't change it, you can wrap the list with Collections.unmodifiableList(...), which allows returning a List of a supertype of the argument's parameter. (It avoids the typesafety problem by throwing UnsupportedOperationException on insertion tries.)
As #erickson explained, if you really want a reference to the original list, make sure no code inserts anything to that list if you ever want to use it again under its original declaration. The simplest way to get it is to just cast it to a plain old ungeneric list:
List<BaseClass> baseList = (List)new ArrayList<SubClass>();
I would not recommend this if you don't know what happens to the List and would suggest you change whatever code needs the List to accept the List you have.
The most efficient and at the same time safe way of accomplishing this is as follows:
List<S> supers = List.copyOf( descendants );
The documentation of this function is here: oracle.com - Java SE 19 docs - List.copyOf() The documentation states that this function exists "Since: 10".
The use of this function has the following advantages:
It is a neat one-liner.
It produces no warnings.
It does not require any typecast.
It does not require the cumbersome List<? extends S> construct.
It does not necessarily make a copy !!!
Most importantly: it does the right thing. (It is safe.)
Why is this the right thing?
If you look at the source code of List.copyOf() you will see that it works as follows:
If your list was created with List.of(), then it will do the cast and return it without copying it.
Otherwise, (e.g. if your list is an ArrayList(),) it will create a copy and return it.
If your original List<D> is an ArrayList<D>, then in order to obtain a List<S>, a copy of the ArrayList must be made. If a cast was made instead, it would be opening up the possibility of inadvertently adding an S into that List<S>, causing your original ArrayList<D> to contain an S among the Ds, which is a disastrous situation known as Heap Pollution (Wikipedia): attempting to iterate all the Ds in the original ArrayList<D> would throw a ClassCastException.
On the other hand, if your original List<D> has been created using List.of(), then it is unchangeable(*1), so it is okay to simply cast it to List<S>, because nobody can actually add an S among the Ds.
List.copyOf() takes care of this decision logic for you.
(*1) when these lists were first introduced they were called "immutable"; later they realized that it is wrong to call them immutable, because a collection cannot be immutable, since it cannot vouch for the immutability of the elements that it contains; so they changed the documentation to call them "unmodifiable" instead; however, "unmodifiable" already had a meaning before these lists were introduced, and it meant "an unmodifiable to you view of my list which I am still free to mutate as I please, and the mutations will be very visible to you". So, neither immutable or unmodifiable is correct. I like to call them "superficially immutable" in the sense that they are not deeply immutable, but that may ruffle some feathers, so I just called them "unchangeable" as a compromise.
I missed the answer where you just cast the original list, using a double cast. So here it is for completeness:
List<BaseClass> baseList = (List<BaseClass>)(List<?>)subList;
Nothing is copied, and the operation is fast. However, you are tricking the compiler here so you must make absolutely sure to not modify the list in such a way that the subList starts containing items of a different sub type. When dealing with immutable lists this is usually not an issue.
Below is a useful snippet that works. It constructs a new array list but JVM object creation over head is in-significant.
I saw other answers are un-necessarily complicated.
List<BaseClass> baselist = new ArrayList<>(sublist);
What you are trying to do is very useful and I find that I need to do it very often in code that I write.
Most java programmers would not think twice before implementing getConvertedList() by allocating a new ArrayList<>(), populating it with all the elements from the original list, and returning it. I enjoy entertaining the thought that about 30% of all clock cycles consumed by java code running on millions of machines all over the planet is doing nothing but creating such useless copies of ArrayLists which are garbage-collected microseconds after their creation.
The solution to this problem is, of course, down-casting the collection. Here is how to do it:
static <T,U extends T> List<T> downCastList( List<U> list )
{
#SuppressWarnings( "unchecked" )
List<T> result = (List<T>)list;
return result;
}
The intermediate result variable is necessary due to a perversion of the java language:
return (List<T>)list; would produce an "unchecked cast" warning;
in order to suppress the warning, you need a #SuppressWarnings( "unchecked" ) annotation, and good programming practices mandate that it must be placed in the smallest possible scope, which is the individual statement, not the method.
in java an annotation cannot be placed on just any line of code; it must be placed on some entity, like a class, a field, a method, etc.
luckily, one such annotatable entity is a local variable declaration.
therefore, we have to declare a new local variable to use the #SuppressWarnings annotation on it, and then return the value of that variable. (It should not matter anyway, because it should be optimized away by the JIT.)
Note: this answer was just upvoted, which is cool, but if you are reading this, please be sure to also read the second, more recent answer of mine to this same question: https://stackoverflow.com/a/72195980/773113
How about casting all elements. It will create a new list, but will reference the original objects from the old list.
List<BaseClass> convertedList = listOfSubClass.stream().map(x -> (BaseClass)x).collect(Collectors.toList());
Something like this should work too:
public static <T> List<T> convertListWithExtendableClasses(
final List< ? extends T> originalList,
final Class<T> clazz )
{
final List<T> newList = new ArrayList<>();
for ( final T item : originalList )
{
newList.add( item );
}// for
return newList;
}
Don't really know why clazz is needed in Eclipse..
This is the complete working piece of code using Generics, to cast sub class list to super class.
Caller method that passes subclass type
List<SubClass> subClassParam = new ArrayList<>();
getElementDefinitionStatuses(subClassParam);
Callee method that accepts any subtype of the base class
private static List<String> getElementDefinitionStatuses(List<? extends
BaseClass> baseClassVariableName) {
return allElementStatuses;
}
}
The Collections.singleton() method returns a Set with that single argument instead of a Collection.
Why is that so? From what I can see, apart from Set being a subtype of Collection, I can see no advantage... Is this only because Set extends Collection anyway so there is no reason not to?
And yes, there is also Collections.singletonList() but this is another matter since you can access random elements from a List with .get()...
Immutable
The benefit is found in the first adjective read in that JavaDoc documentation: immutable.
There are times when you are working with code that demands a Set (or List, etc.). In your own context you may have a strict need for only a single item. To accomplish your own goal of enforcing the rule of single-item-only while needing to present that item in a set, use a Set implementation that forbids you from adding more than one item.
“Immutable” on Collections::singleton means that, once created, the resulting Set object is guaranteed to have one, and only one item. Not zero, and not more than one. No more can be added. The one item cannot be removed.
For example, imagine your code is working with an Employee object representing the CEO (Chief Executive Officer) of your company. Your code is explicitly dealing with the CEO only, so you know there can be only one such Employee object at a time, always one CEO exactly. Yet you want to leverage some existing code that creates a report for a specified collection of Employee objects. By using Collection.singleton you are guaranteed that your own code does not mistakenly have other than one single employee, while still being able to pass a Set.
Set< Employee > ceo = Collections.singleton( new Employee( "Tim Cook" ) ) ; // Always exactly one item in this context, only one CEO is possible.
ceo.add( … ) ; // Fails, as the collection is immutable.
ceo.clear() ; // Fails, as the collection is immutable.
ceo.remove( … ) ; // Fails, as the collection is immutable.
someReport.processEmployees( ceo ) ;
Java 9: Set.of & List.of
Java 9 and later offers new interface methods Set.of and List.of to get the same effect, an immutable collection of a single element.
Set< Pet > pet = Set.of( someDog ) ;
Sibling of methods are overloaded to accept any number of elements to be in the immutable collection, not just one element.
Set< Pet > pets = Set.of( someDog , someOtherDog , someCat ) ;
I'm not sure there's a "benefit" or "advantage" per se? It's just the method that returns a singleton Set, and happens to be the default implementation when you want a singleton Collection as well, since a singleton Collection happens to be a mathematical set as well.
I wondered the same thing and came across your question in my research. Here is my conclusion:
Returning a Set keeps the Collections API clean.
Here are the methods for getting a singleton Collection:
public static <T> Set<T> singleton(T o)
public static <T> List<T> singletonList(T o)
public static <K,V> Map<K,V> singletonMap(K key, V value)
What if the API designers decided on having a singletonSet method and singleton method? It would look like this:
public static <T> Collection<T> singleton(T o)
public static <T> Set<T> singletonSet(T o)
public static <T> List<T> singletonList(T o)
public static <K,V> Map<K,V> singletonMap(K key, V value)
Is the singleton method really necessary? Let's think about why we would need some of these methods.
Think about when you would call singletonList? You probably have an API that requires List instead of Collection or Set. I will use this poor example:
public void needsList(List<?> list);
You can only pass a List. needsList hopefully needs the data indexed and is not arbitrarily requesting a List instead of a Collection.
However, you could also pass a List to a method that required any Collection:
public void needsAnyCollection(Collection<?> collection);
But if that is the case, then why use a List? A List has a more complicated API and involves storing indexes. Do you really need the indexes? Would a Set not suffice? I argue that you should use a Set, because needsAnyCollection does not care about the order.
This is where singletonSet really shines. You know that if the collection is of size 1 (singleton), then the data must be unique. Collections of size 1 are coincidentally a Set.
There is no need for a method which returns a singleton of type Collection, because it is accidentally a Set.
The reason singleton Collections exist is to provide a low memory collection if you know you are going to store 1 element and it is not going to be mutated. Especially in high volume services this can have significant impact due to garbage collection latency.
This applies for both Set.of("1"); and Collections.singleton("1");
Since Set is a Collection already returning the more constrained contract is a good thing for users of the libraries.
You get additional functionality without paying anything for it.
And you as a user should do the same as for any other API and library, you should store the least needed contract for it.
So if the only thing you'll ever need to do with the structure is to iterate in a loop I'd suggest to choose Iterable instead of using List, Set or Collection. Since aCollection is an Iterable this will work out of the box.
Not all the lists are random access, just take a LinkedList (random access in the API, not in the implementation)as a counter-example. By the way I agree with Louis Wasserman, a Set simply makes sense because it is closer to the mathematical definition, it just feels natural.
I'm giving my first steps with generics, and I've just coded a generic function to compare two List objects, like this
public static <T> List<T> diffAdded(List<T> source, List<T> dest) {
List<T> ret = new ArrayList<T>();
for(T element: dest) {
if (!source.contains(element)) {
ret.add(element);
}
}
return ret;
}
Everything works fine, but I'm instantiating an ArrayList, because obviously I cannot instantiate an interface List
the fact is that I want to return an object of the same type as source...
how do you handle these kind of situations?
can I face any cast trouble with the method as it is right now?
thanks a lot
The answer to this question is almost always that "returning an object of the same type as the source" is not actually relevant to your application, and that you're going about things the wrong way.
If your caller needs a specific List implementation, because they'll be doing a specific kind of operation on it, then they can do the copying themselves...but if your method takes an arbitrary argument of type List, and the output changes semantically depending on the exact implementation of the input, then that's a huge code smell.
Leave your code as it is, and if your method's callers really, really need a specific implementation, then they can copy your output into that implementation themselves.
You've got two choices here.
1: Realize that the point of accepting the List<T> interface as your input type means that you are explicitly saying that you don't care about the underlying implementation. Furthermore, by returning a List<T> it says that your caller shouldn't care about the underlying implementation either. In most cases, a List is a List and the details shouldn't matter. If it does, you should explicitly return ArrayList<T> instead.
2: Make a bunch of polymorphic calls that match every List implementation type that you want to support.
I very much think that the first answer is where you should direct your efforts.
Say I have an ArrayList that I have cast to an ArrayList of objects. I know that all the objects that were in the ArrayList I cast were of the same type, but not what the type was.
Now, if the ArrayList is not empty, I could take one of the objects in it and use the instanceof operator to learn what the actual type is. But what of the case where the ArrayList is empty? How do I determine what type Object actually is then? Is it possible?
Edit: In retrospect, I suppose it doesn't strictly matter what type an empty ArrayList holds. I can discard the old one and construct a new empty ArrayList of the proper type I was looking for in the first place.
It's a bit messy, so if anyone has alternate suggestions for how to allow a large variety of potential types which may not share a relevant common superclass (Ex. Integer and ArrayList), I'm open to suggestions. The thing is, I have a class that will contain other classes specifically designed to be interchangably used within it. The contained classes can make assumptions as to the type, as they're the ones defining it. The surrounding class cannot make such assumptions, but must confirm that it is specifying the same type as the contained classes that have been slotted into it.
Thus my question, as the containing class is generic (the generic specifying what sort of type it is handling), but it must ensure the contained classes it was passed are returning and operating on the type that was specified to it (they are designed to be created separately and as such the fact that they match must be confirmed when they are slotted in, not at the time of their creation).
Nope, type erasure makes sure of that.
As Jeffrey mentioned, this is impossible due to magics of type erasure. Your best choice I guess is to add an additional case for the empty list:
if (list.isEmpty()) {
// process empty ...
} else {
Object o = list.get(0);
if (o instanceof Integer) {
List<Integer> integers = (List<Integer>) list;
// process Integer ...
} else if (o instanceof String) {
List<String> strings = (List<String>) list;
// process String ...
} // etc...
}
But beware! This instanceof chain is not generally considered good OO practice. Rather than passing around bare lists and then trying to guess their constituent types consider creating a wrapper class for the lists which may also hold a reference to a Class object. That way you may also be able to refactor your different processing algorithms as overriding a single process() method...