Java how to parametrize a generic method with a Set? - java

I have a method with such signature:
private <T> Map<String, byte[]> m(Map<String, T> data, Class<T> type)
When I invoke like this for example it is working fine:
Map<String, String> abc= null;
m(abc, String.class);
But when my parameter T is a Set it doesn't work:
Map<String, Set<String>> abc= null;
m(abc, Set.class);
Is there a way to make it work?

You're going to have to do something really ugly, using an unchecked cast like this:
m(abc, (Class<Set<String>>) (Class<?>) Set.class);
This comes down to type-erasure. At runtime Class<Set<String>> is the same as Class<Set<Integer>>, because we don't have reified generics, and so there is no way to know that what you have is a class for a "Set of strings" vs. a class for a "Set of integers".
I asked a related question some time ago that should also give you some pointers:
Return a class instance with its generic type
IMO this confusion is due to the fact the generics were bolted on after the fact, and aren't reified. I think it's a failing of the language when the compiler tells you that the generic types don't match, but you don't have an easy way of even representing that particular type. For example, in your case you end up with the compile-time error:
m(abc, Set.class);
^
required: Map<String,T>,Class<T>
found: Map<String,Set<String>>,Class<Set>
reason: inferred type does not conform to equality constraint(s)
inferred: Set
equality constraints(s): Set,Set<String>
where T is a type-variable:
T extends Object declared in method <T>m(Map<String,T>,Class<T>)
Now it would be perfectly reasonable for you to think "Oh, I should use Set<String>.class then", but that is not legal. This is abstraction leakage from the implementation of generics in the language, specifically that they are subject to type-erasure. Semantically, Set<String>.class represents the runtime class instance of a set of strings. But actually at runtime we cannot represent the runtime class of a set of strings, because it is indistinguishable from a set that contains objects of any other type.
So we have a runtime semantic that is at odds with compile-time semantic, and knowing why Set<T>.class isn't legal requires knowing that generics are not reified at runtime. This mismatch is what leads to weird workarounds like these.
What compounds the problem is that class instances also ended up being conflated with type-tokens. Since you do not have access to the type of the generic parameter at runtime, the work around has been to pass in an argument of type Class<T>. On the surface this works great because you can pass in things like String.class (which is of type Class<String>) and the compiler is happy. But this method breaks down in your case: what if T itself represents a type with its own generic-type parameter? Now using classes as type-tokens is not useful because there is no way to distinguish between Class<Set<String>> and Class<Set<Integer>> because fundamentally, they are both Set.class at runtime and so share the same class instance. So IMO, using a class as a runtime type-token doesn't work as a general solution.
Due to this shortcoming in the language, there are some libraries that make it very easy to retrieve the generic type-information. In addition they also provide classes are better at representing the "type" of something:
TypeTools
Reflection Explained: Google Guava

The following signature works with super keyword. (I tested with Java7)
private <T> Map<String, byte[]> m(Map<String, T> data, Class<? super T> type)
Map<String, Set<String>> abc = null;
m(abc, Set.class);
This is subtyping for generics.

From what I see, there are two potential solutions to this problem in which both have their respective limitations.
The first solution relies on the fact that java's type erasure is complete, meaning that types for any parametrized types are erased regardless of "depth". For example: a Map<String, Set<String> will get reduced to Map<String, Set> and then Map<Object, Object> meaning that whilst type information is hard to obtain, it technically isn't needed during runtime given that any object can be inserted into the Map (given that it passes all class casts).
With this, we can create a relatively "ugly" (compared to the second solution) method of obtaining runtime type information through an instance present in the map. By doing so, regardless of how many sets you embed and what the resultant "type" is present after erasure, we can guarantee that an instance of it will be insertable back into the original map.
Demonstrated below:
// Java 7 approach
private <T> Map<String, byte[]> m(Map<String, T> data){
Class valueType = null;
Iterator<T> valueIterator = data.values().iterator();
while(valueIterator.hasNext()){
T nextCandidate = valueIterator.next();
if(nextCandidate != null){
valueType = nextCandidate.getClass();
break;
}
}
if(valueType == null){
// No instance present, fail
return null;
}
// Create a new instance
T obj = (T) valueType.newInstance(); // Exception handling not shown
// Rest of code here
return null;
}
as seen, the type information is extracted directly from the first non-null value present within the map. Under java 8 we can do better using streams:
// Java 8 approach
private <T> Map<String, byte[]> m(Map<String, T> data){
// Note: use findFirst() for more consistent behaviour
Optional<T> optInstance = data.values().stream().filter(Objects::nonNull).findAny();
if(!optInstance.isPresent()){
// No instance present, fail
return null;
}
Class valueType = optInstance.get().getClass();
// Create a new instance
T obj = (T) valueType.newInstance(); // Exception handling not shown
// Rest of code here
return null;
}
However, this solution has a couple of limitations. As stated, the map has to contain at least one non-null value for the operation to be successful. And secondly, this solution doesn't take account of subclassing of the declared type (? extends T) on specific elements which may provide to be problematic if you have elements of different classes (e.g. TreeSet and HashSet within the same map).
The second issue can be solved easily by dealing with type information on a key-value pair basis rather on a "whole" map basis though this comes at the cost of "knowing" the type information for all elements within the map. Alternatively, more complex solutions such as devising the most specific common superclass to all non-null values within the map can also be used, but for all intents and purposes, this becomes more of a guesstimate solution than a real one.
The second solution to this problem is, in my opinion, a lot cleaner but poses additional complexity to the caller. This approach follows a more functional approach and can be applied if there are only a limited number of type-dependent operations within the method. Following your proposed case of instantiation of the generic type T, we can modify the method as follows:
private <T> Map<String, byte[]> m(Map<String, T> data, Callable<T> creator){
// Create a new instance
T obj = creator.call(); // Exception handling not shown
// Rest of code here
return null;
}
and called as follows:
Map<String, Set<String>> data = new HashMap<>();
// Instantiation method set to new HashSet (thanks to bayou.io for HashSet::new)
m(data, HashSet::new); // Note: replace with anonymous inner class for java 7
in this case, the type information (which is present at the level of the caller) can be bypassed by having the caller provide the type-dependent functionality required. The example provides a basic HashSet creation for all values but more complex instantiation rules can be defined on a per-element basis.
The downside to this approach is that it provides complexity to the caller and can be very bad if this were to be an external API function (though the use of private within your original method suggests otherwise). Java 7 and below also causes quite a bit of boilerplate anonymous inner class code to pop up making caller-side code harder to read. Additionally, if most of your method requires type-information to be present then this solution is less feasible as well (since you'd be reprogramming most of your method on a per-type basis, defeating the point of using generics).
In all, I'd personally prefer to use the second approach if possible, only using the first approach if deemed infeasible. The gist of the solutions I'm getting at here is to not rely on type information when dealing with generics or at least set a bound such that you get functionality you require without ugly hacks. In the case where type-dependent operations have to be performed, have the caller provide the functionality for that (through Callables, Runnables or some FunctionalInterface of your creation).
If type information is absolutely critical for some reason not made apparent, I suggest reading this article to stop type erasure altogether, allowing type information to be present directly from within the method.

You'd need to do it like :
Map<String, Set> abc = null; //gives a compiler warning
m(abc, Set.class)
The issue is that if you want T to be captured to Set<String>, there will be no way to express Class<T> since there's no such thing as Set<String>.class, just Set.class.

Related

How to specify the generic type of a collection?

I want to define a function which can convert a kind of Set to another, like convert HashSet to LinkedHashSet. Here is the function declaration. The sp is sharedpreferences.
public Set<String> decodeStringSet(String key, #Nullable Set<String> defaultValue, Class<? extends Set> cls){
Set<String> result = sp.getStringSet(key, defaultValue);
if(result == null){
return defaultValue;
}
else {
String[] array = result.toArray(new String[0]);
Set<String> a;
try {
a = cls.newInstance();
} catch (IllegalAccessException | InstantiationException var7) {
return defaultValue;
}
a.addAll(Arrays.asList(array));
return a;
}
}
However, the compiler remind me that "Unchecked assignment: '? extends java.util.Set' to 'java.util.Set<java.lang.String>'" on "a = cls.newInstance();". I don't know how to change cls to cls<java.lang.String>.
The warning is unavoidable. Isolate it in a helper method and toss the appropriate #SuppressWarnings at it. Or, refactor how this thing works. In general, the generics of Class<?> are weird and don't work well; if you try to write code that relies on the generics part to make it work, it's likely to result in many situations where you can't avoid these warnings, and the API is suboptimal.1
One tricky way to do what you're trying to do here in a one-size-fits-all way is so-called Super Type Tokens. You can search the web for this concept, because for what you're specifically doing here, STTs are overkill. What you are looking for, is a supplier.
You want the caller not to pass you the type of a set. No. You want the caller to pass you a piece of code that, if executed, creates the set.
While we're at it, let's get rid of the array, you're shifting the elements through that array for absolutely no sensible reason.
public <S extends Set<String>> S decodeStringSet(String key, #Nullable Set<String> defaultValue, Supplier<S> setMaker) {
Set<String> result = sp.getStringSet(key, defaultValue);
if(result == null) return defaultValue;
S a = setMaker.get();
a.addAll(result);
return a;
}
This code can be used as follows:
LinkedHashSet<String> mySet = decodeStringSet("myKey", null, LinkedHashSet::new);
Perhaps you're unfamiliar with this syntax. new LinkedHashSet() will, when you run that code, create a LinkedHashSet. In contrast, LinkedHashSet::new will, when you run that code, produce an object that can be asked to create a LinkedHashSet, by invoking its get() method. One does the act right this very moment. The other wraps 'do the act' into a little machine. You can hand the machine to other code, or press the button on the machine to make it do the act, and you can press the button as often as you feel like.
[1] Need some more explanations as to why relying on the generics of j.l.Class is awkward and not a good idea?
A class object simply cannot, itself, represent generics, whereas generics can represent generics. That is: List<List<String>> is perfectly fine. However, Class<List<String>> does not make sense. You can write it, (j.l.Class does not have hardcoded rules to keep sanity alive in the langspec), but it doesn't represent anything: There's just one class object that represents the type j.u.List. This one object cannot therefore represent the generics; you can't have one class object representing List<String> and another representing List<Integer>. Less important, but still annoying - there are things class objects can represent that generics cannot. int.class is types as Class<Integer> but this isn't quite right.
Hence, in your example, the compiler consider Class<? extends Set> as problematic; it's got a raw type inside the generics. However, it is technically correct, in that it is not possible to represent e.g. a Set<T>, merely 'a Set, whose generics are unknown, given that j.l.Class objects cannot represent them'.
Lastly, classes basically only produce (the P in PECS - which explains what the difference is between <Number>, <? extends Number>, and <? super Number>); it is mentally difficult to fathom the difference between Class<? extends String> and Class<String>, because it's an irrelevant difference, given that j.l.Class only produces. And yet, often you really do need to write Class<? extends String> because if you don't, the compiler refuses to compile your code for imaginary, irrelevant reasons. That's because, again, j.l.Class is not hardcoded in the lang spec: The compiler does not know that there is no effective distinction between Class<T> and Class<? extends T>, and java does not have a way to mark off a given generics param as forced Produces-only or some such.

What is the purpose of #CompatibleWith annotation from Guava?

From the documentation of com.google.errorprone.annotations.CompatibleWith:
Declares that a parameter to a method must be "compatible with" one of the type parameters in the
method's enclosing class, or on the method itself. "Compatible with" means that there can exist a
"reference casting conversion" from one type to the other (JLS 5.5.1).
For example, Collection.contains(java.lang.Object) would be annotated as follows:
interface Collection<E> {
boolean contains(#CompatibleWith("E") Object o);
}
To indicate that invocations of Collection.contains(java.lang.Object) must be passed an argument whose type is compatible with the generic type argument of the Collection instance:
Here is a usage from com.google.common.cache.Cache:
public interface Cache<K, V> {
V getIfPresent(#CompatibleWith("K") Object key);
V get(K key, Callable<? extends V> loader) throws ExecutionException;
...
What is the benefit of having #CompatibleWith("E") Object instead of E as the type of the parameter? And why did they use the #CompatibleWith annotation in the getIfPresent but not in the get method from Cache?
It's safe for getIfPresent operation to allow objects of "too broad" type (you don't get anything from cache with string keys from getIfPresent(42)). On the other hand, for hypothetical get(Object, Callable) allowing inserting an object of wrong type (eg. 42 instead of a string "foo") would damage the underlying collection, that's why you have compile time checking won't allow it.
Having said that, this code:
Cache<String, Foo> cache = CacheBuilder.newBuilder()
// and later
Foo value = cache.getIfPresent(42);
is most probably wrong, and it makes sense for framework like Error Prone to signal that as a possible bug.
More detailed clarification about "use Object not generic type in safe operations" convention (which isn't used only in Guava, but also in JDK collections framework) is explained in this old, but still relevant blog post "Why does Set.contains() take an Object, not an E?", where you read:
Why should code like the following compile?
Set<Long> set = new HashSet<Long>();
set.add(10L);
if (set.contains(10)) {
// we won't get here!
}
We're asking if the set contains the Integer ten; it's an "obvious"
bug, but the compiler won't catch it because Set.contains() accepts
Object. Isn't this stupid and evil?
and later answers the question in title:
The real difference is that add() can cause "damage" to the collection when called with the wrong type, and contains() and remove() cannot.
The conclusion is also relevant:
Static analysis plays an extremely important role in the construction of bug-free software.
Which makes sense, because the author, Kevin Bourrillion, is also lead developer of Guava.

When is it acceptable to pass a Class<T> argument to a generic method?

Methods that are generic using the T parameter can for sure be handy. However, I am curious what the use of a generic method would be if you pass an argument such as Class<T> clazz to the method. I've come up with a case that maybe could be an possible use. Perhaps you only want to run a part of the method based on the type of class. For example:
/** load(File, Collection<T>, Class<T>)
* Creates an object T from an xml. It also prints the contents of the collection if T is a House object.
* #return T
* Throws Exception
*/
private static <T> T void load(File xml, Collection<T> t, Class<T> clazz) throws Exception{
T type = (T) Jaxb.unmarshalFile(xml.getAbsolutePath(), clazz); // This method accepts a class argument. Is there an alternative to passing the class here without "clazz"? How can I put "T" in replace of "clazz" here?
if (clazz == House.class) {
System.out.println(t.toString());
} else {
t.clear();
}
return T;
}
Is this an accepted practice? When is the Class<T> clazz argument useful with generic methods?
Is this an accepted practice?
Well, to me.. no not really. To me, it seems somewhat pointless when you can simply define some boundaries on the type of T. For example:
private static <T extends House> void load(Collection<T> t)
This will guarantee that either the object is of type House or of a subclass of House, but then again if you only want an instance of type House or it's subclasses, it should really just be:
private static void load(Collection<House> houses)
The idea of generics is to make a method or a class more malleable and extensible, so to me it seems counter-intuitive to start comparing class types in the method body, when the very notion of generics is to abstract away from such details.
I'd only pass class objects if the generic type could not be derived otherwise. In your case, the compiler should be able to infer T from the collection. To treat specific objects differently, I'd use polymorphism - e.g. House#something() and Other#something(), and just call anyObject.something().
I think it is acceptable but if it can be avoided then you should. Typically, if you can have different methods which accepts different type, then do it instead of one method which uses if clauses to do something different depending on the type of the parameter. You could also delegates to the class the operation you want to make specific for a given type.
In your case, you could simply test the type of each element of the collection using instanceof, to do what you need for the specific type. But it won't work if the list is empty.
A typical use is if you need to get the type to create it and you can find it from another way. For instance, Spring uses it to load a bean from its name:
<T> T getBean(Class<T> requiredType)
In that case, it cannot be avoided (without having to cast).
If the returned value or other parameters types are dependent or need to be equal, generics will add compile time checks, so that there's no need to cast to T.
Examples
<T> T createNewInstanceOfType(Class<T> type);
<T> void addValueToCollection(Collection<T> collection,T value);
<T> List<Class<? extends T>> findSubClassesInClasspath(Class<T> superType);
Raw types
It is still possible to defer a casting error until runtime (ClassCastException) with some casts, e.g. with implicit casts from non-generic (raw) types to generic ones:
List nonGenericList = new ArrayList();
nonGenericList.add(new Integer(42));
List<String> wreckedList = nonGenericList;
The compiler will generate a bunch of warnings, unless you suppress them with annotations or compiler settings.
Compiler Settings (Eclipse):
For example, the usage of raw types generates a warning per default, one can treat warnings as errors and even as fatal errors:
You would pass a Class<T> argument in generics if, and only if, you would pass a Class argument before generics. In other words, only if the Class object is used in some way. Generics serves as a compile-time type checking tool. However, what arguments you pass should be determined by the runtime logic of the program, and should be irrelevant of generics.
I haven't seen passing a Class object in order to check the runtime type of an object as a common use case for generics. If you're doing that, there's a good chance that there's a better way to set up your class structure.
What I have seen is if you need to create a new instance of the class in question, or otherwise use reflection. In that case you do have to pass the Class object, because Java cannot derive it at runtime thanks to type erasure.
In your case actually having the Generic parameter is not strictly needed.
Since the output of the function you are describing does not depend on the type of the input you might as well use wild cards.
private static void stuff(Collection<?> t){
Object next = t.iterator().next(); //this is ugly and inefficient though
if(next instanceof House){
System.out.print(next.toString());
}else{
t.clear();
}
}
The only time you should use generic parameter is when the type of the result of a function will be dependent of the type of the parameters.
You will need to pass the Class corresponding to the type when your code will need it; most of the time this happens when:
- You need to cast/type check objects to T
- There is serialization/deserialization involved.
- You cannot access any instance of T in your function and you cannot call the getClass() method when you need it.
Passing a Class on every generic function will result in you passing an unnecessary parameter most of the time, which is regarded as bad practice.
I answered a similar discussion in the past:
When to use generic methods and when to use wild-card?

Uses for the strange-looking explicit type argument declaration syntax in Java

I recently came upon the strange syntax for explicitly declaring generic types when calling Java methods. For example:
Collections.<String>emptyList();
returns an empty List<String>. However, this seems silly as the implementation of <T> emptyList() is just the unchecked type cast (List<T>) EMPTY_LIST, such that all results have the same type erasure (and are the same object.) Moreover, this sort of explicit type declaration is usually not needed because the compiler can often infer the types:
List<String> empty = Collections.emptyList();
After doing some more digging I found two other times where you'd want to use this syntax, and they're all due to using the Guava library and apparently trying to put too many statements on one line.
Decorating a collection, for example with a synchronized wrapper, and the compiler being not able to infer the types. The following doesn't work if you take out the type declaration: cannot convert from Set<Object> to Set<String>:
Set<String> set = Collections.synchronizedSet(Sets.<String>newHashSet());
Getting less specific type parameters when they compiler tries to make ones that are too specific. For example, without the type declaration the following statement complains as well: cannot convert from Map<String, String> to Map<String, Object>:
Map<String, Object> toJson = ImmutableMap.<String, Object>of("foo", "bar");
I find it ironic that in the first case the inferred type parameters are too general and in the second case they are too specific, but I suppose that is just an artifact of the generics system in Java.
However, this language construct itself seems to be avoidable except in these strange use cases invented by the Guava team. Moreover, it seems plain to me that there is a way for the compiler to infer type arguments in both the above examples, and the developers just chose not to do so. Are there examples of it ever being necessary or useful to use this construct in Java programming or does it exist solely to make the compiler simpler / JDK developer's life easier?
How is "shutting up the compiler" not "necessary or useful?" I find it both necessary and useful for my code to compile.
There are times when the correct type cannot be inferred, as you have already found. In such cases, it is necessary to explicitly specify the type parameters. Some examples of the compiler just not being smart enough:
Why can't javac infer generic type arguments for functions used as arguments?
Generics type inference fails?
And if you really want to dig into the complexities of type inference, it starts and ends with the Java Language Specification. You'll want to focus on JLS §15.12.2.7. Inferring Type Arguments Based on Actual Arguments and §15.12.2.8. Inferring Unresolved Type Arguments.
I found at least one case where the compiler infers the types correctly, and it's still needed: when you want to use the result as a more generic type. Take this method, which basically creates a List<T> from zero or more T objects:
public static <T> List<T> listOf(T... items) {
ArrayList<T> list = new ArrayList<T>();
for (T item : items)
list.add(item);
return list;
}
The idea is that you can use it like this:
List<Integer> numbers = ListUtils.listOf(1, 2, 3);
Now, suppose you have a method that can receive List<Object>:
public static void a(List<Object> objs) {
...
}
and that you want to supply a list built via the listOf() method:
a(ListUtils.listOf(1, 2, 3));
This will not compile, as the method parameter type is List<Object> and the supplied argument is List<Integer>. In that case, we can change the invocation to:
a(ListUtils.<Object>listOf(1, 2, 3));
which does compile, as expected.
Java type inference is incredibly weak. The only time it is not necessary to include the explicit type in a generic method like emptyList() is when the result of the method defines a variable. If you try to pass an empty list as the argument of another method (example 1), a situation which arises for me on a daily basis (and I do not yet use Guava), the compiler just gives up on type inference completely. I fail to see how declaring the empty list as a local, single-use variable is "putting too many statements on one line" as you call it; the empty list is a very simple sub-expression, except that Java's miserable type inference makes it complex. Compare with Scala, which will do inference in 3 different situations.

How to do `MyClass<String>.class` in Java?

How can call public <T> T doit(Class<T> clazz); using MyClass<String>.class as clazz where I can not instantiate or extend MyClass.
EDIT: 'David Winslow' and 'bmargulies' responses are correct (MyClass<String>) doit(MyClass.class); works for the original question BUT surprisingly when the method returns say MyClass<T> instead of T casting will not compile any more.
Edit: I have replaced List with MyClass and added the condition to my original question.
Use List.class. Because of type erasure type parameters to Java classes are entirely a compile-time construct - even if List<String>.class was valid syntax, it would be the exact same class as List<Date>.class, etc. Since reflection is by nature a runtime thing, it doesn't deal well with type parameters (as implemented in Java).
If you want to use the Class object to (for example) instantiate a new List instance, you can cast the result of that operation to have the appropriate type parameter.
List<String> list = (List<String>)(ArrayList.class.newInstance());
I've seen similar questions asked several times, for example
Acquiring generic class type
There are legitimate reasons to construct static generic types. In op' case, he would probably like to
MyClass<String> result = doit(MyClass<String>.class);
Without language syntax support, casting is the correct way to go. If this is needed quite often, the casting should be put in a method, as
public class MyClass<T>
{
#SuppressWarnings("unchecked")
// may need a better method name
static public <T2> Class<MyClass<T2>> of(Class<T2> tClass)
{
return (Class<MyClass<T2>>)(Class<?>)(MyClass.class);
}
}
MyClass<String> result = doit(MyClass.of(String.class)); // no warning
We can supress the warning on that method alone, after making sure the cast is safe. Any call site will not see the warning.
This is all compile time casting game. At runtime all the type parameters are erased, and really only the naked class object is passed around. The of method will most likely be optimized off, so to JVM the last line is nothing but
MyClass result = doit(MyClass.class)
There are also times when at runtime we need a complete MyClass<String> type. A ParameterizedType object needs to be obtained to represent MyClass<String>.
When the two requirements are combined together, that is, we need a compile time expression regarding MyClass and String that will evaluate at runtime to a ParameterizedType
ParameterizedType type_MyClass_String = ???? MyClass ?? String ???
There is a technique involving an anonymous subclass of MyClass<String>
ParameterizedType type_MyClass_String = superTypeOf( new MyClass<String>(){} );
which I find quite disturbing.
See http://jackson.codehaus.org/1.7.0/javadoc/org/codehaus/jackson/type/TypeReference.html and the references that it references for a comprehensive discussion of the issues around generics.
the bottom line is that, if you really want to work with generic types in this way, you have to stop using Class and start using Type and its subclasses.
Contrary to your comment on another answer, you can write List<List<String>> obj = (List<List<String>>) doit(List.class);, you just can't avoid a warning when you write it.
Since after your update your question does not appear to be an exact duplicate:
You would need to call getClass() on an instance of MyClass. Better have a dummy static final instance somewhere:
public static final MyClass INSTANCE = new MyClass();
...
return (Class<MyClass<String>>) instance.getClass();
T corresponds to List, so any reference to String as the generic paramter of List is irrelevant.
How to do MyClass<String>.class in
Java?
You can't.
Generics in Java use type erasure; the type of the parametrized argument is enforced during compilation, but it is lost after compilation. The resulting byte code for an instance of a generic class does not contain any run-time meta-data on its arguments whatsoever.
As it is now, it is just not possible, a major language design blunder IMO.

Categories

Resources