Why is there no Instance-level Stream.concat method in Java? - java

I know that Stream.concat exists (doc) to concatenate two streams. However, I have run into cases where I need to add "a few more" items to an existing stream, and then continue processing on it. In such a situation, I would have expected to be able to chain together methods like:
getStream(someArg)
.map(Arg::getFoo)
.concat(someOtherStreamOfFoos) // Or append, or...
.map(...)
However, no such instance-level chainable append/concat method exists.
This isn't a question asking for solutions to this problem, or more elegant approaches (although I would of course be grateful for any other viewpoints!). Rather, I'm asking about the design factors that led to this decision. The Stream interface was, I trust, designed by some extremely smart people who are aware of the Principle of Least Astonishment - so, I must assume that their decision to omit this (to me) intuitively-obvious method signifies either that the method is an antipattern, or that it is not possible due to some technical limitation. I'd love to know the reason.

I can give you one reason it wouldn't have worked.
Stream.concat is defined as
static <T> Stream<T> concat(Stream<? extends T> a,
Stream<? extends T> b)
You can concat a Stream<HashMap> and Stream<Map> into a Stream<Map>, or even concat a Stream<HashMap> and a Stream<TreeMap> into a Stream<Map>. To do that with an instance method, you would need to be able to declare a type parameter like <U super T>, which Java doesn't allow.
// It'd look kind of like this, if Java allowed it.
public <U super T> Stream<U> concat(Stream<? extends U> other)
Java only allows upper-bounded type parameters, not lower-bounded.
Concatenating a Stream<Something> and a Stream<SomethingElse> might seem unusual, but type inference often produces type parameters too specific to work with an instance method. For example,
Stream.concat(Stream.of(dog), animalStream)
which would require an explicit type parameter if written as
Stream.<Animal>of(dog).concat(animalStream)

I think it is just missed functionality of Stream API.
Note that RxJava's Observable has method "concatWith" with required functionality, so your question is reasonable:
Observable<String> o1 = ...;
Observable<Integer> o2 = ...;
o1.map(s -> s.length())
.concatWith(o2)
....
Java 8 has another functionality is nice to have is get another Optional if current Optional is empty, like:
Optional.ofNullable(x).orElse(anotherOptional)
What I want to say that this concat you described is possible, just not implemented in the Stream.

Related

Java how to parametrize a generic method with a Set?

I have a method with such signature:
private <T> Map<String, byte[]> m(Map<String, T> data, Class<T> type)
When I invoke like this for example it is working fine:
Map<String, String> abc= null;
m(abc, String.class);
But when my parameter T is a Set it doesn't work:
Map<String, Set<String>> abc= null;
m(abc, Set.class);
Is there a way to make it work?
You're going to have to do something really ugly, using an unchecked cast like this:
m(abc, (Class<Set<String>>) (Class<?>) Set.class);
This comes down to type-erasure. At runtime Class<Set<String>> is the same as Class<Set<Integer>>, because we don't have reified generics, and so there is no way to know that what you have is a class for a "Set of strings" vs. a class for a "Set of integers".
I asked a related question some time ago that should also give you some pointers:
Return a class instance with its generic type
IMO this confusion is due to the fact the generics were bolted on after the fact, and aren't reified. I think it's a failing of the language when the compiler tells you that the generic types don't match, but you don't have an easy way of even representing that particular type. For example, in your case you end up with the compile-time error:
m(abc, Set.class);
^
required: Map<String,T>,Class<T>
found: Map<String,Set<String>>,Class<Set>
reason: inferred type does not conform to equality constraint(s)
inferred: Set
equality constraints(s): Set,Set<String>
where T is a type-variable:
T extends Object declared in method <T>m(Map<String,T>,Class<T>)
Now it would be perfectly reasonable for you to think "Oh, I should use Set<String>.class then", but that is not legal. This is abstraction leakage from the implementation of generics in the language, specifically that they are subject to type-erasure. Semantically, Set<String>.class represents the runtime class instance of a set of strings. But actually at runtime we cannot represent the runtime class of a set of strings, because it is indistinguishable from a set that contains objects of any other type.
So we have a runtime semantic that is at odds with compile-time semantic, and knowing why Set<T>.class isn't legal requires knowing that generics are not reified at runtime. This mismatch is what leads to weird workarounds like these.
What compounds the problem is that class instances also ended up being conflated with type-tokens. Since you do not have access to the type of the generic parameter at runtime, the work around has been to pass in an argument of type Class<T>. On the surface this works great because you can pass in things like String.class (which is of type Class<String>) and the compiler is happy. But this method breaks down in your case: what if T itself represents a type with its own generic-type parameter? Now using classes as type-tokens is not useful because there is no way to distinguish between Class<Set<String>> and Class<Set<Integer>> because fundamentally, they are both Set.class at runtime and so share the same class instance. So IMO, using a class as a runtime type-token doesn't work as a general solution.
Due to this shortcoming in the language, there are some libraries that make it very easy to retrieve the generic type-information. In addition they also provide classes are better at representing the "type" of something:
TypeTools
Reflection Explained: Google Guava
The following signature works with super keyword. (I tested with Java7)
private <T> Map<String, byte[]> m(Map<String, T> data, Class<? super T> type)
Map<String, Set<String>> abc = null;
m(abc, Set.class);
This is subtyping for generics.
From what I see, there are two potential solutions to this problem in which both have their respective limitations.
The first solution relies on the fact that java's type erasure is complete, meaning that types for any parametrized types are erased regardless of "depth". For example: a Map<String, Set<String> will get reduced to Map<String, Set> and then Map<Object, Object> meaning that whilst type information is hard to obtain, it technically isn't needed during runtime given that any object can be inserted into the Map (given that it passes all class casts).
With this, we can create a relatively "ugly" (compared to the second solution) method of obtaining runtime type information through an instance present in the map. By doing so, regardless of how many sets you embed and what the resultant "type" is present after erasure, we can guarantee that an instance of it will be insertable back into the original map.
Demonstrated below:
// Java 7 approach
private <T> Map<String, byte[]> m(Map<String, T> data){
Class valueType = null;
Iterator<T> valueIterator = data.values().iterator();
while(valueIterator.hasNext()){
T nextCandidate = valueIterator.next();
if(nextCandidate != null){
valueType = nextCandidate.getClass();
break;
}
}
if(valueType == null){
// No instance present, fail
return null;
}
// Create a new instance
T obj = (T) valueType.newInstance(); // Exception handling not shown
// Rest of code here
return null;
}
as seen, the type information is extracted directly from the first non-null value present within the map. Under java 8 we can do better using streams:
// Java 8 approach
private <T> Map<String, byte[]> m(Map<String, T> data){
// Note: use findFirst() for more consistent behaviour
Optional<T> optInstance = data.values().stream().filter(Objects::nonNull).findAny();
if(!optInstance.isPresent()){
// No instance present, fail
return null;
}
Class valueType = optInstance.get().getClass();
// Create a new instance
T obj = (T) valueType.newInstance(); // Exception handling not shown
// Rest of code here
return null;
}
However, this solution has a couple of limitations. As stated, the map has to contain at least one non-null value for the operation to be successful. And secondly, this solution doesn't take account of subclassing of the declared type (? extends T) on specific elements which may provide to be problematic if you have elements of different classes (e.g. TreeSet and HashSet within the same map).
The second issue can be solved easily by dealing with type information on a key-value pair basis rather on a "whole" map basis though this comes at the cost of "knowing" the type information for all elements within the map. Alternatively, more complex solutions such as devising the most specific common superclass to all non-null values within the map can also be used, but for all intents and purposes, this becomes more of a guesstimate solution than a real one.
The second solution to this problem is, in my opinion, a lot cleaner but poses additional complexity to the caller. This approach follows a more functional approach and can be applied if there are only a limited number of type-dependent operations within the method. Following your proposed case of instantiation of the generic type T, we can modify the method as follows:
private <T> Map<String, byte[]> m(Map<String, T> data, Callable<T> creator){
// Create a new instance
T obj = creator.call(); // Exception handling not shown
// Rest of code here
return null;
}
and called as follows:
Map<String, Set<String>> data = new HashMap<>();
// Instantiation method set to new HashSet (thanks to bayou.io for HashSet::new)
m(data, HashSet::new); // Note: replace with anonymous inner class for java 7
in this case, the type information (which is present at the level of the caller) can be bypassed by having the caller provide the type-dependent functionality required. The example provides a basic HashSet creation for all values but more complex instantiation rules can be defined on a per-element basis.
The downside to this approach is that it provides complexity to the caller and can be very bad if this were to be an external API function (though the use of private within your original method suggests otherwise). Java 7 and below also causes quite a bit of boilerplate anonymous inner class code to pop up making caller-side code harder to read. Additionally, if most of your method requires type-information to be present then this solution is less feasible as well (since you'd be reprogramming most of your method on a per-type basis, defeating the point of using generics).
In all, I'd personally prefer to use the second approach if possible, only using the first approach if deemed infeasible. The gist of the solutions I'm getting at here is to not rely on type information when dealing with generics or at least set a bound such that you get functionality you require without ugly hacks. In the case where type-dependent operations have to be performed, have the caller provide the functionality for that (through Callables, Runnables or some FunctionalInterface of your creation).
If type information is absolutely critical for some reason not made apparent, I suggest reading this article to stop type erasure altogether, allowing type information to be present directly from within the method.
You'd need to do it like :
Map<String, Set> abc = null; //gives a compiler warning
m(abc, Set.class)
The issue is that if you want T to be captured to Set<String>, there will be no way to express Class<T> since there's no such thing as Set<String>.class, just Set.class.

Is there any way of imitating OR in Java Generics

EDIT: I changed a bit the example for getting the idea:
Like
<Integer or Float>
...without having to create a common interface and make a subclass for Integer and Float to implement it
If not, something like this would maybe have more sense and be useful
<E extends Number> <E = (Integer|Float)>
If ? is a wildcard why should not we allowed to restrict certain types?
It's not possible and I hardly see any value in it. You use generics to restrict type, e.g. in collections. With or operator you know as much about the type as much you know about the most specific supertype of both of them, Object in this case. So why not just use Object?
Hypothetical:
List<E extends String or Number> list = //...
What is the type of list.get(0)? Is it String or Number? But you cannot have a variable of such type. It cannot be String, it cannot be Number - it can only be... Object.
UPDATE: Since you changed your example in question to:
<Integer or Float>
why won't you just say:
<Number>
? Note that Number has methods that allow you to easily extract floatValue() and intValue(). Do you really need the exact type?
Note that you can use and operator:
<E extends Serializable & Closeable>
And that makes perfect sense - you can use variable of type E where either Serializable or Closeable is needed. In other words E must extend both Serializable and Closeable. See also: Java Generics Wildcarding With Multiple Classes.
In very extreme cases (pre-Java 7 without AutoCloseable), I would have liked to be able to do that, too. E.g.
<E extends Connection or Statement or ResultSet>
That would've allowed me to call E.close(), no matter what the actual type was. In other words, E would contain the "API intersection" of all supplied types. In this case it would contain close(), and all methods from java.sql.Wrapper and java.lang.Object.
But unfortunately, you cannot do that. Instead, use method overloading, e.g.
void close(Connection c);
void close(Statement s);
void close(ResultSet r);
Or plain old instanceof
if (obj instanceof Connection) {
((Connection) obj).close();
}
else if (obj instanceof Statement) { //...
Or fix your design, as you probably shouldn't have to intersect APIs of arbitrary types anyway
I don't see a real use for it... But anyways, I believe the closest you'd get to it is extending common interfaces for the possible implementations.

Java Collections of Interfaces

I'm writing a small API to deal with objects that have specific 'traits' In this case, they all have an interval of time and a couple of other bits of data, So I write an interface TimeInterval with some getters and setters.
Now most of these API methods deal with a Set or List of Objects. Internally these methods use the Java Colletions Framework (HashMap/TreeMap in particular). So these API methods are like:
getSomeDataAboutIntervals(List<TimeInterval> intervalObjects);
Couple of Questions:
a) Should this be List<? extends TimeInterval> intervalObjects instead?
Is it mostly a matter of style? The one disadvantage of taking strictly an interface that I can see is, you need to create your list as a List<TimeInterval> rather than List<ObjectThatImplementsTimeInterval>.
This means potentially having to copy a List<Object..> to List<TimeInterval> to pass it to the API.
Are there other pros & cons to either approach?
b) And, one dumb question :) The collections framework guarantees I always get out the same instance I put in, the collections are really a collection of references, correct?
1) Yes.
Method parameters should be as general as possible. List<? extends A> is more general than List<A>, and can be used when you don't need to add things to the list.
If you were only adding to the list (and not reading from it), the most general signature would probably be List<? super A>
Conversely, method return types should be as specific as possible. You rarely to never want to return a wildcard generic from a method.
Sometimes this can lead to generic signatures:
<T extends MyObject> List<T> filterMyObjects(List<T>)
This signature is both as specific and as general as possible
2) Yes, except possibly in some rare very specific cases (I'm thinking of BitSet, although that isn't technically a Collection).
If you declare your list as List<? extends A>, then you can pass in any object which static type is List<X>, where X extends A if A is a class, or X implements A id A is an interface. But you'll not be able to pass in a List or a List<Object> to it (unless A is Object) without force-casting it.
However, if you declare the parameter as a List<A>, you'll only be able to pass lists which static type is strictly equivalent to List<A>, so not List<X> for instance. And by "you are not able to do otherwise", I really mean "unless you force the compiler to shut up and accept it", which I believe one should not do unless dealing with legacy code.
Collections are really collections of references. The abstraction actually is that everything you can put in a variable is a reference to something, unless that variable is of a primitive type.
1) I would recommend ? extends TimeInterval. Because of Java's polymorphism, it may not actually make a difference, but it is more robust and better style
2) Yes
a) No. List<? extends TimeInterval> will only accept interfaces that extend the interface TimeInterval. Your assertion that "you need to create your list as a List<TimeInterval> is wrong, unless I misunderstand your point. Here's an example:
List<List> mylist= new ArrayList<List>();
mylist.add(new ArrayList());
b) Yes.
Should this be List intervalObjects instead?
You only do that if you want to pass in a List<TimeIntervalSubclass>. Note you can put instances of subclasses of TimeInterval into a List<TimeInterval>. Keep in mind that the type of the list is different than the types in the list.
If you do List<? extends A> myList -- that only affects what you can assign to myList, which is different than what is in myList.
And, one dumb question :) The collections framework guarantees I
always get out the same instance I put in, the collections are really
a collection of references, correct?
When you create a collection Map myMap = new HashMap(), myMap is a reference to the underlying collection. Similarly, when you put something into a collection, you are putting the reference to the underlying object into the collection.

Does Google Collections API have an equivalent of the Ruby Enumerable#inject method?

I read through the javadoc and couldn't find anything that resembles it.
No, it does not.
While it does have certain functional programming elements (Predicate, Function), those were to support specific needs and its core focus is not adding functional programming elements for Java (seeing as how it's terribly verbose currently). See this issue for a bit on that.
I think that you don't have a exact inject method.. but you can obtain a similar solution by using the transformValues methods supplied
Maps.transformValues(Map<K,V1> fromMap, Function<? super V1,V2> function)
List.transform(List<F> fromList, Function<? super F,? extends T> function)
Of course you'll need a Function class defined ad hoc to work with the passed parameter of the inject:
class MyFunction<Type, Type>
{
static String variable;
Type apply(Type t)
{
//do whatever you want with t
// and storing intermediate result to variable
// return same t to make this function work like identity
return t;
}
}

Complex Generics Combinations

Imagine a generic class MySet which maintains a parent MySet instance and a child MySet instance. The idea is that the parent should be able to hold a superset of T and the child a subset. So given the following sample, consider the following problem:
class MySet<T> {
MySet<? extends T> child;
void doStuff (Collection<? extends T> args) {
child.doStuff(this, args);
}
}
EDIT: fixed question and sample code to reflect the real problem
Now, the child generic <T> may be more restrictive than the parent's <T>, so the parent must pass in a Collection<X> where <X> conforms to the child's <T>. Keep in mind that this parent->child chain could extend to be arbitrarily long. Is there any way to arrange the generics so that parent.doStuff(...) will compile, ie, so that it can only be called with the arguments of it's most restrictive child?
This would mean that the java compiler would pass up generic information all the way up the parent->child chain to determine what the allowable arguments to doStuff could be, and I don't know if it has that capability.
Is the only solution to ensure children cannot be more restrictive than their parents using generics (ie, MySet<T> child; rather than MySet<? extends T>) and have children be more restrictive than their parents elsewhere in the code?
I can give a negative answer to part of that question right away:
Is there any way to arrange the
generics so that parent.doStuff(...)
will compile, ie, so that it can only
be called with the arguments of it's
most restrictive child?
This would mean that the java compiler
would pass up generic information all
the way up the parent->child chain to
determine what the allowable arguments
to doStuff could be, and I don't know
if it has that capability.
The simple answer is no, because the actual extend (and type requirements) of that chain are only known at run time, and by then any information about generics has been lost through erasure.
To push it even further, if you had a checked type instead of vanilla generics, with a method that could pass up in the chain the (most restrictive) actual type accepted, you could do a check at runtime, and raise a runtime error, but it will never be a compile error.
So no, unless the actual final type is known (and specified, maybe by a second type arg) in advance, the compiler is not going to be able to help you there.
What you can (and should do imo) is keep with what you just wrote (which should compile just fine), and pass it an argument that can be unsafe. Whether you type-check it yourself, or let the JVM raise a ClassCastException is needed, remains a matter of choice.
The code as presented doesn't seem to be attempting to make any sense.
Consider
MySet<String> c = ...;
MySet<Object> p = new MySet<Object>(c);
Collections<Integer> ints = ...;
p.doStuff(ints);
That is going to call c.doStuff(???, ints). ints is of type Collection<Integer>, but the callee requires doStuff(Collection<? extends String>).

Categories

Resources