I've been looking at the difference between Collections.sort and list.sort, specifically regarding using the Comparator static methods and whether param types are required in the lambda expressions. Before we start, I know I could use method references, e.g. Song::getTitle to overcome my problems, but my query here is not so much something I want to fix but something I want an answer to, i.e. why is the Java compiler handling it in this way.
These are my finding. Suppose we have an ArrayList of type Song, with some songs added, there are 3 standard get methods:
ArrayList<Song> playlist1 = new ArrayList<Song>();
//add some new Song objects
playlist.addSong( new Song("Only Girl (In The World)", 235, "Rhianna") );
playlist.addSong( new Song("Thinking of Me", 206, "Olly Murs") );
playlist.addSong( new Song("Raise Your Glass", 202,"P!nk") );
Here is a call to both types of sort method that works, no problem:
Collections.sort(playlist1,
Comparator.comparing(p1 -> p1.getTitle()));
playlist1.sort(
Comparator.comparing(p1 -> p1.getTitle()));
As soon as I start to chain thenComparing, the following happens:
Collections.sort(playlist1,
Comparator.comparing(p1 -> p1.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist())
);
playlist1.sort(
Comparator.comparing(p1 -> p1.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist())
);
i.e. syntax errors because it does not know the type of p1 anymore. So to fix this I add the type Song to the first parameter (of comparing):
Collections.sort(playlist1,
Comparator.comparing((Song p1) -> p1.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist())
);
playlist1.sort(
Comparator.comparing((Song p1) -> p1.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist())
);
Now here comes the CONFUSING part. For playlist1.sort, i.e. the List, this solve all compilation errors, for both the following thenComparing calls. However, for Collections.sort, it solves it for the first one, but not the last one. I tested added several extra calls to thenComparing and it always shows an error for the last one, unless I put (Song p1) for the parameter.
Now I went on to test this further with creating a TreeSet and with using Objects.compare:
int x = Objects.compare(t1, t2,
Comparator.comparing((Song p1) -> p1.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist())
);
Set<Song> set = new TreeSet<Song>(
Comparator.comparing((Song p1) -> p1.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist())
);
The same thing happens as in, for the TreeSet, there are no compilation errors but for Objects.compare the last call to thenComparing shows an error.
Can anyone please explain why this is happening and also why there is no need to use (Song p1) at all when simply calling the comparing method (without further thenComparing calls).
One other query on the same topic is when I do this to the TreeSet:
Set<Song> set = new TreeSet<Song>(
Comparator.comparing(p1 -> p1.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist())
);
i.e. remove the type Song from the first lambda parameter for the comparing method call, it shows syntax errors under the call to comparing and the first call to thenComparing but not to the final call to thenComparing - almost the opposite of what was happening above! Whereas, for all the other 3 examples i.e. with Objects.compare, List.sort and Collections.sort when I remove that first Song param type it shows syntax errors for all the calls.
Edited to include screenshot of errors I was receiving in Eclipse Kepler SR2, which I have now since found are Eclipse specific because when compiled using the JDK8 java compiler on the command-line it compiles OK.
First, all the examples you say cause errors compile fine with the reference implementation (javac from JDK 8.) They also work fine in IntelliJ, so its quite possible the errors you're seeing are Eclipse-specific.
Your underlying question seems to be: "why does it stop working when I start chaining." The reason is, while lambda expressions and generic method invocations are poly expressions (their type is context-sensitive) when they appear as method parameters, when they appear instead as method receiver expressions, they are not.
When you say
Collections.sort(playlist1, comparing(p1 -> p1.getTitle()));
there is enough type information to solve for both the type argument of comparing() and the argument type p1. The comparing() call gets its target type from the signature of Collections.sort, so it is known comparing() must return a Comparator<Song>, and therefore p1 must be Song.
But when you start chaining:
Collections.sort(playlist1,
comparing(p1 -> p1.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist()));
now we've got a problem. We know that the compound expression comparing(...).thenComparing(...) has a target type of Comparator<Song>, but because the receiver expression for the chain, comparing(p -> p.getTitle()), is a generic method call, and we can't infer its type parameters from its other arguments, we're kind of out of luck. Since we don't know the type of this expression, we don't know that it has a thenComparing method, etc.
There are several ways to fix this, all of which involve injecting more type information so that the initial object in the chain can be properly typed. Here they are, in rough order of decreasing desirability and increasing intrusiveness:
Use an exact method reference (one with no overloads), like Song::getTitle. This then gives enough type information to infer the type variables for the comparing() call, and therefore give it a type, and therefore continue down the chain.
Use an explicit lambda (as you did in your example).
Provide a type witness for the comparing() call: Comparator.<Song, String>comparing(...).
Provide an explicit target type with a cast, by casting the receiver expression to Comparator<Song>.
The problem is type inferencing. Without adding a (Song s) to the first comparison, comparator.comparing doesn't know the type of the input so it defaults to Object.
You can fix this problem 1 of 3 ways:
Use the new Java 8 method reference syntax
Collections.sort(playlist,
Comparator.comparing(Song::getTitle)
.thenComparing(Song::getDuration)
.thenComparing(Song::getArtist)
);
Pull out each comparison step into a local reference
Comparator<Song> byName = (s1, s2) -> s1.getArtist().compareTo(s2.getArtist());
Comparator<Song> byDuration = (s1, s2) -> Integer.compare(s1.getDuration(), s2.getDuration());
Collections.sort(playlist,
byName
.thenComparing(byDuration)
);
EDIT
Forcing the type returned by the Comparator (note you need both the input type and the comparison key type)
sort(
Comparator.<Song, String>comparing((s) -> s.getTitle())
.thenComparing(p1 -> p1.getDuration())
.thenComparing(p1 -> p1.getArtist())
);
I think the "last" thenComparing syntax error is misleading you. It's actually a type problem with the whole chain, it's just the compiler only marking the end of the chain as a syntax error because that's when the final return type doesn't match I guess.
I'm not sure why List is doing a better inferencing job than Collection since it should do the same capture type but apparently not.
Another way to deal with this compile time error:
Cast your first comparing function's variable explicitly and then good to go. I have sort the list of org.bson.Documents object. Please look at sample code
Comparator<Document> comparator = Comparator.comparing((Document hist) -> (String) hist.get("orderLineStatus"), reverseOrder())
.thenComparing(hist -> (Date) hist.get("promisedShipDate"))
.thenComparing(hist -> (Date) hist.get("lastShipDate"));
list = list.stream().sorted(comparator).collect(Collectors.toList());
playlist1.sort(...) creates a bound of Song for the type variable E, from the declaration of playlist1, which "ripples" to the comparator.
In Collections.sort(...), there is no such bound, and the inference from the type of the first comparator is not enough for the compiler to infer the rest.
I think you would get "correct" behavior from Collections.<Song>sort(...), but don't have a java 8 install to test it out for you.
Related
I have following expression that gets executed successfully:
Function<Long,Long> y = ((Function<Long,Long>)(x -> x*x)).andThen(x -> x+1).andThen(x -> x+2);
I understand why casting is required with the first lambda expression here. But following lambda gives error that "x+1" is not a valid operation for the second compose lambda expression
Function<Long,Long> y = ((Function<Long,Long>)(x -> x*x)).compose(x -> x+1).compose(x -> x+2);
I was able to resolve the above error using casting with compose:
Function<Long,Long> y = ((Function<Long,Long>)(x -> x*x)).compose((Function<Long,Long>)x -> x+1).compose(x -> x+2);
I have following questions:
Why do we need casting with compose calls but not with andThen
calls?
Why do we need casting with intermediate compose calls but not with
terminal compose calls?
Why do we need casting with compose calls but not with andThen calls?
The two methods are different. compose() takes a function whose input is of a type that is not necessarily the same as the current function's parameter type. Here's a slightly modified example to show that the compiler did not have to assume Long:
Function<Long, Long> f = (x -> x * x);
Function<String, Long> g = f.compose(Long::parseLong);
You can observe that f.compose() has a type argument of type String. In the above code, it's inferred from the assignment context (i.e., the compiler knows the input is String-typed because the resulting function is being assigned to a Function<String, Long> variable).
When it comes to .andThen(), however, things are simpler for the compiler : the type parameter <V> is for the output of the given function (not for the input, as is the case for compose). And because it already knows the input type, it has all the information: .andThen(x -> x+1) can only have Long as output type, because Long + int will produce long, boxed to Long. The end.
Why do we need casting with intermediate compose calls but not with terminal compose calls?
Now, think about it, what happens if I wrote this?
Function<String, Long> g = f.compose(Long::parseLong).compose(Long::parseLong);
What happens is that the compiler is ready to infer the <V> of the last .compose() to String because of the assignment context (see above). Question is: Should it assume String for the intermediate .compose()? The answer is Yes in this case* (because Long.parseLong only takes a string, there's no overload), but the compiler doesn't do that; it's a known limitation.
I can get it to work with f.<String>compose(Long::parseLong).compose(Long::parseLong); (which of course breaks my last .compose() call for obvious reasons, but you get the idea.
In other words, you can fix it with
A type witness
...<Long>compose(x -> x + 1).compose(x -> x + 2)
An explicit parameter type (my preferred option)
...compose((Long x) -> x + 1).compose(x -> x + 2)
*I say "yes in this case" because you cannot expect the compiler to always know the type. It's unambiguous here because Long.parseLong with a single parameter is not overloaded, so we can argue that the compiler could infer the intermediate .compose()'s <V> as <String>. But that should not be understood to mean that the compiler should be able to perform such inference in all situations. The function passed to .compose() could be one taking any other parameter type. The end to the discussion for now is that the compiler does not support this kind of inference.
The reason is the behavior of Function.compose and Function.andThen being non identical and non swappable.
If you run the following code.
Function<Long,Long> y1 = ((Function<Long,Long>)(x -> x*x)).andThen(x -> x+1).andThen(x -> x+2);
System.out.println(y1.apply(10l));
Function<Long,Long> y2 = ((Function<Long,Long>)(x -> x*x)).compose((Long x) -> x+1).compose(x -> x+2);
System.out.println(y2.apply(10l));
Even though we run both functions with same values (10) it returns different values. Where andThen is used it returns 103 (10x10+(1+2)) and where compose is used it returns 169 (10+1+2, 13x13). Thus compose is called before the multiplication lambda applies and compose gets a Function<Object, Long> as the parameter instead of Function<Long, Long> compose has no visibility as to any lambda that happened prior because it will be first to be called.
Since there is no context at the time calling compose we need to either cast to Function<Long, Long> or use type in the lambda itself as I have done. Hope this helps.
Can someone explain me, how come both of the lambdas can be replaced with method references here?
In RxJava, map() takes a parameter of type Func1<T, R>, whose comment states that it "Represents a function with one argument". Thus I completely understand why valueOf(Object) works here. But trim() takes no arguments at all.
So how does this work exactly?
Observable.just("")
.map(s -> String.valueOf(s)) //lambdas
.map(s -> s.trim()) //
.map(String::valueOf) //method references
.map(String::trim) //
.subscribe();
I didn't play with RX in java, but please note, that String::valueOf is a static (aka unbound) function, while String::trim is a non-static (aka bound) function that have indirect this argument. So, in fact, both function takes single argument. In Java it's not that visible as it is in Python for example.
I was reading this article and tried counting some words in a text file and found I could not reverse sort similarly to how it showed in listing 1 of the article.
I have some code that works though:
public class WordCounter {
public static final PrintWriter out = new PrintWriter(System.out, true);
public static void main(String... args) throws IOException {
//The need to put "", in front of args in the next line is frustrating.
try (Stream<String> lines = Files.lines(Paths.get("", args))) {
lines.parallel()
.map(l -> l.toLowerCase().replaceAll("[^a-z\\s]", "").split("\\s"))
.flatMap(Arrays::stream)
.filter(s -> !s.isEmpty())
.collect(Collectors.groupingBy(
Function.identity(), Collectors.counting()))
// Sort Map<K,V> Entries by their Integer value descending
.entrySet().parallelStream()
// MY QUESTION IS ABOUT THIS METHOD:
.sorted(
Comparator.comparing(Map.Entry::getValue, Comparator.reverseOrder()))
// --------------------------------- //
.forEachOrdered(e -> out.printf("%5d\t%s\n", e.getValue(), e.getKey()));
}
out.close();
}
}
So the article would suggest that the line:
.sorted(Comparator.comparing(Map.Entry::getValue, Comparator.reverseOrder()))
could be written as:
.sorted(Comparator.comparing(Map.Entry::getValue).reversed())
For this though, the Java compiler complains that:
Error:(46, 49) java: invalid method reference non-static method
getValue() cannot be referenced from a static context
The two comparing method signatures have the exact same first parameter and static scope, yet the former works while the latter complains about getValue being non-static.
My original thought was to write it as either:
.sorted(Map.Entry.comparingByValue())
Which compiles and runs but is not reversed. Or as:
.sorted(Map.Entry.comparingByValue().reversed())
Which again doesn't compile, giving an error message of:
Error:(48, 62) java: incompatible types: java.util.Comparator<java.util.Map.Entry<java.lang.Object,V>> cannot be converted to java.util.Comparator<? super java.util.Map.Entry<java.lang.String,java.lang.Long>>
Okay, so, that should be:
.sorted(Map.Entry.<String, Long>comparingByValue().reversed())
Which works.
I can't seem to see how to give a similar generic type specification to the Map.Entry::getValue form in my "could be written as" line though.
As to why this happens: while type inference has come leaps and bounds in Java 8, it will still only use the return target type if the return value is assigned to something.
In Java 7 we were only able to use this in an assignment context (using =) and it was a little bit clunky. In Java 8, it's less clunky and we can use it in invocation contexts (passed as a method argument, which assigns it to the formal parameter).
So the way I understand it, if the method invocation isn't used in an assignment context or invocation context, target type inference simply turns off, because it's no longer something called a poly expression (15.12, 18.5.2). So says the JLS.
In short, target type inference only works if the return value is:
assigned directly to a variable using =, as in v = foo();.
passed directly to a method, as in bar(foo()).
Once you chain a method call in, like v = foo().zap(), it stops working.
Lifted from my comment:
I can't seem to see how to give a similar generic type specification to the Map.Entry::getValue form though.
This would be Map.Entry<String, Long>::getValue.
I have a question regarding the usage of the Function.identity() method.
Imagine the following code:
Arrays.asList("a", "b", "c")
.stream()
.map(Function.identity()) // <- This,
.map(str -> str) // <- is the same as this.
.collect(Collectors.toMap(
Function.identity(), // <-- And this,
str -> str)); // <-- is the same as this.
Is there any reason why you should use Function.identity() instead of str->str (or vice versa). I think that the second option is more readable (a matter of taste of course). But, is there any "real" reason why one should be preferred?
As of the current JRE implementation, Function.identity() will always return the same instance while each occurrence of identifier -> identifier will not only create its own instance but even have a distinct implementation class. For more details, see here.
The reason is that the compiler generates a synthetic method holding the trivial body of that lambda expression (in the case of x->x, equivalent to return identifier;) and tell the runtime to create an implementation of the functional interface calling this method. So the runtime sees only different target methods and the current implementation does not analyze the methods to find out whether certain methods are equivalent.
So using Function.identity() instead of x -> x might save some memory but that shouldn’t drive your decision if you really think that x -> x is more readable than Function.identity().
You may also consider that when compiling with debug information enabled, the synthetic method will have a line debug attribute pointing to the source code line(s) holding the lambda expression, therefore you have a chance of finding the source of a particular Function instance while debugging. In contrast, when encountering the instance returned by Function.identity() during debugging an operation, you won’t know who has called that method and passed the instance to the operation.
In your example there is no big difference between str -> str and Function.identity() since internally it is simply t->t.
But sometimes we can't use Function.identity because we can't use a Function. Take a look here:
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
this will compile fine
int[] arrayOK = list.stream().mapToInt(i -> i).toArray();
but if you try to compile
int[] arrayProblem = list.stream().mapToInt(Function.identity()).toArray();
you will get compilation error since mapToInt expects ToIntFunction, which is not related to Function. Also ToIntFunction doesn't have identity() method.
From the JDK source:
static <T> Function<T, T> identity() {
return t -> t;
}
So, no, as long as it is syntactically correct.
I have a list with some User objects and i'm trying to sort the list, but only works using method reference, with lambda expression the compiler gives an error:
List<User> userList = Arrays.asList(u1, u2, u3);
userList.sort(Comparator.comparing(u -> u.getName())); // works
userList.sort(Comparator.comparing(User::getName).reversed()); // works
userList.sort(Comparator.comparing(u -> u.getName()).reversed()); // Compiler error
Error:
com\java8\collectionapi\CollectionTest.java:35: error: cannot find symbol
userList.sort(Comparator.comparing(u -> u.getName()).reversed());
^
symbol: method getName()
location: variable u of type Object
1 error
This is a weakness in the compiler's type inferencing mechanism. In order to infer the type of u in the lambda, the target type for the lambda needs to be established. This is accomplished as follows. userList.sort() is expecting an argument of type Comparator<User>. In the first line, Comparator.comparing() needs to return Comparator<User>. This implies that Comparator.comparing() needs a Function that takes a User argument. Thus in the lambda on the first line, u must be of type User and everything works.
In the second and third lines, the target typing is disrupted by the presence of the call to reversed(). I'm not entirely sure why; both the receiver and the return type of reversed() are Comparator<T> so it seems like the target type should be propagated back to the receiver, but it isn't. (Like I said, it's a weakness.)
In the second line, the method reference provides additional type information that fills this gap. This information is absent from the third line, so the compiler infers u to be Object (the inference fallback of last resort), which fails.
Obviously if you can use a method reference, do that and it'll work. Sometimes you can't use a method reference, e.g., if you want to pass an additional parameter, so you have to use a lambda expression. In that case you'd provide an explicit parameter type in the lambda:
userList.sort(Comparator.comparing((User u) -> u.getName()).reversed());
It might be possible for the compiler to be enhanced to cover this case in a future release.
You can work around this limitation by using the two-argument Comparator.comparing with Comparator.reverseOrder() as the second argument:
users.sort(comparing(User::getName, reverseOrder()));
Contrary to the accepted and upvoted answer for which bounty has been awarded, this doesn't really have anything to do with lambdas.
The following compiles:
Comparator<LocalDate> dateComparator = naturalOrder();
Comparator<LocalDate> reverseComparator = dateComparator.reversed();
while the following does not:
Comparator<LocalDate> reverseComparator = naturalOrder().reversed();
This is because the compiler's type inference mechanism isn't strong enough to take two steps at once: determine that the reversed() method call needs type parameter LocalDate and therefore also the naturalOrder() method call will need the same type parameter.
There is a way to call methods and explicitly pass a type parameter. In simple cases it isn't necessary because it's inferred, but it can be done this way:
Comparator<LocalDate> reverseComparator = Comparator.<LocalDate>naturalOrder().reversed();
In the example given in the question, this would become:
userList.sort(Comparator.comparing<User, String>(u -> u.getName()).reversed());
But as shown in the currently accepted answer, anything that helps the compiler inferring type User for the comparing method call without taking extra steps will work, so in this case you can also specify the type of the lambda parameter explicitly or use a method reference User::getName that also includes the type User.
The static method Collections.reverseOrder(Comparator<T>) seems to be the most elegant solution that has been proposed. Just one caveat:
Comparator.reverseOrder() requires that T implements comparable and relies on the natural sorting order.
Collections.reverseOrder(Comparator<T>) has no restriction applied on type T