Why Java 8 Stream forEach method behaves differently?

Why Java 8 Stream forEach method behaves differently? - java

As per my understanding of java 8 lambda expressions, if we don't include code after "->" in curly braces that value will be returned implicitly. But in case of below example, forEach method expects Consumer and expression returns value but the compiler is not giving an error in Eclipse.
List<StringBuilder> messages = Arrays.asList(new StringBuilder(), new StringBuilder());
messages.stream().forEach(s-> s.append("helloworld"));//works fine
messages.stream().forEach((StringBuilder s)-> s.append("helloworld")); //works fine
messages.stream().forEach(s-> s); // doesn't work , Void methods cannot return a value
messages.stream().forEach(s-> s.toString()); // works fine
messages.stream().forEach(s-> {return s.append("helloworld");}); // doesn't work , Void methods cannot return a value
messages.stream().forEach((StringBuilder s)-> {return s.append("helloworld");}); // doesn't work , Void methods cannot return a value
s.append returns StringBuilder and s.toString() returns String but lambda treats it as void.
What am I missing here? Why isn't the compiler giving an error when we invoke method on object?

From JLS 15.27.3. Type of a Lambda Expression:
A lambda expression is congruent with a function type if all of the
following are true:
The function type has no type parameters.
The number of lambda parameters is the same as the number of parameter types of the function type.
If the lambda expression is explicitly typed, its formal parameter types are the same as the parameter types of the function type.
If the lambda parameters are assumed to have the same types as the function type's parameter types, then:
If the function type's result is void, the lambda body is either a statement expression (§14.8) or a void-compatible block.
If the function type's result is a (non-void) type R, then either i) the lambda body is an expression that is compatible with R
in an assignment context, or ii) the lambda body is a value-compatible
block, and each result expression (§15.27.2) is compatible with R in
an assignment context.
The highlighted sentence above means that any statement lambda expression (i.e. a lambda expression without a block) matches a functional interface whose single method's return type is void (such as the Consumer functional interface required by the forEach method).
This explains why s.append("helloworld") & s.toString() (your 1,2 & 4 examples) work fine as statement lambda expressions.
examples 5 & 6 don't work, since those have block lambda bodies which are value-compatible lambda expressions. To be void-compatible, all the return statements must return nothing (i.e. just return;).
On the other hand, the following void-compatible block lambda bodies will pass compilation :
messages.stream().forEach(s-> {s.append("helloworld");});
messages.stream().forEach(s-> {s.append("helloworld"); return;});
Your 4th example - messages.stream().forEach(s-> s); doesn't work for the same reason the following method doesn't pass compilation :
void method (StringBuilder s)
{
s;
}

From java.util.stream.Stream, the signature of forEach is:
void forEach(Consumer<? super T> action)
From java.util.function.Consumer, the action must implement the following method:
void accept(T t)
In all your examples that don't work, you're returning a T, which doesn't match the return type of void.
Why compiler is not giving error when we invoke method on object?
Because you're not trying to return something, the lambda's return type is void, which matches the required Consumer signature.
The only possible type for s -> s is T -> T, whereas (StringBuilder s) -> s.append() could be StringBuilder -> (), which satisfies the void requirement.

Stream.forEach(Consumer) takes a consumer which implements one method
void accept(Object o);
i.e. it can't return anything.
If you want to return something you have to do something with the value returned.

You said
if we don't include code after "->" in curly braces that value will be returned implicitly
But that's not quite accurate. Rather, if you do not include curly braces, then, if a value could be be returned, then it will be returned implicitly. However, if a value should not be returned, because, for example, the functional interface method has a void return value, then the compiler will see that and not attempt to implicitly return anything.
In this case, forEach accepts a Consumer, whose return type is void, which is why you are getting compile errors when you attempt to return a value from it explicitly.

Related

Why does Java type inference fail to distinguish between Function and Consumer?

Given the following identity functions:
<T> Consumer<T> f(Consumer<T> c) { return c; } // (1)
<T,R> Function<T,R> f(Function<T, R> c) { return c; } // (2)
I observe the following behaviour in JDK 11 and JDK 17:
void _void() {}
f(x -> {}); // okay, dispatches to (1)
f(x -> { return; }); // okay, dispatches to (1)
f(x -> { _void(); }); // okay, dispatches to (1)
f(x -> _void()); // should dispatch to (1)
| Error:
| reference to f is ambiguous
| both method f(java.util.function.Function<java.lang.Object,java.lang.Object>) in
and method f(java.util.function.Consumer<java.lang.Object>) in match
int _one() { return 1; }
f(x -> 1); // okay, dispatches to (2)
f(x -> { return 1; }); // okay, dispatches to (2)
f(x -> { return _one(); }); // okay, dispatches to (2)
f(x -> _one()); // should dispatch to (2)
| Error:
| reference to f is ambiguous
| both method <T,R>f(java.util.function.Function<T,R>) in
and method <T>f(java.util.function.Consumer<T>) in match
Why can't the compiler resolve these symbols by using the return type of the expression? The curly brace versions work fine, and I would have thought they would be the more difficult cases. I understand that you can explicity cast the lambda function, but that defeats the purpose of what I am trying to achieve.

x -> _void() and x -> one() are expected to be compatible with Consumer<T> (with the result of one() to be discarded).
When the lambda body is of a block type, the compiler additionally checks the "return" compatibility.
The JLS is rather explicit about void/value compatibility for block bodies:
A block lambda body is void-compatible if every return statement in the block has the form return;.
A block lambda body is value-compatible if it cannot complete normally (§14.21) and every return statement in the block has the form return Expression;.
While that doesn't say why the single-expression bodies fail, it says exactly why block bodies compile: the compiler looks at the return forms to judge on those bodies' compatibility with Consumer or Function (in this case).
For the method invocation expressions, the fact that this is allowed:
Consumer<Integer> c = x -> one(); //discarded result
Function<T, Integer> f = x -> one(); //returned result
doesn't enable the compiler to resolve the conflict that you observed. You can rewrite the same lambda expression with block bodies to resolve the conflict, and that's simply because block bodies are checked differently, by spec.
I guess I'm trying to say that the more natural question is "why block bodies compile at all in this case", given that we normally don't expect return types (forms?) to participate in overload resolution. But lambda expressions' congruence with types is something else, isn't it... I think this (that block type helps target type inference) is the special behavior.

TLDR:
The cases that fail to compile, fail to compile because of two main reasons:
the lambdas have a statement expression (a method call in this case) as their bodies, making them compatible with both the Consumer<T> and Function<T, R> overloads,
the lambdas are also implicitly typed, making them not pertinent to applicability, so overload resolution is unable to decide between the overloads.
Let's go through the overload resolution steps in the spec to see where exactly this fails :)
First, let's determine the potentially applicable methods. For both x -> _void() and x -> _one(), both overloads are potentially applicable. This is because both lambda expressions are congruent to the function types of both Function<T, R> and Consumer<T>. The important condition is:
If the lambda parameters are assumed to have the same types as the
function type's parameter types, then:
If the function type's result is void, the lambda body is either a statement expression (§14.8) or a void-compatible block.
If the function type's result is a (non-void) type R, then either i) the lambda body is an expression that is compatible with R in an
assignment context, or ii) the lambda body is a value-compatible block, and each result expression (§15.27.2) is compatible with R in an assignment context.
(Also notice that for the cases that compile, exactly one of the methods is potentially applicable.)
Then we try to resolve the method to invoke using strict invocation. Loose and variable arity invocation are not very relevant here, so if this phase fails, the whole thing fails. Notice that at the start of that section, the spec defines "pertinent to applicability", and both x -> _void() and x -> _one() are not pertinent to applicability. This will be important.
We then reach:
If m is a generic method and the method invocation does not provide explicit type arguments, then the applicability of the method is inferred as specified in §18.5.1.
According to §18.5.1, to determine the applicability of a method wrt to a call, you first add inference bounds according to the arguments and type parameters. Then you reduce and incorporate the bounds. If there are no false bounds (which are produced when you have conflicting bounds) in the result, then the method is applicable. The relevant point here is that arguments that are not pertinent to applicability are not considered when adding those bounds:
To test for applicability by strict invocation:
If k ≠ n, or if there exists an i (1 ≤ i ≤ n) such that ei is
pertinent to applicability (§15.12.2.2) and either i) ei is a
standalone expression of a primitive type but Fi is a reference type,
or ii) Fi is a primitive type but ei is not a standalone expression of
a primitive type; then the method is not applicable and there is no
need to proceed with inference.
Otherwise, C includes, for all i (1 ≤ i ≤ k) where ei is pertinent to
applicability, ‹ei → Fi θ›.
So the only bounds that are added are those from the type parameters. They obviously are not going to disagree/conflict with each other and produce a false bound, since they are independent.
So again, both methods are applicable.
When there are more than one applicable method, we of course choose the most specific method. The process for doing this for generic methods is described here. It's quite long so I won't quote it here. In principle, it is similar to how §18.5.1 works - add some type bounds, if they agree with each other (no false), then one method is more specific than the other. In this case, however, the implicitly typed lambdas cause a false bound to be added :(
Now knowing this, you can basically make it work the way you want by using explicitly typed lambdas, which are pertinent to applicability.
f((Integer x) -> _one()); // (2)
f((Integer x) -> _void()); // (1)

How lambda expression initialize parameter?

I'm confused with lamda expression.
JavaDStream<ConsumerRecord<String, String>> rsvpsWithGuestsStream =
meetupStream.filter(f -> !f.value().contains("\"guests\":0"));
rsvpsWithGuestsStream.foreachRDD((JavaRDD<ConsumerRecord<String, String>> r) -> {
MongoSpark.save(
r.map(
e -> Document.parse(e.value())
)
);
});
Here is a foreachRDD method void foreachRDD(VoidFunction<R> foreachFunc), It accepts a functional interface.
And in code, JavaRDD<ConsumerRecord<String, String>> r passed as a argument which is internally used by its call method.
I want to know Does lambda expression initialize r on its own ? Becasue it can call map only if its initialized. And In code I cant see anywhere its already created.
Can anyone help me to understand this ?

A lambda expression does not initialize anything. It is just an anonymous function which does something with its arguments - assuming they have been already initialized. I am not familiar with your use case (guess it is Spark) but it looks like r is just one of the elements in the stream, and each such element is passed in to your lambda. Below is a similar but much simpler example:
Stream.of("a", "b", "c").forEach(el -> System.out.println(el));
Here el is each of the elements "a", "b" and "c", passed in to the lambda which simply prints it. Just like a function, a lambda doesn't know anything about its arguments and whether they are initialiazed - this is up to the caller, in the above cases - the forEach methods.

Here is a quote from Java spec:
15.27.1. Lambda Parameters
When the lambda expression is invoked (via a method invocation
expression (§15.12)), the values of the actual argument expressions
initialize newly created parameter variables, each of the declared or inferred type, before execution of the lambda body. The Identifier
that appears in the LambdaParameter or directly in the
LambdaParameterList or LambdaParameters may be used as a simple name
in the lambda body to refer to the formal parameter.
https://docs.oracle.com/javase/specs/jls/se15/html/jls-15.html#jls-15.27.1

How to pass parameter in Supplier function with method reference operator(::)

Sorry, it seems to be very basic in functional programming but I am not getting this idea. Actually I have a method in my code which consumes a method and another param as a parameter.
private <R> CompletableFuture<R> retryRequest(Supplier<CompletableFuture<R>> supplier, int maxRetries)
I want to call this function and pass another method(anOtherMethod) which taking one integer parameter:
CompletableFuture<Boolean> retry = this.retryRequest(this:: anOtherMethod, 2);
Not getting this how I can call this retryRequest and give anOtherMethod(123)?
I know it can work like this:
CompletableFuture<Boolean> retry = this.retryRequest(()-> anOtherMethod(123), 2);

You cannot instantiate a lambda with a specific captured value like 123 in the pure method reference variant.. You need to write the explicit lambda version with arrow, if you want to pass captured values other than the instance to execute the method on. Read more on capturing values in lambdas in this answer: Enhanced 'for' loop and lambda expressions
The only exception is an object, which itself becomes the first parameter.
Assume a signature that expects a Consumer of a String:
public void something(Consumer<String> job) {
...
The above signature will enable you to write the following calls:
String myString = " Hey Jack ";
something(myString::trim);
something(s -> s.trim());
Both do the same, and this is maybe unintuitive, because one takes an argument (the instance reference myString) and one seem not to (but it actually does, too). This works, because the compiler tries two possible resolutions for a lambda method reference (the above version with ::). On one hand, the compiler can apply signatures, as if the called method did not have any parameters, and none need passing. This is the case for myString.trim. But the compiler will also check, whether there is a static method String.trim(myString) (which luckiely there is not). If you wanted to call a static method without any parameters, then you'd have to call the class identifier with the function reference like so:
something(String::trim); // this version of trim does not exist.
This is sometimes even a problem, because if a class offers a static version of a method and an instance-related one, you get ambiguity:
public void somethingElse(Function<Integer, String> transformation) {...}
// This will not compile:
somethingElse(Integer::toString);
The above example will not compile, because the toString method exists twice, once as static Integer.toString(someInt) and once as instance related someInteger.toString().

Method reference - Difference between "Reference to a static method" and "Reference to an instance method of an arbitrary object of a particular type"

I've learned that there are 4 kinds of types in method reference. But I don't understand the difference between "Reference to a static method" and "Reference to an instance method of an arbitrary object of a particular type".
For example:
List<String> weeks = new ArrayList<>();
weeks.add("Monday");
weeks.add("Tuesday");
weeks.add("Wednesday");
weeks.add("Thursday");
weeks.add("Friday");
weeks.add("Saturday");
weeks.add("Sunday");
weeks.stream().map(String::toUpperCase).forEach(System.out::println);
The method toUpperCase is not a static method... so why can one write in the way above, rather than needing to use it this way:
weeks.stream().map(s -> s.toUpperCase()).forEach(System.out::println);

Explanation
The method toUpperCase is not a static method... so why can one write in the way above, rather than needing to use it this way:
weeks.stream().map(s -> s.toUpperCase()).forEach(System.out::println);
Method references are not limited to static methods. Take a look at
.map(String::toUpperCase)
it is equivalent to
.map(s -> s.toUpperCase())
Java will just call the method you have referenced on the elements in the stream. In fact, this is the whole point of references.
The official Oracle tutorial explains this in more detail.
Insights, Examples
The method Stream#map (documentation) has the following signature:
<R> Stream<R> map(Function<? super T, ? extends R> mapper)
So it expects some Function. In your case this is a Function<String, String> which takes a String, applies some method on it and then returns a String.
Now we take a look at Function (documentation). It has the following method:
R apply(T t)
Applies this function to the given argument.
This is exactly what you are providing with your method reference. You provide a Function<String, String> that applies the given method reference on all objects. Your apply would look like:
String apply(String t) {
return t.toUpperCase();
}
And the Lambda expression
.map(s -> s.toUpperCase())
generates the exact same Function with the same apply method.
So what you could do is
Function<String, String> toUpper1 = String::toUpperCase;
Function<String, String> toUpper2 = s -> s.toUpperCase();
System.out.println(toUpper1.apply("test"));
System.out.println(toUpper2.apply("test"));
And they will both output "TEST", they behave the same.
More details on this can be found in the Java Language Specification JLS§15.13. Especially take a look at the examples in the end of the chapter.
Another note, why does Java even know that String::toUpperCase should be interpreted as Function<String, String>? Well, in general it does not. That's why we always need to clearly specify the type:
// The left side of the statement makes it clear to the compiler
Function<String, String> toUpper1 = String::toUpperCase;
// The signature of the 'map' method makes it clear to the compiler
.map(String::toUpperCase)
Also note that we can only do such stuff with functional interfaces:
#FunctionalInterface
public interface Function<T, R> { ... }
Note on System.out::println
For some reason you are not confused by
.forEach(System.out::println);
This method is not static either.
The out is an ordinary object instance and the println is a non static method of the PrintStream (documentation) class. See System#out for the objects documentation.

Method reference quite intelligent feature in Java. So, when you use non-static method reference like String:toUpperCase Java automatically comes to know that it needs to call toUpperCase on the on the first parameter.Suppose there is two parameter a lambda expression expect then the method will call on the first parameter and the second parameter will pass as an argument of the method. Let' take an example.
List<String> empNames = Arrays.asList("Tom","Bob");
String s1 = empNames.stream().reduce("",String::concat); //line -1
String s2 = empNames.stream().reduce("",(a,b)->a.concat(b)); // line -2
System.out.println(s1);
System.out.println(s2);
So, on above example on line -1, String#concat method will call on the first parameter (that is a line-2) and a second parameter (that b for line -2) will pass as the argument.
It is possible for the multiple arguments (more than 2) method also but you need to very careful about the which sequence of the parameters.

I highly recommend you to read the Oracle's article about method references: https://docs.oracle.com/javase/tutorial/java/javaOO/methodreferences.html
That is the form of a lambda expression:
s->s.toUpperCase()
And that is a method reference:
String::toUpperCase
Semantically, the method reference is the same as the lambda expression, it just has different syntax.

Method Overloading ambiguity [duplicate]

The sample code is :
public class OverloadingTest {
public static void test(Object obj){
System.out.println("Object called");
}
public static void test(String obj){
System.out.println("String called");
}
public static void main(String[] args){
test(null);
System.out.println("10%2==0 is "+(10%2==0));
test((10%2==0)?null:new Object());
test((10%2==0)?null:null);
}
And the output is :
String called
10%2==0 is true
Object called
String called
The first call to test(null) invokes the method with String argument , which is understandable according to The Java Language Specification .
1) Can anyone explain me on what basis test() is invoked in preceding calls ?
2) Again when we put , say a if condition :
if(10%2==0){
test(null);
}
else
{
test(new Object());
}
It always invokes the method with String argument .
Will the compiler compute the expression (10%2) while compiling ? I want to know whether expressions are computed at compile time or run time . Thanks.

Java uses early binding. The most specific method is chosen at compile time. The most specific method is chosen by number of parameters and type of parameters. Number of parameters is not relevant in this case. This leaves us with the type of parameters.
What type do the parameters have? Both parameters are expressions, using the ternary conditional operator. The question reduces to: What type does the conditional ternary operator return? The type is computed at compile time.
Given are the two expressions:
(10%2==0)? null : new Object(); // A
(10%2==0)? null : null; // B
The rules of type evaluation are listed here. In B it is easy, both terms are exactly the same: null will be returned (whatever type that may be) (JLS: "If the second and third operands have the same type (which may be the null type), then that is the type of the conditional expression."). In A the second term is from a specific class. As this is more specific and null can be substituted for an object of class Object the type of the whole expression is Object (JLS: "If one of the second and third operands is of the null type and the type of the other is a reference type, then the type of the conditional expression is that reference type.").
After the type evaluation of the expressions the method selection is as expected.
The example with if you give is different: You call the methods with objects of two different types. The ternary conditional operator always is evaluated to one type at compile time that fits both terms.

JLS 15.25:
The type of a conditional expression is determined as follows:
[...]
If one of the second and third operands is of the null type and the type of the other
is a reference type, then the type of the conditional expression is that reference
type.
[...]
So the type of
10 % 2 == 0 ? null : new Object();
is Object.

test((10%2==0)?null:new Object());
Is the same as:
Object o;
if(10%2==0)
o=null;
else
o=new Object();
test(o);
Since type of o is Object (just like the type of (10%2==0)?null:new Object()) test(Object) will be always called. The value of o doesn't matter.

Your answer is : Runtime because in runtime specify parameter is instance of String or not so in compile-time can't find this.

This is the really nice question.
Let me try to clarify your code that you have written above.
In your first method call
test(null);
In this the null will be converted into string type so calling the test(String obj), as per JLS you are convinced with the call.
In the second method call
test((10%2==0)?null:new Object());
Which is going to return the boolean "true" value. So first boolean "true" value is going to auto cast into Boolean Wrapper class object. Boolean wrapper Object is finding the best match with your new Object() option in the ternary operator. And the method calls with Object as a parameter so it calls the following method
public static void test(Object obj)
For the experiment sake you can try the following combinations then you will get better clarity.
test((10 % 2 == 0) ? new Object() : "stringObj" );
test((10 % 2 == 0) ? new Object() : null );
test((10 % 2 == 0) ? "stringObj" : null );
Finally in the last when you are calling with the following code.
test((10%2==0)?null:null);
This time again it returns as boolean "true" value, and it will again follow the same casts as explained above. But this time there is no new Object() parameter is there in your ternary operator. So it will be auto type cast into null Object. Again it follows same method call as the your first method call.
In the last when you asked for code if you put in if .. else statement. Then also the compiler doing the fair decision with the code.
if(10%2==0) {
test(null);
}
Here all the time your if condition is true and calling this code test(null). Therefore all the time it call the firsttest(String obj) method with String as parameter as explained above.

I think your problem is that you are making the wrong assumption, your expressions:
test((10%2==0)?null:new Object());
and
test((10%2==0)?null:null;
Will always call test(null), and that's why they will go through test (Object).

as #Banthar mentionend the ?: operator assigns a value to a variable first then evaluates the condition.
On the other hand, the if condition you mentioned always returns true, so the compiler will replace the whole if-else block with only the body of the if.

1) the test() method is determined by the type of the parameter at the compilation time :
test((Object) null);
test((Object)"String");
output :
Object called
Object called
2) The compiler is even smarter, the compiled code is equivalent to just :
test(null);
you can check the bytecode with javap -c:
0: aconst_null
1: invokestatic #6 // Method test:(Ljava/lang/String;)V
4: return

This is what Java Language Specifications say about the problem.
If more than one method declaration is both accessible and applicable
to a method invocation, it is necessary to choose one to provide the
descriptor for the run-time method dispatch. The Java programming
language uses the rule that the most specific method is chosen.
This is test(String) method in your case.
And because of that if you add...
public static void test(Integer obj){
System.out.println("Ingeter called");
}
it will show compilation error -The method test(String) is ambiguous for the type OverloadingTest.
Just like JLS says:
It is possible that no method is the most specific, because there are
two or more maximally specific methods. In this case:
If all the maximally specific methods have the same signature, then:
If one of the maximally specific methods is not declared abstract, it
is the most specific method. Otherwise, all the maximally specific
methods are necessarily declared abstract. The most specific method is
chosen arbitrarily among the maximally specific methods. However, the
most specific method is considered to throw a checked exception if and
only if that exception is declared in the throws clauses of each of
the maximally specific methods. Otherwise, we say that the method
invocation is ambiguous, and a compile-time error occurs.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.