Java 8 and aggregate operations on stream - java

When we use methods like filter, mapToInt, sum, etc. and
pass them lambda expressions I don't understand if the operations
are method themselves or are the lambda that we pass.
I' d like to know the correct terminology.
I think that the lambda is the function and thus the operation that we pass to
methods that use that to produce a results.
Why is also said that filter, sum, etc. are operations that use function as
their arguments?
Are both correct terminology?

Both the Stream methods and the lambda arguments that they accept are, broadly speaking, operations. This isn't confusing once we get used to the idea that the arguments to method calls can be functions. A Stream method applies the function that it's been given to the values in its stream, either to produce a new stream (intermediate methods) or to produce some aggregated result (terminal methods).
For a more detailed explanation, see http://www.lambdafaq.org/why-are-lambda-expressions-being-added-to-java/

Not sure if this is commonly accepted, but I think it is thus:
A function is something that receives arguments and produces a value, ideally without side effects (though, that is not enforcable in Java). Use this if you want to emphasize the mathematical/functional aspect.
A subroutine/procedure is a named piece of code that is reused for it's side effect.
A method is how functions and subroutines are implemented/written in Java. There is no such thing as a function or procedure that does not belong to some class.
A lambda expression in Java is a way to write methods (of some anonymous class that happens to implement a functional interface) on the fly and at the same time obtain a reference to an instance of said interface.
An operation is a function or procedure.
So, depending on how you want to look at it: Since it is about Java, you could just call everything "method". But sometimes you want to emphasize different aspects. Like in your example:
filter, sum, etc. are operations that use function as their arguments
Here, we could say: "filter is a method that takes a reference to a functional interface as argument", but this somehow changes the intention of the sentence.

Lambda is a function/callback which is usually passed as an argument. An approach to understand this, consider you need to search for a value in a list:
Java 7:
int i = Arrays.asList(1,2,3,4,5,6).indexOf(3);
This will give a single element. Now what happens, when one needs a more dynamic request - he could collect these items with a for-loop. But if he wants to do it in a similar fashion, he could pass a lambda as an argument:
Java 8 (how it could be):
List<Integer> collect = asList(1, 2, 3, 4, 5, 6).filter(x -> x >= 2 && x < 4);
the real Java 8 API example is a bit more verbose:
List<Integer> collect = Arrays.asList(1, 2, 3, 4, 5, 6)
.stream()
.filter(x -> x >= 2 && x < 4)
.collect(Collectors.toList());

Remember,
what we pass to the filter is as a function is just a filtering logic. While the filter function already expect a function that has the filtering logic. Filter function specifically accepts a function that returns boolean and decides whether to pass on the currently element in the stream or simply discard it.
I hope I have understood your question correctly.
For more on streams here is a ongoing series of article on Java 8 Stream API
amitph.com > Java 8 Streams API Tutorials

Related

Convert a For loop to a lambda expression in Java

I have the following code
//assume we have a list of custom type "details" already constructed
for(int i = 0; i < details.size(); ++i) {
CallerID number = details.get(i).getNextNumber();
ClientData.addToClient(number);
}
I have oversimplified the code. The enum CallerID and the ClientData object work as intended. I am asking for help converting this loop to a lambda function so I can understand the logic of how to do so, then fill in the appropriate code as needed.
Let's first write it as a modern basic for loop and golf it a bit, just so we're comparing apples to apples:
for (var detail : details) clientData.addToClient(detail.getNextNumber());
And this is probably the right answer. It is local var, exception, and control flow transparent (which is what you want), and short.
The lambda form is this, but it's got downsides (mostly, those transparencies). It also isn't any shorter. You shouldn't write it this way.
details.stream().forEach(d -> clientData.addToClient(detail.getNextNumber());
You may be able to just remove stream() from that. But probably not.
Generally when people say "I want it in lambda form", that's not because someone is holding a gun to your head - you are saying that because somebody peddling a religion of sorts to you told you that 'it was better' and that this 'will scale'. Realize that they are full of it. There can be advantages to 'functional style', but none of these snippets are functional. A true functional style would involve a bunch of side-effect-free transformations, and then returning something.
.addToClient? You've lost the functional game there - you would want to instead convert each detail to something (presumably a ClientID), and from there construct an immutable object from that stream. You'd 'collect' your ClientIDs into a clientData object.
Let's say for example that clientData is just a 'list of ClientIDs' and nothing more. Then you'd write something like this:
var clientData = details.stream()
.map(MyDetailClass::getNextNumber)
.collect(Collectors.toList());
Is this better? No. However, if you're looking for 'a stream-style, lambda-based functional take on things', that qualifies. The output is constructed by way of collection (and not forEach that does a side-effect operation), and all elements involved are (or can be) immutable.
There's no particular reason why you'd want this, but if for some reason you're convinced this is better, now you know what you want to do. "Just replace it with a lambda" doesn't make it 'functional'.
I am asking for help converting this loop to a lambda function so I can understand the logic of how to do so, then fill in the appropriate code as needed.
A Function returns a value. As you are just updating something what you need is a Consumer which accepts a single argument of a list of some detail. Assuming those are in a Class named SomeDetails, here is how you would do it.
As you iterating over some structure limited by size and using get(i) I am presuming a list is required here.
List<SomeDetails> details = new ArrayList<>(); // then populated
// lambda definition
Consumer<List<SomeDetails>> update = (lst)-> {
for(SomeDetails detail : lst) {
CallerID number = detail.getNextNumber();
ClientData.addToClient(number);
}
};
And then invoke it like this, passing the List.
update.accept(details);
All the above does is encapsulate the for loop (using the enhanced version for simplicity) and perform the operation.
If this is all you wanted, I would recommend just doing it as you were doing it sans the lambda.

Java Aggregate Operations vs Anonymous class suggestion

In this program, let’s say I have a class Leader that I want to assign to a class Mission. The Mission requires a class Skill, which has a type and a strength. The Leader has a List of Skills. I want to write a method that assigns a Leader (or a number of leaders) to a Mission and check if the Leaders’ combined skill strength is enough to accomplish the Mission.
public void assignLeaderToMission(Mission m, Leader... leaders) {
List<Leader> selectedLeaders = new ArrayList(Arrays.asList(leaders));
int combinedStrength = selectedLeaders
.stream()
.mapToInt(l -> l.getSkills()
.stream()
.filter(s -> s.getType() == m.getSkillRequirement().getType())
.mapToInt(s -> s.getStrength())
.sum())
.sum();
if(m.getSkillRequirement().getStrength() > combinedStrength)
System.out.println("Leader(s) do not meet mission requirements");
else {
// assign leader to mission
}
}
Is this the appropriate way to use a stream with lambda operations? NetBeans is giving a suggestion that I use an anonymous class, but I thought that lambas and aggregate operations were supposed to replace the need for anonymous classes with a single method, or maybe I am interpreting this incorrectly.
In this case, I am accessing a List<> within a List<> and I am not sure this is the correct way to do so. Some help would be much appreciated.
There is nothing wrong with using lambda expressions here. Netbeans just offers that code trans­for­ma­tion, since is is possible (and Netbeans can do the transformation for you). If you accept the offer and let it convert the code, it very likely starts offering converting the anonymous class to a lambda expression as soon as the conversion has been done, simply because it is (now) possible.
But if you want to improve your code, you should not use raw types, i.e. use
List<Leader> selectedLeaders = new ArrayList<>(Arrays.asList(leaders));
instead. But if you just want a List<Leader> without needing support for add or remove, there is no need to copy the list into an ArrayList, so you can use
List<Leader> selectedLeaders = Arrays.asList(leaders);
instead. But if all you want to do, is to stream over an array, you don’t need a List detour at all. You can simply use Arrays.stream(leaders) in the first place.
You may also use flatMap to reduce the amount of nested code, i.e.
int combinedStrength = Arrays.stream(leaders)
.flatMap(l -> l.getSkills().stream())
.filter(s -> s.getType() == m.getSkillRequirement().getType())
.mapToInt(s -> s.getStrength())
.sum();
Lambda must be concise so that it is easy to maintain. If the lambda expression is lengthy, then the code will become hard to maintain and understand. Even debugging will be harder.
More details on Why the perfect lambda expression is just one line can be read here.
The perilously long lambda
To better understand the benefits of writing short, concise lambda expressions, consider the opposite: a sprawling lambda that unfolds over several lines of code:
System.out.println(
values.stream()
.mapToInt(e -> {
int sum = 0;
for(int i = 1; i <= e; i++) {
if(e % i == 0) {
sum += i;
}
}
return sum;
})
.sum());
Even though this code is written in the functional style, it misses the benefits of functional-style programming. Let's consider the reasons why.
1. It's hard to read
Good code should be inviting to read. This code takes mental effort to read: your eyes strain to find the beginning and end of the different parts.
2. Its purpose isn't clear
Good code should read like a story, not like a puzzle. A long, anonymous piece of code like this one hides the details of its purpose, costing the reader time and effort. Wrapping this piece of code into a named function would make it modular, while also bringing out its purpose through the associated name.
3. Poor code quality
Whatever your code does, it's likely that you'll want to reuse it sometime. The logic in this code is embedded within the lambda, which in turn is passed as an argument to another function, mapToInt. If we needed the code elsewhere in our program, we might be tempted to rewrite it, thus introducing inconsistencies in our code base. Alternatively, we might just copy and paste the code. Neither option would result in good code or quality software.
4. It's hard to test
Code always does what was typed and not necessarily what was intended, so it stands that any nontrivial code must be tested. If the code within the lambda expression can't be reached as a unit, it can't be unit tested. You could run integration tests, but that is no substitute for unit testing, especially when that code does significant work.
5. Poor code coverage
Lambdas that were embedded in arguments were not easily extracted as units, and many showed up red on the coverage report. With no insight, the team simply had to assume that those pieces worked.

List forEach with method reference explanation

I have been learning java for past few months and just started to get into lambda functions. I recently switched my IDE and noticed a warning saying "Can be replaced with method reference" on codes like this.
List<Integer> intList = new ArrayList<>();
intList.add(1);
intList.add(2);
intList.add(3);
intList.forEach(num -> doSomething(num));
voiddoSomething(int num) {
System.out.println("Number is: " + num);
}
After some digging, I realized that instead of the line
intList.forEach(num -> doSomething(num));
I can just use
intList.forEach(this::doSomething);
This is just amazing. A few days ago I did not even knew about lambdas and was using for loops to do operations like this. Now I replaced my for loops with lambdas and even better, I can replace my lambdas with method references. The problem is that I don't really understand how all this works internally. Can anyone please explain or provide a good resource explaining how the doSomething function is called and the argument is passed to it when we use method reference?
The double-colon operator is simply a convenience operator for doing the same thing that your lambda is doing. Check out this page for more details: https://javapapers.com/core-java/java-method-reference/
The double colon is simply syntactic sugar for defining a lambda expression whose parameters and return type are the same as an existing function. It was created to to allow lambdas to more easily be added with existing codebases.
Calling the forEach method of a List<Integer> object takes as its parameter any object implementing the Consumer functional interface. Your lambda num -> doSomething(num) itself happens to fulfill the formal requirements of this interface.
Thus, you can use the double colon as syntactic sugar for that lambda expression.
In general, if you have an object obj with method func, which accepts parameters params... then writing obj::func is equivalent to the lambda (params...) -> obj.func(params...).
In your case, o is this (the current object), which has a method doSomething(), which takes an integer parameter, thus, this::doSomething is equivalent to num -> doSomething(num).
Given you've mentioned that it's only until recently you started getting into functional programming I'd like to keep things as simple and straightforward as possible, but note that with just the little code you've provided, we could derive a lot both from the high-level view of things as well the low-level view.
Can anyone please explain or provide a good resource explaining how
the doSomething function is called and the argument is passed to it
when we use method reference?
how the doSomething function is called is left to the library (internal iteration) regardless of whether we use a method reference or a lambda expression, so essentially we specify the what not the how meaning we provide to the forEach method a behaviour (a function) that we want to execute for each element of the source intList and not necessarily how it should go about its work.
This is then left to the library to apply (execute) the specified function of doSomething for each element of the source intList.
Method references can be seen as a shorthand for lambdas calling only a specific method. The benefit here is that by referring to a specific method name explicitly, your code gains better readability, therefore, making it easier to read and follow and in most cases reading code with method references reads as the problem statement which is a good thing.
It's also important to know that not any given function can be passed to the forEach terminal operation as every method that accepts a behaviour has a restriction on the type of function allowed. This is accomplished with the use of functional interfaces in the java.util.function package.
Lastly but not least, in terms of refactoring it's not always possible to use method references nor is it always better to use lambdas expressions over code that we used prior to Java-8. However, as you go on with your journey of learning the Java-8 features, a few tips to better your code are to try:
Refactoring anonymous classes to lambda expressions
Refactoring lambda expressions to method references
Refactoring imperative-style data processing to streams

Where are Java 8 lambda expressions evaluated?

Are the lambda expressions evaluated at the place where we write them or in any other class of Java?
For example :
Stream<Student> absent = students.values().stream().filter(s -> !s.present());
Will the above lambda expression passed to the filter method be executed immediately in a given class where the code is written OR in another class and will it take some more time (in terms of nano seconds) than if the code was written in conventional coding style prior to Java 8?
When you compile your sources, the compiler will insert an invokedynamic byte code instruction for the lambda expression that you use. The actual implementation (which in your case is a Predicate) will be created at runtime via ASM. It will not even be present on hard disk when you run it - meaning the class is generated in memory, there will be no .class file for Predicate. That's a big difference between an anonymous class for example - that will generate a class file when you compile it.
You can see the generated file for the Predicate if you run your example with :
-Djdk.internal.lambda.dumpProxyClasses=/Your/Path/Here
Otherwise Eran's answer is correct, Streams are driven by the terminal operation, if such is not present nothing gets executed. You should absolutely read the excellent Holger's answer about even more interesting differences.
The body of the lambda expression passed to the filter method in your example won't be executed at all, since filter is an intermediate operation, which only gets executed for Streams that end in a terminal operation, such as collect, forEach, etc...
If you add a terminal operation, such as collecting the elements of the Stream to a List:
List<Student> absent = students.values().stream().filter(s -> !s.present()).collect(Collectors.toList());
the body of the lambda expression will be executed for each element of your Stream, in order for the terminal operation to be able to produce its output.
Note that this behavior would not change if you passed an anonymous class instance or some other implementation of the Predicate interface to your filter method instead of the lambda expression.
The expressions are lazy evaluated, which means they'll only actually be evaluated when you actually try to 'terminate' the stream - i.e. use an operation that takes a stream but returns something else, like collect, min, max, reduce, etc. Operations which take a stream as input and return a stream as output are usually lazy.
Lambda expressions are essentially objects with a single method, so they're evaluated whenever that method is called.
In your particular case they're never evaluated. A Stream does not evaluate the expressions until you call a terminating operation (collect, findAny, etcetera)

Predicates vs if statements

I have seen in some projects that people use Predicates instead of pure if statements, as illustrated with a simple example below:
int i = 5;
// Option 1
if (i == 5) {
// Do something
System.out.println("if statement");
}
// Option 2
Predicate<Integer> predicate = integer -> integer == 5;
if (predicate.test(i)) {
// Do something
System.out.println("predicate");
}
What's the point of preferring Predicates over if statements?
Using a predicate makes your code more flexible.
Instead of writing a condition that always checks if i == 5, you can write a condition that evaluates a Predicate, which allows you to pass different Predicates implementing different conditions.
For example, the Predicate can be passed as an argument to a method :
public void someMethod (Predicate<Integer> predicate) {
if(predicate.test(i)) {
// do something
System.out.println("predicate");
}
...
}
This is how the filter method of Stream works.
For the exact example that you provided, using a Predicate is a big over-kill. The compiler and then the runtime will create:
a method (de-sugared predicate)
a .class that will implement java.util.Predicate
an instance of the class created at 2
all this versus a simple if statement.
And all this for a stateless Predicate. If your predicate is statefull, like:
Predicate<Integer> p = (Integer j) -> this.isJGood(j); // you are capturing "this"
then every time you will use this Predicate, a new instance will be created (at least under the current JVM).
The only viable option IMO to create such a Predicate is, of course, to re-use it in multiple places (like passing as arguments to methods).
Using if statements is the best (read: most performant) way to check binary conditions.
The switch statement may be faster for more complex situations.
A Predicate are a special form of Function. In fact the java language architect work on a way to allow generic primitive types. This will make Predicate<T> roughly equivalent to Function<T, boolean> (modulo the test vs apply method name).
If a function (resp. method) takes one or more functions as argument(s), we call it higher-order function. We say that we are passing behaviour to a function. This allows us to create powerful APIs.
String result = Match(arg).of(
Case(isIn("-h", "--help"), help()),
Case(isIn("-v", "--version"), version()),
Case($(), cmd -> "unknown command: " + cmd)
);
This example is taken from Javaslang, a library for object-functional programming in Java 8+.
Disclaimer: I'm the creator of Javaslang.
Thi is an old question, but I'll give it a try, since I am battling with it myself...
In my attempt to excuse my own usage of predicates I have made a self-rule.
I believe Predicates are useful where the "logic point" - is NOT the: leaf | corner | the end - of a: graph | tree | straight line, which would make the logic point effectively a "logic joint".
By it being a joint (aka node) it has a state, a re-usable and mutable state, that serves as a means towards an end.
In a stream, where the data is supposed to traverse a path, predicates are useful since they grant a degree of access while keeping the integrity of the stream, this is why the best predicates IMO are only method references minimizing side effects.
Even though the most common form of Predicate is newObject.equal(old), which is in itself a BiPredicate, but CAN be used with a single Predicate with side effect lambda -> lambda.equal(localCache) (so this may be an exception to the Only Method References rule).
IF, the logic serves as the output/exit point towards a different architectural design, or component, or a code that is not written by you, or even if it is written by you, one that differs on its functionality, then an if-else is my way to go.
Another benefit of predicates in the case of reactive programming is that multiple subscribers can make use of the same defined logic gate.
But if the end point of a publisher will be a single lone subscriber (which would be a case similar to your example if I'm reaching), then the logic is better done with an if-else.

Categories

Resources