How to decide between lambda iteration and normal loop? - java

Since he introduction of Java 8 I got really hooked to lambdas and started using them whenever possible, mostly to start getting accustomed to them. One of the most common usage is when we want to iterate and act upon a collection of objects in which case I either resort to forEach or stream(). I rarely write the old for(T t : Ts) loop and I almost forgot about the for(int i = 0.....).
However, we were discussing this with my supervisor the other day and he told me that lambdas aren't always the best choice and can sometimes hinder performance. From a lecture I had seen on this new feature I got the feeling that lambda iterations are always fully optimized by the compiler and will (always?) be better than bare iterations, but he begs to differ. Is this true? If yes how do I distinguish between the best solution in each scenario?
P.S: I'm not talking about cases where it is recommended to apply parallelStream. Obviously those will be faster.

Performance depends on so many factors, that it’s hard to predict. Normally, we would say, if your supervisor claims that there was a problem with performance, your supervisor is in charge of explaining what problem.
One thing someone might be afraid of, is that behind the scenes, a class is generated for each lambda creation site (with the current implementation), so if the code in question is executed only once, this might be considered a waste of resources. This harmonizes with the fact that lambda expressions have a higher initialization overhead as the ordinary imperative code (we are not comparing to inner classes here), so inside class initializers, which only run once, you might consider avoiding it. This is also in line with the fact, that you should never use parallel streams in class initializers, so this potential advantage isn’t available here anyway.
For ordinary, frequently executed code that is likely to be optimized by the JVM, these problems do not arise. As you supposed correctly, classes generated for lambda expressions get the same treatment (optimizations) as other classes. At these places, calling forEach on collections bears the potential of being more efficient than a for loop.
The temporary object instances created for an Iterator or the lambda expression are negligible, however, it might be worth noting that a foreach loop will always create an Iterator instance whereas lambda expression do not always do. While the default implementation of Iterable.forEach will create an Iterator as well, some of the most often used collections take the opportunity to provide a specialized implementation, most notably ArrayList.
The ArrayList’s forEach is basically a for loop over an array, without any Iterator. It will then invoke the accept method of the Consumer, which will be a generated class containing a trivial delegation to the synthetic method containing the code of you lambda expression. To optimize the entire loop, the horizon of the optimizer has to span the ArrayList’s loop over an array (a common idiom recognizable for an optimizer), the synthetic accept method containing a trivial delegation and the method containing your actual code.
In contrast, when iterating over the same list using a foreach loop, an Iterator implementation is created containing the ArrayList iteration logic, spread over two methods, hasNext() and next() and instance variables of the Iterator. The loop will repeatedly invoke the hasNext() method to check the end condition (index<size) and next() which will recheck the condition before returning the element, as there is no guaranty that the caller does properly invoke hasNext() before next(). Of course, an optimizer is capable of removing this duplication, but that requires more effort than not having it in the first place. So to get the same performance of the forEach method, the optimizer’s horizon has to span your loop code, the nontrivial hasNext() implementation and the nontrivial next() implementation.
Similar things may apply to other collections having a specialized forEach implementation as well. This also applies to Stream operations, if the source provides a specialized Spliterator implementation, which does not spread the iteration logic over two methods like an Iterator.
So if you want to discuss the technical aspects of foreach vs. forEach(…), you may use these information.
But as said, these aspects describe only potential performance aspects as the work of the optimizer and other runtime environmental aspects may change the outcome completely. I think, as a rule of thumb, the smaller the loop body/action is, the more appropriate is the forEach method. This harmonizes perfectly with the guideline of avoiding overly long lambda expressions anyway.

It depends on specific implementation.
In general forEach method and foreach loop over Iterator usually have pretty similar performance as they use similar level of abstraction. stream() is usually slower (often by 50-70%) as it adds another level that provides access to the underlying collection.
The advantages of stream() generally are the possible parallelism and easy chaining of the operations with lot of reusable ones provided by JDK.

Related

AtomicInteger & lambda expressions in single-threaded app

I need to modify a local variable inside a lambda expression in a JButton's ActionListener and since I'm not able to modify it directly, I came across the AtomicInteger type.
I implemented it and it works just fine but I'm not sure if this is a good practice or if it is the correct way to solve this situation.
My code is the following:
newAnchorageButton.addActionListener(e -> {
AtomicInteger anchored = new AtomicInteger();
anchored.set(0);
cbSets.forEach(cbSet ->
cbSet.forEach(cb -> {
if (cb.isSelected())
anchored.incrementAndGet();
})
);
// more code where I use the 'anchored' variable...
}
I'm not sure if this is the right way to solve this since I've read that AtomicInteger is used mostly for concurrency-related applications and this program is single-threaded, but at the same time I can't find another way to solve this.
I could simply use two nested for-loops to go over those arrays but I'm trying to reduce the method's cognitive complexity as much as I can according to the sonarlint vscode extension, and leaving those for-loops theoretically increases the method complexity and therefore its readability and maintainability.
Replacing the for-loops with lambda expressions reduces the cognitive complexity but maybe I shouldn't pay that much attention to it.
While it is safe enough in single-threaded code, it would be better to count them in a functional way, like this:
long anchored = cbSets.stream() // get a stream of the sets
.flatMap(List::stream) // flatten to list of cb's
.filter(JCheckBox::isSelected) // only selected ones
.count(); // count them
Instead of mutating an accumulator, we limit the flattened stream to only the ones we're interested in and ask for the count.
More generally, though, it is always possible to sum things up or generally aggregate the values without a mutable variable. Consider:
record Country(int population) { }
countries.stream()
.mapToInt(Country::population)
.reduce(0, Math::addExact)
Note: we never mutate any values; instead, we combine each successive value with the preceding one, producing a new value. One could use sum() but I prefer reduce(0, Math::addExact) to avoid the possibility of overflow.
and leaving those for-loops theoretically increases the method complexity and therefore its readability and maintainability.
This is obvious horsepuckey. x.forEach(foo -> bar) is not 'cognitively simpler' than for (var foo : x) bar; - you can map each AST node straight over from one to the other.
If a definition is being used to define complexity which concludes that one is significantly more complex than the other, then the only correct conclusion is that the definition is silly and should be fixed or abandoned.
To make it practical: Yes, introducing AtomicInteger, whilst performance wise it won't make one iota of difference, does make the code way more complicated. AtomicInteger's simple existence in the code suggests that concurrency is relevant here. It isn't, so you'd have to add a comment to explain why you're using it. Comments are evil. (They imply the code does not speak for itself, and they cannot be tested in any way). They are often the least evil, but evil they are nonetheless.
The general 'trick' for keeping lambda-based code cognitively easily followed is to embrace the pipeline:
You write some code that 'forms' a stream. This can be as simple as list.stream(), but sometimes you do some stream joining or flatmapping a collection of collections.
You have a pipeline of operations that operate on single elements in the stream and do not refer to the whole or to any neighbour.
At the end, you reduce (using collect, reduce, max - some terminator) such that the reducing method returns what you need.
The above model (and the other answer follows it precisely) tends to result in code that is as readable/complex as the 'old style' code, and rarely (but sometimes!) more readable, and significantly less complicated. Deviate from it and the result is virtually always considerably more complicated - a clear loser.
Not all for loops in java fit the above model. If it doesn't fit, then trying to force that particular square peg into the round hole will take a lot of effort and almost always results in code that is significantly worse: Either an order of magnitude slower or considerably more cognitively complicated.
It also means that it is virtually never 'worth' rewriting perfectly fine readable non-stream based code into stream based code; at best it becomes a percentage point more readable according to some personal tastes, with no significant universally agreed upon improvement.
Turn off that silly linter rule. The fact that it considers the above 'less' complex, and that it evidently determines that for (var foo : x) bar; is 'more complicated' than x.forEach(foo -> bar) is proof enough that it's hurting way more than it is helping.
I have the following to add to the two other answers:
Two general good practices in your code are in question:
Lambdas shouldn't be longer than 3-4 lines
Except in some precise cases, lambdas of stream operations should be stateless.
For #1, consider extracting the code of the lambda to a private method for example, when it's getting too long.
You will probably gain in readability, and you will also probably gain in better separating UI from business logic.
For #2, you are probably not concerned since you are working in a single thread at the moment, but streams can be parallelized, and they may not always execute exactly as you think it does.
For that reason, it's always better to keep the code stateless in stream pipeline operations. Otherwise you might be surprised.
More generally, streams are very good, very concise, but sometimes it's just better to do the same with good old loops.
Don't hesitate to come back to classic loops.
When Sonar tells you that the complexity is too high, in fact, you should try to factorize your code: split into smaller methods, improve the model of your objects, etc.

Java collection performance when comparing items

A basic performance question from someone coming from C/C++.
I'm using a Collection (ArrayDeque) to simply hold, add, remove items by identity. I know the contract is for the collection to use equals() when checking equality, for example during remove(obj), but in my case I want to use reference semantics (like IdentityHashMap but don't need the map). So I am fine to just know that I will never override the equals() on any of the objects held inside the collection (which is declared to hold an interface).
Coming from native programming I can't avoid asking myself, will the compiled code of remove(obj) traverse items and perform a virtual call on Object.equals() only to end up comparing addresses? Since I'm storing interface references, there is no way (?) to optimise this using final so the compiler doesn't bother making the useless calls (i.e. inline them) - but now I'm getting ahead of myself because it may be such optimisation is not necessary anyway and JVM has other means (devirtualisation?) to generate optimal code in this case.
Assuming my code needs the level of optimisation that can be obtained by thinking about this aspect in the first place - is my understanding correct? What is a good design for this case?
Making the method final wont avoid the virtual call because invokevirtual opcode will be used anyway and there is no way for the JVM to tell if the method was final or not.
The good news is that the JVM might be able inline it or avoid the virtual call if it can't see that the method is overridden anywhere in the classpath so your performance will improve as your program runs.
When you use the remove method, it will call the equals method for comparison. Ideally, you should be overriding the equals and hashcode method to use such methods. Otherwise the by-default implementation of type-checking and address comparison happens. It is highly recommended to define your implementation of equals and hashcode methods while using methods of Collections.
Regarding the performance, yes you are right - all the objects in the collection will be scanned linearly till the JVM encounters correct match. It is a linear search, hence the time complexity for this operation of removal will take O(n) time.

Does a lambda expression create an object on the heap every time it's executed?

When I iterate over a collection using the new syntactic sugar of Java 8, such as
myStream.forEach(item -> {
// do something useful
});
Isn't this equivalent to the 'old syntax' snippet below?
myStream.forEach(new Consumer<Item>() {
#Override
public void accept(Item item) {
// do something useful
}
});
Does this mean a new anonymous Consumer object is created on the heap every time I iterate over a collection? How much heap space does this take? What performance implications does it have? Does it mean I should rather use the old style for loops when iterating over large multi-level data structures?
It is equivalent but not identical. Simply said, if a lambda expression does not capture values, it will be a singleton that is re-used on every invocation.
The behavior is not exactly specified. The JVM is given big freedom on how to implement it. Currently, Oracle’s JVM creates (at least) one instance per lambda expression (i.e. doesn’t share instance between different identical expressions) but creates singletons for all expressions which don’t capture values.
You may read this answer for more details. There, I not only gave a more detailed description but also testing code to observe the current behavior.
This is covered by The Java® Language Specification, chapter “15.27.4. Run-time Evaluation of Lambda Expressions”
Summarized:
These rules are meant to offer flexibility to implementations of the Java programming language, in that:
A new object need not be allocated on every evaluation.
Objects produced by different lambda expressions need not belong to different classes (if the bodies are identical, for example).
Every object produced by evaluation need not belong to the same class (captured local variables might be inlined, for example).
If an "existing instance" is available, it need not have been created at a previous lambda evaluation (it might have been allocated during the enclosing class's initialization, for example).
When an instance representing the lambda is created sensitively depends on the exact contents of your lambda's body. Namely, the key factor is what the lambda captures from the lexical environment. If it doesn't capture any state which is variable from creation to creation, then an instance will not be created each time the for-each loop is entered. Instead a synthetic method will be generated at compile time and the lambda use site will just receive a singleton object that delegates to that method.
Further note that this aspect is implementation-dependent and you can expect future refinements and advancements on HotSpot towards greater efficiency. There are general plans to e.g. make a lightweight object without a full corresponding class, which has just enough information to forward to a single method.
Here is a good, accessible in-depth article on the topic:
http://www.infoq.com/articles/Java-8-Lambdas-A-Peek-Under-the-Hood
You are passing a new instance to the forEach method. Every time you do that you create a new object but not one for every loop iteration. Iteration is done inside forEach method using the same 'callback' object instance until it is done with the loop.
So the memory used by the loop does not depend on the size of the collection.
Isn't this equivalent to the 'old syntax' snippet?
Yes. It has slight differences at a very low level but I don't think you should care about them. Lamba expressions use the invokedynamic feature instead of anonymous classes.

State of Lambda and Imperfections in Anonymous Classes

I was reading again Brian Goetz document on the State of Lambda where he details many of the reasons why Java needed lambda expressions.
In one of the paragraphs he wrote:
Given the increasing relevance of callbacks and other functional-style
idioms, it is important that modeling code as data in Java be as
lightweight as possible. In this respect, anonymous inner classes are
imperfect for a number of reasons, primarily:
Bulky syntax
Confusion surrounding the meaning of names and this
Inflexible class-loading and instance-creation semantics
Inability to capture non-final local variables
Inability to abstract over control flow
From this list of imperfections I believe I understand reasonably well the items (1), (2) and (4).
But I have no clue of what exactly the problems are in (3) and (5).
Can anybody out there provide any examples of how these two could be an issue when using anonymous classes?
Not all the projects I work on are yet on Java 8 and so I think it is important to understand these shortcomings and above all see clearly how things are better now with Java 8 lambdas. Also, since Brian was one of the leaders of the project lambda I thought it was worth my time to give it some thought to what he meant by this, it could lead me to an epiphany :-)
Well 5. Inability to abstract over control flow is easy.
Lambda's are great to iterate over all the elements in a collection.
aCollection.forEach( myLambda)
The old way you would have to use for loops or Iterators or something similar.
for( ....){
//same code as what's in the lambda
}
This is called internal iteration. We have to tell the collection not only what do do with each element in the collection BUT ALSO HOW TO GET EACH ELEMENT. This code iterates through all the objects in order sequentially. Sometimes that isn't the best for performance reasons.
Lambdas allow us to do external iteration. We only tell the collection what to do with each element. How each element is accessed and in what order is up to the Collection implementation to do it the most efficent way it can using internal implementation knowledge. It may even be parallel not sequential.
3. Inflexible class-loading and instance-creation semantics
Is a lower level issue with how Anonymous classes are loaded and instantiated. I will point you to this article: http://www.infoq.com/articles/Java-8-Lambdas-A-Peek-Under-the-Hood
But basically
anonymous classes require making new class files for each one (MyClass$1 etc). This extra class has to be loaded. Lambdas don't make new class files and their byte code is created dynamically at runtime.
Future versions of Java may be able to make Lambdas differently under the hood. By generating the lambda bytecode at runtime, future versions can safely change how Lambdas get created without breaking anything
I also want to add another thing about (3). "Instance-creation" might refer to the fact that when you create an instance of an anonymous class (new ...), just like when you create an instance of any class, you are guaranteed to get a new object. So the reference guaranteed to compare unequal != to the reference to any other object.
On the other hand, for lambdas, there is no guarantee that running a lambda expression twice will evaluate to two different objects. In particular, if the lambda doesn't capture any variables, then all instances of the lambda are functionally identical. In this case, it could just allocate one object statically and use it for the duration of the program. Allocating lots of objects is not cheap, so in the cases where it can avoid creating more objects, it makes the program more efficient.

Why does Scala implement for as a closure?

Recent events on the blogosphere have indicated that a possible performance problem with Scala is its use of closures to implement for.
What are the reasons for this design decision, as opposed to a C or Java-style "primitive for" - that is one which will be turned into a simple loop?
(I'm making a distinction between Java's for and its "foreach" construct here, as the latter involves an implicit Iterator).
More detail, following up from Peter. This bit of Scala:
object ScratchFor {
def main(args : Array[String]) : Unit = {
for (val s <- args) {
println(s)
}
}
}
creates 3 classes: ScratchFor$$anonfun$main$1.class ScratchFor$.class ScratchFor.class
ScratchFor::main just forwards to the companion object, ScratchFor$.MODULE$::main which spins up an ScratchFor$$anonfun$main$1 (which is an implementation of AbstractFunction1).
It's in the apply() method of this anonymous inner impl of AbstractFunction1 that the actual code lives, which is effectively the loop body.
I don't see HotSpot being able to rewrite this into a simple loop. Happy to be proved wrong on this, though.
Traditional for loops are clumsy, verbose and error-prone. I think it is proof enough of this that "for-each" loops where added to Java, C# and C++, but if you want more details you may check item 46 of Effective Java.
Now, for-each loops are still much faster than Scala for-comprehension, but they are also much less powerful (and more clumsy) because they cannot return values. If you want to transform or filter a collection (or do both to a group of collections), you'll still have to handle all the mechanical details of constructing the result collection in addition to computing the values. Not to mention it inevitably uses some mutable state.
Finally, even though for-each loops are adequate enough for collections, they are not suited to other monadic classes (of which collections are a subset of).
So Scala has a general method which takes care of all of the above. Yes, it is slower, but the goal is to have the compiler effectively optimise it well enough so that this doesn't become a hindrance (and, of course, JIT could help here as well).
That has not been accomplished to this date, but -optimise has reduced a lot of ground between common for-each loops and for-comprehensions on the latest versions of Scala. If performance is essential, you can always use while or tail recursion.
Now, it would be possibly for Scala to have common for loops or for-each loops as special cases specifically targeted at performance issues (since for-comprehensions can do everything they do). However, that violates two principles that guide Scala's design:
Reduce complexity. Yes, contrary to what some say, that is a design goal, and special cases that serve no other purpose other than optimise performance -- even though a workable solution exists for performance cases -- would needlessly increase the complexity of the language.
Scalability. This is in the sense that the use can scale the language for any size of problem by writing libraries. The point here is that having the compiler optimise one particular class, such as Range, would make it impossible for the user to create a replacement class that would perform just as well.
The for comprehension in Scala is a powerful general-purpose looping and pattern-matching construct. Look at what it can do:
case class Person(first: String, last: String) {}
val people = List(Person("Isaac","Newton"), Person("Michael","Jordan"))
val lastfirst = for (Person(f,l) <- people) yield l+", "+f
for (n <- lastfirst) println(n)
The second case looks pretty straightforward--take each item in a collection and print it. But the first takes apart a list containing a custom data structure and transforms it into a different collection type!
The first for there highlights only a small portion of the capability of the construct; it is both extremely powerful and extremely general. In order to maintain this power, the for must be able to turn into something very general, which means closures. Then the question is: do you also introduce special cases that operate on known collections in simple ways with improved performance? The answer thus far has been mostly no, instead preferring solutions that optimize the general closure-taking methods that for turns into.
Whether this is useful for you in particular depends on whether you are using the general capabilities a lot (in which case you will be glad) or not (in which case you may wish progress was faster).
Still, try -optimize. It often usefully speeds up simple for-comprehensions these days.
The for-comprehension is much more than a simple loop.
If you need an imperative loop, use while. If you want to write performant code in Scala, you need to know this. Just like you have to know about language implementation when you want to write fast code in every other language.
So, since the for-comprehension is not a simple loop, I hope you understand that it's not compiled down to a simple loop.
I would assume using a closure is a general solution. A more optimal solution in some cases would be to "inline" the closure as a loop and eliminate the need to create an object. Perhaps the Scala designers feel the JIT should do this, rather having the compiler do this.
Let's say in Java this is the same as writing
public static void main(String... args) {
for_loop(args, new Function<String>() {
public void apply(String s) {
System.out.println(s);
}
});
}
interface Function<T> {
void apply(T s);
}
public static <T> void for_loop(T... ts, Function<T> tFunc) {
for(T t: ts) tFunc.apply(t);
}
This is fairly easy to inline (if you're a human). What is surprising is that Scala doesn't have an intrinsic to perform the optimisation to eliminate the need for a new object. Certainly the JIT could do it in theory, but in practise, it might be a while before it handles this specific case.
I'm surprised that no one has mentioned one of the pitfalls you can get into if for does not create a closure.
In Python for example:
ls = [None] * 3
for i in [0, 1, 2]:
ls[i] = lambda: i
print(ls[0]())
print(ls[1]())
print(ls[2]())
This prints 2 2 2, because i has a longer lifetime than the for loop. I run into this trap all the time in Python and R.
So even in the very simplest of cases, it is important for for in Scala to be implemented using an anonymous function, because it creates an environment to store variables.

Categories

Resources