List.toArray(Object[]) performance [duplicate] - java

This question already has answers here:
.toArray(new MyClass[0]) or .toArray(new MyClass[myList.size()])?
(8 answers)
Closed 4 years ago.
I'm getting a List of object A, then I use Apache Commons Collection4 to transform the obtained List from having A instances to having B instances.
listOfBs = (List<B>) CollectionUtils.collect(listOfAs, componentTransformer);
However, eventually I need to have an Array of Bs not a List.
So my question is, which is faster.
Convert the list using CollectionUtils.collect
Create an array using listOfBs.toArray(new B[listOfBs.size()])
Or
Loop over the listOfAs
Transform each A object to a B object
Add each B object to an array (B[])
The difference between the first approach and the second approach is that the first approach has much less code, but I'm not sure if the toArray method bares a hidden loop or expensive operations.
What I see in the second approach is that I'm sure I'll loop only once over the listOfAs list.
So which approach is faster ?

Don't be concerned about performance of List.toArray(), its complexity is linear as it will resort to a single loop internally.
As it is implemented with Arrays.copyOf, which eventually comes to System.arraycopy, that is implemented in native code it could be potentially even faster than a java-level loop.

Very interesting to read is this article:http://shipilev.net/blog/2016/arrays-wisdom-ancients/#_conclusion
It goes into great detail about the different ways to convert a List to an array.
Conclusion: do not use listOfBs.toArray(new B[listOfBs.size()]) as stated by you, but use listOfBs.toArray(new B[0]).
Believe it or not, this is faster.

Related

Stream.sorted() then collect, or collect then List.sort()? [duplicate]

This question already has answers here:
What is more efficient: sorted stream or sorting a list?
(4 answers)
Closed 4 years ago.
In general, is there a performance difference between these two pieces of code?
List<Integer> list1 = someStream1.sorted().collect(toList());
// vs.
List<Integer> list2 = someStream2.collect(toList());
list2.sort(Comparator.naturalOrder())
Variant 2 is obviously yucky and should be avoided, but I'm curious if there are any performance optimizations built into the mainstream (heh, mainstream) implementations of Stream that would result in a performance difference between these two.
I imagine that because the stream has strictly more information about the situation, it would have a better opportunity to optimize. E.g. I imagine if this had a findFirst() call tacked on, it would elide the sort, in favor of a min operation.
Both options should result in the same final result. But runtime characteristics could be different. What if the initial stream is a parallel one? Then option 1 would do a sort in parallel, whereas option 2 wouldn't do a "sequential" sort. The result should be the same, but the overall runtime resp. CPU load could be very much different then.
I would definitely prefer option 1 over 2: why create a list first, to then later sort it?!
Imagine for example you later want to collect into an immutable list. Then all code that follows your second pattern would break. Whereas code written using pattern 1 wouldn't be affected at all!
Of course, in the example here that shouldn't lead to issues, but what if the sort() happens in a slightly different place?!
Conceptually, streams are usually looked at as "transient" data that's being processed/manipulated, and collecting the stream conveys the notion you're done manipulating it.
While the second snippet should work, the first one would be the more idiomatic way of doing things.
In the first case the sorting happens in the call to collect. If the stream is already sorted this will be a no-op (the data will just pass through as-is). Might not make a big difference, but calling Collections.sort on an already sorted collection is still O(n).
Also the first case benefits from parallel execution, as at least OpenJDK uses Arrays.parallelSort.
Apart from that the first line is cleaner, better to understand and less error prone when refactoring.
According to documentation, it seems that the first sort is not a stable sort implementation for the unordered streams:
For ordered streams, the sort is stable. For unordered streams, no stability guarantees are made.
but the second one is a stable sort implementation:
This implementation is a stable, adaptive, iterative mergesort that requires far fewer than n lg(n) comparisons when the input array is partially sorted, while offering the performance of a traditional mergesort when the input array is randomly ordered. If the input array is nearly sorted, the implementation requires approximately n comparisons.
So, the stability of the sort algorithm is one of the differences between these two lists sort methods.
The list you get back from Collectors.toList() is not guaranteed to be editable. It might be an ArrayList, or an ImmutableList, you cannot know. Therefore you must not try to modify that list.

Can we use LinkedList even though there is risk of memory overhead. In that isnt using ArrayList better option [duplicate]

This question already has answers here:
When to use LinkedList over ArrayList in Java?
(33 answers)
Closed 7 years ago.
I know that when it comes to search operation ArrayList is better. And when it comes to insertion and deletion operation LinkedList is better. But I have read that linked list will cause memory overhead. In that case is it still safe to use LinkedList. Is so in what situation we have to avoid using LinkedList even though our logic contains more of insertion and deletion operation
ArrayList is fast in search (iterating over elements), but LinkedList is fast in modifying (deleting, inserting in any position). Now it depends on you that what are you doing.
You can also refer to this stackoverflow answer

When to use each Java Collections data structure [duplicate]

This question already has answers here:
When to use LinkedList over ArrayList in Java?
(33 answers)
Closed 8 years ago.
I see that there are a ton of generic data structures provided in Java. They all implement List, so they can be used almost interchangeably, but when would I want to use each? Personally, I stick to LinkedList because it's something I'm "familiar" with. I'm not asking for an explanation of every single structure, but can you explain some of the more common ones and give their uses, as well as compare and contrast the uses of "Vector-like" structures?
It depends on the performance characteristics and behavior you are looking for.
For example in a LinkedList add, delete, and retrieve are O(1), O(1), and O(n), whereas for an ArrayList, the same operations are O(n), O(n), and O(1) if using get(int) and O(n) if using get(Object). However ArrayList uses less memory than LinkedList per entry.
One often uses Vector<type> to add elements to the structure that are part of the same collection, but do not have any relationship to other members (other than being part of the same collection). A LinkedList indicates that there is some sort of ordering that is important among the members of the collection.

What is the difference between multiple implementations of ArrayList in the (Java8) source code [duplicate]

This question already has answers here:
Arrays.asList() doubt?
(8 answers)
Closed 9 years ago.
I was trying to understand Streams in Java8 and intermittently I stumbled upon an interesting thing in the source code of Java8: ArrayList seems to be implemented twice:
The obvious one: java.util.ArrayList
The non-obvious one: java.util.Arrays.ArrayList, which is a private class.
One odd difference is that the normal version is way bigger, and implements List<E>, whereas Arrays.ArrayList does not do so (directly).
Why is it defined twice? And why with the same name?
Actually its there ever since Arrays.asList() introduced. Array's ArrayList is view of the underlying array. If the Array gets changed the ArrayList will get effected and viceversa.
The main benefit, No additional space required because it wont copy the array to a new object (ArrayList), also no additional time to copy the elements.

Array vs ArrayList in performance [duplicate]

This question already has answers here:
Array or List in Java. Which is faster?
(32 answers)
Closed 9 years ago.
Which one is better in performance between Array of type Object and ArrayList of type Object?
Assume we have a Array of Animal objects : Animal animal[] and a arraylist : ArrayList list<Animal>
Now I am doing animal[10] and list.get(10)
which one should be faster and why?
It is pretty obvious that array[10] is faster than array.get(10), as the later internally does the same call, but adds the overhead for the function call plus additional checks.
Modern JITs however will optimize this to a degree, that you rarely have to worry about this, unless you have a very performance critical application and this has been measured to be your bottleneck.
From here:
ArrayList is internally backed by Array in Java, any resize operation
in ArrayList will slow down performance as it involves creating new
Array and copying content from old array to new array.
In terms of performance Array and ArrayList provides similar
performance in terms of constant time for adding or getting element if
you know index. Though automatic resize of ArrayList may slow down
insertion a bit Both Array and ArrayList is core concept of Java and
any serious Java programmer must be familiar with these differences
between Array and ArrayList or in more general Array vs List.
When deciding to use Array or ArrayList, your first instinct really shouldn't be worrying about performance, though they do perform differently. You first concern should be whether or not you know the size of the Array before hand. If you don't, naturally you would go with an array list, just for functionality.
I agree with somebody's recently deleted post that the differences in performance are so small that, with very very few exceptions, (he got dinged for saying never) you should not make your design decision based upon that.
In your example, where the elements are Objects, the performance difference should be minimal.
If you are dealing with a large number of primitives, an array will offer significantly better performance, both in memory and time.
Arrays are better in performance. ArrayList provides additional functionality such as "remove" at the cost of performance.

Categories

Resources