Related
I want to use Java 8 Lambda expression in following scenario but I am getting Local variable fooCount defined in an enclosing scope must be final or effectively final. I understand what the error message says, but I need to calculate percentage here so need to increment fooCount and barCount then calculate percentage. So what's the way to achieve it:
// key is a String with values like "FOO;SomethinElse" and value is Long
final Map<String, Long> map = null;
....
private int calculateFooPercentage() {
long fooCount = 0L;
long barCount = 0L;
map.forEach((k, v) -> {
if (k.contains("FOO")) {
fooCount++;
} else {
barCount++;
}
});
final int fooPercentage = 0;
//Rest of the logic to calculate percentage
....
return fooPercentage;
}
One option I have is to use AtomicLong here instead of long but I would like to avoid it, so later if possible I want to use parallel stream here.
There is a count method in stream to do counts for you.
long fooCount = map.keySet().stream().filter(k -> k.contains("FOO")).count();
long barCount = map.size() - fooCount;
If you want parallelisation, change .stream() to .parallelStream().
Alternatively, if you were trying to increment a variable manually, and use stream parallelisation, then you would want to use something like AtomicLong for thread safety. A simple variable, even if the compiler allowed it, would not be thread-safe.
To get both numbers, matching and non-matching elements, you can use
Map<Boolean, Long> result = map.keySet().stream()
.collect(Collectors.partitioningBy(k -> k.contains("FOO"), Collectors.counting()));
long fooCount = result.get(true);
long barCount = result.get(false);
But since your source is a Map, which knows its total size, and want to calculate a percentage, for which barCount is not needed, this specific task can be solved as
private int calculateFooPercentage() {
return (int)(map.keySet().stream().filter(k -> k.contains("FOO")).count()
*100/map.size());
}
Both variants are thread safe, i.e. changing stream() to parallelStream() will perform the operation in parallel, however, it’s unlikely that this operation will benefit from parallel processing. You would need humongous key strings or maps to get a benefit…
I agree with the other answers indicating you should use countor partitioningBy.
Just to explain the atomicity problem with an example, consider the following code:
private static AtomicInteger i1 = new AtomicInteger(0);
private static int i2 = 0;
public static void main(String[] args) {
IntStream.range(0, 100000).parallel().forEach(n -> i1.incrementAndGet());
System.out.println(i1);
IntStream.range(0, 100000).parallel().forEach(n -> i2++);
System.out.println(i2);
}
This returns the expected result of 100000 for i1 but an indeterminate number less than that (between 50000 and 80000 in my test runs) for i2. The reason should be pretty obvious.
I have a list of objects with many duplicated and some fields that need to be merged. I want to reduce this down to a list of unique objects using only Java 8 Streams (I know how to do this via old-skool means but this is an experiment.)
This is what I have right now. I don't really like this because the map-building seems extraneous and the values() collection is a view of the backing map, and you need to wrap it in a new ArrayList<>(...) to get a more specific collection. Is there a better approach, perhaps using the more general reduction operations?
#Test
public void reduce() {
Collection<Foo> foos = Stream.of("foo", "bar", "baz")
.flatMap(this::getfoos)
.collect(Collectors.toMap(f -> f.name, f -> f, (l, r) -> {
l.ids.addAll(r.ids);
return l;
})).values();
assertEquals(3, foos.size());
foos.forEach(f -> assertEquals(10, f.ids.size()));
}
private Stream<Foo> getfoos(String n) {
return IntStream.range(0,10).mapToObj(i -> new Foo(n, i));
}
public static class Foo {
private String name;
private List<Integer> ids = new ArrayList<>();
public Foo(String n, int i) {
name = n;
ids.add(i);
}
}
If you break the grouping and reducing steps up, you can get something cleaner:
Stream<Foo> input = Stream.of("foo", "bar", "baz").flatMap(this::getfoos);
Map<String, Optional<Foo>> collect = input.collect(Collectors.groupingBy(f -> f.name, Collectors.reducing(Foo::merge)));
Collection<Optional<Foo>> collected = collect.values();
This assumes a few convenience methods in your Foo class:
public Foo(String n, List<Integer> ids) {
this.name = n;
this.ids.addAll(ids);
}
public static Foo merge(Foo src, Foo dest) {
List<Integer> merged = new ArrayList<>();
merged.addAll(src.ids);
merged.addAll(dest.ids);
return new Foo(src.name, merged);
}
As already pointed out in the comments, a map is a very natural thing to use when you want to identify unique objects. If all you needed to do was find the unique objects, you could use the Stream::distinct method. This method hides the fact that there is a map involved, but apparently it does use a map internally, as hinted by this question that shows you should implement a hashCode method or distinct may not behave correctly.
In the case of the distinct method, where no merging is necessary, it is possible to return some of the results before all of the input has been processed. In your case, unless you can make additional assumptions about the input that haven't been mentioned in the question, you do need to finish processing all of the input before you return any results. Thus this answer does use a map.
It is easy enough to use streams to process the values of the map and turn it back into an ArrayList, though. I show that in this answer, as well as providing a way to avoid the appearance of an Optional<Foo>, which shows up in one of the other answers.
public void reduce() {
ArrayList<Foo> foos = Stream.of("foo", "bar", "baz").flatMap(this::getfoos)
.collect(Collectors.collectingAndThen(Collectors.groupingBy(f -> f.name,
Collectors.reducing(Foo.identity(), Foo::merge)),
map -> map.values().stream().
collect(Collectors.toCollection(ArrayList::new))));
assertEquals(3, foos.size());
foos.forEach(f -> assertEquals(10, f.ids.size()));
}
private Stream<Foo> getfoos(String n) {
return IntStream.range(0, 10).mapToObj(i -> new Foo(n, i));
}
public static class Foo {
private String name;
private List<Integer> ids = new ArrayList<>();
private static final Foo BASE_FOO = new Foo("", 0);
public static Foo identity() {
return BASE_FOO;
}
// use only if side effects to the argument objects are okay
public static Foo merge(Foo fooOne, Foo fooTwo) {
if (fooOne == BASE_FOO) {
return fooTwo;
} else if (fooTwo == BASE_FOO) {
return fooOne;
}
fooOne.ids.addAll(fooTwo.ids);
return fooOne;
}
public Foo(String n, int i) {
name = n;
ids.add(i);
}
}
If the input elements are supplied in the random order, then having intermediate map is probably the best solution. However if you know in advance that all the foos with the same name are adjacent (this condition is actually met in your test), the algorithm can be greatly simplified: you just need to compare the current element with the previous one and merge them if the name is the same.
Unfortunately there's no Stream API method which would allow you do to such thing easily and effectively. One possible solution is to write custom collector like this:
public static List<Foo> withCollector(Stream<Foo> stream) {
return stream.collect(Collector.<Foo, List<Foo>>of(ArrayList::new,
(list, t) -> {
Foo f;
if(list.isEmpty() || !(f = list.get(list.size()-1)).name.equals(t.name))
list.add(t);
else
f.ids.addAll(t.ids);
},
(l1, l2) -> {
if(l1.isEmpty())
return l2;
if(l2.isEmpty())
return l1;
if(l1.get(l1.size()-1).name.equals(l2.get(0).name)) {
l1.get(l1.size()-1).ids.addAll(l2.get(0).ids);
l1.addAll(l2.subList(1, l2.size()));
} else {
l1.addAll(l2);
}
return l1;
}));
}
My tests show that this collector is always faster than collecting to map (up to 2x depending on average number of duplicate names), both in sequential and parallel mode.
Another approach is to use my StreamEx library which provides a bunch of "partial reduction" methods including collapse:
public static List<Foo> withStreamEx(Stream<Foo> stream) {
return StreamEx.of(stream)
.collapse((l, r) -> l.name.equals(r.name), (l, r) -> {
l.ids.addAll(r.ids);
return l;
}).toList();
}
This method accepts two arguments: a BiPredicate which is applied for two adjacent elements and should return true if elements should be merged and the BinaryOperator which performs merging. This solution is a little bit slower in sequential mode than the custom collector (in parallel the results are very similar), but it's still significantly faster than toMap solution and it's simpler and somewhat more flexible as collapse is an intermediate operation, so you can collect in another way.
Again both these solutions work only if foos with the same name are known to be adjacent. It's a bad idea to sort the input stream by foo name, then using these solutions, because the sorting will drastically reduce the performance making it slower than toMap solution.
As already pointed out by others, an intermediate Map is unavoidable, as that’s the way of finding the objects to merge. Further, you should not modify source data during reduction.
Nevertheless, you can achieve both without creating multiple Foo instances:
List<Foo> foos = Stream.of("foo", "bar", "baz")
.flatMap(n->IntStream.range(0,10).mapToObj(i -> new Foo(n, i)))
.collect(collectingAndThen(groupingBy(f -> f.name),
m->m.entrySet().stream().map(e->new Foo(e.getKey(),
e.getValue().stream().flatMap(f->f.ids.stream()).collect(toList())))
.collect(toList())));
This assumes that you add a constructor
public Foo(String n, List<Integer> l) {
name = n;
ids=l;
}
to your Foo class, as it should have if Foo is really supposed to be capable of holding a list of IDs. As a side note, having a type which serves as single item as well as a container for merged results seems unnatural to me. This is exactly why to code turns out to be so complicated.
If the source items had a single id, using something like groupingBy(f -> f.name, mapping(f -> id, toList()), followed by mapping the entries of (String, List<Integer>) to the merged items was sufficient.
Since this is not the case and Java 8 lacks the flatMapping collector, the flatmapping step is moved to the second step, making it look much more complicated.
But in both cases, the second step is not obsolete as it is where the result items are actually created and converting the map to the desired list type comes for free.
Java 8 introduced a Stream class that resembles Scala's Stream, a powerful lazy construct using which it is possible to do something like this very concisely:
def from(n: Int): Stream[Int] = n #:: from(n+1)
def sieve(s: Stream[Int]): Stream[Int] = {
s.head #:: sieve(s.tail filter (_ % s.head != 0))
}
val primes = sieve(from(2))
primes takeWhile(_ < 1000) print // prints all primes less than 1000
I wondered if it is possible to do this in Java 8, so I wrote something like this:
IntStream from(int n) {
return IntStream.iterate(n, m -> m + 1);
}
IntStream sieve(IntStream s) {
int head = s.findFirst().getAsInt();
return IntStream.concat(IntStream.of(head), sieve(s.skip(1).filter(n -> n % head != 0)));
}
IntStream primes = sieve(from(2));
Fairly simple, but it produces java.lang.IllegalStateException: stream has already been operated upon or closed because both findFirst() and skip() are terminal operations on Stream which can be done only once.
I don't really have to use up the stream twice since all I need is the first number in the stream and the rest as another stream, i.e. equivalent of Scala's Stream.head and Stream.tail. Is there a method in Java 8 Stream that I can use to achieve this?
Thanks.
Even if you hadn’t the problem that you can’t split an IntStream, you code didn’t work because you are invoking your sieve method recursively instead of lazily. So you had an infinity recursion before you could query your resulting stream for the first value.
Splitting an IntStream s into a head and a tail IntStream (which has not yet consumed) is possible:
PrimitiveIterator.OfInt it = s.iterator();
int head = it.nextInt();
IntStream tail = IntStream.generate(it::next).filter(i -> i % head != 0);
At this place you need a construct of invoking sieve on the tail lazily. Stream does not provide that; concat expects existing stream instances as arguments and you can’t construct a stream invoking sieve lazily with a lambda expression as lazy creation works with mutable state only which lambda expressions do not support. If you don’t have a library implementation hiding the mutable state you have to use a mutable object. But once you accept the requirement of mutable state, the solution can be even easier than your first approach:
IntStream primes = from(2).filter(i -> p.test(i)).peek(i -> p = p.and(v -> v % i != 0));
IntPredicate p = x -> true;
IntStream from(int n)
{
return IntStream.iterate(n, m -> m + 1);
}
This will recursively create a filter but in the end it doesn’t matter whether you create a tree of IntPredicates or a tree of IntStreams (like with your IntStream.concat approach if it did work). If you don’t like the mutable instance field for the filter you can hide it in an inner class (but not in a lambda expression…).
My StreamEx library has now headTail() operation which solves the problem:
public static StreamEx<Integer> sieve(StreamEx<Integer> input) {
return input.headTail((head, tail) ->
sieve(tail.filter(n -> n % head != 0)).prepend(head));
}
The headTail method takes a BiFunction which will be executed at most once during the stream terminal operation execution. So this implementation is lazy: it does not compute anything until traversal starts and computes only as much prime numbers as requested. The BiFunction receives a first stream element head and the stream of the rest elements tail and can modify the tail in any way it wants. You may use it with predefined input:
sieve(IntStreamEx.range(2, 1000).boxed()).forEach(System.out::println);
But infinite stream work as well
sieve(StreamEx.iterate(2, x -> x+1)).takeWhile(x -> x < 1000)
.forEach(System.out::println);
// Not the primes till 1000, but 1000 first primes
sieve(StreamEx.iterate(2, x -> x+1)).limit(1000).forEach(System.out::println);
There's also alternative solution using headTail and predicate concatenation:
public static StreamEx<Integer> sieve(StreamEx<Integer> input, IntPredicate isPrime) {
return input.headTail((head, tail) -> isPrime.test(head)
? sieve(tail, isPrime.and(n -> n % head != 0)).prepend(head)
: sieve(tail, isPrime));
}
sieve(StreamEx.iterate(2, x -> x+1), i -> true).limit(1000).forEach(System.out::println);
It interesting to compare recursive solutions: how many primes they capable to generate.
#John McClean solution (StreamUtils)
John McClean solutions are not lazy: you cannot feed them with infinite stream. So I just found by trial-and-error the maximal allowed upper bound (17793) (after that StackOverflowError occurs):
public void sieveTest(){
sieve(IntStream.range(2, 17793).boxed()).forEach(System.out::println);
}
#John McClean solution (Streamable)
public void sieveTest2(){
sieve(Streamable.range(2, 39990)).forEach(System.out::println);
}
Increasing upper limit above 39990 results in StackOverflowError.
#frhack solution (LazySeq)
LazySeq<Integer> ints = integers(2);
LazySeq primes = sieve(ints); // sieve method from #frhack answer
primes.forEach(p -> System.out.println(p));
Result: stuck after prime number = 53327 with enormous heap allocation and garbage collection taking more than 90%. It took several minutes to advance from 53323 to 53327, so waiting more seems impractical.
#vidi solution
Prime.stream().forEach(System.out::println);
Result: StackOverflowError after prime number = 134417.
My solution (StreamEx)
sieve(StreamEx.iterate(2, x -> x+1)).forEach(System.out::println);
Result: StackOverflowError after prime number = 236167.
#frhack solution (rxjava)
Observable<Integer> primes = Observable.from(()->primesStream.iterator());
primes.forEach((x) -> System.out.println(x.toString()));
Result: StackOverflowError after prime number = 367663.
#Holger solution
IntStream primes=from(2).filter(i->p.test(i)).peek(i->p=p.and(v->v%i!=0));
primes.forEach(System.out::println);
Result: StackOverflowError after prime number = 368089.
My solution (StreamEx with predicate concatenation)
sieve(StreamEx.iterate(2, x -> x+1), i -> true).forEach(System.out::println);
Result: StackOverflowError after prime number = 368287.
So three solutions involving predicate concatenation win, because each new condition adds only 2 more stack frames. I think, the difference between them is marginal and should not be considered to define a winner. However I like my first StreamEx solution more as it more similar to Scala code.
The solution below does not do state mutations, except for the head/tail deconstruction of the stream.
The lazyness is obtained using IntStream.iterate. The class Prime is used to keep the generator state
import java.util.PrimitiveIterator;
import java.util.stream.IntStream;
import java.util.stream.Stream;
public class Prime {
private final IntStream candidates;
private final int current;
private Prime(int current, IntStream candidates)
{
this.current = current;
this.candidates = candidates;
}
private Prime next()
{
PrimitiveIterator.OfInt it = candidates.filter(n -> n % current != 0).iterator();
int head = it.next();
IntStream tail = IntStream.generate(it::next);
return new Prime(head, tail);
}
public static Stream<Integer> stream() {
IntStream possiblePrimes = IntStream.iterate(3, i -> i + 1);
return Stream.iterate(new Prime(2, possiblePrimes), Prime::next)
.map(p -> p.current);
}
}
The usage would be this:
Stream<Integer> first10Primes = Prime.stream().limit(10)
You can essentially implement it like this:
static <T> Tuple2<Optional<T>, Seq<T>> splitAtHead(Stream<T> stream) {
Iterator<T> it = stream.iterator();
return tuple(it.hasNext() ? Optional.of(it.next()) : Optional.empty(), seq(it));
}
In the above example, Tuple2 and Seq are types borrowed from jOOλ, a library that we developed for jOOQ integration tests. If you don't want any additional dependencies, you might as well implement them yourself:
class Tuple2<T1, T2> {
final T1 v1;
final T2 v2;
Tuple2(T1 v1, T2 v2) {
this.v1 = v1;
this.v2 = v2;
}
static <T1, T2> Tuple2<T1, T2> tuple(T1 v1, T2 v2) {
return new Tuple<>(v1, v2);
}
}
static <T> Tuple2<Optional<T>, Stream<T>> splitAtHead(Stream<T> stream) {
Iterator<T> it = stream.iterator();
return tuple(
it.hasNext() ? Optional.of(it.next()) : Optional.empty,
StreamSupport.stream(Spliterators.spliteratorUnknownSize(
it, Spliterator.ORDERED
), false)
);
}
If you don't mind using 3rd party libraries cyclops-streams, I library I wrote has a number of potential solutions.
The StreamUtils class has large number of static methods for working directly with java.util.stream.Streams including headAndTail.
HeadAndTail<Integer> headAndTail = StreamUtils.headAndTail(Stream.of(1,2,3,4));
int head = headAndTail.head(); //1
Stream<Integer> tail = headAndTail.tail(); //Stream[2,3,4]
The Streamable class represents a replayable Stream and works by building a lazy, caching intermediate data-structure. Because it is caching and repayable - head and tail can be implemented directly and separately.
Streamable<Integer> replayable= Streamable.fromStream(Stream.of(1,2,3,4));
int head = repayable.head(); //1
Stream<Integer> tail = replayable.tail(); //Stream[2,3,4]
cyclops-streams also provides a sequential Stream extension that in turn extends jOOλ and has both Tuple based (from jOOλ) and domain object (HeadAndTail) solutions for head and tail extraction.
SequenceM.of(1,2,3,4)
.splitAtHead(); //Tuple[1,SequenceM[2,3,4]
SequenceM.of(1,2,3,4)
.headAndTail();
Update per Tagir's request -> A Java version of the Scala sieve using SequenceM
public void sieveTest(){
sieve(SequenceM.range(2, 1_000)).forEach(System.out::println);
}
SequenceM<Integer> sieve(SequenceM<Integer> s){
return s.headAndTailOptional().map(ht ->SequenceM.of(ht.head())
.appendStream(sieve(ht.tail().filter(n -> n % ht.head() != 0))))
.orElse(SequenceM.of());
}
And another version via Streamable
public void sieveTest2(){
sieve(Streamable.range(2, 1_000)).forEach(System.out::println);
}
Streamable<Integer> sieve(Streamable<Integer> s){
return s.size()==0? Streamable.of() : Streamable.of(s.head())
.appendStreamable(sieve(s.tail()
.filter(n -> n % s.head() != 0)));
}
Note - neither Streamable of SequenceM have an Empty implementation - hence the size check for Streamable and the use of headAndTailOptional.
Finally a version using plain java.util.stream.Stream
import static com.aol.cyclops.streams.StreamUtils.headAndTailOptional;
public void sieveTest(){
sieve(IntStream.range(2, 1_000).boxed()).forEach(System.out::println);
}
Stream<Integer> sieve(Stream<Integer> s){
return headAndTailOptional(s).map(ht ->Stream.concat(Stream.of(ht.head())
,sieve(ht.tail().filter(n -> n % ht.head() != 0))))
.orElse(Stream.of());
}
Another update - a lazy iterative based on #Holger's version using objects rather than primitives (note a primitive version is also possible)
final Mutable<Predicate<Integer>> predicate = Mutable.of(x->true);
SequenceM.iterate(2, n->n+1)
.filter(i->predicate.get().test(i))
.peek(i->predicate.mutate(p-> p.and(v -> v%i!=0)))
.limit(100000)
.forEach(System.out::println);
There are many interesting suggestions provided here, but if someone needs a solution without dependencies to third party libraries I came up with this:
import java.util.AbstractMap;
import java.util.Optional;
import java.util.Spliterators;
import java.util.stream.StreamSupport;
/**
* Splits a stream in the head element and a tail stream.
* Parallel streams are not supported.
*
* #param stream Stream to split.
* #param <T> Type of the input stream.
* #return A map entry where {#link Map.Entry#getKey()} contains an
* optional with the first element (head) of the original stream
* and {#link Map.Entry#getValue()} the tail of the original stream.
* #throws IllegalArgumentException for parallel streams.
*/
public static <T> Map.Entry<Optional<T>, Stream<T>> headAndTail(final Stream<T> stream) {
if (stream.isParallel()) {
throw new IllegalArgumentException("parallel streams are not supported");
}
final Iterator<T> iterator = stream.iterator();
return new AbstractMap.SimpleImmutableEntry<>(
iterator.hasNext() ? Optional.of(iterator.next()) : Optional.empty(),
StreamSupport.stream(Spliterators.spliteratorUnknownSize(iterator, 0), false)
);
}
To get head and tail you need a Lazy Stream implementation. Java 8 stream or RxJava are not suitable.
You can use for example LazySeq as follows.
Lazy sequence is always traversed from the beginning using very cheap
first/rest decomposition (head() and tail())
LazySeq implements java.util.List interface, thus can be used in
variety of places. Moreover it also implements Java 8 enhancements to
collections, namely streams and collectors
package com.company;
import com.nurkiewicz.lazyseq.LazySeq;
public class Main {
public static void main(String[] args) {
LazySeq<Integer> ints = integers(2);
LazySeq primes = sieve(ints);
primes.take(10).forEach(p -> System.out.println(p));
}
private static LazySeq<Integer> sieve(LazySeq<Integer> s) {
return LazySeq.cons(s.head(), () -> sieve(s.filter(x -> x % s.head() != 0)));
}
private static LazySeq<Integer> integers(int from) {
return LazySeq.cons(from, () -> integers(from + 1));
}
}
Here is another recipe using the way suggested by Holger.
It use RxJava just to add the possibility to use the take(int) method and many others.
package com.company;
import rx.Observable;
import java.util.function.IntPredicate;
import java.util.stream.IntStream;
public class Main {
public static void main(String[] args) {
final IntPredicate[] p={(x)->true};
IntStream primesStream=IntStream.iterate(2,n->n+1).filter(i -> p[0].test(i)).peek(i->p[0]=p[0].and(v->v%i!=0) );
Observable primes = Observable.from(()->primesStream.iterator());
primes.take(10).forEach((x) -> System.out.println(x.toString()));
}
}
This should work with parallel streams as well:
public static <T> Map.Entry<Optional<T>, Stream<T>> headAndTail(final Stream<T> stream) {
final AtomicReference<Optional<T>> head = new AtomicReference<>(Optional.empty());
final var spliterator = stream.spliterator();
spliterator.tryAdvance(x -> head.set(Optional.of(x)));
return Map.entry(head.get(), StreamSupport.stream(spliterator, stream.isParallel()));
}
If you want to get head of a stream, just:
IntStream.range(1, 5).first();
If you want to get tail of a stream, just:
IntStream.range(1, 5).skip(1);
If you want to get both head and tail of a stream, just:
IntStream s = IntStream.range(1, 5);
int head = s.head();
IntStream tail = s.tail();
If you want to find the prime, just:
LongStream.range(2, n)
.filter(i -> LongStream.range(2, (long) Math.sqrt(i) + 1).noneMatch(j -> i % j == 0))
.forEach(N::println);
If you want to know more, go to get abacus-common
Declaration: I'm the developer of abacus-common.
package com.spse.pricing.client.main;
import java.util.stream.IntStream;
public class NestedParalleStream {
int total = 0;
public static void main(String[] args) {
NestedParalleStream nestedParalleStream = new NestedParalleStream();
nestedParalleStream.test();
}
void test(){
try{
IntStream stream1 = IntStream.range(0, 2);
stream1.parallel().forEach(a ->{
IntStream stream2 = IntStream.range(0, 2);
stream2.parallel().forEach(b ->{
IntStream stream3 = IntStream.range(0, 2);
stream3.parallel().forEach(c ->{
//2 * 2 * 2 = 8;
total ++;
});
});
});
//It should display 8
System.out.println(total);
}catch(Exception e){
e.printStackTrace();
}
}
}
Pls help how to customize parallestream to make sure we will get consistency results.
Since multiple threads are incrementing total, you must declare it volatile to avoid race conditions
Edit: volatile makes read / write operations atomic, but total++ requires mores than one operation. For that reason, you should use an AtomicInteger:
AtomicInteger total = new AtomicInteger();
...
total.incrementAndGet();
Problem in statement total ++; it is invoked in multiple threads simultaneously.
You should protect it with synchronized or use AtomicInteger
LongAdder or LongAccumulator are preferable to AtomicLong or AtomicInteger where multiple threads are mutating the value and it's intended to be read relatively few times, such as once at the end of the computation. The adder/accumulator objects avoid contention problems that can occur with the atomic objects. (There are corresponding adder/accumulator objects for double values.)
There is usually a way to rewrite accumulations using reduce() or collect(). These are often preferable, especially if the value being accumulated (or collected) isn't a long or a double.
There is a major problem regarding mutability with the way you are solving it. A better way to solve it the way you want would be as follows:
int total = IntStream.range(0,2)
.parallel()
.map(i -> {
return IntStream.range(0,2)
.map(j -> {
return IntStream.range(0,2)
.map(k -> i * j * k)
.reduce(0,(acc, val) -> acc + 1);
}).sum();
}).sum();
I have a class that has (among other things):
public class TimeSeries {
private final NavigableMap<LocalDate, Double> prices;
public TimeSeries() { prices = new TreeMap<>(); }
private TimeSeries(NavigableMap<LocalDate, Double> prices) {
this.prices = prices;
}
public void add(LocalDate date, double price) { prices.put(date, price); }
public Set<LocalDate> dates() { return prices.keySet(); }
//the 2 methods below are examples of why I need a TreeMap
public double lastPriceAt(LocalDate date) {
Map.Entry<LocalDate, Double> price = prices.floorEntry(date);
return price.getValue(); //after some null checks
}
public TimeSeries between(LocalDate from, LocalDate to) {
return new TimeSeries(this.prices.subMap(from, true, to, true));
}
}
Now I need to have a "filtered" view on the map where only some of the dates are available. To that effect I have added the following method:
public TimeSeries onDates(Set<LocalDate> retainDates) {
TimeSeries filtered = new TimeSeries(new TreeMap<> (this.prices));
filtered.dates().retainAll(retainDates);
return filtered;
}
The onDates method is a huge performance bottleneck, representing 85% of the processing time of the program. And since the program is running millions of simulations, that means hours spent in that method.
How could I improve the performance of that method?
I'd give ImmutableSortedMap a try, assuming you can use it. It's based on a sorted array rather then a balanced tree, so I guess its overhead is much smaller(*). For building it, you need to employ biziclop's idea as the builder supports no removals.
(*) There's a call to Collection.sort there, but it should be harmless as the collection is already sorted and TimSort is optimized for such a case.
In case your original map doesn't change after creating onDates, maybe a view could help. In case it does, you'd need some "persistent" map, which sounds rather complicated.
Maybe some hacky solution based on sorted arrays and binary search could be fastest, maybe you could even convert LocalDate first to int and then to double and put everything into a single interleaved double[] in order to save memory (and hopefully also time). You'd need your own binary search, but this is rather trivial.
The view idea is rather simple, assuming that
you don't need all NavigableMap methods, but just a couple of methods
the original map doesn't change
only a few elements are missing in retainDates
An example method:
public double lastPriceAt(LocalDate date) {
Map.Entry<LocalDate, Double> price = prices.floorEntry(date);
while (!retainDates.contains(price.getKey()) {
price = prices.lowerEntry(price.getKey()); // after some null checks
}
return price.getValue(); // after some null checks
}
The simplest optimisation:
public TimeSeries onDates(Set<LocalDate> retainDates) {
TreeMap<LocalDate, Double> filteredPrices = new TreeMap<>();
for (Entry<LocalDate, Double> entry : prices.entrySet() ) {
if (retainDates.contains( entry.getKey() ) ) {
filteredPrices.put( entry.getKey(), entry.getValue() );
}
}
TimeSeries filtered = new TimeSeries( filteredPrices );
return filtered;
}
Saves you the cost of creating a full copy of your map first, then iterating across the copy again to filter.