I've got a list of input elements which I want to queue into several ThreadPools. Let's say this is my input:
final List<Integer> ints = Stream.iterate(1, i -> i + 1).limit(100).collect(Collectors.toList());
These are the three functions I want the elements to run through each after another:
final Function<Integer, Integer> step1 =
value -> { // input from the ints list
return value * 2;
};
final Function<Integer, Double> step2 =
value -> { // input from the previous step1
return (double) (value * 2); //
};
final Function<Double, String> step3 =
value -> { // input from the previous step2
return "Result: " + value * 2;
};
And these would be the pools for each step:
final ExecutorService step1Pool = Executors.newFixedThreadPool(4);
final ExecutorService step2Pool = Executors.newFixedThreadPool(3);
final ExecutorService step3Pool = Executors.newFixedThreadPool(1);
I want each element to run through step1Pool and apply the step1. As soon as one element is done its result should
end up in step2pool so that step2 can be applied here. As soon as something in step2Pool is done it should be
queued in step3Pool and step3 should be applied.
On my main thread I want to wait until I have all the results from step3. The order in which each element is processed
doesn't matter. Only that they all run through step1 -> step2 -> step3 on the correct thread pool.
Basically I want to parallelize the Stream.map, push each result immediately to the next queue and wait until I've
got ints.size() results from my last thread pool back.
Is there a simple way to do achieve in Java?
I believe that CompletableFuture will help you here!
List<CompletableFuture<String>> futures = ints.stream()
.map(i -> CompletableFuture.supplyAsync(() -> step1.apply(i), step1Pool)
.thenApplyAsync(step2, step2Pool)
.thenApplyAsync(step3, step3Pool))
.collect(Collectors.toList());
List<String> result = futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
Better use streams for that :
List<String> stringList = Stream.iterate(1, i -> i + 1)
.limit(100)
.parallel()
.map(step1)
.map(step2)
.map(step3)
.collect(Collectors.toList());
Related
I need to combine the results from two reactive Publishers - Mono and Flux. I tried to do it with zip and join functions, but I was not able to fulfil two specific conditions:
result should contain as many element as Flux emits, but corresponding Mono source should be called only once (this condition alone can be implemented with join)
when Flux is empty, then chain should complete without waiting for Mono element
The solution for the first condition is presented in the Combine Mono with Flux entry (pasted below). But I was not able to achieve second condition without blocking the chain - and that I would like to avoid.
Flux<Integer> flux = Flux.concat(Mono.just(1).delayElement(Duration.ofMillis(100)),
Mono.just(2).delayElement(Duration.ofMillis(500))).log();
Mono<String> mono = Mono.just("a").delayElement(Duration.ofMillis(50)).log();
List<String> list = flux.join(mono, (v1) -> Flux.never(), (v2) -> Flux.never(), (x, y) -> {
return x + y;
}).collectList().block();
System.out.println(list);
If you want to cancel the entire operation if Flux is empty you could do the below
Flux<Integer> flux = Flux.concat(Mono.just(1).delayElement(Duration.ofMillis(100)),
Mono.just(2).delayElement(Duration.ofMillis(500))).log();
//Uncomment below and comment out above statement for empty flux
//Flux<Integer> flux = Flux.empty();
Mono<String> mono = Mono.just("a").delayElement(Duration.ofMillis(5000)).log();
//Throw exception if flux is empty
flux = flux.switchIfEmpty(Flux.error(IllegalStateException::new));
List<String> list = flux
.join(mono, s -> Flux.never() , s -> Flux.never(), (x, y) -> x + y)
//Catch exception and return nothing
.onErrorResume(s -> Flux.empty())
.collectList().block();
System.out.println(list);
You could do something like the following if you want Mono to complete but don't want join to hang
DirectProcessor<Integer> processor = DirectProcessor.create();
//Could omit sink, and use processor::onComplete in place of sink::complete
//But typically recommended as provides better thread safety
FluxSink<Integer> sink = processor.serialize().sink();
Flux<Integer> flux = Flux.concat(Mono.just(1).delayElement(Duration.ofMillis(100)),
Mono.just(2).delayElement(Duration.ofMillis(500))).log();
//Uncomment below and comment out above statement for empty flux
//Flux<Integer> flux = Flux.empty();
Mono<String> mono = Mono.just("a").delayElement(Duration.ofMillis(5000)).log();
List<String> list = flux
.doOnComplete(sink::complete)
.join(mono, s -> processor , s -> processor, (x, y) -> x + y).collectList().block();
System.out.println(list);
We have an async method:
public CompletableFuture<OlderCat> asyncGetOlderCat(String catName)
Given a list of Cats:
List<Cat> cats;
We like to create a bulk operation that will result in a map between the cat name and its async result:
public CompletableFuture<Map<String, OlderCat>>
We also like that if an exception was thrown from the asyncGetOlderCat, the cat will not be added to the map.
We were following this post and also this one and we came up with this code:
List<Cat> cats = ...
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(Collectors.toMap(Cat::getName,
c -> asynceGetOlderCat(c.getName())
.exceptionally( ex -> /* null?? */ ))
));
CompletableFuture<Void> allFutures = CompletableFuture
.allOf(completableFutures.values().toArray(new CompletableFuture[completableFutures.size()]));
return allFutures.thenApply(future -> completableFutures.keySet().stream()
.map(CompletableFuture::join) ???
.collect(Collectors.toMap(????)));
But it is not clear how in the allFutureswe can get access to the cat name and how to match between the OlderCat & the catName.
Can it be achieved?
You are almost there. You don't need to put an exceptionally() on the initial futures, but you should use handle() instead of thenApply() after the allOf(), because if any future fails, the allOf() will fail as well.
When processing the futures, you can then just filter out the failing ones from the result, and rebuild the expected map:
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(toMap(Cat::getName, c -> asyncGetOlderCat(c.getName())));
CompletableFuture<Void> allFutures = CompletableFuture
.allOf(completableFutures.values().toArray(new CompletableFuture[0]));
return allFutures.handle((dummy, ex) ->
completableFutures.entrySet().stream()
.filter(entry -> !entry.getValue().isCompletedExceptionally())
.collect(toMap(Map.Entry::getKey, e -> e.getValue().join())));
Note that the calls to join() are guaranteed to be non-blocking since the thenApply() will only be executed after all futures are completed.
As I get it, what you need is CompletableFuture with all results, the code below does exactly what you need
public CompletableFuture<Map<String, OlderCat>> getOlderCats(List<Cat> cats) {
return CompletableFuture.supplyAsync(
() -> {
Map<String, CompletableFuture<OlderCat>> completableFutures = cats
.stream()
.collect(Collectors.toMap(Cat::getName,
c -> asyncGetOlderCat(c.getName())
.exceptionally(ex -> {
ex.printStackTrace();
// if exception happens - return null
// if you don't want null - save failed ones to separate list and process them separately
return null;
}))
);
return completableFutures
.entrySet()
.stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> e.getValue().join()
));
}
);
}
What it does here - returns future, which creates more completable future inside and waits at the end.
This question already has answers here:
Copy a stream to avoid "stream has already been operated upon or closed"
(10 answers)
Closed 3 years ago.
I'm learning the new Java 8 features, and while experimenting with streams (java.util.stream.Stream) and collectors, I realized that a stream can't be used twice.
Is there any way to reuse it?
If you want to have the effect of reusing a stream, you might wrap the stream expression in a Supplier and call myStreamSupplier.get() whenever you want a fresh one. For example:
Supplier<Stream<String>> sup = () -> someList.stream();
List<String> nonEmptyStrings = sup.get().filter(s -> !s.isEmpty()).collect(Collectors.toList());
Set<String> uniqueStrings = sup.get().collect(Collectors.toSet());
From the documentation:
A stream should be operated on (invoking an intermediate or terminal stream operation) only once.
A stream implementation may throw IllegalStateException if it detects that the stream is being reused.
So the answer is no, streams are not meant to be reused.
As others have said, "no you can't".
But it's useful to remember the handy summaryStatistics() for many basic operations:
So instead of:
List<Person> personList = getPersons();
personList.stream().mapToInt(p -> p.getAge()).average().getAsDouble();
personList.stream().mapToInt(p -> p.getAge()).min().getAsInt();
personList.stream().mapToInt(p -> p.getAge()).max().getAsInt();
You can:
// Can also be DoubleSummaryStatistics from mapToDouble()
IntSummaryStatistics stats = personList.stream()
.mapToInt(p-> p.getAge())
.summaryStatistics();
stats.getAverage();
stats.getMin();
stats.getMax();
The whole idea of the Stream is that it's once-off. This allows you to create non-reenterable sources (for example, reading the lines from the network connection) without intermediate storage. If you, however, want to reuse the Stream content, you may dump it into the intermediate collection to get the "hard copy":
Stream<MyType> stream = // get the stream from somewhere
List<MyType> list = stream.collect(Collectors.toList()); // materialize the stream contents
list.stream().doSomething // create a new stream from the list
list.stream().doSomethingElse // create one more stream from the list
If you don't want to materialize the stream, in some cases there are ways to do several things with the same stream at once. For example, you may refer to this or this question for details.
As others have noted the stream object itself cannot be reused.
But one way to get the effect of reusing a stream is to extract the stream creation code to a function.
You can do this by creating a method or a function object which contains the stream creation code. You can then use it multiple times.
Example:
public static void main(String[] args) {
List<Integer> list = Arrays.asList(1, 2, 3, 4, 5);
// The normal way to use a stream:
List<String> result1 = list.stream()
.filter(i -> i % 2 == 1)
.map(i -> i * i)
.limit(10)
.map(i -> "i :" + i)
.collect(toList());
// The stream operation can be extracted to a local function to
// be reused on multiple sources:
Function<List<Integer>, List<String>> listOperation = l -> l.stream()
.filter(i -> i % 2 == 1)
.map(i -> i * i)
.limit(10)
.map(i -> "i :" + i)
.collect(toList());
List<String> result2 = listOperation.apply(list);
List<String> result3 = listOperation.apply(Arrays.asList(1, 2, 3));
// Or the stream operation can be extracted to a static method,
// if it doesn't refer to any local variables:
List<String> result4 = streamMethod(list);
// The stream operation can also have Stream as argument and return value,
// so that it can be used as a component of a longer stream pipeline:
Function<Stream<Integer>, Stream<String>> streamOperation = s -> s
.filter(i -> i % 2 == 1)
.map(i -> i * i)
.limit(10)
.map(i -> "i :" + i);
List<String> result5 = streamOperation.apply(list.stream().map(i -> i * 2))
.filter(s -> s.length() < 7)
.sorted()
.collect(toCollection(LinkedList::new));
}
public static List<String> streamMethod(List<Integer> l) {
return l.stream()
.filter(i -> i % 2 == 1)
.map(i -> i * i)
.limit(10)
.map(i -> "i :" + i)
.collect(toList());
}
If, on the other hand, you already have a stream object which you want to iterate over multiple times, then you must save the content of the stream in some collection object.
You can then get multiple streams with the same content from than collection.
Example:
public void test(Stream<Integer> stream) {
// Create a copy of the stream elements
List<Integer> streamCopy = stream.collect(toList());
// Use the copy to get multiple streams
List<Integer> result1 = streamCopy.stream() ...
List<Integer> result2 = streamCopy.stream() ...
}
Come to think of it, this will of "reusing" a stream is just the will of carry out the desired result with a nice inline operation. So, basically, what we're talking about here, is what can we do to keep on processing after we wrote a terminal operation?
1) if your terminal operation returns a collection, the problem is solved right away, since every collection can be turned back into a stream (JDK 8).
List<Integer> l=Arrays.asList(5,10,14);
l.stream()
.filter(nth-> nth>5)
.collect(Collectors.toList())
.stream()
.filter(nth-> nth%2==0).forEach(nth-> System.out.println(nth));
2) if your terminal operations returns an optional, with JDK 9 enhancements to Optional class, you can turn the Optional result into a stream, and obtain the desired nice inline operation:
List<Integer> l=Arrays.asList(5,10,14);
l.stream()
.filter(nth-> nth>5)
.findAny()
.stream()
.filter(nth-> nth%2==0).forEach(nth-> System.out.println(nth));
3) if your terminal operation returns something else, i really doubt that you should consider a stream to process such result:
List<Integer> l=Arrays.asList(5,10,14);
boolean allEven=l.stream()
.filter(nth-> nth>5)
.allMatch(nth-> nth%2==0);
if(allEven){
...
}
The Functional Java library provides its own streams that do what you are asking for, i.e. they're memoized and lazy. You can use its conversion methods to convert between Java SDK objects and FJ objects, e.g. Java8.JavaStream_Stream(stream) will return a reusable FJ stream given a JDK 8 stream.
The code I'm having problems with is:
Executor executor = (Executor) callList;
List<ProgState> newProgList = executor.invokeAll(callList).stream()
.map(future -> {try {return future.get();} catch(Exception e){e.printStackTrace();}})
.filter(p -> p!=null).collect(Collectors.toList());
The method invokeAll(List>) is undefined for the type Executor
I am told I should use an executor like the one in the code snippet.
The Callables are defined within the following code:
List<Callable<ProgState>> callList = (List<Callable<ProgState>>) lst.stream()
.map(p -> ((Callable<ProgState>)(() -> {return p.oneStep();})))
.collect(Collectors.toList());
Here is the teacher's code:
//prepare the list of callables
List<Callable<PrgState>> callList = prgList.stream().map(p -> (() -> {return p.oneStep();})).collect(Collectors.toList());
//start the execution of the callables
//it returns the list of new created threads
List<PrgState> newPrgList = executor.invokeAll(callList).stream()
.map(future -> { try {
return future.get();
}
catch(Exception e) {
//here you can treat the possible
// exceptions thrown by statements
// execution
}
})
.filter(p -> p!=null).collect(Collectors.toList());
//add the new created threads to the list of existing threads
prgList.addAll(newPrgList);
If you can use stream(), why not parallelStream() as it would be much simpler.
List<PrgState> prgStates = prgList.parallelStream()
.map(p -> p.oneStep())
.collect(Collectors.toList());
This way you have no thread pool to configure, start or stop when finished.
Some might suggest that parallelStream() was the main reason for adding Stream and lambdas to Java 8 in the first place. ;)
You can't cast list of Callables with ExecutorService. You need to define ExecutorService which will inturn pick up callables and execute them in one or multiple threads in parallel.
This is what i think you are after:
ExecutorService executor = Executors.newCachedThreadPool();//change executor type as per your need.
List<ProgState> newProgList = executor.invokeAll(callList).stream().map(future -> {...
I am having the following method:
public String getResult() {
List<String> serversList = getServerListFromDB();
List<String> appList = getAppListFromDB();
List<String> userList = getUserFromDB();
return getResult(serversList, appList, userList);
}
Here I am calling three method sequentially which in turns hits the DB and fetch me results, then I do post processing on the results I got from the DB hits. I know how to call these three methods concurrently via use of Threads. But I would like to use Java 8 Parallel Stream to achieve this. Can someone please guide me how to achieve the same via Parallel Streams?
EDIT I just want to call the methods in parallel via Stream.
private void getInformation() {
method1();
method2();
method3();
method4();
method5();
}
You may utilize CompletableFuture this way:
public String getResult() {
// Create Stream of tasks:
Stream<Supplier<List<String>>> tasks = Stream.of(
() -> getServerListFromDB(),
() -> getAppListFromDB(),
() -> getUserFromDB());
List<List<String>> lists = tasks
// Supply all the tasks for execution and collect CompletableFutures
.map(CompletableFuture::supplyAsync).collect(Collectors.toList())
// Join all the CompletableFutures to gather the results
.stream()
.map(CompletableFuture::join).collect(Collectors.toList());
// Use the results. They are guaranteed to be ordered in the same way as the tasks
return getResult(lists.get(0), lists.get(1), lists.get(2));
}
As already mentioned, a standard parallel stream is probably not the best fit for your use case. I would complete each task asynchronously using an ExecutorService and "join" them when calling the getResult method:
ExecutorService es = Executors.newFixedThreadPool(3);
Future<List<String>> serversList = es.submit(() -> getServerListFromDB());
Future<List<String>> appList = es.submit(() -> getAppListFromDB());
Future<List<String>> userList = es.submit(() -> getUserFromDB());
return getResult(serversList.get(), appList.get(), userList.get());
foreach is what used for side-effects, you can call foreach on a parallel stream. ex:
listOfTasks.parallelStream().foreach(list->{
submitToDb(list);
});
However, parallelStream uses the common ForkJoinPool which is arguably not good for IO-bound tasks.
Consider using a CompletableFuture and supply an appropriate ExecutorService. It gives more flexibility (continuation,configuration). For ex:
ExecutorService executorService = Executors.newCachedThreadPool();
List<CompletableFuture> allFutures = new ArrayList<>();
for(Query query:queries){
CompletableFuture<String> query = CompletableFuture.supplyAsync(() -> {
// submit query to db
return result;
}, executorService);
allFutures.add(query);
}
CompletableFuture<Void> all = CompletableFuture.allOf(allFutures.toArray(new CompletableFuture[allFutures.size()]));
Not quite clear what do you mean, but if you just want to run some process on these lists on parallel you can do something like this:
List<String> list1 = Arrays.asList("1", "234", "33");
List<String> list2 = Arrays.asList("a", "b", "cddd");
List<String> list3 = Arrays.asList("1331", "22", "33");
List<List<String>> listOfList = Arrays.asList(list1, list2, list3);
listOfList.parallelStream().forEach(list -> System.out.println(list.stream().max((o1, o2) -> Integer.compare(o1.length(), o2.length()))));
(it will print most lengthy elements from each list).