Consider the following code:
AtomicInteger counter1 = new AtomicInteger();
AtomicInteger counter2 = new AtomicInteger();
Flux<Object> source = Flux.generate(emitter -> {
emitter.next("item");
});
Executor executor1 = Executors.newFixedThreadPool(32);
Executor executor2 = Executors.newFixedThreadPool(32);
Flux<String> flux1 = Flux.merge(source).concatMap(item -> Mono.fromCallable(() -> {
Thread.sleep(1);
return "1_" + counter1.incrementAndGet();
}).subscribeOn(Schedulers.fromExecutor(executor1)));
Flux<String> flux2 = Flux.merge(source).concatMap(item -> Mono.fromCallable(() -> {
Thread.sleep(100);
return "2_" + counter2.incrementAndGet();
}).subscribeOn(Schedulers.fromExecutor(executor2)));
Flux.merge(flux1, flux2).subscribe(System.out::println);
You can see that one publisher is 100x faster than the other one. Still, when running the code it seems that all data is printed, but there's a huge gap between the two publishers which increases overtime.
What's interesting to note is that when changing the numbers so executer2 will have 1024 threads, and executer1 will have only 1 thread, then still we see a gap that is getting larger and larger overtime.
My expectation was that after tweaking the thread-pools accordingly the publishers will get balanced.
I'd like to achieve a balance between publishers (relative to the thread-pool sizes and processing time)
What would happen if I waited long enough? In other words, is a back-pressure can occur? (Which by default I guess it's a runtime exception, right?)
I don't want to drop items nor want to have a runtime exception. Instead, as I mentioned, I'd like the system to get balanced with respect to the resources it has and the processing times - Does the code above promise that?
Thanks!
Your Flux objects in this example are not ParallelFlux objects, so they'll only ever use one thread.
It doesn't matter if you create a scheduler that's capable of handling thousands of threads, and pass that to one of the Flux objects - they'll just sit there going unused, which is exactly what's happening in this example. There's no backpressure, and it won't result in an exception - it's just going as fast as it can using one thread.
If you want to make sure that the Flux takes full advantage of the 1024 threads available to it, then you need to call .parallel(1024):
ParallelFlux<String> flux1 = Flux.merge(source).parallel(1).concatMap(item -> Mono.fromCallable(() -> {
Thread.sleep(1);
return "1_" + counter1.incrementAndGet();
}).subscribeOn(Schedulers.fromExecutor(executor1)));
ParallelFlux<String> flux2 = Flux.merge(source).parallel(1024).concatMap(item -> Mono.fromCallable(() -> {
Thread.sleep(100);
return "2_" + counter2.incrementAndGet();
}).subscribeOn(Schedulers.fromExecutor(executor1)));
If you do that to your code, then you start to see results much closer to what you seem to be expecting, with 2_ sailing past 1_ despite the fact it's sleeping for 100 times as long:
...
2_17075
2_17076
1_863
1_864
2_17077
1_865
2_17078
2_17079
...
However, a word of warning:
I'd like to achieve a balance between publishers (relative to the thread-pool sizes and processing time)
You can't pick numbers here to balance the outputs, at least not reliably or in any meaningful way - the thread scheduling will be completely arbitrary. If you want to do that, then you could use this variant of the subscribe method, allowing you to explicitly call request() on the subscription consumer. This then allows you to provide backpressure by only requesting as many elements as you're prepared to deal with.
Related
Java version : 11
I have List of objects, I want to perform certain operations on them, where one operations depends on the result of another.
To achieve this workflow in Async fashion, I am using CompletableFuture.
Currently I am doing this by partitioning the List into sub lists and giving each of the sub list to CompletableFuture, so each thread in my thread-pool can start working on that sublist.
The code for above approach I used and working is :
List<SomeObject> someObjectList // original list
List<List<SomeObject>> partitionList = ListUtils.partition(someObjectList, partionSize);
partitionList.forEach(subList -> {
CompletableFuture.supplyAsync(() -> firstOperation(subList), executorService)
.thenAcceptAsync(firstOpresult -> secondOperationWithFirstOpResult(firstOpresult),executorService);
});
public static List<String> firstOperation(List<SomeObject> subList){
//perform operation
return List<String>;
}
public static void secondOperationWithFirstOpResult(List<String> firstOpProducedList) {
//perform operation
//print results.
}
Here the problem is partitioning the original list,
because if my original list has 100 thousand records and partion size
is say 100(which means I want 100 items in each sublist), I will have
1000 sublist objects each holding 100 elements, which may not be good considering this many objects in memory, moreover if the partion size is user controlled/config controller, a smaller partion size would result in huge no of sublist objects.
instead of partitioning the original list,
I want to take the original list.(say 100 elements)
Have a startIndex and endIndex on the original list(say 0 to 9, 10 to 19...)
And give each of those batches to a thread, in threadpool with CompletableFuture
So this thread can perform the two operations.
I know SO is not place for exact solution, but if you could nudge me in the right direction, a pseudo code or even if this is possible with CompletableFuture in the first place, would be great help :)
Since ListUtils.partition is not a standard method, it’s impossible to say, how it works. But if it works “the smart way”, it will use subList on the original list, instead of copying data.
If you want to be on the safe side, you can do the trivial partitioning yourself:
IntStream.range(0, (someObjectList.size() + partionSize - 1) / partionSize)
.forEach(p -> {
int start = p * partionSize,
end = Math.min(someObjectList.size(), start + partionSize);
List<SomeObject> subList = someObjectList.subList(start, end);
CompletableFuture.supplyAsync(() -> firstOperation(subList), executorService)
.thenAcceptAsync(r -> secondOperationWithFirstOpResult(r), executorService);
});
Since these sublists do not store the elements, they consume less memory than the CompletableFuture instances, for example. So this is nothing to worry about.
But if you can live with using the default thread pool¹ instead of executorService, you could use
IntStream.range(0, (someObjectList.size() + partionSize - 1) / partionSize)
.parallel()
.forEach(p -> {
int start = p * partionSize,
end = Math.min(someObjectList.size(), start + partionSize);
List<SomeObject> subList = someObjectList.subList(start, end);
secondOperationWithFirstOpResult(firstOperation(subList));
});
where each sublist only exists while being processed.
This will already use Fork/Join tasks under the hood. There is no need to implement those Fork/Join operations yourself.
¹ the default pool is unspecified, but will be ForkJoinPool.commonPool() in practice.
I've noticed that it's not recommended to use such extractors Mono.toFuture(), Flux.collectList() since they will block the flow.
I'm not very sure the 'blocking' is in which way. Like it in the code below, I know Flux.collectList() will wait for all the item finishing, will it having like a certain thread keep waiting or it's just the last thread that finishes at the last do the .collectList() thing?
It has been metion that Mono.toFuture() will block too, will it return a 'future' immediately (and the future will be usable when onNext() or onComplete() occurred), or it will return until the onNext() or the onComplete() occurred?
var m = Flux.range(0, 100)
.parallel()
.runOn(Schedulers.boundedElastic())
.map(i -> Mono.fromFuture(
Mono.just(i).map(n -> {
try {
var s = (long) (Math.random() * 100);
Thread.sleep(s);
System.out.println(Thread.currentThread() + "after " + s + "ms awaking: " + n);
} catch (InterruptedException e) {
e.printStackTrace();
}
return n;
}).toFuture())
)
.doOnNext(o -> System.out.println(Thread.currentThread() + "before sequential"))
.sequential();
var mm = Flux.merge(m)
.doOnNext(o -> System.out.println(Thread.currentThread() + "before collecting"))
.collectList()
.doOnNext(o -> System.out.println(Thread.currentThread() + "before map"))
.map(list -> list.stream().map(i -> i).collect(Collectors.toList()))
.publishOn(Schedulers.single())
.toFuture();
Your assumptions aren't quite correct.
Mono.toFuture() isn't blocking at all - it simply returns a CompleteableFuture, which you can either block (if you call its get() method) or execute asynchronously (if you use any of its async methods, like thenApply(), thenCompose() etc.) You break out of the reactor context and so forfeit things like backpressure, but you don't immediately have to block.
It's possible you're thinking of (very) old versions of reactor where I believe there was a toFuture() variant that returned a Future, rather than a CompleteableFuture - and while that wasn't blocking either, it put you in a context where you had to then block, as Future has no async component. So while the method call itself wasn't blocking, that was then the only choice you had.
Contrary to popular belief Flux.collectList() also isn't blocking - it specifically returns a Mono<List<T>>, that is a non-blocking publisher that will emit a single element, which is a list of everything that's in that flux. You can call block() on this publisher of course, and that operation would be blocking - but calling collectList() by itself is no more blocking than any other operator.
That being said, it certainly can cause problems. Due to the nature of what it's doing (collecting all elements from a flux into a single list in memory), it may not be ideal:
You might have to wait a long time for the list to be emitted, with no feedback on how many elements it contains, or if it's being populated at all;
You might run out of memory if the number of elements, or size of elements in the flux is particularly large;
You can't output any intermediate state as elements are added, so you forfeit things like streaming JSON support.
That doesn't make it blocking however, it just means there's a different set of potential issues you need to weigh up before deciding whether it's an operator that's worth using in your partiuclar scenario.
I need to make multiple RestTemplate calls for each of the Id in a List<Ids>. What is the best way to perform this?
I have used parallelStream(). Below code snippet is a similar scenario.
List<Integers> employeeIds;
employeeIds.parallelStream().
.map(empId -> employeeService.fetchEmployeedetails(empId))
.collect(Collectos.toList());
employeeService.fetchEmployeedetails is a kind of restCall which will fetch all the employeeDetails.
Is there any other way to tune the performance?
.parallelStream() does not guarantee multi threading since it will create only threads equals to number of cores you have. To really enforce multiple threads doing this simultaneously, you need to use .parallelStream() with ForkJoinPool.
High Performance using ForkJoinPool
List<Employee> employees = new ArrayList();
ForkJoinPool forkJoinPool = new ForkJoinPool(50);
try {
forkJoinPool.submit(() -> employeeIds.parallelStream().forEach(empId -> {
Employee em = employeeService.fetchEmployeedetails(empId);
synchronized(employees) {
employees.add(em);
}
})).get();
} catch (Exception e) {
e.printStackTrace();
throw new BusinessRuleException(ErrorCodeEnum.E002501.value(), e.getMessage());
} finally {
if (forkJoinPool != null) {
forkJoinPool.shutdown(); // always remember to shutdown the pool
}
}
This will ensure that parallelStream() creates max 50 threads instead of depending on number of cores of your system. Make sure you do not forget to shutdown() of the pool in finally block. and also do not forget .get(); which triggers execution of the thread.
ArrayList is not thread-safe to modify but synchronized() will provide that safety. Since the adding to the list is quite minor operation, it will not hold the thread long either.
You will see tremendous performance improvement with this approach.
How to choose optimal number for threads
Too many threads would cause context switch and slow down the process. Too less threads keep the system underutilized. The optimum thread count is often around 10 times the number of cores you have. Also depends on time you are going to spend within the thread. I generally pass an environment variable, so that i can tune the number 50 when I want. But I have reached 50 after quite a bit of experiment on a quad core instance that we have.
#Value("#{systemEnvironment['parallelism'] ?: 50}")
protected int parallelism;
and then use ForkJoinPool forkJoinPool = new ForkJoinPool(getParallelism());
I've a map of key-value and iterating over keys, and calling service and based on the response, I am adding all the response to some uberList
How can I execute the different operations concurrently? Will changing stream() to parallelStream() do the trick? Does it synchronize when it adds to uberList?
The idea is to minimize the response time.
List<MyClass> uberList = new LinkedList<>();
Map<String, List<MyOtherClass>> map = new HashMap();
//Populate map
map.entrySet().stream().filter(s -> s.getValue().size() > 0 && s.getValue().values().size() > 0).forEach(
y -> {
// Do stuff
if(noError) {
uberList.add(MyClass3);
}
}
}
//Do stuff on uberList
How can I execute the different operations concurrently?
One thread can do one task at a time. If you want to do multiple operations concurrently, you have to offwork to other threads.
You can either creating new Thread or using ExecutorService to manage thread pool, queue the task and execute task for you.
Will changing stream() to parallelStream() do the trick?
Yes it does. Internally, parallelStream() use the ForkJoinPool.commonPool() to run tasks for you. But keep in mind that the parallelStream() has no guarantee about if the returned stream is paralleled (but for now, the current implementation return a paralleled one)
Does it synchronize when it adds to uberList?
It's up to you to do the synchronization part in forEach pipeline. Normally you do not want to call collection.add() inside forEach to create collection. Instead you should use .map().collect(toX()) methods. It frees you from synchronizatin part:
It does not required to know about your local variable (in this case uberlist. And it will not modify it on execution, help reduce a lot of strange bugs caused of concurrency
You can freely change the type of collection in .collect() part. It give you more control over the result type.
It does not require thread-safe or synchronization on given collection when using with parallel stream. Because "multiple intermediate results may be instantiated, populated, and merged so as to maintain isolation of mutable data structures" (Read more about this here)
So what you want is to execute multiple similar service call at the same time and collect your result into a list.
You can do it simply by parallel stream:
uberList = map.entrySet().stream()
.parallel() // Use .stream().parallel() to force parallism. The .parallelStream() does not guarantee that the returned stream is parallel stream
.filter(yourCondition)
.map(e -> yourService.methodCall(e))
.collect(Collectors.toList());
Pretty cool, isn't it?
But as I stated, the default parallel stream use ForkJoinPool.commonPool() for thread queueing and executing.
The bad part is if your yourService.methodCall(e) do heavy IO stuff (like HTTP call, even db call...) or long running task then it may exhaust the pool, other incoming tasks will queued forever to wait for execution.
So typically all other tasks depend on this common pool (not only your own yourService.methodCall(e), but all other parallel stream) will be slow down due to queueing time.
To solve this problem, you can force execute parallelism on your own fork-join pool:
ForkJoinPool forkJoinPool = new ForkJoinPool(4); // Typically set it to Runtime.availableProcessors()
uberlist = forkJoinPool.submit(() -> {
return map.entrySet().stream()
.parallel() // Use .stream().parallel() to force parallism. The .parallelStream() does not guarantee that the returned stream is parallel stream
.filter(yourCondition)
.map(e -> yourService.methodCall(e))
.collect(Collectors.toList());
}).get();
You probably don't want to use parallelStream for concurrency, only for parallelism. (That is: use it for tasks where you want to use multiple physical processes efficiently on a task that's conceptually sequential, not for tasks where you want multiple things going on at the same time conceptually.)
In your case you would probably be better off using an ExecutorService, or more specifically com.google.common.util.concurrent.ListenableExecutorService from Google Guava (warning: I haven't tried to compile the below code, there may be syntax errors):
int MAX_NUMBER_OF_SIMULTANEOUS_REQUESTS = 100;
ListeningExecutorService myExecutor =
MoreExecutors.listeningDecorator(
Executors.newFixedThreadPool(MAX_NUMBER_OF_SIMULTANEOUS_REQUESTS));
List<ListenableFuture<Optional<MyClass>>> futures = new ArrayList<>();
for (Map.Entry<String, List<MyOtherClass>> entry : map.entrySet()) {
if (entry.getValue().size() > 0 && entry.getValue().values().size() > 0) {
futures.add(myExecutor.submit(() -> {
// Do stuff
if(noError) {
return Optional.of(MyClass3);
} else {
return Optional.empty();
}
}));
}
}
List<MyClass> uberList = Futures.successfulAsList(futures)
.get(1, TimeUnit.MINUTES /* adjust as necessary */)
.stream()
.filter(Optional::isPresent)
.map(Optional::get)
.collect(Collectors.toList());
The advantage of this code is that it allows you to explicitly specify that the tasks should all start at the "same time" (at least conceptually) and allows you to control your concurrency explicitly (how many simultaneous requests are allowed? What do we do if some of the tasks fail? How long are we willing to wait? etc). Parallel streams aren't really for that.
Parallel Stream will help in execution concurrently. But it is not recommended to do forEach loop and add element in outside list. If you do that, you have to make sure of synchnising external list. Better way of doing it is to use map and collect result into list. In this case, parallelStream takes care of synchronisation.
List<MyClass> uberList = map.entrySet().parallelStream().filter(s ->
s.getValue().size() > 0 && s.getValue().values().size() >
0).map(
y -> {
// Do stuff
return MyClass3;
}
}
.filter(t -> check no ertor condition)
.collect (Collectors.toList())
My API makes about 100 downstream calls, in pairs, to two separate services. All responses need to be aggregated, before I can return my response to the client. I use hystrix-feign to make the HTTP calls.
I came up with what I believed was an elegant solution until on the rxJava docs I've found the following
BlockingObservable is a variety of Observable that provides blocking operators. It can be useful for testing and demo purposes, but is generally inappropriate for production applications (if you think you need to use a BlockingObservable this is usually a sign that you should rethink your design).
My code looks roughly as follows
List<Observable<C>> observables = new ArrayList<>();
for (RequestPair request : requests) {
Observable<C> zipped = Observable.zip(
feignClientA.sendRequest(request.A()),
feignClientB.sendRequest(request.B()),
(a, b) -> new C(a,b));
observables.add(zipped);
}
Collection<D> apiResponse = = new ConcurrentLinkedQueue<>();
Observable
.merge(observables)
.toBlocking()
.forEach(combinedResponse -> apiResponse.add(doSomeWork(combinedResponse)));
return apiResponse;
Few questions based on this setup:
Is toBlocking() justified given my use case
Am I correct in understanding that the actual HTTP calls do not get made until the main thread gets to the forEach()
I've seen that the code in the forEach() block is executed by different threads, but I was not able to verify if there can be more than one thread in the forEach() block. Is the execution there concurrent?
A better option is to return the Observable to be consumed by other operators but you may get away with blocking code (It should, however, run on a background thread.)
public Observable<D> getAll(Iterable<RequestPair> requests) {
return Observable.from(requests)
.flatMap(request ->
Observable.zip(
feignClientA.sendRequest(request.A()),
feignClientB.sendRequest(request.B()),
(a, b) -> new C(a,b)
)
, 8) // maximum concurrent HTTP requests
.map(both -> doSomeWork(both));
}
// for legacy users of the API
public Collection<D> getAllBlocking(Iterable<RequestPair> requests) {
return getAll(requests)
.toList()
.toBlocking()
.first();
}
Am I correct in understanding that the actual HTTP calls do not get made until the main thread gets to the forEach()
Yes, the forEach triggers the whole sequence of operations.
I've seen that the code in the forEach() block is executed by different threads, but I was not able to verify if there can be more than one thread in the forEach() block. Is the execution there concurrent?
Only one thread at a time is allowed to execute the lambda in forEach but you may indeed see different threads entering there.