Project Reactor - Parallel Execution - java

I have the below Flux,
#Test
public void fluxWithRange_CustomTest() {
Flux<Integer> intFlux = Flux.range(1, 10).flatMap(i -> {
if (i % 2 == 0) {
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return Mono.just(i);
} else {
return Mono.just(i);
}
}, 2).subscribeOn(Schedulers.newBoundedElastic(2, 2, "test")).log();
StepVerifier.create(intFlux).expectNext(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).verifyComplete();
}
I was expecting this to run in parallel, however, this just executes in 1 thread.

The subscribeOn method only provides a way to move execution to a different thread when "someone" subscribes to your Flux. What it means is that when you use the StepVerifier you are subscribing to the flux, and because you defined a Schedulers the execution is moved to one of the threads provided by the Schedulers. This does not imply that the Flux is going to be jumping between multiples threads.
The behaviour you are expecting can be archived by adding a second subscribeOn but to the Mono you are using inside the flatMap. When the flatMap now subscribes to the content it will use another thread.
If you change your code to something like this:
#Test
public void fluxWithRange_CustomTest() throws InterruptedException {
Flux<Integer> intFlux = Flux.range(1, 10)
.flatMap(i -> subFlux(i),2)
.subscribeOn(Schedulers.newBoundedElastic(2, 2, "test")).log();
StepVerifier.create(intFlux).expectNext(1, 2, 3, 4, 5, 6, 7, 8,9,10).verifyComplete(); //This now fails.
}
private Mono<Integer> subFlux(int i) {
Mono<Integer> result = Mono.create(sink ->
{
if (i % 2 == 0) {
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
sink.success(i);
});
return result.subscribeOn(Schedulers.newBoundedElastic(2, 2, "other"));
}

Related

List of Strings is not running in parallel - Java 8 parallel streams

I got a requirement to run a collection using parallel stream and it is always running in sequence, here in the below example List is always running in sequence where as IntStream is running in parallel. could some please help me to understand the difference between running a parallel Stream on IntStream and parallel Stream on List<String>.
Also, can you help with the code snippet how to run List<String> in parallel similar to how IntStream is running parallel?
import java.util.List;
import java.util.stream.IntStream;
public class ParallelCollectionTest {
public static void main(String[] args) {
System.out.println("Parallel int stream testing.. ");
IntStream range2 = IntStream.rangeClosed(1, 5);
range2.parallel().peek(t -> {
System.out.println("before");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}).forEachOrdered(System.out::println);
System.out.println("Parallel String collection testing.. ");
List<String> list = List.of("a","b","c","d");
list.stream().parallel().forEachOrdered(o ->
{
System.out.println("before");
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(o);
});
}
}
output of the above code is below.
Parallel int stream testing..
before
before
before
before
before
1
2
3
4
5
Parallel String collection testing..
before
a
before
b
before
c
before
d
The different behavior is not caused by the different streams (IntStream vs. Stream<String>).
The logic of your two stream pipelines is not the same.
In the IntStream snippet you are performing the sleep in the peek() call, which allows it to run in parallel for different elements, which is why that pipeline ends quickly.
In the Stream<String> snippet you are performing the sleep in the forEachOrdered, which means the sleep() for each element must be performed after the sleep() of the previous element ends. That's the behavior of forEachOrdered - This operation processes the elements one at a time, in encounterorder if one exists.
You can make the second snippet behave similar to the first if you add a peek() call to it:
System.out.println("Parallel String collection testing.. ");
List<String> list = List.of("a","b","c","d","e","f","g","h");
list.stream().parallel().peek(t -> {
System.out.println("before");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
})
.forEachOrdered(System.out::println);
Now it will produce:
Parallel String collection testing..
before
before
before
before
a
b
c
d

Project Reactor: doOnNext (or the others doOnXXX) async

Is there any method like doOnNext, but async?
For example, I need to do some long logging (sent notification by email) for determined element.
Scheduler myParallel = Schedulers.newParallel("my-parallel", 4);
Flux<Integer> ints = Flux.just(1, 2, 3, 4, 5)
.publishOn(myParallel)
.doOnNext(v -> {
// For example, we need to do something time-consuming only for 3
if (v.equals(3)) {
try {
Thread.sleep(3000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
System.out.println("LOG FOR " + v);
});
ints.subscribe(System.out::println);
But why should I wait for logging of 3? I want to do this logic asynchronously.
Now I have only this solution
Thread.sleep(10000);
Scheduler myParallel = Schedulers.newParallel("my-parallel", 4);
Scheduler myParallel2 = Schedulers.newParallel("my-parallel2", 4);
Flux<Integer> ints = Flux.just(1, 2, 3, 4, 5)
.publishOn(myParallel)
.doOnNext(v -> {
Mono.just(v).publishOn(myParallel2).subscribe(value -> {
if (value.equals(3)) {
try {
Thread.sleep(3000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
System.out.println("LOG FOR " + value);
});
});
ints.subscribe(System.out::println);
Is there any "nice" solution for this?
If you're absolutely sure you don't care wether or not the email sending succeeds, then you could use "subscribe-inside-doOnNext" but I'm pretty confident that would be a mistake.
In order to have your Flux propagate an onError signal if the "logging" fails, the recommended approach is to use flatMap.
The good news is that since flatMap merges results from the inner publishers immediately into the main sequence, you can get still emit each element immediately AND trigger the email. The only caveat is that the whole thing will only complete once the email-sending Mono has also completed. You can also check within the flatMap lambda if the logging needs to happen at all (rather than inside the inner Mono):
//assuming sendEmail returns a Mono<Void>, takes care of offsetting any blocking send onto another Scheduler
source //we assume elements are also publishOn as relevant in `source`
.flatMap(v -> {
//if we can decide right away wether or not to send email, better do it here
if (shouldSendEmailFor(v)) {
//we want to immediately re-emit the value, then trigger email and wait for it to complete
return Mono.just(v)
.concatWith(
//since Mono<Void> never emits onNext, it is ok to cast it to V
//which makes it compatible with concat, keeping the whole thing a Flux<V>
sendEmail(v).cast(V.class)
);
} else {
return Mono.just(v);
}
});
Flux<Integer> ints = Flux.just(1, 2, 3, 4, 5)
.flatMap(integer -> {
if (integer != 3) {
return Mono.just(integer)
.map(integer1 -> {
System.out.println(integer1);
return integer;
})
.subscribeOn(Schedulers.parallel());
} else {
return Mono.just(integer)
.delayElement(Duration.ofSeconds(3))
.map(integer1 -> {
System.out.println(integer1);
return integer;
})
.subscribeOn(Schedulers.parallel());
}
}
);
ints.subscribe();

Chain CompletableFuture and stop on first success

I'm consuming an API that returns CompletableFutures for querying devices (similar to digitalpetri modbus).
I need to call this API with a couple of options to query a device and figure out what it is - this is basically trial and error until it succeeds. These are embedded device protocols that I cannot change, but you can think of the process as working similar to the following:
Are you an apple?
If not, then are you a pineapple?
If not, then are you a pen?
...
While the API uses futures, in reality, the communications are serial (going over the same physical piece of wire), so they will never be executed synchronously. Once I know what it is, I want to be able to stop trying and let the caller know what it is.
I already know that I can get the result of only one of the futures with any (see below), but that may result in additional attempts that should be avoided.
Is there a pattern for chaining futures where you stop once one of them succeeds?
Similar, but is wasteful of very limited resources.
List<CompletableFuture<String>> futures = Arrays.asList(
CompletableFuture.supplyAsync(() -> "attempt 1"),
CompletableFuture.supplyAsync(() -> "attempt 2"),
CompletableFuture.supplyAsync(() -> "attempt 3"));
CompletableFuture<String>[] futuresArray = (CompletableFuture<String>[]) futures.toArray();
CompletableFuture<Object> c = CompletableFuture.anyOf(futuresArray);
Suppose that you have a method that is "pseudo-asynchronous" as you describe, i.e. it has an asynchronous API but requires some locking to perform:
private final static Object lock = new Object();
private static CompletableFuture<Boolean> pseudoAsyncCall(int input) {
return CompletableFuture.supplyAsync(() -> {
synchronized (lock) {
System.out.println("Executing for " + input);
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
return input > 3;
}
});
}
And a List<Integer> of inputs that you want to check against this method, you can check each of them in sequence with recursive composition:
public static CompletableFuture<Integer> findMatch(List<Integer> inputs) {
return findMatch(inputs, 0);
}
private static CompletableFuture<Integer> findMatch(List<Integer> inputs, int startIndex) {
if (startIndex >= inputs.size()) {
// no match found -- an exception could be thrown here if preferred
return CompletableFuture.completedFuture(null);
}
return pseudoAsyncCall(inputs.get(startIndex))
.thenCompose(result -> {
if (result) {
return CompletableFuture.completedFuture(inputs.get(startIndex));
} else {
return findMatch(inputs, startIndex + 1);
}
});
}
This would be used like this:
public static void main(String[] args) {
List<Integer> inputs = Arrays.asList(0, 1, 2, 3, 4, 5);
CompletableFuture<Integer> matching = findMatch(inputs);
System.out.println("Found match: " + matching.join());
}
Output:
Executing for 0
Executing for 1
Executing for 2
Executing for 3
Executing for 4
Found match: 4
As you can see, it is not called for input 5, while your API (findMatch()) remains asynchronous.
I think the best you can do is, after your retrieval of the result,
futures.forEach(f -> f.cancel(true));
This will not affect the one having produced the result, and tries its best to stop the others. Since IIUC you get them from an outside source, there's no guarantee it will actually interrupt their work.
However, since
this class has no direct control over the computation that causes it to be completed, cancellation is treated as just another form of exceptional completion
(from CompletableFuture doc), I doubt it will do what you actually want.

concurrency using ExecutorService

I have this simple program to count numbers from 1 to 9 using ThreadPool and ExecutorService.
Each Thread is waiting for 1 sec to execute. However, below program gives me random output for each execution.
How do I fix this so that it always produces 45?
public static void main(String[] args) throws InterruptedException {
AtomicLong count = new AtomicLong(0);
ExecutorService executor = Executors.newFixedThreadPool(10);
List<Integer> list = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9);
for(Integer i : list) {
executor.execute(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
count.set(count.get() + i);
}
});
}
System.out.println("Waiting...");
executor.shutdown();
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.MINUTES);
System.out.println("Total::"+count.get());
System.out.println("Done");
}
Instead of
count.set(count.get() + i);
use
count.addAndGet(i);
Method addAndGet adds value atomically but sequential get and set is not atomic operation.
AtomicLong has special methods which are atomic. You only get atomic guarantees when using them (separate calls to add() and get() will not be atomic). There are methods on AtomicLong which will "add" to the current value in an atomic fashion.

RxJava 2 Observable that onComplete resubmits itself

I'm new with RxJava. I'm trying to create an observable that when it completes it will start all over again until I call dispose, but I'm facing an OutofMemory error after a while, below is a simplified example of what I'm trying to do
public void start() throws RuntimeException {
log.info("\t * Starting {} Managed Service...", getClass().getSimpleName());
try {
executeObserve();
log.info("\t * Starting {} Managed Service...OK!", getClass().getSimpleName());
} catch (Exception e) {
log.info("Managed Service {} FAILED! Reason is {} ", getClass().getSimpleName(), e.getMessage(), e);
}
}
start is invoked at the initialization phase once, the executeObserve is as follows (in a simplified form..). Notice that on the onComplete I "resubmit" executeObserve
public void executeObserve() throws RuntimeException {
Observable<Book> booksObserve
= manager.getAsObservable();
booksObserve
.map(Book::getAllOrders)
.flatMap(Observable::fromIterable)
.toList()
.subscribeOn(Schedulers.io())
.subscribe(collectedISBN ->
Observable.fromIterable(collectedISBN)
.buffer(10)
// ...some more steps here...
.toList()
.toObservable()
// resubmit
.doOnComplete(this::executeObserve)
.subscribe(validISBN -> {
// do something with the valid ones
})
)
);
}
My guess is that this is not the way to go if I want to resubmit my tasks but it was not possible to find any documentation.
the booksObserve is implemented as follows
public Observable<Book> getAsObservable() {
return Observable.create(e -> {
try (CloseableResultSet<Book> rs = (CloseableResultSet<Book>) datasource.retrieveAll())) {
for (Book r : rs) {
e.onNext(r);
}
e.onComplete();
} catch (Exception ex) {
e.onError(ex);
}
});
}
What is the correct way to constantly resubmit an operation until we call dispose or equivalent? I'm using RxJava 2
You have created an endless recursion, the loop will create more and more resources and sometime it will blow with OutOfMemory/Stack overflow exception.
In order to repeat the Observable work you should use repeat() operator, it will resubscribes to the Observable when it receives onComplete().
Besides that, some general comments on your code:
why are you nesting the second Observable inside the subscriber? you are breaking the chain, you can just continue the chain instead of creating new Observable at the Subscriber.
Moreover, it's seems (assuming Observable.fromIterable(collectedBets) using the collectedISBN that gets with the onNext() o.w. from where does it comes?) you're collecting all items to a list, and then flatting it again using from iterable, so it's seems you can just continue on the stream , something like that:
booksObserve
.map(Book::getAllOrders)
.flatMap(Observable::fromIterable)
.buffer(10)
// ...some more steps here...
.toList()
.toObservable()
// resubmit
.doOnComplete(this::executeObserve)
.subscribeOn(Schedulers.io())
.subscribe(validISBN -> {
// do something with the valid ones
});
Anyhow, with the nested Observable, the repeat() operator will just repeat the nested one, and not the entire stream (which is what you want) as it is not connected to it.
In continuation to my question the repeat as #yosriz suggested is the proper way to go, the following simple snippet demonstrates that the observable source will be called on each repeat
Observable<Integer> recursiveObservable = Observable.create(emitter -> {
System.out.println("Calling to emit data");
Lists.newArrayList(1, 2, 3, 4, 5, 6, 7, 8, 9, 0).forEach(emitter::onNext);
emitter.onComplete();
});
recursiveObservable
.buffer(2)
.repeat()
.subscribe(integers -> {
System.out.println(integers);
TimeUnit.SECONDS.sleep(1);
});

Categories

Resources