I am trying to write a simple program using RxJava to generate an infinite sequence of natural numbers. So, far I have found two ways to generate sequence of numbers using Observable.timer() and Observable.interval(). I am not sure if these functions are the right way to approach this problem. I was expecting a simple function like one we have in Java 8 to generate infinite natural numbers.
IntStream.iterate(1, value -> value +1).forEach(System.out::println);
I tried using IntStream with Observable but that does not work correctly. It sends infinite stream of numbers only to first subscriber. How can I correctly generate infinite natural number sequence?
import rx.Observable;
import rx.functions.Action1;
import java.util.stream.IntStream;
public class NaturalNumbers {
public static void main(String[] args) {
Observable<Integer> naturalNumbers = Observable.<Integer>create(subscriber -> {
IntStream stream = IntStream.iterate(1, val -> val + 1);
stream.forEach(naturalNumber -> subscriber.onNext(naturalNumber));
});
Action1<Integer> first = naturalNumber -> System.out.println("First got " + naturalNumber);
Action1<Integer> second = naturalNumber -> System.out.println("Second got " + naturalNumber);
Action1<Integer> third = naturalNumber -> System.out.println("Third got " + naturalNumber);
naturalNumbers.subscribe(first);
naturalNumbers.subscribe(second);
naturalNumbers.subscribe(third);
}
}
The problem is that the on naturalNumbers.subscribe(first);, the OnSubscribe you implemented is being called and you are doing a forEach over an infinite stream, hence why your program never terminates.
One way you could deal with it is to asynchronously subscribe them on a different thread. To easily see the results I had to introduce a sleep into the Stream processing:
Observable<Integer> naturalNumbers = Observable.<Integer>create(subscriber -> {
IntStream stream = IntStream.iterate(1, i -> i + 1);
stream.peek(i -> {
try {
// Added to visibly see printing
Thread.sleep(50);
} catch (InterruptedException e) {
}
}).forEach(subscriber::onNext);
});
final Subscription subscribe1 = naturalNumbers
.subscribeOn(Schedulers.newThread())
.subscribe(first);
final Subscription subscribe2 = naturalNumbers
.subscribeOn(Schedulers.newThread())
.subscribe(second);
final Subscription subscribe3 = naturalNumbers
.subscribeOn(Schedulers.newThread())
.subscribe(third);
Thread.sleep(1000);
System.out.println("Unsubscribing");
subscribe1.unsubscribe();
subscribe2.unsubscribe();
subscribe3.unsubscribe();
Thread.sleep(1000);
System.out.println("Stopping");
Observable.Generate is exactly the operator to solve this class of problem reactively. I also assume this is a pedagogical example, since using an iterable for this is probably better anyway.
Your code produces the whole stream on the subscriber's thread. Since it is an infinite stream the subscribe call will never complete. Aside from that obvious problem, unsubscribing is also going to be problematic since you aren't checking for it in your loop.
You want to use a scheduler to solve this problem - certainly do not use subscribeOn since that would burden all observers. Schedule the delivery of each number to onNext - and as a last step in each scheduled action, schedule the next one.
Essentially this is what Observable.generate gives you - each iteration is scheduled on the provided scheduler (which defaults to one that introduces concurrency if you don't specify it). Scheduler operations can be cancelled and avoid thread starvation.
Rx.NET solves it like this (actually there is an async/await model that's better, but not available in Java afaik):
static IObservable<int> Range(int start, int count, IScheduler scheduler)
{
return Observable.Create<int>(observer =>
{
return scheduler.Schedule(0, (i, self) =>
{
if (i < count)
{
Console.WriteLine("Iteration {0}", i);
observer.OnNext(start + i);
self(i + 1);
}
else
{
observer.OnCompleted();
}
});
});
}
Two things to note here:
The call to Schedule returns a subscription handle that is passed back to the observer
The Schedule is recursive - the self parameter is a reference to the scheduler used to call the next iteration. This allows for unsubscription to cancel the operation.
Not sure how this looks in RxJava, but the idea should be the same. Again, Observable.generate will probably be simpler for you as it was designed to take care of this scenario.
When creating infinite sequencies care should be taken to:
subscribe and observe on different threads; otherwise you will only serve single subscriber
stop generating values as soon as subscription terminates; otherwise runaway loops will eat your CPU
The first issue is solved by using subscribeOn(), observeOn() and various schedulers.
The second issue is best solved by using library provided methods Observable.generate() or Observable.fromIterable(). They do proper checking.
Check this:
Observable<Integer> naturalNumbers =
Observable.<Integer, Integer>generate(() -> 1, (s, g) -> {
logger.info("generating {}", s);
g.onNext(s);
return s + 1;
}).subscribeOn(Schedulers.newThread());
Disposable sub1 = naturalNumbers
.subscribe(v -> logger.info("1 got {}", v));
Disposable sub2 = naturalNumbers
.subscribe(v -> logger.info("2 got {}", v));
Disposable sub3 = naturalNumbers
.subscribe(v -> logger.info("3 got {}", v));
Thread.sleep(100);
logger.info("unsubscribing...");
sub1.dispose();
sub2.dispose();
sub3.dispose();
Thread.sleep(1000);
logger.info("done");
Related
I have a question about how Java Streams and chained CompletableFutures perform.
My question is this: if I run the following code, calling execute() with 10 items in the list takes ~11 seconds to complete (number of items in the list plus 1). This is because I have two threads working in parallel: the first executes the digItUp operation, and once that's complete, the second executes the fillItBackIn operation, and the first starts processing digItUp on the next item in the list.
If I comment out line 36 (.collect(Collectors.toList())), the execute() method takes ~20 seconds to complete. The threads do not operate in parallel; for each item in the list, the digItUp operation completes, and then the fillItBackIn operation completes in sequence before the next item in the list is processed.
It's unclear to me why the exclusion of (.collect(Collectors.toList())) should change this behavior. Can someone explain?
The complete class:
package com.test;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.Collectors;
public class SimpleExample {
private final ExecutorService diggingThreadPool = Executors.newFixedThreadPool(1);
private final ExecutorService fillingThreadPool = Executors.newFixedThreadPool(1);
public SimpleExample() {
}
public static void main(String[] args) {
List<Double> holesToDig = new ArrayList<>();
Random random = new Random();
for (int c = 0; c < 10; c++) {
holesToDig.add(random.nextDouble(1000));
}
new SimpleExample().execute(holesToDig);
}
public void execute(List<Double> holeVolumes) {
long start = System.currentTimeMillis();
holeVolumes.stream()
.map(volume -> {
CompletableFuture<Double> digItUpCF = CompletableFuture.supplyAsync(() -> digItUp(volume), diggingThreadPool);
return digItUpCF.thenApplyAsync(volumeDugUp -> fillItBackIn(volumeDugUp), fillingThreadPool);
})
.collect(Collectors.toList())
.forEach(cf -> {
Double volume = cf.join();
System.out.println("Dug a hole and filled it back in. Net volume: " + volume);
});
System.out.println("Dug up and filled back in " + holeVolumes.size() + " holes in " + (System.currentTimeMillis() - start) + " ms");
}
public Double digItUp(Double volume) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
}
System.out.println("Dug hole with volume " + volume);
return volume;
}
public Double fillItBackIn(Double volumeDugUp) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
}
System.out.println("Filled back in hole of volume " + volumeDugUp);
return 0.0;
}
}
The reason is that collect(Collectors.toList()) is a terminal operation, hence it triggers the stream pipeline (remember that streams are evaluated lazily). So when you call collect, all of the CompletableFuture instances are constructed and placed in the list. This means that there is a chain of CompletableFuture, where each one is in turn a chain composed of two stages, let's call them X and Y.
Every time the first thread executor finishes an X stage, it is free to process the X stage of the next composed CompletableFuture, while the other thread executor is processing stage Y of the previous CompletableFuture. This is the result that we intuitively expect.
On the other hand, when you don't call collect, then forEach is in this case the terminal operation. However, in this case every element in the stream is processed sequentially (to confirm try switching to parallelStream()), hence stages X and Y get executed for the first CompletableFuture. Only when stage Y from the previous stream element is finished, will forEach move to the second element in the stream pipeline, and only then will a new CompletableFuture be mapped from the original Double value.
Love this question and M A's answer is awesome! I had a similar use case, and I was using Rxjava there. It worked very well, but my colleagues challenged me to implement it without that. T.T
I tested your example and found a workaround to make it the same performance without collect. The trick is to let the cf.join() be executed in another thread.
.forEach(cf -> CompletableFuture.supplyAsync(cf::join, anotherThreadpool)
// another threadpool for the join, or you can omit it, using the default forkjoinpool.commonpool
.thenAccept(v -> System.out.println("Dug a hole and filled it back in. Net volume: " + v))
);
But I have to say, this might lead to potential issues as it lacks the support for backpressure...if the upstream is infinite and fast, but the consumer is too slow, all the fast-created CompletableFuture in the map operator would be accumulated and submitted to the first diggingThreadPool, finally causing RejectedExecutionException, OOM, etc.
So I know this has been asked many times before, but I have tried many things and nothing seems to work.
Let's start with these blogs/articles/code:
https://blog.danlew.net/2016/01/25/rxjavas-repeatwhen-and-retrywhen-explained/
https://jimbaca.com/rxjava-retrywhen/
http://blog.inching.org/RxJava/2016-12-12-rx-java-error-handling.html
https://pamartinezandres.com/rxjava-2-exponential-backoff-retry-only-when-internet-is-available-5a46188ab175
https://gist.github.com/wotomas/35006d156a16345349a2e4c8e159e122
And many others.
In a nutshell all of them describe how you can use retryWhen to implement exponential back-off. Something like this:
source
.retryWhen(
errors -> {
return errors
.zipWith(Observable.range(1, 3), (n, i) -> i)
.flatMap(
retryCount -> {
System.out.println("retry count " + retryCount);
return Observable.timer((long) Math.pow(1, retryCount), SECONDS);
});
})
Even the documentation in the library agrees with it:
https://github.com/ReactiveX/RxJava/blob/3.x/src/main/java/io/reactivex/rxjava3/core/Observable.java#L11919.
However, I've tried this and some pretty similar variations, not worthy to describe here, and nothing seems to work. There's a way in that the examples works and is using blocking subscribers but I want to avoid blocking threads.
So if to the previous observable we apply a blocking subscriber like this:
.blockingForEach(System.out::println);
It works as expected. But as that's not the idea. If we try:
.subscribe(
x -> System.out.println("onNext: " + x),
Throwable::printStackTrace,
() -> System.out.println("onComplete"));
The flow runs only once, thus not what I want to achieve.
Does that mean it cannot be used as I'm trying to? From the documentation it doesn't seem to be a problem trying to accomplish my requirement.
Any idea what am I missing?
TIA.
Edit: There are 2 ways I'm testing this:
A test method (using testng):
Observable<Integer> source =
Observable.just("test")
.map(
x -> {
System.out.println("trying again");
return Integer.parseInt(x);
});
source
.retryWhen(
errors -> {
return errors
.zipWith(Observable.range(1, 3), (n, i) -> i)
.flatMap(
retryCount -> {
return Observable.timer((long) Math.pow(1, retryCount), SECONDS);
});
})
.subscribe(...);
From a Kafka consumer (using Spring boot):
This is only the subscription to the observer, but the retries logic is what I described earlier in the post.
#KafkaListener(topics = "${kafka.config.topic}")
public void receive(String payload) {
log.info("received payload='{}'", payload);
service
.updateMessage(payload)
.subscribe(...)
.dispose();
}
The main issue of your code is that Observable.timer is by default operating on the computation scheduler. This adds extra effort when trying to verify the behaviour within a test.
Here is some unit testing code that verifies that your retry code is actually retrying.
It adds a counter, just so we can easily check how many calls have happened.
It uses the TestScheduler instead of the computation scheduler so that we can pretend moving in time through advanceTimeBy.
TestScheduler testScheduler = new TestScheduler();
AtomicInteger counter = new AtomicInteger();
Observable<Integer> source =
Observable.just("test")
.map(
x -> {
System.out.println("trying again");
counter.getAndIncrement();
return Integer.parseInt(x);
});
TestObserver<Integer> testObserver = source
.retryWhen(
errors -> {
return errors
.zipWith(Observable.range(1, 3), (n, i) -> i)
.flatMap(
retryCount -> {
return Observable.timer((long) Math.pow(1, retryCount), SECONDS, testScheduler);
});
})
.test();
assertEquals(1, counter.get());
testScheduler.advanceTimeBy(1, SECONDS);
assertEquals(2, counter.get());
testScheduler.advanceTimeBy(1, SECONDS);
assertEquals(3, counter.get());
testScheduler.advanceTimeBy(1, SECONDS);
assertEquals(4, counter.get());
testObserver.assertComplete();
Below is my code snippet.
I know you are not supposed to block cachedFlowable like this, but this is just an example.
It gets stuck at the blockingGet line.
If I replace singleOrError with singleElement, the code will still get stuck. If I replace singleOrError with firstElement, the code will no longer get stuck.
Can someone please explain to me why this is the case?
public static void main(String[] args) {
final Flowable<Integer> cachedFlowable = Flowable.just(1).cache();
cachedFlowable
.doOnNext(i -> {
System.out.println("doOnNext " + i);
final Integer j = cachedFlowable.singleOrError().blockingGet();
System.out.println("after blockingGet " + j);
})
.blockingSubscribe();
}
The reason it deadlocks with singleX operator is that such operators wait for a possible 2nd item emission but since you are blocking them, any second item or completion from the main source can't get executed. With firstX they only care about the very first item thus unblock almost immediately which allows the source to complete.
So yes, you should not use blocking methods in flows like that but instead use flatMap or concatMap to do a per item subflow:
var cache = Flowable.just(1).cache();
cache
.doOnNext(i -> System.out.println("doOnNext " + i))
.concatMapSingle(item -> cache.firstOrError())
.doOnNext(j -> System.out.println("after " + j))
.blockingSubscribe();
I have an infinite hot flux of data. I am about to engage in carrying out an operation on each element in the stream, each of which returns a Mono which will complete (one way or another) after some finite time.
There is the possibility of an error being thrown from these operations. If so, I want to resubscribe to the hot flux without missing anything, retrying elements that were in the middle of being processed when the error was thrown (i.e. anything that did not complete successfully).
What do I do here? I can tolerate repeated operations on the same elements, but not losing elements entirely from the stream.
I've attempted to use a ReplayProcessor to handle this, but I can't see a way of making it work without repeating a lot of operations that might well have succeeded (using a very conservative timeout), or losing elements due to new elements overriding old ones in the buffer (as below).
Test case:
#Test
public void fluxTest() {
List<String> strings = new ArrayList<>();
strings.add("one");
strings.add("two");
strings.add("three");
strings.add("four");
ConnectableFlux<String> flux = Flux.fromIterable(strings).publish();
//Goes boom after three uses of its method, otherwise
//returns a mono. completing after a little time
DangerousClass dangerousClass = new DangerousClass(3);
ReplayProcessor<String> replay = ReplayProcessor.create(3);
flux.subscribe(replay);
replay.flatMap(dangerousClass::doThis)
.retry(1)
.doOnNext(s -> LOG.info("Completed {}", s))
.subscribe();
flux.connect();
flux.blockLast();
}
public class DangerousClass {
Logger LOG = LoggerFactory.getLogger(DangerousClass.class);
private int boomCount;
private AtomicInteger count;
public DangerousClass(int boomCount) {
this.boomCount = boomCount;
this.count = new AtomicInteger(0);
}
public Mono<String> doThis(String s) {
return Mono.fromSupplier(() -> {
LOG.info("doing dangerous {}", s);
if (count.getAndIncrement() == boomCount) {
LOG.error("Throwing exception from {}", s);
throw new RuntimeException("Boom!");
}
return s;
}).delayElement(Duration.ofMillis(600));
}
}
This prints:
doing dangerous one
doing dangerous two
doing dangerous three
doing dangerous four
Throwing exception from four
doing dangerous two
doing dangerous three
doing dangerous four
Completed four
Completed two
Completed three
One is never completed.
The error (at least in the above example) can only occur in the flatMap(dangerousClass::doThis) call - so resubscribing to the root Flux and replaying elements when this one flatMap() call has failed seems a bit odd, and (probably) isn't what you want to do.
Instead, I'd recommend ditching the ReplayProcessor and just calling retry on the inner flatMap() call instead, so you end up with something like:
ConnectableFlux<String> flux = Flux.range(1, 10).map(n -> "Entry " + n).publish();
DangerousClass dangerousClass = new DangerousClass(3);
flux.flatMap(x -> dangerousClass.doThis(x).retry(1))
.doOnNext(s -> System.out.println("Completed " + s))
.subscribe();
flux.connect();
This will give you something like the following, with all entries completed and no retries:
doing dangerous Entry 1
doing dangerous Entry 2
doing dangerous Entry 3
doing dangerous Entry 4
Throwing exception from Entry 4
doing dangerous Entry 4
Completed Entry 2
Completed Entry 1
Completed Entry 3
Completed Entry 4
I'm consuming an API that returns CompletableFutures for querying devices (similar to digitalpetri modbus).
I need to call this API with a couple of options to query a device and figure out what it is - this is basically trial and error until it succeeds. These are embedded device protocols that I cannot change, but you can think of the process as working similar to the following:
Are you an apple?
If not, then are you a pineapple?
If not, then are you a pen?
...
While the API uses futures, in reality, the communications are serial (going over the same physical piece of wire), so they will never be executed synchronously. Once I know what it is, I want to be able to stop trying and let the caller know what it is.
I already know that I can get the result of only one of the futures with any (see below), but that may result in additional attempts that should be avoided.
Is there a pattern for chaining futures where you stop once one of them succeeds?
Similar, but is wasteful of very limited resources.
List<CompletableFuture<String>> futures = Arrays.asList(
CompletableFuture.supplyAsync(() -> "attempt 1"),
CompletableFuture.supplyAsync(() -> "attempt 2"),
CompletableFuture.supplyAsync(() -> "attempt 3"));
CompletableFuture<String>[] futuresArray = (CompletableFuture<String>[]) futures.toArray();
CompletableFuture<Object> c = CompletableFuture.anyOf(futuresArray);
Suppose that you have a method that is "pseudo-asynchronous" as you describe, i.e. it has an asynchronous API but requires some locking to perform:
private final static Object lock = new Object();
private static CompletableFuture<Boolean> pseudoAsyncCall(int input) {
return CompletableFuture.supplyAsync(() -> {
synchronized (lock) {
System.out.println("Executing for " + input);
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
return input > 3;
}
});
}
And a List<Integer> of inputs that you want to check against this method, you can check each of them in sequence with recursive composition:
public static CompletableFuture<Integer> findMatch(List<Integer> inputs) {
return findMatch(inputs, 0);
}
private static CompletableFuture<Integer> findMatch(List<Integer> inputs, int startIndex) {
if (startIndex >= inputs.size()) {
// no match found -- an exception could be thrown here if preferred
return CompletableFuture.completedFuture(null);
}
return pseudoAsyncCall(inputs.get(startIndex))
.thenCompose(result -> {
if (result) {
return CompletableFuture.completedFuture(inputs.get(startIndex));
} else {
return findMatch(inputs, startIndex + 1);
}
});
}
This would be used like this:
public static void main(String[] args) {
List<Integer> inputs = Arrays.asList(0, 1, 2, 3, 4, 5);
CompletableFuture<Integer> matching = findMatch(inputs);
System.out.println("Found match: " + matching.join());
}
Output:
Executing for 0
Executing for 1
Executing for 2
Executing for 3
Executing for 4
Found match: 4
As you can see, it is not called for input 5, while your API (findMatch()) remains asynchronous.
I think the best you can do is, after your retrieval of the result,
futures.forEach(f -> f.cancel(true));
This will not affect the one having produced the result, and tries its best to stop the others. Since IIUC you get them from an outside source, there's no guarantee it will actually interrupt their work.
However, since
this class has no direct control over the computation that causes it to be completed, cancellation is treated as just another form of exceptional completion
(from CompletableFuture doc), I doubt it will do what you actually want.