Using RXJava 2, I'm trying to create an asynchronous Event Bus.
I have a singleton object, with a PublishSubject property. Emitters can send an event to the bus using onNext on the subject.
If subscribers have a long task to execute, I want my bus to dispatch the tasks on multiple threads to execute concurrently the tasks. Which means I want the work to start on an item immediatly after the item is emitted, even if the work on the previous item is not completed.
However, even using observeOn with a scheduler, I cannnot run my tasks concurrently.
Sample code:
public void test() throws Exception {
Subject<Integer> busSubject = PublishSubject.<Integer>create().toSerialized();
busSubject.observeOn(Schedulers.computation())
.subscribe(new LongTaskConsumer());
for (int i = 1; i < 5; i++) {
System.out.println(i + " - event");
busSubject.onNext(i);
Thread.sleep(1000);
}
Thread.sleep(1000);
}
private static class LongTaskConsumer implements Consumer<Integer> {
#Override
public void accept(Integer i) throws Exception {
System.out.println(i + " - start work");
System.out.println(i + " - computation on thread " + Thread.currentThread().getName());
Thread.sleep(2000);
System.out.println(i + " - end work");
}
}
Prints:
1 - event
1 - start work
1 - computation on thread RxComputationThreadPool-1
2 - event
3 - event
1 - end work
2 - start work
2 - computation on thread RxComputationThreadPool-1
4 - event
2 - end work
3 - start work
3 - computation on thread RxComputationThreadPool-1
3 - end work
4 - start work
4 - computation on thread RxComputationThreadPool-1
4 - end work
Which means that the work on item 2 waited for the end of work on item 1, even if the event 2 was already emitted.
When the call below happens one worker is created from Schedulers.computation() and is used for the whole stream. That's why all the of the work you submitted is done on RxComputationThreadPool-1.
busSubject.observeOn(Schedulers.computation())
.subscribe(new LongTaskConsumer());
To schedule work on multiple threads:
busSubject.flatMap(x ->
Flowable.just(x)
.subscribeOn(Schedulers.computation()
.doOnNext(somethingIntensive))
.subscribe(new LongTaskConsumer());
Note also that the intensive work is performed inside the flatMap rather than in the LongTaskConsumer because all items will arrive serially to LongTaskConsumer.
There are other approaches to doing work in parallel that you may want to investigate depending on how many events are hitting the PublishSubject.
Related
I tried to compile the example from Thinking in Java by Bruce Eckel:
import java.util.concurrent.*;
public class SimplePriorities implements Runnable {
private int countDown = 5;
private volatile double d; // No optimization
private int priority;
public SimplePriorities(int priority) {
this.priority = priority;
}
public String toString() {
return Thread.currentThread() + ": " + countDown;
}
public void run() {
Thread.currentThread().setPriority(priority);
while(true) {
// An expensive, interruptable operation:
for(int i = 1; i < 100000; i++) {
d += (Math.PI + Math.E) / (double)i;
if(i % 1000 == 0)
Thread.yield();
}
System.out.println(this);
if(--countDown == 0) return;
}
}
public static void main(String[] args) {
ExecutorService exec = Executors.newCachedThreadPool();
for(int i = 0; i < 5; i++)
exec.execute(
new SimplePriorities(Thread.MIN_PRIORITY));
exec.execute(
new SimplePriorities(Thread.MAX_PRIORITY));
exec.shutdown();
}
}
According to the book, the output has to look like:
Thread[pool-1-thread-6,10,main]: 5
Thread[pool-1-thread-6,10,main]: 4
Thread[pool-1-thread-6,10,main]: 3
Thread[pool-1-thread-6,10,main]: 2
Thread[pool-1-thread-6,10,main]: 1
Thread[pool-1-thread-3,1,main]: 5
Thread[pool-1-thread-2,1,main]: 5
Thread[pool-1-thread-1,1,main]: 5
...
But in my case 6th thread doesn't execute its task at first and threads are disordered. Could you please explain me what's wrong? I just copied the source and didn't add any strings of code.
The code is working fine and with the output from the book.
Your IDE probably has console window with the scroll bar - just scroll it up and see the 6th thread first doing its job.
However, the results may differ depending on OS / JVM version. This code runs as expected for me on Windows 10 / JVM 8
There are two issues here:
If two threads with the same priority want to write output, which one goes first?
The order of threads (with the same priority) is undefined, therefore the order of output is undefined. It is likely that a single thread is allowed to write several outputs in a row (because that's how most thread schedulers work), but it could also be completely random, or anything in between.
How many threads will a cached thread pool create?
That depends on your system. If you run on a dual-core system, creating more than 4 threads is pointless, because there hardly won't be any CPU available to execute those threads. In this scenario further tasks will be queued and executed only after earlier tasks are completed.
Hint: there is also a fixed-size thread pool, experimenting with that should change the output.
In summary there is nothing wrong with your code, it is just wrong to assume that threads are executed in any order. It is even technically possible (although very unlikely), that the first task is already completed before the last task is even started. If your book says that the above order is "correct" then the book is simply wrong. On an average system that might be the most likely output, but - as above - with threads there is never any order, unless you enforce it.
One way to enforce it are thread priorities - higher priorities will get their work done first - you can find other concepts in the concurrent package.
I have a long running task (say an Observable<Integer>) that I want to trigger as few times in my application as possible. I have multiple "views" on the task that process the events that it sends in various ways. I only have one subscribe in my entire application.
How do I ensure that the long running task is only triggered once for each subscription, and is only triggered when required by a subscription?
To make things more concrete, here is a unit-test:
#Test
public void testSubscriptionCount() {
final Counter counter = new Counter();
// Some long running tasks that should be triggered once per subscribe
final Observable<Integer> a = Observable.just(1, 2, 3, 4, 5)
.doOnSubscribe(subscription -> {
counter.increment();
});
// Some "view" on the long running task
final Observable<Integer> b = a.filter(x -> x % 2 == 0);
// Another "view" on the long running task
final Observable<Integer> c = a.filter(x -> x % 2 == 1);
// A view on the views
final Observable<Integer> d = Observable.zip(b, c, (x, y) -> x + y);
d.toList().blockingGet();
assertEquals(1, counter.count); // Fails, counter.count == 2
}
I would like a to only be triggered when one of its views (b, c or d) is subscribed to, but also only once per subscription.
In the code above, the subscription happens twice (I presume that d triggers b and c, which both trigger a independently).
Adding .share() does not solve the problem (although I think it is along the right lines):
// Some long running tasks that should be triggered once per subscribe
final Observable<Integer> a = Observable.just(1, 2, 3, 4, 5)
.doOnSubscribe(subscription -> counter.increment())
.share();
java.lang.AssertionError:
Expected :1
Actual :2
If your goal is to prevent multiple executions when the observer is subscribed in parallel, .share() is what you are looking for:
Observable<Integer> shared = source.share();
// In thread 1:
shared.subscribe(...);
// In thread 2:
shared.subscribe(...);
So long as the source observable has not yet completed when second subscription happens, it will receive the same results as the first, and will not force another execution of the source observable.
The RxJava documentation has much more detailed explanation, but it's basically a wrapper that has some reference counting and only subscribes to the source observable when necessary to avoid concurrent executions.
Also keep in mind that timing will play an important part in which values are actually delivered. I don't believe .share() will do any specific buffering of elements, so if elements are delivered prior to the second subscription the second subscription will not get those elements. You'd have to use .buffer() or some other means of holding onto results for late subscribers.
I need to create concurrent network requests. Depending on the result of these requests, more requests might be started.
I want to get a single Completable that completes once all of the requests have finished and no further request need to be created.
My question is, is this possible to achieve using the following snippet:
return Completable.defer(() -> {
startRequests();
return Observable.merge(requestSubject.asObservable()).toCompletable();
});
In this example, startRequest would add network requests (Retrofit) to the requestSubject, which is a PublishSubject<Observable<SomeResponse>>.
Specifically I'd expect the network requests to start on the IO scheduler once subscribed, and the returned Completable to not complete until I, in the onNext of one of the requests, call requestSubject.onComplete().
I have yet to figure out how I will process the response of the requests without executing the request twice (Retrofit requests on each subscribe).
Does it work this way, or is there a better way to achieve what I am looking for? Thanks!
Just use flatmap() and convert it to Completable.
Here is an example that is (simulating) executing network request which returns 2 items on io pool then it performs computation on these items in computation pool, everything parallel:
#Test
public void foo() throws Exception {
Observable.range(1, 10)
.flatMap(this::getNItemsFromNetwork)
.flatMap(this::asyncCompuatation)
.ignoreElements()
.subscribe(() -> System.out.println("onComplete"),
(t) -> System.out.println("onError"));
Thread.sleep(10000);
}
Observable<String> getNItemsFromNetwork(int count) {
return Observable.just(count)
.subscribeOn(Schedulers.io())
.doOnNext(i -> System.out.println("Executing request for " + count + " on thread: " + Thread.currentThread()))
.flatMap(number -> Observable.just("Item nr " + number + ".1", "Item nr " + number + ".2"))
.delay(random.nextInt(1000), TimeUnit.MILLISECONDS);
}
Observable<String> asyncCompuatation(String string) {
return Observable.just(string)
.subscribeOn(Schedulers.computation())
.delay(random.nextInt(1000), TimeUnit.MILLISECONDS)
.doOnNext(number -> System.out.println("Computing " + number + " on thread: " + Thread.currentThread()));
}
And output for validation:
Executing request for 7 on thread: Thread[RxCachedThreadScheduler-7,5,main]
Executing request for 6 on thread: Thread[RxCachedThreadScheduler-6,5,main]
Executing request for 5 on thread: Thread[RxCachedThreadScheduler-5,5,main]
Executing request for 1 on thread: Thread[RxCachedThreadScheduler-1,5,main]
Executing request for 4 on thread: Thread[RxCachedThreadScheduler-4,5,main]
Executing request for 3 on thread: Thread[RxCachedThreadScheduler-3,5,main]
Executing request for 8 on thread: Thread[RxCachedThreadScheduler-8,5,main]
Executing request for 2 on thread: Thread[RxCachedThreadScheduler-2,5,main]
Executing request for 9 on thread: Thread[RxCachedThreadScheduler-9,5,main]
Executing request for 10 on thread: Thread[RxCachedThreadScheduler-10,5,main]
Computing Item nr 7.1 on thread: Thread[RxComputationThreadPool-5,5,main]
Computing Item nr 10.2 on thread: Thread[RxComputationThreadPool-2,5,main]
Computing Item nr 6.2 on thread: Thread[RxComputationThreadPool-1,5,main]
Computing Item nr 3.1 on thread: Thread[RxComputationThreadPool-7,5,main]
Computing Item nr 4.1 on thread: Thread[RxComputationThreadPool-7,5,main]
Computing Item nr 3.2 on thread: Thread[RxComputationThreadPool-1,5,main]
Computing Item nr 6.1 on thread: Thread[RxComputationThreadPool-7,5,main]
Computing Item nr 2.1 on thread: Thread[RxComputationThreadPool-7,5,main]
Computing Item nr 5.2 on thread: Thread[RxComputationThreadPool-2,5,main]
Computing Item nr 5.1 on thread: Thread[RxComputationThreadPool-5,5,main]
Computing Item nr 7.2 on thread: Thread[RxComputationThreadPool-2,5,main]
Computing Item nr 2.2 on thread: Thread[RxComputationThreadPool-1,5,main]
Computing Item nr 10.1 on thread: Thread[RxComputationThreadPool-5,5,main]
Computing Item nr 9.1 on thread: Thread[RxComputationThreadPool-5,5,main]
Computing Item nr 4.2 on thread: Thread[RxComputationThreadPool-1,5,main]
Computing Item nr 9.2 on thread: Thread[RxComputationThreadPool-2,5,main]
Computing Item nr 8.1 on thread: Thread[RxComputationThreadPool-5,5,main]
Computing Item nr 8.2 on thread: Thread[RxComputationThreadPool-2,5,main]
Computing Item nr 1.1 on thread: Thread[RxComputationThreadPool-7,5,main]
Computing Item nr 1.2 on thread: Thread[RxComputationThreadPool-1,5,main]
onComplete
Ok, not sure if I got your question 100% correct, but here's a rough sketch of what I would do... I believe you want to have a Subject as an intermediate level for caching and not interrupting actual request when you will call unsubscribe.
1) Assume you have 2 Retrofit Observables.
2) In startRequests() you need to subscribe to both of them (on some scheduler you need), apply doOnNext operator and delegate data to your subject. So the subject will receive 2 ticks of data from API.
3) Subscribe to your subject, you will receive 2 ticks of data.
Basically there's no need to wait for completion, you will just receive N amount of onNext ticks.
But if you want to have some indicator that all of the requests have completed, you can for example merge all retrofit observables, and delegate all events to subject, so it will receive N amount of onNext ticks and onComplete in the end.
I think that using Subjects, are unnecessary complication.
You can simply use flatMap() and transfer to Completable at the end using toCompletable(), you didn't mention how does your specific loop works, but assuming you have some List that you loop queries upon it goes something like this, where startRequest(data) returns your Retrofit query Observable:
List<Data> list = ...;
Observable.from(list)
.flatMap(new Func1<Data, Observable<Result>>() {
#Override
public Observable<Result> call(Data data) {
return startRequest(data);
}
}).toCompletable();
regarding your second request, of doing more requests depends on the result, in this case you might want to gather all requests using toList(), you will get one onNext() notification, then you can filter it all, and get Observable that emit item when you want to request more data :
List<Data> list =...;
Observable.from(list)
.flatMap(new Func1<Data, Observable<Result>>() {
#Override
public Observable<Result> call(Data data) {
return startRequest(data);
}
})
.toList()
.filter(new Func1<List<Result>, Boolean>() {
#Override
public Boolean call(List<Result> results) {
return shouldRequestMore(results);
}
});
Does Observable caches emitted items? I have two tests that lead me to different conclusions:
From the test #1 I make an conclusion that it does:
Test #1:
Observable<Long> clock = Observable
.interval(1000, TimeUnit.MILLISECONDS)
.take(10)
.map(i -> i++);
//subscribefor the first time
clock.subscribe(i -> System.out.println("a: " + i));
//subscribe with 2.5 seconds delay
Executors.newScheduledThreadPool(1).schedule(
() -> clock.subscribe(i -> System.out.println(" b: " + i)),
2500,
TimeUnit.MILLISECONDS
);
Output #1:
a: 0
a: 1
a: 2
b: 0
a: 3
b: 1
But the second test shows that we get different values for two observers:
Test #2:
Observable<Integer> observable = Observable
.range(1, 1000000)
.sample(7, TimeUnit.MILLISECONDS);
observable.subscribe(i -> System.out.println("Subscriber #1:" + i));
observable.subscribe(i -> System.out.println("Subscriber #2:" + i));
Output #2:
Subscriber #1:72745
Subscriber #1:196390
Subscriber #1:678171
Subscriber #2:336533
Subscriber #2:735521
There exist two kinds of Observables: hot and cold. Cold observables tend to generate the same sequence to its Observers unless you have external effects, such as a timer based action, associated with it.
In the first example, you get the same sequence twice because there are no external effects other than timer ticks you get one by one. In the second example, you sample a fast source and sampling with time has a non-deterministic effect: each nanosecond counts so even the slightest imprecision leads to different value reported.
and this a normal thread program
class Counter implements Runnable {
private int currentValue;
public Counter() { currentValue = 0; }
public int getValue() { return currentValue; }
public void run() { // (1) Thread entry point
try {
while (currentValue < 5) {
System.out.println(Thread.currentThread().getName() + ": " + (currentValue++)); // (2) Print thread name.
Thread.sleep(250); // (3) Current thread sleeps.
}
} catch (InterruptedException e) {
System.out.println(Thread.currentThread().getName() + " interrupted.");
}
System.out.println("Exit from thread: " + Thread.currentThread().getName());
}
}
//_______________________________________________________________________________
public class Client {
public static void main(String[] args) {
Counter counterA = new Counter(); // (4) Create a counter.
Thread worker = new Thread(counterA, "Counter A");// (5) Create a new thread.
System.out.println(worker);
worker.start(); // (6) Start the thread.
try {
int val;
do {
val = counterA.getValue(); // (7) Access the counter value.
System.out.println("Counter value read by " + Thread.currentThread().getName()+ ": " + val); // (8) Print thread name.
Thread.sleep(1000); // (9) Current thread sleeps.
} while (val < 5);
} catch (InterruptedException e) {
System.out.println("The main thread is interrupted.");
}
System.out.println("Exit from main() method.");
}
}
and the output is
Thread[Counter A,5,main]
Counter value read by main thread: 0
Counter A: 0
Counter A: 1
Counter A: 2
Counter A: 3
Counter value read by main thread: 4
Counter A: 4
Exit from thread: Counter A
Counter value read by main thread: 5
Exit from main() method.
My question is even though the worker thread was started initially before the Main thread enters it's try block, Main thread execution starts first and then when the Main thread goes to sleep child thread gets into action.
As this picture(taken from "A Programmer's Guide To Java SCJP Certification : A Comprehensive Primer 3rd Edition"
Author: Khalid A Mughal, Rolf W Rasmussen) depicts that when the start method is called on the thread it returns immediately.
Please explain this point that why on invoking start method it return immediately and does the thread get starts on calling the start method. As here on calling the start method it doesn't invoke run method of the class. So when does actually the thread starts ?
Also explain this " the call to the start() method is asynchronous."
there are three things that you are missing in your overall analysis.
Call to thread's start method is sequential not parallel. Its the call to run method of Thread that is concurrent. So if you have 5 statements in main method that call start, the 5ht is not going t be called first. Thats the 'happens before' guarantee that JVM specs give you. However the run method of 1 first may get called before or after the call to the second start statement. This depends as its more of a CPU time slicing issue rather than program execution.
When more than 1 thread runs in your program the order of output is in-deterministic. That's because they run in parallel. You can never be sure that the same program will run in same order on two machines or even in two runs on the same machine. In your question you have posted only 1 output. Run the program like 20 times one after another and match the output. I am sure 2 or 3 would be entirely different.
Finally, you are basing your analysis on the order or execution of your concurrent code. That's the biggest blooper programmer make. Concurrent programs are never intended to run in a specific order or sequence. Just try to make your Runnable work an atomic mutually exclusive task (mutually exclusive to the rest of program or even to other Runnables) and track its own execution. Dont mix Threads together.
You cannot directly enforce which Thread is executed/running when. Once you start it, it's handled on lower level(usually by OS) and the results may differ on different machine or even in different execution. If you need more control, you need to use some synchronization mechanism.
The thread is isn't started synchronously underneath the call to start(). It happens later (asynchronously). In other words, just because you called start() doesn't mean the thread has started.
They why and how are all implementation details, that may depend on JVM and/or OS implementations.