Make sequential processing asynchronous without using the "parallel" operator - java

I have this piece of a Java method I'm working on right now:
Observable<Map.Entry<String, ConstituentInfo>> obs = Observable.from(constituents.entrySet());
Subscriber<Map.Entry<String, ConstituentInfo>> sub = new Subscriber<Map.Entry<String, ConstituentInfo>>(){
String securityBySymbol, company, symbol, companyName, marketPlace, countryName, tier, tierId;
ConstituentInfo constituent;
Integer compId;
#Override
public void onNext(Map.Entry<String, ConstituentInfo> entry) {
logger.info("blah blah test");
}
#Override
public void onCompleted() {
logger.info("completed successfully");
}
#Override
public void onError(Throwable throwable) {
logger.error(throwable.getMessage());
throwable.printStackTrace();
}
};
obs.observeOn(Schedulers.io()).subscribe(sub);
The method essentially processes each entry in the Map.Entry, but this seems to be processing it sequentially (same thread). How would I go about making this process asynchronous without using the "parallel" operator (i.e. process entries concurrently)? I tried to run the code above and I am missing some results (some are not processed properly).

Your observable is run on the main thread as it is reported and the Subscriber methods will be called by one worker given by Schedulers.io() so that is why it appears on one thread. Your code does not indicate any real work being done beyond logging by the subscriber so there is nothing to do asynchronously here.
Perhaps you meant to do this?
obs.subscribeOn(Schedulers.io()).subscribe(sub);
In terms of parallel processing if your last line was this:
obs.doOnNext(entry -> doWork(entry)).observeOn(Schedulers.io()).subscribe(sub);
Then you could have the doWork bit done asynchronously like this:
int numProcessors = Runtime.getRuntime().availableProcessors();
obs.buffer(Math.max(1, constituents.size()/numProcessors))
.flatMap(list -> Observable.from(list)
.doOnNext(entry -> doWork(entry)
.subscribeOn(Schedulers.computation())
.observeOn(Schedulers.io())
.subscribe(sub);
You need to buffer up the work by processor otherwise there could be a lot of thread context switching going on.
I'm not sure why you are missing results. If you have a repeatable test case then paste the code here or report it to RxJava on github as an issue.

Related

How can I wrap black-boxed, asynchronous calls in a synchronous manner?

I am using a proprietary, 3rd party framework in my Android app -- EMDK from Zebra, to be specific -- and two of their exposed methods:
.read() and .cancelRead() are asynchronous, each taking anywhere from a split second to a 5 whole seconds to complete. I need to be able to spam them without crashing my application and ensure that each one isn't called twice in a row. How can I go about doing this? I don't have any access to the methods themselves and a decompiler will only give me runtime stubs.
Edit: I also have no idea when each of these two calls ever completes.
Changing asynchronous programs into blocking ones is a more general requirement to this problem.
In Java, we can do this with CountDownLatch (as well as Phaser), or LockSupport + Atomic.
For example, if it is required to change an asynchronous call asyncDoSomethingAwesome(param, callback) into a blocking one, we could write a "wrapper" method like this:
ResultType doSomethingAwesome(ParamType param) {
AtomicReference<ResultType> resultContainer = new AtomicReference<>();
Thread callingThread = Thread.currentThread();
asyncDoSomethingAwesome(param, result -> {
resultContainer.set(result);
LockSupport.unpark(callingThread);
});
ResultType result;
while ((result = resultContainer.get()) == null) {
LockSupport.park();
}
return result;
}
I think this will be enough to solve your problem. However, when we are writing blocking programs, we usually want a "timeout" to keep the system stable even when an underlying interface is not working properly, for example:
ResultType doSomethingAwesome(ParamType param, Duration timeout) throws TimeoutException {
AtomicReference<ResultType> resultContainer = new AtomicReference<>();
Thread callingThread = Thread.currentThread();
asyncDoSomethingAwesome(param, result -> {
resultContainer.set(result);
LockSupport.unpark(callingThread);
});
ResultType result;
long deadline = Instant.now().plus(timeout).toEpochMilli();
while ((result = resultContainer.get()) == null) {
if (System.currentTimeMillis() >= deadline) {
throw new TimeoutException();
}
LockSupport.parkUntil(deadline);
}
return result;
}
Sometimes we need more refined management to the signal among threads, especially when writing concurrency libries. For example, when we need to know whether the blocking thread received the signal from another thread calling LockSupport.unpark, or whether that thread successfully notified the blocking thread, it is usually not easy to implement with Java standard library. Thus I designed another library with more complete mechanism to solve this issue:
https://github.com/wmx16835/experimental_java_common/blob/master/alpha/src/main/java/mingxin/wang/common/concurrent/DisposableBlocker.java
With the support of DisposableBlocker, life will become much easier :)
ResultType doSomethingAwesome(ParamType param, Duration timeout) throws TimeoutException {
// We can use org.apache.commons.lang3.mutable.MutableObject instead of AtomicReference,
// because this object will never be accessed concurrently
MutableObject<ResultType> resultContainer = new MutableObject<>();
DisposableBlocker blocker = new DisposableBlocker();
asyncDoSomethingAwesome(param, result -> {
resultContainer.setValue(result);
blocker.unblock();
});
if (!blocker.blockFor(timeout)) {
throw new TimeoutException();
}
return resultContainer.getValue();
}
Might be off on this as I'm not 100% sure what you're trying to achieve/nor the structure, but could you wrap each in an AsyncTask? Then in a parent AsyncTask or background thread:
AsyncTask1.execute().get(); //get will block until complete
AsyncTask2.execute().get(); //get will block until complete
This is assuming there is some way of knowing the calls you're making completed.

Use Reactor Core Java to send to browser the status of a long executing task, periodically

I have been working with some Reactor Core Java, because I want to figure out if this is possible to solve one problem I currently have using this framework.
At present I have a long, executing job that takes about 40-50 minutes to complete. The method looks more or less like this:
public void doLongTask(List<Something> list){
//instructions.
for(Something sm : list){
if(condition){
executeLongOperation();
}
//instructions
if(condition){
executeLongOperation();
}
}
}
in my controller I have something like this:
#GetMapping(path = "/integersReactor", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
#ResponseBody
public Flux<Integer> getIntegersReactor(){
logger.debug("Request getIntegersReactor initialized.");
return simpleSearchService.getIntegersReactor();
}
and in the service layer I have something like this:
#Override
public Flux<Integer> getIntegersReactor(){
return Flux.range(0, Integer.MAX_VALUE);
}
this is just a placeholder that I am using as a proof of concept. My real intentions are to somehow return a Flux of some object that I will define myself, this object will have a few fields that I will use to tell the consumer the status of the job.
Now, things get somewhat complicated now because I would like to send updates as the executeLongOperation(); are executed, and somehow instead of returning a flux of Integers, return a flux of an object that uses the return of executeLongOperation();
Can this be acomplished with Flux? How can I leverage Reactor Core java to push the return values of all of the times executeLongOperation(); is executed into a reactive stream that can be passed to the controller the same way that getIntegersReactor() does it in my example?
Yes it should be possible, but since the executeLongOperation is blocking, it will need to be offset on a dedicated thread (which reduces the benefits you get from a top-to-bottom reactive implementation).
Change your doLongTask to return a Flux<Foo>, make it concatenate Monos that wrap executeLongOperation on a dedicated thread (or better yet, change the executeLongOperation itself to return a Mono<Foo> and do the wrapping internally and subscribeOn another thread internally). Something like:
public Flux<Foo> doLongTask(List<Something> list) {
return Flux.fromIterable(list)
//ensure `Something` are published on a dedicated thread on which
//we can block
.publishOn(Schedulers.elastic()) //maybe a dedicated Scheduler?
//for each `Something`, perform the work
.flatMap(sm -> {
//in case condition is false, we'll avoid long running task
Flux<Foo> work = Flux.empty();
//start declaring the work depending on conditions
if(condition) {
Mono<Foo> op = Mono.fromCallable(this::executeLongOperation);
work = conditional.concatWith(op);
}
//all other instructions should preferably be non-blocking
//but since we're on a dedicated thread at this point, it should be ok
if(condition) {
Mono<Foo> op = Mono.fromCallable(this::executeLongOperation);
work = conditional.concatWith(op);
}
//let the flatMap trigger the work
return work;
});
}

Connectable Observer: observe termination of all subscribers

I have a connectable observer with multiple subscribers.
Each subscriber computes some business logic. For example one of subscribers stores results in database on every onNext call, other subscriber accumulates it's results in memory and when onCompleted called writes them to file. I want to know when they all finished their work, so I can proceed in doing other stuff (aggregating with other connectable observers, read outputed data from database etc).
This is how I'm observing termination. It's only working because subscribers execute in the same thread as observer.
public Observable<Boolean> observeTermination() {
return Observable.defer(() -> {
try {
start();
return Observable.just(true);
} catch (RuntimeException e) {
return Observable.just(false);
}
});
}
void start() {
Observable<List<Foo>> fooBatchReaderObservable = fooBatchReader.createObservable(BATCH_SIZE);
ConnectableObservable<List<Foo>> connectableObservable = fooBatchReaderObservable.publish();
subscribers.forEach(s -> connectableObservable.subscribe(s));
connectableObservable.connect();
}
So when observeTermination gets called I don't want to execute logic in start method, but only when someone subscribes to it.
Is there a way to make observation better ?
Well, it's all bad. The problem is that I need to call connect on observable somewhere and also return boolean results as inication of termination.
Not a proper answer, but it needs the space to explain properly. It would be much easier if you could deal with Observables instead of Subscribers; that gives you much more flexibility in composing them;
Given that you have components:
Collection<Function<Observable<T>, Observable<?>>> components:
Observable<T> tObs = ... .publish().autoConnect(components.size());
Observable
.from(components)
.flatMap(component -> component.apply(tObs))
.ignoreElements()
.doOnTerminate() // or .defaultIfEmpty(...), or .switchIfEmpty(...)
.subscribe(...);
In fact, I'd say you should not even subscribe at all here, just create the observable and return it, let it be usable for composition in other parts of your code.

RxJava doesn't work in Scheduler.io() thread

The problem in that: I have Observable and Subscriber. I try to launch Observable in .io() thread, because it works with files and zip archivers (I won't show the code - is too large), but Observable do nothing!:
Observable<Double> creatingObservable = getCreatingObservable(image);
Subscriber<Double> creatingSubscriber = getCreatingSubscriber();
creatingObservable
.subscribeOn(Schedulers.io())
.subscribe(creatingSubscriber);
If I launch code without the subscribeOn() - all work. What is the problem and how to solve it
P.S. System.out.println() doesn't work too. Problem have all Scheduler's threads.
It seems the problem is that the main thread terminated before creatingObservable could emit any values.
The simple solution: make the main thread wait long enough to enable creatingObservable to emit/complete.
Observable<Double> creatingObservable = getCreatingObservable(image);
Subscriber<Double> creatingSubscriber = getCreatingSubscriber();
creatingObservable
.subscribeOn(Schedulers.io())
.subscribe(creatingSubscriber);
Thread.sleep(5000); //to wait 5 seconds while creatingObservable is running on IO thread
Try this one:
Subscriber<Double> creatingSubscriber = getCreatingSubscriber();
Observable.defer(new Func0<Observable<Double>>() {
#Override
public Observable<Double> call() {
return getCreatingObservable(image);
}
})
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(creatingSubscriber);
Don't forget to add:
compile 'io.reactivex:rxandroid:1.2.1'
From here: https://github.com/ReactiveX/RxAndroid
Explanation
getCreatingObservable(image); - most probably you use some operators which do 'hard' work in moment of call.
For example:
Observable.just(doSomeStuff())
.subscribeOn(...)
.observeOn(...)
So, the execution process will be:
1). Calculate doSomeStuff()
2). Pass result to Observable.just()
3). And only passing you are applying schedulers
In other words, you are doing 'hard' work firstly, and then applying schedulers.
That's why you need to use Observable.defer()
For more explanation, please read this article of Dan Lew:
http://blog.danlew.net/2014/10/08/grokking-rxjava-part-4/
Section Old, Slow Code
In this case you app create observable just once. You may try to either use
Observable.defer(()-> creatingObservable) so .defer operator will force observable creation every time.
Observable.defer(new Func0<Observable<Double>>() {
#Override
public Observable<Double> call() {
return getCreatingObservable();
}
})
.subscribeOn(Schedulers.io())
.observeOn(Schedulers.io())
.subscribe(getCreatingSubscriber);

Is unsubscribe thread safe in RxJava?

Suppose I have the following RxJava code (which accesses a DB, but the exact use case is irrelevant):
public Observable<List<DbPlaceDto>> getPlaceByStringId(final List<String> stringIds) {
return Observable.create(new Observable.OnSubscribe<List<DbPlaceDto>>() {
#Override
public void call(Subscriber<? super List<DbPlaceDto>> subscriber) {
try {
Cursor c = getPlacseDb(stringIds);
List<DbPlaceDto> dbPlaceDtoList = new ArrayList<>();
while (c.moveToNext()) {
dbPlaceDtoList.add(getDbPlaceDto(c));
}
c.close();
if (!subscriber.isUnsubscribed()) {
subscriber.onNext(dbPlaceDtoList);
subscriber.onCompleted();
}
} catch (Exception e) {
if (!subscriber.isUnsubscribed()) {
subscriber.onError(e);
}
}
}
});
}
Given this code, I have the following questions:
If someone unsubscribes from the observable returned from this method (after a previous subscription), is that operation thread-safe? So are my 'isUnsubscribed()' checks correct in this sense, regardless of scheduling?
Is there a cleaner way with less boilerplate code to check for unsubscribed states than what I'm using here? I couldn't find anything in the framework. I thought SafeSubscriber solves the issue of not forwarding events when the subscriber is unsubscribed, but apparently it does not.
is that operation thread-safe?
Yes. You are receiving an rx.Subscriber which (eventually) checks against a volatile boolean that is set to true when the subscriber's subscription is unsubscribed.
cleaner way with less boilerplate code to check for unsubscribed states
The SyncOnSubscribe and the AsyncOnSubscribe (available as an #Experimental api as of release 1.0.15) was created for this use case. They function as a safe alternative to calling Observable.create. Here is a (contrived) example of the synchronous case.
public static class FooState {
public Integer next() {
return 1;
}
public void shutdown() {
}
public FooState nextState() {
return new FooState();
}
}
public static void main(String[] args) {
OnSubscribe<Integer> sos = SyncOnSubscribe.createStateful(FooState::new,
(state, o) -> {
o.onNext(state.next());
return state.nextState();
},
state -> state.shutdown() );
Observable<Integer> obs = Observable.create(sos);
}
Note that the SyncOnSubscribe next function is not allowed to call observer.onNext more than once per iteration nor can it call into that observer concurrently. Here are a couple of links to the SyncOnSubscribe implementation and tests on the head of the 1.x branch. It's primary usage is to simplify writing observables that iterate or parsing over data synchronously and onNext downstream but doing so in a framework that supports back-pressure and checks if unsubscribed. Essentially you would create a next function which would get invoked every time the downstream operators need a new data element onNexted. Your next function can call onNext either 0 or 1 time.
The AsyncOnSubscribe is designed to play nicely with back pressure for observable sources that operate asynchronously (such as off-box calls). The arguments to your next function include the request count and your provided observable should provide an observable that fulfills data up to that requested amount. An example of this behavior would be paginated queries from an external datasource.
Previously it was a safe practice to transform your OnSubscribe to an Iterable and use Observable.from(Iterable). This implementation gets an iterator and checks subscriber.isUnsubscribed() for you.

Categories

Resources