I created a Subject instance in RxJava and call its onNext() from multiple threads:
PublishSubject<String> subject = PublishSubject.create();
//...
subject.onNext("A"); //thread A
subject.onNext("B"); //thread B
The RxJava documentation says that:
take care not to call its onNext( ) method (or its other on methods) from multiple threads, as this could lead to non-serialized calls, which violates the Observable contract and creates an ambiguity in the resulting Subject.
Do I have to call toSerialized() on such Subject assuming I don't care if "A" goes before or after "B"? How would serialization help?
Is Subject thread-safe anyway or will I break RxJava without toSerialized()?
What is the "Observable contract" that the documentation mentions?
Do I have to call toSerialized() on such Subject assuming I don't care if "A" goes before or after "B"?
Yep use toSerialized() because all operators applied to the subject assume that proper serialization is occurring upstream. The stream may fail or produce unexpected results if this does not happen.
Is Subject thread-safe anyway or will I break RxJava without toSerialized()?
answered above
What is the "Observable contract" that the documentation mentions?
Rx Design Guidelines.pdf section 4 defines the Observable contract:
4.2. Assume observer instances are called in a serialized fashion
As Rx uses a push model and .NET supports multithreading, it is possible for different messages to arrive different execution contexts at the same time. If consumers of observable sequences would have to deal with this in every place, their code would need to perform a lot of housekeeping to avoid common concurrency problems. Code written in this fashion would be harder to maintain and potentially suffer from performance issues.
I think RxJava documentation should make this more discoverable so I'll raise an issue.
According to Dave's answer, if you know beforehand that your Subject is going to be accessed from different threads, you could wrap it into a SerializedSubject
http://reactivex.io/RxJava/javadoc/rx/subjects/SerializedSubject.html
Wraps a Subject so that it is safe to call its various on methods from different threads.
like:
private final Subject<Object, Object> bus = new SerializedSubject<Object, Object>(PublishSubject.create());
(taken from Ben Christensen's EventBus example here: http://bl.ocks.org/benjchristensen/04eef9ca0851f3a5d7bf )
Related
Using async/await it is possible to code asynchronous functions in an imperative style. This can greatly facilitate asynchronous programming. After it was first introduced in C#, it was adopted by many languages such as JavaScript, Python, and Kotlin.
EA Async is a library that adds async/await like functionality to Java. The library abstracts away the complexity of working with CompletableFutures.
But why has async/await neither been added to Java SE, nor are there any plans to add it in the future?
The short answer is that the designers of Java try to eliminate the need for asynchronous methods instead of facilitating their use.
According to Ron Pressler's talk asynchronous programming using CompletableFuture causes three main problems.
branching or looping over the results of asynchronous method calls is not possible
stacktraces cannot be used to identify the source of errors, profiling becomes impossible
it is viral: all methods that do asynchronous calls have to be asynchronous as well, i.e. synchronous and asynchronous worlds don't mix
While async/await solves the first problem it can only partially solve the second problem and does not solve the third problem at all (e.g. all methods in C# doing an await have to be marked as async).
But why is asynchronous programming needed at all? Only to prevent the blocking of threads, because threads are expensive. Thus instead of introducing async/await in Java, in project Loom Java designers are working on virtual threads (aka fibers/lightweight threads) which will aim to significantly reduce the cost of threads and thus eliminate the need of asynchronous programming. This would make all three problems above also obsolete.
Better late than never!!!
Java is 10+ years late in trying to come up with lighter weight units of execution which can be executed in parallel. As a side note, Project loom also aims to expose in Java 'delimited continuation' which, I believe is nothing more than good old 'yield' keyword of C# (again almost 20 years late!!)
Java does recognize the need for solving the bigger problem solved by asyn await (or actually Tasks in C# which is the big idea. Async Await is more of a syntactical sugar. Highly significant improvement, but still not a necessity to solve the actual problem of OS mapped Threads being heavier than desired).
Look at the proposal for project loom here: https://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html
and navigate to last section 'Other Approaches'. You will see why Java does not want to introduce async/await.
Having said this, I don't really agree with the reasoning being provided. Neither in this proposal nor in Stephan's answer.
First let us diagnose Stephan's answer
async await solves point 1 mentioned there. (Stephan also acknowledges it further down the answer)
It is extra work for sure on the part of the framework and tools but not at all on the part of the programmers. Even with async await, .Net debuggers are pretty good in this aspect.
This I only partially agree with. Whole purpose of async await is to elegantly mix asynchronous world with synchronous constructs. But yes, you either need to declare the caller also as async or deal directly with Task in the caller routine. However, project loom will not solve it either in a meaningful way. To fully benefit from the light weight virtual threads, even the caller routine must be getting executed on a virtual thread. Otherwise what's the benefit? You will end up blocking an OS backed thread!!! Hence even virtual threads need to be 'viral' in the code. On the contrary, it will be easier in Java to not notice that the routine you are calling is async and will block the calling thread (which will be concerning if the calling routine is itself not executing on a virtual thread). Async keyword in C# makes the intent very clear and forces you to decide (it is possible in C# to block as well if you want by asking for Task.Result. Most of the time the calling routine can just as easily be async itself).
Stephan is right when he says async programming is needed to prevent blocking of (OS) threads as (OS) threads are expensive. And that's precisely the whole reason why virtual threads (or C# tasks) are needed. You should be able to 'block' on these tasks without losing your sleep. Offcourse to not lose the sleep, either the calling routine itself should be a task or blocking should be on non-blocking IO, with framework being smart enough to not block the calling thread in that case (power of continuation).
C# supports this and proposed Java feature aims to support this.
According to the proposed Java api, blocking on virtual thread will require calling vThread.join() method in Java.
How is it really more beneficial than calling await workDoneByVThread()?
Now let us look at project loom proposal reasoning
Continuations and fibers dominate async/await in the sense that async/await is easily implemented with continuations (in fact, it can be implemented with a weak form of delimited continuations known as stackless continuations, that don't capture an entire call-stack but only the local context of a single subroutine), but not vice-versa
I don't simply understand this statement. If someone does, please let me know in the comments.
For me, async/await are implemented using continuations and as far as stack trace is concerned, since the fibres/virtual threads/tasks are within the virtual machine, it must be possible to manage that aspect. In-fact .net tools do manage that.
While async/await makes code simpler and gives it the appearance of normal, sequential code, like asynchronous code it still requires significant changes to existing code, explicit support in libraries, and does not interoperate well with synchronous code
I have already covered this. Not making significant changes to existing code and no explicit support in libraries will actually mean not using this feature effectively. Until and unless Java is aiming to transparently transform all the threads to virtual threads, which it can't and isn't, this statement does not make sense to me.
As a core idea, I find no real difference between Java virtual threads and C# tasks. To the point that project loom is also aiming for work-stealing scheduler as default, same as the scheduler used by .Net by default (https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskscheduler?view=net-5.0, scroll to last remarks section ).
Only debate it seems is on what syntax should be adopted to consume these.
C# adopted
A distinct class and interface as compared to existing threads
Very helpful syntactical sugar for marrying async with sync
Java is aiming for:
Same familiar interface of Java Thread
No special constructs apart from try-with-resources support for ExecutorService so that the result for submitted tasks/virtual threads can be automatically waited for (thus blocking the calling thread, virtual/non-virtual).
IMHO, Java's choices are worse than those of C#. Having a separate interface and class actually makes it very clear that the behavior is a lot different. Retaining same old interface can lead to subtle bugs when a programmer does not realize that she is now dealing with something different or when a library implementation changes to take advantage of the new constructs but ends up blocking the calling (non-virtual) thread.
Also no special language syntax means that reading async code will remain difficult to understand and reason about (I don't know why Java thinks programmers are in love with Java's Thread syntax and they will be thrilled to know that instead of writing sync looking code they will be using the lovely Thread class)
Heck, even Javascript now has async await (with all its 'single-threadedness').
I release a new project JAsync implement async-await fashion in java which use Reactor as its low level framework. It is in the alpha stage. I need more suggest and test case.
This project makes the developer's asynchronous programming experience as close as possible to the usual synchronous programming, including both coding and debugging.
I think my project solves point 1 mentioned by Stephan.
Here is an example:
#RestController
#RequestMapping("/employees")
public class MyRestController {
#Inject
private EmployeeRepository employeeRepository;
#Inject
private SalaryRepository salaryRepository;
// The standard JAsync async method must be annotated with the Async annotation, and return a JPromise object.
#Async()
private JPromise<Double> _getEmployeeTotalSalaryByDepartment(String department) {
double money = 0.0;
// A Mono object can be transformed to the JPromise object. So we get a Mono object first.
Mono<List<Employee>> empsMono = employeeRepository.findEmployeeByDepartment(department);
// Transformed the Mono object to the JPromise object.
JPromise<List<Employee>> empsPromise = Promises.from(empsMono);
// Use await just like es and c# to get the value of the JPromise without blocking the current thread.
for (Employee employee : empsPromise.await()) {
// The method findSalaryByEmployee also return a Mono object. We transform it to the JPromise just like above. And then await to get the result.
Salary salary = Promises.from(salaryRepository.findSalaryByEmployee(employee.id)).await();
money += salary.total;
}
// The async method must return a JPromise object, so we use just method to wrap the result to a JPromise.
return JAsync.just(money);
}
// This is a normal webflux method.
#GetMapping("/{department}/salary")
public Mono<Double> getEmployeeTotalSalaryByDepartment(#PathVariable String department) {
// Use unwrap method to transform the JPromise object back to the Mono object.
return _getEmployeeTotalSalaryByDepartment(department).unwrap(Mono.class);
}
}
In addition to coding, JAsync also greatly improves the debugging experience of async code.
When debugging, you can see all variables in the monitor window just like when debugging normal code. I will try my best to solve point 2 mentioned by Stephan.
For point 3, I think it is not a big problem. Async/Await is popular in c# and es even if it is not satisfied with it.
In the book "Java 8 in action" (by Urma, Fusco and Mycroft) they highlight that parallel streams internally use the common fork join pool and that whilst this can be configured globally, e.g. using System.setProperty(...), that it is not possibly to specify a value for a single parallel stream.
I have since seen the workaround that involves running the parallel stream inside a custom made ForkJoinPool.
Later on in the book, they have an entire chapter dedicated to CompletableFuture, during which they have a case study where they compare the respective performance of using a parallelStream VS a CompletableFuture. It turns out their performance is very similar - they highlight the reason for this as being that they are both as default using the same common pool (and therefore the same amount of threads).
They go on to show a solution and argue that the CompletableFuture is better in this circumstance as it can be congifured to use a custom Executor, with a thread pool size of the user's choice. When they update the solution to utilise this, the performance is significantly improved.
This made me think - if one were to do the same for the parallel stream version using the workaround highlighted above, would the performance benefits be similar, and would the two approaches therefore become similar again in terms of performance? In this case, why would one choose the CompletableFuture over the parallel stream when it clearly takes more work on the developer's part.
In this case, why would one choose the CompletableFuture over the parallel stream when it clearly takes more work on the developer's part.
IMHO This depends on the interface you are looking to support. If you are looking to support an asynchronous API e.g.
CompletableFuture<String> downloadHttp(URL url);
In this case, only a completable future makes sense because you may want to do something else unrelated while you wait for the data to come down.
On the other hand parallelStream() is best for CPU bound tasks where you want every tasks to perform a portion of some work. i.e. every thread is doing the same thing with different data. As you meantion it is also easier to use.
I'm looking for the Java/Akka equivalent of Python's yield from or gevent's monkey patch.
Update
There has been some confusion in the commets about what is question is asking so let me restate the question:
If I have a future, how do I wait for the future to compete without blocking the thread AND without returning to caller until the future is complete?
Lets say we have method that blocks:
public Object foo() {
Object result = someBlockingCall();
return doSomeThingWithResult(result);
}
To make this asynchronous, we would pass SomeBlockingCall() a callback:
public void foo() {
someAsyncCall(new Handler() {
public void onSuccess(Object result) {
message = doSomethingWithResult(result);
callerRef.tell(message, ActorRef.noSender());
}
});
}
The call to foo() now returns before the result is ready, so the caller no longer gets the result. We have to get the result back to the caller by passing a message. To convert synchronous code to asynchronous Akka code, a redesign of the caller is required.
What I'd like is async code that looks like synchronous code like Python's Gevent.
I want to write:
public Object foo() {
Future future = someAsyncCall();
// save stack frame, go back to the event loop and relinquish CPU
// so other events can use the thread,
// and come back when future is complete and restore stack frame
return yield(future);
}
This would allow me to make my synchronous code asynchronous without a redesign.
Is this posible?
Note:
The Play framework seems to fake this with async() and AsyncResult. But this won't work in general since I have to write the code that handles the AsyncResult which would look like the above callback handler.
I think trying to get back a more straightforward sync design, although an efficient one, is actually a good intention and a good idea (see for example here).
Quasar has facilities to obtain sync/blocking APIs that are still highly efficient from async APIs (see this blog post), which looks exactly what you're looking for.
The fundamental problem is not that the sync/blocking style itself is bad (actually async and sync are dual styles and can be transformed into one another, see for example here), but rather than blocking Java's heavyweight threads is not efficient: it is not an abstraction problem but an implementation problem, so instead of giving up the easier thread abstraction only because the implementation is inefficient, I agree it is better for the future of your code to try and look for more efficient thread implementations.
As Roland hinted, Quasar adds lightweight threads or fibers to the JVM, so you can get the same performance of async frameworks without giving up the thread abstraction and regular imperative control flow constructs (sequence, loops etc.) available in the language.
It also unifies JVM/JDK's threads and its fibers under a common strand interface, so they can interoperate seamlessly, and provides a porting of java.util.concurrent to this unified concept.
On top of strands (either fibers or regular threads) Quasar also offers fully-fledged Erlang-style actors, blocking Go-like channels and dataflow programming, so you can choose the concurrent programming paradigm that suits best your skills and needs without being forced into one.
It also provides bindings for popular and standard technologies (as part of the Comsat project), so you can preserve your code assets because the porting effort will be minimal (if any). For the same reason you can also opt-out easily, should you choose to.
Currently Quasar has binding for Java 7 and 8, Clojure under the Pulsar project and soon JetBrain's Kotlin. Being based on JVM bytecode instrumentation, Quasar can really work with any JVM language if an integration module is present, and it offers tools to build additional ones.
The answer to your question is “no”, and that is very much by design. Writing a method to be asynchronous means returning the Future as its result, the method itself will not perform the computation but will arrange for the result to be provided later. You can then pass this Future to the right place where it is further used, for example by transforming it using one of the many combinators (like map, recover, etc.).
Awaiting a strict result for the Future will have to block the current thread of execution, no matter which technology you use. With plain JVM threads you will block a real thread from the operating system, with Quasar you will block your Fiber, with Akka you will block your Actor (*); blocking means blocking, no way around that.
(*) In an Actor you would get the result via a message at a later point, and until that point you will have to switch the behavior such that new incoming messages are stashed, rejected or dropped, depending on your use-case.
I have a class currently called Promise that works as follows:
It holds a future value
It can always accept a subsequent action to take that uses the future value as the parameter
When the value is completed the function queue launches
Any functions added after the future is complete happen synchronously
So this seems to be a design pattern from functional programming that we're jamming into Java. The important thing is that we can daisy-chain on delayed events, which I understand is a feature more built into C# 3.0 language but you have to hack together with Java classes. Unfortunately, one, I don't know a better name for this than "promise" or "future," which seem misleading since the focus is more on the "DelayedCallStack" then the value at hand, and two, I don't know of any way to do this beyond writing our own fairly complicated Promise class. Ideally I'd like to lift this from the functional Java library but the concept eludes me thus far.
Note Java doesn't even give language/library support for an asynchronous callback that takes a parameter, which is one reason I'm so pessimistic about being able to find this.
So, what is this pattern, can it be done in libraries?
Take a look a ListenableFuture in Guava:
http://code.google.com/p/guava-libraries/wiki/ListenableFutureExplained
ListenableFuture allows you to add callbacks to be executed when the Future computation is completed. You can control what thread pool the callbacks get executed under, so they can be executed synchronously or asynchronously.
I can only say that we implemented pretty much exactly the same thing in Flex (ActionScript) and we also called it a Promise. In Clojure a promise is something quite a bit more lightweight: the get operation on it blocks until another thread delivers the promise. It's basically a one-element queue except that it retains its value forever, so subsequent gets always succeed.
What you have is a kind of a promise coupled with observers of its value. I'm not aware of any special term covering exactly that case.
EDIT
Now I notice that your "promise/future" might own the code that produces its future value (at least it's not entirely obvious whether it does). The ActionScript implementation I mentioned didn't do that -- it behaved like Clojure's, the value being supplied from the outside. I think this is the key distinction between a future and a promise.
I usually use the Observer pattern, my colleague at work though has implemented an Observer using Thread intercommunication (using wait and notify/notifyAll).
Should I implement my observers using the pattern or inter-Thread-communication using wait and notify? Are there any good reasons to avoid one approach and always use the other?
I've always gone with the first one, using the pattern, out of convention, and because it seems more expressive (involved identifiers are a good way to express and understand what is communicated and how).
EDIT:
I'm using the pattern in a Swing GUI, he's using the inter-thread solution in an Android application.
In his solution one thread generates data and then calls notify, to wake up another thread that paints the generated data and calls wait after every paint.
His argument for the wait/notify solution is that it creates less threads and even several concurrent calls to notify will cause only 1 paint event, whereas an observer-based solution would call a repaint with every call. He says it's just another valid approach, but doesn't claim he's done it for performance reasons.
My argument is that I would express the communication among objects on the OO design level rather than use a language-specific feature that makes the communication almost invisible. Also, low-level thread communication is hard to master, might be hard to understand by other readers, and should rather be implemented on a higher level, i. e. using a CyclicBarrier. I don't have any sound arguments for one or the other solution, but I was wondering if there are any sound arguments for either one or the other approach (i. e. "This-and-that can happen, if you use this approach, whereas in the other one that's not possible.").
You are comparing apples and oranges. The wait/notify mechanism is used for thread synchronization, and while your colleague may have used it within an Observer/Observable implementation, it is not, in itself the pattern implementation. It simply means it is a multithreaded implementation.
There are many implementations of this pattern, and they are typically tailored to the environment in which you are working. There are event mechanisms built into most UI frameworks/toolkits. JMS for distributed environments, ...
I don't find much use for the generic Observer/Observable classes provided by the JDK, and from experience I haven't found many other developers use them either. Most will use a provided mechanism, if appropriate, or roll their own specific and ultimately more useful implementation if needed.
Since I have done most of my coding in an OSGi environment of late, I have a preference for a variation of observer/observable called the whiteboard pattern. This may or may not be feasible for you, depending on your environment.
You should avoid, or rather refrain from, inter-thread communication in 99.99% of the cases. If there is a real need for a multi threaded solution, you should use a higher level concurrency mechanism such as an ExecutorService or a good concurrency library such as jetlang: http://code.google.com/p/jetlang/.
Difficult. I would normally use Observer / Observable when not explicitly writing a multithreaded application. However, convention in this case might be for you to use his design. Perhaps see if you can abstract it out somehow so that you can replace it with the Observer pattern at a later stage if necessary?
However, I found these two articles which seem to indicate that the Observer/Observable pattern in Java is not ideal and should be avoided.
An inside view of Observer and
The event generator idiom