Setting and reading instance variable value within rxJava chain from different Schedulers - java

I am not sure about safety of reading/writing instance variables from rxJava chain with different schedulers. There is a small example
public class RxJavaThreadSafety {
private int variable = 0;
// First call
public void doWriting() {
Single.just(255)
.doOnSuccess(
newValue -> variable = newValue
)
.subscribeOn(Schedulers.io())
.subscribe();
}
// Second call
public void doReadingRxChain() {
Single.fromCallable((Callable<Integer>) () -> variable)
.subscribeOn(Schedulers.computation())
.subscribe(
result -> System.out.println(result)
);
}
// Third call
public void doReading() {
System.out.println(variable);
}
}
For simplicity lets assume that these three methods called one after another
My question: Does it thread safe to set variable "in" io scheduler, and lately read this variable "from" computation scheduler or main thread?
I think that is not thread safe, but i want some rxJava and concurrency experts to prove it

No, this is not thread safe.
When you use subscribeOn it means that calling subscribe() adds the task for producing the item to the work queue of a scheduler.
The doWriting() and doReadingRxChain() methods add tasks to different schedulers. There is no guarantee that the chain in doWriting() will even start to run before doReadingRxChain(). This can happen for example if all IO threads are busy.
There is a more fundamental problem: you are writing the value of variable in one thread and reading it in another. Without any concurrency controls, nothing guarantees that the new value of variable is seen by the thread reading it. One way to fix that is declaring the variable as volatile:
private volatile int variable = 0;

Related

How to thread a sequence of actions through multiple threads?

I am exploring a problem which is likely a special case of a problem class, but I don't know the problem class nor the appropriate terminology, so I have to resort to desribing the problem using ad-hoc vocabulary. I'll rephrase once I know the right terminology.
I have a bunch of singletons A, B, C. The singletons are:
Unrelated. There are no constraints like "you must access B before you can do X with C" or similar.
Not thread-safe.
The system accepts tasks to be processed in parallel as far as possible.
Each task consists of a sequence of actions, each action to be executed using one of the singletons. Different tasks may access different singleton in different order, and tasks may contain loops of actions.
Pseudocode:
void myTask(in1, in2, ...) {
doWithA(() -> {
// use in1, in2, ...
// inspect and/or update A
// set up outputs to be used as inputs for the next action:
outA1 = ...
outA2 = ...
...
});
doWithB(() -> {
// use outA1, outA2, ...
// inspect and/or update B
// set up outputs to be used as inputs for the next action:
outB1 = ...
outB2 = ...
...
});
// Tasks may touch singletons repeatedly, in any order
doWithA(() -> {
// outB1, outB2, ..., inspect/modify A, set up outputs
outAx1 = ...
outAx2 = ...
...
});
// Tasks may have loops:
while (conditionInC(() -> ...) {
doWithC(() -> ...);
doWithD(() -> ...);
}
// I am aware that a loop like this can cause a livelock.
// That's an aspect for another question, on another day.
}
There are multiple tasks like myTask above.
Tasks to be executed are wrapped in a closure and scheduled to a ThreadPoolExecutor (or something similar).
Approaches I considered:
Have singletons LockA, LockB, ...
Each doWithX is merely a synchronized(X) block.
OutXn are local variables of myTask.
Problem: One of the singletons is Swing, and I can't move the EDT into a thread that I manage.
As above. Solve the Swing problem from approach (1) by coding doWithSwing(){...} as SwingUtilities.invokeAndWait(() -> {...}.
Problem: invokeAndWait is generally considered prone to deadlock. How do I find out if I am into this kind of trouble with the pattern above?
Have threads threadA, threadB, ..., each of them "owning" one of the singletons (Swing already has this, it is the EDT).
doWithX schedules the block as a Runnable on threadX.
outXn are set up as Future<...> outXn = new SettableFuture<>(), the assignments become outXn.set(...).
Problem: I couldn't find anything like SettableFuture in the JDK; all ways to create a Futurethat I could find were somehow tied to a ThreadPool. Maybe I am looking at the wrong top-level interface and Future is a red herring?
With of these approaches would be best?
Is there a superior approach that I didn't consider?
I don't know the problem class nor the appropriate terminology
I'd probably just refer to the problem class as concurrent task orchestration.
There's a lot of things to consider when identifying the right approach. If you provide some more details, I'll try to update my answer with more color.
There are no constraints like "you must access B before you can do X with C" or similar.
This is generally a good thing. A very common cause of deadlocks is different threads acquiring the same locks in differing orders. E.g., thread 1 locks A then B while thread 2 owns the lock B and is waiting to acquire A. Designing the solution such that this situation does not occur is very important.
I couldn't find anything like SettableFuture in the JDK
Take a look at java.util.concurrent.CompletableFuture<T> - this is probably what you want here. It exposes a blocking get() as well as a number of asynchronous completion callbacks such as thenAccept(Consumer<? super T>).
invokeAndWait is generally considered prone to deadlock
It depends. If your calling thread isn't holding any locks that are going to be necessary for the execution of the Runnable you're submitting, you're probably okay. That said, if you can base your orchestration on asynchronous callbacks, you can instead use SwingUtilities.invokeLater(Runnable) - this will submit the execution of your Runnable on the Swing event loop without blocking the calling thread.
I would probably avoid creating a thread per singleton. Each running thread contributes some overhead and it's better to decouple the number of threads from your business logic. This will allow you to tune the software to different physical machines based on the number of cores, for example.
It sounds like you need each runWithX(...) method to be atomic. In other words, once one thread has begun accessing X, another thread cannot do so until the first thread is finished with its task step. If this is the case, then creating a lock object per singleton and insuring serial (rather than parallel) access is the right way to go. You can achieve this by wrapping the execution of closures that get submitted in your runWithX(...) methods in a synchronized Java code block. The code within the block is also referred to as the critical section or monitor region.
Another thing to consider is thread contention and order of execution. If two tasks both require access to X and task 1 gets submitted before task 2, is it a requirement that task 1's access to X occurs before task 2's? A requirement like that can complicate the design quite a bit and I would probably recommend a different approach than outlined above.
Is there a superior approach that I didn't consider?
These days there are frameworks out there for solving these types of problems. I'm specifically thinking of reactive streams and RxJava. While it is a very powerful framework, it also comes with a very steep learning curve. A lot of analysis and consideration should be done before adopting such a technology within an organization.
Update:
Based on your feedback, I think a CompletableFuture-based approach probably makes the most sense.
I'd create a helper class to orchestrate task step execution:
class TaskHelper
{
private final Object lockA;
private final Object lockB;
private final Object lockC;
private final Executor poolExecutor;
private final Executor swingExecutor;
public TaskHelper()
{
poolExecutor = Executors.newFixedThreadPool( 2 );
swingExecutor = SwingUtilities::invokeLater;
lockA = new Object();
lockB = new Object();
lockC = new Object();
}
public <T> CompletableFuture<T> doWithA( Supplier<T> taskStep )
{
return doWith( lockA, poolExecutor, taskStep );
}
public <T> CompletableFuture<T> doWithB( Supplier<T> taskStep )
{
return doWith( lockB, poolExecutor, taskStep );
}
public <T> CompletableFuture<T> doWithC( Supplier<T> taskStep )
{
return doWith( lockC, swingExecutor, taskStep );
}
private <T> CompletableFuture<T> doWith( Object lock, Executor executor, Supplier<T> taskStep )
{
CompletableFuture<T> future = new CompletableFuture<>();
Runnable serialTaskStep = () -> {
T result;
synchronized ( lock ) {
result = taskStep.get();
}
future.complete( result );
};
executor.execute( serialTaskStep );
return future;
}
}
In my example above withA and withB get scheduled on a shared thread pool while withC is always executed on the Swing thread. The Swing Executor is already going to be serial in nature, so the lock is really optional there.
For creating actual tasks, I'd recommend creating an object for each task. This allows you to supply callbacks as method references, resulting in cleaner code and avoiding callback hell:
This example computes the square of a provided number on a background thread pool and then displays the results on the Swing thread:
class SampleTask
{
private final TaskHelper helper;
private final String id;
private final int startingValue;
public SampleTask( TaskHelper helper, String id, int startingValue )
{
this.helper = helper;
this.id = id;
this.startingValue = startingValue;
}
private void start()
{
helper.doWithB( () -> {
int square = startingValue * startingValue;
return String.format( "computed-thread: %s computed-square: %d",
Thread.currentThread().getName(), square );
} )
.thenAccept( this::step2 );
}
private void step2( String result )
{
helper.doWithC( () -> {
String message = String.format( "current-thread: %s task: %s result: %s",
Thread.currentThread().getName(), id, result );
JOptionPane.showConfirmDialog( null, message );
return null;
} );
}
}
#Test
public void testConcurrent() throws InterruptedException, ExecutionException
{
TaskHelper helper = new TaskHelper();
new SampleTask( helper, "task1", 5 ).start();
new SampleTask( helper, "task2", 7 ).start();
Thread.sleep( 60000 );
}
Update 2:
If you want to avoid callback hell while also avoiding the need to create an object per task, perhaps you should take a serious look at reactive streams after all.
Take a look at the "getting started" page for RxJava:
https://github.com/ReactiveX/RxJava/wiki/How-To-Use-RxJava
For reference here's how the same example above would look in Rx (I'm removing the concept of task ID for simplicity):
#Test
public void testConcurrentRx() throws InterruptedException
{
Scheduler swingScheduler = Schedulers.from( SwingUtilities::invokeLater );
Subject<Integer> inputSubject = PublishSubject.create();
inputSubject
.flatMap( input -> Observable.just( input )
.subscribeOn( Schedulers.computation() )
.map( this::computeSquare ))
.observeOn( swingScheduler )
.subscribe( this::displayResult );
inputSubject.onNext( 5 );
inputSubject.onNext( 7 );
Thread.sleep( 60000 );
}
private String computeSquare( int input )
{
int square = input * input;
return String.format( "computed-thread: %s computed-square: %d",
Thread.currentThread().getName(), square );
}
private void displayResult( String result )
{
String message = String.format( "current-thread: %s result: %s",
Thread.currentThread().getName(), result );
JOptionPane.showConfirmDialog( null, message );
}

Why would I check in a Runnable, that its Thread is not null?

I've found code like this in a project I'm taking over. I'm not sure, what the if condition is supposed to accomplish. If the Runnable is running, it does so in the Thread it checks for being null. So that is always the case, right?
public class Outer
{
Thread m_thread = null;
public Outer()
{
Runnable runner = new Runnable()
{
public void run()
{
if ( m_thread != null )
do_stuff();
}
};
m_thread = new Thread(runner);
m_thread.start();
}
}
There is actually another method, that sets m_thread to null, but since there is no loop in the runnable, does that make a difference? do_stuff() does not access m_thread.
Since m_thread is not marked volatile or guarded by any other memory barrier operation it's possible that when Runnable is running it will observe m_thread to be null. If do_stuff() requires non-null reference to m_thread, the code will fail.
Check the Safe Publication and Safe Initialization in Java article by Shipilev to understand safe publication idioms in Java. In short:
There are a few trivial ways to achieve safe publication:
Exchange the reference through a properly locked field (JLS 17.4.5)
Use static initializer to do the initializing stores (JLS 12.4)
Exchange the reference via a volatile field (JLS 17.4.5), or as the consequence of this rule, via the AtomicX classes
Initialize the value into a final field (JLS 17.5).
You don't. There was a fashion 20 years ago, that I think may have originated in a magazine, for run() methods to loop while (Thread.currentThread() != null). It was meaningless then and it is meaningless now, even when slightly re-expressed as in your code.
Simply spoken: that doesn't make any sense. When a line of code is executed in Java, some thread is running it.
Unless you start implementing your own tracking of threads, the fact that your code is running ... tells it that some thread is running it.
The code shown here A) violates Java naming conventions, and it also B) violates "common sense" in Java.
You see, you could still write code that first initializes that m_thread field, to then invoke runner.run() directly from the "main" thread. And the run method would find that the field is not null, and invoke doStuff(). If at all, you could check that Thread.getCurrentThread() returns something else than your "main" thread.
As in:
class Outer {
private Thread mainThread;
public Outer()
{
mainThread = Thread.getCurrentThread();
Runnable runner = new Runnable()
{
public void run()
{
if ( Thread.getCurrentThread() != mainThread )
do_stuff();
}
};
m_thread = new Thread(runner);
m_thread.start();
}
( I didn't run the above through the compiler, it is meant as "pseudo code like" example, not necessarily 100% correct )

Thread structure in Java

I've got few questions about threads in Java. Here is the code:
TestingThread class:
public class TestingThread implements Runnable {
Thread t;
volatile boolean pause = true;
String msg;
public TestingThread() {
t = new Thread(this, "Testing thread");
}
public void run() {
while (pause) {
//wait
}
System.out.println(msg);
}
public boolean isPause() {
return pause;
}
public void initMsg() {
msg = "Thread death";
}
public void setPause(boolean pause) {
this.pause = pause;
}
public void start() {
t.start();
}
}
And main thread class:
public class Main {
public static void main(String[] args) {
TestingThread testingThread = new TestingThread();
testingThread.start();
testingThread.initMsg();
testingThread.setPause(false);
}
}
Question list:
Should t be volatile?
Should msg be volatile?
Should setPause() be synchronized?
Is this a good example of good thread structure?
You have hit quite a subtlety with your question number 2.
In your very specific case, you:
first write msg from the main thread;
then write the volatile pause from the main thread;
then read the volatile pause from the child thread;
then read msg from the child thread.
Therefore you have transitively established a happens-before relationship between the write and the read of msg. Therefore msg itself does not have to be volatile.
In real-life code, however, you should avoid depending on such subtle behavior: better overapply volatile and sleep calmly.
Here are some relevant quotes from the Java Language Specification:
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
If an action x synchronizes-with a following action y, then we also have hb(x, y).
Note that, in my list of actions,
1 comes before 2 in program order;
same for 3 and 4;
2 synchronizes-with 3.
As for your other questions,
ad 1: t doesn't have to be volatile because it's written to prior to thread creation and never mutated later. Starting a thread induces a happens-before on its own;
ad 3: setPause does not have to be synchronized because all it does is set the volatile var.
> Should msg be volatile?
Yes. Does it have to be in this example, No. But I urge you to use it anyway as the codes correctness becomes much clearer ;) Please note that I am assuming that we are discussing Java 5 or later, before then volatile was broken anyway.
The tricky part to understand is why this example can get away without msg being declared as volatile.
Consider this order part of main().
testingThread.start(); // starts the other thread
testingThread.initMsg(); // the other thread may or may not be in the
// while loop by now msg may or may not be
// visible to the testingThread yet
// the important thing to note is that either way
// testingThread cannot leave its while loop yet
testingThread.setPause(false); // after this volatile, all data earlier
// will be visible to other threads.
// Thus even though msg was not declared
// volatile it will piggy back the pauses
// use of volatile; as described [here](http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile)
// and now the testingThread can leave
// its while loop
So if we now consider the testingThread
while (pause) { // pause is volatile, so it will see the change as soon
// as it is made
//wait
}
System.out.println(msg); // this line cannot be reached until the value
// of pause has been set to false by the main
// method. Which under the post Java5
// semantics will guarantee that msg will
// have been updated too.
> Should t be volatile?
It does not matter, but I would suggest making it private final.
> Should setPause() be synchronized?
Before Java 5, then yes. After Java 5 reading a volatile has the same memory barrier as entering a synchronized block. And writing to a volatile has the same memory barrier as at the end of a synchronized block. Thus unless you need the scoping of a synchronized block, which in this case you do not then you are fine with volatile.
The changes to volatile in Java 5 are documented by the author of the change here.
1&2
Volatile can be treated something like as "synchronization on variable",though the manner is different, but the result is alike, to make sure it is read-consistent.
3.
I feel it does not need to, since this.pause = pause should be an atomic statement.
4.
It is a bad example to do any while (true) {do nothing}, which will result in busy waiting, if you put Thread.sleep inside, which may help just a little bit. Please refer to http://en.wikipedia.org/wiki/Busy_waiting
One of a more appropriate way to do something like "wait until being awaken" is using the monitor object(Object in java is a monitor object), or using condition object along with a lock to do so. You may need to refer to http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html
Also, I don't think it is good idea either, that you have a local filed of thread inside your custom Runnable . Please refer to Seelenvirtuose 's comment.

Java - threads + action

I'm new to Java so I have a simple question that I don't know where to start from -
I need to write a function that accepts an Action, at a multi-threads program , and only the first thread that enter the function do the action, and all the other threads wait for him to finish, and then return from the function without doing anything.
As I said - I don't know where to begin because,
first - there isn't a static var at the function (static like as in c / c++ ) so how do I make it that only the first thread would start the action, and the others do nothing ?
second - for the threads to wait, should I use
public synchronized void lala(Action doThis)
{....}
or should i write something like that inside the function
synchronized (this)
{
...
notify();
}
Thanks !
If you want all threads arriving at a method to wait for the first, then they must synchronize on a common object. It could be the same instance (this) on which the methods are invoked, or it could be any other object (an explicit lock object).
If you want to ensure that the first thread is the only one that will perform the action, then you must store this fact somewhere, for all other threads to read, for they will execute the same instructions.
Going by the previous two points, one could lock on this 'fact' variable to achieve the desired outcome
static final AtomicBoolean flag = new AtomicBoolean(false); // synchronize on this, and also store the fact. It is static so that if this is in a Runnable instance will not appear to reset the fact. Don't use the Boolean wrapper, for the value of the flag might be different in certain cases.
public void lala(Action doThis)
{
synchronized (flag) // synchronize on the flag so that other threads arriving here, will be forced to wait
{
if(!flag.get()) // This condition is true only for the first thread.
{
doX();
flag.set(true); //set the flag so that other threads will not invoke doX.
}
}
...
doCommonWork();
...
}
If you're doing threading in any recent version of Java, you really should be using the java.util.concurrent package instead of using Threads directly.
Here's one way you could do it:
private final ExecutorService executor = Executors.newCachedThreadPool();
private final Map<Runnable, Future<?>> submitted
= new HashMap<Runnable, Future<?>>();
public void executeOnlyOnce(Runnable action) {
Future<?> future = null;
// NOTE: I was tempted to use a ConcurrentHashMap here, but we don't want to
// get into a possible race with two threads both seeing that a value hasn't
// been computed yet and both starting a computation, so the synchronized
// block ensures that no other thread can be submitting the runnable to the
// executor while we are checking the map. If, on the other hand, it's not
// a problem for two threads to both create the same value (that is, this
// behavior is only intended for caching performance, not for correctness),
// then it should be safe to use a ConcurrentHashMap and use its
// putIfAbsent() method instead.
synchronized(submitted) {
future = submitted.get(action);
if(future == null) {
future = executor.submit(action);
submitted.put(action, future);
}
}
future.get(); // ignore return value because the runnable returns void
}
Note that this assumes that your Action class (I'm assuming you don't mean javax.swing.Action, right?) implements Runnable and also has a reasonable implementation of equals() and hashCode(). Otherwise, you may need to use a different Map implementation (for example, IdentityHashMap).
Also, this assumes that you may have multiple different actions that you want to execute only once. If that's not the case, then you can drop the Map entirely and do something like this:
private final ExecutorService executor = Executors.newCachedThreadPool();
private final Object lock = new Object();
private volatile Runnable action;
private volatile Future<?> future = null;
public void executeOnlyOnce(Runnable action) {
synchronized(lock) {
if(this.action == null) {
this.action = action;
this.future = executor.submit(action);
} else if(!this.action.equals(action)) {
throw new IllegalArgumentException("Unexpected action");
}
}
future.get();
}
public synchronized void foo()
{
...
}
is equivalent to
public void foo()
{
synchronized(this)
{
...
}
}
so either of the two options should work. I personally like the synchronized method option.
Synchronizing the whole method can sometimes be overkill if there is only a certain part of the code that deals with shared data (for example, a common variable that each thread is updating).
Best approach for performance is to only use the synchronized keyword just around the shared data. If you synchronized the whole method when it is not entirely necessarily then a lot of threads will be waiting when they can still do work within their own local scope.
When a thread enters the synchronize it acquires a lock (if you use the this object it locks on the object itself), the other will wait till the lock-acquiring thread has exited. You actually don't need a notify statement in this situation as the threads will release the lock when they exit the synchronize statement.

Thread that can restart based on a condition

The basic idea is that I have a native function I want to call in a background thread with a user selected value and the thread cannot be interrupted when started. If the user decides to change the value used to perform the task while the thread is running (they can do this from a GUI), the thread should finish its task with the previous value and then restart with the new value. When the task is done and the value hasn't changed, the thread should end and call a callback function.
This is what my current code looks like for the thread starting part:
volatile int taskValue;
volatile boolean taskShouldRestart;
void setTaskValue(int value)
{
taskValue = value;
synchronized (threadShouldRestart)
{
if task thread is already running
threadShouldRestart = true
else
{
threadShouldRestart = false
create and start new thread
}
}
}
And the actual work thread looks like this:
while (true)
{
nativeFunctionCall(taskValue);
synchronized (threadShouldRestart)
{
if (!threadShouldRestart)
{
invokeTaskCompletedCallbackFunction();
return;
}
}
}
I'm locking on the "threadShouldRestart" part because e.g. I don't want this changing to true just as the thread decides it's done which means the thread wouldn't restart when it was meant to.
Are there any cleaner ways to do this or Java utility classes I could be using?
You could design your run() method as follows:
public void run() {
int currentTaskValue;
do {
currentTaskValue = taskValue;
// perform the work...
} while (currentTaskValue != taskValue);
}
I think the volatile declaration on taskValue is enough for this, since reads and writes of primitives no larger than 32 bits are atomic.
Have you considered a ThreadPoolExecutor? It seems to lend itself well to your problem as you mentioned you have no need to restart or stop a thread which has already started.
http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ThreadPoolExecutor.html
A user could submit as many tasks as they like to a task queue, tasks will be processed concurrently by some number of worker threads you define in the ThreadPoolExecutor constructor.

Categories

Resources