Is my class threadsafe? If not why?
class Foo {
boolean b = false;
void doSomething() throws Exception {
while (b) Thread.sleep();
}
void setB(boolean b) {
this.b = b;
}
}
The code is not thread safe because a running thread may see changes until the code is compiled (which could be at a random point later) and you no longer see changes.
BTW: This makes testing it very hard to do. e.g. if you sleep for 1 second you might not see this behaviour for almost three hours.
i.e. it may or may not work and you cannot say just because it has worked it will continue to work.
As b is not volatile the JIT can and will optimise
while (b) Thread.sleep(N);
to be
boolean b = this.b;
if (b) while (true) Thread.sleep(N);
so the value of b is not read each time.
It is not. setB() updating instance variable b value but is not synchronized.
Multiple threads may try to execute setB() method at same point of time may result in un-predicted results.
You need to synchronize the method (or) use synchronize block with lock on this object.
Take a look at AtomicBoolean. This will mean that only one thread can access it at any one time.
But, how is it relates to thread safety? little confused.
It goes back to the definition of "thread-safe". Wikipedia says this:
"Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it functions correctly during simultaneous execution by multiple threads."
The key point here is functions correctly. If you look at the scenario explained by Peter Lawrey, you can see that JIT compilation can result in code that doesn't notice when the value of b changes from true to false, and instead loops for ever. This is clearly incorrect behaviour. And since this only manifests when there are two threads, it is a thread-safety issue.
An at-issue concept here is visibility, discussed in Concurrent Java. If doSomething() is running on a different thread, the JVM makes no guarantee that the action taken in setB will apply to the version of b that doSomething sees. So you could call setB and doSomething is not guaranteed to ever terminate.
Related
I am preparing for the OCP exam and I found this question in a mock exam:
Given:
class Calculator {
private AtomicInteger i = new AtomicInteger();
public void add(int value) {
int oldValue = i.get();
int newValue = oldValue + value;
System.out.print(i.compareAndSet(oldValue,newValue));
}
public int getValue() {
return i.get();
}
}
What could you do to make this class thread safe?
And surprisingly, to me, the answer is:
"Class Calculator is thread-safe"
It must be that I have not understood correctly the concept. To my understanding, a class is thread safe when all the methods work as expected under thread concurrency. Now, if two thread call at the same time getValue(), then call add() passing a different value, and then they call getValue() again, the 'second' thread won't see its value passed increased.
I understand that the oldValue and the newValue as local variables are stored in the method stack, but that doesn't prevent that the second call to compareAndSet will find that the oldValue is not the current value and so won't add the newValue.
What am I missing here?
According to JCIP
A class is thread-safe if it behaves correctly when accessed from multiple threads, regardless of the scheduling or
interleaving of the execution of those threads by the runtime environment, and with no additional synchronization or
other coordination on the part of the calling code.
Although there is no definition of thread-safety and no specification of the class, in my opinion, by any sane definition of an add method in a Calculator class it is "correct" if it the value of the AtomicInteger i is increased in any case, "regardless of the scheduling or interleaving of the execution".
Therefore, in my opinion, the class is not thread-safe by this definition.
There is clearly a problem with the term "thread-safe" here, in that it isn't absolute. What is considered thread-safe depends on what you expect the program to do. And in most real-world applications you wouldn't consider this code thread-safe.
However the JLS formally specifies a different concept:
A program is correctly synchronized if and only if all sequentially
consistent executions are free of data races.
If a program is correctly synchronized, then all executions of the
program will appear to be sequentially consistent
Correctly synchronized is a precisely defined, objective condition and according to that definition the code is correctly synchronized because every access to i is in a happens-before relationship with every other access, and that's enough to satisfy the criteria.
The fact that the exact order of reads/writes depends on unpredictable timing is a different problem, outside the realms of correct synchronization (but well within what most people would call thread safety).
The add method is going to do one of two things:
Add value to the value of i, and print true.
Do nothing and print false.
The theoretically sound1 definitions of thread-safe that I have seen say something like this:
Given a set of requirements, a program is thread-safe with respect to those requirements if it correct with respect to those requirements for all possible executions in a multi-threaded environment.
In this case, we don't have a clear statement of requirements, but if we infer that the intended behavior is as above, then that class is thread-safe with respect to the inferred requirements.
Now if the requirements were that add always added value to i, then that implementation does not meet the requirement. In that case, you could argue that the method is not thread-safe. (In a single-threaded use-case add would always work, but in a multi-threaded use-case the add method could occasionally fail to meet the requirement ... and hence would be non-thread-safe.)
1 - By contrast, the Wikipedia description (seen on 2016-01-17) is this: "A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time." The problem is that it doesn't say what "safe execution" means. Eric Lippert's 2009 blog post "What is this thing you call thread-safe" is really pertinent.
It's threadsafe because compareAndSet is threadsafe and that's the only part that's modifying shared resources in the object.
It doesn't make a difference how many threads enter that method body at the same time. The first one to reach the end and call compareAndSet "wins" and gets to change the value while the others find the value has changed on them and compareAndSet returns false. It never results in the system being in an indeterminate state, though callers would probably have to handle the false outcome in any realistic scenario.
I have a situation and I need some advice about synchronized block in Java. I have a Class Test below:
Class Test{
private A a;
public void doSomething1(String input){
synchronized (a) {
result = a.process(input);
}
}
public void doSomething2(String input){
synchronized (a) {
result = a.process(input);
}
}
public void doSomething3(String input){
result = a.process(input);
}
}
What I want is when multi threads call methods doSomeThing1() or doSomeThing2(), object "a" will be used and shared among multi threads (it have to be) and it only processes one input at a time (waiting until others thread set object "a" free) and when doSomeThing3 is called, the input is processed immediately.
My question is will the method doSomeThing3() be impacted my method doSomeThing1() and doSomeThing2()? Will it have to wait if doSomeThing1() and doSomeThing2() are using object "a"?
A method is never impacted by anything that your threads do. What gets impacted is data, and the answer to your question depends entirely on what data are updated (if any) inside the a.process() call.
You asked "Variable reference or memory is blocked?"
First of all, "variable" and "memory" are the same thing. Variables, and fields and objects are all higher level abstractions that are built on top of the lower-level idea of "memory".
Second of all, No. Locking an object does not prevent other threads from accessing or modifying the object or, from accessing or modifying anything else.
Locking an object does two things: It prevents other threads from locking the same object at the same time, and it makes certain guarantees about the visibility of memory updates. The simple explanation is, if thread X updates some variables and then releases a lock, thread Y will be guaranteed to see the updates only after it has acquired the same lock.
What that means for your example is, if thread X calls doSomething1() and modifies the object a; and then thread Y later calls doSomething3(), thread Y is not guaranteed to see the the updates. It might see the a object in its original state, it might see it in the fully updated state, or it might see it in some invalid half-way state. The reason why is because, even though thread X locked the object, modified it, and then released the lock; thread Y never locked the same object.
In your code, doSomething3() can proceed in parallel with doSomething1() or doSomething2(), so in that sense it does what you want. However, depending on exactly what a.process() does, this may cause a race condition and corrupt data. Note that even if doSomething3() is called, any calls to doSomething1() or doSomething2() that have started will continue; they won't be put in abeyance while doSomething3() is processed.
So I've been reading on concurrency and have some questions on the way (guide I followed - though I'm not sure if its the best source):
Processes vs. Threads: Is the difference basically that a process is the program as a whole while a thread can be a (small) part of a program?
I am not exactly sure why there is a interrupted() method and a InterruptedException. Why should the interrupted() method even be used? It just seems to me that Java just adds an extra layer of indirection.
For synchronization (and specifically about the one in that link), how does adding the synchronize keyword even fix the problem? I mean, if Thread A gives back its incremented c and Thread B gives back the decremented c and store it to some other variable, I am not exactly sure how the problem is solved. I mean this may be answering my own question, but is it supposed to be assumed that after one of the threads return an answer, terminate? And if that is the case, why would adding synchronize make a difference?
I read (from some random PDF) that if you have two Threads start() subsequently, you cannot guarantee that the first thread will occur before the second thread. How would you guarantee it, though?
In synchronization statements, I am not completely sure whats the point of adding synchronized within the method. What is wrong with leaving it out? Is it because one expects both to mutate separately, but to be obtained together? Why not just have the two non-synchronized?
Is volatile just a keyword for variables and is synonymous with synchronized?
In the deadlock problem, how does synchronize even help the situation? What makes this situation different from starting two threads that change a variable?
Moreover, where is the "wait"/lock for the other person to bowBack? I would have thought that bow() was blocked, not bowBack().
I'll stop here because I think if I went any further without these questions answered, I will not be able to understand the later lessons.
Answers:
Yes, a process is an operating system process that has an address space, a thread is a unit of execution, and there can be multiple units of execution in a process.
The interrupt() method and InterruptedException are generally used to wake up threads that are waiting to either have them do something or terminate.
Synchronizing is a form of mutual exclusion or locking, something very standard and required in computer programming. Google these terms and read up on that and you will have your answer.
True, this cannot be guaranteed, you would have to have some mechanism, involving synchronization that the threads used to make sure they ran in the desired order. This would be specific to the code in the threads.
See answer to #3
Volatile is a way to make sure that a particular variable can be properly shared between different threads. It is necessary on multi-processor machines (which almost everyone has these days) to make sure the value of the variable is consistent between the processors. It is effectively a way to synchronize a single value.
Read about deadlocking in more general terms to understand this. Once you first understand mutual exclusion and locking you will be able to understand how deadlocks can happen.
I have not read the materials that you read, so I don't understand this one. Sorry.
I find that the examples used to explain synchronization and volatility are contrived and difficult to understand the purpose of. Here are my preferred examples:
Synchronized:
private Value value;
public void setValue(Value v) {
value = v;
}
public void doSomething() {
if(value != null) {
doFirstThing();
int val = value.getInt(); // Will throw NullPointerException if another
// thread calls setValue(null);
doSecondThing(val);
}
}
The above code is perfectly correct if run in a single-threaded environment. However with even 2 threads there is the possibility that value will be changed in between the check and when it is used. This is because the method doSomething() is not atomic.
To address this, use synchronization:
private Value value;
private Object lock = new Object();
public void setValue(Value v) {
synchronized(lock) {
value = v;
}
}
public void doSomething() {
synchronized(lock) { // Prevents setValue being called by another thread.
if(value != null) {
doFirstThing();
int val = value.getInt(); // Cannot throw NullPointerException.
doSecondThing(val);
}
}
}
Volatile:
private boolean running = true;
// Called by Thread 1.
public void run() {
while(running) {
doSomething();
}
}
// Called by Thread 2.
public void stop() {
running = false;
}
To explain this requires knowledge of the Java Memory Model. It is worth reading about in depth, but the short version for this example is that Threads have their own copies of variables which are only sync'd to main memory on a synchronized block and when a volatile variable is reached. The Java compiler (specifically the JIT) is allowed to optimise the code into this:
public void run() {
while(true) { // Will never end
doSomething();
}
}
To prevent this optimisation you can set a variable to be volatile, which forces the thread to access main memory every time it reads the variable. Note that this is unnecessary if you are using synchronized statements as both keywords cause a sync to main memory.
I haven't addressed your questions directly as Francis did so. I hope these examples can give you an idea of the concepts in a better way than the examples you saw in the Oracle tutorial.
I have an immutable object, which is capsulated in class and is global state.
Lets say i have 2 threads that get this state, execute myMethod(state) with it. And lets say thread1 finish first. It modify the global state calling GlobalStateCache.updateState(state, newArgs);
GlobalStateCache {
MyImmutableState state = MyImmutableState.newInstance(null, null);
public void updateState(State currentState, Args newArgs){
state = MyImmutableState.newInstance(currentState, newArgs);
}
}
So thread1 will update the cached state, then thread2 do the same, and it will override the state (not take in mind the state updated from thread1)
I searched google, java specifications and read java concurrency in practice but this is clearly not specified.
My main question is will the immutable state object value be visible to a thread which already had read the immutable state. I think it will not see the changed state, only reads after the update will see it.
So i can not understand when to use immutable objects? Is this depends on if i am ok with concurrent modifications during i work with the latest state i have saw and not need to update the state?
Publication seems to be somewhat tricky concept, and the way it's explained in java concurrency in practice didn't work well to me (as opposed to many other multithreading concepts explained in this wonderful book).
With above in mind, let's first get clear on some simpler parts of your question.
when you state lets say thread1 finish first - how would you know that? or, to be more precise, how would thread2 "know" that? as far as I can tell this could be only possible with some sort of synchronization, explicit or not-so-explicit like in thread join (see the JLS - 17.4.5 Happens-before Order). Code you provided so far does not give sufficient details to tell whether this is the case or not
when you state that thread1 will update the cached state - how would thread2 "know" that? with the piece of code you provided, it looks entirely possible (but not guaranteed mind you) for thread2 to never know about this update
when you state thread2... will override the state what does override mean here? There's nothing in GlobalStateCache code example that could somehow guarantee that thread1 will ever notice this override. Even more, the code provided suggests nothing that would somehow impose happen-before relation of updates from different threads so one can even speculate that override may happen the other way around, you see?
the last but not the least, the wording the immutable state sounds rather fuzzy to me. I would say dangerously fuzzy given this tricky subject. The field state is mutable, it can be changed, well, by invoking method updateState right? From your code I would rather conclude that instances of MyImmutableState class are assumed to be immutable - at least that's what name tells me.
With all above said, what is guaranteed to be visible with the code you provided so far? Not much I'm afraid... but maybe better than nothing at all. The way I see it is...
For thread1, it is guaranteed that prior to invoking updateState it will see either null or properly constructed (valid) object updated from thread2. After the update, it is guaranteed to see either of properly constructed (valid) objects updated from thread1 or thread2. Note after this update thread1 is guaranteed not to see null per the very JLS 17.4.5 I refer to above ("...x and y are actions of the same thread and x comes before y in program order...")
For thread2, guarantees are pretty similar to above.
Essentially, all that is guaranteed with the code you provided is that both threads will see either null or one of properly constructed (valid) instances of MyImmutableState class.
Above guarantees may look insignificant at the first glance, but if you skim one page above the one with quote that confused you ("Immutable objects can be used safely etc..."), you'll find an example worth deeper drilling into in 3.5.1. Improper Publication: When Good Objects Go Bad.
Yeah object being immutable alone won't guarantee its visibility but it at least will guarantee that the object won't "explode from inside", like in example provided in 3.5.1:
public class Holder {
private int n;
public Holder(int n) { this.n = n; }
public void assertSanity() {
if (n != n)
throw new AssertionError("This statement is false.");
}
}
Goetz comments for above code begin at explaining issues true for both mutable and immutable objects, ...we say the Holder was not properly published. Two things can go wrong with improperly published objects. Other threads could see a stale value for the holder field, and thus see a null reference or other older value even though a value has been placed in holder...
...then he dives into what can happen if object is mutable, ...But far worse, other threads could see an up-todate value for the holder reference, but stale values for the state of the Holder. To make things even less predictable, a thread may see a stale value the first time it reads a field and then a more up-to-date value the next time, which is why assertSanity can throw AssertionError.
Above "AssertionHorror" may sound counter-intuitive but all the magic goes away if you consider scenario like below (completely legal per Java 5 memory model - and for a good reason btw):
thread1 invokes sharedHolderReference = Holder(42);
thread1 first fills n field with default value (0) then is going to assign it within constructor but...
...but scheduler switches to thread2,
sharedHolderReference from thread1 becomes visible to thread2 because, say because why not? maybe optimizing hot-spot compiler decided it's a good time for that
thread2 reads the up-todate sharedHolderReference with field value still being 0 btw
thread2 invokes sharedHolderReference.assertSanity()
thread2 reads the left side value of if statement within assertSanity which is, well, 0 then it is going to read the right side value but...
...but scheduler switches back to thread1,
thread1 completes the constructor assignment suspended at step #2 above by setting n field value 42
value 42 in the field n from thread1 becomes visible to thread2 because, say because why not? maybe optimizing hot-spot compiler decided it's a good time for that
then, at some moment later, scheduler switches back to thread2
thread2 proceeds from where it was suspended at step #6 above, ie it reads right-hand side of if statement, which is, well, 42 now
oops our innocent if (n != n) suddenly turns into if (0 != 42) which...
...naturally throws AssertionError
As far as I understand, initialization safety for immutable objects just guarantees that above won't happen - no more... and no less
I think the key is to distinguish between objects and references.
The immutable objects are safe to publish, so any thread can publish object, and if any other thread reads a reference to such object - it can safely use the object. Of course, reader thread will see the immutable object state that was published at the moment the thread read the reference, it will not see any updates, until it reads the reference again.
It is very useful in many situations. E.g. if there is a single publisher, and many readers - and readers need to see a consistent state. The readers periodically read the reference, and work on the obtained state - it is guaranteed to be consistent, and it does not require any locking on reader thread. Also when it is OK to loose some updates, e.g. you don't care which thread updates the state.
If I understand your question, immutability doesn't seem to be relevant here. You're just asking whether threads will see updates to shared objects.
[Edit] after an exchange of comments, I now see that you need also to hold a reference to your shared singleton state while doing some actions, and then setting the state to reflect that action.
The good news, as before, is that providing this will of necessity also solve your memory consistency issue.
Instead of defining separate synchronized getState and updateState methods, you'll have to perform all three actions without being interrupted: getState, yourAction, and updateState.
I can see three ways to do it:
1) Do all three steps inside a single synchronized method in GlobalStateCache. Define an atomic doActionAndUpdateState method in GlobalStateCache, synchronized of course on your state singleton, which would take a functor object to do your action.
2) Do getState and updateState as separate calls, and change updateState so that it checks to be sure state hasn't changed since the get. Define getState and checkAndUpdateState in GlobalStateCache. checkAndUpdateState will take the original state caller got from getState, and must be able to check if state has changed since your get. If it has changed, you'll need to do something to let caller know they potentially need to revert their action (depends on your use case).
3) Define a getStateWithLock method in GlobalStateCache. This implies that you'll also need to assure callers release their lock. I'd create an explicit releaseStateLock method, and have your updateState method call it.
Of these, I advise against #3, because it leaves you vulnerable to leaving that state locked in the event of some kinds of bugs. I'd also advise (though less strongly) against #2, because of the complexity it creates with what happens in the event that the state has changed: do you just abandon the action? Do you retry it? Must it be (can it be) reverted? I'm for #1: a single synchronized atomic method, which will look something like this:
public interface DimitarActionFunctor {
public void performAction();
}
GlobalStateCache {
private MyImmutableState state = MyImmutableState.newInstance(null, null);
public MyImmutableState getState {
synchronized(state) {
return state;
}
}
public void doActionAndUpdateState(DimitarActionFunctor functor, State currentState, Args newArgs){
synchronized(state) {
functor.performAction();
state = MyImmutableState.newInstance(currentState, newArgs);
}
}
}
}
Caller then constructs a functor for the action (an instance of DimitarActionFunctor), and calls doActionAndUpdateState. Of course, if the actions need data, you'll have to define your functor interface to take that data as arguments.
Again, I point you to this question, not for the actual difference, but for how they both work in terms of memory consistency: Difference between volatile and synchronized in Java
So much depends on the actual use case here that it's hard to make a recommendation, but it looks like you want some sort of Compare-And-Set semantics for the GlobalStateCache, using a java.util.concurrent.atomic.AtomicReference.
public class GlobalStateCache {
AtomicReference<MyImmutableState> atomic = new AtomicReference<MyImmutableState>(MyImmutableState.newInstance(null, null);
public State getState()
{
return atomic.get();
}
public void updateState( State currentState, Args newArgs )
{
State s = currentState;
while ( !atomic.compareAndSet( s, MyImmutableState.newInstance( s, newArgs ) ) )
{
s = atomic.get();
}
}
}
This, of course, depends on the expense of potentially creating a few extra MyImmutableState objects, and whether you need to re-run myMethod(state) if the state has been updated underneath, but the concept should be correct.
Answering you "main" question: no Thread2 will not see the change. Immutable objects do not change :-)
So if Thread1 read state A and then Thread2 stores state B, Thread1 should read the variable again to see the changes.
Visibily of variables is affected by volatile keyword. If variable is declared as volatile then Java guarantees that if one thread updates the variable all other threads will see the change immediately (at the cost of speed).
Still immutable objects are very useful in multithreaded environments. I will give you an example how I used it once. Lets say you have an object that is periodically changed (life field in my case) by one thread and it is somehow processed by other threads (my program was sending it to clients over the network). These threads fail if the object is changed in the middle of processing (they send inconsistent life field state). If you make this object immutable and will create a new instance every time it changes, you don't have to write any synchronization at all. Updating thread will periodically publish new versions of an object and every time other threads read it they will have most recent version of it and can safely process it. This particular example saves time spent on synchronization but wastes more memory. This should give you a general understanding when you can use them.
Another link I found: http://download.oracle.com/javase/tutorial/essential/concurrency/immutable.html
Edit (answer comment):
I will explain my task. I had to write a network server that will send clients the most recent life field and will constantly update it. With design mentioned above, I have two types of threads:
Thread that updates object that represents life field. It is immutable, so actually it creates a new instance every time and only changes reference. The fact that reference is declared volatile is crucial.
Every client is served with its own thread. When client requests life field, it reads the reference once and starts sending. Because network connection can be slow, life field can be updated many times while this thread sends data. The fact that object is immutable guarantees that server will send consistent state of life field. In the comments you are concerned about changes made while this thread processes the object. Yes, when client receives data it may not be up to date but there is nothing you can do about it. This is not synchronization issue but rather a slow connection issue.
I am not stating that immutable objects can solve all of your concurrency problems. This is obviously not true and you point this out. I am trying to explain you where it actually can solve problems. I hope my example is clear now.
I just want to check if my understanding of the JMM's thread start synchronization rule is correct:
Does the following Java program must print "num:1 m_i:2 " just because of the following synchronization order.
http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.4.4
An action that starts a thread synchronizes-with the first action in the thread it starts.
public class ThreadHappenBefore {
static int num;
int m_i;
public static void main(String[] args) {
final ThreadHappenBefore hb = new ThreadHappenBefore();
num = 1;
hb.m_i = 2;
new Thread(new Runnable() {
#Override
public void run() {
System.out.println("num:"+num);
System.out.println("m_i:"+hb.m_i);
}
}).start();
}
}
Anything coded before other code is guaranteed to happen before the other code in a given thread when the earlier code has an effect on the later code. Because the thread start is coded after the assignments, and the assignments affect the outcome of the print statements, these assignments are "visible" (ie happened before) the the code that prints them.
However, there is no such guarantee made to the effects of order of execution when viewed from another thread.
EDITED (thanks to commenters)
Added a refinement (in bold) to the above regarding reordering.
I don't think u have a synchronize issue here
you will have the result u expect because of the code order.
synchronize issue can occur if you have several threads referring to same source
Since so far nobody except Stephen C in a comment has posted the correct answer: The creation of a thread (the start() not the new Thread()!) ensures a happens-before relationship.
If this was not the case, the JVM could easily reorder the instructions as long as it can guarantee that the own thread wouldn't see a difference.
You can be pretty sure that in every language with a sensible multithreaded memory model that there will be some memory barrier for this case, because otherwise you couldn't reliably pass parameters to a thread.
No, it prints the right thing because of the "happens-before" relation in section 17.4.5. If you look at the discussion below the section, it is also clearly stated:
A call to start() on a thread happens-before any actions in the
started thread.
and furthermore:
If x and y are actions of the same thread and x comes before y in
program order, then hb(x, y).
Therefore the two assignments are guaranteed to have happened before the read in the thread.
You have no synchronization in your example.
What 'synchronization order'? There is no synchronization here. This is just normal excecution order. All the assignments occur before the thread even starts. I don't know why this is considered mysterious.