I am writing some code where
class A {
Integer x;
String y
}
I created an object of A and I am passing it to 2 runnable threads. First thread updates value x, while second one updates value y.
Is there any scenario where this can break? I mean, can there be a race condition if there are two threads updating different variables of the same object?
No, it will work fine. As long as any given variable is only updated by one thread (With some conditions on the reading of that variable by other threads) you will be okay.
It may not be the most comprehendible design depending on what you are doing--Also, as I alluded to above, don't count on reading those variables from another thread reliably, if you want that look into atomic objects or volatile. (Atomic will be quicker for writes from multiple threads, volatile may still be better if you are just writing from one thread and reading from others)
Multithreading is not only about race conditions it's also about memory visibility formulated by
jsr-133 which you absolutely have to learn if you want to understand concurrency in java.
There are probably many ways in which your code can break, one obvious example is if you create your object in Thread A change x in thread B and y in thread C you might never see those changes in Thread A.
Some code to illustrate:
A a = new A();
a.x = 0;
Thread t1 = new Thread( () -> {
while (true) {
a.x = 1;
}
});
Thread t2 = new Thread( () -> {
a.y = "a";
});
t1.start();
t2.start();
while(a.x == 0) {
}
System.out.println("might never get here");
One possible fix is to make x and y volatile
class A {
volatile Integer x;
volatile String y;
}
This will ensure that all threads see changes to x and y but you still need to make sure that the instance of A is safely published.
Related
Let's say I have the following code in Java
public class SynchronizedCounter {
private int c = 0;
public synchronized void increment() {
c++;
}
}
And I create two threads T1 and T2
Thread T1 = new Thread(c1);
Thread T2 = new Thread(c2);
Where c1 and c2 are two different instances of the class SynchronizedCounter.
It is really needed to synchronize the method increment? Because I know that when we use a synchronized method, the thread hold a lock on the object, in this way other threads cannot acquire the lock on the same object, but threads "associated" with other objects can execute that method without problems. Now, because I have only one thread associated with the object c1, it is anyway needed to use the synchronized method? Also if no other threads associated with the same object exist?
In your specific example, synchronized is not needed because each thread has its own instance of the class, so there is no data "sharing" between them.
If you change your example to:
Thread T1 = new Thread(c);
Thread T2 = new Thread(c);
Then you need to synchronize the method because the ++ operation is not atomic and the instance is shared between threads.
The bottom line is that your class is not thread safe without synchronized. If you never use a single instance across threads it doesn't matter. There are plenty of legitimate use cases for classes which are not thread safe. But as soon as you start sharing them between threads all bets are off (i.e. vicious bugs may appear randomly).
Given code/example does not need synchronization since it is using two distinct instances (and so, variables). But if you have one instance shared between two or more threads, synchronization is needed, despite comments stating otherwise.
Actually it is very simple to create a program to show that behavior:
removed synchronized
added code to call the method from two threads
public class SynchronizedCounter {
private int c = 0;
public void increment() {
c++;
}
public static void main(String... args) throws Exception {
var counter = new SynchronizedCounter();
var t1 = create(100_000, counter);
var t2 = create(100_000, counter);
t1.start();
t2.start();
// wait termination of both threads
t1.join();
t2.join();
System.out.println(counter.c);
}
private static Thread create(int count, SynchronizedCounter counter) {
return new Thread(() -> {
for (var i = 0; i < count; i++) {
counter.increment();
}
System.out.println(counter.c);
});
}
}
Eventually (often?) this will result in weird numbers like:
C:\TMP>java SynchronizedCounter.java
122948
136644
136644
add synchronized and output should always end with 200000:
C:\TMP>java SynchronizedCounter.java
170134
200000
200000
Apparently posted code is not complete: the incremented variable is private and there is no method to retrieve the incremented value. impossible to really know if the method must be synchronized or not.
The Java memory model guarantees a happens-before relationship between an object's construction and finalizer:
There is a happens-before edge from the end of a constructor of an
object to the start of a finalizer (§12.6) for that object.
As well as the constructor and the initialization of final fields:
An object is considered to be completely initialized when its
constructor finishes. A thread that can only see a reference to an
object after that object has been completely initialized is guaranteed
to see the correctly initialized values for that object's final
fields.
There's also a guarantee about volatile fields since, there's a happens-before relations with regard to all access to such fields:
A write to a volatile field (§8.3.1.4) happens-before every subsequent
read of that field.
But what about regular, good old non-volatile fields? I've seen a lot of multi-threaded code that doesn't bother creating any sort of memory barrier after object construction with non-volatile fields. But I've never seen or heard of any issues because of it and I wasn't able to recreate such partial construction myself.
Do modern JVMs just put memory barriers after construction? Avoid reordering around construction? Or was I just lucky? If it's the latter, is it possible to write code that reproduces partial construction at will?
Edit:
To clarify, I'm talking about the following situation. Say we have a class:
public class Foo{
public int bar = 0;
public Foo(){
this.bar = 5;
}
...
}
And some Thread T1 instantiates a new Foo instance:
Foo myFoo = new Foo();
Then passes the instance to some other thread, which we'll call T2:
Thread t = new Thread(() -> {
if (myFoo.bar == 5){
....
}
});
t.start();
T1 performed two writes that are interesting to us:
T1 wrote the value 5 to bar of the newly instantiated myFoo
T1 wrote the reference to the newly created object to the myFoo variable
For T1, we get a guarantee that write #1 happened-before write #2:
Each action in a thread happens-before every action in that thread
that comes later in the program's order.
But as far as T2 is concerned the Java memory model offers no such guarantee. Nothing prevents it from seeing the writes in the opposite order. So it could see a fully built Foo object, but with the bar field equal to equal to 0.
Edit2:
I took a second look at the example above a few months after writing it. And that code is actually guaranteed to work correctly since T2 was started after T1's writes. That makes it an incorrect example for the question I wanted to ask. The fix it to assume that T2 is already running when T1 is performing the write. Say T2 is reading myFoo in a loop, like so:
Foo myFoo = null;
Thread t2 = new Thread(() -> {
for (;;) {
if (myFoo != null && myFoo.bar == 5){
...
}
...
}
});
t2.start();
myFoo = new Foo(); //The creation of Foo happens after t2 is already running
Taking your example as the question itself - the answer would be yes, that is entirely possible. The initialized fields are visible only to the constructing thread, like you quoted. This is called safe publication (but I bet you already knew about this).
The fact that you are not seeing that via experimentation is that AFAIK on x86 (being a strong memory model), stores are not re-ordered anyway, so unless JIT would re-ordered those stores that T1 did - you can't see that. But that is playing with fire, literately, this question and the follow-up (it's close to the same) here of a guy that (not sure if true) lost 12 milion of equipment
The JLS guarantees only a few ways to achieve the visibility. And it's not the other way around btw, the JLS will not say when this would break, it will say when it will work.
1) final field semantics
Notice how the example shows that each field has to be final - even if under the current implementation a single one would suffice, and there are two memory barriers inserted (when final(s) are used) after the constructor: LoadStore and StoreStore.
2) volatile fields (and implicitly AtomicXXX); I think this one does not need any explanations and it seems you quoted this.
3) Static initializers well, kind of should be obvious IMO
4) Some locking involved - this should be obvious too, happens-before rule...
But anecdotal evidence suggests that it doesn't happen in practice
To see this issue, you have to avoid using any memory barriers. e.g. if you use thread safe collection of any kind or some System.out.println can prevent the problem occurring.
I have seen a problem with this before though a simple test I just wrote for Java 8 update 161 on x64 didn't show this problem.
It seems there is no synchronization during object construction.
The JLS doesn't permit it, nor was I able to produce any signs of it in code. However, it's possible to produce an opposition.
Running the following code:
public class Main {
public static void main(String[] args) throws Exception {
new Thread(() -> {
while(true) {
new Demo(1, 2);
}
}).start();
}
}
class Demo {
int d1, d2;
Demo(int d1, int d2) {
this.d1 = d1;
new Thread(() -> System.out.println(Demo.this.d1+" "+Demo.this.d2)).start();
try {
Thread.sleep(500);
} catch(InterruptedException e) {
e.printStackTrace();
}
this.d2 = d2;
}
}
The output would continuously show 1 0, proving that the created thread was able to access data of a partially created object.
However, if we synchronized this:
Demo(int d1, int d2) {
synchronized(Demo.class) {
this.d1 = d1;
new Thread(() -> {
synchronized(Demo.class) {
System.out.println(Demo.this.d1+" "+Demo.this.d2);
}
}).start();
try {
Thread.sleep(500);
} catch(InterruptedException e) {
e.printStackTrace();
}
this.d2 = d2;
}
}
The output is 1 2, showing that the newly created thread will in fact wait for a lock, opposed to the unsynchronized exampled.
Related: Why can't constructors be synchronized?
I've got few questions about threads in Java. Here is the code:
TestingThread class:
public class TestingThread implements Runnable {
Thread t;
volatile boolean pause = true;
String msg;
public TestingThread() {
t = new Thread(this, "Testing thread");
}
public void run() {
while (pause) {
//wait
}
System.out.println(msg);
}
public boolean isPause() {
return pause;
}
public void initMsg() {
msg = "Thread death";
}
public void setPause(boolean pause) {
this.pause = pause;
}
public void start() {
t.start();
}
}
And main thread class:
public class Main {
public static void main(String[] args) {
TestingThread testingThread = new TestingThread();
testingThread.start();
testingThread.initMsg();
testingThread.setPause(false);
}
}
Question list:
Should t be volatile?
Should msg be volatile?
Should setPause() be synchronized?
Is this a good example of good thread structure?
You have hit quite a subtlety with your question number 2.
In your very specific case, you:
first write msg from the main thread;
then write the volatile pause from the main thread;
then read the volatile pause from the child thread;
then read msg from the child thread.
Therefore you have transitively established a happens-before relationship between the write and the read of msg. Therefore msg itself does not have to be volatile.
In real-life code, however, you should avoid depending on such subtle behavior: better overapply volatile and sleep calmly.
Here are some relevant quotes from the Java Language Specification:
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
If an action x synchronizes-with a following action y, then we also have hb(x, y).
Note that, in my list of actions,
1 comes before 2 in program order;
same for 3 and 4;
2 synchronizes-with 3.
As for your other questions,
ad 1: t doesn't have to be volatile because it's written to prior to thread creation and never mutated later. Starting a thread induces a happens-before on its own;
ad 3: setPause does not have to be synchronized because all it does is set the volatile var.
> Should msg be volatile?
Yes. Does it have to be in this example, No. But I urge you to use it anyway as the codes correctness becomes much clearer ;) Please note that I am assuming that we are discussing Java 5 or later, before then volatile was broken anyway.
The tricky part to understand is why this example can get away without msg being declared as volatile.
Consider this order part of main().
testingThread.start(); // starts the other thread
testingThread.initMsg(); // the other thread may or may not be in the
// while loop by now msg may or may not be
// visible to the testingThread yet
// the important thing to note is that either way
// testingThread cannot leave its while loop yet
testingThread.setPause(false); // after this volatile, all data earlier
// will be visible to other threads.
// Thus even though msg was not declared
// volatile it will piggy back the pauses
// use of volatile; as described [here](http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile)
// and now the testingThread can leave
// its while loop
So if we now consider the testingThread
while (pause) { // pause is volatile, so it will see the change as soon
// as it is made
//wait
}
System.out.println(msg); // this line cannot be reached until the value
// of pause has been set to false by the main
// method. Which under the post Java5
// semantics will guarantee that msg will
// have been updated too.
> Should t be volatile?
It does not matter, but I would suggest making it private final.
> Should setPause() be synchronized?
Before Java 5, then yes. After Java 5 reading a volatile has the same memory barrier as entering a synchronized block. And writing to a volatile has the same memory barrier as at the end of a synchronized block. Thus unless you need the scoping of a synchronized block, which in this case you do not then you are fine with volatile.
The changes to volatile in Java 5 are documented by the author of the change here.
1&2
Volatile can be treated something like as "synchronization on variable",though the manner is different, but the result is alike, to make sure it is read-consistent.
3.
I feel it does not need to, since this.pause = pause should be an atomic statement.
4.
It is a bad example to do any while (true) {do nothing}, which will result in busy waiting, if you put Thread.sleep inside, which may help just a little bit. Please refer to http://en.wikipedia.org/wiki/Busy_waiting
One of a more appropriate way to do something like "wait until being awaken" is using the monitor object(Object in java is a monitor object), or using condition object along with a lock to do so. You may need to refer to http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html
Also, I don't think it is good idea either, that you have a local filed of thread inside your custom Runnable . Please refer to Seelenvirtuose 's comment.
Is is okay to synchronize all methods which mutate the state of an object, but not synchronize anything which is atomic? In this case, just returning a field?
Consider:
public class A
{
private int a = 5;
private static final Object lock = new Object();
public void incrementA()
{
synchronized(lock)
{
a += 1;
}
}
public int getA()
{
return a;
}
}
I've heard people argue that it's possible for getA() and incrementA() to be called at roughly the same time and have getA() return to wrong thing. However it seems like, in the case that they're called at the same time, even if the getter is synchronized you can get the wrong thing. In fact the "right thing" doesn't even seem defined if these are called concurrently. The big thing for me is that the state remains consistent.
I've also heard talk about JIT optimizations. Given an instance of the above class and the following code(the code would be depending on a to be set in another thread):
while(myA.getA() < 10)
{
//incrementA is not called here
}
it is apparently a legal JIT optimization to change this to:
int temp = myA.getA();
while(temp < 10)
{
//incrementA is not called here
}
which can obviously result in an infinite loop.
Why is this a legal optimization? Would this be illegal if a was volatile?
Update
I did a little bit of testing into this.
public class Test
{
private int a = 5;
private static final Object lock = new Object();
public void incrementA()
{
synchronized(lock)
{
a += 1;
}
}
public int getA()
{
return a;
}
public static void main(String[] args)
{
final Test myA = new Test();
Thread t = new Thread(new Runnable(){
public void run() {
while(true)
{
try {
Thread.sleep(100);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
myA.incrementA();
}
}});
t.start();
while(myA.getA() < 15)
{
System.out.println(myA.getA());
}
}
}
Using several different sleep times, this worked even when a is not volatile. This of course isn't conclusive, it still may be legal. Does anyone have some examples that could trigger such JIT behaviour?
Is is okay to synchronize all methods which mutate the state of an object, but not synchronize anything which is atomic? In this case, just returning a field?
Depends on the particulars. It is important to realize that synchronization does two important things. It is not just about atomicity but it is also required because of memory synchronization. If one thread updates the a field, then other threads may not see the update because of memory caching on the local processor. Making the int a field be volatile solves this problem. Making both the get and the set method be synchronized will as well but it is more expensive.
If you want to be able to change and read a from multiple threads, the best mechanism is to use an AtomicInteger.
private AtomicInteger a = new AtomicInteger(5);
public void setA(int a) {
// no need to synchronize because of the magic of the `AtomicInteger` code
this.a.set(a);
}
public int getA() {
// AtomicInteger also takes care of the memory synchronization
return a.get();
}
I've heard people argue that it's possible for getA() and setA() to be called at roughly the same time and have getA() return to wrong thing.
This is true but you can get the wrong value if getA() is called after setA() as well. A bad cache value can stick forever.
which can obviously result in an infinite loop. Why is this a legal optimization?
It is a legal optimization because threads running with their own memory cache asynchronously is one of the important reasons why you see performance improvements with them. If all memory accesses where synchronized with main memory then the per-CPU memory caches would not be used and threaded programs would run a lot slower.
Would this be illegal if a was volatile?
It is not legal if there is some way for a to be altered – by another thread possibly. If a was final then the JIT could make that optimization. If a was volatile or the get method marked as synchronized then it would certainly not be a legal optimization.
It's not thread safe because that getter does not ensure that a thread will see the latest value, as the value may be stale. Having the getter be synchronized ensures that any thread calling the getter will see the latest value instead of a possible stale one.
You basically have two options:
1) Make your int volatile
2) Use an atomic type like AtomicInt
using a normal int without synchronization is not thread safe at all.
Your best solution is to use an AtomicInteger, they were basically designed for exactly this use case.
If this is more of a theoretical "could this be done question", I think something like the following would be safe (but still not perform as well as an AtomicInteger):
public class A
{
private volatile int a = 5;
private static final Object lock = new Object();
public void incrementA()
{
synchronized(lock)
{
final int tmp = a + 1;
a = tmp;
}
}
public int getA()
{
return a;
}
}
The short answer is your example will be thread-safe, if
the variable is declared as volatile, or
the getter is declared as synchronized.
The reason that your example class A is not thread-safe is that one can create a program using it that doesn't have a "well-formed execution" (see JLS 17.4.7).
For instance, consider
// in thread #1
int a1 = A.getA();
Thread.sleep(...);
int a2 = A.getA();
if (a1 == a2) {
System.out.println("no increment");
// in thread #2
A.incrementA();
in the scenario that the increment happens during the sleep.
For this execution to be well-formed, there must be a "happens before" (HB) chain between the assignment to a in incrementA called by thread #2, and the subsequent read of a in getA called by thread #1.
If the two threads synchronize using the same lock object, then there is a HB between one thread releasing the lock and a second thread acquiring the lock. So we get this:
thread #2 acquires lock --HB-->
thread #2 reads a --HB-->
thread #2 writes a --HB-->
thread #2 releases lock --HB-->
thread #1 acquires lock --HB-->
thread #1 reads a
If two threads share a a volatile variable, there is a HB between any write and any subsequent read (without an intervening write). So we typically get this:
thread #2 acquires lock --HB-->
thread #2 reads a --HB-->
thread #2 writes a --HB-->
thread #1 reads a
Note that incrementA needs to be synchronized to avoid race conditions with other threads calling incrementA.
If neither of the above is true, we get this:
thread #2 acquires lock --HB-->
thread #2 reads a --HB-->
thread #2 writes a // No HB!!
thread #1 reads a
Since there is no HB between the write by thread #2 and the subsequent read by thread #1, the JLS does not guarantee that the latter will see the value written by the former.
Note that this is a simplified version of the rules. For the complete version, you need to read all of JLS Chapter 17.
I've read that "volatile" in Java allows different threads to have access to the same field and see changes the other threads has made to that field. If that's the case, I'd predict that when the first and second thread have completely run, the value of "d" will be incremented to 4. But instead, each thread increments "d" to a value of 2.
public class VolatileExample extends Thread {
private int countDown = 2;
private volatile int d = 0;
public VolatileExample(String name) {
super(name);
start();
}
public String toString() {
return super.getName() + ": countDown " + countDown;
}
public void run() {
while(true) {
d = d + 1;
System.out.println(this + ". Value of d is " + d);
if(--countDown == 0) return;
}
}
public static void main(String[] args) {
new VolatileExample("first thread");
new VolatileExample("second thread");
}
}
The results from running this program are:
first thread: countDown 2. Value of d is 1
second thread: countDown 2. Value of d is 1
first thread: countDown 1. Value of d is 2
second thread: countDown 1. Value of d is 2
I understand that if I add keyword "static" the program,
(that is, "private static volatile int d = 0;"), "d" would be incremented to 4.
And I know that's because d will become a variable that the whole class shares rather than each instance getting a copy.
The results look like:
first thread: countDown 2. Value of d is 1
first thread: countDown 1. Value of d is 3
second thread: countDown 2. Value of d is 2
second thread: countDown 1. Value of d is 4
My question is, why doesn't "private volatile int d = 0; " yield similar results if volatile is supposed to allow the sharing of "d" between the two threads? That is, if the first thread updates the value of d to 1, then why doesn't the second thread grab the value of d as 1 rather than as zero?
volatile doesn't "allow the sharing" of anything. It just prevents the variable from being cached thread local, so that changes to the variables value occur immediately. Your d variable is an instance variable and is thus owned by the instance that holds it. You'll want to re-read the threading tutorials to re-align your assumptions.
One decent reference is here
There are a couple of misunderstandings here. You seem not to properly understand what a thread is, what an instance field is and what a static field is.
An instance field is a memory location that gets allocated once you instantiate a class (ie, a memory location gets allocated for a field d when you VolatileExample v = new VolatileExample()). To reference that memory location from within the class, you do this.d (then you can write to and read from that memory location). To reference that memory location from outside the class, it must be acessible (ie, not private), and then you'd do v.d. As you can see, each instance of a class gets its own memory location for its own field d. So, if you have 2 different instances of VolatileExample, each will have its own, independent, field d.
A static field is a memory location that gets allocated once a class is initialized (which, forgetting about the possibility of using multiple ClassLoaders, happens exactly once). So, you can think that a static field is some kind of global variable. To reference that memory location, you'd use VolatileExample.d (accessibility also applies (ie, if it is private, it can only be done from within the class)).
Finally, a thread of execution is a sequence of steps that will be executed by the JVM. You must not think of a thread as a class, or an instance of the class Thread, it will only get you confused. It is as simple as that: a sequence of steps.
The main sequence of steps is what is defined in the main(...) method. It is that sequence of steps that the JVM will start executing when you launch your program.
If you want to start a new thread of execution to run simultaneously (ie, you want a separate sequence of steps to be run concurrently), in Java you do so by creating an instance of the class Thread and calling its start() method.
Let's modify your code a little bit so that it is easier to understand what is happening:
public class VolatileExample extends Thread {
private int countDown = 2;
private volatile int d = 0;
public VolatileExample(String name) {
super(name);
}
public String toString() {
return super.getName() + ": countDown " + countDown;
}
public void run() {
while(true) {
d = d + 1;
System.out.println(this + ". Value of d is " + d);
if(--countDown == 0) return;
}
}
public static void main(String[] args) {
VolatileExample ve1 = new VolatileExample("first thread");
ve1.start();
VolatileExample ve2 = new VolatileExample("second thread");
ve2.start();
}
}
The line VolatileExample ve1 = new VolatileExample("first thread"); creates an instance of VolatileExample. This will allocate some memory locations: 4 bytes for countdown and 4 bytes for d. Then you start a new thread of execution: ve1.start();. This thread of execution will access (read from and write to) the memory locations described before in this paragraph.
The next line, VolatileExample ve2 = new VolatileExample("second thread"); creates another instance of VolatileExample, which will allocate 2 new memory locations: 4 bytes for ve2's countdown and 4 bytes for ve2's d. Then, you start a thread of execution, which will access THESE NEW memory locations, and not those described in the paragraph before this one.
Now, with or without volatile, you see that you have two different fields d : each thread operates on a different field. Therefore, it is unreasonable for you to expect that d would get incremented to 4, since there's no single d.
If you make d a static field, only then both threads would (supposedly) be operating on the same memory location. Only then volatile would come into play, since only then you'd be sharing a memory location between different threads.
If you make a field volatile, you are guaranteed that writes go straight to the main memory and reads come straight from the main memory (ie, they won't get cached on some -- extremely fast -- processor-local cache, the operations would take longer but would be guaranteed to be visible to other threads).
It wouldn't, however, guarantee that you'd see the value 4 stored on d. That's because volatile solves visibility problem, but not atomicity problems: increment = read from main memory + operation on the value + write to main memory. As you can see, 2 different threads could read the initial value (0), operate (locally) on it (obtaining 1), then writing it to the main memory (both would end up writing 1) -- the 2 increments would be perceived as only 1.
To solve this, you must make the increment an atomic operation. To do so, you'd need to either use a synchronization mechanism -- a mutex (synchronized (...) { ... }, or an explicit lock) -- or a class designed specifically for this things: AtomicInteger.
volatile can make sharing safe (if atomicity of a single read or write operation is sufficient), it doesn't cause sharing.
Note that if you make d static, it is actually unspecified what value d would have, because the statement d = d + 1 is not atomic, i.e. a thread may be interrupted between reading and writing d. A synchronized block, or an AtomicInteger are the typical solutions for that.