I have the following code snippet that I'm trying to see if it can crash/misbehave at some point. The HashMap is being called from multiple threads in which put is inside a synchronized block and get is not. Is there any issue with this code? If so, what modification I need to make to see that happens given that I only use put and get this way, and there is no putAll, clear or any operations involved.
import java.util.HashMap;
import java.util.Map;
public class Main {
Map<Integer, String> instanceMap = new HashMap<>();
public static void main(String[] args) {
System.out.println("Hello");
Main main = new Main();
Thread thread1 = new Thread("Thread 1"){
public void run(){
System.out.println("Thread 1 running");
for (int i = 0; i <= 100; i++) {
System.out.println("Thread 1 " + i + "-" + main.getVal(i));
}
}
};
thread1.start();
Thread thread2 = new Thread("Thread 2"){
public void run(){
System.out.println("Thread 2 running");
for (int i = 0; i <= 100; i++) {
System.out.println("Thread 2 " + i + "-" + main.getVal(i));
}
}
};
thread2.start();
}
private String getVal(int key) {
check(key);
return instanceMap.get(key);
}
private void check(int key) {
if (!instanceMap.containsKey(key)) {
synchronized (instanceMap) {
if (!instanceMap.containsKey(key)) {
// System.out.println(Thread.currentThread().getName());
instanceMap.put(key, "" + key);
}
}
}
}
}
What I have checked out:
Are size(), put(), remove(), get() atomic in Java synchronized HashMap?
Extending HashMap<K,V> and synchronizing only puts
Why does HashMap.get(key) needs to be synchronized when change operations are synchronized?
I somewhat modified your code:
removed System.out.println() from the "hot" loop, it is internally synchronized
increased the number of iterations
changed printing to only print when there's an unexpected value
There's much more we can do and try, but this already fails, so I stopped there. The next step would we to rewrite the whole thing to jcsctress.
And voila, as expected, sometimes this happens on my Intel MacBook Pro with Temurin 17:
Exception in thread "Thread 2" java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because the return value of "java.util.Map.get(Object)" is null
at com.gitlab.janecekpetr.playground.Playground.getVal(Playground.java:35)
at com.gitlab.janecekpetr.playground.Playground.lambda$0(Playground.java:21)
at java.base/java.lang.Thread.run(Thread.java:833)
Code:
private record Val(int index, int value) {}
private static final int MAX = 100_000;
private final Map<Integer, Integer> instanceMap = new HashMap<>();
public static void main(String... args) {
Playground main = new Playground();
Runnable runnable = () -> {
System.out.println(Thread.currentThread().getName() + " running");
Val[] vals = new Val[MAX];
for (int i = 0; i < MAX; i++) {
vals[i] = new Val(i, main.getVal(i));
}
System.out.println(Stream.of(vals).filter(val -> val.index() != val.value()).toList());
};
Thread thread1 = new Thread(runnable, "Thread 1");
thread1.start();
Thread thread2 = new Thread(runnable, "Thread 2");
thread2.start();
}
private int getVal(int key) {
check(key);
return instanceMap.get(key);
}
private void check(int key) {
if (!instanceMap.containsKey(key)) {
synchronized (instanceMap) {
if (!instanceMap.containsKey(key)) {
instanceMap.put(key, key);
}
}
}
}
To specifically explain the excellent sleuthing work in the answer by #PetrJaneček :
Every field in java has an evil coin attached to it. Anytime any thread reads the field, it flips this coin. It is not a fair coin - it is evil. It will flip heads 10,000 times in a row if that's going to ruin your day (for example, you may have code that depends on coinflips landing a certain way, or it'll fail to work. The coin is evil: You may run into the situation that just to ruin your day, during all your extensive testing, the coin flips heads, and during the first week in production it's all heads flips. And then the big new potential customer demos your app and the coin starts flipping some tails on you).
The coinflip decides which variant of the field is used - because every thread may or may not have a local cache of that field. When you write to a field from any thread? Coin is flipped, on tails, the local cache is updated and nothing more happens. Read from any thread? Coin is flipped. On tails, the local cache is used.
That's not really what happens of course (your JVM does not actually have evil coins nor is it out to get you), but the JMM (Java Memory Model), along with the realities of modern hardware, means that this abstraction works very well: It will reliably lead to the right answer when writing concurrent code, namely, that any field that is touched by more than one thread must have guards around it, or must never change at all during the entire duration of the multi-thread access 'session'.
You can force the JVM to flip the coin the way you want, by establishing so-called Happens Before relationships. This is explicit terminology used by the JMM. If 2 lines of code have a Happens-Before relationship (one is defined as 'happening before' the other, as per the JMM's list of HB relationship establishing actions), then it is not possible (short of a bug in the JVM itself) to observe any side effect of the HA line whilst not also observing all side effects of the HB line. (That is to say: the 'happens before' line happens before the 'happens after' line as far as your code could ever tell, though it's a bit of schrodiner's cat situation. If your code doesn't actually look at these files in a way that you'd ever be able to tell, then the JVM is free to not do that. And it won't, you can rely on the evil coin being evil: If the JMM takes a 'right', there will be some combination of CPU, OS, JVM release, version, and phase of the moon that combine to use it).
A small selection of common HB/HA establishing conditions:
The first line inside a synchronized(lock) block is HA relative to the hitting of that block in any other thread.
Exiting a synchronized(lock) block is HB relative to any other thread entering any synchronized(lock) block, assuming the two locks are the same reference.
thread.start() is HB relative to the first line that thread will run.
The 'natural' HB/HA: line X is HB relative to line Y if X and Y are run by the same thread and X is 'before it' in your code. You can't write x = 5; y = x; and have y be set by a version of x that did not witness the x = 5 happening (of course, if another thread is also modifying x, all bets are off unless you have HB/HA with whatever line is doing that).
writes and reads to volatile establish HB/HA but you usually can't get any guarantees about which direction.
This explains the way your code may fail: The get() call establishes absolutely no HB/HA relationship with the other thread that is calling put(), and therefore the get() call may or may not use locally cached variants of the various fields that HashMap uses internally, depending on the evil coin (which is of course hitting some fields; it'll be private fields in the HashMap implementation someplace, so you don't know which ones, but HashMap obviously has long-lived state, which implies fields are involved).
So why haven't you actually managed to 'see' your code asplode like the JMM says it will? Because the coin is EVIL. You cannot rely on this line of reasoning: "I wrote some code that should fail if the synchronizations I need aren't happening the way I want. I ran it a whole bunch of times, and it never failed, therefore, apparently this code is concurrency-safe and I can use this in my production code". That is simply not ever actually reliable. That's why you need to be thinking: Evil! That coin is out to get me! Because if you don't, you may be tempted to write test code like this.
You should be scared of writing code where more than one thread interacts with the same field. You should be bending over backwards to avoid it. Use message queues. Do your chat between threads by using databases, which have much nicer primitives for this stuff (transactions and isolation levels). Rewrite the code so that it takes a bunch of params up front and then runs without interacting with other threads via fields at all, until it is all done, and it then returns a result (and then use e.g. fork/join framework to make it all work). Make your webserver performant and using all the cores simply by relying on the fact that every incoming request will be its own thread, so the only thing that needs to happen for you to use all the cores is for that many folks to hit your server at the same time. If you don't have enough requests, great! Your server isn't busy so it doesn't matter you aren't using all the cores.
If truly you decide that interacting with the same field from multiple threads is the right answer, you need to think NASA programming mars rovers on the lines that interact with those fields, because tests simply cannot be relied upon. It's not as hard as it sounds - especially if you keep the actual interacting with the relevant fields down to a minimum and keep thinking: "Have I established HB/HA"?
In this case, I think Petr figured it out correctly: System.out.println is hella slow and does various synchronizing actions. JMM is a package deal, and commutative: Once HB/HA establishes, everything the HB line changed is observable to the code in the HA line, and add in the natural rule, which means all code that follows the HA line cannot possibly observe a universe where something any line before the HB line did is not yet visible. In other words, the System.out.println statements HB/HA with each other in some order, but you can't rely on that (System.out is not specced to synchronize. But, just about every implementation does. You should not rely on implementation details, and I can trivially write you some java code that is legal, compiles, runs, and breaks no contracts, because you can set System.out with System.setOut - that does not synchronize when interacting with System.out!). The evil coin in this case took the shape of 'accidental' synchronization via intentionally unspecced behaviour of System.out.
The following explanation is more in line with the terminology used in the JMM. Could be useful if you want a more solid understanding of this topic.
2 Actions are conflicting when they access the same address and there is at least 1 write.
2 Actions are concurrent when they are not ordered by a happens-before relation (there is no happens-before edge between them).
2 Actions are in data race when they are conflicting and concurrent.
When there are data races in your program, weird problems can happen like unexpected reordering of instructions, visibility problems, or atomicity problems.
So what makes up the happens-before relation. If a volatile read observes a particular volatile write, then there is a happens-before edge between the write and the read. This means that read will not only see that write, but everything that happened before that write. There are other sources of happens-before edges like the release of a monitor and subsequent acquire of the same monitor. And there is a happens-before edge between A, B when A occurs before B in the program order. Note: the happens-before relation is transitive, so if A happens-before B and B happens-before C, then A happens-before C.
In your case, you have a get/put operations which are conflicting since they access the same address(es) and there is at least 1 write.
The put/get action are concurrent, since is no happens-before edge between writing and reading because even though the write releases the monitor, the get doesn't acquire it.
Since the put/get operations are concurrent and conflicting, they are in data race.
The simplest way to fix this problem, is to execute the map.get in a synchronized block (using the same monitor). This will introduce the desired happens-before edge and makes the actions sequential instead of concurrent and as consequence, the data-race disappears.
A better-performing solution would be to make use of a ConcurrentHashMap. Instead of a single central lock, there are many locks and they can be acquired concurrently to improve scalability and performance. I'm not going to dig into the optimizations of the ConcurrentHashMap because would create confusion.
[Edit]
Apart from a data-race, your code also suffers from race conditions.
Related
I read the below program and answer in a blog.
int x = 0;
boolean bExit = false;
Thread 1 (not synchronized)
x = 1;
bExit = true;
Thread 2 (not synchronized)
if (bExit == true)
System.out.println("x=" + x);
is it possible for Thread 2 to print “x=0”?
Ans : Yes ( reason : Every thread has their own copy of variables. )
how do you fix it?
Ans: By using make both threads synchronized on a common mutex or make both variable volatile.
My doubt is : If we are making the 2 variable as volatile then the 2 threads will share the variables from the main memory. This make a sense, but in case of synchronization how it will be resolved as both the thread have their own copy of variables.
Please help me.
This is actually more complicated than it seems. There are several arcane things at work.
Caching
Saying "Every thread has their own copy of variables" is not exactly correct. Every thread may have their own copy of variables, and they may or may not flush these variables into the shared memory and/or read them from there, so the whole thing is non-deterministic. Moreover, the very term flushing is really implementation-dependent. There are strict terms such as memory consistency, happens-before order, and synchronization order.
Reordering
This one is even more arcane. This
x = 1;
bExit = true;
does not even guarantee that Thread 1 will first write 1 to x and then true to bExit. In fact, it does not even guarantee that any of these will happen at all. The compiler may optimize away some values if they are not used later. The compiler and CPU are also allowed to reorder instructions any way they want, provided that the outcome is indistinguishable from what would happen if everything was really in program order. That is, indistinguishable for the current thread! Nobody cares about other threads until...
Synchronization comes in
Synchronization does not only mean exclusive access to resources. It is also not just about preventing threads from interfering with each other. It's also about memory barriers. It can be roughly described as each synchronization block having invisible instructions at the entry and exit, the first one saying "read everything from the shared memory to be as up-to-date as possible" and the last one saying "now flush whatever you've been doing there to the shared memory". I say "roughly" because, again, the whole thing is an implementation detail. Memory barriers also restrict reordering: actions may still be reordered, but the results that appear in the shared memory after exiting the synchronized block must be identical to what would happen if everything was indeed in program order.
All that only works, of course, only if both blocks use the same locking object.
The whole thing is described in details in Chapter 17 of the JLS. In particular, what's important is the so-called "happens-before order". If you ever see in the documentation that "this happens-before that", it means that everything the first thread does before "this" will be visible to whoever does "that". This may even not require any locking. Concurrent collections are a good example: one thread puts there something, another one reads that, and that magically guarantees that the second thread will see everything the first thread did before putting that object into the collection, even if those actions had nothing to do with the collection itself!
Volatile variables
One last warning: you better give up on the idea that making variables volatile will solve things. In this case maybe making bExit volatile will suffice, but there are so many troubles that using volatiles can lead to that I'm not even willing to go into that. But one thing is for sure: using synchronized has much stronger effect than using volatile, and that goes for memory effects too. What's worse, volatile semantics changed in some Java version so there may exist some versions that still use the old semantics which was even more obscure and confusing, whereas synchronized always worked well provided you understand what it is and how to use it.
Pretty much the only reason to use volatile is performance because synchronized may cause lock contention and other troubles. Read Java Concurrency in Practice to figure all that out.
Q & A
1) You wrote "now flush whatever you've been doing there to the shared
memory" about synchronized blocks. But we will see only the variables
that we access in the synchronize block or all the changes that the
thread call synchronize made (even on the variables not accessed in the
synchronized block)?
Short answer: it will "flush" all variables that were updated during the synchronized block or before entering the synchronized block. And again, because flushing is an implementation detail, you don't even know whether it will actually flush something or do something entirely different (or doesn't do anything at all because the implementation and the specific situation already somehow guarantee that it will work).
Variables that wasn't accessed inside the synchronized block obviously won't change during the execution of the block. However, if you change some of those variables before entering the synchronized block, for example, then you have a happens-before relationship between those changes and whatever happens in the synchronized block (the first bullet in 17.4.5). If some other thread enters another synchronized block using the same lock object then it synchronizes-with the first thread exiting the synchronized block, which means that you have another happens-before relationship here. So in this case the second thread will see the variables that the first thread updated prior to entering the synchronized block.
If the second thread tries to read those variables without synchronizing on the same lock, then it is not guaranteed to see the updates. But then again, it isn't guaranteed to see the updates made inside the synchronized block as well. But this is because of the lack of the memory-read barrier in the second thread, not because the first one didn't "flush" its variables (memory-write barrier).
2) In this chapter you post (of JLS) it is written that: "A write to a
volatile field (§8.3.1.4) happens-before every subsequent read of that
field." Doesn't this mean that when the variable is volatile you will
see only changes of it (because it is written write happens-before
read, not happens-before every operation between them!). I mean
doesn't this mean that in the example, given in the description of the
problem, we can see bExit = true, but x = 0 in the second thread if
only bExit is volatile? I ask, because I find this question here: http://java67.blogspot.bg/2012/09/top-10-tricky-java-interview-questions-answers.html
and it is written that if bExit is volatile the program is OK. So the
registers will flush only bExits value only or bExits and x values?
By the same reasoning as in Q1, if you do bExit = true after x = 1, then there is an in-thread happens-before relationship because of the program order. Now since volatile writes happen-before volatile reads, it is guaranteed that the second thread will see whatever the first thread updated prior to writing true to bExit. Note that this behavior is only since Java 1.5 or so, so older or buggy implementations may or may not support this. I have seen bits in the standard Oracle implementation that use this feature (java.concurrent collections), so you can at least assume that it works there.
3) Why monitor matters when using synchronized blocks about memory
visibility? I mean when try to exit synchronized block aren't all
variables (which we accessed in this block or all variables in the
thread - this is related to the first question) flushed from registers
to main memory or broadcasted to all CPU caches? Why object of
synchronization matters? I just cannot imagine what are relations and
how they are made (between object of synchronization and memory).
I know that we should use the same monitor to see this changes, but I
don't understand how memory that should be visible is mapped to
objects. Sorry, for the long questions, but these are really
interesting questions for me and it is related to the question (I
would post questions exactly for this primer).
Ha, this one is really interesting. I don't know. Probably it flushes anyway, but Java specification is written with high abstraction in mind, so maybe it allows for some really weird hardware where partial flushes or other kinds of memory barriers are possible. Suppose you have a two-CPU machine with 2 cores on each CPU. Each CPU has some local cache for every core and also a common cache. A really smart VM may want to schedule two threads on one CPU and two threads on another one. Each pair of the threads uses its own monitor, and VM detects that variables modified by these two threads are not used in any other threads, so it only flushes them as far as the CPU-local cache.
See also this question about the same issue.
4) I thought that everything before writing a volatile will be up to
date when we read it (moreover when we use volatile a read that in
Java it is memory barrier), but the documentation don't say this.
It does:
17.4.5.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
If hb(x, y) and hb(y, z), then hb(x, z).
A write to a volatile field (§8.3.1.4) happens-before every subsequent
read of that field.
If x = 1 comes before bExit = true in program order, then we have happens-before between them. If some other thread reads bExit after that, then we have happens-before between write and read. And because of the transitivity, we also have happens-before between x = 1 and read of bExit by the second thread.
5) Also, if we have volatile Person p does we have some dependency
when we use p.age = 20 and print(p.age) or have we memory barrier in
this case(assume age is not volatile) ? - I think - No
You are correct. Since age is not volatile, then there is no memory barrier, and that's one of the trickiest things. Here is a fragment from CopyOnWriteArrayList, for example:
Object[] elements = getArray();
E oldValue = get(elements, index);
if (oldValue != element) {
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len);
newElements[index] = element;
setArray(newElements);
} else {
// Not quite a no-op; ensures volatile write semantics
setArray(elements);
Here, getArray and setArray are trivial setter and getter for the array field. But since the code changes elements of the array, it is necessary to write the reference to the array back to where it came from in order for the changes to the elements of the array to become visible. Note that it is done even if the element being replaced is the same element that was there in the first place! It is precisely because some fields of that element may have changed by the calling thread, and it's necessary to propagate these changes to future readers.
6) And is there any happens before 2 subsequent reads of volatile
field? I mean does the second read will see all changes from thread
which reads this field before it(of course we will have changes only
if volatile influence visibility of all changes before it - which I am
a little confused whether it is true or not)?
No, there is no relationship between volatile reads. Of course, if one thread performs a volatile write and then two other thread perform volatile reads, they are guaranteed to see everything at least up to date as it was before the volatile write, but there is no guarantee of whether one thread will see more up-to-date values than the other. Moreover, there is not even strict definition of one volatile read happening before another! It is wrong to think of everything happening on a single global timeline. It is more like parallel universes with independent timelines that sometimes sync their clocks by performing synchronization and exchanging data with memory barriers.
It depends on the implementation which decides if threads will keep a copy of the variables in their own memory. In case of class level variables threads have a shared access and in case of local variables threads will keep a copy of it. I will provide two examples which shows this fact , please have a look at it.
And in your example if I understood it correctly your code should look something like this--
package com.practice.multithreading;
public class LocalStaticVariableInThread {
static int x=0;
static boolean bExit = false;
public static void main(String[] args) {
Thread t1=new Thread(run1);
Thread t2=new Thread(run2);
t1.start();
t2.start();
}
static Runnable run1=()->{
x = 1;
bExit = true;
};
static Runnable run2=()->{
if (bExit == true)
System.out.println("x=" + x);
};
}
Output
x=1
I am getting this output always. It is because the threads share the variable and the when it is changed by one thread other thread can see it. But in real life scenarios we can never say which thread will start first, since here the threads are not doing anything we can see the expected result.
Now take this example--
Here if you make the i variable inside the for-loop` as static variable then threads won t keep a copy of it and you won t see desired outputs, i.e. the count value will not be 2000 every time even if u have synchronized the count increment.
package com.practice.multithreading;
public class RaceCondition2Fixed {
private int count;
int i;
/*making it synchronized forces the thread to acquire an intrinsic lock on the method, and another thread
cannot access it until this lock is released after the method is completed. */
public synchronized void increment() {
count++;
}
public static void main(String[] args) {
RaceCondition2Fixed rc= new RaceCondition2Fixed();
rc.doWork();
}
private void doWork() {
Thread t1 = new Thread(new Runnable() {
#Override
public void run() {
for ( i = 0; i < 1000; i++) {
increment();
}
}
});
Thread t2 = new Thread(new Runnable() {
#Override
public void run() {
for ( i = 0; i < 1000; i++) {
increment();
}
}
});
t1.start();
t2.start();
try {
t1.join();
t2.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
/*if we don t use join then count will be 0. Because when we call t1.start() and t2.start()
the threads will start updating count in the spearate threads, meanwhile the main thread will
print the value as 0. So. we need to wait for the threads to complete. */
System.out.println(Thread.currentThread().getName()+" Count is : "+count);
}
}
public class ThreadTest implements Runnable {
private int counter;
private Date mydate = new Date();
public void upCounter1() {
synchronized (mydate ) {
for (int i = 0; i < 5; i++) {
counter++;
System.out.println("1 " + counter);
}
}
}
public void upCounter2() {
synchronized (mydate ) {
for (int i = 0; i < 5; i++) {
counter++;
System.out.println("2 " + counter);
}
}
}
public void upCounter3() {
synchronized (mydate ) {
for (int i = 0; i < 5; i++) {
counter++;
System.out.println("3 " + counter);
}
}
}
#Override
public void run() {
upCounter1();
upCounter2();
upCounter3();
}
public static void main(String[] args) {
Threadtest mtt = new Threadtest();
Thread t1 = new Thread(mtt);
Thread t2 = new Thread(mtt);
Thread t3 = new Thread(mtt);
t1.start();
t2.start();
t3.start();
}
}
I tried this code with various synchronisation techniques and I'd like to make sure I get what's happening. I've read a bunch of articles on this, but none of them broke it down enough for me.
So here's what I observed:
synchronised (this): This works only, if I give the SAME instance of Threadtest to all threads, because if I give each thread its own instance, each will get that instance's intrinsic lock and can access the methods without interruption from the other threads.
However, if I give each thread its own instance, I can do: synchronised (getClass()), because then I get the instrinsic lock of the class
Alternatively, I could do: synchronised (mydate), where the same rules apply that apply to synchronised (this). But it has the advantage of not being public. > I dont really understand this. What is the "danger" of using this?
Alternatively to synchronised (getClass()), I could also use a private static field.
However, I cannot do synchronised(Date.class).
I could synchronise the entire methods (same effecte as with synchronised-block)
making counter volatile doesn't work, because incrementing isn't a truly atomic operation
If I want to make each method accessible individually, I would make three private fields and use them in the synchronised-blocks. I then am effectively using the intrinsic locks of those fields and not of my class or instance.
I also noted that when I use the class-lock, each method is viewed as separate and I have effectively 3 ounters that go to 15. If I use the instance lock, the counter goes to 45. Is that the correct and expected behaviour?
Are my explanations and observations correct? (I basically want to make sure I draw the correct conclusions form the console output I got)
a-c; e-f are correct.
c) Alternatively, I could do: synchronised (mydate), where the same rules apply that apply to synchronised (this). But it has the advantage of not being public. > I dont really understand this. What is the "danger" of using this?
The argument is that other code may also decide to use that object as a lock. Which could cause conflict; when you know that this can never be the case then it is not such an evil thing. It is also usually more of a problem when one uses wait/notify in their code.
d) Alternatively to synchronised (getClass()), I could also use a private static field. However, I cannot do synchronised(Date.class).
You can use Date.class, it would just be a bit weird and falls into the argument discussed in c above about not polluting other classes work spaces.
g) If I want to make each method accessible individually, I would make three private fields and use them in the synchronised-blocks. I then am effectively using the intrinsic locks of those fields and not of my class or instance.
Given that the three methods share the same state, then no, this would not be wise as it would lead to races between the threads.
h) I also noted that when I use the class-lock, each method is viewed as separate and I have effectively 3 counters that go to 15. If I use the instance lock, the counter goes to 45. Is that the correct and expected behaviour?
No, this sounds wrong but I may have misunderstood you. I would expect the total to be 45 in both cases when using either this or this.getClass() as the lock.
Your code is threadsafe as it stands, if slow (you are writing to the console while holding a lock) - but better correct and slow than wrong and fast!
a) synchronised (this): This works only, if I give the SAME instance of Threadtest to all threads, because if I give each thread its own instance, each will get that instance's intrinsic lock and can access the methods without interruption from the other threads.
Your code is threadsafe either case - that is, it will give the exact same results every time. If you pass the same instance to three different threads the final line of output will be "3 45" (since there is only one counter variable) and if you give each thread its own instance there will be three lines reading "3 15". It sounds to me like you understand this.
b) However, if I give each thread its own instance, I can do: synchronised (getClass()), because then I get the instrinsic lock of the class
If you do this your code is still threadsafe, but you will get three lines reading "3 15" as above. Be aware that you will also be more prone to liveness and deadlock issues for the reason stated below.
c) Alternatively, I could do: synchronised (mydate), where the same rules apply that apply to synchronised (this). But it has the advantage of not being public. I dont really understand this. What is the "danger" of using this?
You should try to use private locks where you can. If you use a globally-visible object (e.g. this or getClass or a field with visibility other than private or an interned String or an object that you got from a factory) then you open up the possibility that some other code will also try to lock on the object that you are locking on. You may end up waiting longer than you expect to acquire the lock (liveness issue) or even in a deadlock situation.
For a detailed analysis of things that can go wrong, see the secure coding guidelines for Java - but note that this is not just a security issue.
d) Alternatively to synchronised (getClass()), I could also use a private static field. However, I cannot do synchronised(Date.class).
A private static field is preferable to either getClass() or Date.class for the reasons stated above.
e) I could synchronise the entire methods (same effecte as with synchronised-block)
Pretty much (there are currently some insignificant byte code differences), but again you should prefer private locks.
f) making counter volatile doesn't work, because incrementing isn't a truly atomic operation
Yes, you may run into a race condition and your code is no longer threadsafe (although you don't have the visibility issue mentioned below)
g) If I want to make each method accessible individually, I would make three private fields and use them in the synchronised-blocks. I then am effectively using the intrinsic locks of those fields and not of my class or instance.
You should not do this, you should always use the same lock to access a variable. As well as the fact that you could have multiple threads reading/writing to the same variable at the same time giving race condition you also have a subtler issue to do with inter-thread visibility. The Java Memory Model guarantees that writes done by one thread before a lock is released will be seen another thread when that other thread acquires the same lock. So thread 2 executing upCounter2 may or may not see the results of thread 1 executing upCounter1.
Rather than thinking of "which blocks of code do I need to execute?" you should think "which pieces of state do I need to access?".
h) I also noted that when I use the class-lock, each method is viewed as separate and I have effectively 3 ounters that go to 15. If I use the instance lock, the counter goes to 45. Is that the correct and expected behaviour?
Yes, but it has nothing to do with the object you are using for synchronisation, rather it's because you have created three different ThreadTest objects and hence have three different counters, as I explained in my answer to your first question.
Make sure that you understand the difference between three threads operating on one object and one thread operating on three different objects. Then you will be able to understand the behaviour you are observing with three threads operating on three different objects.
a) Correct
b) Correct
c) There could be some other bunch of code using your this or class in another part of your application where your class is accessible. This will mean that unrelated code will be waiting for each other to complete.
d) You cannot do synchronisation on Date.class because of the same reason above. There may be unrelated threaded methods waiting for each other unnecessarily.
e) Method synchronisation is same as class lock
g) Correct
Program order rule states "Each action in a thread happens-before every action in that thread that comes later in the program order"
1.I read in another thread that an action is
reads and writes to variables
locks and unlocks of monitors
starting and joining with threads
Does this mean that reads and writes can be changed in order, but reads and writes cannot change order with actions specified in 2nd or 3rd lines?
2.What does "program order" mean?
Explanation with an examples would be really helpful.
Additional related question
Suppose I have the following code:
long tick = System.nanoTime(); //Line1: Note the time
//Block1: some code whose time I wish to measure goes here
long tock = System.nanoTime(); //Line2: Note the time
Firstly, it's a single threaded application to keep things simple. Compiler notices that it needs to check the time twice and also notices a block of code that has no dependency with surrounding time-noting lines, so it sees a potential to reorganize the code, which could result in Block1 not being surrounded by the timing calls during actual execution (for instance, consider this order Line1->Line2->Block1). But, I as a programmer can see the dependency between Line1,2 and Block1. Line1 should immediately precede Block1, Block1 takes a finite amount of time to complete, and immediately succeeded by Line2.
So my question is: Am I measuring the block correctly?
If yes, what is preventing the compiler from rearranging the order.
If no, (which is think is correct after going through Enno's answer) what can I do to prevent it.
P.S.: I stole this code from another question I asked in SO recently.
It probably helps to explain why such rule exist in the first place.
Java is a procedural language. I.e. you tell Java how to do something for you. If Java executes your instructions not in the order you wrote, it would obviously not work. E.g. in the below example, if Java would do 2 -> 1 -> 3 then the stew would be ruined.
1. Take lid off
2. Pour salt in
3. Cook for 3 hours
So, why does the rule not simply say "Java executes what you wrote in the order you wrote"? In a nutshell, because Java is clever. Take the following example:
1. Take eggs out of the freezer
2. Take lid off
3. Take milk out of the freezer
4. Pour egg and milk in
5. Cook for 3 hours
If Java was like me, it'll just execute it in order. However Java is clever enough to understand that it's more efficient AND that the end result would be the same should it do 1 -> 3 -> 2 -> 4 -> 5 (you don't have to walk to the freezer again, and that doesn't change the recipe).
So what the rule "Each action in a thread happens-before every action in that thread that comes later in the program order" is trying to say is, "In a single thread, your program will run as if it was executed in the exact order you wrote it. We might change the ordering behind the scene but we make sure that none of that would change the output.
So far so good. Why does it not do the same across multiple threads? In multi-thread programming, Java isn't clever enough to do it automatically. It will for some operations (e.g. joining threads, starting threads, when a lock (monitor) is used etc.) but for other stuff you need to explicitly tell it to not do reordering that would change the program output (e.g. volatile marker on fields, use of locks etc.).
Note:
Quick addendum about "happens-before relationship". This is a fancy way of saying no matter what reordering Java might do, stuff A will happen before stuff B. In our weird later stew example, "Step 1 & 3 happens-before step 4 "Pour egg and milk in" ". Also for example, "Step 1 & 3 do not need a happens-before relationship because they don't depend on each other in any way"
On the additional question & response to the comment
First, let us establish what "time" means in the programming world. In programming, we have the notion of "absolute time" (what's the time in the world now?) and the notion of "relative time" (how much time has passed since x?). In an ideal world, time is time but unless we have an atomic clock built in, the absolute time would have to be corrected time to time. On the other hand, for relative time we don't want corrections as we are only interested in the differences between events.
In Java, System.currentTime() deals with absolute time and System.nanoTime() deals with relative time. This is why the Javadoc of nanoTime states, "This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time".
In practice, both currentTimeMillis and nanoTime are native calls and thus the compiler can't practically prove if a reordering won't affect the correctness, which means it will not reorder the execution.
But let us imagine we want to write a compiler implementation that actually looks into native code and reorders everything as long as it's legal. When we look at the JLS, all that it tells us is that "You can reorder anything as long as it cannot be detected". Now as the compiler writer, we have to decide if the reordering would violate the semantics. For relative time (nanoTime), it would clearly be useless (i.e. violates the semantics) if we'd reorder the execution. Now, would it violate the semantics if we'd reorder for absolute time (currentTimeMillis)? As long as we can limit the difference from the source of the world's time (let's say the system clock) to whatever we decide (like "50ms")*, I say no. For the below example:
long tick = System.currentTimeMillis();
result = compute();
long tock = System.currentTimeMillis();
print(result + ":" + tick - tock);
If the compiler can prove that compute() takes less than whatever maximum divergence from the system clock we can permit, then it would be legal to reorder this as follows:
long tick = System.currentTimeMillis();
long tock = System.currentTimeMillis();
result = compute();
print(result + ":" + tick - tock);
Since doing that won't violate the spec we defined, and thus won't violate the semantics.
You also asked why this is not included in the JLS. I think the answer would be "to keep the JLS short". But I don't know much about this realm so you might want to ask a separate question for that.
*: In actual implementations, this difference is platform dependent.
The program order rule guarantees that, within individual threads, reordering optimizations introduced by the compiler cannot produce different results from what would have happened if the program had been executed in serial fashion. It makes no guarantees about what order the thread's actions may appear to occur in to any other threads if its state is observed by those threads without synchronization.
Note that this rule speaks only to the ultimate results of the program, and not to the order of individual executions within that program. For instance, if we have a method which makes the following changes to some local variables:
x = 1;
z = z + 1;
y = 1;
The compiler remains free to reorder these operations however it sees best fit to improve performance. One way to think of this is: if you could reorder these ops in your source code and still obtain the same results, the compiler is free to do the same. (And in fact, it can go even further and completely discard operations which are shown to have no results, such as invocations of empty methods.)
With your second bullet point the monitor lock rule comes into play: "An unlock on a monitor happens-before every subsequent lock on that main monitor lock." (Java Concurrency in Practice p. 341) This means that a thread acquiring a given lock will have a consistent view of the actions which occurred in other threads before releasing that lock. However, note that this guarantee only applies when two different threads release or acquire the same lock. If Thread A does a bunch of stuff before releasing Lock X, and then Thread B acquires Lock Y, Thread B is not assured to have a consistent view of A's pre-X actions.
It is possible for reads and writes to variables to be reordered with start and join if a.) doing so doesn't break within-thread program order, and b.) the variables have not had other "happens-before" thread synchronization semantics applied to them, say by storing them in volatile fields.
A simple example:
class ThreadStarter {
Object a = null;
Object b = null;
Thread thread;
ThreadStarter(Thread threadToStart) {
this.thread = threadToStart;
}
public void aMethod() {
a = new BeforeStartObject();
b = new BeforeStartObject();
thread.start();
a = new AfterStartObject();
b = new AfterStartObject();
a.doSomeStuff();
b.doSomeStuff();
}
}
Since the fields a and b and the method aMethod() are not synchronized in any way, and the action of starting thread does not change the results of the writes to the fields (or the doing of stuff with those fields), the compiler is free to reorder thread.start() to anywhere in the method. The only thing it could not do with the order of aMethod() would be to move the order of writing one of the BeforeStartObjects to a field after writing an AfterStartObject to that field, or to move one of the doSomeStuff() invocations on a field before the AfterStartObject is written to it. (That is, assuming that such reordering would change the results of the doSomeStuff() invocation in some way.)
The critical thing to bear in mind here is that, in the absence of synchronization, the thread started in aMethod() could theoretically observe either or both of the fields a and b in any of the states which they take on during the execution of aMethod() (including null).
Additional question answer
The assignments to tick and tock cannot be reordered with respect to the code in Block1 if they are to be actually used in any measurements, for example by calculating the difference between them and printing the result as output. Such reordering would clearly break Java's within-thread as-if-serial semantics. It changes the results from what would have been obtained by executing instructions in the specified program order. If the assignments aren't used for any measurements and have no side-effects of any kind on the program result, they'll likely be optimized away as no-ops by the compiler rather than being reordered.
Before I answer the question,
reads and writes to variables
Should be
volatile reads and volatile writes (of the same field)
Program order doesn't guarantee this happens before relationship, rather the happens-before relationship guarantees program order
To your questions:
Does this mean that reads and writes can be changed in order, but reads and writes cannot change order with actions specified in 2nd or 3rd lines?
The answer actually depends on what action happens first and what action happens second. Take a look at the JSR 133 Cookbook for Compiler Writers. There is a Can Reorder grid that lists the allowed compiler reordering that can occur.
For instance a Volatile Store can be re-ordered above or below a Normal Store but a Volatile Store cannot be be reordered above or below a Volatile Load. This is all assuming intrathread semantics still hold.
What does "program order" mean?
This is from the JLS
Among all the inter-thread actions performed by each thread t, the
program order of t is a total order that reflects the order in which
these actions would be performed according to the intra-thread
semantics of t.
In other words, if you can change the writes and loads of a variable in such a way that it will preform exactly the same way as you wrote it then it maintains program order.
For instance
public static Object getInstance(){
if(instance == null){
instance = new Object();
}
return instance;
}
Can be reordered to
public static Object getInstance(){
Object temp = instance;
if(instance == null){
temp = instance = new Object();
}
return temp;
}
it simply mean though the thread may be multiplxed, but the internal order of the thread's action/operation/instruction would remain constant (relatively)
thread1: T1op1, T1op2, T1op3...
thread2: T2op1, T2op2, T2op3...
though the order of operation (Tn'op'M) among thread may vary, but operations T1op1, T1op2, T1op3 within a thread will always be in this order, and so as the T2op1, T2op2, T2op3
for ex:
T2op1, T1op1, T1op2, T2op2, T2op3, T1op3
Java tutorial http://docs.oracle.com/javase/tutorial/essential/concurrency/memconsist.html says that happens-before relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement. Here is an illustration
int x;
synchronized void x() {
x += 1;
}
synchronized void y() {
System.out.println(x);
}
synchronized creates a happens-before relationship, if we remove it there will be no guarantee that after thread A increments x thread B will print 1, it may print 0
I am refering to page 261 - 262 of Joshua Bloch Effective Java
// Properly synchronized cooperative thread termination
public class StopThread {
private static boolean stopRequested;
private static synchronized void requestStop() {
stopRequested = true;
}
private static synchronized boolean stopRequested() {
return stopRequested;
}
public static void main(String[] args) throws InterruptedException {
Thread backgroundThread = new Thread(new Runnable() {
public void run() {
int i = 0;
while (!stopRequested())
i++;
}
});
backgroundThread.start();
TimeUnit.SECONDS.sleep(1);
requestStop();
}
}
Note that both the write method
(requestStop) and the read method
(stop- Requested) are synchronized. It
is not sufficient to synchronize only
the write method! In fact,
synchronization has no effect unless
both read and write operations are
synchronized.
Joshua's example is synchronized on this. However My doubt is that, must synchronized be acted on the same object? Say, if I change the code to
private static void requestStop() {
synchronized(other_static_final_object_monitor) {
stopRequested = true;
}
}
private static synchronized boolean stopRequested() {
return stopRequested;
}
will this still able to avoid liveness failure?
That's is, we know grabbing monitor for a same object during read/write can avoid liveness failure (According to Joshua Bloch's example). But how about grabbing monitor for different object during read/write?
I don't believe it's guaranteed, although I wouldn't be surprised if it actually was okay in all existing implementations. The Java Language Specification, section 17.4.4 states this:
An unlock action on monitor m synchronizes-with all subsequent lock actions on m (where subsequent is defined according to the synchronization order).
I believe that all the safety of reading/writing shared variables within locks stems from that bullet point in the spec - and that only specifies anything about a lock and an unlock action on a single monitor.
EDIT: Even if this did work for a single variable, you wouldn't want to use it for multiple variables. If you update multiple variables while holding a monitor and only read from them when holding a monitor, you can ensure that you always read a consistent set of data: nothing's going to write to variable Y before you've read that but after you've read variable X. If you use different monitors for reading and writing, that consistency goes away: the values could be changed at any time while you're reading them.
Possibly, but there are no guarantees and it could be highly platform dependant. In your case there is no real test for liveliness so if the value is a few milli-seconds late your application will appear to work correctly anyway. The application will stop eventually without any synchronized and you may not see then difference.
The problem with memory consistency errors is I have seen examples where something can be updated correctly in a test 1 billion times and then fail when there is a different program running on the system. This is why guaranteed behaviour is more interesting.
According to the The Java Language Specification,
"We say that a read r of a variable v is allowed to observe a write w to v if, in the happens-before partial order of the execution trace:
r is not ordered before w (i.e., it is not the case that hb(r, w), and
there is no intervening write w' to v (i.e., no write w' to v such that hb(w, w') and hb(w', r).
Informally, a read r is allowed to see the result of a write w if there is no happens-before ordering to prevent that read."
This means that unless there is some explicit synchronization action that causes multiple threads to interleave their actions in some predictable way (i.e. there's a good happens-before relationship defined on their actions), then a thread is allowed to see pretty much any value of a variable at any point where it was written to.
If you synchronize on multiple different objects, there is no happens-before relationship connecting the reader and the writer. This means that the reading thread can keep seeing whatever value it wants for the stopRequested variable, which could either be the first value forever, or the new value as soon as its updated, or something delightfully in-between the two.
Theoretically it's wrong. Per lang spec v3, the background thread may not see the update.
Practically it'll work. VM just can't be that smart to optimize to such a degree. (In older version of Java, which has threading spec worded differently, it is possible that your suggestion is correct even in theory.)
In any case, don't do it.
If you use a different monitor, there is no synchronization. No other code is requesting the monitor of this or other_static_final_object_monitor.
Using a static object to synchronize is only useful, if you want to synchronize across classes and within methods.
Also, NEVER use a String as a lock/monitor. Always use something like this:
static final Object LOCK = new Object();
This question has been discussed in two blog posts (http://dow.ngra.de/2008/10/27/when-systemcurrenttimemillis-is-too-slow/, http://dow.ngra.de/2008/10/28/what-do-we-really-know-about-non-blocking-concurrency-in-java/), but I haven't heard a definitive answer yet. If we have one thread that does this:
public class HeartBeatThread extends Thread {
public static int counter = 0;
public static volatile int cacheFlush = 0;
public HeartBeatThread() {
setDaemon(true);
}
static {
new HeartBeatThread().start();
}
public void run() {
while (true) {
try {
Thread.sleep(500);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
counter++;
cacheFlush++;
}
}
}
And many clients that run the following:
if (counter == HeartBeatThread.counter) return;
counter = HeartBeatThread.cacheFlush;
is it threadsafe or not?
Within the java memory model? No, you are not ok.
I've seen a number of attempts to head towards a very 'soft flush' approach like this, but without an explicit fence, you're definitely playing with fire.
The 'happens before' semantics in
http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.7
start to referring to purely inter-thread actions as 'actions' at the end of 17.4.2. This drives a lot of confusion since prior to that point they distinguish between inter- and intra- thread actions. Consequently, the intra-thread action of manipulating counter isn't explicitly synchronized across the volatile action by the happens-before relationship. You have two threads of reasoning to follow about synchronization, one governs local consistency and is subject to all the nice tricks of alias analysis, etc to shuffle operations The other is about global consistency and is only defined for inter-thread operations.
One for the intra-thread logic that says within the thread the reads and writes are consistently reordered and one for the inter-thread logic that says things like volatile reads/writes and that synchronization starts/ends are appropriately fenced.
The problem is the visibility of the non-volatile write is undefined as it is an intra-thread operation and therefore not covered by the specification. The processor its running on should be able to see it as it you executed those statements serially, but its sequentialization for inter-thread purposes is potentially undefined.
Now, the reality of whether or not this can affect you is another matter entirely.
While running java on x86 and x86-64 platforms? Technically you're in murky territory, but practically the very strong guarantees x86 places on reads and writes including the total order on the read/write across the access to cacheflush and the local ordering on the two writes and the two reads should enable this code to execute correctly provided it makes it through the compiler unmolested. That assumes the compiler doesn't step in and try to use the freedom it is permitted under the standard to reorder operations on you due to the provable lack of aliasing between the two intra-thread operations.
If you move to a memory with weaker release semantics like an ia64? Then you're back on your own.
A compiler could in perfectly good faith break this program in java on any platform, however. That it functions right now is an artifact of current implementations of the standard, not of the standard.
As an aside, in the CLR, the runtime model is stronger, and this sort of trick is legal because the individual writes from each thread have ordered visibility, so be careful trying to translate any examples from there.
Well, I don't think it is.
The first if-statement:
if (counter == HeartBeatThread.counter)
return;
Does not access any volatile field and is not synchronized. So you might read stale data forever and never get to the point of accessing the volatile field.
Quoting from one of the comments in the second blog entry: "anything that was visible to thread A when it writes to volatile field f becomes visible to thread B when it reads f."
But in your case B (the client) never reads f (=cacheFlush). So changes to HeartBeatThread.counter do not have to become visible to the client.