Questions on Concurrency from Java Guide

Questions on Concurrency from Java Guide - java

So I've been reading on concurrency and have some questions on the way (guide I followed - though I'm not sure if its the best source):
Processes vs. Threads: Is the difference basically that a process is the program as a whole while a thread can be a (small) part of a program?
I am not exactly sure why there is a interrupted() method and a InterruptedException. Why should the interrupted() method even be used? It just seems to me that Java just adds an extra layer of indirection.
For synchronization (and specifically about the one in that link), how does adding the synchronize keyword even fix the problem? I mean, if Thread A gives back its incremented c and Thread B gives back the decremented c and store it to some other variable, I am not exactly sure how the problem is solved. I mean this may be answering my own question, but is it supposed to be assumed that after one of the threads return an answer, terminate? And if that is the case, why would adding synchronize make a difference?
I read (from some random PDF) that if you have two Threads start() subsequently, you cannot guarantee that the first thread will occur before the second thread. How would you guarantee it, though?
In synchronization statements, I am not completely sure whats the point of adding synchronized within the method. What is wrong with leaving it out? Is it because one expects both to mutate separately, but to be obtained together? Why not just have the two non-synchronized?
Is volatile just a keyword for variables and is synonymous with synchronized?
In the deadlock problem, how does synchronize even help the situation? What makes this situation different from starting two threads that change a variable?
Moreover, where is the "wait"/lock for the other person to bowBack? I would have thought that bow() was blocked, not bowBack().
I'll stop here because I think if I went any further without these questions answered, I will not be able to understand the later lessons.

Answers:
Yes, a process is an operating system process that has an address space, a thread is a unit of execution, and there can be multiple units of execution in a process.
The interrupt() method and InterruptedException are generally used to wake up threads that are waiting to either have them do something or terminate.
Synchronizing is a form of mutual exclusion or locking, something very standard and required in computer programming. Google these terms and read up on that and you will have your answer.
True, this cannot be guaranteed, you would have to have some mechanism, involving synchronization that the threads used to make sure they ran in the desired order. This would be specific to the code in the threads.
See answer to #3
Volatile is a way to make sure that a particular variable can be properly shared between different threads. It is necessary on multi-processor machines (which almost everyone has these days) to make sure the value of the variable is consistent between the processors. It is effectively a way to synchronize a single value.
Read about deadlocking in more general terms to understand this. Once you first understand mutual exclusion and locking you will be able to understand how deadlocks can happen.
I have not read the materials that you read, so I don't understand this one. Sorry.

I find that the examples used to explain synchronization and volatility are contrived and difficult to understand the purpose of. Here are my preferred examples:
Synchronized:
private Value value;
public void setValue(Value v) {
value = v;
}
public void doSomething() {
if(value != null) {
doFirstThing();
int val = value.getInt(); // Will throw NullPointerException if another
// thread calls setValue(null);
doSecondThing(val);
}
}
The above code is perfectly correct if run in a single-threaded environment. However with even 2 threads there is the possibility that value will be changed in between the check and when it is used. This is because the method doSomething() is not atomic.
To address this, use synchronization:
private Value value;
private Object lock = new Object();
public void setValue(Value v) {
synchronized(lock) {
value = v;
}
}
public void doSomething() {
synchronized(lock) { // Prevents setValue being called by another thread.
if(value != null) {
doFirstThing();
int val = value.getInt(); // Cannot throw NullPointerException.
doSecondThing(val);
}
}
}
Volatile:
private boolean running = true;
// Called by Thread 1.
public void run() {
while(running) {
doSomething();
}
}
// Called by Thread 2.
public void stop() {
running = false;
}
To explain this requires knowledge of the Java Memory Model. It is worth reading about in depth, but the short version for this example is that Threads have their own copies of variables which are only sync'd to main memory on a synchronized block and when a volatile variable is reached. The Java compiler (specifically the JIT) is allowed to optimise the code into this:
public void run() {
while(true) { // Will never end
doSomething();
}
}
To prevent this optimisation you can set a variable to be volatile, which forces the thread to access main memory every time it reads the variable. Note that this is unnecessary if you are using synchronized statements as both keywords cause a sync to main memory.
I haven't addressed your questions directly as Francis did so. I hope these examples can give you an idea of the concepts in a better way than the examples you saw in the Oracle tutorial.

Related

Threads does not work without volatile and reads the value from RAM instead of caching

Volatile is supposed to make the Threads read the values from RAM disabling thread cache, and without volatile caching will be enabled making a thread unaware of the variable change made by another thread but this does not work for the below code.
Why does this happen and code works the same with and without volatile keyword there?
public class Racing{
private boolean won = false; //without volatile keyword
public void race() throws InterruptedException{
Thread one = new Thread(()->{
System.out.println("Player-1 is racing...");
while(!won){
won=true;
}
System.out.println("Player-1 has won...");
});
Thread two=new Thread(()->{
System.out.println("Player-2 is racing...");
while(!won){
System.out.println("Player-2 Still Racing...");
}
});
one.start();
//Thread.sleep(2000);
two.start();
}
public static void main(String k[]) {
Racing racing=new Racing();
try{
racing.race();
}
catch(InterruptedException ie){}
}
Why does this behave the same with and without volatile ?

Volatile is supposed to make the threads read the values from RAM
disabling thread cache
No, this is not accurate. It depends on the architecture where the code is running. The Java language standard itself does not state anything about how the volatile should or not be implemented.
From Myths Programmers Believe about CPU Caches can read:
As a computer engineer who has spent half a decade working with caches
at Intel and Sun, I’ve learnt a thing or two about cache-coherency.
(...)
For another, if volatile variables were truly written/read from main-memory > every single time, they would be horrendously slow – main-memory references are > 200x slower than L1 cache references. In reality, volatile-reads (in Java) can > often be just as cheap as a L1 cache reference, putting to rest the notion that volatile forces reads/writes all the way to main memory. If you’ve been avoiding the use of volatiles because of performance concerns, you might have been a victim of the above misconceptions.
Unfortunately, there still are several articles online propagating this inaccuracy (i.e., that volatile forces variables to be read from main memory).
Accordingly to the language standard (§17.4):
A field may be declared volatile, in which case the Java Memory Model
ensures that all threads see a consistent value for the variable
So informally, all threads will have a view of the most updated value of that variable. There is nothing about how the hardware should enforce such constrain.
Why does this happen and code works same with and without volatile
Well (in your case) without the volatile is undefined behavior, meaning you might or not see the most updated value of the flag won, consequently, theoretically the race condition is still there. However, because you have added the following statement
System.out.println("Player-2 Still Racing...");
in:
Thread two = new Thread(()->{
System.out.println("Player-2 is racing...");
while(!won){
System.out.println("Player-2 Still Racing...");
}
});
two things will happen, you will avoid the Spin on field problem, and second if one looks at the System.out.println code:
public void println(String x) {
synchronized (this) {
print(x);
newLine();
}
}
one can see that there is a synchronized being called, which will increase the likelihood that the threads will be reading the most updated value of the field flag (before the called to the println method). However, even that might change based on the JVM implementation.

Without volatile, there is no guarantee that another thread will see updates written to a variable. That does not mean that another thread will not see those updates if the value is not volatile. Other threads may eventually see the modified value.
In your example, you are using System.out.printlns, which contain memory barriers. That means once the println works, all variables updated before that point are visible to all the threads. The program might work differently if you do not print anything.

Why is this code working without volatile?

I am new to Java, I am currently learning about volatile. Say I have the following code:
public class Test
{
private static boolean b = false;
public static void main(String[] args) throws Exception
{
new Thread(new Runnable()
{
public void run()
{
while(true)
{
b = true;
}
}
}).start();
// Give time for thread to start
Thread.sleep(2000);
System.out.println(b);
}
}
Output:
true
This code has two threads (the main thread and another thread). Why is the other thread able to modify the value of b, shouldn't b be volatile in order for this to happen?

The volatile keyword guarantees that changes are visible amongst multiple threads, but you're interpreting that to mean that opposite is also true; that the absence of the volatile keyword guarantees isolation between threads, and there's no such guarantee.
Also, while your code example is multi-threaded, it isn't necessarily concurrent. It could be that the values were cached per-thread, but there was enough time for the JVM to propagate the change before you printed the result.

You are right that with volatile, you can ensure/guarantee that your 2 threads will see the appropriate value from main memory at all times, and never a thread-specific cached version of it.
Without volatile, you lose that guarantee. And each thread is working with its own cached version of the value.
However, there is nothing preventing the 2 threads from resynchronizing their memory if and when they feel like it, and eventually viewing the same value (maybe). It's just that you can't guarantee that it will happen, and you most certainly cannot guarantee when it will happen. But it can happen at some indeterminate point in time.
The point is that your code may work sometimes, and sometimes not. But even if every time you run it on your personal computer, is seems like it's reading the variable properly, it's very likely that this same code will break on a different machine. So you are taking big risks.

What happens to a field in java that is accessed asynchronously (inconsistently) by multiple threads?

I think I have a fairly firm grasp on using the synchronized keyword to prevent inconsistencies between threads in java, but I don't fully understand what happens if you don't use that keyword.
Say for instance that I have a field accessed/modified by two threads:
private String sharedString = "";
class OneThread extends Thread {
private Boolean mRunning = false;
public OneThread() {}
public synchronized setRunning(Boolean b) {
mRunning = b;
}
#Override
public void run() {
while (mRunning) {
// read or write to shared string
sharedString = "text from thread 1";
System.out.println("String seen from thread 1: " + sharedString);
super.run();
}
}
}
class AnotherThread extends Thread {
private Boolean mRunning = false;
public AnotherThread() {}
public synchronized setRunning(Boolean b) {
mRunning = b;
}
#Override
public void run() {
while (mRunning) {
// read or write to shared string
sharedString = "text from thread 2";
System.out.println("String seen from thread 2: " + sharedString);
super.run();
}
}
}
Since both of these threads are accessing and modifying the field sharedString without using the synchronized keyword, I would expect inconsistencies. What I am wondering is what actually happens though. While debugging, I have stepped carefully through both threads in situations like this and noticed that even while one thread is paused, it's state can be "sticky".
For the sake of the above example, suppose both threads are paused in the debugger. If I step through one of the threads and leave the other paused, I would expect it would operate like a single threaded application. Yet, many times right after modifying the field, the next line that accesses it retrieves the "wrong" value (a value inconsistent with what it was just modified to).
I know that this code is not good.. but I am asking the question because I'm hoping someone could provide an answer that gives some insight into what actually happens in the virtual machine when multi-threaded applications are implemented poorly. Does the thread who's field modification attempt was unsuccessful have any effect at all?
If after poorly implementing multi-threaded code we are simply in the realm of "undefined" behavior, and there is no value in learning about this behavior, I'm ok with that.. just a multi-threading noob trying to understand what I observe in the debugger.

This is due to another critical function of synchronization across threads in Java: preventing data staleness. As part of the Java Memory Model, a Java thread may cache values for shared data. There is no guarantee that a thread will ever see updates made by another thread unless either the shared mutable data is accessed in synchronized blocks or it is marked as volatile. See here for more information.

There's really no way for the printout to be "wrong" if there is no other thread that can change the shared value (as would be the case if there are really only 2 threads and one is definitely paused). Can you by chance provide the code that "kicks off" these 2 threads (i.e. your main)?

Java: How exactly do synchronized operations relate to volatility?

Sorry this is such a long question.
Ive been doing lots of research lately into multi-threading as I slowly implement it into a personal project. However, probably due to an abundance of slightly incorrect examples, the use of synchronized blocks and volatility in certain situations is still a bit unclear to me.
My core question is this: Are changes to references and primitives automatically volatile (that is, performed on the main memory and not a cache) when a thread is inside a synchronized block, or does the read also have to be synchronized for it to work properly?
If so What is the purpose of synchronizing a simple getter method? (see example 1 ) Also, are ALL changes sent to main memory as long as the thread has synchronized on anything? eg if it is sent off to do loads of work all over the place inside a very high level sync will every single change then made be to main memory, and nothing ever to cache, until its unlocked again?
If not Does the change have to be explicitly inside a synchronized block, or can java actually pick up on, for example, uses of the Lock object? (see example 3)
If either Does the synchronized object need to be related to the reference/primitive being changed in any way (eg the immediate object that contains it)? Can I write by syncing on one object and read with another if its otherwise safe? (see example 2)
(please note for the following examples that I know that synchronized methods and synchronized(this) are frowned upon and why, but discussion about that is beyond the scope of my question)
Example 1:
class Counter{
int count = 0;
public synchronized void increment(){
count++;
}
public int getCount(){
return count;
}
}
In this example, increment() needs to be synchronized since ++ is not an atomic operation. As such, two threads incremending at the same time may result in a overall increase of 1 to the count. The count primitive needs to be atomic (eg not long/double/reference), and it is so thats fine.
Does getCount() need to be synchronized here and why exactly? The explanation I have heard the most is that I will have no guarantee whether the count returned will be the pre- or post-increment. However, this seems like the explanation for something slightly different, thats found itself in the wrong place. I mean if I were to synchronize getCount(), then I still see no guarantee - its now down to not knowing the locking order, insead of not knowing whether the actual read happens to be before/after the actual write.
Example 2:
Is the following example threadsafe, if you assume that through trickery not shown here that none of these methods will never be called at the same time? Will count increment in an expected way if its done so using a random method each time, and then be read properly, or does the lock have to be the same object? (btw I fully realise how rediculous this example is but Im more interested in theory than practice)
class Counter{
private final Object lock1 = new Object();
private final Object lock2 = new Object();
private final Object lock3 = new Object();
int count = 0;
public void increment1(){
synchronized(lock1){
count++;
}
}
public void increment2(){
synchronized(lock2){
count++;
}
}
public int getCount(){
synchronized(lock3){
return count;
}
}
}
Example 3:
Is the happens-before relationship simply a java concept, or is it an actual thing built into the JVM? Even though I can guarantee a conceptual happens-before relationship for this next example, is java smart enough to pick it up if its a built in thing? I am assuming it is not, but is this example actually threadsafe? If its threadsafe, what about if getCount() did no locking?
class Counter{
private final Lock lock = new Lock();
int count = 0;
public void increment(){
lock.lock();
count++;
lock.unlock();
}
public int getCount(){
lock.lock();
int count = this.count;
lock.unlock();
return count;
}
}

Yes, the read has to be synchronized as well. This page says:
The results of a write by one thread are guaranteed to be visible to a
read by another thread only if the write operation happens-before the
read operation.
[...]
An unlock (synchronized block or method exit) of a monitor
happens-before every subsequent lock (synchronized block or method
entry) of that same monitor
The same page says:
Actions prior to "releasing" synchronizer methods such as Lock.unlock,
Semaphore.release, and CountDownLatch.countDown happen-before actions
subsequent to a successful "acquiring" method such as Lock.lock
So locks offer the same visibility guarantees as synchronized blocks.
Whether you use synchronized blocks or locks, the visibility is only guaranteed if the reader thread uses the same monitor or lock as the writer thread.
Your Example 1 is incorrect: the getter must be synchronized as well if you want to see the latest value of the count.
Your example 2 is incorrect because it uses different locks to guard the same count.
Your example 3 is OK. If the getter did not lock, you could see an older value of the count. The happens-before is something that is guaranteed by the JVM. The JVM has to respect the rules specified, by flushing caches to the main memory for example.

Try to view it in terms of two distinct, simple operations:
Locking (mutual exclusion),
Memory barrier (cache sync, instruction reordering barrier).
Entering a synchronized block entails both locking and memory barrier; leaving the synchronized block entails unlocking + memory barrier; reading/writing a volatile field entails memory barrier only. Thinking in these terms I think you can clarify for yourself all the question above.
As for Example 1, the reading thread will not have any kind of memory barrier. It's not just between seeing the value before/after read, it's about never observing any change to the var after a thread is started.
Example 2. is the most interesting issue you raise. You are indeed given no guarantees by the JLS in this case. In practice you won't be given any ordering guarantees (it's as if the locking aspect wasn't there at all), but you'll still have the benefit of the memory barriers so you will observe changes, unlike the first example. Basically, this is exactly the same as removing synchronized and tagging the int as volatile (apart from the runtime costs of acquiring locks).
Regarding Example 3, by "just a Java thing" I feel you have generics with erasure in mind, something that only the static code checking is aware of. This is not like that -- both locks and memory barriers are pure runtime artifacts. In fact, the compiler can't reason about them at all.

Doubt on avoiding liveness failure - discussed in Effective Java

I am refering to page 261 - 262 of Joshua Bloch Effective Java
// Properly synchronized cooperative thread termination
public class StopThread {
private static boolean stopRequested;
private static synchronized void requestStop() {
stopRequested = true;
}
private static synchronized boolean stopRequested() {
return stopRequested;
}
public static void main(String[] args) throws InterruptedException {
Thread backgroundThread = new Thread(new Runnable() {
public void run() {
int i = 0;
while (!stopRequested())
i++;
}
});
backgroundThread.start();
TimeUnit.SECONDS.sleep(1);
requestStop();
}
}
Note that both the write method
(requestStop) and the read method
(stop- Requested) are synchronized. It
is not sufficient to synchronize only
the write method! In fact,
synchronization has no effect unless
both read and write operations are
synchronized.
Joshua's example is synchronized on this. However My doubt is that, must synchronized be acted on the same object? Say, if I change the code to
private static void requestStop() {
synchronized(other_static_final_object_monitor) {
stopRequested = true;
}
}
private static synchronized boolean stopRequested() {
return stopRequested;
}
will this still able to avoid liveness failure?
That's is, we know grabbing monitor for a same object during read/write can avoid liveness failure (According to Joshua Bloch's example). But how about grabbing monitor for different object during read/write?

I don't believe it's guaranteed, although I wouldn't be surprised if it actually was okay in all existing implementations. The Java Language Specification, section 17.4.4 states this:
An unlock action on monitor m synchronizes-with all subsequent lock actions on m (where subsequent is defined according to the synchronization order).
I believe that all the safety of reading/writing shared variables within locks stems from that bullet point in the spec - and that only specifies anything about a lock and an unlock action on a single monitor.
EDIT: Even if this did work for a single variable, you wouldn't want to use it for multiple variables. If you update multiple variables while holding a monitor and only read from them when holding a monitor, you can ensure that you always read a consistent set of data: nothing's going to write to variable Y before you've read that but after you've read variable X. If you use different monitors for reading and writing, that consistency goes away: the values could be changed at any time while you're reading them.

Possibly, but there are no guarantees and it could be highly platform dependant. In your case there is no real test for liveliness so if the value is a few milli-seconds late your application will appear to work correctly anyway. The application will stop eventually without any synchronized and you may not see then difference.
The problem with memory consistency errors is I have seen examples where something can be updated correctly in a test 1 billion times and then fail when there is a different program running on the system. This is why guaranteed behaviour is more interesting.

According to the The Java Language Specification,
"We say that a read r of a variable v is allowed to observe a write w to v if, in the happens-before partial order of the execution trace:
r is not ordered before w (i.e., it is not the case that hb(r, w), and
there is no intervening write w' to v (i.e., no write w' to v such that hb(w, w') and hb(w', r).
Informally, a read r is allowed to see the result of a write w if there is no happens-before ordering to prevent that read."
This means that unless there is some explicit synchronization action that causes multiple threads to interleave their actions in some predictable way (i.e. there's a good happens-before relationship defined on their actions), then a thread is allowed to see pretty much any value of a variable at any point where it was written to.
If you synchronize on multiple different objects, there is no happens-before relationship connecting the reader and the writer. This means that the reading thread can keep seeing whatever value it wants for the stopRequested variable, which could either be the first value forever, or the new value as soon as its updated, or something delightfully in-between the two.

Theoretically it's wrong. Per lang spec v3, the background thread may not see the update.
Practically it'll work. VM just can't be that smart to optimize to such a degree. (In older version of Java, which has threading spec worded differently, it is possible that your suggestion is correct even in theory.)
In any case, don't do it.

If you use a different monitor, there is no synchronization. No other code is requesting the monitor of this or other_static_final_object_monitor.
Using a static object to synchronize is only useful, if you want to synchronize across classes and within methods.
Also, NEVER use a String as a lock/monitor. Always use something like this:
static final Object LOCK = new Object();

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.