In java in which thread is an object runned

In java in which thread is an object runned - java

I cannot understand why a thread is able to call a method of an object that is declered in an another thread which is in a while true loop.
From my basic knowledge if a thread is in a while true loop you cannot interact with it and therefore even with the object declered in this thread.
Thank you in advice.
this is the main class
/**
* main
*/
public class mainClass {
public static void main(String[] args) {
whatTime wt = new whatTime();
threadA ta = new threadA(wt);
ta.start();
while (true) {
}
}
}
this is the thread A class
/**
* threadA
*/
public class threadA extends Thread {
private whatTime wt;
public threadA(whatTime wt) {
System.out.println("threadA() constructor");
this.wt = wt;
}
public void run() {
while (true) {
//every 10s
try {
Thread.sleep(10000);
System.out.println("threadA: " + wt.getTime());
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
and this is the object that i use
public class whatTime {
public whatTime() {
System.out.println("whatTime() constructor");
}
public long getTime() {
return System.currentTimeMillis();
}
}

I cannot understand why a thread is able to call a method of an object that is declered in an another thread
I think you misunderstand.
whatTime wt = new whatTime();
This does 2 completely different things.
In the local space (which is in the stack, and indeed, cannot be interacted with by other threads, whatsoever - that stack is instantly reused by whatever happens immediately after this method returns, so if any other thread dares to try, you're in big trouble) - it declares a reference variable. That's generally a 64-bit value that tells the JVM where to find an object in the heap. It is not the instance of whatTime itself that is stored in wt! - it is merely a reference to it. Think: "An page in an address book with the address of a house on it", not: "A house". an instanceof whatTime is the house, wt is the address book page.
Other threads cannot interact with the address book page - at best, you can make a copy of it, and hand the copy to another thread. But they can take their copy of the address book page, drive over there, toss a brick through the window. If later that day you also drive over there you will - see the damaged window.
new WhatTime() - that 'builds' the house, and assigns the address of that house to the wt variable.
threadA ta = new threadA(wt);
This makes a new instance of threadA (and also notes the adress of it onto the local variable space), and hands a copy of your address book page to it (java is pass-by-copy, always, but wt is the reference, not the object itself). You now have an address book page with '123 Fairfield lane' on it, and so does threadA - threadA has its own private copy. Whatever threadA does with that address book page has no effect on your address book. But if they decide to go to the house at that address - you can see that too.
From my basic knowledge if a thread is in a while true loop you cannot interact with it
Where you say 'in a while true loop', what you really mean is 'if it is running'. It doesn't matter if it is looping, it just matters that it is running.
What this is referring to is the memory model. Modern CPUs cannot read or write memory at all. Because that memory bus is waaaay too far away, those electrons are flying through the lines on your motherboard at ~60% the speed of light or similar, and that means it's like slow molasses relative to the CPU, that's why they cannot do that. Instead, there is a small chunk of very fast memory directly on the core, split up in a few 'pages' that have a set size (Something like 64k - depends on your processor), and the only thing a CPU can do is tell the memory controller: Go write this page back to main memory (all 64k of it) at this address, then wipe it out and replace it with the contents of the main memory banks from A to B. I'll wait. All 1000 cycles of it, because, boy, I need to wait a long time for you to do this stuff because that memory bank is so far away.
Given that that is how CPUs work, java needs to 'claim' some rights because if it didn't it would run incredibly slowly. One of the rights it claims is reordering and local caching.
A JVM may cache anything you get from the heap (so any non-local) in the on-die cache of the CPU core. Which means if you have 1 field, and 2 threads are simultaneously reading and writing to that field, each thread may actually just be writing to its local copy, and the JVM is non-specific as to how it will merge this back into memory - generally, some arbitrary page write 'wins' and it's all unreliable.
In other words, if 2 threads are accessing the same bit of heap (the same field), your code is broken: Depending on the CPU, architecture, music playing on your music player as you do the operation, or the phase of the moon - you get one result or another. Ouch.
To make threading possible in the first place, the JMM (Java Memory Model) defines certain specific operations that establish 'Happens-Before'. If according to the JMM action A 'happens before' action B, then the B line, and any line that follows it, will not be able to observe state such as it was before A ran. Essentially, If A 'Happens-Before (per the JMM)' B, then A really does happen before B, unless B couldn't possibly observe this, in which case the JVM is still free to run these things in whatever order it wants.
Establishing HB can be done with synchronized, volatile, various peculiar actions (thread.start() is HB relative to the first line inside that thread, for example), and of course by using (core) library functions that itself do this stuff, such as ConcurrentHashMap and many other things from the j.u.c package.
This is, presumably, what you read and misunderstood: You should not interact (read or write) any field that another thread that has started and hasn't died yet also interacts with, unless you have carefully managed this access and ensured that all possible interactions are probably guarded by HB/HA relationships so that your code will run the same way every time and didn't turn into some crazy evil coin flip game where it depends on the flaps of the proverbial butterfly. If you break the rule and you interact with the same field from 2 separate threads without establishing HB/HA anyway - then 'it works' (nothing directly crashes, no exceptions are thrown, the code compiles fine, and will run) - but the results are more or less arbitrary:
class Example {
int x;
void crazy() {
x = 1;
new Thread(() -> x = 5).start();
new Thread(() -> x = 10).start();
System.out.println(x);
}
}
The above code can legally print 1, 5, or 10. A JVM is working properly and is fulfilling the spec no matter which one of those it happens to return - and the JVM doesn't guarantee it's random either (a JVM that always returns 1 here, is fine. A JVM that always returns 1 except it returns 10 on the 5th monday of the 3rd month - is fine). Code written like this is the problem. There's no way to know, really. It's essentially impossible to test for it (given that a JVM that always returns 1 is also fine here). The vast majority of ways you can run this (combos of JVM vendor, version, OS, architecture, etc) will print '1', but a '10' wouldn't be incorrect.
Here in this specific code, there is no problem at all - wt the reference is written once, and this is HB relative to the thread (because it precedes t.start()), so accessing the ref isn't an issue as it never changes, and System.currentTimeMillis() doesn't access any fields, so the call to .getTime() isn't problematic either.
Hence, this code works.. just fine. There is nothing wrong with it.

Related

HashMap synchronized `put` but not `get`

I have the following code snippet that I'm trying to see if it can crash/misbehave at some point. The HashMap is being called from multiple threads in which put is inside a synchronized block and get is not. Is there any issue with this code? If so, what modification I need to make to see that happens given that I only use put and get this way, and there is no putAll, clear or any operations involved.
import java.util.HashMap;
import java.util.Map;
public class Main {
Map<Integer, String> instanceMap = new HashMap<>();
public static void main(String[] args) {
System.out.println("Hello");
Main main = new Main();
Thread thread1 = new Thread("Thread 1"){
public void run(){
System.out.println("Thread 1 running");
for (int i = 0; i <= 100; i++) {
System.out.println("Thread 1 " + i + "-" + main.getVal(i));
}
}
};
thread1.start();
Thread thread2 = new Thread("Thread 2"){
public void run(){
System.out.println("Thread 2 running");
for (int i = 0; i <= 100; i++) {
System.out.println("Thread 2 " + i + "-" + main.getVal(i));
}
}
};
thread2.start();
}
private String getVal(int key) {
check(key);
return instanceMap.get(key);
}
private void check(int key) {
if (!instanceMap.containsKey(key)) {
synchronized (instanceMap) {
if (!instanceMap.containsKey(key)) {
// System.out.println(Thread.currentThread().getName());
instanceMap.put(key, "" + key);
}
}
}
}
}
What I have checked out:
Are size(), put(), remove(), get() atomic in Java synchronized HashMap?
Extending HashMap<K,V> and synchronizing only puts
Why does HashMap.get(key) needs to be synchronized when change operations are synchronized?

I somewhat modified your code:
removed System.out.println() from the "hot" loop, it is internally synchronized
increased the number of iterations
changed printing to only print when there's an unexpected value
There's much more we can do and try, but this already fails, so I stopped there. The next step would we to rewrite the whole thing to jcsctress.
And voila, as expected, sometimes this happens on my Intel MacBook Pro with Temurin 17:
Exception in thread "Thread 2" java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because the return value of "java.util.Map.get(Object)" is null
at com.gitlab.janecekpetr.playground.Playground.getVal(Playground.java:35)
at com.gitlab.janecekpetr.playground.Playground.lambda$0(Playground.java:21)
at java.base/java.lang.Thread.run(Thread.java:833)
Code:
private record Val(int index, int value) {}
private static final int MAX = 100_000;
private final Map<Integer, Integer> instanceMap = new HashMap<>();
public static void main(String... args) {
Playground main = new Playground();
Runnable runnable = () -> {
System.out.println(Thread.currentThread().getName() + " running");
Val[] vals = new Val[MAX];
for (int i = 0; i < MAX; i++) {
vals[i] = new Val(i, main.getVal(i));
}
System.out.println(Stream.of(vals).filter(val -> val.index() != val.value()).toList());
};
Thread thread1 = new Thread(runnable, "Thread 1");
thread1.start();
Thread thread2 = new Thread(runnable, "Thread 2");
thread2.start();
}
private int getVal(int key) {
check(key);
return instanceMap.get(key);
}
private void check(int key) {
if (!instanceMap.containsKey(key)) {
synchronized (instanceMap) {
if (!instanceMap.containsKey(key)) {
instanceMap.put(key, key);
}
}
}
}

To specifically explain the excellent sleuthing work in the answer by #PetrJaneček :
Every field in java has an evil coin attached to it. Anytime any thread reads the field, it flips this coin. It is not a fair coin - it is evil. It will flip heads 10,000 times in a row if that's going to ruin your day (for example, you may have code that depends on coinflips landing a certain way, or it'll fail to work. The coin is evil: You may run into the situation that just to ruin your day, during all your extensive testing, the coin flips heads, and during the first week in production it's all heads flips. And then the big new potential customer demos your app and the coin starts flipping some tails on you).
The coinflip decides which variant of the field is used - because every thread may or may not have a local cache of that field. When you write to a field from any thread? Coin is flipped, on tails, the local cache is updated and nothing more happens. Read from any thread? Coin is flipped. On tails, the local cache is used.
That's not really what happens of course (your JVM does not actually have evil coins nor is it out to get you), but the JMM (Java Memory Model), along with the realities of modern hardware, means that this abstraction works very well: It will reliably lead to the right answer when writing concurrent code, namely, that any field that is touched by more than one thread must have guards around it, or must never change at all during the entire duration of the multi-thread access 'session'.
You can force the JVM to flip the coin the way you want, by establishing so-called Happens Before relationships. This is explicit terminology used by the JMM. If 2 lines of code have a Happens-Before relationship (one is defined as 'happening before' the other, as per the JMM's list of HB relationship establishing actions), then it is not possible (short of a bug in the JVM itself) to observe any side effect of the HA line whilst not also observing all side effects of the HB line. (That is to say: the 'happens before' line happens before the 'happens after' line as far as your code could ever tell, though it's a bit of schrodiner's cat situation. If your code doesn't actually look at these files in a way that you'd ever be able to tell, then the JVM is free to not do that. And it won't, you can rely on the evil coin being evil: If the JMM takes a 'right', there will be some combination of CPU, OS, JVM release, version, and phase of the moon that combine to use it).
A small selection of common HB/HA establishing conditions:
The first line inside a synchronized(lock) block is HA relative to the hitting of that block in any other thread.
Exiting a synchronized(lock) block is HB relative to any other thread entering any synchronized(lock) block, assuming the two locks are the same reference.
thread.start() is HB relative to the first line that thread will run.
The 'natural' HB/HA: line X is HB relative to line Y if X and Y are run by the same thread and X is 'before it' in your code. You can't write x = 5; y = x; and have y be set by a version of x that did not witness the x = 5 happening (of course, if another thread is also modifying x, all bets are off unless you have HB/HA with whatever line is doing that).
writes and reads to volatile establish HB/HA but you usually can't get any guarantees about which direction.
This explains the way your code may fail: The get() call establishes absolutely no HB/HA relationship with the other thread that is calling put(), and therefore the get() call may or may not use locally cached variants of the various fields that HashMap uses internally, depending on the evil coin (which is of course hitting some fields; it'll be private fields in the HashMap implementation someplace, so you don't know which ones, but HashMap obviously has long-lived state, which implies fields are involved).
So why haven't you actually managed to 'see' your code asplode like the JMM says it will? Because the coin is EVIL. You cannot rely on this line of reasoning: "I wrote some code that should fail if the synchronizations I need aren't happening the way I want. I ran it a whole bunch of times, and it never failed, therefore, apparently this code is concurrency-safe and I can use this in my production code". That is simply not ever actually reliable. That's why you need to be thinking: Evil! That coin is out to get me! Because if you don't, you may be tempted to write test code like this.
You should be scared of writing code where more than one thread interacts with the same field. You should be bending over backwards to avoid it. Use message queues. Do your chat between threads by using databases, which have much nicer primitives for this stuff (transactions and isolation levels). Rewrite the code so that it takes a bunch of params up front and then runs without interacting with other threads via fields at all, until it is all done, and it then returns a result (and then use e.g. fork/join framework to make it all work). Make your webserver performant and using all the cores simply by relying on the fact that every incoming request will be its own thread, so the only thing that needs to happen for you to use all the cores is for that many folks to hit your server at the same time. If you don't have enough requests, great! Your server isn't busy so it doesn't matter you aren't using all the cores.
If truly you decide that interacting with the same field from multiple threads is the right answer, you need to think NASA programming mars rovers on the lines that interact with those fields, because tests simply cannot be relied upon. It's not as hard as it sounds - especially if you keep the actual interacting with the relevant fields down to a minimum and keep thinking: "Have I established HB/HA"?
In this case, I think Petr figured it out correctly: System.out.println is hella slow and does various synchronizing actions. JMM is a package deal, and commutative: Once HB/HA establishes, everything the HB line changed is observable to the code in the HA line, and add in the natural rule, which means all code that follows the HA line cannot possibly observe a universe where something any line before the HB line did is not yet visible. In other words, the System.out.println statements HB/HA with each other in some order, but you can't rely on that (System.out is not specced to synchronize. But, just about every implementation does. You should not rely on implementation details, and I can trivially write you some java code that is legal, compiles, runs, and breaks no contracts, because you can set System.out with System.setOut - that does not synchronize when interacting with System.out!). The evil coin in this case took the shape of 'accidental' synchronization via intentionally unspecced behaviour of System.out.

The following explanation is more in line with the terminology used in the JMM. Could be useful if you want a more solid understanding of this topic.
2 Actions are conflicting when they access the same address and there is at least 1 write.
2 Actions are concurrent when they are not ordered by a happens-before relation (there is no happens-before edge between them).
2 Actions are in data race when they are conflicting and concurrent.
When there are data races in your program, weird problems can happen like unexpected reordering of instructions, visibility problems, or atomicity problems.
So what makes up the happens-before relation. If a volatile read observes a particular volatile write, then there is a happens-before edge between the write and the read. This means that read will not only see that write, but everything that happened before that write. There are other sources of happens-before edges like the release of a monitor and subsequent acquire of the same monitor. And there is a happens-before edge between A, B when A occurs before B in the program order. Note: the happens-before relation is transitive, so if A happens-before B and B happens-before C, then A happens-before C.
In your case, you have a get/put operations which are conflicting since they access the same address(es) and there is at least 1 write.
The put/get action are concurrent, since is no happens-before edge between writing and reading because even though the write releases the monitor, the get doesn't acquire it.
Since the put/get operations are concurrent and conflicting, they are in data race.
The simplest way to fix this problem, is to execute the map.get in a synchronized block (using the same monitor). This will introduce the desired happens-before edge and makes the actions sequential instead of concurrent and as consequence, the data-race disappears.
A better-performing solution would be to make use of a ConcurrentHashMap. Instead of a single central lock, there are many locks and they can be acquired concurrently to improve scalability and performance. I'm not going to dig into the optimizations of the ConcurrentHashMap because would create confusion.
[Edit]
Apart from a data-race, your code also suffers from race conditions.

java instance variable not visible to other threads

I've encountered this code in a book. It states NoVisibility could loop forever because the value of ready might never become
visible to the reader thread.
I'm confused by this statement. In order for the loop to run forever, ready must always be false, which is the default value. This means it must fail at executing ready = true; because the reader thread will always read the ready variable from memory. the assignment happens in CPU and it must have some problem in flushing the data back to Main Memory. I think I need some explanation on a situation how it can fail, or I may have missed some other part.
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready)
Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}

Your understanding is flawed. You are assuming that Java will behave intuitively here. In fact, it may not. And, indeed, the Java Language specification allows non-intuitive behavior if you don't follow the rules.
To be more specific, in your example it is NOT GUARANTEED that the second thread will see the results of the first thread's assignment to ready1. This is due to such things as:
The compiler caching the value of ready in a register in the first or second thread.
The compiler not including instructions to force the write to be flushed from one core's memory cache to main memory, or similar.
If you want a guarantee that the second thread will see the result of the write then either reads and writes of ready by the two threads must be (properly) synchronized, or the ready variable must be declared to be volatile.
So ...
This means it must fail at executing ready = true; because the reader thread will always read the ready variable from memory.
is incorrect. The "because" is not guaranteed by the Java language specification in this example.
Yes. It is nonintuitive. Relying on your intuition based on your understanding of single-threaded programs is not reliable. If you want to want to understand what is and is not guaranteed, please study the specification of the "Java Memory Model" in Section 17.4 of the JLS.
In short, the book is correct.
1 - It might see the results immediately, or after a short or long delay. Or it might never see them. And the behavior is liable to vary from one system to the next, and with versions of the Java platform. So your program that (by luck) works all of the time on one system may not always work on another system.

The value of ready may be updated but the other thread may never know about it. There you need volatile variables! A thread assumes that the variable is only used by this and only thread. So, it reads its value from the stack that it created.
private static volatile boolean ready;
What volatile does is that it says to your program to ready from the memory, not from the stack.
Actually what jvm does is it translates:
while(flag){...}
To:
if(flag){
while(true){
}
The stack is created when the thread is created. It collectes the values of the variables in order to use them later.
This is what I have understand, correct me if I am wrong!

Concurrent variable modification: cannot fully understand this example

I need some help to fully understand what's happening when, running this code
public class Main extends Thread {
private static int x;
public static void main(String[] args) {
Thread th1 = new Main("A");
Thread th2 = new Main("B");
th1.start();
th2.start();
}
public Main(String n) {
super(n);
}
public void run() {
while(x<4) { //1
x++; //2
System.out.print(Thread.currentThread().getName()+x+" "); //3
}
}
}
I get the output
B2 B3 B4 A2
I understand that threads A and B both increment x, then B loops incrementing and outputting... but why is last output A2? Shouldn't A see x as 4 when executing //3?
Bonus question: why is it impossible for x to become 5?
EDIT
This question (in a slightly different form) comes from a mock test for OCP certification, where explanation states that x will never be 5. I'm glad to see that I'm not the only one to disagree.

When you update a variable's value in one thread, its value is not necessarily visible to all threads immediately. This is because memory is held in the CPU cache, which allows it to be read and written much more quickly than it would be to main memory.
Periodically, the updated contents of the cache are copied to main memory. It is only when this happens that other threads see updates to values.
What it looks like is happening here is that B is updating the value, but that value is not being committed to main memory; as such, A sees old values of it.
If you make the variable volatile, all reads and writes are done directly from/to main memory (or, at least, the cache is refreshed from/flushed to main memory), so updates to the values are visible immediately to all threads.
Note, however, that you are not performing atomic reads and writes: it is possible for another thread to update the value of x in between the current thread checking x < 4 and incrementing x++. As such, you might end up with a value of 5 being printed.
The easiest way to fix this is to make the checking/incrementing synchronized:
synchronized (Main.class) {
if (x < 4) {
x++;
System.out.println(...);
}
}
This also has the effect of ensuring visibility of updates to x in all threads, but also ensures that only one thread can check/increment x at once.

This is a classic race condition. When you call th1.start() & th2.start() it only schedules the thread to start, it doesn't sequentially start then and there. As a result, your actual threads can and do start in any old order. Now, add to that fact that between while (x<4) or x++ or System.out.println any one of the threads can schedule out and allow another thread to run and you basically get undefined behavior.
Bonus question: why is it impossible for x to become 5?
It's not impossible (for the same reason the output is interleaved). Try increasing your number of threads and eventually you'll see x become 5 and maybe even higher depending on how much thread contention you can create.
I disagree with others that this is a volatility issue. Rather this is a shared memory access issue. Using volatile alone will not fix this. A simple mutex around the static x variable access will properly protect it and sequence how you expect with the exception of the order of 'A' vs. 'B' which would require additional synchronization.

You, my friend, have run into what is called a Data Race.
Wikipedia has an example depicting exactly what you are going through:
https://en.wikipedia.org/wiki/Race_condition.
So, why is this happening?
The reason lies hidden in the way a computer process instructions. Take, for example, the following line of java code:
x++;
Now, ignoring compiler magic for the moment, we have to think what the computer needs to do to execute this instruction.
We need to read the old value of x.
We need to perform the addition x + 1.
We need to write the new value back into the variable x.
This works wonderfully when just looking at it from a sequential standpoint. But what happens if two people are doing the exact same thing, at the same time?
See the Wikipedia example for exact answers.
The important thing to note here is that your single x++ instruction is actually multiple instructions for a computer. Even if each instruction can be carried out atomically by the processor, you are not guaranteed atomicity for the whole sequence of instructions.
The same holds true for using the variable x. When you are calling the System.out.println() function, you are once again accessing x. This access means that we have to read x from memory again.
Do we know what B has done to the variable from the time you changed it?
Nope.
Also, I noticed the volatile comment. This is actually wrong (as confirmed by running the code on my computer). volatile ensures that we do not read/write jumbled data into the variable. It does not ensure any other atomicity.
Bonus question: why is it impossible for x to become 5?
It is very possible, although perhaps unlikely. The part of your program that takes time is the work and synchronization done inside your System.out.println() statement. This is probably why you do not see the value 5 often.

Your variable x is static so it is shared between both threads.
Thread B increments x to 4 and completes, writing each step as it goes.
Thread A gets one chance to look at x when it is at 1 so increments it and prints A2. The next time it sees x it is at >= 4 so it exits its loop.
Bonus question - yes it is possible for x to become 5 - and even print as 5. If both threads check x<4 when it happens to be 3 at the same time they will both increment it.

knowing that start is asynchronous method call so first one off the thread will start before the other.
two : x is a static but in a local context means the first running thread will change the x while the second is still sleeping (when sleeping the second have a local stored value of the local static x that he will use once he awaken )
after that once the second thread print the local x he will seek it on the memory(the global one) and find it equals to 4 so he will stop.
this may help
|------------------------------------------------------------------------------------------|
| Thread A:x works |local| big static X that changed . . . . . . . . . . . . . . ..|
| Thread B:x=2 sleep |local| big static X that will be read after first loop.|
|------------------------------------------------------------------------------------------|
so we can say x is local and global in the same time
proof : add a sleep with random time and see the result for x<10 after the increment dont forget the try catch clause.

Interpretation of "program order rule" in Java concurrency

Program order rule states "Each action in a thread happens-before every action in that thread that comes later in the program order"
1.I read in another thread that an action is
reads and writes to variables
locks and unlocks of monitors
starting and joining with threads
Does this mean that reads and writes can be changed in order, but reads and writes cannot change order with actions specified in 2nd or 3rd lines?
2.What does "program order" mean?
Explanation with an examples would be really helpful.
Additional related question
Suppose I have the following code:
long tick = System.nanoTime(); //Line1: Note the time
//Block1: some code whose time I wish to measure goes here
long tock = System.nanoTime(); //Line2: Note the time
Firstly, it's a single threaded application to keep things simple. Compiler notices that it needs to check the time twice and also notices a block of code that has no dependency with surrounding time-noting lines, so it sees a potential to reorganize the code, which could result in Block1 not being surrounded by the timing calls during actual execution (for instance, consider this order Line1->Line2->Block1). But, I as a programmer can see the dependency between Line1,2 and Block1. Line1 should immediately precede Block1, Block1 takes a finite amount of time to complete, and immediately succeeded by Line2.
So my question is: Am I measuring the block correctly?
If yes, what is preventing the compiler from rearranging the order.
If no, (which is think is correct after going through Enno's answer) what can I do to prevent it.
P.S.: I stole this code from another question I asked in SO recently.

It probably helps to explain why such rule exist in the first place.
Java is a procedural language. I.e. you tell Java how to do something for you. If Java executes your instructions not in the order you wrote, it would obviously not work. E.g. in the below example, if Java would do 2 -> 1 -> 3 then the stew would be ruined.
1. Take lid off
2. Pour salt in
3. Cook for 3 hours
So, why does the rule not simply say "Java executes what you wrote in the order you wrote"? In a nutshell, because Java is clever. Take the following example:
1. Take eggs out of the freezer
2. Take lid off
3. Take milk out of the freezer
4. Pour egg and milk in
5. Cook for 3 hours
If Java was like me, it'll just execute it in order. However Java is clever enough to understand that it's more efficient AND that the end result would be the same should it do 1 -> 3 -> 2 -> 4 -> 5 (you don't have to walk to the freezer again, and that doesn't change the recipe).
So what the rule "Each action in a thread happens-before every action in that thread that comes later in the program order" is trying to say is, "In a single thread, your program will run as if it was executed in the exact order you wrote it. We might change the ordering behind the scene but we make sure that none of that would change the output.
So far so good. Why does it not do the same across multiple threads? In multi-thread programming, Java isn't clever enough to do it automatically. It will for some operations (e.g. joining threads, starting threads, when a lock (monitor) is used etc.) but for other stuff you need to explicitly tell it to not do reordering that would change the program output (e.g. volatile marker on fields, use of locks etc.).
Note:
Quick addendum about "happens-before relationship". This is a fancy way of saying no matter what reordering Java might do, stuff A will happen before stuff B. In our weird later stew example, "Step 1 & 3 happens-before step 4 "Pour egg and milk in" ". Also for example, "Step 1 & 3 do not need a happens-before relationship because they don't depend on each other in any way"
On the additional question & response to the comment
First, let us establish what "time" means in the programming world. In programming, we have the notion of "absolute time" (what's the time in the world now?) and the notion of "relative time" (how much time has passed since x?). In an ideal world, time is time but unless we have an atomic clock built in, the absolute time would have to be corrected time to time. On the other hand, for relative time we don't want corrections as we are only interested in the differences between events.
In Java, System.currentTime() deals with absolute time and System.nanoTime() deals with relative time. This is why the Javadoc of nanoTime states, "This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time".
In practice, both currentTimeMillis and nanoTime are native calls and thus the compiler can't practically prove if a reordering won't affect the correctness, which means it will not reorder the execution.
But let us imagine we want to write a compiler implementation that actually looks into native code and reorders everything as long as it's legal. When we look at the JLS, all that it tells us is that "You can reorder anything as long as it cannot be detected". Now as the compiler writer, we have to decide if the reordering would violate the semantics. For relative time (nanoTime), it would clearly be useless (i.e. violates the semantics) if we'd reorder the execution. Now, would it violate the semantics if we'd reorder for absolute time (currentTimeMillis)? As long as we can limit the difference from the source of the world's time (let's say the system clock) to whatever we decide (like "50ms")*, I say no. For the below example:
long tick = System.currentTimeMillis();
result = compute();
long tock = System.currentTimeMillis();
print(result + ":" + tick - tock);
If the compiler can prove that compute() takes less than whatever maximum divergence from the system clock we can permit, then it would be legal to reorder this as follows:
long tick = System.currentTimeMillis();
long tock = System.currentTimeMillis();
result = compute();
print(result + ":" + tick - tock);
Since doing that won't violate the spec we defined, and thus won't violate the semantics.
You also asked why this is not included in the JLS. I think the answer would be "to keep the JLS short". But I don't know much about this realm so you might want to ask a separate question for that.
*: In actual implementations, this difference is platform dependent.

The program order rule guarantees that, within individual threads, reordering optimizations introduced by the compiler cannot produce different results from what would have happened if the program had been executed in serial fashion. It makes no guarantees about what order the thread's actions may appear to occur in to any other threads if its state is observed by those threads without synchronization.
Note that this rule speaks only to the ultimate results of the program, and not to the order of individual executions within that program. For instance, if we have a method which makes the following changes to some local variables:
x = 1;
z = z + 1;
y = 1;
The compiler remains free to reorder these operations however it sees best fit to improve performance. One way to think of this is: if you could reorder these ops in your source code and still obtain the same results, the compiler is free to do the same. (And in fact, it can go even further and completely discard operations which are shown to have no results, such as invocations of empty methods.)
With your second bullet point the monitor lock rule comes into play: "An unlock on a monitor happens-before every subsequent lock on that main monitor lock." (Java Concurrency in Practice p. 341) This means that a thread acquiring a given lock will have a consistent view of the actions which occurred in other threads before releasing that lock. However, note that this guarantee only applies when two different threads release or acquire the same lock. If Thread A does a bunch of stuff before releasing Lock X, and then Thread B acquires Lock Y, Thread B is not assured to have a consistent view of A's pre-X actions.
It is possible for reads and writes to variables to be reordered with start and join if a.) doing so doesn't break within-thread program order, and b.) the variables have not had other "happens-before" thread synchronization semantics applied to them, say by storing them in volatile fields.
A simple example:
class ThreadStarter {
Object a = null;
Object b = null;
Thread thread;
ThreadStarter(Thread threadToStart) {
this.thread = threadToStart;
}
public void aMethod() {
a = new BeforeStartObject();
b = new BeforeStartObject();
thread.start();
a = new AfterStartObject();
b = new AfterStartObject();
a.doSomeStuff();
b.doSomeStuff();
}
}
Since the fields a and b and the method aMethod() are not synchronized in any way, and the action of starting thread does not change the results of the writes to the fields (or the doing of stuff with those fields), the compiler is free to reorder thread.start() to anywhere in the method. The only thing it could not do with the order of aMethod() would be to move the order of writing one of the BeforeStartObjects to a field after writing an AfterStartObject to that field, or to move one of the doSomeStuff() invocations on a field before the AfterStartObject is written to it. (That is, assuming that such reordering would change the results of the doSomeStuff() invocation in some way.)
The critical thing to bear in mind here is that, in the absence of synchronization, the thread started in aMethod() could theoretically observe either or both of the fields a and b in any of the states which they take on during the execution of aMethod() (including null).
Additional question answer
The assignments to tick and tock cannot be reordered with respect to the code in Block1 if they are to be actually used in any measurements, for example by calculating the difference between them and printing the result as output. Such reordering would clearly break Java's within-thread as-if-serial semantics. It changes the results from what would have been obtained by executing instructions in the specified program order. If the assignments aren't used for any measurements and have no side-effects of any kind on the program result, they'll likely be optimized away as no-ops by the compiler rather than being reordered.

Before I answer the question,
reads and writes to variables
Should be
volatile reads and volatile writes (of the same field)
Program order doesn't guarantee this happens before relationship, rather the happens-before relationship guarantees program order
To your questions:
Does this mean that reads and writes can be changed in order, but reads and writes cannot change order with actions specified in 2nd or 3rd lines?
The answer actually depends on what action happens first and what action happens second. Take a look at the JSR 133 Cookbook for Compiler Writers. There is a Can Reorder grid that lists the allowed compiler reordering that can occur.
For instance a Volatile Store can be re-ordered above or below a Normal Store but a Volatile Store cannot be be reordered above or below a Volatile Load. This is all assuming intrathread semantics still hold.
What does "program order" mean?
This is from the JLS
Among all the inter-thread actions performed by each thread t, the
program order of t is a total order that reflects the order in which
these actions would be performed according to the intra-thread
semantics of t.
In other words, if you can change the writes and loads of a variable in such a way that it will preform exactly the same way as you wrote it then it maintains program order.
For instance
public static Object getInstance(){
if(instance == null){
instance = new Object();
}
return instance;
}
Can be reordered to
public static Object getInstance(){
Object temp = instance;
if(instance == null){
temp = instance = new Object();
}
return temp;
}

it simply mean though the thread may be multiplxed, but the internal order of the thread's action/operation/instruction would remain constant (relatively)
thread1: T1op1, T1op2, T1op3...
thread2: T2op1, T2op2, T2op3...
though the order of operation (Tn'op'M) among thread may vary, but operations T1op1, T1op2, T1op3 within a thread will always be in this order, and so as the T2op1, T2op2, T2op3
for ex:
T2op1, T1op1, T1op2, T2op2, T2op3, T1op3

Java tutorial http://docs.oracle.com/javase/tutorial/essential/concurrency/memconsist.html says that happens-before relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement. Here is an illustration
int x;
synchronized void x() {
x += 1;
}
synchronized void y() {
System.out.println(x);
}
synchronized creates a happens-before relationship, if we remove it there will be no guarantee that after thread A increments x thread B will print 1, it may print 0

visibility of immutable object after publication

I have an immutable object, which is capsulated in class and is global state.
Lets say i have 2 threads that get this state, execute myMethod(state) with it. And lets say thread1 finish first. It modify the global state calling GlobalStateCache.updateState(state, newArgs);
GlobalStateCache {
MyImmutableState state = MyImmutableState.newInstance(null, null);
public void updateState(State currentState, Args newArgs){
state = MyImmutableState.newInstance(currentState, newArgs);
}
}
So thread1 will update the cached state, then thread2 do the same, and it will override the state (not take in mind the state updated from thread1)
I searched google, java specifications and read java concurrency in practice but this is clearly not specified.
My main question is will the immutable state object value be visible to a thread which already had read the immutable state. I think it will not see the changed state, only reads after the update will see it.
So i can not understand when to use immutable objects? Is this depends on if i am ok with concurrent modifications during i work with the latest state i have saw and not need to update the state?

Publication seems to be somewhat tricky concept, and the way it's explained in java concurrency in practice didn't work well to me (as opposed to many other multithreading concepts explained in this wonderful book).
With above in mind, let's first get clear on some simpler parts of your question.
when you state lets say thread1 finish first - how would you know that? or, to be more precise, how would thread2 "know" that? as far as I can tell this could be only possible with some sort of synchronization, explicit or not-so-explicit like in thread join (see the JLS - 17.4.5 Happens-before Order). Code you provided so far does not give sufficient details to tell whether this is the case or not
when you state that thread1 will update the cached state - how would thread2 "know" that? with the piece of code you provided, it looks entirely possible (but not guaranteed mind you) for thread2 to never know about this update
when you state thread2... will override the state what does override mean here? There's nothing in GlobalStateCache code example that could somehow guarantee that thread1 will ever notice this override. Even more, the code provided suggests nothing that would somehow impose happen-before relation of updates from different threads so one can even speculate that override may happen the other way around, you see?
the last but not the least, the wording the immutable state sounds rather fuzzy to me. I would say dangerously fuzzy given this tricky subject. The field state is mutable, it can be changed, well, by invoking method updateState right? From your code I would rather conclude that instances of MyImmutableState class are assumed to be immutable - at least that's what name tells me.
With all above said, what is guaranteed to be visible with the code you provided so far? Not much I'm afraid... but maybe better than nothing at all. The way I see it is...
For thread1, it is guaranteed that prior to invoking updateState it will see either null or properly constructed (valid) object updated from thread2. After the update, it is guaranteed to see either of properly constructed (valid) objects updated from thread1 or thread2. Note after this update thread1 is guaranteed not to see null per the very JLS 17.4.5 I refer to above ("...x and y are actions of the same thread and x comes before y in program order...")
For thread2, guarantees are pretty similar to above.
Essentially, all that is guaranteed with the code you provided is that both threads will see either null or one of properly constructed (valid) instances of MyImmutableState class.
Above guarantees may look insignificant at the first glance, but if you skim one page above the one with quote that confused you ("Immutable objects can be used safely etc..."), you'll find an example worth deeper drilling into in 3.5.1. Improper Publication: When Good Objects Go Bad.
Yeah object being immutable alone won't guarantee its visibility but it at least will guarantee that the object won't "explode from inside", like in example provided in 3.5.1:
public class Holder {
private int n;
public Holder(int n) { this.n = n; }
public void assertSanity() {
if (n != n)
throw new AssertionError("This statement is false.");
}
}
Goetz comments for above code begin at explaining issues true for both mutable and immutable objects, ...we say the Holder was not properly published. Two things can go wrong with improperly published objects. Other threads could see a stale value for the holder field, and thus see a null reference or other older value even though a value has been placed in holder...
...then he dives into what can happen if object is mutable, ...But far worse, other threads could see an up-todate value for the holder reference, but stale values for the state of the Holder. To make things even less predictable, a thread may see a stale value the first time it reads a field and then a more up-to-date value the next time, which is why assertSanity can throw AssertionError.
Above "AssertionHorror" may sound counter-intuitive but all the magic goes away if you consider scenario like below (completely legal per Java 5 memory model - and for a good reason btw):
thread1 invokes sharedHolderReference = Holder(42);
thread1 first fills n field with default value (0) then is going to assign it within constructor but...
...but scheduler switches to thread2,
sharedHolderReference from thread1 becomes visible to thread2 because, say because why not? maybe optimizing hot-spot compiler decided it's a good time for that
thread2 reads the up-todate sharedHolderReference with field value still being 0 btw
thread2 invokes sharedHolderReference.assertSanity()
thread2 reads the left side value of if statement within assertSanity which is, well, 0 then it is going to read the right side value but...
...but scheduler switches back to thread1,
thread1 completes the constructor assignment suspended at step #2 above by setting n field value 42
value 42 in the field n from thread1 becomes visible to thread2 because, say because why not? maybe optimizing hot-spot compiler decided it's a good time for that
then, at some moment later, scheduler switches back to thread2
thread2 proceeds from where it was suspended at step #6 above, ie it reads right-hand side of if statement, which is, well, 42 now
oops our innocent if (n != n) suddenly turns into if (0 != 42) which...
...naturally throws AssertionError
As far as I understand, initialization safety for immutable objects just guarantees that above won't happen - no more... and no less

I think the key is to distinguish between objects and references.
The immutable objects are safe to publish, so any thread can publish object, and if any other thread reads a reference to such object - it can safely use the object. Of course, reader thread will see the immutable object state that was published at the moment the thread read the reference, it will not see any updates, until it reads the reference again.
It is very useful in many situations. E.g. if there is a single publisher, and many readers - and readers need to see a consistent state. The readers periodically read the reference, and work on the obtained state - it is guaranteed to be consistent, and it does not require any locking on reader thread. Also when it is OK to loose some updates, e.g. you don't care which thread updates the state.

If I understand your question, immutability doesn't seem to be relevant here. You're just asking whether threads will see updates to shared objects.
[Edit] after an exchange of comments, I now see that you need also to hold a reference to your shared singleton state while doing some actions, and then setting the state to reflect that action.
The good news, as before, is that providing this will of necessity also solve your memory consistency issue.
Instead of defining separate synchronized getState and updateState methods, you'll have to perform all three actions without being interrupted: getState, yourAction, and updateState.
I can see three ways to do it:
1) Do all three steps inside a single synchronized method in GlobalStateCache. Define an atomic doActionAndUpdateState method in GlobalStateCache, synchronized of course on your state singleton, which would take a functor object to do your action.
2) Do getState and updateState as separate calls, and change updateState so that it checks to be sure state hasn't changed since the get. Define getState and checkAndUpdateState in GlobalStateCache. checkAndUpdateState will take the original state caller got from getState, and must be able to check if state has changed since your get. If it has changed, you'll need to do something to let caller know they potentially need to revert their action (depends on your use case).
3) Define a getStateWithLock method in GlobalStateCache. This implies that you'll also need to assure callers release their lock. I'd create an explicit releaseStateLock method, and have your updateState method call it.
Of these, I advise against #3, because it leaves you vulnerable to leaving that state locked in the event of some kinds of bugs. I'd also advise (though less strongly) against #2, because of the complexity it creates with what happens in the event that the state has changed: do you just abandon the action? Do you retry it? Must it be (can it be) reverted? I'm for #1: a single synchronized atomic method, which will look something like this:
public interface DimitarActionFunctor {
public void performAction();
}
GlobalStateCache {
private MyImmutableState state = MyImmutableState.newInstance(null, null);
public MyImmutableState getState {
synchronized(state) {
return state;
}
}
public void doActionAndUpdateState(DimitarActionFunctor functor, State currentState, Args newArgs){
synchronized(state) {
functor.performAction();
state = MyImmutableState.newInstance(currentState, newArgs);
}
}
}
}
Caller then constructs a functor for the action (an instance of DimitarActionFunctor), and calls doActionAndUpdateState. Of course, if the actions need data, you'll have to define your functor interface to take that data as arguments.
Again, I point you to this question, not for the actual difference, but for how they both work in terms of memory consistency: Difference between volatile and synchronized in Java

So much depends on the actual use case here that it's hard to make a recommendation, but it looks like you want some sort of Compare-And-Set semantics for the GlobalStateCache, using a java.util.concurrent.atomic.AtomicReference.
public class GlobalStateCache {
AtomicReference<MyImmutableState> atomic = new AtomicReference<MyImmutableState>(MyImmutableState.newInstance(null, null);
public State getState()
{
return atomic.get();
}
public void updateState( State currentState, Args newArgs )
{
State s = currentState;
while ( !atomic.compareAndSet( s, MyImmutableState.newInstance( s, newArgs ) ) )
{
s = atomic.get();
}
}
}
This, of course, depends on the expense of potentially creating a few extra MyImmutableState objects, and whether you need to re-run myMethod(state) if the state has been updated underneath, but the concept should be correct.

Answering you "main" question: no Thread2 will not see the change. Immutable objects do not change :-)
So if Thread1 read state A and then Thread2 stores state B, Thread1 should read the variable again to see the changes.
Visibily of variables is affected by volatile keyword. If variable is declared as volatile then Java guarantees that if one thread updates the variable all other threads will see the change immediately (at the cost of speed).
Still immutable objects are very useful in multithreaded environments. I will give you an example how I used it once. Lets say you have an object that is periodically changed (life field in my case) by one thread and it is somehow processed by other threads (my program was sending it to clients over the network). These threads fail if the object is changed in the middle of processing (they send inconsistent life field state). If you make this object immutable and will create a new instance every time it changes, you don't have to write any synchronization at all. Updating thread will periodically publish new versions of an object and every time other threads read it they will have most recent version of it and can safely process it. This particular example saves time spent on synchronization but wastes more memory. This should give you a general understanding when you can use them.
Another link I found: http://download.oracle.com/javase/tutorial/essential/concurrency/immutable.html
Edit (answer comment):
I will explain my task. I had to write a network server that will send clients the most recent life field and will constantly update it. With design mentioned above, I have two types of threads:
Thread that updates object that represents life field. It is immutable, so actually it creates a new instance every time and only changes reference. The fact that reference is declared volatile is crucial.
Every client is served with its own thread. When client requests life field, it reads the reference once and starts sending. Because network connection can be slow, life field can be updated many times while this thread sends data. The fact that object is immutable guarantees that server will send consistent state of life field. In the comments you are concerned about changes made while this thread processes the object. Yes, when client receives data it may not be up to date but there is nothing you can do about it. This is not synchronization issue but rather a slow connection issue.
I am not stating that immutable objects can solve all of your concurrency problems. This is obviously not true and you point this out. I am trying to explain you where it actually can solve problems. I hope my example is clear now.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.