Reordering External Operations in Java Memory Model

Reordering External Operations in Java Memory Model - java

I'm currently learning about the Java Memory Model, and how it affects the reorderings that a compiler may make. However, I'm a bit confused about external operations. The JMM defines them as an action that may be observable outside of on operation. Going off of this question, I understand external actions to be things like printing a value, writing to a file, network operations, etc.
Now, how are external actions affected by reordering? I think it's obvious that an external action cannot be reordered with another external action, as this will change the observable behaviour of the program (and thus is not a valid transformation according to the JMM). But what about reordering an external action with a normal memory access, or a synchronisation action? For example:
volatile int v = 5;
int x = v;
System.out.println("!");
Can the print and int x = v be reordered here? I can't see it changing behaviour, but the read of volatile v is the same as a monitor acquire, so I don't think the reordering is valid.

External actions are added to avoid surprising outcomes:
class ExternalAction {
int foo = 0;
void method() {
jni();
foo = 42;
}
native void jni(); /* {
assert foo == 0;
} */
}
Assuming that the JNI method was implemented to run the same assertion, you would not expect this to fail. The JIT compiler cannot determine the outcome of anything external such that the JMM forbidds such reorderings, too.

I think it's obvious that an external action cannot be reordered with another external action, as this will change the observable behaviour of the program (and thus is not a valid transformation according to the JMM).
According to the JLS, observable behaviour doesn't require a total order of all external actions:
Note that a behavior B does not describe the order in which the external actions in B are observed, but other (internal) constraints on how the external actions are generated and performed may impose such constraints.
It seems that two external actions cannot be reordered if the result of the 1st external action is used as a parameter of the 2nd external action (either used directly, or indirectly — to compute the value of a parameter)
This is what the JLS says about the result of an external action:
An external action tuple contains an additional component, which contains the results of the external action as perceived by the thread performing the action. This may be information as to the success or failure of the action, and any values read by the action.
I suppose there could be strong ordering guarantees for external action which might access JVM's internal state — like jndi as explained in the Rafael's answer.
Other than that it seems like the JLS allows almost anything:
An implementation is free to produce any code it likes, as long as all resulting executions of a program produce a result that can be predicted by the memory model.
This provides a great deal of freedom for the implementor to perform a myriad of code transformations, including the reordering of actions and removal of unnecessary synchronization.
Of course, Java implementations can provide stronger guarantees. This would be legal because stronger guarantees don't produce new behaviours.

Related

akka stream behaviour based on outside variable

I have a generic question on Akka Streams,
I need to change stream behavior based on variable outside of akka. The variable is static and is changed by other piece of code.
How would you achieve this. Simply by checking variable in stream element ?
For example:
.filterNot(ping -> pingRecieved)
The pingRecieved is static variable in Java class.

It is legal to have a stream stage check some global state and alter its behavior based on that state.
Whether it's a great idea is another question entirely.
At minimum, you'll want to be aware of the limits and subtleties of the Java Memory Model around visibility (because if the code writing to that variable isn't executing on the same thread as the stream stage (and if it's outside of Akka, it categorically won't; if it's code executed by an actor on the same dispatcher as the stream stage, it might at some point execute on the same thread, but controlling that is going to require some tradeoffs), there's no guarantee about when (or even possibly whether) the stream stage will see the write). Ensuring that visibility (e.g. volatile or using atomics) may in turn have substantial implications for performance, etc.
You may want to investigate alternatives like a custom stream stage which materializes as an object with methods which propagate updates to that value to the stage (e.g. via the async callback mechanisms in Akka): these would be guaranteed to become visible to the stage and would abstract away the concurrency. Another option would be to expose a source (e.g. a Source.queue) which injects changes to that value as stream elements which get merged into the stream and interpreted by the stream to change its behavior. Alternatively, in some cases it might be useful to use mapAsync or ask to pass stream elements to an actor.

How to know if a method is thread safe

Suppose I have a method that checks for a id in the db and if the id doesn't exit then inserts a value with that id. How do I know if this is thread safe and how do I ensure that its thread safe. Are there any general rules that I can use to ensure that it doesn't contain race conditions and is generally thread safe.
public TestEntity save(TestEntity entity) {
if (entity.getId() == null) {
entity.setId(UUID.randomUUID().toString());
}
Map<String, TestEntity > map = dbConnection.getMap(DB_NAME);
map.put(entity.getId(), entity);
return map.get(entity.getId());
}

This is a how long is a piece of string question...
A method will be thread safe if it uses the synchronized keyword in its declaration.
However, even if your setId and getId methods used synchronized keyword, your process of setting the id (if it has not been previously initialized) above is not. .. but even then there is an "it depends" aspect to the question. If it is impossible for two threads to ever get the same object with an uninitialised id then you are thread safe because you would never be attempting to concurrently modifying the id.
It is entirely possible, given the code in your question, that there could be two calls to the thread safe getid at the same time for the same object. One by one they get the return value (null) and immediately get pre-empted to let the other thread run. This means both will then run the thread safe setId method - again one by one.
You could declare the whole save method as synchronized, but if you do that the entire method will be single threaded which defeats the purpose of using threads in the first place. You tend to want to minimize the synchronized code to the bare minimum to maximize concurrency.
You could also put a synchronized block around the critical if statement and minimise the single threaded part of the processing, but then you would also need to be careful if there were other parts of the code that might also set the Id if it wasn't previously initialized.
Another possibility which has various pros and cons is to put the initialization of the Id into the get method and make that method synchronized, or simply assign the Id when the object is created in the constructor.
I hope this helps...
Edit...
The above talks about java language features. A few people mentioned facilities in the java class libraries (e.g. java.util.concurrent) which also provide support for concurrency. So that is a good add on, but there are also whole packages which address the concurrency and other related parallel programming paradigms (e.g. parallelism) in various ways.
To complete the list I would add tools such as Akka and Cats-effect (concurrency) and more.
Not to mention the books and courses devoted to the subject.
I just reread your question and noted that you are asking about databases. Again the answer is it depends. Rdbms' usually let you do this type of operation with record locks usually in a transaction. Some (like teradata) use special clauses such as locking row for write select * from some table where pi_cols = 'somevalues' which locks the rowhash to you until you update it or certain other conditions. This is known as pessimistic locking.
Others (notebly nosql) have optimistic locking. This is when you read the record (like you are implying with getid) there is no opportunity to lock the record. Then you do a conditional update. The conditional update is sort of like this: write the id as x provided that when you try to do so the Id is still null (or whatever the value was when you checked). These types of operations are usually down through an API.
You can also do optimistics locking in an RDBMs as follows:
SQL
Update tbl
Set x = 'some value',
Last_update_timestamp = current_timestamp()
Where x = bull AND last_update_timestamp = 'same value as when I last checked'
In this example the second part of the where clause is the critical bit which basically says "only update the record if no one else did and I trust that everyone else will update the last update to when they do". The "trust" bit can sometimes be replaced by triggers.
These types of database operations (if available) are guaranteed by the database engine to be "thread safe".
Which loops me back to the "how long is a piece of string" observation at the beginning of this answer...

Test-and-set is unsafe
a method that checks for a id in the db and if the id doesn't exit then inserts a value with that id.
Any test-and-set pair of operations on a shared resource is inherently unsafe, vulnerable to a race condition. If the two operations are separate (not atomic), then they must be protected as a pair. While one thread completes the test but has not yet done the set, another thread could sneak in and do both the test and the set. The first thread now completes its set without knowing a duplicate action has occurred.
Providing that necessary protection is too broad a topic for an Answer on Stack Overflow, as others have said here.
UPSERT
However, let me point out that an alternative approach to to make the test-and-set atomic.
In the context of a database, that can be done using the UPSERT feature. Also known as a Merge operation. For example, in Postgres 9.5 and later we have the INSERT INTO … ON CONFLICT command. See this explanation for details.
In the context of a Boolean-style flag, a semaphore makes the test-and-set atomic.

In general, when we say "a method is thread-safe" when there is no race-condition to the internal and external data structure of the object it belongs to. In other words, the order of the method calls are strictly enforced.
For example, let's say you have a HashMap object and two threads, thread_a and thread_b.
thread_a calls put("a", "a") and thread_b calls put("a", "b").
The put method is not thread-safe (refer to its documentation) in the sense that while thread_a is executing its put, thread_b can also go in and execute its own put.
A put contains reading and writing part.
thread_a.read("a")
thread_b.read("a")
thread_b.write("a", "b")
thread_a.write("a", "a")
If above sequence happens, you can say ... a method is not thread-safe.
How to make a method thread-safe is by ensuring the state of the whole object cannot be perturbed while the thread-safe method is executing. An easier way is to put "synchronized" keyword in method declarations.
If you are worried about performance, use manual locking using synchronized blocks with a lock object. Further performance improvement can be achieved using a very well designed semaphores.

Java Memory Model in practice

I was trying to learn the Java Memory Model, but still cannot understand how people use it in practice.
I know that many just rely on appropriate memory barriers (as described in the Cookbook), but in fact the model itself does not operate such terms.
The model introduces different orders defined on a set of actions and defines so called "well-formed executions".
Some people are trying to explain the memory model restrictions using one of such orders, namely "happens-before", but it seems like the order, at least by itself, does not define acceptable execution:
It should be noted that the presence of a happens-before relationship between two actions does not necessarily imply that they have to take place in that order in an implementation. If the reordering produces results consistent with a legal execution, it is not illegal
My question is how can one verify that certain code or change can lead to an "illegal execution" in practice (according to the model) ?
To be more concrete, let's consider a very simple example:
public class SomeClass {
private int a;
private int b;
public void someMethod() {
a = 2; // 1
b = 3; // 2
}
// other methods
}
It's clear that within the thread w(a = 2) happens before w(b = 3) according to the program order.
How can compiler/optimizer be sure that reordering 1 and 2 won't produce an "illegal execution" (strictly in terms of the model) ? And why if we set b to be volatile it will ?

Are you asking about how the VM/JIT analyzes the bytecode flow? Thats far too broad to answer, entire research papers have been written about that. And what the VM actually implements may change from release to release.
Or is the question simply about which rules of the memory model govern what is "legal"? For the executing thread, the memory model already makes the strong guarantee that every action on a given thread appears to happen in program order for that thread. That means if the JIT determines by whatever method(s) it implements for reordering that the reordering produces the same observable result(s) is legal.
The presence of actions that establish happens-before guarantees with respect to other threads (such as volatile accesses) simply adds more constraints to the legal reorderings.
Simplified it could be memorized as that everything that happened in program order before also appears to have (already) happened to other threads when a happend-before establishing action is executed.
For your example that means, in case of non-volatile (a, b) only the guarantee "appears to happen in program order" (to the executing thread) needs to be upheld, that means any reordering of the writes to (a, b) is legal, even delaying them until they are actually read (e.g. holding the value in a CPU register and bypassing main memory) would be valid. It could even omit writting the members at all if the JIT detects they are never actually read before the object goes out of scope (and to be precise, there is also no finalizer using them).
Making b volatile in your example changes the constraints in that other threads reading b would also be guaranteed to see the last update of a because it happened before the write to b. Again simplified, happens-before actions extend some of the perceived ordering guarantees from the executing thread to other threads.

It seems you are making the common mistake of thinking too much about low level aspects of the JMM in isolation. Regarding your question “how people use it in practice”, if you are talking about an application programmer, (s)he will use it in practice by not thinking about memory barriers or possible reorderings all the time.
Regarding your example:
public void someMethod() {
a = 2; // 1
b = 3; // 2
}
Given a and b are non-final, non-volatile.
It's clear that within the thread w(a = 2) happens before w(b = 3) according to the program order. How can compiler/optimizer be sure that reordering 1 and 2 won't produce an "illegal execution" (strictly in terms of the model) ?
Here, it backfires that you are focusing on re-ordering in isolation. First of all, the resulting code (of HotSpot optimization, JIT compilation, etc.) does not need to write the values to the heap memory at all. It might hold the new values in CPU registers and use it from there in subsequent operations of the same thread. Only when reaching a point were these changes have to be made visible to other threads they have to be written to the heap. Which may happen in arbitrary order.
But if, for example, the caller of the method enters an infinite loop after calling this method, the values don’t have to be written ever.
And why if we set b to be volatile it will ?
Declaring b as volatile does not guaranty that a and b are written. This is another mistake which arises from focusing on memory barriers.
Let’s go more abstract:
Suppose you have two concurrent actions, A and B. For concurrent execution in Java, there are several perfectly valid behaviors, including:
A might be executed entirely before B
B might be executed entirely before A
All or parts of A and B run in parallel
in the case B is executed entirely before A, there is no sense in having a write barrier in A and a read barrier in B, B will still not notice any activities of A. You can draw your conclusions about different parallel scenarios from this starting point.
This is where the happens-before relationship comes into play: a write of a value to a volatile variable happens before a read of that value from that variable by another thread. If the read operation is executed before the write operation, the reading thread will not see the value and hence there’s no happens-before relationship and so there is no statement about the other variables we can make.
To stay at your example with b being volatile: this implies that if a reading thread reads b and reads the value 3, and only then it is guaranteed to see the value of 2 (or an even more recent value if there are other writes) for a on subsequent reads.
So if a JVM can prove that there will never be a read operation on b seeing the written value, maybe because the entire instance we are modifying will never be seen by another thread, there is no happens-before relationship to be ever established, in other words, b being volatile has no impact on the allowed code transformations in this case, i.e. it might be reordered as well, or even never written to the heap at all.
So the bottom line is that it is not useful to look at a small piece of code and ask whether it will allow reordering or whether it will contain a memory barrier. This might not even be answerable as the answer might change depending on how the code is actually used. Only if your view is wide enough to see how threads will interact when accessing the data and you can safely deduce whether a happens-before relationship will be established you can start drawing conclusions about the correct working of the code. As you found out by yourself, correct working does not imply that you know whether reordering will happen or not on the lowest level.

In Java can I depend on reference assignment being atomic to implement copy on write?

If I have an unsynchronized java collection in a multithreaded environment, and I don't want to force readers of the collection to synchronize[1], is a solution where I synchronize the writers and use the atomicity of reference assignment feasible? Something like:
private Collection global = new HashSet(); // start threading after this
void allUpdatesGoThroughHere(Object exampleOperand) {
// My hypothesis is that this prevents operations in the block being re-ordered
synchronized(global) {
Collection copy = new HashSet(global);
copy.remove(exampleOperand);
// Given my hypothesis, we should have a fully constructed object here. So a
// reader will either get the old or the new Collection, but never an
// inconsistent one.
global = copy;
}
}
// Do multithreaded reads here. All reads are done through a reference copy like:
// Collection copy = global;
// for (Object elm: copy) {...
// so the global reference being updated half way through should have no impact
Rolling your own solution seems to often fail in these type of situations, so I'd be interested in knowing other patterns, collections or libraries I could use to prevent object creation and blocking for my data consumers.
[1] The reasons being a large proportion of time spent in reads compared to writes, combined with the risk of introducing deadlocks.
Edit: A lot of good information in several of the answers and comments, some important points:
A bug was present in the code I posted. Synchronizing on global (a badly named variable) can fail to protect the syncronized block after a swap.
You could fix this by synchronizing on the class (moving the synchronized keyword to the method), but there may be other bugs. A safer and more maintainable solution is to use something from java.util.concurrent.
There is no "eventual consistency guarantee" in the code I posted, one way to make sure that readers do get to see the updates by writers is to use the volatile keyword.
On reflection the general problem that motivated this question was trying to implement lock free reads with locked writes in java, however my (solved) problem was with a collection, which may be unnecessarily confusing for future readers. So in case it is not obvious the code I posted works by allowing one writer at a time to perform edits to "some object" that is being read unprotected by multiple reader threads. Commits of the edit are done through an atomic operation so readers can only get the pre-edit or post-edit "object". When/if the reader thread gets the update, it cannot occur in the middle of a read as the read is occurring on the old copy of the "object". A simple solution that had probably been discovered and proved to be broken in some way prior to the availability of better concurrency support in java.

Rather than trying to roll out your own solution, why not use a ConcurrentHashMap as your set and just set all the values to some standard value? (A constant like Boolean.TRUE would work well.)
I think this implementation works well with the many-readers-few-writers scenario. There's even a constructor that lets you set the expected "concurrency level".
Update: Veer has suggested using the Collections.newSetFromMap utility method to turn the ConcurrentHashMap into a Set. Since the method takes a Map<E,Boolean> my guess is that it does the same thing with setting all the values to Boolean.TRUE behind-the-scenes.
Update: Addressing the poster's example
That is probably what I will end up going with, but I am still curious about how my minimalist solution could fail. – MilesHampson
Your minimalist solution would work just fine with a bit of tweaking. My worry is that, although it's minimal now, it might get more complicated in the future. It's hard to remember all of the conditions you assume when making something thread-safe—especially if you're coming back to the code weeks/months/years later to make a seemingly insignificant tweak. If the ConcurrentHashMap does everything you need with sufficient performance then why not use that instead? All the nasty concurrency details are encapsulated away and even 6-months-from-now you will have a hard time messing it up!
You do need at least one tweak before your current solution will work. As has already been pointed out, you should probably add the volatile modifier to global's declaration. I don't know if you have a C/C++ background, but I was very surprised when I learned that the semantics of volatile in Java are actually much more complicated than in C. If you're planning on doing a lot of concurrent programming in Java then it'd be a good idea to familiarize yourself with the basics of the Java memory model. If you don't make the reference to global a volatile reference then it's possible that no thread will ever see any changes to the value of global until they try to update it, at which point entering the synchronized block will flush the local cache and get the updated reference value.
However, even with the addition of volatile there's still a huge problem. Here's a problem scenario with two threads:
We begin with the empty set, or global={}. Threads A and B both have this value in their thread-local cached memory.
Thread A obtains obtains the synchronized lock on global and starts the update by making a copy of global and adding the new key to the set.
While Thread A is still inside the synchronized block, Thread B reads its local value of global onto the stack and tries to enter the synchronized block. Since Thread A is currently inside the monitor Thread B blocks.
Thread A completes the update by setting the reference and exiting the monitor, resulting in global={1}.
Thread B is now able to enter the monitor and makes a copy of the global={1} set.
Thread A decides to make another update, reads in its local global reference and tries to enter the synchronized block. Since Thread B currently holds the lock on {} there is no lock on {1} and Thread A successfully enters the monitor!
Thread A also makes a copy of {1} for purposes of updating.
Now Threads A and B are both inside the synchronized block and they have identical copies of the global={1} set. This means that one of their updates will be lost! This situation is caused by the fact that you're synchronizing on an object stored in a reference that you're updating inside your synchronized block. You should always be very careful which objects you use to synchronize. You can fix this problem by adding a new variable to act as the lock:
private volatile Collection global = new HashSet(); // start threading after this
private final Object globalLock = new Object(); // final reference used for synchronization
void allUpdatesGoThroughHere(Object exampleOperand) {
// My hypothesis is that this prevents operations in the block being re-ordered
synchronized(globalLock) {
Collection copy = new HashSet(global);
copy.remove(exampleOperand);
// Given my hypothesis, we should have a fully constructed object here. So a
// reader will either get the old or the new Collection, but never an
// inconsistent one.
global = copy;
}
}
This bug was insidious enough that none of the other answers have addressed it yet. It's these kinds of crazy concurrency details that cause me to recommend using something from the already-debugged java.util.concurrent library rather than trying to put something together yourself. I think the above solution would work—but how easy would it be to screw it up again? This would be so much easier:
private final Set<Object> global = Collections.newSetFromMap(new ConcurrentHashMap<Object,Boolean>());
Since the reference is final you don't need to worry about threads using stale references, and since the ConcurrentHashMap handles all the nasty memory model issues internally you don't have to worry about all the nasty details of monitors and memory barriers!

According to the relevant Java Tutorial,
We have already seen that an increment expression, such as c++, does not describe an atomic action. Even very simple expressions can define complex actions that can decompose into other actions. However, there are actions you can specify that are atomic:
Reads and writes are atomic for reference variables and for most primitive variables (all types except long and double).
Reads and writes are atomic for all variables declared volatile (including long and double variables).
This is reaffirmed by Section §17.7 of the Java Language Specification
Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.
It appears that you can indeed rely on reference access being atomic; however, recognize that this does not ensure that all readers will read an updated value for global after this write -- i.e. there is no memory ordering guarantee here.
If you use an implicit lock via synchronized on all access to global, then you can forge some memory consistency here... but it might be better to use an alternative approach.
You also appear to want the collection in global to remain immutable... luckily, there is Collections.unmodifiableSet which you can use to enforce this. As an example, you should likely do something like the following...
private volatile Collection global = Collections.unmodifiableSet(new HashSet());
... that, or using AtomicReference,
private AtomicReference<Collection> global = new AtomicReference<>(Collections.unmodifiableSet(new HashSet()));
You would then use Collections.unmodifiableSet for your modified copies as well.
// ... All reads are done through a reference copy like:
// Collection copy = global;
// for (Object elm: copy) {...
// so the global reference being updated half way through should have no impact
You should know that making a copy here is redundant, as internally for (Object elm : global) creates an Iterator as follows...
final Iterator it = global.iterator();
while (it.hasNext()) {
Object elm = it.next();
}
There is therefore no chance of switching to an entirely different value for global in the midst of reading.
All that aside, I agree with the sentiment expressed by DaoWen... is there any reason you're rolling your own data structure here when there may be an alternative available in java.util.concurrent? I figured maybe you're dealing with an older Java, since you use raw types, but it won't hurt to ask.
You can find copy-on-write collection semantics provided by CopyOnWriteArrayList, or its cousin CopyOnWriteArraySet (which implements a Set using the former).
Also suggested by DaoWen, have you considered using a ConcurrentHashMap? They guarantee that using a for loop as you've done in your example will be consistent.
Similarly, Iterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration.
Internally, an Iterator is used for enhanced for over an Iterable.
You can craft a Set from this by utilizing Collections.newSetFromMap like follows:
final Set<E> safeSet = Collections.newSetFromMap(new ConcurrentHashMap<E, Boolean>());
...
/* guaranteed to reflect the state of the set at read-time */
for (final E elem : safeSet) {
...
}

I think your original idea was sound, and DaoWen did a good job getting the bugs out. Unless you can find something that does everything for you, it's better to understand these things than hope some magical class will do it for you. Magical classes can make your life easier and reduce the number of mistakes, but you do want to understand what they are doing.
ConcurrentSkipListSet might do a better job for you here. It could get rid of all your multithreading problems.
However, it is slower than a HashSet (usually--HashSets and SkipLists/Trees hard to compare). If you are doing a lot of reads for every write, what you've got will be faster. More importantly, if you update more than one entry at a time, your reads could see inconsistent results. If you expect that whenever there is an entry A there is an entry B, and vice versa, the skip list could give you one without the other.
With your current solution, to the readers, the contents of the map are always internally consistent. A read can be sure there's an A for every B. It can be sure that the size() method gives the precise number of elements that will be returned by the iterator. Two iterations will return the same elements in the same order.
In other words, allUpdatesGoThroughHere and ConcurrentSkipListSet are two good solutions to two different problems.

Can you use the Collections.synchronizedSet method? From HashSet Javadoc http://docs.oracle.com/javase/6/docs/api/java/util/HashSet.html
Set s = Collections.synchronizedSet(new HashSet(...));

Replace the synchronized by making global volatile and you'll be alright as far as the copy-on-write goes.
Although the assignment is atomic, in other threads it is not ordered with the writes to the object referenced. There needs to be a happens-before relationship which you get with a volatile or synchronising both reads and writes.
The problem of multiple updates happening at once is separate - use a single thread or whatever you want to do there.
If you used a synchronized for both reads and writes then it'd be correct but the performance may not be great with reads needing to hand-off. A ReadWriteLock may be appropriate, but you'd still have writes blocking reads.
Another approach to the publication issue is to use final field semantics to create an object that is (in theory) safe to be published unsafely.
Of course, there are also concurrent collections available.

Understanding JVM Happens-before and reorder

I am reading the JLS spec on memory model, 17.4.5 Happens-before Order.
I do not understand the first rule:
"# If x and y are actions of the same thread and x comes before y in program
order, then hb(x, y)."
Let's assume A an B are objects (instances of class object) that can be shared between multiple threads:
int i=A.getNum(); // ActionA
int j=B.getNum(); // ActionB
Three questions:
According to the above rule, does it mean hb(ActionA,ActionB)?
If the answer to 1 is true, does it mean according to happens-before rule, ActionB can not reordered to come before ActionA in any JVM that follows JSR133 memory model?
If 1 and 2 both are true, it seems that ActionA and ActionB are not relevant, why can not reorder them? just for this spec?

It is my understanding that:
you're right
they can be reordered, but only if action B doesn't depend on result of action A
Happens-before relationship doesn't say anything about reordering actions. It only says that if HB(A, B) holds, then action B must see memory effects of action A.
If action B doesn't use any result of action A, then there is no reason why they cannot be reordered. (In general, "use any result of another action" is pretty broad, and it can only be detected for quite simple actions, like memory reads/writes, not for actions using external resources like I/O operations, or time-based operations)

Yes, ActionA happens before ActionB. Read further in that section though. It doesn't necessarily mean that the JVM can't reorder these. It means that ActionB must observe the effect of ActionA, that is all. If ActionB never depends on ActionA's effect, that is trivially true.

You are basically correct in your understanding. However, the key thing to remember is:
Reordering is allowed if it doesn't affect the outcome of the thread in which it runs
This doesn't mean reordering isn't allowed if it affects other threads
It is this last fact that is a common source of errors and bewilderment in multi-threaded programming in java.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.