Why is reference assignment atomic in Java? - java

As far as I know reference assignment is atomic in a 64 bit JVM.
Now, I assume the jvm doesn't use atomic pointers internally to model this, since otherwise there would be no need for Atomic References. So my questions are:
Is atomic reference assignment in the "specs" of java/Scala and guaranteed to happen or is it just a happy coincidence that it is that way most times ?
Is atomic reference assignment implied for any language that compiles to the JVM's bytecode (e.g. clojure, Groovy, JRuby, JPython...etc) ?
How can reference assignment be atomic without using an atomic pointer internally ?

First of all, reference assignment is atomic because the specification says so. Besides that, there is no obstacle for JVM implementors to fulfill this constraint, as 64 Bit references are usually only used on 64 Bit architectures, where atomic 64 Bit assignment comes for free.
Your main confusion stems from the assumption that the additional “Atomic References” feature means exactly that, due to its name. But the AtomicReference class offers far more, as it encapsulates a volatile reference, which has stronger memory visibility guarantees in a multi-threaded execution.
Having an atomic reference update does not necessarily imply that a thread reading the reference will also see consistent values regarding the fields of the object reachable through that reference. All it guarantees is that you will read either the null reference or a valid reference to an existing object that actually was stored by some thread. If you want more guarantees, you need constructs like synchronization, volatile references, or an AtomicReference.
AtomicReference also offers atomic update operations like compareAndSet or getAndSet. These are not possible with ordinary reference variables using built-in language constructs (but only with special classes like AtomicReferenceFieldUpdater or VarHandle).

Atomic reference assignment is in the specs.
Writes to and reads of references are always atomic, regardless of
whether they are implemented as 32 or 64 bit values.
Quoted from JSR-133: Java(TM) Memory Model and Thread Specification, section 12 Non-atomic Treatment of double and long, http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf.

As the other answer outlines, the Java Memory Model states that references read/writes are atomic.
But of course, that is the Java language memory model. On the other hand: no matter if we talk Java or Scala or Kotlin or ... in the end everything gets compiled into bytecode.
There are no special bytecode instructions for Java. Scala in the end uses the very same instructions.
Leading to: the properties of that memory model must be implemented inside the VM platform. Thus they must apply to other languages running on the platform as well.

Related

Is the word 'mutable variable' in java concurrency programming same as the meaning in functional programming?

In the book 'Java Concurrency in Practice', when talking about 'locking and visibility', the author said:
We can now give the other reason for the rule requiring all threads to synchronize on the same lock when accessing a shared mutable variable—to guarantee that values written by one thread are made visible to other threads. Otherwise, if a thread reads a variable without holding the appropriate lock, it might see a stale value.
Here is the figure:
I'm curious about the meaning of 'mutable' here. As per my knowledge in functional programming, 'immutable' means unchangeable and 'mutable' opposite. The variable x in the figure is what the author refers to as shared mutable variable. Is x(an integer or some other similar) mutable?
A shared variable is a placeholder for a location within the shared memory. There might be some confusion due to the fact that you can have an immutable reference variable pointing to an object with mutable instance variables.
But you can always decomposed all object graphs to a set of simple variables. If all these variables are immutable, then the entire object graph is immutable. But if some of these variables are mutable, we may enter the discussion about the possibility of data races if one or more of these variables are modified in one thread and read by another thread.
For this discussion, their place in the complex object graph is irrelevant, which is reason why the discussion uses just two mutable variables, x and y, apparently of type int. They still may be members of a, e.g. a Point instance being stored in a HashMap, but the only thing that matters is that these x and y variables are being modified and, as explained in the cited book, the unlocking of M will make these modifications visible to any thread subsequently locking M, as this applies to all variables, regardless of their place within the heap memory or object graph.
Note that the mutable nature of x and y implies that there might be older values they had before the x=1 resp. y=1 assignments, which can show up when being read without synchronization. This includes the default values (0) they have before the first assignment.

Local References in ConcurrentHashMap

In ConcurrentHashMap, segments is marked final (and thus will never change), but the method ensureSegment creates a method-local copy, ss, of segments upon which to operate.
Does anybody know this purpose? which benefit we can get?
Update:
I search from google, get one page which explained ConcurrentHashMap in JDK7 The Concurrency Of ConcurrentHashMap, below are excerpts
Local References
Even though segments is marked final (and thus will never change), Doug Lea prudently creates a method-local copy, ss, of segments upon which to operate. Such defensive programming allows a programmer to not worry about otherwise-volatile instance member references changing during execution of a method (i.e. inconsistent reads). Of course, this is simply a new reference and does not prevent your method from seeing changes to the referent.
Can anyone explain the bold text?
There is no semantic difference between accessing a final field and accessing a local variable holding a copy of the final field’s value. However, it is an established pattern to copy fields into local variables in performance critical code.
Even in cases where it does not make a difference (that depends on the state of the HotSpot optimizer), it will save at least some bytes in the method’s code.
Each access to an instance field, be it final or not (the only exception being compile-time constants), will get compiled as two instructions, first pushing the this reference onto the operand stack via aload_0, then performing a getfield operation, which has a two byte index to the constant pool entry describing the field. In other words, each field access needs four bytes in the method’s code, whereas reading a local variable needs only one byte if the variable is one of the method’s first four (counting this as a local variable), which is the case here (see aload_n).
So storing a field’s value in a local variable when it is going to be accessed multiple times is a good behavior to protect against changes of a mutable variable and to avoid the cost of a volatile read and still doesn’t hurt even when it is obsolete as in the case of a final field as it will even produce more compact byte code. And in a simple interpreted execution, i.e. before the optimizer kicks in, the single local variable access might indeed be faster than the detour via the this instance.
The array is marked final, but the array elements are not (can not). So each array element can be set or replaced, for example using:
this.segments[i] = ...
ensureSegment in Java 7 uses a different method to set the array element, using sun.misc.Unsafe, for improved performance when concurrently calling the method. This is not what a "regular" developer should do.
A local variable
final Segment<K,V>[] ss = this.segments;
is typically used to ensure (well - to increase that probability) that ss is in a CPU register, and not re-read from memory, for the duration of this method. I guess that would not be needed in this case, as the compiler can infer that. Using ss does make the following lines slightly shorter, maybe that's why it is used.

using volatile keyword in java4 and java5

what is the difference in using volatile keyword in java4 and java5 onwards?
and related to that,
Read/write operations on non-atomic variables(long/double) are atomic when they are
declared as volatile.
Is this also true for java4 or it is valid from java5 onwards???
Yes there is a difference.
Up to Java 4 volatile could be re-ordered by compiler with respect to any previous read or write, leading to subtle concurrency bugs e.g. making it impossible to implement a double check locking (very common idiom for a Singleton).
This is fixed in Java 5.0 which extends the semantics for volatile which cannot be reordered with respect to any following read or write anymore and introduces a new memory model. You can read this Double Checked Locking for example reference
This site gives a good explanation of the differences: http://www.javamex.com/tutorials/synchronization_volatile.shtml
They also give an explanation of the behavior of volatile in Java 5 on a separate page: http://www.javamex.com/tutorials/synchronization_volatile_java_5.shtml
People have provided good points and references responding to my question answering first part.
Going specific to the second part of question, this i read at some forum:
A volatile declared long is atomic (pre-Java 5 also) in the sense that
it guarantees (for all JVM implementations) a read or write go
directly to main memory instead of two 32-bit registers.
And
Pre-Java 5, volatile was supposed to provide such guarantees for long
and double. However things did not work out this way in practice, and
implementations frequently violated this guarantee. As I recall the
issue seemed to get fixed around JDK 1.4, but as they were still
working on the whole memory model thing, they didn't really make any
clear announcements about it until JDK 5, when the new rules were
announced, and memory guarantees actually meant something.
And this is from Java Language Specification,Second Edition:
17.4 Nonatomic Treatment of double and long
The load, store, read, and write actions on volatile variables are atomic,
even if the type of the variable is double or long.
What is the difference in using volatile keyword in java4 and java5 onwards?
JMM before JDK5 is broken and using volatile for JDK4 may not provide the intended result. For more check this:
http://www.ibm.com/developerworks/library/j-jtp02244/
Read/write operations on non-atomic variables(long/double) are atomic when they are declared as volatile.
Read/Write for long/double happen as two separate 32-bit operations. For two threads it is possible that one thread has read higher 32-bits and other one has read lower 32-bits of a long/double variable. In short read/write on long is not atomic operation unlike other primitives.
Using volatile for long/double is supposed to provide such guarantee as the instructions for volatile are not re-ordered for volatile read/write by compiler and volatile also provides visibility guarantee. But again it may not work for JDK 4 or before.

java and C++11 volatile

I'd like to port some piece of code from Java to C++11 and I'm a bit confused with volatile keyword.
I'm not familiar with Java language and I don't get what a volatile variable is. It guarantees that every thread has access to the up to date value of variable - it is the C++ volatile behaviour. But it is usually used to synchronize - are all actions performed on volatile variable atomic?
So I think thath the C++11 good replacement for Java volatile will be std::atomic. Or I'm totally wrong, cause I missed some additional Java volatile features?
Yes, they would be a good match, there is a good article on this at Dr Dobbs.
In a nutshell, ordered atomic variables are safe to read and write on
multiple threads at the same time without doing any explicit locking
because they provide two guarantees: their reads and writes are
guaranteed to be executed in the order they appear in your program's
source code; and each read or write is guaranteed to be atomic,
all-or-nothing.
Java provides this type of variable as volatile, C++ as std::atomic.
This page has a pretty nice explanation on Java's volatile keyword: http://www.javamex.com/tutorials/synchronization_volatile.shtml. It looks to me that C++11 std::atomic<> on primitive types (e.g., integers) indeed is a good replacement. Note that std::atomic<> provides support for read-modify-write operations (e.g., compare_exchange_strong and fetch_add).

Method call and atomicity

I have a method with a single atomic operation, like this one
int value;
public void setValue(int value) {
this.value = value;
}
then I call it in obvious way, like
foo.setValue(10);
The question is: would it be atomic operation? If no, what atomic operations will be executed? How I can test this at my computer (if I can)?
Yes, the
this.value = value;
operation is atomic. See the Java Language Specification: Threads and Locks.
Note though that threads are allowed to cache their own values of non-volatile variables, so it is not guaranteed that a successive get-operation would yield the last set value.
To get rid of these kind of data races you need to synchronize the access to the variable somehow. This can be done by
making the method synchronized,
by letting the variable be volatile or,
use AtomicInteger from the java.util.concurrent package. (preferred way imo)
It should also be noted that the operation would not be atomic if you changed from int to long or double. Here is a relevant section from the JLS:
17.4 Non-atomic Treatment of double and long
If a double or long variable is not declared volatile, then for the purposes of load, store, read, and write actions they are treated as if they were two variables of 32 bits each: wherever the rules require one of these actions, two such actions are performed, one for each 32-bit half.
Some useful links:
Wikipedia article on the Java Memory Model
Java Language Specification, Interaction with the Memory Model
It is atomic, because it is just a primitive 32 bit value.
Hence when you read it, there is a guarantee that you will see a value set by one of the threads, but you won't know which one it was.
If it was a long, you wouldn't have the same guarantee, although in practice most VM implementations do treat long writes as atomic operations.
This is what the JLS has to say on the issue:
VM implementors are encouraged to avoid splitting their 64-bit values where possible. Programmers are encouraged to declare shared 64-bit values as volatile or synchronize their programs correctly to avoid possible complications.
But with ints you are safe, question is, is this very weak guarantee enough for you? More often than not, the answer is a no.
First of all, assignment to all primitive types (except 64-bit ones) in Java is atomic according to the Java specification. But for instance auto-increment isn't thread-safe, no matter which type you use.
But the real problem with this code is not atomicity, but visibility. If two threads are modifying the value, they might not see the changes made by each other. Use the volatile keyword or, even better, AtomicInteger to guarantee correct synchronization and visibility.
Please note that synchronized keyword also guarantees visibility, which means if some modification happens inside synchronnized block, it is guaranteed that it will be visible by other threads.

Categories

Resources