discussion for concurrency when using reference value

discussion for concurrency when using reference value - java

when writing concurrency program, sometimes we use the reference parameter, assume it is ref1 with fake type Reference, a method like
public void testRefVarInMethod(Reference ref1) {
Reference ref2 = ref1;
....
....
}
In this method, I declare a new variable ref2 which points to ref1. We all know that method variable is thread safe, however, as to reference ref1, anybody can change its value outside the method, so the ref2's value will be changed too. I guess this cannot guarantee thread safe, why do some people write code like this?

That's why people use methods like clone to ensure thread-safety.
Reference ref2 = ref1.clone();
By referencing the copy of ref1, ref2 will not be affected regardless how ref1 is changed by some other threads.
Edit:
As pointed out in the comments, the clone method does not necessarily enforce thread-safety. It has to be correctly implemented in a way that modifying ref1 will not change the state of ref2. i.e., ref1 and ref2 do not share any mutable fields.

The value of the local variable ref2 itself cannot be changed outside your method (none can make it point to another object from outside). It's only the state of the object it references that can be changed (someone can call ref1.setField(newValue)) concurrently.
People do that because they need to share objects between threads. Otherwise, they wouldn't be able to gain benefits of multithreading in many cases.
But people don't do it recklessly, they usually introduce various forms of synchronization to guarantee thread safety. For instance, one can use synchronized section as the simplest and most straightforward tool to delineate a critical section that can be executed by only one thread at any given time:
synchronized(ref2) {
// Change or read object here
}
If all the code uses the same approach, making changes (and reading them) on the object will be safe.
There're many other, more specialised and more efficient, synchronization primitives and techniques that you should learn about if you're going to write multithreaded programs with shared objects: immutability, volatile, ReadWriteLock etc. Books like "Java Concurrency in Practice" can give you a good introduction into the field.

Related

Is the word 'mutable variable' in java concurrency programming same as the meaning in functional programming?

In the book 'Java Concurrency in Practice', when talking about 'locking and visibility', the author said:
We can now give the other reason for the rule requiring all threads to synchronize on the same lock when accessing a shared mutable variable—to guarantee that values written by one thread are made visible to other threads. Otherwise, if a thread reads a variable without holding the appropriate lock, it might see a stale value.
Here is the figure:
I'm curious about the meaning of 'mutable' here. As per my knowledge in functional programming, 'immutable' means unchangeable and 'mutable' opposite. The variable x in the figure is what the author refers to as shared mutable variable. Is x(an integer or some other similar) mutable?

A shared variable is a placeholder for a location within the shared memory. There might be some confusion due to the fact that you can have an immutable reference variable pointing to an object with mutable instance variables.
But you can always decomposed all object graphs to a set of simple variables. If all these variables are immutable, then the entire object graph is immutable. But if some of these variables are mutable, we may enter the discussion about the possibility of data races if one or more of these variables are modified in one thread and read by another thread.
For this discussion, their place in the complex object graph is irrelevant, which is reason why the discussion uses just two mutable variables, x and y, apparently of type int. They still may be members of a, e.g. a Point instance being stored in a HashMap, but the only thing that matters is that these x and y variables are being modified and, as explained in the cited book, the unlocking of M will make these modifications visible to any thread subsequently locking M, as this applies to all variables, regardless of their place within the heap memory or object graph.
Note that the mutable nature of x and y implies that there might be older values they had before the x=1 resp. y=1 assignments, which can show up when being read without synchronization. This includes the default values (0) they have before the first assignment.

Are Mutable Atomic References a Bad Idea?

I have a data structure that I occasionally wish to modify, and occasionally wish to replace outright. At the moment, I'm storing this in an AtomicReference, and using synchrnonized blocks (synchronized on the AtomicReference itself, not its stored value) when I need to modify it, rather than replace it.
So something like:
public void foo(AtomicReference reference){
synchronized(reference){
reference.get()
.performSomeModification();
}
}
Notice that the modifying call is a member of the wrapped value, not the atomic reference, and is not guaranteed to have any thread safety of its own.
Is this safe? Findbugs (a freeware code reviewing tool) had this to say about it, so now I'm worried there's something happening under the hood, where it may be prematurely releasing the lock or something. I've also seen documentation referencing AtomicReference as specifically for immutable things.
Is this safe? If it isn't I could create my own Reference-storing class that I would be more certain about the behavior of, but I don't want to jump to conclusions.

From the linked documentation:
For example, synchronizing on an AtomicBoolean will not prevent other threads from modifying the AtomicBoolean.
It can't prevent other threads from modifying the AtomicBoolean because it can't force other threads to synchronize on the AtomicBoolean.
If I understand your question correctly, your intention is to synchronize calls to performSomeModification(). The code you've written will achieve that, if and only if every call to performSomeModification() is synchronized on the same object. As in the example from the docs, the basic problem is the enforceability of that requirement. You can't force other callers to synchronize on the AtomicReference. You or some other developer who comes after you could easily call performSomeModification() without external synchronization.
You should make it hard to use your API incorrectly. Since AtomicReference is a generic type (AtomicReference<V>), you can enforce the synchronization in a variety of ways, depending on what V is:
If V is an interface, you could easily wrap the instance in a synchronized wrapper.
If V is a class that you can modify, you could synchronize performSomeModification(), or create a subclass in which it is synchronized. (Possibly an anonymous subclass produced by a factory method.)
If V is a class that you cannot modify, it may be difficult to wrap. In that case, you could encapsulate the AtomicReference in a class that you do control, and have that class perform the required synchronization.

Are Mutable Atomic References a Bad Idea?
Definitely not! AtomicReference is designed to provide thread-safe, atomic updates of the underlying reference. In fact, the Javadoc description of AtomicReference is:
An object reference that may be updated atomically.
So they most definitely are designed to be mutated!
Is this safe?
It depends on what you mean by "safe", and what the rest of your code is doing. There's nothing inherently unsafe about your snippet of code in isolation. It's perfectly valid, though perhaps a bit unusual, to synchronize on an AtomicReference. As a developer unfamiliar with this code, I would see the synchronization on reference and assume that it means that the underlying object may be replaced at any time, and you want to make sure your code is always operating on the "newest" reference.
The standard best practices for synchronization apply, and violating them could result in unsafe behavior. For example, since you say performSomeModification() is not thread-safe, it would be unsafe if you accessed the underlying object somewhere else without synchronizing on reference.
public void bar(AtomicReference reference) {
// no synchronization: performSomeModification could be called on the object
// at the same time another thread is executing foo()
reference.get().performSomeModification();
}
If could also be "unsafe" if your application requires that only one instance of the underlying object be operated on at any one time, and you haven't synchronized on the reference when .set()ing it:
public void makeNewFoo(AtomicReference reference) {
// no synchronication on "reference", so it may be updated by another thread
// while foo() is executing performSomeModification() on the "old" reference
SomeObject foo = new SomeObject();
reference.set(foo);
}
If you need to synchronize on the AtomicReference, do so, it's perfectly safe. But I would highly recommend adding a few code comments about why you're doing it.

Can a volatile variable that is never assigned null to ever contain null?

Can in the following conceptual Java example:
public class X implements Runnable {
public volatile Object x = new Object();
#Runnable
public void run() {
for (;;) {
Thread.sleep(1000);
x = new Object();
}
}
}
x ever be read as null from another thread?
Bonus: Do I need to declare it volatile (I do not really care about that value, it suffices that sometime in the future it will be the newly assigned value and never is null)

Technically, yes it can. That is the main reason for the original ConcurrentHashMap's readUnderLock. The javadoc even explains how:
Reads value field of an entry under lock. Called if value field ever appears to be null. This is possible only if a compiler happens to reorder a HashEntry initialization with its table assignment, which is legal under memory model but is not known to ever occur.
Since the HashEntry's value is volatile this type of reordering is legal on consturction.
Moral of the story is that all non-final initializations can race with object construction.
Edit:
#Nathan Hughes asked a valid question:
#John: in the OP's example wouldn't the construction have happened before the thread the runnable is passed into started? it would seem like that would impose a happens-before barrier subsequent to the field's initialization.
Doug Lea had a couple comments on this topic, the entire thread can be read here. He answered the comment:
But the issue is whether assignment of the new C instance to some other memory must occur after the volatile stores.
With the answer
Sorry for mis-remembering why I had treated this issue as basically settled:
Unless a JVM always pre-zeros memory (which usually not a good option), then
even if not explicitly initialized, volatile fields must be zeroed
in the constructor body, with a release fence before publication.
And so even though there are cases in which the JMM does not
strictly require mechanics preventing publication reordering
in constructors of classes with volatile fields, the only good
implementation choices for JVMs are either to use non-volatile writes
with a trailing release fence, or to perform each volatile write
with full fencing. Either way, there is no reordering with publication.
Unfortunately, programmers cannot rely on a spec to guarantee
it, at least until the JMM is revised.
And finished with:
Programmers do not expect that even though final fields are specifically
publication-safe, volatile fields are not always so.
For various implementation reasons, JVMs arrange that
volatile fields are publication safe anyway, at least in
cases we know about.
Actually updating the JMM/JLS to mandate this is not easy
(no small tweak that I know applies). But now is a good time
to be considering a full revision for JDK9.
In the mean time, it would make sense to further test
and validate JVMs as meeting this likely future spec.

This depends on how the X instance is published.
Suppose x is published unsafely, eg. through a non-volatile field
private X instance;
...
void someMethod() {
instance = new X();
}
Another thread accessing the instance field is allowed to see a reference value referring to an uninitialized X object (ie. where its constructor hasn't run yet). In such a case, its field x would have a value of null.
The above example translates to
temporaryReferenceOnStack = new memory for X // a reference to the instance
temporaryReferenceOnStack.<init> // call constructor
instance = temporaryReferenceOnStack;
But the language allows the following reordering
temporaryReferenceOnStack = new memory for X // a reference to the instance
instance = temporaryReferenceOnStack;
temporaryReferenceOnStack.<init> // call constructor
or directly
instance = new memory for X // a reference to the instance
instance.<init> // call constructor
In such a case, a thread is allowed to see the value of instance before the constructor is invoked to initialize the referenced object.
Now, how likely this is to happen in current JVMs? Eh, I couldn't come up with an MCVE.
Bonus: Do I need to declare it volatile (I do not really care about
that value, it suffices that sometime in the future it will be the
newly assigned value and never is null)
Publish the enclosing object safely. Or use a final AtomicReference field which you set.

No. The Java memory model guarantees that you will never seen x as null. x must always be the initial value it was assigned, or some subsequent value.
This actually works with any variable, not just volatile. What you are asking about is called "out of thin air values". C.f. Java Concurrency in Practice which talks about this concept in some length.
The other part of your question "Do I need to declare x as volatile:" given the context, yes, it should be either volatile or final. Either one provides safe publication for your object referenced by x. C.f. Safe Publication. Obviously, x can't be changed later if it's final.

Equivalent of AtomicReference but without the volatile synchronization cost

What is the equivalent of:
AtomicReference<SomeClass> ref = new AtomicReference<SomeClass>( ... );
but without the synchronization cost. Note that I do want to wrap a reference inside another object.
I've looked at the classes extending the Reference abstract class but I'm a bit lost amongst all the choices.
I need something really simple, not weak nor phantom nor all the other references besides one. Which class should I use?

If you want a reference without thread safety you can use an array of one.
MyObject[] ref = { new MyObject() };
MyObject mo = ref[0];
ref[0] = n;

If you are simply trying to store a reference in an object. Can't you create a class with a field, considering the field would be a strong reference that should achieve what you want
You shouldn't create a StrongReference class (because it would be silly) but to demonstrate it
public class StrongReference{
Object refernece;
public void set(Object ref){
this.reference =ref;
}
public Object get(){
return this.reference;
}
}

Since Java 9 you can now use AtomicReference.setPlain() and AtomicReference.getPlain().
JavaDoc on setPlain:
"Sets the value to newValue, with memory semantics of setting as if the variable was declared non-volatile and non-final."

AtomicReference does not have the cost of synchronization in the sense of traditional synchronized sections. It is implemented as non-blocking, meaning that threads that wait to "acquire the lock" are not context-switched, which makes it very fast in practice. Probably for concurrently updating a single reference, you cannot find a faster method.

If you still want to use AtomicReference but don't want to incur the cost of the volatile write you can use lazySet
The write doesn't issue a memory barrier that a normal volatile write does, but the get still invokes a volatile load (which is relatively cheap)
AtomicReference<SomeClass> ref = new AtomicReference<SomeClass>();
ref.lazySet(someClass);

I think all you want is:
public class MyReference<T>{
T reference;
public void set(T ref){
this.reference =ref;
}
public T get(){
return this.reference;
}
}
You might consider adding delegating equals(), hashcode(), and toString().

To use java.util.concurrent.atomic.AtomicReference feels wrong to me too in order to share a reference of an object. Besides the "atomicity costs" AtomicReference is full of methods that are irrelevant for your use case and may raise wrong expectations to the user.
But I haven't encounter such an equivalent class in the JDK yet.
Here is a summary of your options - chose what fits best to you:
A self-written value container like the proposed StrongReference or MyReference from the other answers
MutableObject from Apache Commons Lang
Array with length == 1 or a List with size == 1
setPlain(V) and getPlain() in AtomicReference since Java 9

all provided classes extending Reference has some special functionality attached, from atomic CaS to allowing the referenced object to be collected event thoguh a reference still exists to the object
you can create your own StringReference as John Vint explained (or use a array with length==1) but there aren't that many uses for that though

There is no synchronization cost to AtomicReference. From the description of the java.util.concurrent.atomic package:
A small toolkit of classes that support lock-free thread-safe programming on single variables.
EDIT
Based on your comments to your original post, it seems that you used the term "synchronization cost" in a non-standard way to mean thread-local cache flushing in general. On most architectures, reading a volatile is nearly as cheap as reading a non-volatile value. Any update to a shared variable is going to require cache flushing of at least that variable (unless you are going to abolish thread-local caches entirely). There isn't anything cheaper (performance-wise) than the classes in java.util.concurrent.atomic.

If your value is immutable, java.util.Optional looks like a great option.

Are non-synchronised static methods thread safe if they don't modify static class variables?

I was wondering if you have a static method that is not synchronised, but does not modify any static variables is it thread-safe? What about if the method creates local variables inside it? For example, is the following code thread-safe?
public static String[] makeStringArray( String a, String b ){
return new String[]{ a, b };
}
So if I have two threads calling ths method continously and concurrently, one with dogs (say "great dane" and "bull dog") and the other with cats (say "persian" and "siamese") will I ever get cats and dogs in the same array? Or will the cats and dogs never be inside the same invocation of the method at the same time?

This method is 100% thread safe, it would be even if it wasn't static. The problem with thread-safety arises when you need to share data between threads - you must take care of atomicity, visibility, etc.
This method only operates on parameters, which reside on stack and references to immutable objects on heap. Stack is inherently local to the thread, so no sharing of data occurs, ever.
Immutable objects (String in this case) are also thread-safe because once created they can't be changed and all threads see the same value. On the other hand if the method was accepting (mutable) Date you could have had a problem. Two threads can simultaneously modify that same object instance, causing race conditions and visibility problems.

A method can only be thread-unsafe when it changes some shared state. Whether it's static or not is irrelevant.

The function is perfectly thread safe.
If you think about it... assume what would happen if this were different. Every usual function would have threading problems if not synchronized, so all API functions in the JDK would have to be synchronized, because they could potentially be called by multiple threads. And since most time the app is using some API, multithreaded apps would effectively be impossible.
This is too ridiculous to think about it, so just for you: Methods are not threadsafe if there is a clear reason why there could be problems. Try to always think about what if there were multiple threads in my function, and what if you had a step-debugger and would one step after another advance the first... then the second thread... maybe the second again... would there be problems? If you find one, its not thread safe.
Please be also aware, that most of the Java 1.5 Collection classes are not threadsafe, except those where stated, like ConcurrentHashMap.
And if you really want to dive into this, have a close look at the volatile keyword and ALL its side effects. Have a look at the Semaphore() and Lock() class, and their friends in java.util.Concurrent. Read all the API docs around the classes. It is worth to learn and satisfying, too.
Sorry for this overly elaborate answer.

Use the static keyword with synchronized static methods to modify static data shared among threads. With the static keyword all the threads created will contend for a single version of the method.
Use the volatile keyword along with synchronized instance methods will guarantee that each thread has its own copy of the shared data and no read/ writes will leak out between threads.

String objects being immutable is another reason for thread-safe scenario above. Instead if mutable objects are used (say makeMutableArray..) then surely thread-safety will break.

Since the complete method was pushed onto the stack, any variable creation that takes place lives within the stack (again exceptions being static variables) and only accessible to one thread. So all the methods are thread safe until they change the state of some static variable.
See also:
Is static method is thread safe in Java?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.