java.lang.String is only effectively immutable. Brian Goetz of "Java Concurrency in Practice" said something like effectively immutable objects will only be thread safe if safely published. Now, say I unsafely publish String like this:
public class MultiThreadingClass {
private String myPath ="c:\\somepath";
//beginmt runs simultaneously on a single instance of MultiThreading class
public void beginmt(){
Holder h = new Holder();
h.setPath(new File(myPath)); //line 6
h.begin();
}
}
public class Holder {
private File path;
public void setPath(File path){
this.path = path;
}
public void begin(){
System.out.println(path.getCanonicalPath()+"some string");
}
}
At the moment that the MultiThreadingClass is initializing with its constructor, it could happen that the File constructor on line 6 may not see the value of myPath.
Then, about three seconds after the construction of the unsafely published String object, threads on MultiThreadingClass are still running. Could there still be a chance that the File constructor may not see the value of myPath?
Your statement that you are asking your question about:
At the moment that the MultiThreadingClass is initializing with its
constructor, it could happen that the File constructor on line 6 may
not see the value of myPath.
The answer is complicated.
You don't need to worry about the char-array value inside the String object. As I mentioned in the comments, because it is a final field that is assigned in the constructors, and because String doesn't pass a reference to itself before assigning the final field, it is always safely published. You don't need to worry about the hash and hash32 fields either. They're not safely published, but they can only have the value 0 or the valid hash code. If they're still 0, the method String.hashCode will recalculate the value - it only leads to other threads recalculating the hashCode when this was already done earlier in a different thread.
The reference myPath in MultiThreadingClass is not safely published, because it is not final. "At the moment that the MultiThreadingClass is initializing with its constructor", but also later, after the constructor completed, other Threads than the thread that ran the constructor may see the value null in myPath rather than a reference to your string.
There's an example in the Java Memory Model section of the Java Language Specification [version 8 linked but this is unchanged since JMM was released in JSR-133]:
Example 17.5-1. final Fields In The Java Memory Model
The program below illustrates how final fields compare to normal fields.
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
}
The class FinalFieldExample has a final int field x and a non-final
int field y. One thread might execute the method writer and another
might execute the method reader.
Because the writer method writes f after the object's constructor
finishes, the reader method will be guaranteed to see the properly
initialized value for f.x: it will read the value 3. However, f.y is
not final; the reader method is therefore not guaranteed to see the
value 4 for it.
This is even likely to happen on a heavily loaded machine with many threads.
Workarounds/solutions:
Make myPath a final field (and don't add constructors that pass out the this reference before assigning the field)
Make myPath a volatile field
Make sure that all threads accessing myPath synchronize on the same monitor object before accessing myPath. For example, by making beginmt a synchronized method, or by any other means.
Could there still be a chance that the File constructor may not see the value of myPath?
Answer is yes it is possible since Java Memory Model guarantees visibility of only final fields
:-
"A new guarantee of initialization safety should be provided. If an object is properly constructed (which means that references to it do not escape during construction), then all threads which see a reference to that object will also see the values for its final fields that were set in the constructor, without the need for synchronization."
JSR 133 Link
However I feel this situation is impossible to recreate (I too had tried earlier a similar theory but ended in vain).
There is a case of unsafe publication/escape of this reference within the constructor which can lead to the scenario of myPath not being initialized properly. An example for this is given in Listing 3.7 of the book you mentioned. Below is an example of making your class this reference to escape in constructor.
public class MultiThreadingClass implements Runnable{
public static volatile MultiThreadingClass unsafeObject;
private String myPath ="c:\\somepath";
public MultiThreadingClass() {
unsafeObject = this;
.....
}
public void beginmt(){
Holder h = new Holder();
h.setPath(new File(myPath)); //line 6
h.begin();
}
}
The above class can cause other threads to access unsafeObject reference even before the myPath is correctly set but again recreating this scenario might be difficult.
Related
Consider the following code:
public class Resource {
public int val;
public Resource() {
val = 3;
}
}
public class UnsafePublication {
public static Resource resource;
public static void initialize() {
resource = new Resource();
}
}
In Thread1
UnsafePublication.initialize();
In Thread2
while (true) {
if (UnsafePublication.resource != null) {
System.out.println(UnsafePublication.resource.val);
break;
}
}
We know that the unsafe publication may lead to Thread2 printing 0 instead of 3.
After going through many reference materials, I concluded two explanations:
Assume the assignment resource = new Resource(); does write through, so Thread2 will find UnsafePublication.resource is not null. But the assignment val = 3; in the constructor of Resource does not write through, so the val will be the default 0 value.
There is a reorder when assigning new Resource() to resource. To be more specific:
allocate the memory for a Resource object
call Resource' s constructor to initialize the object
assign the object to the field "resource"
is reordered to:
allocate the memory for a Resource object
assign the object to the field "resource"
call Resource' s constructor to initialize the object.
So if Thread1 just finishes the step2 in the reordered version, and Thread2 is swapped in, then Thread2 will find UnsafePublication.resource.val is the default 0 value.
So here is my question: Are both of the explanations correct and possible? And in the real world, are these two factors may even mixed to make the situation more complicated?
There are three possibilities here.
Thread2 may view resource as null.
Thread2 may view a reference to a partially constructed instance (i.e. memory is allocated on the heap, but the variables are not yet initialized), so the value of val variable might be zero.
Thread2 may observe a fully constructed instance and the latest value of the variable, which is 3.
In fact, you can't predict the behavior of the program since there's no any happens before link established between the write and subsequent read operations. So, all the combinations of events are possible.
When I read jsr-133-faq, in question "How do final fields work under the new JMM?", it said:
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x;
int j = f.y;
}
}
}
The class above is an example of how final fields should be used. A thread executing reader is guaranteed to see the value 3 for f.x, because it is final. It is not guaranteed to see the value 4 for y, because it is not final.
This makes me confused, because the code in writer is not safe publication, thread executing reader may see f is not null, but the constructor of the object witch f referenced is not finished yet, so even if x is final, the thread executing reader can not be guaranteed to see the value 3 for f.x.
This is the place where I'm confused, pls correct me if i am wrong, thank you very much.
This is the whole point, and that is what's great about about final fields in the JMM. Yes, compiler can generally assign a reference to the object before the object is even fully constructed, and this is unsafe publication because the object may be accessed in a partially-constructed state. However, for final fields, the JMM (and the compiler) guarantees that all final fields will be prepared first, before assigning the reference to the object. The publication may still be unsafe and the object still only partially constructed when a new thread accesses it, but at least the final fields will be in their expected state. From chapter 16.3 in Java Concurrency in Practice:
Initialization safety makes visibility guarantees only for the values
that are reachable through final fields as of the time the constructor
finishes. For values reachable through nonfinal fields, or values that
may change after construction, you must use synchronization to ensure
visibility.
I also recommend reading chapter 16.3 for more detail.
I was reading about Immutable objects in java.
There is a statement which states - "Immutable objects are thread-safe".
I need more clarification for the above statement:
If I have a shared resource of type 'String' which is shared with multiple threads (say 3)
And if one of the thread make changes to the shared reference, it will create a new String object and that will available only with that Thread object and other threads will not get to know about the changes made by one of the thread.
Will it not lead to data inconsistency?
So could anyone please help me understand this?
Thanks in advance.
An object of an immutable class is always thread-safe. However, the reference to such an object doesn't have anything to do with that. It might be or it might not be thread-safe, depending on how the class that contains the reference to this thread-safe object (a String object, for example) and the methods to manipulate such a reference.
See this example below, to illustrate the explanation:
public class User {
private string name;
public User(String name) {
this.name=name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name=name;
}
}
So, even though String is an immutable and thread-safe class, the reference to it in the class User above isn't thread-safe because its methods could be concurrently changing this reference to new String objects.
If you wanted it to be thread-safe, there would be many options. One of them would be to set all methods that can change the name object as synchronized.
Immutable types (in any language) are special reference types.
Their uniqueness lies in the fact that when they are changed they are replaced with an entirely new instance which reflects the previous state + the change
Lets say 2 threads running a function which receive an immutable reference type object as a parameter ( and of course i'll use string for this example)
// pseudo code !!
main()
{
var str = "initialState";
new Thread(Do,str,1).Start();
new Thread(Do,str,2).Start();
}
void Do(string arg,int tid)
{
int i = 0;
while(true)
{
arg += "running in thread " + tid + "for the " + i + "time";
Console.WriteLine(arg);
}
}
This program while print in parallel the number of occurrences running in each thread with out effecting what was printed in the other thread.
Why ?
If you think about it string is a reference type, it is passed by reference and not by value, meaning it is not copied only a copy of the reference occurs.
And yet though it was initially the same object. after the first change it is no longer and a different one is created.
If it was still the same object after changes it would be a shared resource between the 2 threads and will disrupt the state those threads men't to print.
Lets say I have a code like this in my servlet:
private static final String RESOURCE_URL_PATTERN = "resourceUrlPattern";
private static final String PARAM_SEPARATOR = "|";
private List<String> resourcePatterns;
#Override
public void init() throws ServletException {
String resourcePatterns = getInitParameter(RESOURCE_URL_PATTERN);
this.resourcePatterns = com.google.common.base.Splitter.on(PARAM_SEPARATOR).trimResults().splitToList(resourcePatterns);
}
Is this thread safe to use 'resourcePatterns' if it will never be modified?
Lets say like this:
private boolean isValidRequest(String servletPath) {
for (String resourcePattern : resourcePatterns) {
if (servletPath.matches(resourcePattern)) {
return true;
}
}
return false;
}
Should I use CopyOnWriteArrayList or ArrayList is OK in this case?
Yes, List is fine to read from multiple threads concurrently, so long as nothing's writing.
For more detailed information on this, please see this answer that explains this further. There are some important gotchas.
From java concurrency in practice we have:
To publish an object safely, both the reference to the object and the
object's state must be made visible to other threads at the same time.
A properly constructed object can be safely published by:
Initializing an object reference from a static initializer. Storing a
reference to it into a volatile field. Storing a reference to it into
a final field. Storing a reference to it into a field that is properly
guarded by a (synchronized) lock.
your list is neither of these. I suggest making it final as this will make your object effectively immutable which in this case would be enough. If init() is called several times you should make it volatile instead. With this I of course assume that NO changes to the element of the list occur and that you don't expose any elements of the list either (as in a getElementAtPosition(int pos) method or the like.
Say you have the following class
public class AccessStatistics {
private final int noPages, noErrors;
public AccessStatistics(int noPages, int noErrors) {
this.noPages = noPages;
this.noErrors = noErrors;
}
public int getNoPages() { return noPages; }
public int getNoErrors() { return noErrors; }
}
and you execute the following code
private AtomicReference<AccessStatistics> stats =
new AtomicReference<AccessStatistics>(new AccessStatistics(0, 0));
public void incrementPageCount(boolean wasError) {
AccessStatistics prev, newValue;
do {
prev = stats.get();
int noPages = prev.getNoPages() + 1;
int noErrors = prev.getNoErrors;
if (wasError) {
noErrors++;
}
newValue = new AccessStatistics(noPages, noErrors);
} while (!stats.compareAndSet(prev, newValue));
}
In the last line while (!stats.compareAndSet(prev, newValue)) how does the compareAndSet method determine equality between prev and newValue? Is the AccessStatistics class required to implement an equals() method? If not, why? The javadoc states the following for AtomicReference.compareAndSet
Atomically sets the value to the given updated value if the current value == the expected value.
... but this assertion seems very general and the tutorials i've read on AtomicReference never suggest implementing an equals() for a class wrapped in an AtomicReference.
If classes wrapped in AtomicReference are required to implement equals() then for objects more complex than AccessStatistics I'm thinking it may be faster to synchronize methods that update the object and not use AtomicReference.
It compares the refrerences exactly as if you had used the == operator. That means that the references must be pointing to the same instance. Object.equals() is not used.
Actually, it does not compare prev and newValue!
Instead it compares the value stored within stats to prev and only when those are the same, it updates the value stored within stats to newValue. As said above it uses the equals operator (==) to do so. This means that anly when prev is pointing to the same object as is stored in stats will stats be updated.
It simply checks the object reference equality (aka ==), so if object reference held by AtomicReference had changed after you got the reference, it won't change the reference, so you'll have to start over.
Following are some of the source code of AtomicReference. AtomicReference refers to an object reference. This reference is a volatile member variable in the AtomicReference instance as below.
private volatile V value;
get() simply returns the latest value of the variable (as volatiles do in a "happens before" manner).
public final V get()
Following is the most important method of AtomicReference.
public final boolean compareAndSet(V expect, V update) {
return unsafe.compareAndSwapObject(this, valueOffset, expect, update);
}
The compareAndSet(expect,update) method calls the compareAndSwapObject() method of the unsafe class of Java. This method call of unsafe invokes the native call, which invokes a single instruction to the processor. "expect" and "update" each reference an object.
If and only if the AtomicReference instance member variable "value" refers to the same object is referred to by "expect", "update" is assigned to this instance variable now, and "true" is returned. Or else, false is returned. The whole thing is done atomically. No other thread can intercept in between. As this is a single processor operation (magic of modern computer architecture), it's often faster than using a synchronized block. But remember that when multiple variables need to be updated atomically, AtomicReference won't help.
I would like to add a full fledged running code, which can be run in eclipse. It would clear many confusion. Here 22 users (MyTh threads) are trying to book 20 seats. Following is the code snippet followed by the full code.
Code snippet where 22 users are trying to book 20 seats.
for (int i = 0; i < 20; i++) {// 20 seats
seats.add(new AtomicReference<Integer>());
}
Thread[] ths = new Thread[22];// 22 users
for (int i = 0; i < ths.length; i++) {
ths[i] = new MyTh(seats, i);
ths[i].start();
}
Following is the github link for those who wants to see the running full code which is small and concise.
https://github.com/sankar4git/atomicReference/blob/master/Solution.java