How reading (outside critical resource block) and writing (inside critical resource block) does not have atomicity issues.
I have read and discussed with various people but most people don't answer if both operations are atomic and how atomicity is actually achieved for above problem.
class ABC {
private static volatile ABC abcInstance;
static ABC getInstance(){
if(abcInstance == null){
synchronized(ABC.class){
if(abcInstance == null){
abcInstance = new ABC();
return abcInstance;
}
}
}
return abcInstance;
}
}
Are if(abcInstance == null) outside synchronisation block and abcInstance = new ABC(); atomic, if not then this way of creating singletons is wrong.
In C++, abcInstance = new ABC(); consists of three instructions broadly speaking:
Create ABC object.
Allocate memory for ABC.
Assign it to abcInstance.
And for optimisations compiler can reorder these three instructions in any way. Suppose it follows 2->3->1 and after instruction 3 interrupt happens and next thread calling getInstance() will read that abcInstance has some value then it will point to something which does not have ABC object.
Please correct me if am wrong for both C++ and Java.
This answers the Java part of your question only.
Is if(abcInstance == null) and abcInstance = new ABC(); are atomic, if not then this way of creating singleton is wrong.
It is not atomicity that is the (potential) problem. (Reference assignment is atomic from the perspective of both the thread doing the assignment, and the thread reading the assigned variable.)
The problem is when the value written to abcInstance becomes visible to another thread.
Prior to Java 5, the memory model did not provide sufficient guarantees about memory visibility for that implementation to work reliably.
In the Java 5 (and later) memory model, there is a happens before relation between one thread's write to a volatile variable and another thread's subsequent read of the variable. This means:
The second thread is guaranteed to see the non-null value of abcInstance if the first thread has written it.
The happens before relation also guarantees that the second thread will see the fully initialized state of the ABC instance create by the first thread.
The synchronized block ensures that only one ABC instance may be created at a time.
This is the authoritative article explaining why old double-checked locking implementations were broken:
The "Double-Checked Locking is Broken" Declaration
As Andrew Turner states, there is a simpler, cleaner way to implement singleton classes in Java: use an enum.
Implementing Singleton with an Enum (in Java)
Here are two typical singleton variants in C++.
First one shared by all threads:
class singleton {
private:
singleton() {}
public:
singleton(const singleton&) = delete;
static singleton& get_instance() {
static singleton ins;
return ins;
}
};
And here's one that that will create one instance per thread that needs it:
class tl_singleton {
private:
tl_singleton() {}
public:
tl_singleton(const tl_singleton&) = delete;
static tl_singleton& get_instance() {
static thread_local tl_singleton ins;
return ins;
}
};
Related
Class which is implemented with Singleton Pattern is as follows, when multiple threads access this method only one thread has to create the instance so all I am doing is synchronising the method
private static synchronized FactoryAPI getIOInstance(){
if(factoryAPI == null){
FileUtils.initWrapperProp();
factoryAPI = new FactoryAPIImpl();
}
return factoryAPI;
}
which I feel is unnecessary because only for the first time the instance would be created and for the rest of the time the instance created already would be returned. When adding synchronised to block allows only one thread to access the method at a time.
The getIOInstance does two jobs
i) Initialising properties and
ii) Creating a new instance for the first time
So, I'm trying to do block level synchronisation here like the following
private static FactoryAPI getIOInstance(){
if(factoryAPI == null){
synchronised {
if(factoryAPI == null){
FileUtils.initWrapperProp();
factoryAPI = new FactoryAPIImpl();
}
}
}
return factoryAPI;
}
I prefer the second one to be the right one. Am I using it in a right way? Any suggestions are welcome.
Use the first method because the second one is not thread-safe.
When you say,
factoryAPI = new FactoryAPIImpl();
The compiler is free to execute the code in the following order:
1) Allocate some memory on the heap
2) Initialize factoryAPI to the address of that allocated space
3) Call the constructor of FactoryAPIImpl
The problem is when another thread calls getIOInstance() after step 2 and before step 3. It may see a non-null factoryAPI variable that points to an uninitialized FactoryAPI instance.
There are many different answers to this problem, you can find an extensive discussion at SEI for example.
The modern Java solution is simple: use an enum - as the JLS guarantees you that the compiler / JVM will create exactly one thing.
Found Initialization-on-demand holder method of initialising as an interesting one like the following,
public class FactoryAPI {
private FactoryAPI() {}
private static class LazyHolder {
static final Something INSTANCE = new Something();
}
public static Something getInstance() {
return FactoryAPI.INSTANCE;
}
}
Since the class initialization phase is guaranteed by the JLS to be serial, i.e., non-concurrent, no further synchronization is required in the static getInstance method during loading and initialization.
And since the initialization phase writes the static variable INSTANCE in a serial operation, all subsequent concurrent invocations of the getInstance will return the same correctly initialized INSTANCE without incurring any additional synchronization overhead.
After reading dozens of articles about DCL. I feel that I should not use this concept without volatile.If I will not lead this technique my code will not thread save and very very bad according one hundreed different reasons.
Recently I reread basics Happens Before and I have a bit another view
Lets research singleton code listing:
public class Singleton{
private static Something instance = null;
public static Singleton getInstance() {
if (instance == null) { // point 1
synchronized (Singleton.class) {
if (instance == null)
instance = new Something();
}
}
return instance; //point 2
}
}
we change instance only inside synchronized (Something.class) {
thus we will see actual value inside synchronized section which uses same monitor. It is our case.
Thus now I have suspicion that this is not effective but it thread safe.
Am I right?
Only one concern:
Can if (instance == null) { to see non null value before actual assignment instance = new Something();
But I still absolutely sure that this code doesn't allow to create 2 singleton instances
P.S.
I read a bit more and looks like if in point 1 we read non-null value, return instance at point 2 could return null;
The problem in your example is not in possible creating of two instances. That's true that only one instance will be created. Real problem is that on multiple thread access to this method, other thread can start using partially constructed version of instance (1)(2).
So, instance variable should be definitely defined as volatile (which is missed in your code block), otherwise you should concern about "freshness" of this variable's value.
Because there is no locking if the field is already initialized, it is
critical that the field be declared volatile (Item 66) [J. Bloch, "Effective Java", Item 71]
So:
private static volatile Something instance;
(BTW, explicitly assigning null value is redundant).
Why it doesn't work without "volatile" is good explained here:
The most obvious reason it doesn't work it that the writes that
initialize the Helper object and the write to the helper field can be
done or perceived out of order
I've seen this type of code a lot in projects, where the application wants a global data holder, so they use a static singleton that any thread can access.
public class GlobalData {
// Data-related code. This could be anything; I've used a simple String.
//
private String someData;
public String getData() { return someData; }
public void setData(String data) { someData = data; }
// Singleton code
//
private static GlobalData INSTANCE;
private GlobalData() {}
public synchronized GlobalData getInstance() {
if (INSTANCE == null) INSTANCE = new GlobalData();
return INSTANCE;
}
}
I hope it's easy to see what's going on. One can call GlobalData.getInstance().getData() at any time on any thread. If two threads call setData() with different values, even if you can't guarantee which one "wins", I'm not worried about that.
But thread-safety isn't my concern here. What I'm worried about is memory visibility. Whenever there's a memory barrier in Java, the cached memory is synched between the corresponding threads. A memory barrier happens when passing through synchronizations, accessing volatile variables, etc.
Imagine the following scenario happening in chronological order:
// Thread 1
GlobalData d = GlobalData.getInstance();
d.setData("one");
// Thread 2
GlobalData d = GlobalData.getInstance();
d.setData("two");
// Thread 1
String value = d.getData();
Isn't it possible that the last value of value in thread 1 can still be "one"? The reason being, thread 2 never called any synchronized methods after calling d.setData("two") so there was never a memory barrier? Note that the memory-barrier in this case happens every time getInstance() is called because it's synchronized.
You are absolutely correct.
There is no guarantee that writes in one Thread will be visible is another.
To provide this guarantee you would need to use the volatile keyword:
private volatile String someData;
Incidentally you can leverage the Java classloader to provider thread safe lazy init of your singleton as documented here. This avoids the synchronized keyword and therefore saves you some locking.
It is worth noting that the current accepted best practice is to use an enum for storing singleton data in Java.
Correct, it is possible that thread 1 still sees the value as being "one" since no memory synchronization event occurred and there is no happens before relationship between thread 1 and thread 2 (see section 17.4.5 of the JLS).
If someData was volatile then thread 1 would see the value as "two" (assuming thread 2 completed before thread 1 fetched the value).
Lastly, and off topic, the implementation of the singleton is slightly less than ideal since it synchronizes on every access. It is generally better to use an enum to implement a singleton or at the very least, assign the instance in a static initializer so no call to the constructor is required in the getInstance method.
Isn't it possible that the last value of value in thread 1 can still be "one"?
Yes it is. The java memory model is based on happens before (hb) relationships. In your case, you only have getInstance exit happens-before subsequent getInstance entry, due to the synchronized keyword.
So if we take your example (assuming the thread interleaving is in that order):
// Thread 1
GlobalData d = GlobalData.getInstance(); //S1
d.setData("one");
// Thread 2
GlobalData d = GlobalData.getInstance(); //S2
d.setData("two");
// Thread 1
String value = d.getData();
You have S1 hb S2. If you called d.getData() from Thread2 after S2, you would see "one". But the last read of d is not guaranteed to see "two".
I saw some examples in java where they do synchronization on a block of code to change some variable while that variable was declared volatile originally .. I saw that in an example of singleton class where they declared the unique instance as volatile and they sychronized the block that initializes that instance ... My question is why we declare it volatile while we synch on it, why we need to do both?? isn't one of them is sufficient for the other ??
public class SomeClass {
volatile static Object uniqueInstance = null;
public static Object getInstance() {
if (uniqueInstance == null) {
synchronized (someClass.class) {
if (uniqueInstance == null) {
uniqueInstance = new SomeClass();
}
}
}
return uniqueInstance;
}
}
thanks in advance.
Synchronization by itself would be enough in this case if the first check was within synchronized block (but it's not and one thread might not see changes performed by another if the variable were not volatile). Volatile alone would not be enough because you need to perform more than one operation atomically. But beware! What you have here is so-called double-checked locking - a common idiom, which unfortunately does not work reliably. I think this has changed since Java 1.6, but still this kind of code may be risky.
EDIT: when the variable is volatile, this code works correctly since JDK 5 (not 6 as I wrote earlier), but it will not work as expected under JDK 1.4 or earlier.
This uses the double checked locking, note that the if(uniqueInstance == null) is not within the synchronized part.
If uniqueInstance is not volatile, it might be "initialized" with a partially constructed object where parts of it isn't visible to other than the thread executing in the synchronized block. volatile makes this an all or nothing operation in this case.
If you didn't have the synchronized block, you could end up with 2 threads getting to this point at the same time.
if(uniqueInstance == null) {
uniqueInstance = new someClass(); <---- here
And you construct 2 SomeClass objects, which defeats the purpose.
Strictly speaking, you don't need volatile , the method could have been
public static someClass getInstance() {
synchronized(FullDictionary.class) {
if(uniqueInstance == null) {
uniqueInstance = new someClass();
}
return uniqueInstance;
}
}
But that incurs the synchronization and serialization of every thread that performs getInstance().
This post explains the idea behind volatile.
It is also addressed in the seminal work, Java Concurrency in Practice.
The main idea is that concurrency not only involves protection of shared state but also the visibility of that state between threads: this is where volatile comes in. (This larger contract is defined by the Java Memory Model.)
You can do synchronization without using synchronized block.
It's not a necessary to use volatile variable in it...
volatile updates the one variable from main memory..and
synchronized Update all shared variables that have been accessed from main memory..
So you can use it according to your requirement..
My two cents here
Frist a quick explanation of the intuition of this code
if(uniqueInstance == null) {
synchronized(someClass.class) {
if(uniqueInstance == null) {
uniqueInstance = new someClass();
}
}
}
The reason it checks uniqueInstance == null twice is to reduce the overhead of calling the synchronized block which is relatively slower. So called double-checked locking.
Second, the reason it uses synchronized is easy to understand, it make the two operations inside the synchronized block atomic.
Last the volatile modifier makes sure all threads see the same copy so the very first check outside of the synchronized block will see the value of uniqueInstance in a way which is "synchronized"
with the synchronized block. Without the volatile modifier one thread can assign a value to uniqueInstance but the other thread may not see it by the first check. (Although the second check will see it)
I want to implement lazy initialization for multithreading in Java.
I have some code of the sort:
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null) {
Helper h;
synchronized(this) {
h = helper;
if (h == null)
synchronized (this) {
h = new Helper();
} // release inner synchronization lock
helper = h;
}
}
return helper;
}
// other functions and members...
}
And I'm getting the the "Double-Checked Locking is Broken" declaration.
How can I solve this?
Here is the idiom recommended in the Item 71: Use lazy initialization judiciously of
Effective Java:
If you need to use lazy initialization for performance on an
instance field, use the double-check
idiom. This idiom avoids the cost
of locking when accessing the field
after it has been initialized (Item
67). The idea behind the idiom is to
check the value of the field twice
(hence the name double-check): once
without locking, and then, if the
field appears to be uninitialized, a
second time with locking. Only if the
second check indicates that the field
is uninitialized does the call
initialize the field. Because there is
no locking if the field is already
initialized, it is critical that the
field be declared volatile (Item
66). Here is the idiom:
// Double-check idiom for lazy initialization of instance fields
private volatile FieldType field;
private FieldType getField() {
FieldType result = field;
if (result != null) // First check (no locking)
return result;
synchronized(this) {
if (field == null) // Second check (with locking)
field = computeFieldValue();
return field;
}
}
This code may appear a bit convoluted.
In particular, the need for the local
variable result may be unclear. What
this variable does is to ensure that
field is read only once in the common
case where it’s already initialized.
While not strictly necessary, this may
improve performance and is more
elegant by the standards applied to
low-level concurrent programming. On
my machine, the method above is about
25 percent faster than the obvious
version without a local variable.
Prior to release 1.5, the double-check
idiom did not work reliably because
the semantics of the volatile modifier
were not strong enough to support it
[Pugh01]. The memory model introduced
in release 1.5 fixed this problem
[JLS, 17, Goetz06 16]. Today, the
double-check idiom is the technique of
choice for lazily initializing an
instance field. While you can apply
the double-check idiom to static
fields as well, there is no reason to
do so: the lazy initialization holder
class idiom is a better choice.
Reference
Effective Java, Second Edition
Item 71: Use lazy initialization judiciously
Here is a pattern for correct double-checked locking.
class Foo {
private volatile HeavyWeight lazy;
HeavyWeight getLazy() {
HeavyWeight tmp = lazy; /* Minimize slow accesses to `volatile` member. */
if (tmp == null) {
synchronized (this) {
tmp = lazy;
if (tmp == null)
lazy = tmp = createHeavyWeightObject();
}
}
return tmp;
}
}
For a singleton, there is a much more readable idiom for lazy initialization.
class Singleton {
private static class Ref {
static final Singleton instance = new Singleton();
}
public static Singleton get() {
return Ref.instance;
}
}
DCL using ThreadLocal By Brian Goetz # JavaWorld
what's broken about DCL?
DCL relies on an unsynchronized use of the resource field. That appears to be harmless, but it is not. To see why, imagine that thread A is inside the synchronized block, executing the statement resource = new Resource(); while thread B is just entering getResource(). Consider the effect on memory of this initialization. Memory for the new Resource object will be allocated; the constructor for Resource will be called, initializing the member fields of the new object; and the field resource of SomeClass will be assigned a reference to the newly created object.
class SomeClass {
private Resource resource = null;
public Resource getResource() {
if (resource == null) {
synchronized {
if (resource == null)
resource = new Resource();
}
}
return resource;
}
}
However, since thread B is not executing inside a synchronized block, it may see these memory operations in a different order than the one thread A executes. It could be the case that B sees these events in the following order (and the compiler is also free to reorder the instructions like this): allocate memory, assign reference to resource, call constructor. Suppose thread B comes along after the memory has been allocated and the resource field is set, but before the constructor is called. It sees that resource is not null, skips the synchronized block, and returns a reference to a partially constructed Resource! Needless to say, the result is neither expected nor desired.
Can ThreadLocal help fix DCL?
We can use ThreadLocal to achieve the DCL idiom's explicit goal -- lazy initialization without synchronization on the common code path. Consider this (thread-safe) version of DCL:
Listing 2. DCL using ThreadLocal
class ThreadLocalDCL {
private static ThreadLocal initHolder = new ThreadLocal();
private static Resource resource = null;
public Resource getResource() {
if (initHolder.get() == null) {
synchronized {
if (resource == null)
resource = new Resource();
initHolder.set(Boolean.TRUE);
}
}
return resource;
}
}
I think; here each thread will once enters the SYNC block to update the threadLocal value; then it will not. So ThreadLocal DCL will ensure a thread will enter only once inside the SYNC block.
What does synchronized really mean?
Java treats each thread as if it runs on its own processor with its own local memory, each talking to and synchronizing with a shared main memory. Even on a single-processor system, that model makes sense because of the effects of memory caches and the use of processor registers to store variables. When a thread modifies a location in its local memory, that modification should eventually show up in the main memory as well, and the JMM defines the rules for when the JVM must transfer data between local and main memory. The Java architects realized that an overly restrictive memory model would seriously undermine program performance. They attempted to craft a memory model that would allow programs to perform well on modern computer hardware while still providing guarantees that would allow threads to interact in predictable ways.
Java's primary tool for rendering interactions between threads predictably is the synchronized keyword. Many programmers think of synchronized strictly in terms of enforcing a mutual exclusion semaphore (mutex) to prevent execution of critical sections by more than one thread at a time. Unfortunately, that intuition does not fully describe what synchronized means.
The semantics of synchronized do indeed include mutual exclusion of execution based on the status of a semaphore, but they also include rules about the synchronizing thread's interaction with main memory. In particular, the acquisition or release of a lock triggers a memory barrier -- a forced synchronization between the thread's local memory and main memory. (Some processors -- like the Alpha -- have explicit machine instructions for performing memory barriers.) When a thread exits a synchronized block, it performs a write barrier -- it must flush out any variables modified in that block to main memory before releasing the lock. Similarly, when entering a synchronized block, it performs a read barrier -- it is as if the local memory has been invalidated, and it must fetch any variables that will be referenced in the block from main memory.
The only way to do double-checked locking correctly in Java is to use "volatile" declarations on the variable in question. While that solution is correct, note that "volatile" means cache lines get flushed at every access. Since "synchronized" flushes them at the end of the block, it may not actually be any more efficient (or even less efficient). I'd recommend just not using double-checked locking unless you've profiled your code and found there to be a performance problem in this area.
Define the variable that should be double-checked with volatile midifier
You don't need the h variable.
Here is an example from here
class Foo {
private volatile Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized(this) {
if (helper == null)
helper = new Helper();
}
}
return helper;
}
}
what do you mean, from whom you are getting the declaration?
Double-Checked Locking is fixed. check wikipedia:
public class FinalWrapper<T>
{
public final T value;
public FinalWrapper(T value) { this.value = value; }
}
public class Foo
{
private FinalWrapper<Helper> helperWrapper = null;
public Helper getHelper()
{
FinalWrapper<Helper> wrapper = helperWrapper;
if (wrapper == null)
{
synchronized(this)
{
if (helperWrapper ==null)
helperWrapper = new FinalWrapper<Helper>( new Helper() );
wrapper = helperWrapper;
}
}
return wrapper.value;
}
As a few have noted, you definitely need the volatile keyword to make it work correctly, unless all members in the object are declared final, otherwise there is no happens-before pr safe-publication and you could see the default values.
We got sick of the constant problems with people getting this wrong, so we coded a LazyReference utility that has final semantics and has been profiled and tuned to be as fast as possible.
Copying below from somewhere else ,which explains why using a method local variable as a copy for the volatile variable will speed things up.
Statement that needs explanation:
This code may appear a bit convoluted. In particular, the need for the
local variable result may be unclear.
Explanation:
The field would be read first time in the first if statement and
second time in the return statement. The field is declared volatile,
which means it has to be refetched from memory every time it is
accessed (roughly speaking, even more processing might be required to
access volatile variables) and can not be stored into a register by
the compiler. When copied to the local variable and then used in both
statements (if and return), the register optimization can be done by
the JVM.