How to solve the "Double-Checked Locking is Broken" Declaration in Java? - java

I want to implement lazy initialization for multithreading in Java.
I have some code of the sort:
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null) {
Helper h;
synchronized(this) {
h = helper;
if (h == null)
synchronized (this) {
h = new Helper();
} // release inner synchronization lock
helper = h;
}
}
return helper;
}
// other functions and members...
}
And I'm getting the the "Double-Checked Locking is Broken" declaration.
How can I solve this?

Here is the idiom recommended in the Item 71: Use lazy initialization judiciously of
Effective Java:
If you need to use lazy initialization for performance on an
instance field, use the double-check
idiom. This idiom avoids the cost
of locking when accessing the field
after it has been initialized (Item
67). The idea behind the idiom is to
check the value of the field twice
(hence the name double-check): once
without locking, and then, if the
field appears to be uninitialized, a
second time with locking. Only if the
second check indicates that the field
is uninitialized does the call
initialize the field. Because there is
no locking if the field is already
initialized, it is critical that the
field be declared volatile (Item
66). Here is the idiom:
// Double-check idiom for lazy initialization of instance fields
private volatile FieldType field;
private FieldType getField() {
FieldType result = field;
if (result != null) // First check (no locking)
return result;
synchronized(this) {
if (field == null) // Second check (with locking)
field = computeFieldValue();
return field;
}
}
This code may appear a bit convoluted.
In particular, the need for the local
variable result may be unclear. What
this variable does is to ensure that
field is read only once in the common
case where it’s already initialized.
While not strictly necessary, this may
improve performance and is more
elegant by the standards applied to
low-level concurrent programming. On
my machine, the method above is about
25 percent faster than the obvious
version without a local variable.
Prior to release 1.5, the double-check
idiom did not work reliably because
the semantics of the volatile modifier
were not strong enough to support it
[Pugh01]. The memory model introduced
in release 1.5 fixed this problem
[JLS, 17, Goetz06 16]. Today, the
double-check idiom is the technique of
choice for lazily initializing an
instance field. While you can apply
the double-check idiom to static
fields as well, there is no reason to
do so: the lazy initialization holder
class idiom is a better choice.
Reference
Effective Java, Second Edition
Item 71: Use lazy initialization judiciously

Here is a pattern for correct double-checked locking.
class Foo {
private volatile HeavyWeight lazy;
HeavyWeight getLazy() {
HeavyWeight tmp = lazy; /* Minimize slow accesses to `volatile` member. */
if (tmp == null) {
synchronized (this) {
tmp = lazy;
if (tmp == null)
lazy = tmp = createHeavyWeightObject();
}
}
return tmp;
}
}
For a singleton, there is a much more readable idiom for lazy initialization.
class Singleton {
private static class Ref {
static final Singleton instance = new Singleton();
}
public static Singleton get() {
return Ref.instance;
}
}

DCL using ThreadLocal By Brian Goetz # JavaWorld
what's broken about DCL?
DCL relies on an unsynchronized use of the resource field. That appears to be harmless, but it is not. To see why, imagine that thread A is inside the synchronized block, executing the statement resource = new Resource(); while thread B is just entering getResource(). Consider the effect on memory of this initialization. Memory for the new Resource object will be allocated; the constructor for Resource will be called, initializing the member fields of the new object; and the field resource of SomeClass will be assigned a reference to the newly created object.
class SomeClass {
private Resource resource = null;
public Resource getResource() {
if (resource == null) {
synchronized {
if (resource == null)
resource = new Resource();
}
}
return resource;
}
}
However, since thread B is not executing inside a synchronized block, it may see these memory operations in a different order than the one thread A executes. It could be the case that B sees these events in the following order (and the compiler is also free to reorder the instructions like this): allocate memory, assign reference to resource, call constructor. Suppose thread B comes along after the memory has been allocated and the resource field is set, but before the constructor is called. It sees that resource is not null, skips the synchronized block, and returns a reference to a partially constructed Resource! Needless to say, the result is neither expected nor desired.
Can ThreadLocal help fix DCL?
We can use ThreadLocal to achieve the DCL idiom's explicit goal -- lazy initialization without synchronization on the common code path. Consider this (thread-safe) version of DCL:
Listing 2. DCL using ThreadLocal
class ThreadLocalDCL {
private static ThreadLocal initHolder = new ThreadLocal();
private static Resource resource = null;
public Resource getResource() {
if (initHolder.get() == null) {
synchronized {
if (resource == null)
resource = new Resource();
initHolder.set(Boolean.TRUE);
}
}
return resource;
}
}
I think; here each thread will once enters the SYNC block to update the threadLocal value; then it will not. So ThreadLocal DCL will ensure a thread will enter only once inside the SYNC block.
What does synchronized really mean?
Java treats each thread as if it runs on its own processor with its own local memory, each talking to and synchronizing with a shared main memory. Even on a single-processor system, that model makes sense because of the effects of memory caches and the use of processor registers to store variables. When a thread modifies a location in its local memory, that modification should eventually show up in the main memory as well, and the JMM defines the rules for when the JVM must transfer data between local and main memory. The Java architects realized that an overly restrictive memory model would seriously undermine program performance. They attempted to craft a memory model that would allow programs to perform well on modern computer hardware while still providing guarantees that would allow threads to interact in predictable ways.
Java's primary tool for rendering interactions between threads predictably is the synchronized keyword. Many programmers think of synchronized strictly in terms of enforcing a mutual exclusion semaphore (mutex) to prevent execution of critical sections by more than one thread at a time. Unfortunately, that intuition does not fully describe what synchronized means.
The semantics of synchronized do indeed include mutual exclusion of execution based on the status of a semaphore, but they also include rules about the synchronizing thread's interaction with main memory. In particular, the acquisition or release of a lock triggers a memory barrier -- a forced synchronization between the thread's local memory and main memory. (Some processors -- like the Alpha -- have explicit machine instructions for performing memory barriers.) When a thread exits a synchronized block, it performs a write barrier -- it must flush out any variables modified in that block to main memory before releasing the lock. Similarly, when entering a synchronized block, it performs a read barrier -- it is as if the local memory has been invalidated, and it must fetch any variables that will be referenced in the block from main memory.

The only way to do double-checked locking correctly in Java is to use "volatile" declarations on the variable in question. While that solution is correct, note that "volatile" means cache lines get flushed at every access. Since "synchronized" flushes them at the end of the block, it may not actually be any more efficient (or even less efficient). I'd recommend just not using double-checked locking unless you've profiled your code and found there to be a performance problem in this area.

Define the variable that should be double-checked with volatile midifier
You don't need the h variable.
Here is an example from here
class Foo {
private volatile Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized(this) {
if (helper == null)
helper = new Helper();
}
}
return helper;
}
}

what do you mean, from whom you are getting the declaration?
Double-Checked Locking is fixed. check wikipedia:
public class FinalWrapper<T>
{
public final T value;
public FinalWrapper(T value) { this.value = value; }
}
public class Foo
{
private FinalWrapper<Helper> helperWrapper = null;
public Helper getHelper()
{
FinalWrapper<Helper> wrapper = helperWrapper;
if (wrapper == null)
{
synchronized(this)
{
if (helperWrapper ==null)
helperWrapper = new FinalWrapper<Helper>( new Helper() );
wrapper = helperWrapper;
}
}
return wrapper.value;
}

As a few have noted, you definitely need the volatile keyword to make it work correctly, unless all members in the object are declared final, otherwise there is no happens-before pr safe-publication and you could see the default values.
We got sick of the constant problems with people getting this wrong, so we coded a LazyReference utility that has final semantics and has been profiled and tuned to be as fast as possible.

Copying below from somewhere else ,which explains why using a method local variable as a copy for the volatile variable will speed things up.
Statement that needs explanation:
This code may appear a bit convoluted. In particular, the need for the
local variable result may be unclear.
Explanation:
The field would be read first time in the first if statement and
second time in the return statement. The field is declared volatile,
which means it has to be refetched from memory every time it is
accessed (roughly speaking, even more processing might be required to
access volatile variables) and can not be stored into a register by
the compiler. When copied to the local variable and then used in both
statements (if and return), the register optimization can be done by
the JVM.

Related

Double checking way of creating singleton issue

How reading (outside critical resource block) and writing (inside critical resource block) does not have atomicity issues.
I have read and discussed with various people but most people don't answer if both operations are atomic and how atomicity is actually achieved for above problem.
class ABC {
private static volatile ABC abcInstance;
static ABC getInstance(){
if(abcInstance == null){
synchronized(ABC.class){
if(abcInstance == null){
abcInstance = new ABC();
return abcInstance;
}
}
}
return abcInstance;
}
}
Are if(abcInstance == null) outside synchronisation block and abcInstance = new ABC(); atomic, if not then this way of creating singletons is wrong.
In C++, abcInstance = new ABC(); consists of three instructions broadly speaking:
Create ABC object.
Allocate memory for ABC.
Assign it to abcInstance.
And for optimisations compiler can reorder these three instructions in any way. Suppose it follows 2->3->1 and after instruction 3 interrupt happens and next thread calling getInstance() will read that abcInstance has some value then it will point to something which does not have ABC object.
Please correct me if am wrong for both C++ and Java.
This answers the Java part of your question only.
Is if(abcInstance == null) and abcInstance = new ABC(); are atomic, if not then this way of creating singleton is wrong.
It is not atomicity that is the (potential) problem. (Reference assignment is atomic from the perspective of both the thread doing the assignment, and the thread reading the assigned variable.)
The problem is when the value written to abcInstance becomes visible to another thread.
Prior to Java 5, the memory model did not provide sufficient guarantees about memory visibility for that implementation to work reliably.
In the Java 5 (and later) memory model, there is a happens before relation between one thread's write to a volatile variable and another thread's subsequent read of the variable. This means:
The second thread is guaranteed to see the non-null value of abcInstance if the first thread has written it.
The happens before relation also guarantees that the second thread will see the fully initialized state of the ABC instance create by the first thread.
The synchronized block ensures that only one ABC instance may be created at a time.
This is the authoritative article explaining why old double-checked locking implementations were broken:
The "Double-Checked Locking is Broken" Declaration
As Andrew Turner states, there is a simpler, cleaner way to implement singleton classes in Java: use an enum.
Implementing Singleton with an Enum (in Java)
Here are two typical singleton variants in C++.
First one shared by all threads:
class singleton {
private:
singleton() {}
public:
singleton(const singleton&) = delete;
static singleton& get_instance() {
static singleton ins;
return ins;
}
};
And here's one that that will create one instance per thread that needs it:
class tl_singleton {
private:
tl_singleton() {}
public:
tl_singleton(const tl_singleton&) = delete;
static tl_singleton& get_instance() {
static thread_local tl_singleton ins;
return ins;
}
};

Need of volatile keyword in case of DCL

I was just reading concurrency in practice. I came to know it is necessary to use volatile keyword in double checked locking mechanism for field otherwise thread can read stale value of not null object. Because it is a possibility of reordering instruction without use of volatile keyword. Because of that object reference could be assigned to resource variable before calling constructor. so thread could see partially constructed object.
I have a question regarding that.
I assume synchronized block also restricts compiler from instruction reordering so why we need volatile keyword here?
public class DoubleCheckedLocking {
private static volatile Resource resource;
public static Resource getInstance() {
if (resource == null) {
synchronized (DoubleCheckedLocking.class) {
if (resource == null)
resource = new Resource();
}
}
return resource;
}
}
The JMM only guarantees that a thread T1 will see a properly initialized object created by another thread T2 inside a synchronized block if the calling thread (T1) also reads it from a synchronized block (on the same lock).
Since T1 could see the resource as not null, and thus return it immediately without going though the synchronized block, it could get an object but not see its state properly initialized.
Using volatile brings back that guarantee, because there is a happens-before relationship between the write of a volatile field and the read of that volatile field.
Volatile is necessary in this case, as others have observed, because a data race is possible when first accessing the resource. There is no guarantee, absent volatile, that thread A, reading a non-null value, will actually access the fully initialized resource -- if it is, at the same time, being built in thread B within the synchronized section, which thread A has not yet reached. Thread A could then try to work with a half-initialized copy.
Double-checked locking with volatile, while working since JSR-133 (2004), is still not recommended, as it is not very readable and not as efficient as the recommended alternative:
private static class LazyResourceHolder {
public static Resource resource = new Resource();
}
...
public static Resource getInstance() {
return LazyResourceHolder.something;
}
This is the Initialize-On-Demand Holder Class idiom, and according to the above page,
[...] derives its thread safety from the fact that operations that are
part of class initialization, such as static initializers, are
guaranteed to be visible to all threads that use that class, and its
lazy initialization from the fact that the inner class is not loaded
until some thread references one of its fields or methods.
Actually there is no need to use volatile here. Using volatile will mean that multiple threads will each time the instance variable is used in a thread method it will not optimize the memory read away but make sure it is read again and again. The only times I've deliberately used volatile is in threads where I have a stop indicator (private volatile boolean stop = false;)
Creating singletons like in your sample code is needlessly complex and doesn't offer any actual speed improvements. The JIT compiler is very good at doing thread locking optimizations.
You'll be better out creating singletons using:
public static synchronized Resource getInstance() {
if (resource == null) {
resource = new Resource();
}
return resource;
}
Which is much easier to read and infer its logic for human beings.
See also Do you ever use the volatile keyword in Java?, where volatile is indeed generally used for some end-of-loop flag in threads.

Synchronizing singleton in Java

I have just encountered code where synchronization was done on wrong class:
public class Test
{
public static volatile Test instance = null;
public static void setIfNull(Test newInstance)
{
synchronized (WRONG.class) // should be synchronized (Test.class)
{
if (newInstance == null)
throw new IllegalArgumentException("newInstance must not be null.");
if (instance == null) instance = newInstance;
}
}
}
Above error would not happen if whole method was synchronized:
public class Test
{
public static volatile Test instance = null;
public static synchronized void setIfNull(Test newInstance)
{
if (newInstance == null)
throw new IllegalArgumentException("newInstance must not be null.");
if (instance == null) instance = newInstance;
}
}
The way I see it, second piece of code is more error proof than first one.
Are there any pitfalls of using method synchronization over synchronization block concerning above code pattern?
Warning: In above code instance field is not properly encapsulated. Being public member nothing prevents external code not only to read it, but also write to it in thread unsafe manner. This code should not be used as proper thread safe singleton example because that is not what it is.
Are there any pitfalls of using method synchronization over synchronization block concerning above code pattern?
Since this:
public static synchronized void setIfNull(Test newInstance) {
...
}
...is exactly the same (JLS 8.4.3.6) as this:
public static void setIfNull(Test newInstance) {
synchronized (Test.class) {
...
}
}
...what you are really asking is: "What is the difference between synchronizing on some other class object WRONG.class and on This.class?".
The only thing to look out for is whether something else in your code decides to synchronize on Test.class.
1) One significant difference between synchronized method and block is that, Synchronized block generally reduce scope of lock. As scope of lock is inversely proportional to performance, its always better to lock only critical section of code. One of the best example of using synchronized block is double checked locking in Singleton pattern where instead of locking whole getInstance() method we only lock critical section of code which is used to create Singleton instance. This improves performance drastically because locking is only required one or two times.
2) Synchronized block provide granular control over lock, as you can use arbitrary any lock to provide mutual exclusion to critical section code. On the other hand synchronized method always lock either on current object represented by this keyword or class level lock, if its static synchronized method.
3) Synchronized block can throw throw java.lang.NullPointerException if expression provided to block as parameter evaluates to null, which is not the case with synchronized methods.
4) In case of synchronized method, lock is acquired by thread when it enter method and released when it leaves method, either normally or by throwing Exception. On the other hand in case of synchronized block, thread acquires lock when they enter synchronized block and release when they leave synchronized block.
Read more: http://java67.blogspot.com/2013/01/difference-between-synchronized-block-vs-method-java-example.html#ixzz3qAc5gOJy
I cant remember any pitfalls when you synchronize whole methods. Of course those are more "expensive" that just locks around certain areas.
If you're not sure i would always go for the synchronized method first until you encounter is as a bottleneck.
To avoid blocking on the wrong object simple create an instance variable:
private final Object block = new Object();
And use this when you need sychronization. Anyway when you do so, keep in mind that other methods called by different threads dont respect this and you get side effects. So you have to be careful when going this way.
I read quite some books on those topics thats an really concrete answer ist hard to mark as the correct one.
I recommend that you read "Java Concurrency in Practice" from Brian Goetz.
Also "Java Core" from Angelika Langer, Klaus Kreft (a book that goes into deep when it comes to using the volatile keyword) (German Book, still curious that no one translated this into english as it is a masterpiece in its area).
Also you could use reentant locks to get fair locking if you like.

why using volatile with synchronized block?

I saw some examples in java where they do synchronization on a block of code to change some variable while that variable was declared volatile originally .. I saw that in an example of singleton class where they declared the unique instance as volatile and they sychronized the block that initializes that instance ... My question is why we declare it volatile while we synch on it, why we need to do both?? isn't one of them is sufficient for the other ??
public class SomeClass {
volatile static Object uniqueInstance = null;
public static Object getInstance() {
if (uniqueInstance == null) {
synchronized (someClass.class) {
if (uniqueInstance == null) {
uniqueInstance = new SomeClass();
}
}
}
return uniqueInstance;
}
}
thanks in advance.
Synchronization by itself would be enough in this case if the first check was within synchronized block (but it's not and one thread might not see changes performed by another if the variable were not volatile). Volatile alone would not be enough because you need to perform more than one operation atomically. But beware! What you have here is so-called double-checked locking - a common idiom, which unfortunately does not work reliably. I think this has changed since Java 1.6, but still this kind of code may be risky.
EDIT: when the variable is volatile, this code works correctly since JDK 5 (not 6 as I wrote earlier), but it will not work as expected under JDK 1.4 or earlier.
This uses the double checked locking, note that the if(uniqueInstance == null) is not within the synchronized part.
If uniqueInstance is not volatile, it might be "initialized" with a partially constructed object where parts of it isn't visible to other than the thread executing in the synchronized block. volatile makes this an all or nothing operation in this case.
If you didn't have the synchronized block, you could end up with 2 threads getting to this point at the same time.
if(uniqueInstance == null) {
uniqueInstance = new someClass(); <---- here
And you construct 2 SomeClass objects, which defeats the purpose.
Strictly speaking, you don't need volatile , the method could have been
public static someClass getInstance() {
synchronized(FullDictionary.class) {
if(uniqueInstance == null) {
uniqueInstance = new someClass();
}
return uniqueInstance;
}
}
But that incurs the synchronization and serialization of every thread that performs getInstance().
This post explains the idea behind volatile.
It is also addressed in the seminal work, Java Concurrency in Practice.
The main idea is that concurrency not only involves protection of shared state but also the visibility of that state between threads: this is where volatile comes in. (This larger contract is defined by the Java Memory Model.)
You can do synchronization without using synchronized block.
It's not a necessary to use volatile variable in it...
volatile updates the one variable from main memory..and
synchronized Update all shared variables that have been accessed from main memory..
So you can use it according to your requirement..
My two cents here
Frist a quick explanation of the intuition of this code
if(uniqueInstance == null) {
synchronized(someClass.class) {
if(uniqueInstance == null) {
uniqueInstance = new someClass();
}
}
}
The reason it checks uniqueInstance == null twice is to reduce the overhead of calling the synchronized block which is relatively slower. So called double-checked locking.
Second, the reason it uses synchronized is easy to understand, it make the two operations inside the synchronized block atomic.
Last the volatile modifier makes sure all threads see the same copy so the very first check outside of the synchronized block will see the value of uniqueInstance in a way which is "synchronized"
with the synchronized block. Without the volatile modifier one thread can assign a value to uniqueInstance but the other thread may not see it by the first check. (Although the second check will see it)

Inner synchronization on the same object as the outer synchronization

Recently I attended a lecture concerning some design patterns:
The following code had been displayed:
public static Singleton getInstance()
{
if (instance == null)
{
synchronized(Singleton.class) { //1
Singleton inst = instance; //2
if (inst == null)
{
synchronized(Singleton.class) { //3
inst = new Singleton(); //4
}
instance = inst; //5
}
}
}
return instance;
}
taken from: Double-checked locking: Take two
My question has nothing to do with the above mentioned pattern but with the synchronized blocks:
Is there any benefit whatsoever to the double synchronization done in lines 1 & 3 with regards to the fact that the synchronize operation is done on the same Object?
In the old Java Memory Model (JMM), exiting a synchronized block allegedly flushed local data out to main memory. Entering a synchronized block used to cause rereading of cached data. (Here, cache includes registers with associated compiler optimisations.) The old JMM was broken and not implemented correctly.
In the new JMM it doesn't do anything. New JMM is specified for 1.5, and implemented for the "Sun" 1.4 JRE. 1.5 completed it's End of Service Life period some time ago, so you shouldn't need to worry about the old JMM (well, perhaps Java ME will do something unpredictable).
I am no memory model expert, but I think one must consider that a "synchronized" doesn't only signal the need to obtain a lock, but also rules about possible optimization of code and flushing and refreshing of caches.
You'll find the details in the Java Memory Model
Synchronizing twice on the same object does enforce that all of the changes made inside the inner block is flushed shared memory when the inner synchronization block is exited. But what is important to note is that there is no rules that say that changes made after the inner synchronization block can't be made before the inner synchronization is exited.
For example
public void doSomething()
{
synchronized(this) { // "this" locked
methodCall1();
synchronized(this) {
methodCall2();
} // memory flushed
methodCall3();
} // "this" unlocked and memory flushed
}
Can be compiled to execute in this order
public void doSomething()
{
synchronized(this) { // "this" locked
methodCall1();
synchronized(this) {
methodCall2();
methodCall3();
} // memory flushed
} // "this" unlocked and memory flushed
}
For a more detailed explanation check out Double Check Locking in the A fix that doesn't work section about a third of the way down.

Categories

Resources