After reading dozens of articles about DCL. I feel that I should not use this concept without volatile.If I will not lead this technique my code will not thread save and very very bad according one hundreed different reasons.
Recently I reread basics Happens Before and I have a bit another view
Lets research singleton code listing:
public class Singleton{
private static Something instance = null;
public static Singleton getInstance() {
if (instance == null) { // point 1
synchronized (Singleton.class) {
if (instance == null)
instance = new Something();
}
}
return instance; //point 2
}
}
we change instance only inside synchronized (Something.class) {
thus we will see actual value inside synchronized section which uses same monitor. It is our case.
Thus now I have suspicion that this is not effective but it thread safe.
Am I right?
Only one concern:
Can if (instance == null) { to see non null value before actual assignment instance = new Something();
But I still absolutely sure that this code doesn't allow to create 2 singleton instances
P.S.
I read a bit more and looks like if in point 1 we read non-null value, return instance at point 2 could return null;
The problem in your example is not in possible creating of two instances. That's true that only one instance will be created. Real problem is that on multiple thread access to this method, other thread can start using partially constructed version of instance (1)(2).
So, instance variable should be definitely defined as volatile (which is missed in your code block), otherwise you should concern about "freshness" of this variable's value.
Because there is no locking if the field is already initialized, it is
critical that the field be declared volatile (Item 66) [J. Bloch, "Effective Java", Item 71]
So:
private static volatile Something instance;
(BTW, explicitly assigning null value is redundant).
Why it doesn't work without "volatile" is good explained here:
The most obvious reason it doesn't work it that the writes that
initialize the Helper object and the write to the helper field can be
done or perceived out of order
Related
How reading (outside critical resource block) and writing (inside critical resource block) does not have atomicity issues.
I have read and discussed with various people but most people don't answer if both operations are atomic and how atomicity is actually achieved for above problem.
class ABC {
private static volatile ABC abcInstance;
static ABC getInstance(){
if(abcInstance == null){
synchronized(ABC.class){
if(abcInstance == null){
abcInstance = new ABC();
return abcInstance;
}
}
}
return abcInstance;
}
}
Are if(abcInstance == null) outside synchronisation block and abcInstance = new ABC(); atomic, if not then this way of creating singletons is wrong.
In C++, abcInstance = new ABC(); consists of three instructions broadly speaking:
Create ABC object.
Allocate memory for ABC.
Assign it to abcInstance.
And for optimisations compiler can reorder these three instructions in any way. Suppose it follows 2->3->1 and after instruction 3 interrupt happens and next thread calling getInstance() will read that abcInstance has some value then it will point to something which does not have ABC object.
Please correct me if am wrong for both C++ and Java.
This answers the Java part of your question only.
Is if(abcInstance == null) and abcInstance = new ABC(); are atomic, if not then this way of creating singleton is wrong.
It is not atomicity that is the (potential) problem. (Reference assignment is atomic from the perspective of both the thread doing the assignment, and the thread reading the assigned variable.)
The problem is when the value written to abcInstance becomes visible to another thread.
Prior to Java 5, the memory model did not provide sufficient guarantees about memory visibility for that implementation to work reliably.
In the Java 5 (and later) memory model, there is a happens before relation between one thread's write to a volatile variable and another thread's subsequent read of the variable. This means:
The second thread is guaranteed to see the non-null value of abcInstance if the first thread has written it.
The happens before relation also guarantees that the second thread will see the fully initialized state of the ABC instance create by the first thread.
The synchronized block ensures that only one ABC instance may be created at a time.
This is the authoritative article explaining why old double-checked locking implementations were broken:
The "Double-Checked Locking is Broken" Declaration
As Andrew Turner states, there is a simpler, cleaner way to implement singleton classes in Java: use an enum.
Implementing Singleton with an Enum (in Java)
Here are two typical singleton variants in C++.
First one shared by all threads:
class singleton {
private:
singleton() {}
public:
singleton(const singleton&) = delete;
static singleton& get_instance() {
static singleton ins;
return ins;
}
};
And here's one that that will create one instance per thread that needs it:
class tl_singleton {
private:
tl_singleton() {}
public:
tl_singleton(const tl_singleton&) = delete;
static tl_singleton& get_instance() {
static thread_local tl_singleton ins;
return ins;
}
};
Class which is implemented with Singleton Pattern is as follows, when multiple threads access this method only one thread has to create the instance so all I am doing is synchronising the method
private static synchronized FactoryAPI getIOInstance(){
if(factoryAPI == null){
FileUtils.initWrapperProp();
factoryAPI = new FactoryAPIImpl();
}
return factoryAPI;
}
which I feel is unnecessary because only for the first time the instance would be created and for the rest of the time the instance created already would be returned. When adding synchronised to block allows only one thread to access the method at a time.
The getIOInstance does two jobs
i) Initialising properties and
ii) Creating a new instance for the first time
So, I'm trying to do block level synchronisation here like the following
private static FactoryAPI getIOInstance(){
if(factoryAPI == null){
synchronised {
if(factoryAPI == null){
FileUtils.initWrapperProp();
factoryAPI = new FactoryAPIImpl();
}
}
}
return factoryAPI;
}
I prefer the second one to be the right one. Am I using it in a right way? Any suggestions are welcome.
Use the first method because the second one is not thread-safe.
When you say,
factoryAPI = new FactoryAPIImpl();
The compiler is free to execute the code in the following order:
1) Allocate some memory on the heap
2) Initialize factoryAPI to the address of that allocated space
3) Call the constructor of FactoryAPIImpl
The problem is when another thread calls getIOInstance() after step 2 and before step 3. It may see a non-null factoryAPI variable that points to an uninitialized FactoryAPI instance.
There are many different answers to this problem, you can find an extensive discussion at SEI for example.
The modern Java solution is simple: use an enum - as the JLS guarantees you that the compiler / JVM will create exactly one thing.
Found Initialization-on-demand holder method of initialising as an interesting one like the following,
public class FactoryAPI {
private FactoryAPI() {}
private static class LazyHolder {
static final Something INSTANCE = new Something();
}
public static Something getInstance() {
return FactoryAPI.INSTANCE;
}
}
Since the class initialization phase is guaranteed by the JLS to be serial, i.e., non-concurrent, no further synchronization is required in the static getInstance method during loading and initialization.
And since the initialization phase writes the static variable INSTANCE in a serial operation, all subsequent concurrent invocations of the getInstance will return the same correctly initialized INSTANCE without incurring any additional synchronization overhead.
I saw some examples in java where they do synchronization on a block of code to change some variable while that variable was declared volatile originally .. I saw that in an example of singleton class where they declared the unique instance as volatile and they sychronized the block that initializes that instance ... My question is why we declare it volatile while we synch on it, why we need to do both?? isn't one of them is sufficient for the other ??
public class SomeClass {
volatile static Object uniqueInstance = null;
public static Object getInstance() {
if (uniqueInstance == null) {
synchronized (someClass.class) {
if (uniqueInstance == null) {
uniqueInstance = new SomeClass();
}
}
}
return uniqueInstance;
}
}
thanks in advance.
Synchronization by itself would be enough in this case if the first check was within synchronized block (but it's not and one thread might not see changes performed by another if the variable were not volatile). Volatile alone would not be enough because you need to perform more than one operation atomically. But beware! What you have here is so-called double-checked locking - a common idiom, which unfortunately does not work reliably. I think this has changed since Java 1.6, but still this kind of code may be risky.
EDIT: when the variable is volatile, this code works correctly since JDK 5 (not 6 as I wrote earlier), but it will not work as expected under JDK 1.4 or earlier.
This uses the double checked locking, note that the if(uniqueInstance == null) is not within the synchronized part.
If uniqueInstance is not volatile, it might be "initialized" with a partially constructed object where parts of it isn't visible to other than the thread executing in the synchronized block. volatile makes this an all or nothing operation in this case.
If you didn't have the synchronized block, you could end up with 2 threads getting to this point at the same time.
if(uniqueInstance == null) {
uniqueInstance = new someClass(); <---- here
And you construct 2 SomeClass objects, which defeats the purpose.
Strictly speaking, you don't need volatile , the method could have been
public static someClass getInstance() {
synchronized(FullDictionary.class) {
if(uniqueInstance == null) {
uniqueInstance = new someClass();
}
return uniqueInstance;
}
}
But that incurs the synchronization and serialization of every thread that performs getInstance().
This post explains the idea behind volatile.
It is also addressed in the seminal work, Java Concurrency in Practice.
The main idea is that concurrency not only involves protection of shared state but also the visibility of that state between threads: this is where volatile comes in. (This larger contract is defined by the Java Memory Model.)
You can do synchronization without using synchronized block.
It's not a necessary to use volatile variable in it...
volatile updates the one variable from main memory..and
synchronized Update all shared variables that have been accessed from main memory..
So you can use it according to your requirement..
My two cents here
Frist a quick explanation of the intuition of this code
if(uniqueInstance == null) {
synchronized(someClass.class) {
if(uniqueInstance == null) {
uniqueInstance = new someClass();
}
}
}
The reason it checks uniqueInstance == null twice is to reduce the overhead of calling the synchronized block which is relatively slower. So called double-checked locking.
Second, the reason it uses synchronized is easy to understand, it make the two operations inside the synchronized block atomic.
Last the volatile modifier makes sure all threads see the same copy so the very first check outside of the synchronized block will see the value of uniqueInstance in a way which is "synchronized"
with the synchronized block. Without the volatile modifier one thread can assign a value to uniqueInstance but the other thread may not see it by the first check. (Although the second check will see it)
I want to implement lazy initialization for multithreading in Java.
I have some code of the sort:
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null) {
Helper h;
synchronized(this) {
h = helper;
if (h == null)
synchronized (this) {
h = new Helper();
} // release inner synchronization lock
helper = h;
}
}
return helper;
}
// other functions and members...
}
And I'm getting the the "Double-Checked Locking is Broken" declaration.
How can I solve this?
Here is the idiom recommended in the Item 71: Use lazy initialization judiciously of
Effective Java:
If you need to use lazy initialization for performance on an
instance field, use the double-check
idiom. This idiom avoids the cost
of locking when accessing the field
after it has been initialized (Item
67). The idea behind the idiom is to
check the value of the field twice
(hence the name double-check): once
without locking, and then, if the
field appears to be uninitialized, a
second time with locking. Only if the
second check indicates that the field
is uninitialized does the call
initialize the field. Because there is
no locking if the field is already
initialized, it is critical that the
field be declared volatile (Item
66). Here is the idiom:
// Double-check idiom for lazy initialization of instance fields
private volatile FieldType field;
private FieldType getField() {
FieldType result = field;
if (result != null) // First check (no locking)
return result;
synchronized(this) {
if (field == null) // Second check (with locking)
field = computeFieldValue();
return field;
}
}
This code may appear a bit convoluted.
In particular, the need for the local
variable result may be unclear. What
this variable does is to ensure that
field is read only once in the common
case where it’s already initialized.
While not strictly necessary, this may
improve performance and is more
elegant by the standards applied to
low-level concurrent programming. On
my machine, the method above is about
25 percent faster than the obvious
version without a local variable.
Prior to release 1.5, the double-check
idiom did not work reliably because
the semantics of the volatile modifier
were not strong enough to support it
[Pugh01]. The memory model introduced
in release 1.5 fixed this problem
[JLS, 17, Goetz06 16]. Today, the
double-check idiom is the technique of
choice for lazily initializing an
instance field. While you can apply
the double-check idiom to static
fields as well, there is no reason to
do so: the lazy initialization holder
class idiom is a better choice.
Reference
Effective Java, Second Edition
Item 71: Use lazy initialization judiciously
Here is a pattern for correct double-checked locking.
class Foo {
private volatile HeavyWeight lazy;
HeavyWeight getLazy() {
HeavyWeight tmp = lazy; /* Minimize slow accesses to `volatile` member. */
if (tmp == null) {
synchronized (this) {
tmp = lazy;
if (tmp == null)
lazy = tmp = createHeavyWeightObject();
}
}
return tmp;
}
}
For a singleton, there is a much more readable idiom for lazy initialization.
class Singleton {
private static class Ref {
static final Singleton instance = new Singleton();
}
public static Singleton get() {
return Ref.instance;
}
}
DCL using ThreadLocal By Brian Goetz # JavaWorld
what's broken about DCL?
DCL relies on an unsynchronized use of the resource field. That appears to be harmless, but it is not. To see why, imagine that thread A is inside the synchronized block, executing the statement resource = new Resource(); while thread B is just entering getResource(). Consider the effect on memory of this initialization. Memory for the new Resource object will be allocated; the constructor for Resource will be called, initializing the member fields of the new object; and the field resource of SomeClass will be assigned a reference to the newly created object.
class SomeClass {
private Resource resource = null;
public Resource getResource() {
if (resource == null) {
synchronized {
if (resource == null)
resource = new Resource();
}
}
return resource;
}
}
However, since thread B is not executing inside a synchronized block, it may see these memory operations in a different order than the one thread A executes. It could be the case that B sees these events in the following order (and the compiler is also free to reorder the instructions like this): allocate memory, assign reference to resource, call constructor. Suppose thread B comes along after the memory has been allocated and the resource field is set, but before the constructor is called. It sees that resource is not null, skips the synchronized block, and returns a reference to a partially constructed Resource! Needless to say, the result is neither expected nor desired.
Can ThreadLocal help fix DCL?
We can use ThreadLocal to achieve the DCL idiom's explicit goal -- lazy initialization without synchronization on the common code path. Consider this (thread-safe) version of DCL:
Listing 2. DCL using ThreadLocal
class ThreadLocalDCL {
private static ThreadLocal initHolder = new ThreadLocal();
private static Resource resource = null;
public Resource getResource() {
if (initHolder.get() == null) {
synchronized {
if (resource == null)
resource = new Resource();
initHolder.set(Boolean.TRUE);
}
}
return resource;
}
}
I think; here each thread will once enters the SYNC block to update the threadLocal value; then it will not. So ThreadLocal DCL will ensure a thread will enter only once inside the SYNC block.
What does synchronized really mean?
Java treats each thread as if it runs on its own processor with its own local memory, each talking to and synchronizing with a shared main memory. Even on a single-processor system, that model makes sense because of the effects of memory caches and the use of processor registers to store variables. When a thread modifies a location in its local memory, that modification should eventually show up in the main memory as well, and the JMM defines the rules for when the JVM must transfer data between local and main memory. The Java architects realized that an overly restrictive memory model would seriously undermine program performance. They attempted to craft a memory model that would allow programs to perform well on modern computer hardware while still providing guarantees that would allow threads to interact in predictable ways.
Java's primary tool for rendering interactions between threads predictably is the synchronized keyword. Many programmers think of synchronized strictly in terms of enforcing a mutual exclusion semaphore (mutex) to prevent execution of critical sections by more than one thread at a time. Unfortunately, that intuition does not fully describe what synchronized means.
The semantics of synchronized do indeed include mutual exclusion of execution based on the status of a semaphore, but they also include rules about the synchronizing thread's interaction with main memory. In particular, the acquisition or release of a lock triggers a memory barrier -- a forced synchronization between the thread's local memory and main memory. (Some processors -- like the Alpha -- have explicit machine instructions for performing memory barriers.) When a thread exits a synchronized block, it performs a write barrier -- it must flush out any variables modified in that block to main memory before releasing the lock. Similarly, when entering a synchronized block, it performs a read barrier -- it is as if the local memory has been invalidated, and it must fetch any variables that will be referenced in the block from main memory.
The only way to do double-checked locking correctly in Java is to use "volatile" declarations on the variable in question. While that solution is correct, note that "volatile" means cache lines get flushed at every access. Since "synchronized" flushes them at the end of the block, it may not actually be any more efficient (or even less efficient). I'd recommend just not using double-checked locking unless you've profiled your code and found there to be a performance problem in this area.
Define the variable that should be double-checked with volatile midifier
You don't need the h variable.
Here is an example from here
class Foo {
private volatile Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized(this) {
if (helper == null)
helper = new Helper();
}
}
return helper;
}
}
what do you mean, from whom you are getting the declaration?
Double-Checked Locking is fixed. check wikipedia:
public class FinalWrapper<T>
{
public final T value;
public FinalWrapper(T value) { this.value = value; }
}
public class Foo
{
private FinalWrapper<Helper> helperWrapper = null;
public Helper getHelper()
{
FinalWrapper<Helper> wrapper = helperWrapper;
if (wrapper == null)
{
synchronized(this)
{
if (helperWrapper ==null)
helperWrapper = new FinalWrapper<Helper>( new Helper() );
wrapper = helperWrapper;
}
}
return wrapper.value;
}
As a few have noted, you definitely need the volatile keyword to make it work correctly, unless all members in the object are declared final, otherwise there is no happens-before pr safe-publication and you could see the default values.
We got sick of the constant problems with people getting this wrong, so we coded a LazyReference utility that has final semantics and has been profiled and tuned to be as fast as possible.
Copying below from somewhere else ,which explains why using a method local variable as a copy for the volatile variable will speed things up.
Statement that needs explanation:
This code may appear a bit convoluted. In particular, the need for the
local variable result may be unclear.
Explanation:
The field would be read first time in the first if statement and
second time in the return statement. The field is declared volatile,
which means it has to be refetched from memory every time it is
accessed (roughly speaking, even more processing might be required to
access volatile variables) and can not be stored into a register by
the compiler. When copied to the local variable and then used in both
statements (if and return), the register optimization can be done by
the JVM.
I recently wrote a class for an assignment in which I had to store names in an ArrayList (in java). I initialized the ArrayList as an instance variable private ArrayList<String> names. Later when I checked my work against the solution, I noticed that they had initialized their ArrayList in the run() method instead.
I thought about this for a bit and I kind of feel it might be a matter of taste, but in general how does one choose in situations like this? Does one take up less memory or something?
PS I like the instance variables in Ruby that start with an # symbol: they are lovelier.
(meta-question: What would be a better title for this question?)
In the words of the great Knuth "Premature optimization is the root of all evil".
Just worry that your program functions correctly and that it does not have bugs. This is far more important than an obscure optimization that will be hard to debug later on.
But to answer your question - if you initialize in the class member, the memory will be allocated the first time a mention of your class is done in the code (i.e. when you call a method from it). If you initialize in a method, the memory allocation occurs later, when you call this specific method.
So it is only a question of initializing later... this is called lazy initialization in the industry.
Initialization
As a rule of thumb, try to initialize variables when they are declared.
If the value of a variable is intended never to change, make that explicit with use of the final keyword. This helps you reason about the correctness of your code, and while I'm not aware of compiler or JVM optimizations that recognize the final keyword, they would certainly be possible.
Of course, there are exceptions to this rule. For example, a variable may by be assigned in an if–else or a switch. In a case like that, a "blank" declaration (one with no initialization) is preferable to an initialization that is guaranteed to be overwritten before the dummy value is read.
/* DON'T DO THIS! */
Color color = null;
switch(colorCode) {
case RED: color = new Color("crimson"); break;
case GREEN: color = new Color("lime"); break;
case BLUE: color = new Color("azure"); break;
}
color.fill(widget);
Now you have a NullPointerException if an unrecognized color code is presented. It would be better not to assign the meaningless null. The compiler would produce an error at the color.fill() call, because it would detect that you might not have initialized color.
In order to answer your question in this case, I'd have to see the code in question. If the solution initialized it inside the run() method, it must have been used either as temporary storage, or as a way to "return" the results of the task.
If the collection is used as temporary storage, and isn't accessible outside of the method, it should be declared as a local variable, not an instance variable, and most likely, should be initialized where it's declared in the method.
Concurrency Issues
For a beginning programming course, your instructor probably wasn't trying to confront you with the complexities of concurrent programming—although if that's the case, I'm not sure why you were using a Thread. But, with current trends in CPU design, anyone who is learning to program needs to have a firm grasp on concurrency. I'll try to delve a little deeper here.
Returning results from a thread's run method is a bit tricky. This method is the Runnable interface, and there's nothing stopping multiple threads from executing the run method of a single instance. The resulting concurrency issues are part of the motivation behind the Callable interface introduced in Java 5. It's much like Runnable, but can return a result in a thread-safe manner, and throw an Exception if the task can't be executed.
It's a bit of a digression, but if you are curious, consider the following example:
class Oops extends Thread { /* Note that thread implements "Runnable" */
private int counter = 0;
private Collection<Integer> state = ...;
public void run() {
state.add(counter);
counter++;
}
public static void main(String... argv) throws Exception {
Oops oops = new Oops();
oops.start();
Thread t2 = new Thread(oops); /* Now pass the same Runnable to a new Thread. */
t2.start(); /* Execute the "run" method of the same instance again. */
...
}
}
By the end of the the main method you pretty much have no idea what the "state" of the Collection is. Two threads are working on it concurrently, and we haven't specified whether the collection is safe for concurrent use. If we initialize it inside the thread, at least we can say that eventually, state will contain one element, but we can't say whether it's 0 or 1.
From wikibooks:
There are three basic kinds of scope for variables in Java:
local variable, declared within a method in a class, valid for (and occupying storage only for) the time that method is executing. Every time the method is called, a new copy of the variable is used.
instance variable, declared within a class but outside any method. It is valid for and occupies storage for as long as the corresponding object is in memory; a program can instantiate multiple objects of the class, and each one gets its own copy of all instance variables. This is the basic data structure rule of Object-Oriented programming; classes are defined to hold data specific to a "class of objects" in a given system, and each instance holds its own data.
static variable, declared within a class as static, outside any method. There is only one copy of such a variable no matter how many objects are instantiated from that class.
So yes, memory consumption is an issue, especially if the ArrayList inside run() is local.
I am not completely I understand your complete problem.
But as far as I understand it right now, the performance/memory benefit will be rather minor. Therefore I would definitely favour the easibility side.
So do what suits you the best. Only address performance/memory optimisation when needed.
My personal rule of thumb for instance variables is to initialize them, at least with a default value, either:
at delcaration time, i.e.
private ArrayList<String> myStrings = new ArrayList<String>();
in the constructor
If it's something that really is an instance variable, and represents state of the object, it is then completely initialized by the time the constructor exits. Otherwise, you open yourself to the possibility of trying to access the variable before it has a value. Of course, that doesn't apply to primitives where you will get a default value automatically.
For static (class-level) variables, initialize them in the declaration or in a static initializer. I use a static initializer if I have do calculations or other work to get a value. Initialize in the declaration if you're just calling new Foo() or setting the variable to a known value.
You have to avoid Lazy initialization. It leads to problems later.
But if you have to do it because the initialization is too heavy you have to do it like this:
Static fields:
// Lazy initialization holder class idiom for static fields
private static class FieldHolder {
static final FieldType field = computeFieldValue();
}
static FieldType getField() { return FieldHolder.field; }
Instance fields:
// Double-check idiom for lazy initialization of instance fields
private volatile FieldType field;
FieldType getField() {
FieldType result = field;
if (result == null) { // First check (no locking)
synchronized(this) {
result = field;
if (result == null) // Second check (with locking)
field = result = computeFieldValue();
}
}
return result;
}
Acording to Joshua Bolch book's "Effective Java™
Second Edition" (ISBN-13: 978-0-321-35668-0):
"Use lazy initialization judiciously"