Lazily initialising static variables in a multi-threaded situation - java

I am trying to write an instance method to lazily initialise several static variables. The objects I am initialising are immutable, and the references for the objects are not changed by any other instance or static methods in the class. I want the initialization code to never be executed more than once even though there may be several instances of the class in many different threads. The initialization needs to take place in an instance method as the method overrides a method in a superclass. My approach I am using is as follows.
private static volatile boolean isPrepared;
private static volatile Object object1;
private static volatile Object object2;
private static volatile Object object3;
#Override
void prepare() {
synchronized (this.getClass()) {
if (isPrepared) { return; }
object1 = expensiveCalculation1();
object2 = expensiveCalculation2();
object3 = expensiveCalculation3();
isPrepared = true;
}
}
I am assuming that since the initialization takes place in a single synchronized block, it would be impossible for an instance to ever observe isPrepared as being true unless object1, object2 and object3 are all non-null. I am also assuming that it wouldn't work by simply declaring prepare() as synchronized as the lock would just be this. Are my assumptions right? Also, is it a good idea to have several variables marked volatile when you want to regard them as being initialised together, or should I bundle them together into a single Immutable class?

Bundling all lazily-initialized state into an immutable object is usually the preferred approach because then all you need is a volatile variable, with no synchronization. That arrangement opens you to some duplicated effort if another thread starts initializing while initialization is in progress, but the chances of that can be minimized, such as by writing a sentinel value to the volatile to signal the "in progress" state.

You can use next approach: https://stackoverflow.com/a/11879164/2413618
Put your variables into static nested class. When nested class will be accessed first time then all variables will be initialized.

Related

Singleton without volatile member

Let's say I have next class:
public class Singleton{
private static Singleton _instance;
public static Singleton getInstance(){
if(_instance == null){
synchronized(Singleton.class){
if(_instance == null)
_instance = new Singleton();
}
}
return _instance;
}
For example we have two threads A and B which try to execute getInstance() method simultaneously. Am I understand the process correctly:
Thread A enter into getInstance() method and acquire the lock;
Thread B also enter into getInstance() method and blocked;
Thread A create new Singleton() object and release the lock, now last value of _instance variable should be visible to thread B? Or thread B still could have own copy of _instance variable which is not synchronized with the main memory(_instance=null)?
Thread B is blocked on the synchronize, when it proceeds it will see the changed field _instance != null and therefore does not construct one but return the existing.
All other threads which come later see the instance being set and will not even lock.
Problem: your code is incomplete, you need volatile in order to make sure threads which do not go through the synchronized (most of them, hopefully) still only see a completely published singleton object.
The Java Memory Model does only guarantee that the final fields are initialized. For all others you need a safe publish, which is possible with:
Exchange the reference through a properly locked field (JLS 17.4.5)
Use static initializer to do the initializing stores (JLS 12.4)
Exchange the reference via a volatile field (JLS 17.4.5), or as the consequence of this rule, via the AtomicX classes
Initialize the value into a final field (JLS 17.5).
The easiest method to avoid the volatile (or an atomic reference which is also safe to publish objects to other threads) is to use normal Object initialisation, this is a valid and robust singleton (but not lazy) provided by the JVM:
class Singleton
{
private static final Singleton HIGHLANDER = new Singleton();
private Singleton() { } // not accessible
public static getSingleton() { return HIGHLANDER; }
}
JDK internally uses this similar construct with "Holder" objects to implement the same simple and robust pattern but in a lazy fashion:
class Singleton
{
private Singleton() { } // not accessible
private static Class LazyHolder {
private static final Singleton LAZY_HIGHLANDER = new Singleton();
}
public static Singleton getInstance() {
return LazyHolder.LAZY_HIGHLANDER;
}
}
Both methods do not require volatile variable access (which you need in DCL case) or synchronisation (it is implicitly done by the JVM which does the initialisation protected by a class lock).
What you show here is called double-checked locking.
The static variable belongs to the class, not the thread. Both threads will see the proper value, but it is possible that the compiler may optimize the reads such that the static variable is not checked both times. For this reason, you should declare the variable with the volatile keyword.
Please note that in versions of Java prior to version 5 this might not work correctly even with a volatile variable. It used to be possible for the assignment to assign a partially-constructed object to the variable. Now the constructor must return before the assignment can proceed. This will work correctly in any modern version Java.
Two problems that could exist as far as I know.
Thread B might or might not see the latest value.
Thread B might see a partially constructed object, incase there are a lot of things that the constructor does, and JVM decides to change the ordering of the code.
Making it volatile solves both problems, since it enforces the happens before relationship and stops JVM from re ordering the code execution, and updates the values in the other threads.

What is the use of static synchronized method in java?

I have one question in my mind. I have read that static synchronized method locks in the class object
and synchronized method locks the current instance of an object. So what's the meaning of locked
on class object?
Can anyone please help me on this topic?
In general, synchronized methods are used to protect access to resources that are accessed concurrently. When a resource that is being accessed concurrently belongs to each instance of your class, you use a synchronized instance method; when the resource belongs to all instances (i.e. when it is in a static variable) then you use a synchronized static method to access it.
For example, you could make a static factory method that keeps a "registry" of all objects that it has produced. A natural place for such registry would be a static collection. If your factory is used from multiple threads, you need to make the factory method synchronized (or have a synchronized block inside the method) to protect access to the shared static collection.
Note that using synchronized without a specific lock object is generally not the safest choice when you are building a library to be used in code written by others. This is because malicious code could synchronize on your object or a class to block your own methods from executing. To protect your code against this, create a private "lock" object, instance or static, and synchronize on that object instead.
At run time every loaded class has an instance of a Class object. That is the object that is used as the shared lock object by static synchronized methods. (Any synchronized method or block has to lock on some shared object.)
You can also synchronize on this object manually if wanted (whether in a static method or not). These three methods behave the same, allowing only one thread at a time into the inner block:
class Foo {
static synchronized void methodA() {
// ...
}
static void methodB() {
synchronized (Foo.class) {
// ...
}
}
static void methodC() {
Object lock = Foo.class;
synchronized (lock) {
// ...
}
}
}
The intended purpose of static synchronized methods is when you want to allow only one thread at a time to use some mutable state stored in static variables of a class.
Nowadays, Java has more powerful concurrency features, in java.util.concurrent and its subpackages, but the core Java 1.0 constructs such as synchronized methods are still valid and usable.
In simple words a static synchronized method will lock the class instead of the object, and it will lock the class because the keyword static means: "class instead of instance".
The keyword synchronized means that only one thread can access the method at a time.
And static synchronized mean:
Only one thread can access the class at one time.
Suppose there are multiple static synchronized methods (m1, m2, m3, m4) in a class, and suppose one thread is accessing m1, then no other thread at the same time can access any other static synchronized methods.
static methods can be synchronized. But you have one lock per class. when the java class is loaded coresponding java.lang.class class object is there. That object's lock is needed for.static synchronized methods.
So when you have a static field which should be restricted to be accessed by multiple threads at once you can set those fields private and create public static synchronized setters or getters to access those fields.
Java VM contains a single class object per class. Each class may have some shared variables called static variables. If the critical section of the code plays with these variables in a concurrent environment, then we need to make that particular section as synchronized. When there is more than one static synchronized method only one of them will be executed at a time without preemption. That's what lock on class object does.

Instance methods and thread-safety of instance variables

I would like to known if each instance of a class has its own copy of the methods in that class?
Lets say, I have following class MyClass:
public MyClass {
private String s1;
private String s2;
private String method1(String s1){
...
}
private String method2(String s2){
...
}
}
So if two differents users make an instance of MyClass like:
MyClass instanceOfUser1 = new MyClass();
MyClass instanceOfUser2 = new MyClass();
Does know each user have in his thread a copy of the methods of MyClass? If yes, the instance variables are then thread-safe, as long as only the instance methods manipulate them, right?
I am asking this question because I often read that instance variables are not thread-safe. And I can not see why it should be like that, when each user gets an instance by calling the new operator?
Each object gets its own copy of the class's instance variables - it's static variables that are shared between all instances of a class. The reason that instance variables are not necessarily thread-safe is that they might be simultaneously modified by multiple threads calling unsynchronized instance methods.
class Example {
private int instanceVariable = 0;
public void increment() {
instanceVariable++;
}
}
Now if two different threads call increment at the same then you've got a data race - instanceVariable might increment by 1 or 2 at the end of the two methods returning. You could eliminate this data race by adding the synchronized keyword to increment, or using an AtomicInteger instead of an int, etc, but the point is that just because each object gets its own copy of the class's instance variables does not necessarily mean that the variables are accessed in a thread-safe manner - this depends on the class's methods. (The exception is final immutable variables, which can't be accessed in a thread-unsafe manner, short of something goofy like a serialization hack.)
Issues with multi-threading arise primarily with static variables and instances of a class being accessed at the same time.
You shouldn't worry about methods in the class but more about the fields (meaning scoped at the class level). If multiple references to an instance of a class exist, different execution paths may attempt to access the instance at the same time, causing unintended consequences such as race conditions.
A class is basically a blueprint for making an instance of an object. When the object is instantiated it receives a spot in memory that is accessed by a reference. If more than one thread has a handle to this reference it can cause occurrences where the instance is accessed simultaneously, this will cause fields to be manipulated by both threads.
'Instance Variables are not thread safe' - this statement depends on the context.
It is true, if for example you are talking about Servlets. It is because, Servlets create only one instance and multiple threads access it. So in that case Instance Variables are not thread safe.
In the above simplified case, if you are creating new instance for each thread, then your instance variables are thread safe.
Hope this answers your question
A method is nothing but a set of instructions. Whichever thread calls the method, get a copy of those instructions. After that the execution begins. The method may use local variables which are method and thread-scoped, or it may use shared resources, like static resources, shared objects or other resources, which are visible across threads.
Each instance has its own set of instance variables. How would you detect whether every instance had a distinct "copy" of the methods? Wouldn't the difference be visible only by examining the state of the instance variables?
In fact, no, there is only one copy of the method, meaning the set of instructions executed when the method is invoked. But, when executing, an instance method can refer to the instance on which it's being invoked with the reserved identifier this. The this identifier refers to the current instance. If you don't qualify an instance variable (or method) with something else, this is implied.
For example,
final class Example {
private boolean flag;
public void setFlag(boolean value) {
this.flag = value;
}
public void setAnotherFlag(Example friend) {
friend.flag = this.flag;
}
}
There's only one copy of the bytes that make up the VM instructions for the setFlag() and setAnotherFlag() methods. But when they are invoked, this is set to the instance upon which the invocation occurred. Because this is implied for an unqualified variable, you could delete all the references to this in the example, and it would still function exactly the same.
However, if a variable is qualified, like friend.flag above, the variables of another instance can be referenced. This is how you can get into trouble in a multi-threaded program. But, as long as an object doesn't "escape" from one thread to be visible to others, there's nothing to worry about.
There are many situations in which an instance may be accessible from multiple classes. For example, if your instance is a static variable in another class, then all threads would share that instance, and you can get into big trouble that way. That's just the first way that pops into my mind...

If an object reference is static, does it mean that the attributes of that object is static too?

For example I have a class like this:
public class StaticObjectReference{
private static StaticObjectReference instance;
private Vector queue;
public static StaticObjectReference getInstance(){
if(instance == null){
instance = new StaticObjectReference();
}
return instance;
}
public Vector getQueue(){
queue = new Vector();
return queue;
}
}
And these next two classes called the StaticObjectReference class.
public class CallerOne{
Vector queue1;
public void callObjectInstance1(){
queue1 = StaticObjectReference.getInstance().getQueue();
}
}
class CallerTwo{
Vector queue2;
public void callObjectInstance2(){
queue2 = StaticObjectReference.getInstance().getQueue();
}
}
Is the queue1 in the class CallerOne the same instance queue2 in the class CallerTwo?
You are using the same instance of the class StaticObjectReference to get to the queue, so yes they are the same queue.
Please note that this has nothing to do directly with the fact that instance is static though. It has more to do that this class is implementing the Singleton pattern, so there is only ONE instance of the class.
As Brian points out, this implementation isnt thread-safe. Check the wikipedia link for thread-safe methods.
If an object reference is static, does it mean that the attributes of that object is static too?
This question does not make sense if you read it literally ... and that is probably the root of your uncertainty.
The term static (in Java) means that the variable belongs to a class, not an instance or a method call. The thing here that is labelled as static is a variable. The distinction between a variable, and the value contained in a variable is critical.
An object reference cannot be static. Static-ness is not a meaningful property for object references. It simply doesn't make sense. An object reference is a value, not a variable.
While a static variable holds a reference to an object, that (in general) doesn't make the object reference it contains (or the object it refers to) static. In addition to the terminological problem, if you change a static variable, the object that the variable originally referred to may go away. That is distinctly non-static behaviour.
In fact, the static-ness of the instance variable and queue are orthogonal issues. If the variables are declared as static, they are static. Otherwise they are not.
Now your code ensures1 that there will only ever be one instance of StaticObjectReference. But that is an emergent property of the way you have written the class, not anything to do with the declarations (static or otherwise) of those two variables. And we'd not call this property "static-ness". We'd call it "singleton-ness" ... if you'll excuse my abuse of the English language ...
1 - Actually, it doesn't always guarantee that because it is not thread-safe, as written.

Singleton instantiation

Below show is the creation on the singleton object.
public class Map_en_US extends mapTree {
private static Map_en_US m_instance;
private Map_en_US() {}
static{
m_instance = new Map_en_US();
m_instance.init();
}
public static Map_en_US getInstance(){
return m_instance;
}
#Override
protected void init() {
//some code;
}
}
My question is what is the reason for using a static block for instantiating. i am familiar with below form of instantiation of the singleton.
public static Map_en_US getInstance(){
if(m_instance==null)
m_instance = new Map_en_US();
}
The reason is thread safety.
The form you said you are familiar with has the potential of initializing the singleton a large number of times. Moreover, even after it has been initialized multiple times, future calls to getInstance() by different threads might return different instances! Also, one thread might see a partially-initialized singleton instance! (let's say the constructor connects to a DB and authenticates; one thread might be able to get a reference to the singleton before the authentication happens, even if it is done in the constructor!)
There are some difficulties when dealing with threads:
Concurrency: they have to potential to execute concurrently;
Visibility: modifications to the memory made by one thread might not be visible to other threads;
Reordering: the order in which the code is executed cannot be predicted, which can lead to very strange results.
You should study about these difficulties to understand precisely why those odd behaviors are perfectly legal in the JVM, why they are actually good, and how to protect from them.
The static block is guaranteed, by the JVM, to be executed only once (unless you load and initialize the class using different ClassLoaders, but the details are beyond the scope of this question, I'd say), and by one thread only, and the results of it are guaranteed to be visible to every other thread.
That's why you should initialize the singleton on the static block.
My preferred pattern: thread-safe and lazy
The pattern above will instantiate the singleton on the first time the execution sees a reference to the class Map_en_US (actually, only a reference to the class itself will load it, but might not yet initialize it; for more details, check the reference). Maybe you don't want that. Maybe you want the singleton to be initialized only on the first call to Map_en_US.getInstance() (just as the pattern you said you are familiar with supposedly does).
If that's what you want, you can use the following pattern:
public class Singleton {
private Singleton() { ... }
private static class SingletonHolder {
private static final Singleton instance = new Singleton();
}
public static Singleton getInstance() {
return SingletonHolder.instance;
}
}
In the code above, the singleton will only be instantiated when the class SingletonHolder is initialized. This will happen only once (unless, as I said before, you are using multiple ClassLoaders), the code will be executed by only one thread, the results will have no visibility problems, and the initialization will occur only on the first reference to SingletonHolder, which happens inside the getInstance() method. This is the pattern I use most often when I need a singleton.
Another patterns...
1. synchronized getInstace()
As discussed in the comments to this answer, there's another way to implement the singleton in a thread safe manner, and which is almost the same as the (broken) one you are familiar with:
public class Singleton {
private static Singleton instance;
public static synchronized getInstance() {
if (instance == null)
instance = new Singleton();
}
}
The code above is guaranteed, by the memory model, to be thread safe. The JVM specification states the following (in a more cryptic way): let L be the lock of any object, let T1 and T2 be two threads. The release of L by T1 happens-before the acquisition of L by T2.
This means that every thing that has been done by T1 before releasing the lock will be visible to every other thread after they acquire the same lock.
So, suppose T1 is the first thread that entered the getInstance() method. Until it has finished, no other thread will be able to enter the same method (since it is synchronized). It will see that instance is null, will instantiate a Singleton and store it in the field. It will then release the lock and return the instance.
Then, T2, which was waiting for the lock, will be able to acquire it and enter the method. Since it acquired the same lock that T1 just released, T2 will see that the field instance contains the exact same instance of Singleton created by T1, and will simply return it. What is more, the initialization of the singleton, which has been done by T1, happened before the release of the lock by T1, which happened before the acquisition of the lock by T2, therefore there's no way T2 can see a partially-initialized singleton.
The code above is perfectly correct. The only problem is that the access to the singleton will be serialized. If it happens a lot, it will reduce the scalability of your application. That's why I prefer the SingletonHolder pattern I showed above: access to the singleton will be truly concurrent, without the need for synchronization!
2. Double checked locking (DCL)
Often, people are scared about the cost of lock acquisition. I've read that nowadays it is not that relevant for most applications. The real problem with lock acquisition is that it hurts scalability by serializing access to the synchronized block.
Someone devised an ingenuous way to avoid acquiring a lock, and it has been called double-checked locking. The problem is that most implementations are broken. That is, most implementations are not thread-safe (ie, are as thread-unsafe as the getInstace() method on the original question).
The correct way to implement the DCL is as follows:
public class Singleton {
private static volatile Singleton instance;
public static Singleton getInstance() {
if (instance == null) {
synchronized {
if (instance == null) {
instance = new Singleton();
}
}
}
return instance;
}
}
The difference between this correct and an incorrect implementation is the volatile keyword.
To understand why, let T1 and T2 be two threads. Let's first assume that the field is not volatile.
T1 enters the getInstace() method. It's the first one to ever enter it, so the field is null. It then enters the synchronized block, then the second if. It also evaluates to true, so T1 creates a new instance of the singleton and stores it in the field. The lock is then release, and the singleton is returned. For this thread, it is guaranteed that the Singleton is completely initialized.
Now, T2 enters the getInstace() method. It is possible (although not guaranteed) that it will see that instance != null. It will then skip the if block (and so it will never acquire the lock), and will directly return the instance of the Singleton. Due to reordering, it is possible that T2 will not see all the initialization performed by the Singleton in its constructor! Revisiting the db connection singleton example, T2 might see a connected but not yet authenticated singleton!
For more information...
... I'd recommend a brilliant book, Java Concurrency in Practice, and also, the Java Language Specification.
If you initialize in the getInstance() method, you can get a racing conditions, i.e. if 2 threads execute the if(m_instance == null) check simulataneously, both might see the instance be null and thus both might call m_instance = new Map_en_US();
Since the static initializer block is executed only once (by one thread that is executing the class loader), you don't have a problem there.
Here's a good overview.
How about this approach for eradicating the static block:
private static Map_en_US s_instance = new Map_en_US() {{init();}};
It does the same thing, but is way neater.
Explanation of this syntax:
The outer set of braces creates an anonymous class.
The inner set of braces is called an "instance block" - it fires during construction.
This syntax is often incorrectly called the "double brace initializer" syntax, usually by those who don't understand what is going on.
Also, note:
m_ is a naming convention prefix for instance (ie member) fields.
s_ is a naming convention prefix for class (ie static) fields.
So I changed the name of the field to s_....
It depends on how resource-intensive the init method is. If it e.g. does a lot of work, maybe you want that work done at the startup of the application instead of on the first call. Maybe it downloads the map from Internet? I don't know...
The static block is executed when the class is first loaded by the JVM. As Bruno said, that helps with thread safety because there isn't a possibility that two threads will fight over the same getInstance() call for the first time.
With static instantiation there will be only one copy of the instance per class irrespective of how many objects being created.
Second advantage with the way is, The method is thread-safeas you are not doing anything in the method except returning the instance.
the static block instances your class and call the default contructor (if any) only one time and when the application starts and all static elements are loaded by the JVM.
Using the getInstance() method the object for the class is builded and initialized when the method is called and not on the static initialization. And is not really safe if you are running the getInstance() in diferent threads at the same time.
static block is here to allow for init invocation. Other way to code it could be eg like this (which to prefer is a matter of taste)
public class Map_en_US extends mapTree {
private static
/* thread safe without final,
see VM spec 2nd ed 2.17.15 */
Map_en_US m_instance = createAndInit();
private Map_en_US() {}
public static Map_en_US getInstance(){
return m_instance;
}
#Override
protected void init() {
//some code;
}
private static Map_en_US createAndInit() {
final Map_en_US tmp = new Map_en_US();
tmp.init();
return tmp;
}
}
update corrected per VM spec 2.17.5, details in comments
// Best way to implement the singleton class in java
package com.vsspl.test1;
class STest {
private static STest ob= null;
private STest(){
System.out.println("private constructor");
}
public static STest create(){
if(ob==null)
ob = new STest();
return ob;
}
public Object clone(){
STest obb = create();
return obb;
}
}
public class SingletonTest {
public static void main(String[] args) {
STest ob1 = STest.create();
STest ob2 = STest.create();
STest ob3 = STest.create();
System.out.println("obj1 " + ob1.hashCode());
System.out.println("obj2 " + ob2.hashCode());
System.out.println("obj3 " + ob3.hashCode());
STest ob4 = (STest) ob3.clone();
STest ob5 = (STest) ob2.clone();
System.out.println("obj4 " + ob4.hashCode());
System.out.println("obj5 " + ob5.hashCode());
}
}
-------------------------------- OUT PUT -------------------------------------
private constructor
obj1 1169863946
obj2 1169863946
obj3 1169863946
obj4 1169863946
obj5 1169863946
Interesting never seen that before. Seems largely a style preference. I suppose one difference is: the static initialisation takes place at VM startup, rather than on first request for an instance, potentially eliminating an issue with concurrent instantiations? (Which can also be handled with synchronized getInstance() method declaration)

Categories

Resources