Someone somewhere told me that Java constructors are synchronized so that it can't be accessed concurrently during construction, and I was wondering: if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
Let me demonstrate with some code:
public class Test {
private static final Map<Integer, Test> testsById =
Collections.synchronizedMap(new HashMap<>());
private static final AtomicInteger atomicIdGenerator = new AtomicInteger();
private final int id;
public Test() {
this.id = atomicIdGenerator.getAndIncrement();
testsById.put(this.id, this);
// Some lengthy operation to fully initialize this object
}
public static Test getTestById(int id) {
return testsById.get(id);
}
}
Assume that put/get are the only operations on the map, so I won't get CME's via something like iteration, and try to ignore other obvious flaws here.
What I want to know is if another thread (that's not the one constructing the object, obviously) tries to access the object using getTestById and calling something on it, will it block? In other words:
Test test = getTestById(someId);
test.doSomething(); // Does this line block until the constructor is done?
I'm just trying to clarify how far the constructor synchronization goes in Java and if code like this would be problematic. I've seen code like this recently that did this instead of using a static factory method, and I was wondering just how dangerous (or safe) this is in a multi-threaded system.
Someone somewhere told me that Java constructors are synchronized so that it can't be accessed concurrently during construction
This is certainly not the case. There is no implied synchronization with constructors. Not only can multiple constructors happen at the same time but you can get concurrency issues by, for example, forking a thread inside of a constructor with a reference to the this being constructed.
if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
No it won't.
The big problem with constructors in threaded applications is that the compiler has the permission, under the Java memory model, to reorder the operations inside of the constructor so they take place after (of all things) the object reference is created and the constructor finishes. final fields will be guaranteed to be fully initialized by the time the constructor finishes but not other "normal" fields.
In your case, since you are putting your Test into the synchronized-map and then continuing to do initialization, as #Tim mentioned, this will allow other threads to get ahold of the object in a possibly semi-initialized state. One solution would be to use a static method to create your object:
private Test() {
this.id = atomicIdGenerator.getAndIncrement();
// Some lengthy operation to fully initialize this object
}
public static Test createTest() {
Test test = new Test();
// this put to a synchronized map forces a happens-before of Test constructor
testsById.put(test.id, test);
return test;
}
My example code works since you are dealing with a synchronized-map, which makes a call to synchronized which ensures that the Test constructor has completed and has been memory synchronized.
The big problems in your example is both the "happens before" guarantee (the constructor may not finish before Test is put into the map) and memory synchronization (the constructing thread and the get-ing thread may see different memory for the Test instance). If you move the put outside of the constructor then both are handled by the synchronized-map. It doesn't matter what object it is synchronized on to guarantee that the constructor has finished before it was put into the map and the memory has been synchronized.
I believe that if you called testsById.put(this.id, this); at the very end of your constructor, you may in practice be okay however this is not good form and at the least would need careful commenting/documentation. This would not solve the problem if the class was subclassed and initialization was done in the subclass after the super(). The static solution I showed is a better pattern.
Someone somewhere told me that Java constructors are synchronized
'Somebody somewhere' is seriously misinformed. Constructors are not synchronized. Proof:
public class A
{
public A() throws InterruptedException
{
wait();
}
public static void main(String[] args) throws Exception
{
A a = new A();
}
}
This code throws java.lang.IllegalMonitorStateException at the wait() call. If there was synchronization in effect, it wouldn't.
It doesn't even make sense. There is no need for them to be synchronized. A constructor can only be invoked after a new(), and by definition each invocation of new() returns a different value. So there is zero possibility of a constructor being invoked by two threads simultaneously with the same value of this. So there is no need for synchronization of constructors.
if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
No. Why would it do that? Who's going to block it? Letting 'this' escape from a constructor like that is poor practice: it allows other threads to access an object that is still under construction.
You've been misinformed. What you describe is actually referred to as improper publication and discussed at length in the Java Concurrency In Practice book.
So yes, it will be possible for another thread to obtain a reference to your object and begin trying to use it before it is finished initializing. But wait, it gets worse consider this answer: https://stackoverflow.com/a/2624784/122207 ... basically there can be a reordering of reference assignment and constructor completion. In the example referenced, one thread can assign h = new Holder(i) and another thread call h.assertSanity() on the new instance with timing just right to get two different values for the n member that is assigned in Holder's constructor.
constructors are just like other methods, there's no additional synchronization (except for handling final fields).
the code would work if this is published later
public Test()
{
// Some lengthy operation to fully initialize this object
this.id = atomicIdGenerator.getAndIncrement();
testsById.put(this.id, this);
}
Although this question is answered but the code pasted in question doesn't follow safe construction techniques as it allows this reference to escape from constructor , I would like to share a beautiful explanation presented by Brian Goetz in the article: "Java theory and practice: Safe construction techniques" at the IBM developerWorks website.
It's unsafe. There are no additional synchronization in JVM. You can do something like this:
public class Test {
private final Object lock = new Object();
public Test() {
synchronized (lock) {
// your improper object reference publication
// long initialization
}
}
public void doSomething() {
synchronized (lock) {
// do something
}
}
}
Related
I'm building a program that requires the construction of some objects that require such intense computation to create, my smartest course would be to have them built in their own dedicated threads, while the master thread keeps grinding away on other things until the objects are needed.
So I thought about creating a special class specifically designed to create custom objects in their own thread. Like so:
public abstract class DedicatedThreadBuilder<T> {
private T object;
public DedicatedThreadBuilder() {
DedicatedThread dt = new DedicatedThread(this);
dt.start();
}
private void setObject(T i) {
object = i;
}
protected abstract T constructObject();
public synchronized T getObject() {
return object;
}
private class DedicatedThread extends Thread {
private DedicatedThreadBuilder dtb;
public DedicatedThread(DedicatedThreadBuilder builder){
dtb = builder;
}
public void run() {
synchronized(dtb) {
dtb.setObject(dtb.constructObject());
}
}
}
}
My only concern is that this mechanism will only work properly if the master thread (i.e. the thread that constructs the DedicatedThreadBuilder) has a synchronized lock on the DedicatedThreadBuilder until it's construction is completed, and therefore blocks the DedicatedThread's attempt to build the product object until it has finished construction of the DedicatedThreadBuilder. Why? Because the subclasses of DedicatedThreadBuilder will no doubt need to be constructed with parameters the will need to be passed into their own private storage, so that they can be used in the constructObject() process.
e.g.
public class JellybeanStatisticBuilder extends DedicatedThreadBuilder<JellybeanStatistics> {
private int greens;
private int blacks;
private int yellows;
public JellybeanStatisticBuilder(int g, int b, int y) {
super();
greens = g;
blacks = b;
yellows = y;
}
protected JellybeanStatistics constructObject() {
return new JellybeanStatistics(greens, blacks, yellows);
}
}
This will only work properly if the object is blocked to other threads until after it is completely constructed. Otherwise, the DedicatedThread might try to build the object before the necessary variables have been assigned.
So is that how Java works?
I think what you want is to have some sort of synchronised factory class:
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;
public class SyncFactory<T> {
// alternatively, use newCachedThreadPool or newFixedThreadPool if you want to allow some degree of parallel construction
private ExecutorService executor = Executors.newSingleThreadExecutor();
public Future<T> create() {
return executor.submit(() -> {
return new T();
});
}
}
Now you'd replace usages of T that may need to happen before T is ready with Future<T>, and have a choice between calling its Future.get() method to block until it's ready, set a timeout, or to call Future.isDone() to check up on the construction without blocking. In addition, instead of polling the Future, you may want to have the factory call a callback or post an event to notify the main thread when it has completed construction.
If (big if) this is truly needed (think that one over first)...
The overall idea you are heading toward can work, but your code is confusing and, at first glance by me anyway, appears that it might not work. This type of complexity which can break things is a very good reason to double-think and even triple-think heading down this path.
The major problem I spotted right away is that you are only ever creating 1 instance of the object. If this is a factory which just creates things on another thread, then the DedicatedThread should be called upon in DedicatedThreadBuilder's constructObject, not in its constructor.
If, on the other hand, you actually intend for the DedicatedThreadBuilder to only create 1 instance of T, then this abstraction seems unnecessary... just move DedicatedThread's behavior out to DedicatedThreadBuilder, as DedicatedThreadBuilder doesn't really seem to be doing anything extra.
Second, a minor thing that isn't incorrect so much as it is just unnecessary: you have an inner class which you pass an instance of the outer class to its constructor (that is, DedicatedThread's constructor takes a reference to its parent DedicatedThreadBuilder). This is unnecessary, as non-static inner classes are already linked to their outer classes, so the inner class can reference the outer class without any extra reference to it.
Third, if you move the behavior out of the constructor and into a separate method, then you could synchronize that. Personally, I would have had the constructObject be the thing that kicked off the process, so that calling dtb.constructObject() started the object's creation, and constructObject itself set object = newlyCreatedThing when it was done. Then you could synchronize that method if you want, or do whatever, and not have to worry about the constructor possibly not behaving how you want - in my opinion you should not generally have to worry that a constructor might have some odd side effects.
Fourth, do you have any way to know when the object is ready and available for getting? You might want to add some mechanism for that, such as an observer or other callback.
The problem is that you are using the subclass before it is constructed. It doesn't really have anything to do with multithreading. If you were calling constructObject directly from the DedicatedThreadBuilder constructor, it would be just as bad.
The reasonable implementation that is closet to what you have is just to provide DedicatedThreadBuidler with a separate start() method that should be called after the object is constructed.
Or you could have it extend Thread and use the Thread methods.
Or you could have it implement Runnable so you could use it with a Thread or an Executor or whatever.
What does this java code mean? Will it gain lock on all objects of MyClass?
synchronized(MyClass.class) {
//is all objects of MyClass are thread-safe now ??
}
And how the above code differs from this one:
synchronized(this) {
//is all objects of MyClass are thread-safe now ??
}
The snippet synchronized(X.class) uses the class instance as a monitor. As there is only one class instance (the object representing the class metadata at runtime) one thread can be in this block.
With synchronized(this) the block is guarded by the instance. For every instance only one thread may enter the block.
synchronized(X.class) is used to make sure that there is exactly one Thread in the block. synchronized(this) ensures that there is exactly one thread per instance. If this makes the actual code in the block thread-safe depends on the implementation. If mutate only state of the instance synchronized(this) is enough.
To add to the other answers:
static void myMethod() {
synchronized(MyClass.class) {
//code
}
}
is equivalent to
static synchronized void myMethod() {
//code
}
and
void myMethod() {
synchronized(this) {
//code
}
}
is equivalent to
synchronized void myMethod() {
//code
}
No, the first will get a lock on the class definition of MyClass, not all instances of it. However, if used in an instance, this will effectively block all other instances, since they share a single class definition.
The second will get a lock on the current instance only.
As to whether this makes your objects thread safe, that is a far more complex question - we'd need to see your code!
Yes it will (on any synchronized block/function).
I was wondering about this question for couple days for myself (actually in kotlin). I finally found good explanation and want to share it:
Class level lock prevents multiple threads to enter in synchronized block in any of all available instances of the class on runtime. This means if in runtime there are 100 instances of DemoClass, then only one thread will be able to execute demoMethod() in any one of instance at a time, and all other instances will be locked for other threads.
Class level locking should always be done to make static data thread safe. As we know that static keyword associate data of methods to class level, so use locking at static fields or methods to make it on class level.
Plus to notice why .class. It is just because .class is equivalent to any static variable of class similar to:
private final static Object lock = new Object();
where lock variable name is class and type is Class<T>
Read more:
https://howtodoinjava.com/java/multi-threading/object-vs-class-level-locking/
In effective java 2nd Edition Item 70 Josh Bloch explained about thread hostile classes
This class is not safe for concurrent use even if all method
invocations are surrounded by external synchronization. Thread
hostility usually results from modifying static data without
synchronization
Can Someone explain me with example how it's impossible to achieve thread safety by external synchronization if the class modifies shared static data without internal synchronization?
The citation assumes the synchronization of every call to the class method on the same object instance. For example, consider the following class:
public class Test {
private Set<String> set = new TreeSet<>();
public void add(String s) {
set.add(s);
}
}
While it's not thread-safe, you can safely call the add method this way:
public void safeAdd(Test t, String s) {
synchronized(t) {
t.add(s);
}
}
If safeAdd is called from multiple threads with the same t, they will be mutually exclusive. If the different t is used, it's also fine as independent objects are updated.
However consider that we declare the set as static:
private static Set<String> set = new TreeSet<>();
This way even different Test objects access the shared collection. So in this case the synchronization on Test instances will not help as the same set may still be modified concurrently from different Test instances which may result in data loss, random exception, infinite loop or whatever. So such class would be thread-hostile.
Can Someone explain me with example how it's impossible to achieve thread safety by external synchronization if the class modifies shared static data without internal synchronization?
It's not impossible. If the class has methods that access global (i.e., static) data, then you can achieve thread-safety by synchronizing on a global lock.
But forcing the caller to synchronize threads on one global lock still is thread-hostile. A big global lock can be a serious bottleneck in a multi-threaded application. What the author wants you to do is design your class so that it would be sufficient for the client to use a separate lock for each instance of the class.
Perhaps a contrived example, but this class is impossible to synchronize externally, because the value is accessible from outside the class:
public class Example {
public static int value;
public void setValue(int newValue) {
this.value = newValue;
}
}
However you synchronize invocation of the setter, you can't guarantee that some other thread isn't changing value.
If a class has only two synchronized methods (both either static or non static), the class is considered to be thread safe. What if one of the methods is static and one non static? Is it still thread safe, or bad things can happen if multiple threads call the methods?
There are some similar threads like static synchronized and non static synchronized methods in threads which describe the method calls are not blocking each other. But I am curious to know whether bad things in the world of thread safety (like inconsistent state, race condition, etc) can happen or not.
Edit 1:
Since static methods can't call non static methods, there should be no thread conflict from this side. On the other hand if a non static method calls the static one, it has to acquire the class lock. Which would be still thread safe. So by just having two methods (one static one none) I don't see any thread conflict. Is that right? In other words the only case I can see to have an issue is when the non static method accesses some static variables. But if all accesses are done through methods then I don't see any issues with thread safety. These were my thoughts. I am not sure whether I am missing something here since I am a little bit new to java concurrency.
In Java, the following:
public class MyClass
{
public synchronized void nonStaticMethod()
{
// code
}
public synchronized static void staticMethod()
{
// code
}
}
is equivalent to the following:
public class MyClass
{
public void nonStaticMethod()
{
synchronized(this)
{
// code
}
}
public void static staticMethod()
{
synchronized(MyClass.class)
{
// code
}
}
}
As you see, static methods use this as monitor object, and non-static methods use class object as monitor.
As this and MyClass.class are different objects, static and non-static methods may run concurrently.
To "fix" this, create a dedicated static monitor object and use it in both static and non-static methods:
public class MyClass
{
private static Object monitor = new Object();
public void nonStaticMethod()
{
synchronized(monitor)
{
// code
}
}
public static void staticMethod()
{
synchronized(monitor)
{
// code
}
}
}
What if one of the methods is static and one non static? Is it still thread safe, or bad things can happen if multiple threads call the methods?
Bad things can happen.
The static method will lock on the class monitor. The instance method will lock on the instance monitor. Since two different lock objects are in use, both methods could execute at the same time from different threads. If they share state (i.e. the instance method accesses static data) you will have problems.
What if one of the methods is static and one non static? Is it still thread safe, or bad things can happen if multiple threads call the methods?
Synchronization works with monitor (locks) that is taken on object.
In case of static method it's object of Class's class and in case of instance method it's this or calling object.
Since both the objects are different hence both synchronized static and non-static method will not block each other in case of multi-threading. Both the methods will execute simultaneously.
What if one of the methods is static and one non static
Yes.. Bad things can happen. Because if you synchronize on a static method, then you will be locking on the monitor of the Class object not on the class instancei.e, you will be locking on MyClass.class.
Whereas when you are synchronizing on an instance (non-static) method, you will actually be locking on the current instance i.e, this . So, you are locking on two different objects. So, the behaviour will be undefined and certainly not correct.
PS : In multi-threading, as a rule of thumb, please remember - If bad things can happen, they will happen.
What if ... Is it still thread safe?
Can't answer that question without a complete example. The question of thread safety is never a question about methods: It's a question about corruption of data structures and about liveness guarantees. Nobody can say whether a program is thread safe without knowing what all of the different threads do, what data they do it to, and how they coordinate (synchronize) with one another.
In the classic "Java concurrency in Practice" Brian Goetz uses the following snippet of code to demonstrate how to safely publish an object using a private constructor and a factory method:
public class SafeListener {
private final EventListener listener;
private SafeListener() {
listener = new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
};
}
public static SafeListener newInstance(EventSource source) {
SafeListener safe = new SafeListener();
source.registerListener(safe.listener);
return safe;
}
}
What I can't figure out yet is how this code achieves safe publication through a private constructor.
I am aware that a private constructor is used to prevent instantiation outside of the object but how does that apply to a thread rather than an object? A thread is not necessarily an object and I can't see what stops another thread from acquiring a reference to safe before the constructor finishes execution.
The constructor’s property of being private has nothing to do with the thread-safety. This is an example of the final field publication guaranty. In order for it to work, the this instance of the final field must not escape during the constructor, therefore the factory method takes care of constructing the holder instance first and registering the listener afterwards. It’s natural for applications of the factory pattern to have private constructors and public factory methods but that is not important to achieve the thread-safety here.
It’s all about how JIT and the HotSpot optimizer treat the code when performing the optimizations. They know what final or volatile fields are and what a constructor is. They will obey the limitations to the degree of optimization of such constructs. Most important, they ensure that all writes to an object you store in the final field and the write to the final field itself happen-before the constructor ends so that the final field stores happen-before any effect of the registerListener invocation in your example. Therefore, other threads can’t see the listener reference before its correct initialization.
Note that this is something you should rarely rely on. In your example, if the method registerListener of the EventSource is not thread-safe, still very bad things can happen. On the other hand, if it’s thread-safe, its thread-safety will apply to the listener constructed before the registration as well so the final field guaranty would not be needed.
Point here is to prevent escapement of this till constructor is finished. To this end, constructor is made private and factory method is provided which takes care of registering listener in external code, after object's constructor finished.
An example of thread-safe API.
how does that apply to a thread rather than an object? A thread is not necessarily an object and I can't see what stops another thread from acquiring a reference to safe before the constructor finishes execution.
A Thread is always an object in the java.lang.Thread sense, of course. The pattern should not be applied to a thread itself. Instead, it could be applied for cases where a new Thread has to be started "together" with the construction of an object. Starting the thread IN the constructor can allow the reference to the incompletely constructed object to escape. However, with this pattern, the newly constructed instance is trapped in the newInstance method until its construction is entirely complete.
(Or to put it that way: I can't imagine how another thread should acquire a reference to the safe instance before its construction is complete. Maybe you can give an example how how you think this might happen.)