In effective java 2nd Edition Item 70 Josh Bloch explained about thread hostile classes
This class is not safe for concurrent use even if all method
invocations are surrounded by external synchronization. Thread
hostility usually results from modifying static data without
synchronization
Can Someone explain me with example how it's impossible to achieve thread safety by external synchronization if the class modifies shared static data without internal synchronization?
The citation assumes the synchronization of every call to the class method on the same object instance. For example, consider the following class:
public class Test {
private Set<String> set = new TreeSet<>();
public void add(String s) {
set.add(s);
}
}
While it's not thread-safe, you can safely call the add method this way:
public void safeAdd(Test t, String s) {
synchronized(t) {
t.add(s);
}
}
If safeAdd is called from multiple threads with the same t, they will be mutually exclusive. If the different t is used, it's also fine as independent objects are updated.
However consider that we declare the set as static:
private static Set<String> set = new TreeSet<>();
This way even different Test objects access the shared collection. So in this case the synchronization on Test instances will not help as the same set may still be modified concurrently from different Test instances which may result in data loss, random exception, infinite loop or whatever. So such class would be thread-hostile.
Can Someone explain me with example how it's impossible to achieve thread safety by external synchronization if the class modifies shared static data without internal synchronization?
It's not impossible. If the class has methods that access global (i.e., static) data, then you can achieve thread-safety by synchronizing on a global lock.
But forcing the caller to synchronize threads on one global lock still is thread-hostile. A big global lock can be a serious bottleneck in a multi-threaded application. What the author wants you to do is design your class so that it would be sufficient for the client to use a separate lock for each instance of the class.
Perhaps a contrived example, but this class is impossible to synchronize externally, because the value is accessible from outside the class:
public class Example {
public static int value;
public void setValue(int newValue) {
this.value = newValue;
}
}
However you synchronize invocation of the setter, you can't guarantee that some other thread isn't changing value.
Related
Given below is a java class using Bill Pugh singleton solution.
public class Singleton {
int nonVolatileVariable;
private static class SingletonHelper {
private static Singleton INSTANCE = new Singleton();
}
private Singleton() { }
public static Singleton getInstance() {
return SingletonHelper.INSTANCE;
}
public int getNonVolatileVariable() {
return nonVolatileVariable;
}
public void setNonVolatileVariable(int nonVolatileVariable) {
this.nonVolatileVariable= nonVolatileVariable;
}
}
I have read in many places that this approach is thread safe. So if I understand it correctly, then the singleton instance is created only once and all threads accessing the getInstance method will receive the same instance of the class Singleton. However I was wondering if the threads can locally cache the obtained singleton object. Can they? If yes then doesn't that would mean that each thread can change the instance field nonVolatileVariable to different values which might create problems.
I know that there are other singleton creation methods such as enum singleton, but I am particularly interested in this singleton creation method.
So my question is that, is there a need to use the volatile keyword like
int volatile nonVolatileVariable; to make sure that the singleton using this approach is truely thread safe? Or is it already truly thread safe? If so how?
So my question is that, is there a need to use the volatile keyword
like int volatile nonVolatileVariable; to make sure that the
singleton using this approach is truly thread safe? Or is it already
truly thread safe? If so how?
The singleton pattern ensures that a single instance of the class is created. It doesn't ensures that fields and methods be thread-safe and volatile doesn't ensure it either.
However I was wondering if the threads can locally cache the obtained
singleton object ?
According to the memory model in Java, yes they can.
if yes then doesn't that would mean that each thread can change the
instance field nonVolatileVariable to different values which might
create problems.
Indeed but you would still a problem of consistency with a volatile variable because volatile handles the memory visibility question but it doesn't handle the synchronization between threads.
Try the following code where multiple threads increment a volatile int 100 times.
You will see that you could not get 100 at each time as result.
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
public class Singleton {
volatile int volatileInt;
private static class SingletonHelper {
private static Singleton INSTANCE = new Singleton();
}
private Singleton() {
}
public static Singleton getInstance() {
return SingletonHelper.INSTANCE;
}
public int getVolatileInt() {
return volatileInt;
}
public void setVolatileInt(int volatileInt ) {
this.volatileInt = volatileInt ;
}
public static void main(String[] args) throws InterruptedException {
ExecutorService executorService = Executors.newFixedThreadPool(5);
List<Callable<Void>> callables = IntStream.range(0, 100)
.mapToObj(i -> {
Callable<Void> callable = () -> {
Singleton.getInstance().setVolatileInt(Singleton.getInstance().getVolatileInt()+1);
return null;
};
return callable;
})
.collect(Collectors.toList());
executorService.invokeAll(callables);
System.out.println(Singleton.getInstance().getVolatileInt());
}
}
To ensure that each thread takes into consideration other invocations, you have to use external synchronization and in this case make the variable volatile is not required.
For example :
synchronized (Singleton.getInstance()) {
Singleton.getInstance()
.setVolatileInt(Singleton.getInstance().getVolatileInt() + 1);
}
And in this case, volatile is not required any longer.
The specific guarantee of this type of singleton basically works like this:
Each class has a unique lock for class initialization.
Any action which could cause class initialization (such as accessing a static method) is required to first acquire this lock, check if the class needs to be initialized and, if so, initialize it.
If the JVM can determine that the class is already initialized and the current thread can see the effect of that, then it can skip step 2, including skipping acquiring the lock.
(This is documented in §12.4.2.)
In other words, what's guaranteed here is that all threads must at least see the effects of the assignment in private static Singleton INSTANCE = new Singleton();, and anything else that was performed during static initialization of the SingletonHelper class.
Your analysis that concurrent reads and writes of the non-volatile variable can be inconsistent between threads is correct, although the language specification isn't written in terms of caching. The way the language specification is written is that reads and writes can appear out of order. For example, suppose the following sequence of events, listed chronologically:
nonVolatileVariable is 0
ThreadA sets nonVolatileVariable to 1
ThreadB reads nonVolatileVariable (what value should it see?)
The language specification allows ThreadB to see the value 0 when it reads nonVolatileVariable, which is as if the events had happened in the following order:
nonVolatileVariable is 0
ThreadB reads nonVolatileVariable (and sees 0)
ThreadA sets nonVolatileVariable to 1
In practice, this is due to caching, but the language specification doesn't say what may or may not be cached (except here and here, as a brief mention), it only specifies the order of events.
One extra note regarding thread-safety: some actions are always considered atomic, such as reads and writes of object references (§17.7), so there are some cases where the use of a non-volatile variable can be considered thread-safe, but it depends on what you're specifically doing with it. There can still be memory inconsistency but concurrent reads and writes can't somehow interleave, so you can't end up with e.g. an invalid pointer value somehow. It's therefore sometimes safe to use non-volatile variables for e.g. lazily-initialized fields if it doesn't matter that the initialization procedure could happen more than once. I know of at least one place in the JDK where this is used, in java.lang.reflect.Field (also see this comment in the file), but it's not the norm.
If a class has only two synchronized methods (both either static or non static), the class is considered to be thread safe. What if one of the methods is static and one non static? Is it still thread safe, or bad things can happen if multiple threads call the methods?
There are some similar threads like static synchronized and non static synchronized methods in threads which describe the method calls are not blocking each other. But I am curious to know whether bad things in the world of thread safety (like inconsistent state, race condition, etc) can happen or not.
Edit 1:
Since static methods can't call non static methods, there should be no thread conflict from this side. On the other hand if a non static method calls the static one, it has to acquire the class lock. Which would be still thread safe. So by just having two methods (one static one none) I don't see any thread conflict. Is that right? In other words the only case I can see to have an issue is when the non static method accesses some static variables. But if all accesses are done through methods then I don't see any issues with thread safety. These were my thoughts. I am not sure whether I am missing something here since I am a little bit new to java concurrency.
In Java, the following:
public class MyClass
{
public synchronized void nonStaticMethod()
{
// code
}
public synchronized static void staticMethod()
{
// code
}
}
is equivalent to the following:
public class MyClass
{
public void nonStaticMethod()
{
synchronized(this)
{
// code
}
}
public void static staticMethod()
{
synchronized(MyClass.class)
{
// code
}
}
}
As you see, static methods use this as monitor object, and non-static methods use class object as monitor.
As this and MyClass.class are different objects, static and non-static methods may run concurrently.
To "fix" this, create a dedicated static monitor object and use it in both static and non-static methods:
public class MyClass
{
private static Object monitor = new Object();
public void nonStaticMethod()
{
synchronized(monitor)
{
// code
}
}
public static void staticMethod()
{
synchronized(monitor)
{
// code
}
}
}
What if one of the methods is static and one non static? Is it still thread safe, or bad things can happen if multiple threads call the methods?
Bad things can happen.
The static method will lock on the class monitor. The instance method will lock on the instance monitor. Since two different lock objects are in use, both methods could execute at the same time from different threads. If they share state (i.e. the instance method accesses static data) you will have problems.
What if one of the methods is static and one non static? Is it still thread safe, or bad things can happen if multiple threads call the methods?
Synchronization works with monitor (locks) that is taken on object.
In case of static method it's object of Class's class and in case of instance method it's this or calling object.
Since both the objects are different hence both synchronized static and non-static method will not block each other in case of multi-threading. Both the methods will execute simultaneously.
What if one of the methods is static and one non static
Yes.. Bad things can happen. Because if you synchronize on a static method, then you will be locking on the monitor of the Class object not on the class instancei.e, you will be locking on MyClass.class.
Whereas when you are synchronizing on an instance (non-static) method, you will actually be locking on the current instance i.e, this . So, you are locking on two different objects. So, the behaviour will be undefined and certainly not correct.
PS : In multi-threading, as a rule of thumb, please remember - If bad things can happen, they will happen.
What if ... Is it still thread safe?
Can't answer that question without a complete example. The question of thread safety is never a question about methods: It's a question about corruption of data structures and about liveness guarantees. Nobody can say whether a program is thread safe without knowing what all of the different threads do, what data they do it to, and how they coordinate (synchronize) with one another.
Someone somewhere told me that Java constructors are synchronized so that it can't be accessed concurrently during construction, and I was wondering: if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
Let me demonstrate with some code:
public class Test {
private static final Map<Integer, Test> testsById =
Collections.synchronizedMap(new HashMap<>());
private static final AtomicInteger atomicIdGenerator = new AtomicInteger();
private final int id;
public Test() {
this.id = atomicIdGenerator.getAndIncrement();
testsById.put(this.id, this);
// Some lengthy operation to fully initialize this object
}
public static Test getTestById(int id) {
return testsById.get(id);
}
}
Assume that put/get are the only operations on the map, so I won't get CME's via something like iteration, and try to ignore other obvious flaws here.
What I want to know is if another thread (that's not the one constructing the object, obviously) tries to access the object using getTestById and calling something on it, will it block? In other words:
Test test = getTestById(someId);
test.doSomething(); // Does this line block until the constructor is done?
I'm just trying to clarify how far the constructor synchronization goes in Java and if code like this would be problematic. I've seen code like this recently that did this instead of using a static factory method, and I was wondering just how dangerous (or safe) this is in a multi-threaded system.
Someone somewhere told me that Java constructors are synchronized so that it can't be accessed concurrently during construction
This is certainly not the case. There is no implied synchronization with constructors. Not only can multiple constructors happen at the same time but you can get concurrency issues by, for example, forking a thread inside of a constructor with a reference to the this being constructed.
if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
No it won't.
The big problem with constructors in threaded applications is that the compiler has the permission, under the Java memory model, to reorder the operations inside of the constructor so they take place after (of all things) the object reference is created and the constructor finishes. final fields will be guaranteed to be fully initialized by the time the constructor finishes but not other "normal" fields.
In your case, since you are putting your Test into the synchronized-map and then continuing to do initialization, as #Tim mentioned, this will allow other threads to get ahold of the object in a possibly semi-initialized state. One solution would be to use a static method to create your object:
private Test() {
this.id = atomicIdGenerator.getAndIncrement();
// Some lengthy operation to fully initialize this object
}
public static Test createTest() {
Test test = new Test();
// this put to a synchronized map forces a happens-before of Test constructor
testsById.put(test.id, test);
return test;
}
My example code works since you are dealing with a synchronized-map, which makes a call to synchronized which ensures that the Test constructor has completed and has been memory synchronized.
The big problems in your example is both the "happens before" guarantee (the constructor may not finish before Test is put into the map) and memory synchronization (the constructing thread and the get-ing thread may see different memory for the Test instance). If you move the put outside of the constructor then both are handled by the synchronized-map. It doesn't matter what object it is synchronized on to guarantee that the constructor has finished before it was put into the map and the memory has been synchronized.
I believe that if you called testsById.put(this.id, this); at the very end of your constructor, you may in practice be okay however this is not good form and at the least would need careful commenting/documentation. This would not solve the problem if the class was subclassed and initialization was done in the subclass after the super(). The static solution I showed is a better pattern.
Someone somewhere told me that Java constructors are synchronized
'Somebody somewhere' is seriously misinformed. Constructors are not synchronized. Proof:
public class A
{
public A() throws InterruptedException
{
wait();
}
public static void main(String[] args) throws Exception
{
A a = new A();
}
}
This code throws java.lang.IllegalMonitorStateException at the wait() call. If there was synchronization in effect, it wouldn't.
It doesn't even make sense. There is no need for them to be synchronized. A constructor can only be invoked after a new(), and by definition each invocation of new() returns a different value. So there is zero possibility of a constructor being invoked by two threads simultaneously with the same value of this. So there is no need for synchronization of constructors.
if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
No. Why would it do that? Who's going to block it? Letting 'this' escape from a constructor like that is poor practice: it allows other threads to access an object that is still under construction.
You've been misinformed. What you describe is actually referred to as improper publication and discussed at length in the Java Concurrency In Practice book.
So yes, it will be possible for another thread to obtain a reference to your object and begin trying to use it before it is finished initializing. But wait, it gets worse consider this answer: https://stackoverflow.com/a/2624784/122207 ... basically there can be a reordering of reference assignment and constructor completion. In the example referenced, one thread can assign h = new Holder(i) and another thread call h.assertSanity() on the new instance with timing just right to get two different values for the n member that is assigned in Holder's constructor.
constructors are just like other methods, there's no additional synchronization (except for handling final fields).
the code would work if this is published later
public Test()
{
// Some lengthy operation to fully initialize this object
this.id = atomicIdGenerator.getAndIncrement();
testsById.put(this.id, this);
}
Although this question is answered but the code pasted in question doesn't follow safe construction techniques as it allows this reference to escape from constructor , I would like to share a beautiful explanation presented by Brian Goetz in the article: "Java theory and practice: Safe construction techniques" at the IBM developerWorks website.
It's unsafe. There are no additional synchronization in JVM. You can do something like this:
public class Test {
private final Object lock = new Object();
public Test() {
synchronized (lock) {
// your improper object reference publication
// long initialization
}
}
public void doSomething() {
synchronized (lock) {
// do something
}
}
}
I have a web application running on Tomcat.
There are several calculations that need to be done on multiple places in the web application. Can I make those calculations static helper functions? If the server has enough processor cores, can multiple calls to that static function (resulting from multiple requests to different servlets) run parallel? Or does one request have to wait until the other request finished the call?
public class Helper {
public static void doSomething(int arg1, int arg2) {
// do something with the args
return val;
}
}
if the calls run parallel:
I have another helper class with static functions, but this class contains a private static member which is used in the static functions. How can I make sure that the functions are thread-safe?
public class Helper {
private static SomeObject obj;
public static void changeMember() {
Helper.obj.changeValue();
}
public static String readMember() {
Helper.obj.readValue();
}
}
changeValue() and readValue() read/change the same member variable of Helper.obj. Do I have to make the whole static functions synchronized, or just the block where Helper.obj is used? If I should use a block, what object should I use to lock it?
can i make those calculations static helper functions? if the server has enough processor cores, can multiple calls to that static function (resulting from multiple requests to different servlets) run parallel?
Yes, and yes.
do i have to make the whole static functions synchronized
That will work.
or just the block where Helper.obj is used
That will also work.
if i should use a block, what object should i use to lock it?
Use a static Object:
public class Helper {
private static SomeObject obj;
private static final Object mutex = new Object();
public static void changeMember() {
synchronized (mutex) {
obj.changeValue();
}
}
public static String readMember() {
synchronized (mutex) {
obj.readValue();
}
}
}
Ideally, though, you'd write the helper class to be immutable (stateless or otherwise) so that you just don't have to worry about thread safety.
You should capture the calculations in a class, and create an instance of the class for each thread. What you have now is not threadsafe, as you are aware, and to make it threadsafe you will have to synchronize on the static resource/the methods that access that static resource, which will cause blocking.
Note that there are patterns to help you with this. You can use the strategy pattern (in its canonical form, the strategy must be chosen at runtime, which might or might not apply here) or a variant. Just create a class for each calculation with an execute method (and an interface that has the method), and pass a context object to execute. The context holds all the state of the calculation. One strategy instance per thread, with its context, and you shouldn't have any issues.
If you don't have to share it you can make it thread local, then it doesn't have to be thread safe.
public class Helper {
private static final ThreadLocal<SomeObject> obj = new ThreadLocal<SomeObject>() {
public SomeObject initialValue() {
return enw SomeObject();
}
}
public static void changeMember() {
Helper.obj.get().changeValue();
}
public static String readMember() {
Helper.obj.get().readValue();
}
}
I'll sum up here what has been said in the comments to the Matt Ball's answer, since it got pretty long at the end and the message gets lost: and the message was
in a shared environment like a web/application server you should try very hard to find a solution without synchronizing. Using static helpers synchronized on static object might work well enough for stand alone application with a single user in front of the screen, in a multiuser/multiapplication scenario doing this would most probably end in a very poor performance - it would effectively mean serializing access to your application, all users would have to wait on the same lock. You might not notice the problem for a long time: if the calculation are fast enough and load is evenly distributed.
But then all of a sudden all your users might try to go through the calculation at 9am and you app will stop to work! I mean not really stop, but they all would block on the lock and make a huge queue.
Now regardless the necessity of a shared state, since you originally named calculations as subject of synchronization: do their results need to be shared? Or are those calculations specific to a user/session? In the latter case a ThreadLocal as per Peter Lawrey would be enough. Otherwise I'd say for overall performance it would be better to duplicate the calculations for everybody needing them in order not to synchronize (depends on the cost).
Session management should also be better left to the container: it has been optimized to handle them efficiently, if necessary including clustering etc. I doubt one could make better solution without investing lot of work and making lots of bugs on the way there. But as Matt Ball has stated it should be better asked separately.
In the first case you don't have to worry about threading issues, because the variables are local to each thread. You correctly identify the problem in the second case, though, because multiple threads will be reading/writing the same object. Synchronizing on the methods will work, as would synchronized blocks.
For the first part:
Yes, these calls are independent and run in parallel when called by different threads.
For the last part:
Use synchronize blocks on the concurrent object, a dummy object or class object. Be aware of cascaded synchronize blocks. They can lead into dead locks when acquired in different order.
If you are worried about synchronization and thread safety, don't use static helpers. Create a normal class with your helper methods and create an instance upon servlet request. Keep it simple :-)
I have a fairly trivial static variable question. I'm building out a solution that loosely follows the path or RMI. On my server, I have a ComputeEngine class that will execute 'Tasks' (class instances with an 'execute' method). However, the ComputeEngine will contain a global variable that will need to be accessed by different tasks, each executing in its own thread. What's the best way to give access to this? I want to keep everything as loosely coupled as possible. The shared global static variable in my ComputeEngine class will be a List. Should I have a getter for this static variable? I will have a read/write lock in my ComputeEngine class to give access to my global List. This too will be static and will need to be shared. I'm looking for best practice on how to provide access to a global static variable in a class.
If you want to decouple it, the best way is to pass a callback object when you create the Task.
interface FooListManipulator {
void addFoo( Foo f );
List<Foo> getFooList();
}
class Task {
private FooListManipulator fooListManipulator;
public Task( FooListManipulator fooListManipulator ) {
this.fooListManipulator = fooListManipulator;
}
}
This way the Task itself doesn't have to assume anything about who created it and how the list is stored.
And in your ComputeEngine you will do something like this:
class ComputeEngine {
private static List<Foo> fooList;
class Manipulator implements FooListManipulator {
public void addFoo( Foo f ) {
synchronized( fooList ) {
fooList.add( f );
}
}
public List<Foo> getFooList() {
return Collections.unmodifiableList( fooList );
}
}
private Task createTask() {
return new Task( new Manipulator() );
}
}
If you want to change the storage of fooList later (which you should really consider, as static global variables aren't a great idea), Task will remain unchanged. Plus you will be able to unit test Task with a mock manipulator.
I'm looking for best practice on how
to provide access to a global static
variable
By best practices, you shouldn't have such variables.
Should I have a getter for this static
variable? I will have a read/write
lock in my ComputeEngine class to give
access to my global List.
No, you shouldn't provide such getter. Just addTask(Task task) or execute(task) method. Method synchronization will be workable solution.
Nooooooooooooooooooooooooooooooooooooooooooooooooooo!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Static mutables are Bad, and dressing it up as Singletons just make matters worse. Pass objects through constructors as necessary. And give objects sensible behaviour.
In the case of RMI, by default you are loading untrusted code from wherever directed by the client (top tip, when using RMI, use -Djava.rmi.server.useCodebaseOnly=true). As a global static, this code can fiddle with your server state (assuming in an accessible class loader, etc).
don't return your list from the getter, as you won't know what people will do with it (they may add things and break your locking). So do this:
static synchronized List getTheList() {
return new ArrayList(theList);
}
and only implement the getter if anyone actually needs it
don't implement any setters; instead implement addItemToList() and removeItemToList()
other than that, having a global static variable is frowned upon...
I have several recommendations for you:
Your "Task" sounds like a Runnable, change "execute" to "run" and you get a lot of stuff for free. Like all the awesome classes in java.util.concurrent.
Make ComputeEngine itself a Singleton via the technique in this post. To be clear, use the "enum" approach from Josh Bloch (2nd answer on that question).
Make your List a member of ComputeEngine
Tasks use ComputeEngine.saveResult(...), which modifies the List.
Consider using java.util.concurrent.Executors to manage your pool of Tasks.
Going after #biziclop answer but with other separation)
You can separate your code in next parts.
interface Task {
void execute();
}
public final class TaskExecutor{
TaskExecutor(List<Task> tasks){}
void addTask(Task task){synchronized(tasks){tasks.add(task);}}
}
Than,
public class SomeTaskAdder {
SomeTaskAdder(TaskExecutor executor){}
void foo(){
executor.addTask(new GoodTask(bla-bla));
}
}
public class SomeTasksUser {
SomeTasksUser(List<Task> tasks){synchronized(tasks){bla-bla}}
}
than, you should create your objects with some magic constructor injection)
Everyone seems to be asserting you're using the List to keep a queue of Tasks, but I don't actually see that in your question. But if it is, or if the manipulations to the list are otherwise independent — that is, if you are simply adding to or removing from the list as opposed to say, scanning the list and removing some items from the middle as a part of the job — then you should consider using a BlockingQueue or QueueDeque instead of a List and simply leveraging the java.util.concurrent package. These types do not require external lock management.
If you require in earnest a List which is concurrently accessed by the each job, where the reads and writes to the list not independent, I would encapsulate the part of the processing which does this manipulation in a singleton, and use an exclusive lock to have each thread make use of the list. For instance, if your list contains some sort of aggregate statistics which are only a part of the process execution, then I would have my job be one class and a singleton aggregate statistics be a separate job.
class AggregateStatistics {
private static final AggregateStatistics aggregateStatistics =
new AggregateStatistics();
public static AggregateStatistics getAggregateStatistics () {
return aggregateStatistics;
}
private List list = new ArrayList ();
private Lock lock = new ReentrantLock();
public void updateAggregates (...) {
lock.lock();
try {
/* Mutation of the list */
}
finally {
lock.unlock();
}
}
}
Then have your task enter this portion of the job by accessing the singleton and calling the method on it which is managed with a lock.
Never pass a collection which into a concurrent environment, it will only cause you problems. You can always pass around an immutable "wrapper" though if it's really suitable, by using java.util.Collections.unmodifiableList(List) and similar methods.