Protecting a static class variable - java

I have a fairly trivial static variable question. I'm building out a solution that loosely follows the path or RMI. On my server, I have a ComputeEngine class that will execute 'Tasks' (class instances with an 'execute' method). However, the ComputeEngine will contain a global variable that will need to be accessed by different tasks, each executing in its own thread. What's the best way to give access to this? I want to keep everything as loosely coupled as possible. The shared global static variable in my ComputeEngine class will be a List. Should I have a getter for this static variable? I will have a read/write lock in my ComputeEngine class to give access to my global List. This too will be static and will need to be shared. I'm looking for best practice on how to provide access to a global static variable in a class.

If you want to decouple it, the best way is to pass a callback object when you create the Task.
interface FooListManipulator {
void addFoo( Foo f );
List<Foo> getFooList();
}
class Task {
private FooListManipulator fooListManipulator;
public Task( FooListManipulator fooListManipulator ) {
this.fooListManipulator = fooListManipulator;
}
}
This way the Task itself doesn't have to assume anything about who created it and how the list is stored.
And in your ComputeEngine you will do something like this:
class ComputeEngine {
private static List<Foo> fooList;
class Manipulator implements FooListManipulator {
public void addFoo( Foo f ) {
synchronized( fooList ) {
fooList.add( f );
}
}
public List<Foo> getFooList() {
return Collections.unmodifiableList( fooList );
}
}
private Task createTask() {
return new Task( new Manipulator() );
}
}
If you want to change the storage of fooList later (which you should really consider, as static global variables aren't a great idea), Task will remain unchanged. Plus you will be able to unit test Task with a mock manipulator.

I'm looking for best practice on how
to provide access to a global static
variable
By best practices, you shouldn't have such variables.
Should I have a getter for this static
variable? I will have a read/write
lock in my ComputeEngine class to give
access to my global List.
No, you shouldn't provide such getter. Just addTask(Task task) or execute(task) method. Method synchronization will be workable solution.

Nooooooooooooooooooooooooooooooooooooooooooooooooooo!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Static mutables are Bad, and dressing it up as Singletons just make matters worse. Pass objects through constructors as necessary. And give objects sensible behaviour.
In the case of RMI, by default you are loading untrusted code from wherever directed by the client (top tip, when using RMI, use -Djava.rmi.server.useCodebaseOnly=true). As a global static, this code can fiddle with your server state (assuming in an accessible class loader, etc).

don't return your list from the getter, as you won't know what people will do with it (they may add things and break your locking). So do this:
static synchronized List getTheList() {
return new ArrayList(theList);
}
and only implement the getter if anyone actually needs it
don't implement any setters; instead implement addItemToList() and removeItemToList()
other than that, having a global static variable is frowned upon...

I have several recommendations for you:
Your "Task" sounds like a Runnable, change "execute" to "run" and you get a lot of stuff for free. Like all the awesome classes in java.util.concurrent.
Make ComputeEngine itself a Singleton via the technique in this post. To be clear, use the "enum" approach from Josh Bloch (2nd answer on that question).
Make your List a member of ComputeEngine
Tasks use ComputeEngine.saveResult(...), which modifies the List.
Consider using java.util.concurrent.Executors to manage your pool of Tasks.

Going after #biziclop answer but with other separation)
You can separate your code in next parts.
interface Task {
void execute();
}
public final class TaskExecutor{
TaskExecutor(List<Task> tasks){}
void addTask(Task task){synchronized(tasks){tasks.add(task);}}
}
Than,
public class SomeTaskAdder {
SomeTaskAdder(TaskExecutor executor){}
void foo(){
executor.addTask(new GoodTask(bla-bla));
}
}
public class SomeTasksUser {
SomeTasksUser(List<Task> tasks){synchronized(tasks){bla-bla}}
}
than, you should create your objects with some magic constructor injection)

Everyone seems to be asserting you're using the List to keep a queue of Tasks, but I don't actually see that in your question. But if it is, or if the manipulations to the list are otherwise independent — that is, if you are simply adding to or removing from the list as opposed to say, scanning the list and removing some items from the middle as a part of the job — then you should consider using a BlockingQueue or QueueDeque instead of a List and simply leveraging the java.util.concurrent package. These types do not require external lock management.
If you require in earnest a List which is concurrently accessed by the each job, where the reads and writes to the list not independent, I would encapsulate the part of the processing which does this manipulation in a singleton, and use an exclusive lock to have each thread make use of the list. For instance, if your list contains some sort of aggregate statistics which are only a part of the process execution, then I would have my job be one class and a singleton aggregate statistics be a separate job.
class AggregateStatistics {
private static final AggregateStatistics aggregateStatistics =
new AggregateStatistics();
public static AggregateStatistics getAggregateStatistics () {
return aggregateStatistics;
}
private List list = new ArrayList ();
private Lock lock = new ReentrantLock();
public void updateAggregates (...) {
lock.lock();
try {
/* Mutation of the list */
}
finally {
lock.unlock();
}
}
}
Then have your task enter this portion of the job by accessing the singleton and calling the method on it which is managed with a lock.
Never pass a collection which into a concurrent environment, it will only cause you problems. You can always pass around an immutable "wrapper" though if it's really suitable, by using java.util.Collections.unmodifiableList(List) and similar methods.

Related

Are Constructors Synchronized Until Totally Complete?

I'm building a program that requires the construction of some objects that require such intense computation to create, my smartest course would be to have them built in their own dedicated threads, while the master thread keeps grinding away on other things until the objects are needed.
So I thought about creating a special class specifically designed to create custom objects in their own thread. Like so:
public abstract class DedicatedThreadBuilder<T> {
private T object;
public DedicatedThreadBuilder() {
DedicatedThread dt = new DedicatedThread(this);
dt.start();
}
private void setObject(T i) {
object = i;
}
protected abstract T constructObject();
public synchronized T getObject() {
return object;
}
private class DedicatedThread extends Thread {
private DedicatedThreadBuilder dtb;
public DedicatedThread(DedicatedThreadBuilder builder){
dtb = builder;
}
public void run() {
synchronized(dtb) {
dtb.setObject(dtb.constructObject());
}
}
}
}
My only concern is that this mechanism will only work properly if the master thread (i.e. the thread that constructs the DedicatedThreadBuilder) has a synchronized lock on the DedicatedThreadBuilder until it's construction is completed, and therefore blocks the DedicatedThread's attempt to build the product object until it has finished construction of the DedicatedThreadBuilder. Why? Because the subclasses of DedicatedThreadBuilder will no doubt need to be constructed with parameters the will need to be passed into their own private storage, so that they can be used in the constructObject() process.
e.g.
public class JellybeanStatisticBuilder extends DedicatedThreadBuilder<JellybeanStatistics> {
private int greens;
private int blacks;
private int yellows;
public JellybeanStatisticBuilder(int g, int b, int y) {
super();
greens = g;
blacks = b;
yellows = y;
}
protected JellybeanStatistics constructObject() {
return new JellybeanStatistics(greens, blacks, yellows);
}
}
This will only work properly if the object is blocked to other threads until after it is completely constructed. Otherwise, the DedicatedThread might try to build the object before the necessary variables have been assigned.
So is that how Java works?
I think what you want is to have some sort of synchronised factory class:
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;
public class SyncFactory<T> {
// alternatively, use newCachedThreadPool or newFixedThreadPool if you want to allow some degree of parallel construction
private ExecutorService executor = Executors.newSingleThreadExecutor();
public Future<T> create() {
return executor.submit(() -> {
return new T();
});
}
}
Now you'd replace usages of T that may need to happen before T is ready with Future<T>, and have a choice between calling its Future.get() method to block until it's ready, set a timeout, or to call Future.isDone() to check up on the construction without blocking. In addition, instead of polling the Future, you may want to have the factory call a callback or post an event to notify the main thread when it has completed construction.
If (big if) this is truly needed (think that one over first)...
The overall idea you are heading toward can work, but your code is confusing and, at first glance by me anyway, appears that it might not work. This type of complexity which can break things is a very good reason to double-think and even triple-think heading down this path.
The major problem I spotted right away is that you are only ever creating 1 instance of the object. If this is a factory which just creates things on another thread, then the DedicatedThread should be called upon in DedicatedThreadBuilder's constructObject, not in its constructor.
If, on the other hand, you actually intend for the DedicatedThreadBuilder to only create 1 instance of T, then this abstraction seems unnecessary... just move DedicatedThread's behavior out to DedicatedThreadBuilder, as DedicatedThreadBuilder doesn't really seem to be doing anything extra.
Second, a minor thing that isn't incorrect so much as it is just unnecessary: you have an inner class which you pass an instance of the outer class to its constructor (that is, DedicatedThread's constructor takes a reference to its parent DedicatedThreadBuilder). This is unnecessary, as non-static inner classes are already linked to their outer classes, so the inner class can reference the outer class without any extra reference to it.
Third, if you move the behavior out of the constructor and into a separate method, then you could synchronize that. Personally, I would have had the constructObject be the thing that kicked off the process, so that calling dtb.constructObject() started the object's creation, and constructObject itself set object = newlyCreatedThing when it was done. Then you could synchronize that method if you want, or do whatever, and not have to worry about the constructor possibly not behaving how you want - in my opinion you should not generally have to worry that a constructor might have some odd side effects.
Fourth, do you have any way to know when the object is ready and available for getting? You might want to add some mechanism for that, such as an observer or other callback.
The problem is that you are using the subclass before it is constructed. It doesn't really have anything to do with multithreading. If you were calling constructObject directly from the DedicatedThreadBuilder constructor, it would be just as bad.
The reasonable implementation that is closet to what you have is just to provide DedicatedThreadBuidler with a separate start() method that should be called after the object is constructed.
Or you could have it extend Thread and use the Thread methods.
Or you could have it implement Runnable so you could use it with a Thread or an Executor or whatever.

How to avoid creating Thread hostile classes in Java

In effective java 2nd Edition Item 70 Josh Bloch explained about thread hostile classes
This class is not safe for concurrent use even if all method
invocations are surrounded by external synchronization. Thread
hostility usually results from modifying static data without
synchronization
Can Someone explain me with example how it's impossible to achieve thread safety by external synchronization if the class modifies shared static data without internal synchronization?
The citation assumes the synchronization of every call to the class method on the same object instance. For example, consider the following class:
public class Test {
private Set<String> set = new TreeSet<>();
public void add(String s) {
set.add(s);
}
}
While it's not thread-safe, you can safely call the add method this way:
public void safeAdd(Test t, String s) {
synchronized(t) {
t.add(s);
}
}
If safeAdd is called from multiple threads with the same t, they will be mutually exclusive. If the different t is used, it's also fine as independent objects are updated.
However consider that we declare the set as static:
private static Set<String> set = new TreeSet<>();
This way even different Test objects access the shared collection. So in this case the synchronization on Test instances will not help as the same set may still be modified concurrently from different Test instances which may result in data loss, random exception, infinite loop or whatever. So such class would be thread-hostile.
Can Someone explain me with example how it's impossible to achieve thread safety by external synchronization if the class modifies shared static data without internal synchronization?
It's not impossible. If the class has methods that access global (i.e., static) data, then you can achieve thread-safety by synchronizing on a global lock.
But forcing the caller to synchronize threads on one global lock still is thread-hostile. A big global lock can be a serious bottleneck in a multi-threaded application. What the author wants you to do is design your class so that it would be sufficient for the client to use a separate lock for each instance of the class.
Perhaps a contrived example, but this class is impossible to synchronize externally, because the value is accessible from outside the class:
public class Example {
public static int value;
public void setValue(int newValue) {
this.value = newValue;
}
}
However you synchronize invocation of the setter, you can't guarantee that some other thread isn't changing value.

Constructor synchronization in Java

Someone somewhere told me that Java constructors are synchronized so that it can't be accessed concurrently during construction, and I was wondering: if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
Let me demonstrate with some code:
public class Test {
private static final Map<Integer, Test> testsById =
Collections.synchronizedMap(new HashMap<>());
private static final AtomicInteger atomicIdGenerator = new AtomicInteger();
private final int id;
public Test() {
this.id = atomicIdGenerator.getAndIncrement();
testsById.put(this.id, this);
// Some lengthy operation to fully initialize this object
}
public static Test getTestById(int id) {
return testsById.get(id);
}
}
Assume that put/get are the only operations on the map, so I won't get CME's via something like iteration, and try to ignore other obvious flaws here.
What I want to know is if another thread (that's not the one constructing the object, obviously) tries to access the object using getTestById and calling something on it, will it block? In other words:
Test test = getTestById(someId);
test.doSomething(); // Does this line block until the constructor is done?
I'm just trying to clarify how far the constructor synchronization goes in Java and if code like this would be problematic. I've seen code like this recently that did this instead of using a static factory method, and I was wondering just how dangerous (or safe) this is in a multi-threaded system.
Someone somewhere told me that Java constructors are synchronized so that it can't be accessed concurrently during construction
This is certainly not the case. There is no implied synchronization with constructors. Not only can multiple constructors happen at the same time but you can get concurrency issues by, for example, forking a thread inside of a constructor with a reference to the this being constructed.
if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
No it won't.
The big problem with constructors in threaded applications is that the compiler has the permission, under the Java memory model, to reorder the operations inside of the constructor so they take place after (of all things) the object reference is created and the constructor finishes. final fields will be guaranteed to be fully initialized by the time the constructor finishes but not other "normal" fields.
In your case, since you are putting your Test into the synchronized-map and then continuing to do initialization, as #Tim mentioned, this will allow other threads to get ahold of the object in a possibly semi-initialized state. One solution would be to use a static method to create your object:
private Test() {
this.id = atomicIdGenerator.getAndIncrement();
// Some lengthy operation to fully initialize this object
}
public static Test createTest() {
Test test = new Test();
// this put to a synchronized map forces a happens-before of Test constructor
testsById.put(test.id, test);
return test;
}
My example code works since you are dealing with a synchronized-map, which makes a call to synchronized which ensures that the Test constructor has completed and has been memory synchronized.
The big problems in your example is both the "happens before" guarantee (the constructor may not finish before Test is put into the map) and memory synchronization (the constructing thread and the get-ing thread may see different memory for the Test instance). If you move the put outside of the constructor then both are handled by the synchronized-map. It doesn't matter what object it is synchronized on to guarantee that the constructor has finished before it was put into the map and the memory has been synchronized.
I believe that if you called testsById.put(this.id, this); at the very end of your constructor, you may in practice be okay however this is not good form and at the least would need careful commenting/documentation. This would not solve the problem if the class was subclassed and initialization was done in the subclass after the super(). The static solution I showed is a better pattern.
Someone somewhere told me that Java constructors are synchronized
'Somebody somewhere' is seriously misinformed. Constructors are not synchronized. Proof:
public class A
{
public A() throws InterruptedException
{
wait();
}
public static void main(String[] args) throws Exception
{
A a = new A();
}
}
This code throws java.lang.IllegalMonitorStateException at the wait() call. If there was synchronization in effect, it wouldn't.
It doesn't even make sense. There is no need for them to be synchronized. A constructor can only be invoked after a new(), and by definition each invocation of new() returns a different value. So there is zero possibility of a constructor being invoked by two threads simultaneously with the same value of this. So there is no need for synchronization of constructors.
if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
No. Why would it do that? Who's going to block it? Letting 'this' escape from a constructor like that is poor practice: it allows other threads to access an object that is still under construction.
You've been misinformed. What you describe is actually referred to as improper publication and discussed at length in the Java Concurrency In Practice book.
So yes, it will be possible for another thread to obtain a reference to your object and begin trying to use it before it is finished initializing. But wait, it gets worse consider this answer: https://stackoverflow.com/a/2624784/122207 ... basically there can be a reordering of reference assignment and constructor completion. In the example referenced, one thread can assign h = new Holder(i) and another thread call h.assertSanity() on the new instance with timing just right to get two different values for the n member that is assigned in Holder's constructor.
constructors are just like other methods, there's no additional synchronization (except for handling final fields).
the code would work if this is published later
public Test()
{
// Some lengthy operation to fully initialize this object
this.id = atomicIdGenerator.getAndIncrement();
testsById.put(this.id, this);
}
Although this question is answered but the code pasted in question doesn't follow safe construction techniques as it allows this reference to escape from constructor , I would like to share a beautiful explanation presented by Brian Goetz in the article: "Java theory and practice: Safe construction techniques" at the IBM developerWorks website.
It's unsafe. There are no additional synchronization in JVM. You can do something like this:
public class Test {
private final Object lock = new Object();
public Test() {
synchronized (lock) {
// your improper object reference publication
// long initialization
}
}
public void doSomething() {
synchronized (lock) {
// do something
}
}
}

Decorator for Java class with final methods

I have a (Java) class, WindowItem, that has a problem: One of the methods is not thread-safe. I can't fix WindowItem, because it's part of an external framework. So I figured I implement a Decorator for it, that has a "synchronized" keyword on the method in question.
The Decorator extends WindowItem and will also contain WindowItem. Following the Decorator pattern, I create methods in the Decorator that call the WindowItem it contains.
However, WindowItem has a few final methods, that I cannot override in the Decorator. That breaks the transparency of the Decorator. Let's make this explicit:
public class WindowItem {
private List<WindowItem> windows;
public Properties getMethodWithProblem() {
...
}
public final int getwindowCount() {
return windows.size();
}
}
public class WindowItemDecorator extends WindowItem {
private WindowItem item;
public WindowItemDecorator(WindowItem item) {
this.item = item;
}
# Here I solve the problem by adding the synchronized keyword:
public synchronized Properties getgetMethodWithProblem() {
return super.getMethodWithProblem();
}
# Here I should override getWindowCount() but I can't because it's final
}
In my own code, whenever I have to pass a WindowItem somewhere, I wrapped it in a decorator first: new WindowItemDecorator(item) -- and the thread-safety problem disappears. However, if my code calls getwindowCount() on a WindowItemDecorator, it will always be zero: It executes getWindowCount() on the superclass instead of the "item" member.
So I would say the design of WindowItem (the fact that it has public final methods) makes it impossible to create a Decorator for this class.
Is that correct, or am I missing something?
In this case I might be able to keep a copy of the list of windows in the decorator, and keep it in sync, and then the result of getWindowCount() would be correct. But in that case, I prefer to fork and patch the framework...
How about not thinking of the problem this way? Why not just handle the threading issues in your code, without assuming thread-safety of WindowItem.
// I personally prefer ReadWriteLocks, but this sounds like it will do...
synchronized (windowItem) {
windowItem.getMethodWithProblem();
}
And then submit an RFE with the package maintainer to better support thread safety.
Indeed, if the class isn't designed to be thread safe, it is unlikely that a few synchronized keywords are going to truly fix things. What somebody means by "thread safe" is always relative ;-)
(Incidentally, WindowItem is definitely NOT thread safe as it is using List instead of explicitly using a "thread ready" variant Correct way to synchronize ArrayList in java - there are also no guarantees that the List is being accessed in a thread safe manner).
Perhaps you could employ the Delegation Pattern, which would work nicely if the WindowItem class implements an interface defining all the methods you care about. Or if it doesn't break too much of your existing code to refer to this delegated class rather than WindowItem.
The answer to your question is yes, you can't override final methods, meaning that it's not possible to create a decorator for this class.
If you can override the method that has the problem, and solve the problem by synchronizing the method, you could just leave it at that. That is, just use your subclass, and not use the decorator pattern.
A coworker had an idea that I think can solve the problem. I can keep the state of the superclass and the state of the "item" member in sync by looking at all methods that modify the List windows. There are a few: addWindow, removeWindow. Instead of calling just "item.addWindow(...)" in the decorator, I call addWindow on the superclass as well:
Normal decorator:
public void addWindow(WindowItem newItem) {
item.addWindow(newItem);
}
In this case I do:
public void addWindow(WindowItem newItem) {
super.addWindow(newItem);
item.addWindow(newItem);
}
That keeps the state in sync and the return values of the final methods correct.
This is a solution that can work or not work depending on internals of the class being decorated.

Multiple Threads calling static helper method

I have a web application running on Tomcat.
There are several calculations that need to be done on multiple places in the web application. Can I make those calculations static helper functions? If the server has enough processor cores, can multiple calls to that static function (resulting from multiple requests to different servlets) run parallel? Or does one request have to wait until the other request finished the call?
public class Helper {
public static void doSomething(int arg1, int arg2) {
// do something with the args
return val;
}
}
if the calls run parallel:
I have another helper class with static functions, but this class contains a private static member which is used in the static functions. How can I make sure that the functions are thread-safe?
public class Helper {
private static SomeObject obj;
public static void changeMember() {
Helper.obj.changeValue();
}
public static String readMember() {
Helper.obj.readValue();
}
}
changeValue() and readValue() read/change the same member variable of Helper.obj. Do I have to make the whole static functions synchronized, or just the block where Helper.obj is used? If I should use a block, what object should I use to lock it?
can i make those calculations static helper functions? if the server has enough processor cores, can multiple calls to that static function (resulting from multiple requests to different servlets) run parallel?
Yes, and yes.
do i have to make the whole static functions synchronized
That will work.
or just the block where Helper.obj is used
That will also work.
if i should use a block, what object should i use to lock it?
Use a static Object:
public class Helper {
private static SomeObject obj;
private static final Object mutex = new Object();
public static void changeMember() {
synchronized (mutex) {
obj.changeValue();
}
}
public static String readMember() {
synchronized (mutex) {
obj.readValue();
}
}
}
Ideally, though, you'd write the helper class to be immutable (stateless or otherwise) so that you just don't have to worry about thread safety.
You should capture the calculations in a class, and create an instance of the class for each thread. What you have now is not threadsafe, as you are aware, and to make it threadsafe you will have to synchronize on the static resource/the methods that access that static resource, which will cause blocking.
Note that there are patterns to help you with this. You can use the strategy pattern (in its canonical form, the strategy must be chosen at runtime, which might or might not apply here) or a variant. Just create a class for each calculation with an execute method (and an interface that has the method), and pass a context object to execute. The context holds all the state of the calculation. One strategy instance per thread, with its context, and you shouldn't have any issues.
If you don't have to share it you can make it thread local, then it doesn't have to be thread safe.
public class Helper {
private static final ThreadLocal<SomeObject> obj = new ThreadLocal<SomeObject>() {
public SomeObject initialValue() {
return enw SomeObject();
}
}
public static void changeMember() {
Helper.obj.get().changeValue();
}
public static String readMember() {
Helper.obj.get().readValue();
}
}
I'll sum up here what has been said in the comments to the Matt Ball's answer, since it got pretty long at the end and the message gets lost: and the message was
in a shared environment like a web/application server you should try very hard to find a solution without synchronizing. Using static helpers synchronized on static object might work well enough for stand alone application with a single user in front of the screen, in a multiuser/multiapplication scenario doing this would most probably end in a very poor performance - it would effectively mean serializing access to your application, all users would have to wait on the same lock. You might not notice the problem for a long time: if the calculation are fast enough and load is evenly distributed.
But then all of a sudden all your users might try to go through the calculation at 9am and you app will stop to work! I mean not really stop, but they all would block on the lock and make a huge queue.
Now regardless the necessity of a shared state, since you originally named calculations as subject of synchronization: do their results need to be shared? Or are those calculations specific to a user/session? In the latter case a ThreadLocal as per Peter Lawrey would be enough. Otherwise I'd say for overall performance it would be better to duplicate the calculations for everybody needing them in order not to synchronize (depends on the cost).
Session management should also be better left to the container: it has been optimized to handle them efficiently, if necessary including clustering etc. I doubt one could make better solution without investing lot of work and making lots of bugs on the way there. But as Matt Ball has stated it should be better asked separately.
In the first case you don't have to worry about threading issues, because the variables are local to each thread. You correctly identify the problem in the second case, though, because multiple threads will be reading/writing the same object. Synchronizing on the methods will work, as would synchronized blocks.
For the first part:
Yes, these calls are independent and run in parallel when called by different threads.
For the last part:
Use synchronize blocks on the concurrent object, a dummy object or class object. Be aware of cascaded synchronize blocks. They can lead into dead locks when acquired in different order.
If you are worried about synchronization and thread safety, don't use static helpers. Create a normal class with your helper methods and create an instance upon servlet request. Keep it simple :-)

Categories

Resources