Threads and synchronisation - java

I've problem understanding the following piece of code:-
public class SoCalledSigleton{
private final static boolean allDataLoaded = SoCalledSigleton();
private SoCalledSigleton(){
loadDataFromDB();
loadDataFromFile();
loadDataAgainFromDB();
}
}
Is this piece of code thread safe? If not then Why?

This will create an error in Java.
private final static boolean allDataLoaded = SoCalledSigleton();
You're assigning an object to a boolean variable.
You forgot to add new to instantiate the variable.
But if your code is like this
public class SoCalledSigleton{
private final static SoCalledSigleton allDataLoaded = new SoCalledSigleton();
private SoCalledSigleton(){
loadDataFromDB();
loadDataFromFile();
loadDataAgainFromDB();
}
}
It is thread-safe as static initialization and static attributes are thread-safe. They are initialized only once and exists throughout the whole life-cycle of the system.

The code is unusable in its current form, so any notions of thread safety are irrelevent.
What public interface would users use to get an instance of the singleton?

(I assume that allDataLoaded is meant to be a SoCalledSigleton and boolean is just a typo :-)
If the class has no other constructors, or the loadData* methods don't do funny business (such as publishing this), its initialization is thread safe, because the initialization of final static data members is guarded by the JVM. Such members are initialized by the class loader when the class is first loaded. During this, there is a lock on the class so the initialization process is thread safe even if multiple threads try to access the class in parallel. So the constructor of the class is guaranteed to be called only once (per classloader - thanks Visage for the clarification :-).
Note that since you don't show us the rest of the class (I suppose it should have at least a static getInstance method, and probably further nonstatic members), we can't say anything about whether the whole implementation of the class is thread safe or not.

From what we can see, there are no specific issues - it's guaranteed that the constructor will only ever by called once (so by definition can't be run multithreaded), which I presume is what you were concerned about.
However, there are still possible areas for problems. Firstly, if the loadData... methods are public, then they can be called by anyone at any time, and quite possibly could lead to concurrency errors.
Additionally, these methods are presumably modifying some kind of collection somewhere. If these collections are publically accessible before the constructor returns, then you can quite easily run into concurrency issues again. This could be an issue with anything exception updating instance-specific fields (static fields may or may not exhibit this problem depending where they are defined in the file).
Depending on the way the class is used, simply writing all of the data single-threaded may not be good enough. Collection classes are not necessarily safe for multi-threaded access even if read-only, so you'll need to ensure you're using the thread-safe data structures if multiple threads might access your singleton.
There are possibly other issues too. Thread-safety isn't a simple check-list; you need to think about what bits of code/data might be accessed concurrently, and ensure that appropriate action is taken (declaring methods synchronized, using concurrent collections, etc.). Thread-safety also isn't a binary thing (i.e. there's no such thing as "thread safe" per se); it depends how many threads will be accessing the class at once, what combinations of methods are thread-safe, whether sequences of operations will continue to function as one would expect (you can make a class "thread safe" in that is doesn't crash, but certain return values are undefined if pre-empted), what monitors threads need to hold to guarantee certain invariants etc.
I guess what I'm trying to say is that you need to think about and understand how the class is used. Showing people a snapshot of half a file (which doesn't even compile), and asking them to give a yes/no answer, is not going to be beneficial. At best they'll point out some of the issues for you if there are any; at worst you'll get a false sense of confidence.

Yeah, it's thread safe. The "method" is the constructor, and it will be called when the class is loaded, i.e. exactly once.
But looking at the stuff being done, I think it's probably a lousy idea to call it from the class loader. Essentially, you'll end up doing your DB connection and stuff at the point in time when something in your code touches the SoCalledSingleton. Chances are, this will not be inside some well-defined sequence of events where, if there's an error you have catch blocks to take you to some helpful GUI message handling or whatever.
The "cleaner" way is to use a synchronized static getInstance() method, which will construct your class and call its code exactly when getInstance() is called the first time.
EDIT: As The Elite Gentleman pointed out, there's a syntax error in there. You need to say
private final static SoCalledSingleton allDataLoaded = new SoCalledSigleton();

Related

Can a thread-safe class contain any public instance fields?

Can a thread-safe class contain any public instance fields?
Access modifieres are irrelevant in this context of thread-safety. Of course you can have public fields in a thread safe class, the question you need to ask yourself is : Does this conform to my / a design pattern and what could I possibly achieve from doing this.
When people say that class C is "thread-safe", they usually mean that no interleaving of operations performed on a single instance of the class by multiple threads can leave the instance in an invalid state. (But as Marko says, that's not a formally agreed-upon definition.) So, what are the states of an instance of your class? Which states are valid and which are not valid? Is it possible to change a valid state into an invalid state by updating one of the public fields?
If there is any way that updating a public field can change the state from valid to invalid, then you can't say that the class is generally thread safe, but if that never happens in your application, then maybe the class is thread-safe in the limited context of your application.

Breaking legacy singletons

We have packaged all our legacy code into a library and the new version of the code calls the legacy code as and when required; Though this approach is good, currently we are in a spot of bother as part of the legacy code has thread-unsafe singletons whereas the newer code calling them expect them to be thread-safe; we cannot afford to have synchronized blocks as that will clog the system when the load goes beyond certain number. Thought will take your inputs. Thanks!
Edit:
These singletons are lazy ones without synchronized and double-checks on null instance:
public static Parser getInstance() {
Parser p = null;
try {
if (instance == null) {
instance = new Parser(...);
}
} catch (Exception x) {
...
}
return p;
}
and this code is at least 8 years old, we cannot fix them.
if you have an object which is not thread safe, and the factory method returns a singleton, you have no choice but to synchronise.
you will need to change the factory method (or create a new one, if you can't edit the original code) which constructs new objects. if this is too expensive (and don't just assume it is, until you test it), look into which aspects of that are expensive and maybe some of the object's dependencies can still be singletons.
there is no magic solution here.
but.. i am about to tell you about a hack I did once when I was in a similar situation. but this is LAST RESORT and may not work in your case anyway. i was once handed a web application consisting of many servlets which contained both effectively global and local variables, as member variables. the guy who wrote it didn't realise the members of a servlet were single instance. the app worked in testing with 1 client but failed with multiple connections. we needed a fix fast. i let the servlets construct themselves as written. as the doGet and doPost methods were called, i got the servlet to clone itself and pass requests down to the clone. this copied the "global" members and gave the request fresh uninitialised members to the request. but, it's problematic. so don't do it. just fix your code. :)
As was mentioned above in the comments, you should simply fix these classes (that's an easy fix!). That said, assuming you cannot touch this code, you can inherit from it and "override" (actually it's called "hiding" since the method is static) getInstance(). That would fix only the broken part and will maintain the same logic for the other parts.
BTW, if you decide to implement a singleton with the double null check, not only that you have to synchronize the innermost check, you also have to declare instance as volatile.
There are better ways to implement both lazy and eager singleton (static class, inner helper class and enum), make sure to asses all the options before you choose one.

Is there a pthread_once equivalent in Java?

In C/C++ world, it is very easy make a routine executed just once by using pthread_once. In Java, I generally use static atomic variables to do the explicit check if the routine was run already. But that looks ugly and hence wondering if there is something like pthrea_once in Java.
Since you refer to “static atomic variables” you seem to talk about static resources which do not need special actions if you initialize them within the class initializer itself:
class Foo {
static ResourceType X = createResource();
}
Here, createResource() will be executed exactly once in a thread-safe manner on the first use of Foo, e.g. when Foo.X is accessed the first time. Threads accessing X while the class initialization is in progress are forced to wait, but subsequent access will be performed without any synchronization overhead. Typically, but not necessarily, the variable will be declared final as well.
If you have multiple resources whose creation should be deferred independently, the owner class might use inner classes, each of them holding one resource.
If your question is about an action which should be executed exactly once without returning a value, the static initialization can be used as well. You only have to add a member you can access to trigger the class initialization, e.g.:
class Foo {
static { performAction(); }
static void performActionOnce() {}
}
Here, calling Foo.performActionOnce() will cause performAction() to be executed the first time while all other subsequent invocations do nothing. You can also rely on that on returning from performActionOnce() the action within performAction() has been completed, even when there is contention on the first invocation.
This is different from any atomic variable approach as atomic variables do not provide a sufficient waiting capability for the case that the first invocation is contended. If you combine the atomic variable with a waiting queue, you end up what Lock (or any other AQS based concurrency tool) provides. For instance variables where the static initialization does not work, there is no simple workaround (besides thinking about whether initialization really has to be lazy).

Does accessing methods in a non static way affect/benefit performance?

Assuming all method calls here are static, like this:
public class Util {
public static void method1() {
}
}
Accessing in a static way:
Util.method1();
Util.method2();
Util.method3();
accessing in a non static way
Util util = new Util();
util.method1();
util.method2();
util.method3();
Is there any performance difference for either way? I know the first way of doing it here is accessing it properly. But the second way only instantiates the util object once as opposed to three times. I can't find anything pointing to anything other than to be accessing these methods properly. From what I can tell there is no functional difference, but a logical difference. Looking for sort of a cost vs. benefit of either way if anyone knows.
Is there any performance difference for either way?
Yes - the second is marginally slower, due to a pointless instance being constructed.
I know the first way of doing it here is accessing it properly. But the second way only instantiates the util object once as opposed to three times.
No, the second way creates one instance of Util whereas the first way doesn't create any instances.
The first way is significantly better, because it makes it clear that it is a static method. Consider this code:
Thread t = new Thread(someRunnable);
t.start();
t.sleep(1000);
What does it look like that last call does? Surely it makes the new thread sleep, right? No... it just calls Thread.sleep(), which only ever makes the current thread sleep.
When you mangle a static method call to act "through" a reference, the value of the reference is completely ignored - it can even be null:
Util util = null;
util.method1(); // This will still work...
There's no difference for the code you show, since all those methods are static. (The compiler will issue a warning for the second group, however.) I think there's a small performance benefit to static methods. The underlying byte code for static access, invokeSpecial I think, should be faster than invokeVirtual, which has to do some type decoding.
But it's not enough to worry about. Use whichever type of method (static vs. instance) is right for your design. Don't try to optimize the method calls like that.

A member variable's hashCode() value is different

There's a piece of code that looks like this. The problem is that during bootup, 2 initialization takes place. (1) Some method does a reflection on ForumRepository & performs a newInstance() purely to invoke #setCacheEngine. (2) Another method following that invokes #start(). I am noticing that the hashCode of the #cache member variable is different sometimes in some weird scenarios. Since only 1 piece of code invokes #setCacheEngine, how can the hashCode change during runtime (I am assuming that a different instance will have a different hashCode). Is there a bug here somewhere ?
public class ForumRepository implements Cacheable
{
private static CacheEngine cache;
private static ForumRepository instance;
public void setCacheEngine(CacheEngine engine) { cache = engine; }
public synchronized static void start()
{
instance = new ForumRepository();
}
public synchronized static void addForum( ... )
{
cache.add( .. );
System.out.println( cache.hashCode() );
// snipped
}
public synchronized static void getForum( ... )
{
... cache.get( .. );
System.out.println( cache.hashCode() );
// snipped
}
}
The whole system is wired up & initialized in the init method of a servlet.
And the init() method looks like this conceptually:
// create an instance of the DefaultCacheEngine
cache = (CacheEngine)Class.forName( "com..DefaultCacheEngine" ).newInstance();
cache.init();
// init the ForumRepository's static member
Object o = Class.forName( "com.jforum....ForumRepository" ).newInstance();
if( o instanceof Cacheable )
((Cacheable)o).setCacheEngine(cache);
// Now start the ForumRepository
ForumRepository.start();
UPDATE I didn't write this code. It is taken from jforum
UPDATE 2 Solution found. I added a separate comment below describing the cause of the problem. Thanks to everyone.
You're going to have to give WAY more information than this, but CacheEngine is probably a mutable data type, and worse, it may even be shared by others. Depending on how CacheEngine defines its hashCode(), this could very well lead to aForumRepository seeing various different hash codes from its cache.
It's perfectly fine for the same object, if it's mutable, to change its hashCode() over a period of time, as long as it's done in a consistent manner (which is another topic altogether).
See also
Object.hashCode() -- make sure you understand the implications of the contract
On cache being static
More information has resurfaced, and we now know that the object in question, while mutable, does not #Override hashCode(). However, there seems to be a serious issue in design in making cache a static field of ForumRepository class, with a non-static "setter" setCacheEngine (which looks to be specified by Cacheable).
This means that there is only incarnation of cache, no matter how many ForumRepository instances are created! In a way, all instances of ForumRepository are "sharing" the same cache!
JLS 8.3.1.1 static Fields
If a field is declared static, there exists exactly one incarnation of the field, no matter how many instances (possibly zero) of the class may eventually be created. A static field, sometimes called a class variable, is incarnated when the class is initialized.
I think it's important to step back right now and ask these questions:
Does cache need to be static? Is this intended?
Should instances of ForumRepository have their own cache?
... or should they all "share" the same cache?
How many instances of ForumRepository will be created?
Putting pros and cons of the design pattern aside, should ForumRepository be a singleton?
How many times can setCacheEngine be called on a ForumRepository object?
Would it benefit from a defensive mechanism of throwing IllegalStateException if it's called more than once?
The best recommendations would depend on the answers to the above questions. The third bullet point is something that is immediately actionable, and would reveal if setCacheEngine is getting invoked multiple times. Even if they're only invoked once for each ForumRepository instance, it's still effectively a multiple "set" in the current state of affairss, since there's only one cache.
A static field with a non-static setter is a design decision that needs to be thoroughly reexamined.
are you sure the ForumRepository classes that you are comparing are the same. if you are doing newInstance you might have a weird classloader issue.
Mutable objects with hashCode implementations based on mutable state are almost always a bad idea. For example, they behave very strangely in hash-based collections -- if you insert such an object into a HashSet and then mutate it, the contains method won't be able to find it anymore.
Objects that are naturally distinguished by their identity should not override equals and hashCode at all. If you override hashCode based on state, that state should be immutable. Examples are String and the boxing types. Those are often referred to as "value classes", because their identity has no significance -- it is meaningless to distinguish between multiple instances of new Integer(42).
The thing that puzzles me about this question is this:
Why are you looking at the hashcode of a CacheEngine instance?
It seems that your code is putting Forum objects into a cache and getting them back. As such, it makes sense to look at the hashcode values for the Forum instances. But the hashcode of the cache itself would appear to be entirely irrelevant.
Having said that, if the DefaultCacheEngine class inherits its implementation of hashcode from java.lang.Object then the value returned by the method is the "identity" hashcode, and this cannot change over the lifetime of an object. If it does appear to change, this means that you are now looking at a different instance of the DefaultCacheEngine class.
I've solved my problem and I would like to share with you what I've learnt or discovered.
Root cause of the bug
The jforum.war webapp was being loaded twice by Tomcat 6.x, via two different virtual hosts. So yes, the CacheEngine was displaying two different hashCodes because they were loaded up in separate webapp classloaders.
Solution
The quick fix for me was to limit the invocation of the servlets in jforum.war via one specific virtual host address.

Categories

Resources