Breaking legacy singletons

Breaking legacy singletons - java

We have packaged all our legacy code into a library and the new version of the code calls the legacy code as and when required; Though this approach is good, currently we are in a spot of bother as part of the legacy code has thread-unsafe singletons whereas the newer code calling them expect them to be thread-safe; we cannot afford to have synchronized blocks as that will clog the system when the load goes beyond certain number. Thought will take your inputs. Thanks!
Edit:
These singletons are lazy ones without synchronized and double-checks on null instance:
public static Parser getInstance() {
Parser p = null;
try {
if (instance == null) {
instance = new Parser(...);
}
} catch (Exception x) {
...
}
return p;
}
and this code is at least 8 years old, we cannot fix them.

if you have an object which is not thread safe, and the factory method returns a singleton, you have no choice but to synchronise.
you will need to change the factory method (or create a new one, if you can't edit the original code) which constructs new objects. if this is too expensive (and don't just assume it is, until you test it), look into which aspects of that are expensive and maybe some of the object's dependencies can still be singletons.
there is no magic solution here.
but.. i am about to tell you about a hack I did once when I was in a similar situation. but this is LAST RESORT and may not work in your case anyway. i was once handed a web application consisting of many servlets which contained both effectively global and local variables, as member variables. the guy who wrote it didn't realise the members of a servlet were single instance. the app worked in testing with 1 client but failed with multiple connections. we needed a fix fast. i let the servlets construct themselves as written. as the doGet and doPost methods were called, i got the servlet to clone itself and pass requests down to the clone. this copied the "global" members and gave the request fresh uninitialised members to the request. but, it's problematic. so don't do it. just fix your code. :)

As was mentioned above in the comments, you should simply fix these classes (that's an easy fix!). That said, assuming you cannot touch this code, you can inherit from it and "override" (actually it's called "hiding" since the method is static) getInstance(). That would fix only the broken part and will maintain the same logic for the other parts.
BTW, if you decide to implement a singleton with the double null check, not only that you have to synchronize the innermost check, you also have to declare instance as volatile.
There are better ways to implement both lazy and eager singleton (static class, inner helper class and enum), make sure to asses all the options before you choose one.

Related

Initializing in Constructor

I have seen lots of codes where the coders define an init() function for classes and call it the first thing after creating the instance.
Is there any harm or limitation of doing all the initializations in Constructor?

Usually for maintainability and to reduce code size when multiple constructors call the same initialization code:
class stuff
{
public:
stuff(int val1) { init(); setVal = val1; }
stuff() { init(); setVal = 0; }
void init() { startZero = 0; }
protected:
int setVal;
int startZero;
};

Just the opposite: it's usually better to put all of the initializations
in the constructor. In C++, the "best" policy is usually to put the
initializations in an initializer list, so that the members are
directly constructed with the correct values, rather than default
constructed, then assigned. In Java, you want to avoid a function
(unless it is private or final), since dynamic resolution can put
you into an object which hasn't been initialized.
About the only reason you would use an init() function is because you
have a lot of constructors with significant commonality. (In the case
of C++, you'd still have to weigh the difference between default
construction, then assignment vs. immediate construction with the
correct value.)

In Java, there's are good reasons for keeping constructors short and moving initialization logic into an init() method:
constructors are not inherited, so any subclasses have to either reimplement them or provide stubs that chain with super
you shouldn't call overridable methods in a constructor since you can find your object in an inconsistent state where it is partially initialized

It is a design pattern that has to do with exceptions thrown from inside an object constructor.
In C++ if an exception is thrown from inside an object costructor then that object is considered as not-constructed at all, by the language runtime.
As a consequence the object destructor won't be called when the object goes out of scope.
This means that if you had code like this inside your constructor:
int *p1 = new int;
int *p2 = new int;
and code like this in your destructor:
delete p1;
delete p2;
and the initialization of p2 inside the constructor fails due to no more memory available, then a bad_alloc exception is thrown by the new operator.
At that point your object is not fully constructred, even if the memory for p1 has been allocated correctly.
If this happens the destructor won't be called and you are leaking p1.
So the more code you place inside the constructor, the more likely an error will occur leading to potential memory leaks.
That's the main reason for that design choice, which isn't too mad after all.
More on this on Herb Sutter's blog: Constructors exceptions in C++

It's a design choice. You want to keep your constructor as simple as possible, so it's easy to read what it's doing. That why you'll often see constructors that call other methods or functions, depending on the language. It allows the programmer to read and follow the logic without getting lost in the code.
As constructors go, you can quickly run into a scenario where you have a tremendous sequence of events you wish to trigger. Good design dictates that you break down these sequences into simpler methods, again, to make it more readable and easier to maintain in the future.
So, no, there's no harm or limitation, it's a design preference. If you need all that initialisation done in the constructor then do it there. If you only need it done later, then put it in a method you call later. Either way, it's entirely up to you and there are no hard or fast rules around it.

If you have multiple objects of the same class or different classes that need to be initialized with pointers to one another, so that there is at least one cycle of pointer dependency, you can't do all the initialization in constructors alone. (How are you going to construct the first object with a pointer/reference to another object when the other object hasn't been created yet?)
One situation where this can easily happen is in an event simulation system where different components interact, so that each component needs pointers to other components.
Since it's impossible to do all the initialization in the constructors, then at least some of the initialization has to happen in init functions. That leads to the question: Which parts should be done in init functions? The flexible choice seems to be to do all the pointer initialization in init functions. Then, you can construct the objects in any order since when constructing a given object, you don't have to be concerned about whether you already have the necessary pointers to the other objects it needs to know about.

Threads and synchronisation

I've problem understanding the following piece of code:-
public class SoCalledSigleton{
private final static boolean allDataLoaded = SoCalledSigleton();
private SoCalledSigleton(){
loadDataFromDB();
loadDataFromFile();
loadDataAgainFromDB();
}
}
Is this piece of code thread safe? If not then Why?

This will create an error in Java.
private final static boolean allDataLoaded = SoCalledSigleton();
You're assigning an object to a boolean variable.
You forgot to add new to instantiate the variable.
But if your code is like this
public class SoCalledSigleton{
private final static SoCalledSigleton allDataLoaded = new SoCalledSigleton();
private SoCalledSigleton(){
loadDataFromDB();
loadDataFromFile();
loadDataAgainFromDB();
}
}
It is thread-safe as static initialization and static attributes are thread-safe. They are initialized only once and exists throughout the whole life-cycle of the system.

The code is unusable in its current form, so any notions of thread safety are irrelevent.
What public interface would users use to get an instance of the singleton?

(I assume that allDataLoaded is meant to be a SoCalledSigleton and boolean is just a typo :-)
If the class has no other constructors, or the loadData* methods don't do funny business (such as publishing this), its initialization is thread safe, because the initialization of final static data members is guarded by the JVM. Such members are initialized by the class loader when the class is first loaded. During this, there is a lock on the class so the initialization process is thread safe even if multiple threads try to access the class in parallel. So the constructor of the class is guaranteed to be called only once (per classloader - thanks Visage for the clarification :-).
Note that since you don't show us the rest of the class (I suppose it should have at least a static getInstance method, and probably further nonstatic members), we can't say anything about whether the whole implementation of the class is thread safe or not.

From what we can see, there are no specific issues - it's guaranteed that the constructor will only ever by called once (so by definition can't be run multithreaded), which I presume is what you were concerned about.
However, there are still possible areas for problems. Firstly, if the loadData... methods are public, then they can be called by anyone at any time, and quite possibly could lead to concurrency errors.
Additionally, these methods are presumably modifying some kind of collection somewhere. If these collections are publically accessible before the constructor returns, then you can quite easily run into concurrency issues again. This could be an issue with anything exception updating instance-specific fields (static fields may or may not exhibit this problem depending where they are defined in the file).
Depending on the way the class is used, simply writing all of the data single-threaded may not be good enough. Collection classes are not necessarily safe for multi-threaded access even if read-only, so you'll need to ensure you're using the thread-safe data structures if multiple threads might access your singleton.
There are possibly other issues too. Thread-safety isn't a simple check-list; you need to think about what bits of code/data might be accessed concurrently, and ensure that appropriate action is taken (declaring methods synchronized, using concurrent collections, etc.). Thread-safety also isn't a binary thing (i.e. there's no such thing as "thread safe" per se); it depends how many threads will be accessing the class at once, what combinations of methods are thread-safe, whether sequences of operations will continue to function as one would expect (you can make a class "thread safe" in that is doesn't crash, but certain return values are undefined if pre-empted), what monitors threads need to hold to guarantee certain invariants etc.
I guess what I'm trying to say is that you need to think about and understand how the class is used. Showing people a snapshot of half a file (which doesn't even compile), and asking them to give a yes/no answer, is not going to be beneficial. At best they'll point out some of the issues for you if there are any; at worst you'll get a false sense of confidence.

Yeah, it's thread safe. The "method" is the constructor, and it will be called when the class is loaded, i.e. exactly once.
But looking at the stuff being done, I think it's probably a lousy idea to call it from the class loader. Essentially, you'll end up doing your DB connection and stuff at the point in time when something in your code touches the SoCalledSingleton. Chances are, this will not be inside some well-defined sequence of events where, if there's an error you have catch blocks to take you to some helpful GUI message handling or whatever.
The "cleaner" way is to use a synchronized static getInstance() method, which will construct your class and call its code exactly when getInstance() is called the first time.
EDIT: As The Elite Gentleman pointed out, there's a syntax error in there. You need to say
private final static SoCalledSingleton allDataLoaded = new SoCalledSigleton();

Is there a rule of thumb for when to code a static method vs an instance method?

I'm learning Java (and OOP) and although it might irrelevant for where I'm at right now, I was wondering if SO could share some common pitfalls or good design practices.

One important thing to remember is that static methods cannot be overridden by a subclass. References to a static method in your code essentially tie it to that implementation. When using instance methods, behavior can be varied based on the type of the instance. You can take advantage of polymorphism. Static methods are more suited to utilitarian types of operations where the behavior is set in stone. Things like base 64 encoding or calculating a checksum for instance.

I don't think any of the answers get to the heart of the OO reason of when to choose one or the other. Sure, use an instance method when you need to deal with instance members, but you could make all of your members public and then code a static method that takes in an instance of the class as an argument. Hello C.
You need to think about the messages the object you are designing responds to. Those will always be your instance methods. If you think about your objects this way, you'll almost never have static methods. Static members are ok in certain circumstances.
Notable exceptions that come to mind are the Factory Method and Singleton (use sparingly) patterns. Exercise caution when you are tempted to write a "helper" class, for from there, it is a slippery slope into procedural programming.

If the implementation of a method can be expressed completely in terms of the public interface (without downcasting) of your class, then it may be a good candidate for a static "utility" method. This allows you to maintain a minimal interface while still providing the convenience methods that clients of the code may use a lot. As Scott Meyers explains, this approach encourages encapsulation by minimizing the amount of code impacted by a change to the internal implementation of a class. Here's another interesting article by Herb Sutter picking apart std::basic_string deciding what methods should be members and what shouldn't.
In a language like Java or C++, I'll admit that the static methods make the code less elegant so there's still a tradeoff. In C#, extension methods can give you the best of both worlds.
If the operation will need to be overridden by a sub-class for some reason, then of course it must be an instance method in which case you'll need to think about all the factors that go into designing a class for inheritance.

My rule of thumb is: if the method performs anything related to a specific instance of a class, regardless of whether it needs to use class instance variables. If you can consider a situation where you might need to use a certain method without necessarily referring to an instance of the class, then the method should definitely be static (class). If this method also happens to need to make use of instance variables in certain cases, then it is probably best to create a separate instance method that calls the static method and passes the instance variables. Performance-wise I believe there is negligible difference (at least in .NET, though I would imagine it would be very similar for Java).

If you keep state ( a value ) of an object and the method is used to access, or modify the state then you should use an instance method.
Even if the method does not alter the state ( an utility function ) I would recommend you to use an instance method. Mostly because this way you can have a subclass that perform a different action.
For the rest you could use an static method.
:)

This thread looks relevant: Method can be made static, but should it? The difference's between C# and Java won't impact its relevance (I think).

Your default choice should be an instance method.

If it uses an instance variable it must be an instance method.
If not, it's up to you, but if you find yourself with a lot of static methods and/or static non-final variables, you probably want to extract all the static stuff into a new class instance. (A bunch of static methods and members is a singleton, but a really annoying one, having a real singleton object would be better--a regular object that there happens to be one of, the best!).

Basically, the rule of thumb is if it uses any data specific to the object, instance. So Math.max is static but BigInteger.bitCount() is instance. It obviously gets more complicated as your domain model does, and there are border-line cases, but the general idea is simple.

I would use an instance method by default. The advantage is that behavior can be overridden in a subclass or if you are coding against interfaces, an alternative implementation of the collaborator can be used. This is really useful for flexibility in testing code.
Static references are baked into your implementation and can't change. I find static useful for short utility methods. If the contents of your static method are very large, you may want to think about breaking responsibility into one or more separate objects and letting those collaborate with the client code as object instances.

IMHO, if you can make it a static method (without having to change it structure) then make it a static method. It is faster, and simpler.
If you know you will want to override the method, I suggest you write a unit test where you actually do this and so it is no longer appropriate to make it static. If that sounds like too much hard work, then don't make it an instance method.
Generally, You shouldn't add functionality as soon as you imagine a use one day (that way madness lies), you should only add functionality you know you actually need.
For a longer explanation...
http://en.wikipedia.org/wiki/You_Ain%27t_Gonna_Need_It
http://c2.com/xp/YouArentGonnaNeedIt.html

the issue with static methods is that you are breaking one of the core Object Oriented principles as you are coupled to an implementation. You want to support the open close principle and have your class implement an interface that describes the dependency (in a behavioral abstract sense) and then have your classes depend on that innterface. Much easier to extend after that point going forward . ..

My static methods are always one of the following:
Private "helper" methods that evaluate a formula useful only to that class.
Factory methods (Foo.getInstance() etc.)
In a "utility" class that is final, has a private constructor and contains nothing other than public static methods (e.g. com.google.common.collect.Maps)
I will not make a method static just because it does not refer to any instance variables.

Java: Rationale of the Object class not being declared abstract

Why wasn't the java.lang.Object class declared to be abstract ?
Surely for an Object to be useful it needs added state or behaviour, an Object class is an abstraction, and as such it should have been declared abstract ... why did they choose not to ?

An Object is useful even if it does not have any state or behaviour specific to it.
One example would be its use as a generic guard that's used for synchronization:
public class Example {
private final Object o = new Object();
public void doSomething() {
synchronized (o) {
// do possibly dangerous stuff
}
}
}
While this class is a bit simple in its implementation (it isn't evident here why it's useful to have an explicit object, you could just declare the method synchronized) there are several cases where this is really useful.

Ande, I think you are approaching this -- pun NOT intended -- with an unnecessary degree of abstraction. I think this (IMHO) unnecessary level of abstraction is what is causing the "problem" here. You are perhaps approaching this from a mathematical theoretical approach, where many of us are approaching this from a "programmer trying to solve problems" approach. I believe this difference in approach is causing the disagreements.
When programmers look at practicalities and how to actually implement something, there are a number of times when you need some totally arbitrary Object whose actual instance is totally irrelevant. It just cannot be null. The example I gave in a comment to another post is the implementation of *Set (* == Hash or Concurrent or type of choice), which is commonly done by using a backing *Map and using the Map keys as the Set. You often cannot use null as the Map value, so what is commonly done is to use a static Object instance as the value, which will be ignored and never used. However, some non-null placeholder is needed.
Another common use is with the synchronized keyword where some Object is needed to synchronize on, and you want to ensure that your synchronizing item is totally private to avoid deadlock where different classes are unintentionally synchronizing on the same lock. A very common idiom is to allocate a private final Object to use in a class as the lock. To be fair, as of Java 5 and java.util.concurrent.locks.Lock and related additions, this idiom is measurably less applicable.
Historically, it has been quite useful in Java to have Object be instantiable. You could make a good point that with small changes in design or with small API changes, this would no longer be necessary. You're probably correct in this.
And yes, the API could have provided a Placeholder class that extends Object without adding anything at all, to be used as a placeholder for the purposes described above. But -- if you're extending Object but adding nothing, what is the value in the class other than allowing Object to be abstract? Mathematically, theoretically, perhaps one could find a value, but pragmatically, what value would it add to do this?
There are times in programming where you need an object, some object, any concrete object that is not null, something that you can compare via == and/or .equals(), but you just don't need any other feature to this object. It exists only to serve as a unique identifier and otherwise does absolutely nothing. Object satisfies this role perfectly and (IMHO) very cleanly.
I would guess that this is part of the reason why Object was not declared abstract: It is directly useful for it not to be.

Does Object specify methods that classes extending it must implement in order to be useful? No, and therefor it needn't be abstract.
The concept of a class being abstract has a well defined meaning that does not apply to Object.

You can instantiate Object for synchronization locks:
Object lock = new Object();
void someMethod() {
//safe stuff
synchronized(lock) {
//some code avoiding race condition
}
}
void someOtherMethod() {
//safe code
synchronized(lock) {
//some other stuff avoiding race condition
}
}

I am not sure this is the reason, but it allows (or allowed, as there are now better ways of doing it) for an Object to be used as a lock:
Object lock = new Object();
....
synchronized(lock)
{
}

How is Object any more offensive than null?
It makes a good place marker (as good as null anyway).
Also, I don't think it would be good design to make an object abstract without an abstract method that needs to go on it.
I'm not saying null is the best thing since sliced bread--I read an article the other day by the "Inventor" discussing the cost/value of having the concept of null... (I didn't even think null was inventable! I guess someone somewhere could claim he invented zero..) just that being able to instantiate Object is no worse than being able to pass null.

You never know when you might want to use a simple Object as a placeholder. Think of it as like having a zero in a numerical system (and null doesn't work for this, since null represents the absence of data).

There should be a reason to make a class abstract. One is to prevent clients from instantiating the class and force them into using only subclasses (for whatever reasons). Another is if you wish to use it as an interface by providing abstract methods, which subclasses must implement. Probably, the designers og Java saw no such reasons, so java.lang.Object remains concrete.

As always, Guava comes to help: with http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/base/Optional.html
Stuff here can be used to kill nulls / Object instances for "a not-null placeholder" from the code.
There are entirely seperated questions here:
why did not they make Object abstract?
how much disaster comes after if they decide to make it abstract in a future release?

I'll just throw in another reason that I've found Object to useful to instantiate on its own. I have a pool of objects I've created that has a number of slots. Those slots can contain any of a number of objects, all that inherit from an abstract class. But what do I put in the pool to represent "empty". I could use null, but for my purpose, it made more sense to insure that there was always some object in each slot. I can't instantiate the abstract class to put in there, and I wouldn't have wanted to. So I could have created a concrete subclass of my abstract class to represent "not a useful foo", but that seemed unnecessary when using an instance of Object was just as good..in fact better, as it clearly says that what's in the slot has no functionality. So when I initialize my pool, I do so by creating an Object to assign to each slot as the initial condition of the pool.
I agree that it might have made sense for the original Java crew to have defined a Placeholder object as a concrete subclass of Object, and then made Object abstract, but it doesn't rub me wrong at all that they went the way they did. I would then have used Placeholder in place of Object.

How much code should one put in a constructor?

I was thinking how much code one should put in constructors in Java? I mean, very often you make helper methods, which you invoke in a constructor, but sometimes there are some longer initialization things, for example for a program, which reads from a file, or user interfaces, or other programs, in which you don't initialize only the instance variables, in which the constructor may get longer (if you don't use helper methods). I have something in mind that the constructors should generally be short and concise, shouldn't they? Are there exceptions to this?

If you go by the SOLID principles, each class should have one reason to change (i.e. do one thing). Therefore a constructor would normally not be reading a file, but you would have a separate class that builds the objects from the file.

Take a look at this SO question. Even though the other one is for C++, the concepts are still very similar.

As little as is needed to complete the initialization of the object.
If you can talk about a portion (5 or so lines is my guideline) of your constructor as a chunk of logic or a specific process, it's probably best to split it into a separate method for clarity and organizational purposes.
But to each his own.

My customary practice is that if all the constructor has to do is set some fields on an object, it can be arbitrarily long. If it gets too long, it means that the class design is broken anyway, or data need to be packaged in some more complex structures.
If, on the other hand, the input data need some more complex processing before initializing the class fields, I tend to give the constructor the processed data and move the processing to a static factory method.

Constructors should be just long enough, but no longer =)
If you are defining multiple overloaded constructors, don't duplicate code; instead, consolidate functionality into one of them for improved clarity and ease of maintenance.

As Knuth said, "Premature optimization is the root of all evil."
How much should you put in the consructor? Everything you need to. This is the "eager" approach. When--and only when--performance becomes an issue do you consider optimizing it (to the "lazy" or "over-eager" approaches).

Constructors should create the most minimal, generic instance of your object. How generic? Choose the test cases that every instance or object that inherits from the class must pass to be valid - even if "valid" only means fails gracefully (programatically generated exception).
Wikipedia has a good description :
http://en.wikipedia.org/wiki/Constructor_(computer_science)
A Valid object is the goal of the constructor, valid not necessarily useful - that can be done in an initialization method.

Your class may need to be initialized to a certain state, before any useful work can be done with it.
Consider this.
public class CustomerRecord
{
private Date dateOfBirth;
public CustomerRecord()
{
dateOfBirth = new Date();
}
public int getYearOfBirth()
{
Calendar calendar = Calendar.getInstance();
calendar.setTime(dateOfBirth);
return calendar.get(Calendar.YEAR);
}
}
Now if you don't initialize the dateOfBirth member varialble, any subsequent invocation of getYearOfBirth(), will result in a NullPointerException.
So the bare minimum initialization which may involve
Assigning of values.
Invoking helper functions.
to ensure that the class behaves correctly when it's members are invoked later on, is all that needs to be done.

Constructor is like an Application Setup Wizard where you do only configuration. If the Instance is ready to take any (possible) Action on itself then Constructor doing well.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Breaking legacy singletons - java

Related

Initializing in Constructor

Threads and synchronisation

Is there a rule of thumb for when to code a static method vs an instance method?

Java: Rationale of the Object class not being declared abstract

How much code should one put in a constructor?

Categories

Resources