How much code should one put in a constructor? - java

I was thinking how much code one should put in constructors in Java? I mean, very often you make helper methods, which you invoke in a constructor, but sometimes there are some longer initialization things, for example for a program, which reads from a file, or user interfaces, or other programs, in which you don't initialize only the instance variables, in which the constructor may get longer (if you don't use helper methods). I have something in mind that the constructors should generally be short and concise, shouldn't they? Are there exceptions to this?

If you go by the SOLID principles, each class should have one reason to change (i.e. do one thing). Therefore a constructor would normally not be reading a file, but you would have a separate class that builds the objects from the file.

Take a look at this SO question. Even though the other one is for C++, the concepts are still very similar.

As little as is needed to complete the initialization of the object.
If you can talk about a portion (5 or so lines is my guideline) of your constructor as a chunk of logic or a specific process, it's probably best to split it into a separate method for clarity and organizational purposes.
But to each his own.

My customary practice is that if all the constructor has to do is set some fields on an object, it can be arbitrarily long. If it gets too long, it means that the class design is broken anyway, or data need to be packaged in some more complex structures.
If, on the other hand, the input data need some more complex processing before initializing the class fields, I tend to give the constructor the processed data and move the processing to a static factory method.

Constructors should be just long enough, but no longer =)
If you are defining multiple overloaded constructors, don't duplicate code; instead, consolidate functionality into one of them for improved clarity and ease of maintenance.

As Knuth said, "Premature optimization is the root of all evil."
How much should you put in the consructor? Everything you need to. This is the "eager" approach. When--and only when--performance becomes an issue do you consider optimizing it (to the "lazy" or "over-eager" approaches).

Constructors should create the most minimal, generic instance of your object. How generic? Choose the test cases that every instance or object that inherits from the class must pass to be valid - even if "valid" only means fails gracefully (programatically generated exception).
Wikipedia has a good description :
http://en.wikipedia.org/wiki/Constructor_(computer_science)
A Valid object is the goal of the constructor, valid not necessarily useful - that can be done in an initialization method.

Your class may need to be initialized to a certain state, before any useful work can be done with it.
Consider this.
public class CustomerRecord
{
private Date dateOfBirth;
public CustomerRecord()
{
dateOfBirth = new Date();
}
public int getYearOfBirth()
{
Calendar calendar = Calendar.getInstance();
calendar.setTime(dateOfBirth);
return calendar.get(Calendar.YEAR);
}
}
Now if you don't initialize the dateOfBirth member varialble, any subsequent invocation of getYearOfBirth(), will result in a NullPointerException.
So the bare minimum initialization which may involve
Assigning of values.
Invoking helper functions.
to ensure that the class behaves correctly when it's members are invoked later on, is all that needs to be done.

Constructor is like an Application Setup Wizard where you do only configuration. If the Instance is ready to take any (possible) Action on itself then Constructor doing well.

Related

Is passing 'this' in a method call accepted practice in java

Is it good/bad/acceptable practice to pass the current object in a method call. As in:
public class Bar{
public Bar(){}
public void foo(Baz baz){
// modify some values of baz
}
}
public class Baz{
//constructor omitted
public void method(){
Bar bar = new Bar();
bar.foo(this);
}
}
Specifically, is the line bar.foo(this) acceptable?
There's nothing wrong with that. What is NOT a good practice is to do the same inside constructors, because you would give a reference to a not-yet-completely-initialized object.
There is a sort of similar post here: Java leaking this in constructor
where they give an explanation of why the latter is a bad practice.
There's no reason not to use it, this is the current instance and it's perfectly legitimate to use. In fact there's often no clean way to omit it.
So use it.
As it's hard to convince it's acceptable without example (a negative answer to such a question is always easier to argument), I just opened one of the most common java.lang classes, the String one, and of course I found instances of this use, for example
1084 // Argument is a String
1085 if (cs.equals(this))
1086 return true;
Look for (this in big "accepted" projects, you won't fail to find it.
Yes, but you should be careful about two things
Passing this when the object has not been constructed yet (i.e. in its constructor)
Passing this to a long-living object, that will keep the reference alive and will prevent the this object from being garbage collected.
It's perfectly normal and perfectly acceptable.
this stands for the current object. What you are doing is sytatically correct but i don't see a need of this if you are calling the method in the same class.
It is bad practice to pass the current object in a method call if there less complex alternatives to achieve the same behaviour.
By definition, a bidirectional association is created as soon as this is passed from one object to another.
To quote Refactoring, by Martin Fowler:
Change Bidirectional Association to Unidirectional (200)
Bidirectional associations are useful, but they carry a price. The
price is the added complexity of maintaining the two-way links and
ensuring that objects are properly created and removed. Bidirectional
associations are not natural for many programmers, so they often are a
source of errors
...
You should use bidirectional associations when you need to but not
when you don’t. As soon as you see a bidirectional association is no
longer pulling its weight, drop the unnecessary end.
So, theoretically, we should be hearing alarm bells when we find we need to pass this and try really hard to think of other ways to solve the problem at hand. There are, of course, times when, at last resort, it makes sense to do it.
Also it is often necessary to corrupt your design temporarily, doing 'bad practice things', during a longer term refactoring of your code for an overall improvement. (One step back, two steps forward).
In practice I have found my code has improved massively by avoiding bidirectional links like the plague.
Yes. you can use it.Its just common in programming to pass this.But there are pros and cons about using that.Still it is not hazardous to do so.
Just to add one more example where passing this is correct and follows good design: Visitor pattern. In Visitor design pattern, method accept(Visitor v) is typically implemented in a way it just calls v.visit(this).
Acceptable
Snippet from Oracle JAVA docs:
Within an instance method or a constructor, this is a reference to the
current object — the object whose method or constructor is being
called. You can refer to any member of the current object from within
an instance method or a constructor by using this.
Using this with a Field
The most common reason for using the this keyword is because a field
is shadowed by a method or constructor parameter.
Everything in java is passed by value. But objects are NEVER passed to the method!
When java passes an object to a method, it first makes a copy of a reference to the object, not a copy of the object itself. Hence this is pefectly used method in java. And most commonly followed usage.

Initializing in Constructor

I have seen lots of codes where the coders define an init() function for classes and call it the first thing after creating the instance.
Is there any harm or limitation of doing all the initializations in Constructor?
Usually for maintainability and to reduce code size when multiple constructors call the same initialization code:
class stuff
{
public:
stuff(int val1) { init(); setVal = val1; }
stuff() { init(); setVal = 0; }
void init() { startZero = 0; }
protected:
int setVal;
int startZero;
};
Just the opposite: it's usually better to put all of the initializations
in the constructor. In C++, the "best" policy is usually to put the
initializations in an initializer list, so that the members are
directly constructed with the correct values, rather than default
constructed, then assigned. In Java, you want to avoid a function
(unless it is private or final), since dynamic resolution can put
you into an object which hasn't been initialized.
About the only reason you would use an init() function is because you
have a lot of constructors with significant commonality. (In the case
of C++, you'd still have to weigh the difference between default
construction, then assignment vs. immediate construction with the
correct value.)
In Java, there's are good reasons for keeping constructors short and moving initialization logic into an init() method:
constructors are not inherited, so any subclasses have to either reimplement them or provide stubs that chain with super
you shouldn't call overridable methods in a constructor since you can find your object in an inconsistent state where it is partially initialized
It is a design pattern that has to do with exceptions thrown from inside an object constructor.
In C++ if an exception is thrown from inside an object costructor then that object is considered as not-constructed at all, by the language runtime.
As a consequence the object destructor won't be called when the object goes out of scope.
This means that if you had code like this inside your constructor:
int *p1 = new int;
int *p2 = new int;
and code like this in your destructor:
delete p1;
delete p2;
and the initialization of p2 inside the constructor fails due to no more memory available, then a bad_alloc exception is thrown by the new operator.
At that point your object is not fully constructred, even if the memory for p1 has been allocated correctly.
If this happens the destructor won't be called and you are leaking p1.
So the more code you place inside the constructor, the more likely an error will occur leading to potential memory leaks.
That's the main reason for that design choice, which isn't too mad after all.
More on this on Herb Sutter's blog: Constructors exceptions in C++
It's a design choice. You want to keep your constructor as simple as possible, so it's easy to read what it's doing. That why you'll often see constructors that call other methods or functions, depending on the language. It allows the programmer to read and follow the logic without getting lost in the code.
As constructors go, you can quickly run into a scenario where you have a tremendous sequence of events you wish to trigger. Good design dictates that you break down these sequences into simpler methods, again, to make it more readable and easier to maintain in the future.
So, no, there's no harm or limitation, it's a design preference. If you need all that initialisation done in the constructor then do it there. If you only need it done later, then put it in a method you call later. Either way, it's entirely up to you and there are no hard or fast rules around it.
If you have multiple objects of the same class or different classes that need to be initialized with pointers to one another, so that there is at least one cycle of pointer dependency, you can't do all the initialization in constructors alone. (How are you going to construct the first object with a pointer/reference to another object when the other object hasn't been created yet?)
One situation where this can easily happen is in an event simulation system where different components interact, so that each component needs pointers to other components.
Since it's impossible to do all the initialization in the constructors, then at least some of the initialization has to happen in init functions. That leads to the question: Which parts should be done in init functions? The flexible choice seems to be to do all the pointer initialization in init functions. Then, you can construct the objects in any order since when constructing a given object, you don't have to be concerned about whether you already have the necessary pointers to the other objects it needs to know about.

No-Parameter Constructor v/s Constructor with params

which one of below is better or to be prefered
new Object();
Object.setValue1("1");
Object.setValue2("2");
Object.setValue3("3");
or
new Object("1","2","3");
(I assume you're talking about the design of your own classes, rather than how to use other already-designed classes.)
Neither is always "better," it depends on the nature of the object and (to an extent) on your preferred style.
If an object cannot have a meaningful state without some external information, then requiring that information in the constructor makes sense, because then you can't create instances with an invalid state.
However, having constructors that require as little information as possible is useful in terms of making the class easy to use in a variety of situations. If the class is such that a zero-arguments constructor is feasible and doesn't complicate the class, it's great in terms of supporting various use-cases, including cases where the instance is being built as part of parsing some other structure (JSON, XML, etc.).
There is also a third option that builds on the use of fluent interfaces
MyObject obj = new MyObject().setValue1("1").setValue("2").setValue("3");
I personally like this approach but if the number of parameters is short and known at the time of construction AND the varying possible combinations of parameters is short then I would take the route of parameters on the constructor. I think most would agree that 12 constructor overloads are an eye sore.
Depends on whether you know the values at the time of object construction.
If Yes, then use the constructor version if not then you will have to use the other version.
Ofcourse,Initialization through the Constructor version is faster because it involves just one function call over 3 set function calls and also it is more logical way.
Its always better performance to call in constructor If you know values already.
Then there is my preferred alternative. Thing is the name of some interface. It creates an anonymous class.
Thing createThing ( final int val1 , final int val2 , final int val3 )
{
return new Thing ( )
{
// can use val1, val2, val3
} ;
}
It all depends on the application.
Calling Parameterized constructor will be a good idea if at Compile time you know what values to be given to your variables, rather than calling Setters. Because you are calling 3 setter methods and in parameterized constructor, you are just passing to the cnostructor it self.
But if at compile time you don't know what values to be given then how can you call paramterized constructor.
But for initialization it will be better to call parameterized constructor.
There is a semantic difference. If (1) you are instantiating an object with an initial state. In (2) you are changing the state of an existing instance (3 times). It's a small difference but may become very important in more complex systems (especially if you rely on various JavaBean conventions). But still, neither one is wrong or better.

the use of private keyword

I am new to programming. I am learning Java now, there is something I am not really sure, that the use of private. Why programmer set the variable as private then write , getter and setter to access it. Why not put everything in public since we use it anyway.
public class BadOO {
public int size;
public int weight;
...
}
public class ExploitBadOO {
public static void main (String [] args) {
BadOO b = new BadOO();
b.size = -5; // Legal but bad!!
}
}
I found some code like this, and i saw the comment legal but bad. I don't understand why, please explain me.
The most important reason is to hide the internal implementation details of your class. If you prevent programmers from relying on those details, you can safely modify the implementation without worrying that you will break existing code that uses the class.
So by declaring the field private you prevent a user from accessing the variable directly. By providing gettters and setters you control exactly how a user may control the variable.
The main reason to not just make the variable public in the first place is that if you did make it public, you would create more headaches later on.
For example, one programmer writes public getters and setters around a private member variable. Three months later, he needs to verify that the variable is never "set" to null. He adds in a check in the "setFoo(...)" method, and all attempts to set the variable will then be checked for "setting it to null". Case closed, and with little effort.
Another programmer realizes that putting in public getters and setters around a private member variable is violating the spirit of encapsulation, he sees the futility of the methods and decides to just make the member variable public. Perhaps this gains a bit of a performance boost, or perhaps the programmer just wants to "write it as it is used". Three months later, he needs to verify that the variable is never "set" to null. He scans every access to the variable, effectively searching through the entire code base, including all code that might be accessing the variable via reflection. This includes all 3rd party libraries which has extended his code, and all newly written modules which used his code after it was written. He then either modifies all calls to guarantee that the variable is never set to null. The case is never closed, because he can't effectively find all accesses to the exposed member, nor does he have access to all 3rd party source code. With imperfect knowledge of newly written modules, the survey is guaranteed to be incomplete. Finally he has no control over the future code which may access the public member, and that code may contain lines which set the member variable to null.
Of course the second programmer could then break all existing code by putting "get" and "set" methods around the variable and making it private, but hey, he could have done that three months earlier and saved himself the explanation of why he needed to break everyone else's code.
Call it what you will, but putting public "get" and "set" methods around a private member variable is defensive programming which has been brought about by many years (i.e. decades) of experience.
Anything public in your class is a contract with the users of the class. As you modify the class, you must maintain the contract. You can add to the contract (new methods, variables, etc.), but you can't remove from it. Idealy you want that contract to be as small as possible. It is useful to make everything private that you can. If you need direct access from package members, make it protected. Only make those things public which are required by your users.
Exposing variables means that you are contracting forever, to have that variable and allow users to modify it. As discussed above, you may find you need to invoke behaviour when a variable is accessed. This can be be done if you only contract for the getter and setter methods.
Many of the early Java classes have contracts which require them to be thread safe. This adds significant overhead in cases where only one thread can access the instance. Newer releases have new classes which duplicate or enhance the functionality but drop the syncronization. Hence StringBuilder was added and in most cases should be used instead of StringBuffer.
Its considered bad mainly because you loose control over who can change the value and what happens when the value changes.
In tiny application written by you for you it won't seem that important but as you start developing for larger and larger applications having control over who changes what and when becomes critical.
Imagine from your example above, you publish library as is, other people use it, then you decide you wanted to calculate another value in your bad class when the size changes ... suddenly the bad00 class has no way of knowing and you can't change it because other people rely on it.
Instead if you had a set method you could extend it to say
void SetSize(int newSize)
{
size = newSize;
DoCalculation;
}
You can extend the functionality without breaking other peoples reliance on you.
I highly recommend the book Effective Java, it contains a lot of useful information about how to write better programs in Java.
Your question is addressed in items 13 and 14 of that book:
Item 13: Minimize the accessibility of classes and members
Item 14: In public classes, use accessor methods, not public fields
You shouldn't allow implementations to alter your records directly. Providing getters and setters means that you have exact control over how variables get assigned or what gets returned, etc. The same thing goes for the code in your constructor. What if the setter does something special when you assign a value to size? This won't happen if you assign it directly.
It's a common pet-peeve of many programmers - Java code with private fields and public accessors and mutators. The effect is as you say, those fields might as well been public.
There are programming languages that voice for the other extreme, too. Look at Python; just about everything is public, to some extent.
These are different coding practices and a common thing programmers deal with every day. But in Java, here's my rule of thumb:
If the field is used purely as an attribute, both readable and writeable by anyone, make it public.
If the field is used internally only, use private. Provide a getter if you want read access, and provide a setter if you want write access.
There is a special case: sometimes, you want to process extra data when an attribute is accessed. In that case, you would provide both getters and setters, but inside these property functions, you would do more than just return - for example, if you want to track the number of times an attribute is read by other programs during an object's life time.
That's just a brief overview on access levels. If you're interested, also read up on protected access.
This is indeed used to hide the internal implementation. This also helps is providing extra bit of logic on your variables. Say you need to make sure that the value passed for a varable should not be 0/null, you can provide this logic in the set method. Also in the same way you can provide some logic while getting the value, say you have a object variable which is not initialised and you are accessing that object, in this case you cand provide the logic to null check for that object and always return an object.
C# programmers use this equally as much, or maybe more frequently than I see in Java. C# calls it properties where in Java it is accessors/mutators
For me it makes sense to have getter and setter methods to encapsulate the classes so that no class can change the instance variables of another class.
Okay. We are talking about Objects here. The real world objects. If they are not private,the user of your class is allowed to change. What if for a Circle class, and for the radius attribute/property of the Circle class, the user sets value as '0'. It doesn't make sense for a Circle to exist with radius as '0'. You can avoid such mistakes if you make your attributes private and give a setter method and in which and throw an Exception/Error (instructing the user ) that it is not allowed to create a Circle with radisu as '0'. Basically, the objects that are created out of your class - are meant to exist as you wished to have them exist. This is one of the ways to achieve it.
As stated earlier, the reason for making a variable private is to hide it from the outside. But if you make a getter AND a setter then you may as well make the variable itself public. If you find yourself later in a position that you made the wrong choice, then you must refactor your code from using the public variable into using the getter/setter which may not be a problem. But it can be a problem if other code, which you do not control, starts depending on your code. Then such a refactoring will break the other code. If you use getters and setters from the start you will reduce that risk in exchange for a little effort. So it depends on your situation.
It depends on who access these public variables. Most likely, only by people inside your company/team. Then it's trivial to refactor them into getters/setters when necessary. I say in this case, it's better to leave the variables public; unless you are forced to follow the java bean convention.
If you are writing a framework or a library intended for the public, then you shouldn't expose variables. It's impossible for you to change them into getters/setters later.
But the 2nd case is more rare than the first; people apply extremely unreasonable assumptions when it come to software engineer, as if they are not writing code, instead they are carving code in stone. And as if the whole world is watching while you code - in reality, nobody will ever read your code except yourself

Is there a rule of thumb for when to code a static method vs an instance method?

I'm learning Java (and OOP) and although it might irrelevant for where I'm at right now, I was wondering if SO could share some common pitfalls or good design practices.
One important thing to remember is that static methods cannot be overridden by a subclass. References to a static method in your code essentially tie it to that implementation. When using instance methods, behavior can be varied based on the type of the instance. You can take advantage of polymorphism. Static methods are more suited to utilitarian types of operations where the behavior is set in stone. Things like base 64 encoding or calculating a checksum for instance.
I don't think any of the answers get to the heart of the OO reason of when to choose one or the other. Sure, use an instance method when you need to deal with instance members, but you could make all of your members public and then code a static method that takes in an instance of the class as an argument. Hello C.
You need to think about the messages the object you are designing responds to. Those will always be your instance methods. If you think about your objects this way, you'll almost never have static methods. Static members are ok in certain circumstances.
Notable exceptions that come to mind are the Factory Method and Singleton (use sparingly) patterns. Exercise caution when you are tempted to write a "helper" class, for from there, it is a slippery slope into procedural programming.
If the implementation of a method can be expressed completely in terms of the public interface (without downcasting) of your class, then it may be a good candidate for a static "utility" method. This allows you to maintain a minimal interface while still providing the convenience methods that clients of the code may use a lot. As Scott Meyers explains, this approach encourages encapsulation by minimizing the amount of code impacted by a change to the internal implementation of a class. Here's another interesting article by Herb Sutter picking apart std::basic_string deciding what methods should be members and what shouldn't.
In a language like Java or C++, I'll admit that the static methods make the code less elegant so there's still a tradeoff. In C#, extension methods can give you the best of both worlds.
If the operation will need to be overridden by a sub-class for some reason, then of course it must be an instance method in which case you'll need to think about all the factors that go into designing a class for inheritance.
My rule of thumb is: if the method performs anything related to a specific instance of a class, regardless of whether it needs to use class instance variables. If you can consider a situation where you might need to use a certain method without necessarily referring to an instance of the class, then the method should definitely be static (class). If this method also happens to need to make use of instance variables in certain cases, then it is probably best to create a separate instance method that calls the static method and passes the instance variables. Performance-wise I believe there is negligible difference (at least in .NET, though I would imagine it would be very similar for Java).
If you keep state ( a value ) of an object and the method is used to access, or modify the state then you should use an instance method.
Even if the method does not alter the state ( an utility function ) I would recommend you to use an instance method. Mostly because this way you can have a subclass that perform a different action.
For the rest you could use an static method.
:)
This thread looks relevant: Method can be made static, but should it? The difference's between C# and Java won't impact its relevance (I think).
Your default choice should be an instance method.
If it uses an instance variable it must be an instance method.
If not, it's up to you, but if you find yourself with a lot of static methods and/or static non-final variables, you probably want to extract all the static stuff into a new class instance. (A bunch of static methods and members is a singleton, but a really annoying one, having a real singleton object would be better--a regular object that there happens to be one of, the best!).
Basically, the rule of thumb is if it uses any data specific to the object, instance. So Math.max is static but BigInteger.bitCount() is instance. It obviously gets more complicated as your domain model does, and there are border-line cases, but the general idea is simple.
I would use an instance method by default. The advantage is that behavior can be overridden in a subclass or if you are coding against interfaces, an alternative implementation of the collaborator can be used. This is really useful for flexibility in testing code.
Static references are baked into your implementation and can't change. I find static useful for short utility methods. If the contents of your static method are very large, you may want to think about breaking responsibility into one or more separate objects and letting those collaborate with the client code as object instances.
IMHO, if you can make it a static method (without having to change it structure) then make it a static method. It is faster, and simpler.
If you know you will want to override the method, I suggest you write a unit test where you actually do this and so it is no longer appropriate to make it static. If that sounds like too much hard work, then don't make it an instance method.
Generally, You shouldn't add functionality as soon as you imagine a use one day (that way madness lies), you should only add functionality you know you actually need.
For a longer explanation...
http://en.wikipedia.org/wiki/You_Ain%27t_Gonna_Need_It
http://c2.com/xp/YouArentGonnaNeedIt.html
the issue with static methods is that you are breaking one of the core Object Oriented principles as you are coupled to an implementation. You want to support the open close principle and have your class implement an interface that describes the dependency (in a behavioral abstract sense) and then have your classes depend on that innterface. Much easier to extend after that point going forward . ..
My static methods are always one of the following:
Private "helper" methods that evaluate a formula useful only to that class.
Factory methods (Foo.getInstance() etc.)
In a "utility" class that is final, has a private constructor and contains nothing other than public static methods (e.g. com.google.common.collect.Maps)
I will not make a method static just because it does not refer to any instance variables.

Categories

Resources