Initialize class fields in constructor or at declaration? - java

I've been programming in C# and Java recently and I am curious where the best place is to initialize my class fields.
Should I do it at declaration?:
public class Dice
{
private int topFace = 1;
private Random myRand = new Random();
public void Roll()
{
// ......
}
}
or in a constructor?:
public class Dice
{
private int topFace;
private Random myRand;
public Dice()
{
topFace = 1;
myRand = new Random();
}
public void Roll()
{
// .....
}
}
I'm really curious what some of you veterans think is the best practice. I want to be consistent and stick to one approach.

My rules:
Don't initialize with the default values in declaration (null, false, 0, 0.0…).
Prefer initialization in declaration if you don't have a constructor parameter that changes the value of the field.
If the value of the field changes because of a constructor parameter put the initialization in the constructors.
Be consistent in your practice (the most important rule).

In C# it doesn't matter. The two code samples you give are utterly equivalent. In the first example the C# compiler (or is it the CLR?) will construct an empty constructor and initialise the variables as if they were in the constructor (there's a slight nuance to this that Jon Skeet explains in the comments below).
If there is already a constructor then any initialisation "above" will be moved into the top of it.
In terms of best practice the former is less error prone than the latter as someone could easily add another constructor and forget to chain it.

I think there is one caveat. I once committed such an error: Inside of a derived class, I tried to "initialize at declaration" the fields inherited from an abstract base class. The result was that there existed two sets of fields, one is "base" and another is the newly declared ones, and it cost me quite some time to debug.
The lesson: to initialize inherited fields, you'd do it inside of the constructor.

The semantics of C# differs slightly from Java here. In C# assignment in declaration is performed before calling the superclass constructor. In Java it is done immediately after which allows 'this' to be used (particularly useful for anonymous inner classes), and means that the semantics of the two forms really do match.
If you can, make the fields final.

Assuming the type in your example, definitely prefer to initialize fields in the constructor. The exceptional cases are:
Fields in static classes/methods
Fields typed as static/final/et al
I always think of the field listing at the top of a class as the table of contents (what is contained herein, not how it is used), and the constructor as the introduction. Methods of course are chapters.

In Java, an initializer with the declaration means the field is always initialized the same way, regardless of which constructor is used (if you have more than one) or the parameters of your constructors (if they have arguments), although a constructor might subsequently change the value (if it is not final). So using an initializer with a declaration suggests to a reader that the initialized value is the value that the field has in all cases, regardless of which constructor is used and regardless of the parameters passed to any constructor. Therefore use an initializer with the declaration only if, and always if, the value for all constructed objects is the same.

There are many and various situations.
I just need an empty list
The situation is clear. I just need to prepare my list and prevent an exception from being thrown when someone adds an item to the list.
public class CsvFile
{
private List<CsvRow> lines = new List<CsvRow>();
public CsvFile()
{
}
}
I know the values
I exactly know what values I want to have by default or I need to use some other logic.
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = new List<string>() {"usernameA", "usernameB"};
}
}
or
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = GetDefaultUsers(2);
}
}
Empty list with possible values
Sometimes I expect an empty list by default with a possibility of adding values through another constructor.
public class AdminTeam
{
private List<string> usernames = new List<string>();
public AdminTeam()
{
}
public AdminTeam(List<string> admins)
{
admins.ForEach(x => usernames.Add(x));
}
}

What if I told you, it depends?
I in general initialize everything and do it in a consistent way. Yes it's overly explicit but it's also a little easier to maintain.
If we are worried about performance, well then I initialize only what has to be done and place it in the areas it gives the most bang for the buck.
In a real time system, I question if I even need the variable or constant at all.
And in C++ I often do next to no initialization in either place and move it into an Init() function. Why? Well, in C++ if you're initializing something that can throw an exception during object construction you open yourself to memory leaks.

The design of C# suggests that inline initialization is preferred, or it wouldn't be in the language. Any time you can avoid a cross-reference between different places in the code, you're generally better off.
There is also the matter of consistency with static field initialization, which needs to be inline for best performance. The Framework Design Guidelines for Constructor Design say this:
✓ CONSIDER initializing static fields inline rather than explicitly using static constructors, because the runtime is able to optimize the performance of types that don’t have an explicitly defined static constructor.
"Consider" in this context means to do so unless there's a good reason not to. In the case of static initializer fields, a good reason would be if initialization is too complex to be coded inline.

Being consistent is important, but this is the question to ask yourself:
"Do I have a constructor for anything else?"
Typically, I am creating models for data transfers that the class itself does nothing except work as housing for variables.
In these scenarios, I usually don't have any methods or constructors. It would feel silly to me to create a constructor for the exclusive purpose of initializing my lists, especially since I can initialize them in-line with the declaration.
So as many others have said, it depends on your usage. Keep it simple, and don't make anything extra that you don't have to.

Consider the situation where you have more than one constructor. Will the initialization be different for the different constructors? If they will be the same, then why repeat for each constructor? This is in line with kokos statement, but may not be related to parameters. Let's say, for example, you want to keep a flag which shows how the object was created. Then that flag would be initialized differently for different constructors regardless of the constructor parameters. On the other hand, if you repeat the same initialization for each constructor you leave the possibility that you (unintentionally) change the initialization parameter in some of the constructors but not in others. So, the basic concept here is that common code should have a common location and not be potentially repeated in different locations. So I would say always put it in the declaration until you have a specific situation where that no longer works for you.

There is a slight performance benefit to setting the value in the declaration. If you set it in the constructor it is actually being set twice (first to the default value, then reset in the ctor).

When you don't need some logic or error handling:
Initialize class fields at declaration
When you need some logic or error handling:
Initialize class fields in constructor
This works well when the initialization value is available and the
initialization can be put on one line. However, this form of
initialization has limitations because of its simplicity. If
initialization requires some logic (for example, error handling or a
for loop to fill a complex array), simple assignment is inadequate.
Instance variables can be initialized in constructors, where error
handling or other logic can be used.
From https://docs.oracle.com/javase/tutorial/java/javaOO/initial.html .

I normally try the constructor to do nothing but getting the dependencies and initializing the related instance members with them. This will make you life easier if you want to unit test your classes.
If the value you are going to assign to an instance variable does not get influenced by any of the parameters you are going to pass to you constructor then assign it at declaration time.

Not a direct answer to your question about the best practice but an important and related refresher point is that in the case of a generic class definition, either leave it on compiler to initialize with default values or we have to use a special method to initialize fields to their default values (if that is absolute necessary for code readability).
class MyGeneric<T>
{
T data;
//T data = ""; // <-- ERROR
//T data = 0; // <-- ERROR
//T data = null; // <-- ERROR
public MyGeneric()
{
// All of the above errors would be errors here in constructor as well
}
}
And the special method to initialize a generic field to its default value is the following:
class MyGeneric<T>
{
T data = default(T);
public MyGeneric()
{
// The same method can be used here in constructor
}
}

"Prefer initialization in declaration", seems like a good general practice.
Here is an example which cannot be initialized in the declaration so it has to be done in the constructor.
"Error CS0236 A field initializer cannot reference the non-static field, method, or property"
class UserViewModel
{
// Cannot be set here
public ICommand UpdateCommad { get; private set; }
public UserViewModel()
{
UpdateCommad = new GenericCommand(Update_Method); // <== THIS WORKS
}
void Update_Method(object? parameter)
{
}
}

Related

Why can I instantiate an interface without declaring a class? [duplicate]

This question already has answers here:
Why are only final variables accessible in anonymous class?
(15 answers)
Closed 6 years ago.
So in Java it's possible to instantiate an interface without creating an explicit class
public interface Foo {
public void OnNotify()
}
Say I do the following somewhere else, say in a method Subscribe
public void Subscribe()
{
final int someInt = 5;
Foo bar = new Foo() {
final int value = someInt;
#Override
public void OnNotify()
{
Log.d("Debug", "You are being notified that I hold the value " + value);
}
}
someObject.AddSubscription(bar);
}
This is used extensively in Android for setting listeners to events.
Why is this possible, and does this kind of instantiation have a special name? Is this related to lambda functions in some way perhaps?
And why do I need to make a 'final' variable if I want to give it to this instantiated interface to hold. Say for example I wanted to pass the current iteration 'i' of a for loop to identify what index of an array a subscription references. I need to declare a final variable to hold 'i', and then pass it into the instantiated interface.
Edit:
I'm still asking why I can instantiate an interface without making a class first, and what it's called. Not knowing what this is, there's no way I could have found the duplicate question, which doesn't cover what a Java anonymous class is.
Why is this possible, and does this kind of instantiation have a special name? Is this related to lambda functions in some way perhaps?
I don't know of any "special name" for this. I generally refer to it as a "interface implementation declaration." That's just me though, and maybe that's not accurate or correct.
I believe they are not, because aside from syntax differences, I look to lambdas as a method-class like structure, and this form is overriding methods from an object type. When passing an interface object, you're passing reference of a type.
And why do I need to make a 'final' variable if I want to give it to this instantiated interface to hold.
The way I have looked at it, and understood it to be, is because this declaration inline and not in a separate class doesn't quite change what an Interface still is. In the Java programming language, an interface is a reference type, similar to a class, that can contain only constants, method signatures, default methods, static methods, and nested types Link
An interface is required to have a constant, and to me final is a keyword that works similarly to the C/C++ keyword const in that it isn't holding the value but a reference to the type that you've declared. These values are immutable, preventing any copies and changes, as they only contain a reference to the location of the value.
So, overall, I believe that an interface declaration like how you illustrate is not working as declaring an instance of the actual interface class, but rather it's creating an object of that type and that object is a new location in memory that holds references to these methods and members (like a class).
Hope this helps some. I don't normally like to volunteer answers to something I am not 100% sure on, and wish this was a comment more (but lack the rep still for that), but hopefully it helps clear some things.
You can read the Java Language Specification here for more information on Interfaces too. Hope that helps.

Thinking in OOP way

Whenever I think that I am gaining some confidence in OOP then suddenly I get bitten by some advance example. Like in this very great article by Uncle Bob he uses the below class an example for his kata.
public class WordWrapper {
private int length;
public WordWrapper(int length) {
this.length = length;
}
public static String wrap(String s, int length) {
return new WordWrapper(length).wrap(s);
}
public String wrap(String s) {
if (length < 1)
throw new InvalidArgument();
if (s == null)
return "";
if (s.length() <= length)
return s;
else {
int space = s.indexOf(" ");
if (space >= 0)
return breakBetween(s, space, space + 1);
else
return breakBetween(s, length, length);
}
}
private String breakBetween(String s, int start, int end) {
return s.substring(0, start) +
"\n" +
wrap(s.substring(end), length);
}
public static class InvalidArgument extends RuntimeException {
}
}
I have following doubts:
Why the static helper method wrap?
Why the InvalidArgument class is nested and static?
Why do we even need to initialize this class since its nothing but an algorithm and can operate without any instance variable, why we need ~100 instances(for eg) of it?
Why the static helper method wrap?
There is no especially good reason - I think that it is a subjective judgement that:
WordWrapper.wrap("foo", 5);
is neater than
new WordWrapper(5).wrap("foo");
(which I would agree it is). I tend to find myself adding methods like this when the code just feels very repetitive.
However, the static form can lead to hidden problems: invoking that in a loop results in the creation of a lot of unnecessary instances of WordWrapper, whereas the non-static form just creates one and reuses it.
Why the InvalidArgument class is nested and static?
The implication of it being nested is that it is only for use in reporting invalid arguments of methods in WordWrapper. For instance, it wouldn't make much sense if some database-related class threw an instance of WordWrapper.InvalidArgument.
Remember that you can reference it as InvalidArgument for convenience if appropriately imported; you're still always using some.packagename.WordWrapper.InvalidArgument, so its use in other classes doesn't make semantic sense.
If you expect to use it in other classes, it should not be nested.
As for why static: there are two reasons that I can think of (which are sort of different sides of the same coin):
It doesn't need to be non-static. A non-static nested class is called an inner class. It is related to the instance of the containing class which created it; in some way, the data in the inner class is related to the data in the outer class.
What this actually means is there is a hidden reference to the outer class passed into the inner class when it is created. If you never need to refer to this instance, make it static, so the reference isn't passed. It's just like removing unused parameters of methods: if you don't need it, don't pass it.
Holding this reference has unexpected consequences. (I draw this as a separate point because whereas the previous one refers to a logical requirement/design for the reference or not, this refers to practical implications of holding that reference).
Just as with holding any reference, if you have a reference to an instance of the inner class, you make everything that it references ineligible for garbage collection, since it is still reachable. Depending upon how you use instances of the inner class, this can lead to a memory leak. The static version of the class doesn't suffer from this problem, since there is no reference: you can have a reference to a InvalidArgument when all of the instances of Wrapper are cleared up.
Another consequence is that the contract of InvalidArgument is invalid: Throwable, a superclass of InvalidArgument, implements Serializable, meaning that InvalidArgument also implements Serializable. However, WordWrapper is not Serializable. As such, serialization of a non-static InvalidArgument would fail because of the non-null reference to WordWrapper.
The simple solution to both of these issues is to make the nested class static; as a defensive strategy, one should make all nested classes static, unless you really need them not to be.
Why do we even need to initialize this class since its nothing but an algorithm...
Good question. This is sort of related to your first question: you could get away with just the static helper method, and remove the instance methods and state.
Before you chuck away your instance methods, there are advantages to instance methods over static methods.
The obvious one is that you are able to store state in the instances, for instance length. This allows you to pass fewer parameters to wrap, which might make the code less repetitive; I suppose it gives an effect a bit like partial evaluation. (You can store state in static variables too, but global mutable state is a royal PITA; that's another story).
Static methods are a tight coupling: the class using WordWrapper is tightly bound to a specific implementation of word wrapping.
For many purposes, one implementation might be fine. However, there is almost always a case for at least two implementations (your production and test implementations).
So, whereas the following is tightly bound to one implementation:
void doStuffWithAString(String s) {
// Do something....
WordWrapper.wrap(s, 100);
// Do something else ....
}
the following can have an implementation provided at runtime:
void doStuffWithAString(WordWrapper wrapper, String s) {
// Do something....
wrapper.wrap(s);
// Do something else ....
}
which is using the wrapper as a strategy.
Now, you can select the word wrapping algorithm used for a particular case (e.g. one algorithm works well for English, but another works better for Chinese - maybe, I don't know, it's just an example).
Or, for a test, you can inject a mocked instance for tests which just returns the parameter - this allows you to test doStuffWithAString without testing the implementation of WordWrapper at the same time.
But, with flexibility comes overhead. The static method is more concise. For very simple methods, static could well be the way to go; as the method gets more complicated (and, particularly in the testing case, it becomes harder and harder to work out the input to provide to get a specific output which is important to your test case), the instance method form becomes a better choice.
Ultimately, there is no hard-and-fast rule for which to use. Be aware of both, and notice which works best in given situations.

Deferred initialization of immutable variables

I've been using scala's lazy val idiom a lot and I would like to achieve something similar in Java. My main problem is that to construct some value I need some other value which is not known at object construction time, but I do not want to be able to change it afterwards. The reason for that is, I'm using a GUI library which instanciates the object on my behalf and calls a different method when everything I need is created, which is when I know the values I need.
Here are the properties I try to achieve:
* Immutability of my variable.
* Initialization in some other method than the constructor.
I do not think this is possible in Java, for only final achieves immutability of the variable and final variables cannot be initialized outside of the constructor.
What would be the closest thing in Java to what I am trying to achieve ?
One way to do it would be to push the actual instantiation of the value in question into another class. This will be final, but won't be actually created until the class is loaded, which is deferred until it is needed. Something like the following:
public class MyClass
{
private static class Loader
{
public static final INSTANCE = new Foo();
}
Foo getInstance()
{
return Loader.INSTANCE;
}
}
This will lazily initialise the Foo as and when desired.
If you absolutely need the Foo to be an instance variable of your top-level class - I can't think of any way off-hand to do this. The variable must be populated in the constructor, as you noted.
In fact I'm not sure exactly how Scala gets around this, but my guess would be that it sets the lazy val variable to some kind of thunk which is replaced by the actual object when first evaluated. Scala can of course do this by subverting the normal access modifiers in this case, but I don't think you can transparently do this in Java. You could declare the field to be e.g. a Future<Foo> which creates the value on first invocation and caches it from that point on, but that's not referentially transparent, and by the definition of final I don't see a way around this.
Andrzej's answer is great, but there is also a way to do it without changing the source code. Use AspectJ to capture Constructor invocations and return non-initialized objects:
pointcut lazyInit() : execution(* com.mycompany.expensiveservices.*.init(*));
void around() : lazyInit() && within(#Slow *) {
new Thread(new Runnable(){
#Override
public void run(){
// initialize Object in separate thread
proceed();
}
}
}
Given this aspect, all constructors of objects marked with a #Slow annotations will be run in a separate thread.
I did not find much reference to link to, but please read AspectJ in Action by Ramnivas Laddad for more info.

Why should I use the keyword "final" on a method parameter in Java?

I can't understand where the final keyword is really handy when it is used on method parameters.
If we exclude the usage of anonymous classes, readability and intent declaration then it seems almost worthless to me.
Enforcing that some data remains constant is not as strong as it seems.
If the parameter is a primitive then it will have no effect since the parameter is passed to the method as a value and changing it will have no effect outside the scope.
If we are passing a parameter by reference, then the reference itself is a local variable and if the reference is changed from within the method, that would not have any effect from outside of the method scope.
Consider the simple test example below.
This test passes although the method changed the value of the reference given to it, it has no effect.
public void testNullify() {
Collection<Integer> c = new ArrayList<Integer>();
nullify(c);
assertNotNull(c);
final Collection<Integer> c1 = c;
assertTrue(c1.equals(c));
change(c);
assertTrue(c1.equals(c));
}
private void change(Collection<Integer> c) {
c = new ArrayList<Integer>();
}
public void nullify(Collection<?> t) {
t = null;
}
Stop a Variable’s Reassignment
While these answers are intellectually interesting, I've not read the short simple answer:
Use the keyword final when you want the compiler to prevent a
variable from being re-assigned to a different object.
Whether the variable is a static variable, member variable, local variable, or argument/parameter variable, the effect is entirely the same.
Example
Let’s see the effect in action.
Consider this simple method, where the two variables (arg and x) can both be re-assigned different objects.
// Example use of this method:
// this.doSomething( "tiger" );
void doSomething( String arg ) {
String x = arg; // Both variables now point to the same String object.
x = "elephant"; // This variable now points to a different String object.
arg = "giraffe"; // Ditto. Now neither variable points to the original passed String.
}
Mark the local variable as final. This results in a compiler error.
void doSomething( String arg ) {
final String x = arg; // Mark variable as 'final'.
x = "elephant"; // Compiler error: The final local variable x cannot be assigned.
arg = "giraffe";
}
Instead, let’s mark the parameter variable as final. This too results in a compiler error.
void doSomething( final String arg ) { // Mark argument as 'final'.
String x = arg;
x = "elephant";
arg = "giraffe"; // Compiler error: The passed argument variable arg cannot be re-assigned to another object.
}
Moral of the story:
If you want to ensure a variable always points to the same object,
mark the variable final.
Never Reassign Arguments
As good programming practice (in any language), you should never re-assign a parameter/argument variable to an object other than the object passed by the calling method. In the examples above, one should never write the line arg = . Since humans make mistakes, and programmers are human, let’s ask the compiler to assist us. Mark every parameter/argument variable as 'final' so that the compiler may find and flag any such re-assignments.
In Retrospect
As noted in other answers…
Given Java's original design goal of helping programmers to avoid dumb mistakes such as reading past the end of an array, Java should have been designed to automatically enforce all parameter/argument variables as 'final'. In other words, Arguments should not be variables. But hindsight is 20/20 vision, and the Java designers had their hands full at the time.
So, always add final to all arguments?
Should we add final to each and every method parameter being declared?
In theory, yes.
In practice, no.➥ Add final only when the method’s code is long or complicated, where the argument may be mistaken for a local or member variable and possibly re-assigned.
If you buy into the practice of never re-assigning an argument, you will be inclined to add a final to each. But this is tedious and makes the declaration a bit harder to read.
For short simple code where the argument is obviously an argument, and not a local variable nor a member variable, I do not bother adding the final. If the code is quite obvious, with no chance of me nor any other programmer doing maintenance or refactoring accidentally mistaking the argument variable as something other than an argument, then don’t bother. In my own work, I add final only in longer or more involved code where an argument might mistaken for a local or member variable.
#Another case added for the completeness
public class MyClass {
private int x;
//getters and setters
}
void doSomething( final MyClass arg ) { // Mark argument as 'final'.
arg = new MyClass(); // Compiler error: The passed argument variable arg cannot be re-assigned to another object.
arg.setX(20); // allowed
// We can re-assign properties of argument which is marked as final
}
record
Java 16 brings the new records feature. A record is a very brief way to define a class whose central purpose is to merely carry data, immutably and transparently.
You simply declare the class name along with the names and types of its member fields. The compiler implicitly provides the constructor, getters, equals & hashCode, and toString.
The fields are read-only, with no setters. So a record is one case where there is no need to mark the arguments final. They are already effectively final. Indeed, the compiler forbids using final when declaring the fields of a record.
public record Employee( String name , LocalDate whenHired ) // 🡄 Marking `final` here is *not* allowed.
{
}
If you provide an optional constructor, there you can mark final.
public record Employee(String name , LocalDate whenHired) // 🡄 Marking `final` here is *not* allowed.
{
public Employee ( final String name , final LocalDate whenHired ) // 🡄 Marking `final` here *is* allowed.
{
this.name = name;
whenHired = LocalDate.MIN; // 🡄 Compiler error, because of `final`.
this.whenHired = whenHired;
}
}
Sometimes it's nice to be explicit (for readability) that the variable doesn't change. Here's a simple example where using final can save some possible headaches:
public void setTest(String test) {
test = test;
}
If you forget the 'this' keyword on a setter, then the variable you want to set doesn't get set. However, if you used the final keyword on the parameter, then the bug would be caught at compile time.
Yes, excluding anonymous classes, readability and intent declaration it's almost worthless. Are those three things worthless though?
Personally I tend not to use final for local variables and parameters unless I'm using the variable in an anonymous inner class, but I can certainly see the point of those who want to make it clear that the parameter value itself won't change (even if the object it refers to changes its contents). For those who find that adds to readability, I think it's an entirely reasonable thing to do.
Your point would be more important if anyone were actually claiming that it did keep data constant in a way that it doesn't - but I can't remember seeing any such claims. Are you suggesting there's a significant body of developers suggesting that final has more effect than it really does?
EDIT: I should really have summed all of this up with a Monty Python reference; the question seems somewhat similar to asking "What have the Romans ever done for us?"
Let me explain a bit about the one case where you have to use final, which Jon already mentioned:
If you create an anonymous inner class in your method and use a local variable (such as a method parameter) inside that class, then the compiler forces you to make the parameter final:
public Iterator<Integer> createIntegerIterator(final int from, final int to)
{
return new Iterator<Integer>(){
int index = from;
public Integer next()
{
return index++;
}
public boolean hasNext()
{
return index <= to;
}
// remove method omitted
};
}
Here the from and to parameters need to be final so they can be used inside the anonymous class.
The reason for that requirement is this: Local variables live on the stack, therefore they exist only while the method is executed. However, the anonymous class instance is returned from the method, so it may live for much longer. You can't preserve the stack, because it is needed for subsequent method calls.
So what Java does instead is to put copies of those local variables as hidden instance variables into the anonymous class (you can see them if you examine the byte code). But if they were not final, one might expect the anonymous class and the method seeing changes the other one makes to the variable. In order to maintain the illusion that there is only one variable rather than two copies, it has to be final.
I use final all the time on parameters.
Does it add that much? Not really.
Would I turn it off? No.
The reason: I found 3 bugs where people had written sloppy code and failed to set a member variable in accessors. All bugs proved difficult to find.
I'd like to see this made the default in a future version of Java. The pass by value/reference thing trips up an awful lot of junior programmers.
One more thing.. my methods tend to have a low number of parameters so the extra text on a method declaration isn't an issue.
Using final in a method parameter has nothing to do with what happens to the argument on the caller side. It is only meant to mark it as not changing inside that method. As I try to adopt a more functional programming style, I kind of see the value in that.
Personally I don't use final on method parameters, because it adds too much clutter to parameter lists.
I prefer to enforce that method parameters are not changed through something like Checkstyle.
For local variables I use final whenever possible, I even let Eclipse do that automatically in my setup for personal projects.
I would certainly like something stronger like C/C++ const.
Since Java passes copies of arguments I feel the relevance of final is rather limited. I guess the habit comes from the C++ era where you could prohibit reference content from being changed by doing a const char const *. I feel this kind of stuff makes you believe the developer is inherently stupid as f*** and needs to be protected against truly every character he types. In all humbleness may I say, I write very few bugs even though I omit final (unless I don't want someone to override my methods and classes). Maybe I'm just an old-school dev.
Short answer: final helps a tiny bit but... use defensive programming on the client side instead.
Indeed, the problem with final is that it only enforces the reference is unchanged, gleefully allowing the referenced object members to be mutated, unbeknownst to the caller. Hence the best practice in this regard is defensive programming on the caller side, creating deeply immutable instances or deep copies of objects that are in danger of being mugged by unscrupulous APIs.
I never use final in a parameter list, it just adds clutter like previous respondents have said. Also in Eclipse you can set parameter assignment to generate an error so using final in a parameter list seems pretty redundant to me.
Interestingly when I enabled the Eclipse setting for parameter assignment generating an error on it caught this code (this is just how I remember the flow, not the actual code. ) :-
private String getString(String A, int i, String B, String C)
{
if (i > 0)
A += B;
if (i > 100)
A += C;
return A;
}
Playing devil's advocate, what exactly is wrong with doing this?
One additional reason to add final to parameter declarations is that it helps to identify variables that need to be renamed as part of a "Extract Method" refactoring. I have found that adding final to each parameter prior to starting a large method refactoring quickly tells me if there are any issues I need to address before continuing.
However, I generally remove them as superfluous at the end of the refactoring.
Follow up by Michel's post. I made myself another example to explain it. I hope it could help.
public static void main(String[] args){
MyParam myParam = thisIsWhy(new MyObj());
myParam.setArgNewName();
System.out.println(myParam.showObjName());
}
public static MyParam thisIsWhy(final MyObj obj){
MyParam myParam = new MyParam() {
#Override
public void setArgNewName() {
obj.name = "afterSet";
}
#Override
public String showObjName(){
return obj.name;
}
};
return myParam;
}
public static class MyObj{
String name = "beforeSet";
public MyObj() {
}
}
public abstract static class MyParam{
public abstract void setArgNewName();
public abstract String showObjName();
}
From the code above, in the method thisIsWhy(), we actually didn't assign the [argument MyObj obj] to a real reference in MyParam. In instead, we just use the [argument MyObj obj] in the method inside MyParam.
But after we finish the method thisIsWhy(), should the argument(object) MyObj still exist?
Seems like it should, because we can see in main we still call the method showObjName() and it needs to reach obj. MyParam will still use/reaches the method argument even the method already returned!
How Java really achieve this is to generate a copy also is a hidden reference of the argument MyObj obj inside the MyParam object ( but it's not a formal field in MyParam so that we can't see it )
As we call "showObjName", it will use that reference to get the corresponding value.
But if we didn't put the argument final, which leads a situation we can reassign a new memory(object) to the argument MyObj obj.
Technically there's no clash at all! If we are allowed to do that, below will be the situation:
We now have a hidden [MyObj obj] point to a [Memory A in heap] now live in MyParam object.
We also have another [MyObj obj] which is the argument point to a [Memory B in heap] now live in thisIsWhy method.
No clash, but "CONFUSING!!" Because they are all using the same "reference name" which is "obj".
To avoid this, set it as "final" to avoid programmer do the "mistake-prone" code.

Why must delegation to a different constructor happen first in a Java constructor?

In a constructor in Java, if you want to call another constructor (or a super constructor), it has to be the first line in the constructor. I assume this is because you shouldn't be allowed to modify any instance variables before the other constructor runs. But why can't you have statements before the constructor delegation, in order to compute the complex value to the other function? I can't think of any good reason, and I have hit some real cases where I have written some ugly code to get around this limitation.
So I'm just wondering:
Is there a good reason for this limitation?
Are there any plans to allow this in future Java releases? (Or has Sun definitively said this is not going to happen?)
For an example of what I'm talking about, consider some code I wrote which I gave in this StackOverflow answer. In that code, I have a BigFraction class, which has a BigInteger numerator and a BigInteger denominator. The "canonical" constructor is the BigFraction(BigInteger numerator, BigInteger denominator) form. For all the other constructors, I just convert the input parameters to BigIntegers, and call the "canonical" constructor, because I don't want to duplicate all the work.
In some cases this is easy; for example, the constructor that takes two longs is trivial:
public BigFraction(long numerator, long denominator)
{
this(BigInteger.valueOf(numerator), BigInteger.valueOf(denominator));
}
But in other cases, it is more difficult. Consider the constructor which takes a BigDecimal:
public BigFraction(BigDecimal d)
{
this(d.scale() < 0 ? d.unscaledValue().multiply(BigInteger.TEN.pow(-d.scale())) : d.unscaledValue(),
d.scale() < 0 ? BigInteger.ONE : BigInteger.TEN.pow(d.scale()));
}
I find this pretty ugly, but it helps me avoid duplicating code. The following is what I'd like to do, but it is illegal in Java:
public BigFraction(BigDecimal d)
{
BigInteger numerator = null;
BigInteger denominator = null;
if(d.scale() < 0)
{
numerator = d.unscaledValue().multiply(BigInteger.TEN.pow(-d.scale()));
denominator = BigInteger.ONE;
}
else
{
numerator = d.unscaledValue();
denominator = BigInteger.TEN.pow(d.scale());
}
this(numerator, denominator);
}
Update
There have been good answers, but thus far, no answers have been provided that I'm completely satisfied with, but I don't care enough to start a bounty, so I'm answering my own question (mainly to get rid of that annoying "have you considered marking an accepted answer" message).
Workarounds that have been suggested are:
Static factory.
I've used the class in a lot of places, so that code would break if I suddenly got rid of the public constructors and went with valueOf() functions.
It feels like a workaround to a limitation. I wouldn't get any other benefits of a factory because this cannot be subclassed and because common values are not being cached/interned.
Private static "constructor helper" methods.
This leads to lots of code bloat.
The code gets ugly because in some cases I really need to compute both numerator and denominator at the same time, and I can't return multiple values unless I return a BigInteger[] or some kind of private inner class.
The main argument against this functionality is that the compiler would have to check that you didn't use any instance variables or methods before calling the superconstructor, because the object would be in an invalid state. I agree, but I think this would be an easier check than the one which makes sure all final instance variables are always initialized in every constructor, no matter what path through the code is taken. The other argument is that you simply can't execute code beforehand, but this is clearly false because the code to compute the parameters to the superconstructor is getting executed somewhere, so it must be allowed at a bytecode level.
Now, what I'd like to see, is some good reason why the compiler couldn't let me take this code:
public MyClass(String s) {
this(Integer.parseInt(s));
}
public MyClass(int i) {
this.i = i;
}
And rewrite it like this (the bytecode would be basically identical, I'd think):
public MyClass(String s) {
int tmp = Integer.parseInt(s);
this(tmp);
}
public MyClass(int i) {
this.i = i;
}
The only real difference I see between those two examples is that the "tmp" variable's scope allows it to be accessed after calling this(tmp) in the second example. So maybe a special syntax (similar to static{} blocks for class initialization) would need to be introduced:
public MyClass(String s) {
//"init{}" is a hypothetical syntax where there is no access to instance
//variables/methods, and which must end with a call to another constructor
//(using either "this(...)" or "super(...)")
init {
int tmp = Integer.parseInt(s);
this(tmp);
}
}
public MyClass(int i) {
this.i = i;
}
I think several of the answers here are wrong because they assume encapsulation is somehow broken when calling super() after invoking some code. The fact is that the super can actually break encapsulation itself, because Java allows overriding methods in the constructor.
Consider these classes:
class A {
protected int i;
public void print() { System.out.println("Hello"); }
public A() { i = 13; print(); }
}
class B extends A {
private String msg;
public void print() { System.out.println(msg); }
public B(String msg) { super(); this.msg = msg; }
}
If you do
new B("Wubba lubba dub dub");
the message printed out is "null". That's because the constructor from A is accessing the uninitialized field from B. So frankly it seems that if someone wanted to do this:
class C extends A {
public C() {
System.out.println(i); // i not yet initialized
super();
}
}
Then that's just as much their problem as if they make class B above. In both cases the programmer has to know how the variables are accessed during construction. And given that you can call super() or this() with all kinds of expressions in the parameter list, it seems like an artificial restriction that you can't compute any expressions before calling the other constructor. Not to mention that the restriction applies to both super() and this() when presumably you know how to not break your own encapsulation when calling this().
My verdict: This feature is a bug in the compiler, perhaps originally motivated by a good reason, but in its current form it is an artifical limitation with no purpose.
I find this pretty ugly, but it helps
me avoid duplicating code. The
following is what I'd like to do, but
it is illegal in Java ...
You could also work around this limitation by using a static factory method that returns a new object:
public static BigFraction valueOf(BigDecimal d)
{
// computate numerator and denominator from d
return new BigFraction(numerator, denominator);
}
Alternatively, you could cheat by calling a private static method to do the computations for your constructor:
public BigFraction(BigDecimal d)
{
this(computeNumerator(d), computeDenominator(d));
}
private static BigInteger computeNumerator(BigDecimal d) { ... }
private static BigInteger computeDenominator(BigDecimal d) { ... }
The constructors must be called in order, from the root parent class to the most derived class. You can't execute any code beforehand in the derived constructor because before the parent constructor is called, the stack frame for the derived constructor hasn't even been allocated yet, because the derived constructor hasn't started executing. Admittedly, the syntax for Java doesn't make this fact clear.
Edit: To summarize, when a derived class constructor is "executing" before the this() call, the following points apply.
Member variables can't be touched, because they are invalid before base
classes are constructed.
Arguments are read-only, because the stack frame has not been allocated.
Local variables cannot be accessed, because the stack frame has not been allocated.
You can gain access to arguments and local variables if you allocated the constructors' stack frames in reverse order, from derived classes to base classes, but this would require all frames to be active at the same time, wasting memory for every object construction to allow for the rare case of code that wants to touch local variables before base classes are constructed.
"My guess is that, until a constructor has been called for every level of the heierarchy, the object is in an invalid state. It is unsafe for the JVM to run anything on it until it has been completely constructed."
Actually, it is possible to construct objects in Java without calling every constructor in the hierarchy, although not with the new keyword.
For example, when Java's serialization constructs an object during deserialization, it calls the constructor of the first non-serializable class in the hierarchy. So when java.util.HashMap is deserialized, first a java.util.HashMap instance is allocated and then the constructor of its first non-serializable superclass java.util.AbstractMap is called (which in turn calls java.lang.Object's constructor).
You can also use the Objenesis library to instantiate objects without calling the constructor.
Or if you are so inclined, you can generate the bytecode yourself (with ASM or similar). At the bytecode level, new Foo() compiles to two instructions:
NEW Foo
INVOKESPECIAL Foo.<init> ()V
If you want to avoid calling the constructor of Foo, you can change the second command, for example:
NEW Foo
INVOKESPECIAL java/lang/Object.<init> ()V
But even then, the constructor of Foo must contain a call to its superclass. Otherwise the JVM's class loader will throw an exception when loading the class, complaining that there is no call to super().
Allowing code to not call the super constructor first breaks encapsulation - the idea that you can write code and be able to prove that no matter what someone else does - extend it, invoke it, instansiate it - it will always be in a valid state.
IOW: it's not a JVM requirement as such, but a Comp Sci requirement. And an important one.
To solve your problem, incidentally, you make use of private static methods - they don't depend on any instance:
public BigFraction(BigDecimal d)
{
this(appropriateInitializationNumeratorFor(d),
appropriateInitializationDenominatorFor(d));
}
private static appropriateInitializationNumeratorFor(BigDecimal d)
{
if(d.scale() < 0)
{
return d.unscaledValue().multiply(BigInteger.TEN.pow(-d.scale()));
}
else
{
return d.unscaledValue();
}
}
If you don't like having separate methods (a lot of common logic you only want to execute once, for instance), have one method that returns a private little static inner class which is used to invoke a private constructor.
My guess is that, until a constructor has been called for every level of the heierarchy, the object is in an invalid state. It is unsafe for the JVM to run anything on it until it has been completely constructed.
Well, the problem is java cannot detect what 'statements' you are going to put before the super call. For example, you could refer to member variables which are not yet initialized. So I don't think java will ever support this.
Now, there are many ways to work around this problem such as by using factory or template methods.
Look it this way.
Let's say that an object is composed of 10 parts.
1,2,3,4,5,6,7,8,9,10
Ok?
From 1 to 9 are in the super class, part #10 is your addition.
Simple cannot add the 10th part until the previous 9 are completed.
That's it.
If from 1-6 are from another super class that fine, the thing is one single object is created in a specific sequence, that's the way is was designed.
Of course real reason is far more complex than this, but I think this would pretty much answers the question.
As for the alternatives, I think there are plenty already posted here.

Categories

Resources