Thinking in OOP way - java

Whenever I think that I am gaining some confidence in OOP then suddenly I get bitten by some advance example. Like in this very great article by Uncle Bob he uses the below class an example for his kata.
public class WordWrapper {
private int length;
public WordWrapper(int length) {
this.length = length;
}
public static String wrap(String s, int length) {
return new WordWrapper(length).wrap(s);
}
public String wrap(String s) {
if (length < 1)
throw new InvalidArgument();
if (s == null)
return "";
if (s.length() <= length)
return s;
else {
int space = s.indexOf(" ");
if (space >= 0)
return breakBetween(s, space, space + 1);
else
return breakBetween(s, length, length);
}
}
private String breakBetween(String s, int start, int end) {
return s.substring(0, start) +
"\n" +
wrap(s.substring(end), length);
}
public static class InvalidArgument extends RuntimeException {
}
}
I have following doubts:
Why the static helper method wrap?
Why the InvalidArgument class is nested and static?
Why do we even need to initialize this class since its nothing but an algorithm and can operate without any instance variable, why we need ~100 instances(for eg) of it?

Why the static helper method wrap?
There is no especially good reason - I think that it is a subjective judgement that:
WordWrapper.wrap("foo", 5);
is neater than
new WordWrapper(5).wrap("foo");
(which I would agree it is). I tend to find myself adding methods like this when the code just feels very repetitive.
However, the static form can lead to hidden problems: invoking that in a loop results in the creation of a lot of unnecessary instances of WordWrapper, whereas the non-static form just creates one and reuses it.
Why the InvalidArgument class is nested and static?
The implication of it being nested is that it is only for use in reporting invalid arguments of methods in WordWrapper. For instance, it wouldn't make much sense if some database-related class threw an instance of WordWrapper.InvalidArgument.
Remember that you can reference it as InvalidArgument for convenience if appropriately imported; you're still always using some.packagename.WordWrapper.InvalidArgument, so its use in other classes doesn't make semantic sense.
If you expect to use it in other classes, it should not be nested.
As for why static: there are two reasons that I can think of (which are sort of different sides of the same coin):
It doesn't need to be non-static. A non-static nested class is called an inner class. It is related to the instance of the containing class which created it; in some way, the data in the inner class is related to the data in the outer class.
What this actually means is there is a hidden reference to the outer class passed into the inner class when it is created. If you never need to refer to this instance, make it static, so the reference isn't passed. It's just like removing unused parameters of methods: if you don't need it, don't pass it.
Holding this reference has unexpected consequences. (I draw this as a separate point because whereas the previous one refers to a logical requirement/design for the reference or not, this refers to practical implications of holding that reference).
Just as with holding any reference, if you have a reference to an instance of the inner class, you make everything that it references ineligible for garbage collection, since it is still reachable. Depending upon how you use instances of the inner class, this can lead to a memory leak. The static version of the class doesn't suffer from this problem, since there is no reference: you can have a reference to a InvalidArgument when all of the instances of Wrapper are cleared up.
Another consequence is that the contract of InvalidArgument is invalid: Throwable, a superclass of InvalidArgument, implements Serializable, meaning that InvalidArgument also implements Serializable. However, WordWrapper is not Serializable. As such, serialization of a non-static InvalidArgument would fail because of the non-null reference to WordWrapper.
The simple solution to both of these issues is to make the nested class static; as a defensive strategy, one should make all nested classes static, unless you really need them not to be.
Why do we even need to initialize this class since its nothing but an algorithm...
Good question. This is sort of related to your first question: you could get away with just the static helper method, and remove the instance methods and state.
Before you chuck away your instance methods, there are advantages to instance methods over static methods.
The obvious one is that you are able to store state in the instances, for instance length. This allows you to pass fewer parameters to wrap, which might make the code less repetitive; I suppose it gives an effect a bit like partial evaluation. (You can store state in static variables too, but global mutable state is a royal PITA; that's another story).
Static methods are a tight coupling: the class using WordWrapper is tightly bound to a specific implementation of word wrapping.
For many purposes, one implementation might be fine. However, there is almost always a case for at least two implementations (your production and test implementations).
So, whereas the following is tightly bound to one implementation:
void doStuffWithAString(String s) {
// Do something....
WordWrapper.wrap(s, 100);
// Do something else ....
}
the following can have an implementation provided at runtime:
void doStuffWithAString(WordWrapper wrapper, String s) {
// Do something....
wrapper.wrap(s);
// Do something else ....
}
which is using the wrapper as a strategy.
Now, you can select the word wrapping algorithm used for a particular case (e.g. one algorithm works well for English, but another works better for Chinese - maybe, I don't know, it's just an example).
Or, for a test, you can inject a mocked instance for tests which just returns the parameter - this allows you to test doStuffWithAString without testing the implementation of WordWrapper at the same time.
But, with flexibility comes overhead. The static method is more concise. For very simple methods, static could well be the way to go; as the method gets more complicated (and, particularly in the testing case, it becomes harder and harder to work out the input to provide to get a specific output which is important to your test case), the instance method form becomes a better choice.
Ultimately, there is no hard-and-fast rule for which to use. Be aware of both, and notice which works best in given situations.

Related

Difference between methods that take parameters and the ones don't?

Ok, this might be a stupid question. However, it really confuses me;
What is the difference between a method like :
toString()
and a method like
toString(String s) ?
I know the last one doesn't exist but let's say i made one like this.
Why should I choose to make a method that can be invoked like this: object.method(); and not: method(object j ); i hope i could explain myself.
It all depends on what you want the method to do.
For instance, the .toString(), is used to provide a string representation of the object. In this case, all the data which the method requires to operate as expected is stored within the object itself, thus, no external input is required.
Taking a different example, printToStream(Stream stream) (where Stream would be some interface used by all the streams, be it file streams, console streams, network streams, etc). In this case, the method needs to know to which stream it must write to.
The class to which the printToStream method belongs could have a property which denotes to which stream it must print, and thus allowing us to change the signature of printToStream to printToStream(), but that would require 2 lines of code to set up, and in this case, this can potentially introduce problems, especially if you need to print to different streams and at some point, you forget to set the property. Thus, in this case, having the method take in the extra parameter results in cleaner code which is less error prone.
EDIT: The parseInt method is a static method which is provided by the Integer class which is used to transform items from one type to integer. Static methods are methods which need to be self contained, so if you want to transform one object from one form to another using a static method, passing it as a parameter is the way to go.
The suggestion you are making would take the look of something like so: int a = "123".ToInteger();. I guess that that would be one way to go around doing the same thing, but in the end the Java specification was not designed with that frame of mind.
EDIT 2: I'll try and expand my answer the one provided by # user3284549. Please note that my answer is in no means final and should be supplimented by other material you can find online. In Java, methods are either:
Static. Static methods are methods which have no state (more on that later) and are self contained. As seen earlier, the .parseInt method from the Integer class is one example of such method. Another examples are provided by the Math class.
Non Static. Non static methods are methods which are made available once that an object is initialized (or instantiated). These methods are methods which act upon the state of the object, or else, expose some functionality which the object provides which might, in turn, affect how it behaves.
Taking your dice example, again, you can achieve this in two ways:
Let us assume that we have a class called Dice:
public class Dice {
private int value;
public int getValue() {
return this.value;
}
public void setValue(int value) {
this.value = value;
}
}
Static method
public static void roll(Dice d) {
//Rolling the dice entails a potential change in it's current value. Thus, you would need to access the dice itself and update the value.
Random r = ...
d.setValue(randomNumber);
}
Non Static Method
We just enhance the Dice class to have a new method which mimics a dice roll as follows:
public void roll() {
Random r = ...;
this.value = randomNumber;
}
If you notice, even in the static method, we still made use of a method which takes in a parameter. The particular setXXX methods (and their counter parts getXXX methods provide us with encapsulation). This is because we need to change the state of the Dice object.
Lets proceed with you example -
object.method();
object.method(Object j);
The first one does some operations on it's variable and returns/set some value (although there may be a lot of other cases, but for simplicity just consider these). When the method is toString() it actually represents a string representation of some important properties of the Object. In this case the toString() method don't need to be feed some argument from the outside of the object.
And when you use object.method(Object j) it means you need to provide some arguments to complete it's task.
For the second part of your question. Think about encapsulation. If the method needs access to data that is private to the object then object.method() would be the correct way. And also it is probably not going to be helpful in processing objects of any other type.
For the second type, it could be a static method or utility method commonly used on many different objects. The method implementation logic doesn't have to be tightly related to the object's internal implementation.

Calling static method from a (superclass) list of subclass instances

Say I have a base class A (with a virtual method called normalInit()), and 300 subclasses: A1, A2, A3, ... Each of these subclasses have a staticInit() static method, plus a normalInit() override. (Please don't ask why; this is in a production software, already given, can't change the design for better reuse. Actually, those subclasses are generated by a code generator, but this is irrelevant now.)
Depending on the different executions of the application, a (small) subset of A1, A2, A3, ... needs to be initialized. In other words, there are some data that all instances of a particular Ai share or access commonly. Obviously, it is reasonable to define and treat these entities as static members/methods (as they are shared by all instances of an Ai).
So how to initialize the statics (and call the static methods) of this subset?
To be brief, it's not a solution to static-initialize all Ai subclasses, because only a small subset will be required (would be a waste of memory). The static behavior in Java apparently gives a solution to this: the static initializers of a class are initialized when a class is accessed for the first time (I disregard some special cases here, e.g. compiler inline of pritimive final statics, as in that case, technically there is no class access, just on source code level).
The problem is, I need deterministic (actually at-predefined-time) static initialization, because their static behavior also accesses the current static (global) state of the application. So static initializers are not an option, I need static methods, to call them explicitly at the appropriate place.
In the application in question, this must be done when instances of various Ai classes are accessed via iterating through an ArrayList<A>, where A is the superclass.
for (int i = 0; i < list.size(); ++i) {
list[i].normalInit(args); // normalInit() is an instance method
}
This list consists of Ai instances (e.g. 950 instances of A1, 1750 instances of A2, etc., in an unsorted, "random" order).
In other words, I don't have access to concrete class names (so I can't just call A4.staticInit()), because I don't know which Ai has instances in the list. Note that I know statics are bound at compile time and I do know that polymorphism is not possible here, so I'm not asking how to call the static methods from the above loop! The concretely called instance (and thus its Class) is decided at runtime, due to dynamic dispatch, when normalInit() is called.
An apparent solution is to call the concrete class' staticInit() method from the normalInit() override:
public class A2 {
#Override
public void normalInit(int[] args) {
// ...
staticInit();
}
private static void staticInit() {
if (!sStaticInitialized) {
sStaticInitialized = true;
...
}
}
}
For this, the code generator templates that generate the Ai subclasses must be modified.
But this (and the above code) doesn't look like a nice solution. I understand if the overall app design is somewhat flawed, but even if that's your viewpoint, I would grateful if such claims are augmented with additional (independent) constructive advice. Is there a nicer solution/idiom to the above issue?
Ok, answering it using reflection:
String classPrefixName = "com.your.company.A";
for (int i = 0; i< 300; i++) {
Class<?> clazz = Class.forName(classPrefixName+i); //look for the class
Method method = clazz.getDeclaredMethod("staticInit"); //look for the method
method.invoke(null); //invoke(null), since it's a static method
}
That way you don't need to wrap the static method inside an instance one.

Why must delegation to a different constructor happen first in a Java constructor?

In a constructor in Java, if you want to call another constructor (or a super constructor), it has to be the first line in the constructor. I assume this is because you shouldn't be allowed to modify any instance variables before the other constructor runs. But why can't you have statements before the constructor delegation, in order to compute the complex value to the other function? I can't think of any good reason, and I have hit some real cases where I have written some ugly code to get around this limitation.
So I'm just wondering:
Is there a good reason for this limitation?
Are there any plans to allow this in future Java releases? (Or has Sun definitively said this is not going to happen?)
For an example of what I'm talking about, consider some code I wrote which I gave in this StackOverflow answer. In that code, I have a BigFraction class, which has a BigInteger numerator and a BigInteger denominator. The "canonical" constructor is the BigFraction(BigInteger numerator, BigInteger denominator) form. For all the other constructors, I just convert the input parameters to BigIntegers, and call the "canonical" constructor, because I don't want to duplicate all the work.
In some cases this is easy; for example, the constructor that takes two longs is trivial:
public BigFraction(long numerator, long denominator)
{
this(BigInteger.valueOf(numerator), BigInteger.valueOf(denominator));
}
But in other cases, it is more difficult. Consider the constructor which takes a BigDecimal:
public BigFraction(BigDecimal d)
{
this(d.scale() < 0 ? d.unscaledValue().multiply(BigInteger.TEN.pow(-d.scale())) : d.unscaledValue(),
d.scale() < 0 ? BigInteger.ONE : BigInteger.TEN.pow(d.scale()));
}
I find this pretty ugly, but it helps me avoid duplicating code. The following is what I'd like to do, but it is illegal in Java:
public BigFraction(BigDecimal d)
{
BigInteger numerator = null;
BigInteger denominator = null;
if(d.scale() < 0)
{
numerator = d.unscaledValue().multiply(BigInteger.TEN.pow(-d.scale()));
denominator = BigInteger.ONE;
}
else
{
numerator = d.unscaledValue();
denominator = BigInteger.TEN.pow(d.scale());
}
this(numerator, denominator);
}
Update
There have been good answers, but thus far, no answers have been provided that I'm completely satisfied with, but I don't care enough to start a bounty, so I'm answering my own question (mainly to get rid of that annoying "have you considered marking an accepted answer" message).
Workarounds that have been suggested are:
Static factory.
I've used the class in a lot of places, so that code would break if I suddenly got rid of the public constructors and went with valueOf() functions.
It feels like a workaround to a limitation. I wouldn't get any other benefits of a factory because this cannot be subclassed and because common values are not being cached/interned.
Private static "constructor helper" methods.
This leads to lots of code bloat.
The code gets ugly because in some cases I really need to compute both numerator and denominator at the same time, and I can't return multiple values unless I return a BigInteger[] or some kind of private inner class.
The main argument against this functionality is that the compiler would have to check that you didn't use any instance variables or methods before calling the superconstructor, because the object would be in an invalid state. I agree, but I think this would be an easier check than the one which makes sure all final instance variables are always initialized in every constructor, no matter what path through the code is taken. The other argument is that you simply can't execute code beforehand, but this is clearly false because the code to compute the parameters to the superconstructor is getting executed somewhere, so it must be allowed at a bytecode level.
Now, what I'd like to see, is some good reason why the compiler couldn't let me take this code:
public MyClass(String s) {
this(Integer.parseInt(s));
}
public MyClass(int i) {
this.i = i;
}
And rewrite it like this (the bytecode would be basically identical, I'd think):
public MyClass(String s) {
int tmp = Integer.parseInt(s);
this(tmp);
}
public MyClass(int i) {
this.i = i;
}
The only real difference I see between those two examples is that the "tmp" variable's scope allows it to be accessed after calling this(tmp) in the second example. So maybe a special syntax (similar to static{} blocks for class initialization) would need to be introduced:
public MyClass(String s) {
//"init{}" is a hypothetical syntax where there is no access to instance
//variables/methods, and which must end with a call to another constructor
//(using either "this(...)" or "super(...)")
init {
int tmp = Integer.parseInt(s);
this(tmp);
}
}
public MyClass(int i) {
this.i = i;
}
I think several of the answers here are wrong because they assume encapsulation is somehow broken when calling super() after invoking some code. The fact is that the super can actually break encapsulation itself, because Java allows overriding methods in the constructor.
Consider these classes:
class A {
protected int i;
public void print() { System.out.println("Hello"); }
public A() { i = 13; print(); }
}
class B extends A {
private String msg;
public void print() { System.out.println(msg); }
public B(String msg) { super(); this.msg = msg; }
}
If you do
new B("Wubba lubba dub dub");
the message printed out is "null". That's because the constructor from A is accessing the uninitialized field from B. So frankly it seems that if someone wanted to do this:
class C extends A {
public C() {
System.out.println(i); // i not yet initialized
super();
}
}
Then that's just as much their problem as if they make class B above. In both cases the programmer has to know how the variables are accessed during construction. And given that you can call super() or this() with all kinds of expressions in the parameter list, it seems like an artificial restriction that you can't compute any expressions before calling the other constructor. Not to mention that the restriction applies to both super() and this() when presumably you know how to not break your own encapsulation when calling this().
My verdict: This feature is a bug in the compiler, perhaps originally motivated by a good reason, but in its current form it is an artifical limitation with no purpose.
I find this pretty ugly, but it helps
me avoid duplicating code. The
following is what I'd like to do, but
it is illegal in Java ...
You could also work around this limitation by using a static factory method that returns a new object:
public static BigFraction valueOf(BigDecimal d)
{
// computate numerator and denominator from d
return new BigFraction(numerator, denominator);
}
Alternatively, you could cheat by calling a private static method to do the computations for your constructor:
public BigFraction(BigDecimal d)
{
this(computeNumerator(d), computeDenominator(d));
}
private static BigInteger computeNumerator(BigDecimal d) { ... }
private static BigInteger computeDenominator(BigDecimal d) { ... }
The constructors must be called in order, from the root parent class to the most derived class. You can't execute any code beforehand in the derived constructor because before the parent constructor is called, the stack frame for the derived constructor hasn't even been allocated yet, because the derived constructor hasn't started executing. Admittedly, the syntax for Java doesn't make this fact clear.
Edit: To summarize, when a derived class constructor is "executing" before the this() call, the following points apply.
Member variables can't be touched, because they are invalid before base
classes are constructed.
Arguments are read-only, because the stack frame has not been allocated.
Local variables cannot be accessed, because the stack frame has not been allocated.
You can gain access to arguments and local variables if you allocated the constructors' stack frames in reverse order, from derived classes to base classes, but this would require all frames to be active at the same time, wasting memory for every object construction to allow for the rare case of code that wants to touch local variables before base classes are constructed.
"My guess is that, until a constructor has been called for every level of the heierarchy, the object is in an invalid state. It is unsafe for the JVM to run anything on it until it has been completely constructed."
Actually, it is possible to construct objects in Java without calling every constructor in the hierarchy, although not with the new keyword.
For example, when Java's serialization constructs an object during deserialization, it calls the constructor of the first non-serializable class in the hierarchy. So when java.util.HashMap is deserialized, first a java.util.HashMap instance is allocated and then the constructor of its first non-serializable superclass java.util.AbstractMap is called (which in turn calls java.lang.Object's constructor).
You can also use the Objenesis library to instantiate objects without calling the constructor.
Or if you are so inclined, you can generate the bytecode yourself (with ASM or similar). At the bytecode level, new Foo() compiles to two instructions:
NEW Foo
INVOKESPECIAL Foo.<init> ()V
If you want to avoid calling the constructor of Foo, you can change the second command, for example:
NEW Foo
INVOKESPECIAL java/lang/Object.<init> ()V
But even then, the constructor of Foo must contain a call to its superclass. Otherwise the JVM's class loader will throw an exception when loading the class, complaining that there is no call to super().
Allowing code to not call the super constructor first breaks encapsulation - the idea that you can write code and be able to prove that no matter what someone else does - extend it, invoke it, instansiate it - it will always be in a valid state.
IOW: it's not a JVM requirement as such, but a Comp Sci requirement. And an important one.
To solve your problem, incidentally, you make use of private static methods - they don't depend on any instance:
public BigFraction(BigDecimal d)
{
this(appropriateInitializationNumeratorFor(d),
appropriateInitializationDenominatorFor(d));
}
private static appropriateInitializationNumeratorFor(BigDecimal d)
{
if(d.scale() < 0)
{
return d.unscaledValue().multiply(BigInteger.TEN.pow(-d.scale()));
}
else
{
return d.unscaledValue();
}
}
If you don't like having separate methods (a lot of common logic you only want to execute once, for instance), have one method that returns a private little static inner class which is used to invoke a private constructor.
My guess is that, until a constructor has been called for every level of the heierarchy, the object is in an invalid state. It is unsafe for the JVM to run anything on it until it has been completely constructed.
Well, the problem is java cannot detect what 'statements' you are going to put before the super call. For example, you could refer to member variables which are not yet initialized. So I don't think java will ever support this.
Now, there are many ways to work around this problem such as by using factory or template methods.
Look it this way.
Let's say that an object is composed of 10 parts.
1,2,3,4,5,6,7,8,9,10
Ok?
From 1 to 9 are in the super class, part #10 is your addition.
Simple cannot add the 10th part until the previous 9 are completed.
That's it.
If from 1-6 are from another super class that fine, the thing is one single object is created in a specific sequence, that's the way is was designed.
Of course real reason is far more complex than this, but I think this would pretty much answers the question.
As for the alternatives, I think there are plenty already posted here.

How do I identify immutable objects in Java

In my code, I am creating a collection of objects which will be accessed by various threads in a fashion that is only safe if the objects are immutable. When an attempt is made to insert a new object into my collection, I want to test to see if it is immutable (if not, I'll throw an exception).
One thing I can do is to check a few well-known immutable types:
private static final Set<Class> knownImmutables = new HashSet<Class>(Arrays.asList(
String.class, Byte.class, Short.class, Integer.class, Long.class,
Float.class, Double.class, Boolean.class, BigInteger.class, BigDecimal.class
));
...
public static boolean isImmutable(Object o) {
return knownImmutables.contains(o.getClass());
}
This actually gets me 90% of the way, but sometimes my users will want to create simple immutable types of their own:
public class ImmutableRectangle {
private final int width;
private final int height;
public ImmutableRectangle(int width, int height) {
this.width = width;
this.height = height;
}
public int getWidth() { return width; }
public int getHeight() { return height; }
}
Is there some way (perhaps using reflection) that I could reliably detect whether a class is immutable? False positives (thinking it's immutable when it isn't) are not acceptable but false negatives (thinking it's mutable when it isn't) are.
Edited to add: Thanks for the insightful and helpful answers. As some of the answers pointed out, I neglected to define my security objectives. The threat here is clueless developers -- this is a piece of framework code that will be used by large numbers of people who know next-to-nothing about threading and won't be reading the documentation. I do NOT need to defend against malicious developers -- anyone clever enough to mutate a String or perform other shenanigans will also be smart enough to know it's not safe in this case. Static analysis of the codebase IS an option, so long as it is automated, but code reviews cannot be counted on because there is no guarantee every review will have threading-savvy reviewers.
There is no reliable way to detect if a class is immutable. This is because there are so many ways a property of a class might be altered and you can't detect all of them via reflection.
The only way to get close to this is:
Only allow final properties of types that are immutable (primitive types and classes you know are immutable),
Require the class to be final itself
Require that they inherit from a base class you provide (which is guaranteed to be immutable)
Then you can check with the following code if the object you have is immutable:
static boolean isImmutable(Object obj) {
Class<?> objClass = obj.getClass();
// Class of the object must be a direct child class of the required class
Class<?> superClass = objClass.getSuperclass();
if (!Immutable.class.equals(superClass)) {
return false;
}
// Class must be final
if (!Modifier.isFinal(objClass.getModifiers())) {
return false;
}
// Check all fields defined in the class for type and if they are final
Field[] objFields = objClass.getDeclaredFields();
for (int i = 0; i < objFields.length; i++) {
if (!Modifier.isFinal(objFields[i].getModifiers())
|| !isValidFieldType(objFields[i].getType())) {
return false;
}
}
// Lets hope we didn't forget something
return true;
}
static boolean isValidFieldType(Class<?> type) {
// Check for all allowed property types...
return type.isPrimitive() || String.class.equals(type);
}
Update: As suggested in the comments, it could be extended to recurse on the superclass instead of checking for a certain class. It was also suggested to recursively use isImmutable in the isValidFieldType Method. This could probably work and I have also done some testing. But this is not trivial. You can't just check all field types with a call to isImmutable, because String already fails this test (its field hash is not final!). Also you are easily running into endless recursions, causing StackOverflowErrors ;) Other problems might be caused by generics, where you also have to check their types for immutablity.
I think with some work, these potential problems might be solved somehow. But then, you have to ask yourself first if it really is worth it (also performance wise).
Use the Immutable annotation from Java Concurrency in Practice. The tool FindBugs can then help in detecting classes which are mutable but shouldn't be.
At my company we've defined an Attribute called #Immutable. If you choose to attach that to a class, it means you promise you're immutable.
It works for documentation, and in your case it would work as a filter.
Of course you're still depending on the author keeping his word about being immutable, but since the author explicitly added the annotation it's a reasonable assumption.
Basically no.
You could build a giant white-list of accepted classes but I think the less crazy way would be to just write in the documentation for the collection that everything that goes is this collection must be immutable.
Edit: Other people have suggested having an immutable annotation. This is fine, but you need the documentation as well. Otherwise people will just think "if I put this annotation on my class I can store it in the collection" and will just chuck it on anything, immutable and mutable classes alike. In fact, I would be wary of having an immutable annotation just in case people think that annotation makes their class immutable.
In my code, I am creating a collection of objects which will be accessed by various threads in a fashion that is only safe if the objects are immutable.
Not a direct answer to your question, but keep in mind that objects that are immutable are not automatically guaranteed to be thread safe (sadly). Code needs to be side-effect free to be thread safe, and that's quite a bit more difficult.
Suppose you have this class:
class Foo {
final String x;
final Integer y;
...
public bar() {
Singleton.getInstance().foolAround();
}
}
Then the foolAround() method might include some non-thread safe operations, which will blow up your app. And it's not possible to test for this using reflection, as the actual reference can only be found in the method body, not in the fields or exposed interface.
Other than that, the others are correct: you can scan for all declared fields of the class, check if every one of them is final and also an immutable class, and you're done. I don't think methods being final is a requirement.
Also, be careful about recursively checking dependent fields for immutability, you might end up with circles:
class A {
final B b; // might be immutable...
}
class B {
final A a; // same so here.
}
Classes A and B are perfectly immutable (and possibly even usable through some reflection hacks), but naive recursive code will go into an endless loop checking A, then B, then A again, onwards to B, ...
You can fix that with a 'seen' map that disallows cycles, or with some really clever code that decides classes are immutable if all their dependees are immutable only depending on themselves, but that's going to be really complicated...
This could be another hint:
If the class has no setters then it cannot be mutated, granted the parameters it was created with are either "primitive" types or not mutable themselves.
Also no methods could be overridden, all fields are final and private,
I'll try to code something tomorrow for you, but Simon's code using reflection looks pretty good.
In the mean time try to grab a copy of the "Effective Java" book by Josh Block, it has an Item related to this topic. While is does not for sure say how to detect an immutable class, it shows how to create a good one.
The item is called: "Favor immutability"
Updated link: https://www.amazon.com/Effective-Java-Joshua-Bloch/dp/0134685997
You Can Ask your clients to add metadata (annotations) and check them at runtime with reflection, like this:
Metadata:
#Retention(RetentionPolicy.RUNTIME)
#Target(ElementType.CLASS)
public #interface Immutable{ }
Client Code:
#Immutable
public class ImmutableRectangle {
private final int width;
private final int height;
public ImmutableRectangle(int width, int height) {
this.width = width;
this.height = height;
}
public int getWidth() { return width; }
public int getHeight() { return height; }
}
Then by using reflection on the class, check if it has the annotation (I would paste the code but its boilerplate and can be found easily online)
why do all the recommendations require the class to be final? if you are using reflection to check the class of each object, and you can determine programmatically that that class is immutable (immutable, final fields), then you don't need to require that the class itself is final.
You can use AOP and #Immutable annotation from jcabi-aspects:
#Immutable
public class Foo {
private String data;
}
// this line will throw a runtime exception since class Foo
// is actually mutable, despite the annotation
Object object = new Foo();
Like the other answerers already said, IMHO there is no reliable way to find out if an object is really immutable.
I would just introduce an interface "Immutable" to check against when appending. This works as a hint that only immutable objects should be inserted for whatever reason you're doing it.
interface Immutable {}
class MyImmutable implements Immutable{...}
public void add(Object o) {
if (!(o instanceof Immutable) && !checkIsImmutableBasePrimitive(o))
throw new IllegalArgumentException("o is not immutable!");
...
}
Try this:
public static boolean isImmutable(Object object){
if (object instanceof Number) { // Numbers are immutable
if (object instanceof AtomicInteger) {
// AtomicIntegers are mutable
} else if (object instanceof AtomicLong) {
// AtomLongs are mutable
} else {
return true;
}
} else if (object instanceof String) { // Strings are immutable
return true;
} else if (object instanceof Character) { // Characters are immutable
return true;
} else if (object instanceof Class) { // Classes are immutable
return true;
}
Class<?> objClass = object.getClass();
// Class must be final
if (!Modifier.isFinal(objClass.getModifiers())) {
return false;
}
// Check all fields defined in the class for type and if they are final
Field[] objFields = objClass.getDeclaredFields();
for (int i = 0; i < objFields.length; i++) {
if (!Modifier.isFinal(objFields[i].getModifiers())
|| !isImmutable(objFields[i].getType())) {
return false;
}
}
// Lets hope we didn't forget something
return true;
}
To my knowledge, there is no way to identify immutable objects that is 100% correct. However, I have written a library to get you closer. It performs analysis of bytecode of a class to determine if it is immutable or not, and can execute at runtime. It is on the strict side, so it also allows whitelisting known immutable classes.
You can check it out at: www.mutabilitydetector.org
It allows you to write code like this in your application:
/*
* Request an analysis of the runtime class, to discover if this
* instance will be immutable or not.
*/
AnalysisResult result = analysisSession.resultFor(dottedClassName);
if (result.isImmutable.equals(IMMUTABLE)) {
/*
* rest safe in the knowledge the class is
* immutable, share across threads with joyful abandon
*/
} else if (result.isImmutable.equals(NOT_IMMUTABLE)) {
/*
* be careful here: make defensive copies,
* don't publish the reference,
* read Java Concurrency In Practice right away!
*/
}
It is free and open source under the Apache 2.0 license.
Something which works for a high percentage of builtin classes is test for instanceof Comparable. For the classes which are not immutable like Date, they are often treated as immutable in most cases.
I appreciate and admire the amount of work Grundlefleck has put into his mutability detector, but I think it is a bit of an overkill. You can write a simple but practically very adequate (that is, pragmatic) detector as follows:
(note: this is a copy of my comment here: https://stackoverflow.com/a/28111150/773113)
First of all, you are not going to be just writing a method which determines whether a class is immutable; instead, you will need to write an immutability detector class, because it is going to have to maintain some state. The state of the detector will be the detected immutability of all classes which it has examined so far. This is not only useful for performance, but it is actually necessary because a class may contain a circular reference, which would cause a simplistic immutability detector to fall into infinite recursion.
The immutability of a class has four possible values: Unknown, Mutable, Immutable, and Calculating. You will probably want to have a map which associates each class that you have encountered so far to an immutability value. Of course, Unknown does not actually need to be implemented, since it will be the implied state of any class which is not yet in the map.
So, when you begin examining a class, you associate it with a Calculating value in the map, and when you are done, you replace Calculating with either Immutable or Mutable.
For each class, you only need to check the field members, not the code. The idea of checking bytecode is rather misguided.
First of all, you should not check whether a class is final; The finality of a class does not affect its immutability. Instead, a method which expects an immutable parameter should first of all invoke the immutability detector to assert the immutability of the class of the actual object that was passed. This test can be omitted if the type of the parameter is a final class, so finality is good for performance, but strictly speaking not necessary. Also, as you will see further down, a field whose type is of a non-final class will cause the declaring class to be considered as mutable, but still, that's a problem of the declaring class, not the problem of the non-final immutable member class. It is perfectly fine to have a tall hierarchy of immutable classes, in which all the non-leaf nodes must of course be non-final.
You should not check whether a field is private; it is perfectly fine for a class to have a public field, and the visibility of the field does not affect the immutability of the declaring class in any way, shape, or form. You only need to check whether the field is final and its type is immutable.
When examining a class, what you want to do first of all is to recurse to determine the immutability of its super class. If the super is mutable, then the descendant is by definition mutable too.
Then, you only need to check the declared fields of the class, not all fields.
If a field is non-final, then your class is mutable.
If a field is final, but the type of the field is mutable, then your class is mutable. (Arrays are by definition mutable.)
If a field is final, and the type of the field is Calculating, then ignore it and proceed to the next field. If all fields are either immutable or Calculating, then your class is immutable.
If the type of the field is an interface, or an abstract class, or a non-final class, then it is to be considered as mutable, since you have absolutely no control over what the actual implementation may do. This might seem like an insurmountable problem, because it means that wrapping a modifiable collection inside an UnmodifiableCollection will still fail the immutability test, but it is actually fine, and it can be handled with the following workaround.
Some classes may contain non-final fields and still be effectively immutable. An example of this is the String class. Other classes which fall into this category are classes which contain non-final members purely for performance monitoring purposes (invocation counters, etc.), classes which implement popsicle immutability (look it up), and classes which contain members that are interfaces which are known to not cause any side effects. Also, if a class contains bona fide mutable fields but promises not to take them into account when computing hashCode() and equals(), then the class is of course unsafe when it comes to multi-threading, but it can still be considered as immutable for the purpose of using it as a key in a map. So, all these cases can be handled in one of two ways:
Manually adding classes (and interfaces) to your immutability detector. If you know that a certain class is effectively immutable despite the fact that the immutability test for it fails, you can manually add an entry to your detector which associates it with Immutable. This way, the detector will never attempt to check whether it is immutable, it will always just say 'yes, it is.'
Introducing an #ImmutabilityOverride annotation. Your immutability detector can check for the presence of this annotation on a field, and if present, it may treat the field as immutable despite the fact that the field may be non-final or its type may be mutable. The detector may also check for the presence of this annotation on the class, thus treating the class as immutable without even bothering to check its fields.
I hope this helps future generations.

Initialize class fields in constructor or at declaration?

I've been programming in C# and Java recently and I am curious where the best place is to initialize my class fields.
Should I do it at declaration?:
public class Dice
{
private int topFace = 1;
private Random myRand = new Random();
public void Roll()
{
// ......
}
}
or in a constructor?:
public class Dice
{
private int topFace;
private Random myRand;
public Dice()
{
topFace = 1;
myRand = new Random();
}
public void Roll()
{
// .....
}
}
I'm really curious what some of you veterans think is the best practice. I want to be consistent and stick to one approach.
My rules:
Don't initialize with the default values in declaration (null, false, 0, 0.0…).
Prefer initialization in declaration if you don't have a constructor parameter that changes the value of the field.
If the value of the field changes because of a constructor parameter put the initialization in the constructors.
Be consistent in your practice (the most important rule).
In C# it doesn't matter. The two code samples you give are utterly equivalent. In the first example the C# compiler (or is it the CLR?) will construct an empty constructor and initialise the variables as if they were in the constructor (there's a slight nuance to this that Jon Skeet explains in the comments below).
If there is already a constructor then any initialisation "above" will be moved into the top of it.
In terms of best practice the former is less error prone than the latter as someone could easily add another constructor and forget to chain it.
I think there is one caveat. I once committed such an error: Inside of a derived class, I tried to "initialize at declaration" the fields inherited from an abstract base class. The result was that there existed two sets of fields, one is "base" and another is the newly declared ones, and it cost me quite some time to debug.
The lesson: to initialize inherited fields, you'd do it inside of the constructor.
The semantics of C# differs slightly from Java here. In C# assignment in declaration is performed before calling the superclass constructor. In Java it is done immediately after which allows 'this' to be used (particularly useful for anonymous inner classes), and means that the semantics of the two forms really do match.
If you can, make the fields final.
Assuming the type in your example, definitely prefer to initialize fields in the constructor. The exceptional cases are:
Fields in static classes/methods
Fields typed as static/final/et al
I always think of the field listing at the top of a class as the table of contents (what is contained herein, not how it is used), and the constructor as the introduction. Methods of course are chapters.
In Java, an initializer with the declaration means the field is always initialized the same way, regardless of which constructor is used (if you have more than one) or the parameters of your constructors (if they have arguments), although a constructor might subsequently change the value (if it is not final). So using an initializer with a declaration suggests to a reader that the initialized value is the value that the field has in all cases, regardless of which constructor is used and regardless of the parameters passed to any constructor. Therefore use an initializer with the declaration only if, and always if, the value for all constructed objects is the same.
There are many and various situations.
I just need an empty list
The situation is clear. I just need to prepare my list and prevent an exception from being thrown when someone adds an item to the list.
public class CsvFile
{
private List<CsvRow> lines = new List<CsvRow>();
public CsvFile()
{
}
}
I know the values
I exactly know what values I want to have by default or I need to use some other logic.
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = new List<string>() {"usernameA", "usernameB"};
}
}
or
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = GetDefaultUsers(2);
}
}
Empty list with possible values
Sometimes I expect an empty list by default with a possibility of adding values through another constructor.
public class AdminTeam
{
private List<string> usernames = new List<string>();
public AdminTeam()
{
}
public AdminTeam(List<string> admins)
{
admins.ForEach(x => usernames.Add(x));
}
}
What if I told you, it depends?
I in general initialize everything and do it in a consistent way. Yes it's overly explicit but it's also a little easier to maintain.
If we are worried about performance, well then I initialize only what has to be done and place it in the areas it gives the most bang for the buck.
In a real time system, I question if I even need the variable or constant at all.
And in C++ I often do next to no initialization in either place and move it into an Init() function. Why? Well, in C++ if you're initializing something that can throw an exception during object construction you open yourself to memory leaks.
The design of C# suggests that inline initialization is preferred, or it wouldn't be in the language. Any time you can avoid a cross-reference between different places in the code, you're generally better off.
There is also the matter of consistency with static field initialization, which needs to be inline for best performance. The Framework Design Guidelines for Constructor Design say this:
✓ CONSIDER initializing static fields inline rather than explicitly using static constructors, because the runtime is able to optimize the performance of types that don’t have an explicitly defined static constructor.
"Consider" in this context means to do so unless there's a good reason not to. In the case of static initializer fields, a good reason would be if initialization is too complex to be coded inline.
Being consistent is important, but this is the question to ask yourself:
"Do I have a constructor for anything else?"
Typically, I am creating models for data transfers that the class itself does nothing except work as housing for variables.
In these scenarios, I usually don't have any methods or constructors. It would feel silly to me to create a constructor for the exclusive purpose of initializing my lists, especially since I can initialize them in-line with the declaration.
So as many others have said, it depends on your usage. Keep it simple, and don't make anything extra that you don't have to.
Consider the situation where you have more than one constructor. Will the initialization be different for the different constructors? If they will be the same, then why repeat for each constructor? This is in line with kokos statement, but may not be related to parameters. Let's say, for example, you want to keep a flag which shows how the object was created. Then that flag would be initialized differently for different constructors regardless of the constructor parameters. On the other hand, if you repeat the same initialization for each constructor you leave the possibility that you (unintentionally) change the initialization parameter in some of the constructors but not in others. So, the basic concept here is that common code should have a common location and not be potentially repeated in different locations. So I would say always put it in the declaration until you have a specific situation where that no longer works for you.
There is a slight performance benefit to setting the value in the declaration. If you set it in the constructor it is actually being set twice (first to the default value, then reset in the ctor).
When you don't need some logic or error handling:
Initialize class fields at declaration
When you need some logic or error handling:
Initialize class fields in constructor
This works well when the initialization value is available and the
initialization can be put on one line. However, this form of
initialization has limitations because of its simplicity. If
initialization requires some logic (for example, error handling or a
for loop to fill a complex array), simple assignment is inadequate.
Instance variables can be initialized in constructors, where error
handling or other logic can be used.
From https://docs.oracle.com/javase/tutorial/java/javaOO/initial.html .
I normally try the constructor to do nothing but getting the dependencies and initializing the related instance members with them. This will make you life easier if you want to unit test your classes.
If the value you are going to assign to an instance variable does not get influenced by any of the parameters you are going to pass to you constructor then assign it at declaration time.
Not a direct answer to your question about the best practice but an important and related refresher point is that in the case of a generic class definition, either leave it on compiler to initialize with default values or we have to use a special method to initialize fields to their default values (if that is absolute necessary for code readability).
class MyGeneric<T>
{
T data;
//T data = ""; // <-- ERROR
//T data = 0; // <-- ERROR
//T data = null; // <-- ERROR
public MyGeneric()
{
// All of the above errors would be errors here in constructor as well
}
}
And the special method to initialize a generic field to its default value is the following:
class MyGeneric<T>
{
T data = default(T);
public MyGeneric()
{
// The same method can be used here in constructor
}
}
"Prefer initialization in declaration", seems like a good general practice.
Here is an example which cannot be initialized in the declaration so it has to be done in the constructor.
"Error CS0236 A field initializer cannot reference the non-static field, method, or property"
class UserViewModel
{
// Cannot be set here
public ICommand UpdateCommad { get; private set; }
public UserViewModel()
{
UpdateCommad = new GenericCommand(Update_Method); // <== THIS WORKS
}
void Update_Method(object? parameter)
{
}
}

Categories

Resources