I would like to have some guarantee that instances of some particular class Content is only accessed by its "owner", and if another object wants the same Content it needs to take a deep copy. Coming from C++ I would achieve that using a unique_ptr, is there anything similar in Java?
Currently I am resolving this by just keeping the Content private everywhere I keep one and paying attention to creating a new Content (the constructor implements the deep copy mechanism) on a getContent. But I have no means of enforcing possible other users of the Content class to follow the same pattern, it's easy to forget. It would be nicer if it could take care of itself somehow, like not being copyable.
I realize that it goes somewhat against the spirit of the language, but in some cases I think it's justified. For example, if Content represents some stream of data that is modified even by reading it. I thought, if not in the core language, maybe there is some #interface for compile-time checking or a way of creating one?
Edit: The idea is that the owner can modify the object freely, before or after taking copies, and if someone takes a deep copy, they can modify theirs (not affecting the original), so making the Content immutable is a bit too harsh (unless I'm misunderstanding what that implies).
There are a couple of common strategies here:
Privacy with defensive copying
In this strategy, you'd have the owner have a private reference to the content, and if it's appropriate for it to give out copies of that content, to do so via a defensive copy:
class Owner {
private Content content;
// ...unnecessary detail omitted...
public Content getContent() {
return new Content(this.content);
}
}
The Cloneable interface can sometimes be useful here.
Immutable objects
The other common strategy is to use immutable objects (e.g., ensure that Content, once instantiated, cannot be modified). Then you don't care who has a reference to the content, since they cannot change it.
No there isn't.
Once you have established a reference to an object, there's absolutely nothing you can do to stop someone form assigning another reference to that object via that established reference.
Java programmers get round this by making objects immutable (see java.lang.String). Then you ought not give two hoots about who else is referring to a particular instance.
You can declare the class Content as Immutable by doing this:
Don't provide "setter" methods — methods that modify fields or objects referred to by fields.
Make all fields final and private.
Don't allow subclasses to override methods. The simplest way to do this is to declare the class as final.
If the instance fields include references to mutable objects, don't allow those objects to be changed
Here is a java official doc: https://docs.oracle.com/javase/tutorial/essential/concurrency/imstrat.html
Java does not have something like that. There are some language elements that can help with such requirements:
Enums that have only one constant; to be used as "built-in" singletons
Methods in Collections to create immutable copies of collections
And of course, you can make all fields in your class final; so they get initialized only during construction time; to prevent later changes
But as Java is also missing a const concept, you can partially work around such things. Like in:
class Foo {
private final List<Bar> bars = new ArrayList<>();
doesn't mean that instances of Foo will be immutable - as you still can add/remove elements to that list owned by Foo.
Similar; given
List<Foo> root = ...
List<Foo> immutableCopy = Collections.unmodifiableList(root);
one can still change that immutableCopy ... by messing up root.
Related
Let's say I have a simple class that stores a user's friends in an ArrayList of strings, with a getter to access that ArrayList:
public class User
{
private ArrayList<String> mFriends;
// ...other code like constructors and setters...
public ArrayList<String> getFriends()
{
return mFriends;
}
}
Since Java and many other languages are (equivalently) pass-by-reference, does this not allow the caller of getFriends() direct access to my ArrayList of strings i.e. they could modify it without even calling my setter?
I feel this immensely breaks the concept of encapsulation. Is there a normal workaround for this, besides dynamically creating a new ArrayList with the same values and returning that, or am I misunderstanding something?
Edit: I understand that Java is not truly pass-by-reference but rather passes over a copy of the address of the original object, but this has the same effect as pass-by-reference when exposing objects to those outside your class
Java is not, and has never been implementing Pass by Reference mechanism, it has always been Pass by Value;
The problem you are describing is known as Reference Escape, and yes, you are right, caller can modify your object, if you expose it via reference;
In order to avoid the Reference Escape problem, you can either:
return a deep copy of the object (with .clone());
create a new object with the existing data (e.g. new ArrayList<>(yourObjectHere));
or come up with some other idea, there are some other ways too do this;
This does not really break the Encapsulation, per se, it is rather a point of correct design how you implement the encapsulation;
Your concern about performance: no, it is not going to break performance, moreover - it has nothing to do with performance; rather it is a matter of proper design of the OOP, mandating either mutability or immutability of the object. If you were to always return a deep copy instead of reference, you would not have a chance to have a good leverage of your object.
Taking your example: what if you want to change the state of the object without just setting a new object via its setter? what if you want to amend the existing friends (which is in your example)? do you think it is rather better to create a new List of friends and set it into the object? no, you are simply losing control over your object in the latter case.
If you are worried about encapsulation then you can return a copy of your list e.g.
public ArrayList<String> getFriends() {
return new ArrayList<>(mFriends);
}
By the way, Java is not truly pass-by-reference it's more pass-by-value.
You are right for mutable objects. You could wrap the field with Collections.unmodifiableList and such. What one also sees, is for (mutable) collections just have no getters, but addFriend, getFriend(index) and such.
In fact getters (and estpecially setters) are no longer a very esteemed pattern.
The member "m" prefix is imho better suited for other languages.
I read Effective Java, and there written
If a class cannot be made immutable, limit its mutability as much as
possible...
and
...make every field final unless there is a compelling reason to make it
nonfinal.
So need I always make all my POJO(for example simple Bookclass with ID, Title and Author fields) classes immutable? And when I want to change state of my object(for example user change it in table where represented many Books), instead of setters use method like this:
public Book changeAuthor(String author) {
return new Book(this.id, this.title, author); //Book constructor is private
}
But I think is really not a good idea..
Please, explain me when to make a class immutable.
No, you don't need always to make your POJO immutable. Like you said, sometimes it can be a bad idea. If you object has attributes that will change over the time, a setter is the most comfortable way to do it.
But you should consider to make your object immutable. It will help you to find errors, to program more clearly and to deal with concurrency.
But I think you quoting say everything:
If a class cannot be made immutable, limit its mutability as much as
possible...
and
...make every field final unless there is a compelling reason to make
it nonfinal.
That's what you should do. Unless it's not possible, because you have a setter. But then be aware of concurrency.
In OOP world we have state. State it's all properties in your object. Return new object when you change state of your object guaranties that your application will work correctly in concurrent environment without specific things (synchronized, locks, atomics, etc.). But you always create new object.
Imagine that your object contains 100 properties, or to be real some collection with 100 elements. To follow the idea of immutability you need copy this collection as well. It's great memory overhead, perhaps it handled by GC. In most situation it's better to manually handle state of object than make object immutable. In some hard cases better to return copy if concurrent problems very hard. It depends on task. No silver bullet.
1. A POJO is one which has private Instance Variables with Getter and Setter methods.
2. And Classes like String class, which needs a constant behavior/implementation at all time needs to be
final, not the one which needs to change with time.
3. For making a class immutable, final is not only the solution, One can have private Instance variables, with only Getter methods. And their state being set into the Constructor.
4. Now depending on your coding decision, try to rectify which fields needs to be constant throughout the program, if you feel that certain fields are to be immutable, make them final.
5. JVM uses a mechanism called Constant folding for pre-calculating the constant values.
In the book Java Concurrency In Practice it explains the advantages of "effectively immutable" objects versus mutable objects concurrency-wise. But it does not explain what advantage "effectively immutables" objects would offer over really immutable objects.
And I don't get it: can't you always build a really immutable object at the moment you'd decide to publish safely an "effectively immutable" object? (instead of doing your "safe publication" you'd build a really immutable object and that's it)
When I'm designing classes I fail to see cases where I couldn't always build a truly immutable object (using delegation if needed etc. to build other wrapped objects, themselves truly immmutable of course) at the moment I'd decide to "safely publish".
So are "effectively immutable" object and their "safe publication" just a case of bad design or poor APIs?
Where would you be forced to use an effectively immutable object and be forced to safely publish it where you couldn't build a much superior really immutable object?
Yes, they make sense in some cases. An easy example is when you want some property to be generated lazily and cached so you can avoid the overhead of generating it if it's never accessed. String is an example of an effectively immutable class that does this (with its hashcode).
For circular immutables:
class Foo
{
final Object param;
final Foo other;
Foo(Object param, Foo other)
{
this.param = param;
this.other = other;
}
// create a pair of Foo's, A=this, B=other
Foo(Object paramA, Object paramB)
{
this.param = paramA;
this.other = new Foo(paramB, this);
}
Foo getOther(){ return other; }
}
// usage
Foo fooA = new Foo(paramA, paramB);
Foo fooB = fooA.getOther();
// publish fooA/fooB (unsafely)
A question is, since this of fooA is leaked inside constructor, is fooA still a thread safe immutable? That is, if another thread reads fooB.getOther().param, is it guaranteed to see paramA? The answer is yes, since this is not leaked to another thread before the freeze action; we can establish hb/dc/mc orders required by spec to prove that paramA is the only visible value for the read.
Back to your original question. In practice there are always constraints beyond the pure technical ones. Initialize everything inside constructor is not necessarily the best option for a design, considering all engineering, operational, political and other human-ish reasons.
Ever wondering why we are fed to think that it is a great supreme idea?
The deeper problem is Java lacks a general cheap fense for safe publication which is cheaper than volatile. Java only has it for final fields; for some reason, that fence is not available otherwise.
Now final carries two independent meanings: 1st, that a final field must be assigned exactly once; 2nd, the memory semantics of safe publication. These two meanings have nothing to do with each other. It is quite confusing to bundle them together. When people need the 2nd meaning, they are forced to accept the 1st meaning too. When the 1st is very inconvenient to achieve in a design, people wonder what they have done wrong - not realizing that it's Java that did wrong.
Bundling of two meanings under one final makes it double plus good, so that apparently we have more reason and motivation to use final. The more sinister story is actually we are forced to use it because we are not given a more flexible choice.
Using effectively immutable objects lets you avoid creating a considerable number of classes. Instead of making pairs of [mutable builder]/[immutable object] classes, you can build one effectively immutable class. I usually define an immutable interface, and a mutable class that implements this interface. An object is configured through its mutable class methods, and then published through its immutable interface. As long as the clients of your library program to the interface, to them your objects remain immutable through their published lifetime.
Suppose one has an immutable class Foo with five properties, named Alpha, Beta, etc., and one wishes to provide WithAlpha, WithBeta, etc. methods which will return an instance which is identical to the original except with the particular property changed. If the class is truly and deeply immutable, the methods have to take the form:
Foo WithAlpha(string newAlpha)
{
return new Foo(newAlpha, Beta, Gamma, Delta, Epsilon);
}
Foo WithBeta(string newBeta)
{
return new Foo(Alpha, NewBeta, Gamma, Delta, Epsilon);
}
Ick. A massive violation of "Don't Repeat Yourself" (DRY) principles. Further, adding a new property to the class would require adding it to every single one of those methods.
On the other hand, if each Foo held an internal FooGuts which included a copy constructor, one could instead do something like:
Foo WithAlpha(string newAlpha)
{
FooGuts newGuts = new FooGuts(Guts); // Guts is a private or protected field
newGuts.Alpha = newAlpha;
return new Foo(newGuts); // Private or protected constructor
}
The number of lines of code for each method has increased, but the methods no longer need to make any reference to any properties they aren't "interested" in. Note that while a Foo might not be immutable if its constructor were called with a FooGuts to which any outside reference existed, its constructor is only accessible to code which is trusted not to maintain any such reference after construction.
In C++ a getter & setter for a private data member is very useful due to the ability to control mutability via a const return value.
In Java, if I understand correctly (please correct me if I am mistaken), specifying final on a getter doesn't work that way. Once the caller received the data member reference through the getter, it can modify it, despite it being private...
If that's the case (and please correct me if I have a gross misconception here), why not declare the data member public and simplify things?
Making immutable return values in java is a matter of either returning already immutable objects types (such as String) or returning a copy for non-immutable objects.
Sample 1 - Already immutable object
public String getValue() {
return value;
}
Sample 2 - Collection of already immutable objects
public List<String> getValues() {
return new ArrayList<String>(values);
}
Sample 3 - Non-immutable object
public Complex getComplex() {
return complex.clone();
}
Sample 4 - Collection of non-immutable objects
public List<Complex> getComplex() {
List<Complex> copy = new ArrayList<Complex>(complexs.size());
for (Complex c : complexs)
copy.add(c.clone());
return copy;
}
Sample 3 and 4 are for conveniance based on that the complex type implements the Cloneable interface.
Furthermore, to avoid subclasses overriding your immutable methods you can declare them final. As a side note, the builder pattern is typically useful for constructing immutable objects.
If you want your class to be immutable (i.e. having only final fields and getters) you must be sure that the values you return are immutable as well. You get this for free when returning Strings and built-in primitives, however some extra steps are necessary for other data types:
wrap collections with immutable decorators or defensively copy them before returning from a getter
make a copy of Date and Calendar
Only return immutable objects or defensively clone them. This also applies to objects in collections.
Note that if you defensively copy a collection, the client can view or modify the copy, but this does not affect the original collection:
return new ArrayList<Foo>(foos);
On the other hand if you wrap the original collection, the client is able to see all the changes that were introduced to the collection after the wrapper was created, but trying to change the contents of the wrapper will result in runtime exception:
return Collections.unmodifiableList(foos);
The bottom line is: Foo has to be immutable as well, otherwise the collection is immutable, but the client code can still modify members of the collection. So the same rules apply to Foo.
If that's the case (and please correct me if I have a gross misconception here), why not declare the data member public and simplify things?
Because:
you might wish to store mutable data inside an object and only provide immutable (read-only) view of the data (like wrapping collections)
you can change the implementation in the future, get rid of the field and for instance compute the value on the fly.
If you want to return an immutable view of a mutable standard container (eg list), then you should take a look at the Collections library:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/Collections.html
It provides some useful wrappers such as unmodifiableMap and unmodifiableList. That way you don't have to make a wasteful copy. Of course, if the elements of the list are mutable, then this won't help as much -- there's no easy way in Java to get "deep" immutability. Of course, the same is true in C++ -- e.g., if you have a const vector of pointers to Foo objects, then the Foo objects themselves can still be modified (because const doesn't propagate across pointers).
If that's the case (and please correct me if I have a gross misconception here), why not declare the data member public and simplify things?
First of all, the JavaBeans spec. requires you to provide getters (and setters for mutable properties).
Second, getters might enable you to add some logic, e.g. one getter might actually decide what to return (e.g. if the property is null return something differenc). If you didn't have getters in the first place you'd have more trouble to add such logic later on. With getters you'd just change the method without touching the callers.
why not declare the data member public and simplify things?
Because information hiding makes it easier to manage and maintain a complex codebase. If the data members are private, you can change representation and behavior in one class, rather than throughout a large codebase.
Once the caller received the data member reference through the getter, it can modify it, despite it being private...
To clarify, a caller cannot modify a data member returned from a getter. It might be able to modify an object to which the data member points.
If this is a problem, and you're providing access through a getter, you can return an immutable instance, or a defensive copy.
The setter is also valuable for controlling modification to a referenced object. You can make a defensive copy in the setter.
You often read about immutable objects requiring final fields to be immutable in Java. Is this in fact the case, or is it simply enough to have no public mutability and not actually mutate the state?
For example, if you have an immutable object built by the builder pattern, you could do it by having the builder assign the individual fields as it builds, or having the builder hold the fields itself and ultimately return the immutable object by passing the values to its (private) constructor.
Having the fields final has the obvious advantage of preventing implementation errors (such as allowing code to retain a reference to the builder and "building" the object multiple times while in fact mutating an existing object), but having the Builder store its data inside the object as it is built would seem to be DRYer.
So the question is: Assuming the Builder does not leak the Object early and stops itself from modifying the object once built (say by setting its reference to the object as null) is there actually anything gained (such as improved thread safety) in the "immutability" of the object if the object's fields were made final instead?
Yes, you do get "thread safety" from final fields. That is, the value assigned to a final field during construction is guaranteed to be visible to all threads. The other alternative for thread safety is to declare the fields volatile, but then you are incurring a high overhead with every read… and confusing anyone who looks at your class and wonders why the fields of this "immutable" class are marked "volatile."
Marking the fields final is the most correct technically, and conveys your intent most clearly. Unfortunately, it does make the builder pattern very cumbersome. I think it should be possible to create an annotation processor to synthesize a builder for an immutable class, much like Project Lombok does with setters and getters. The real work would be the IDE support needed so that you could code against the builders that don't really exist.
An Object can certainly have mutable private fields and still work as an immutable object. All that matters to meet the contract of immutability is that the object appears immutable from the outside. An object with non-final private fields but no setters would for example satisfy this requirement.
In fact, if your encapsulation is right then you can actually mutate the internal state and still operate successfully as an "immutable" object. An example might be some sort of lazy evaluation or caching of data structures.
Clojure for example does this in its internal implementation of lazy sequences, these objects behave as if they are immutable but only actually calculate and store future values when they are directly requested. Any subsequent request retrieves the stored value.
However - I would add as a caveat that the number of places where you would actually want to mutate the internals of an immutable object are probably quite rare. If in doubt, make them final.
I think you would just need to consider the environment its running in and decide if frameworks that use reflection to manipulate objects are a hazard.
One could easily cook up an oddball scenario where a supposedly immutable object gets clobbered via a POST injection attack because of a web binding framework that's configured to use reflection instead of bean setters.
You definitely can have an immutable object with non-final fields.
For example see java 1.6 implementation of java.lang.String.
Comment:
#erickson
Like that:
class X { volatile int i, j; }
X y;
// thread A:
X x = new X;
x.i = 1;
x.j = 2;
y = x;
// thread B:
if (y != null) {
a = y.i;
b = y.j;
}
?