Defensive copy: should it be specified in the Javadoc?

Defensive copy: should it be specified in the Javadoc? - java

as far as I understand, getters/setters should always make copies, in order to protect the data.
However, for many of my classes, it is safe to have the getter return a reference to the property asked for, so that the following code
b = a.getB();
b.setC(someValue);
actually changes the state of object a. If I can prove that it is OK for my class, is it good practice to implement the getter this way? Should the user then be notified of this, for example in the Javadoc? I think that this would break the implementation-hiding paradigm, so, should I always assume that the state of a did not change, and make a call to the setter
b = a.getB();
b.setC(someValue);
a.setB(b);
Thanks in advance
S

There's a good argument in your above example that since A is maintaining a reference to B, A should look after B, and not hand it out but manipulate it on your behalf. Otherwise you can argue that you're breaking encapsulation (since A reveals it has a reference to B), and ideally objects should do things for you, rather than export their contents such that you can manipulate them.
Having said all that, the above is certainly not an uncommon practise and often a pragmatic choice.
When you expose an object via get(), you have three options:
expose the actual object
make a defensive copy
expose an object that wraps the original, but prohibits modification. e.g. you can wrap the original object in a restricted interface. See (for example) Collections.unmodifiableCollection() which wraps the original collection (and doesn't copy it) but provides an interface that doesn't permit modification.
Whatever you do, you should document it in the interface (and hence in the Javadoc). Otherwise you're at liberty to change it later, and dependent code can easily break.

Well, the setC violates the Law of Demeter, so I don't think I'd call it a best practice. ("Law" is a bit strong - for instance, it's generally not applied to fluent interfaces.)
That said, getters should not always make copies IMHO. Doing a deep clone can be expensive. There are other options, such as immutable objects.
And, realistically, there are pragmatic considerations.
But I'd err on the side of TMI (too much information) in the JavaDoc.

A further option not yet mentioned is to expose the object via an immutable interface. Obviously this isn't fool-proof as the calling code could always downcast the object into the mutable version, but it avoids any overhead in wrapping the object or creating a copy.
I usually take this approach if I'm writing an API that I'm likely to use myself or within my programming team; i.e. where I know "clients" are going to be good citizens!
// Immutable interface definition.
public interface Record {
String getContent();
}
// Mutable implementation of Record interface.
public class MutableRecord implements Record {
private final String content;
public MutableRecord(String content) {
this.content = content;
}
public String getContent() {
return content;
}
public void setContent(String content) {
this.content = content;
}
}
// API that only exposes the object via its Record interface.
public class MyApi {
private final MutableRecord mutableRecord;
public Record getRecord() {
return mutableRecord;
}
}

You will get copying wrong before you know it! And then you have a real bug, not a potential one.
Therefore, if you trust the client code, don't bother about it.
This is highly academic and I personally never had a problem with it. I also immediately turn the check off in FindBugs (Java) for example...
And unless there is a problem, who reads JavaDoc and the like anyway? Anybody out there?

Related

Don't getters returning objects allow callers direct access to member variables?

Let's say I have a simple class that stores a user's friends in an ArrayList of strings, with a getter to access that ArrayList:
public class User
{
private ArrayList<String> mFriends;
// ...other code like constructors and setters...
public ArrayList<String> getFriends()
{
return mFriends;
}
}
Since Java and many other languages are (equivalently) pass-by-reference, does this not allow the caller of getFriends() direct access to my ArrayList of strings i.e. they could modify it without even calling my setter?
I feel this immensely breaks the concept of encapsulation. Is there a normal workaround for this, besides dynamically creating a new ArrayList with the same values and returning that, or am I misunderstanding something?
Edit: I understand that Java is not truly pass-by-reference but rather passes over a copy of the address of the original object, but this has the same effect as pass-by-reference when exposing objects to those outside your class

Java is not, and has never been implementing Pass by Reference mechanism, it has always been Pass by Value;
The problem you are describing is known as Reference Escape, and yes, you are right, caller can modify your object, if you expose it via reference;
In order to avoid the Reference Escape problem, you can either:
return a deep copy of the object (with .clone());
create a new object with the existing data (e.g. new ArrayList<>(yourObjectHere));
or come up with some other idea, there are some other ways too do this;
This does not really break the Encapsulation, per se, it is rather a point of correct design how you implement the encapsulation;
Your concern about performance: no, it is not going to break performance, moreover - it has nothing to do with performance; rather it is a matter of proper design of the OOP, mandating either mutability or immutability of the object. If you were to always return a deep copy instead of reference, you would not have a chance to have a good leverage of your object.
Taking your example: what if you want to change the state of the object without just setting a new object via its setter? what if you want to amend the existing friends (which is in your example)? do you think it is rather better to create a new List of friends and set it into the object? no, you are simply losing control over your object in the latter case.

If you are worried about encapsulation then you can return a copy of your list e.g.
public ArrayList<String> getFriends() {
return new ArrayList<>(mFriends);
}
By the way, Java is not truly pass-by-reference it's more pass-by-value.

You are right for mutable objects. You could wrap the field with Collections.unmodifiableList and such. What one also sees, is for (mutable) collections just have no getters, but addFriend, getFriend(index) and such.
In fact getters (and estpecially setters) are no longer a very esteemed pattern.
The member "m" prefix is imho better suited for other languages.

Does Java have a concept of reference ownership or noncopyable classes?

I would like to have some guarantee that instances of some particular class Content is only accessed by its "owner", and if another object wants the same Content it needs to take a deep copy. Coming from C++ I would achieve that using a unique_ptr, is there anything similar in Java?
Currently I am resolving this by just keeping the Content private everywhere I keep one and paying attention to creating a new Content (the constructor implements the deep copy mechanism) on a getContent. But I have no means of enforcing possible other users of the Content class to follow the same pattern, it's easy to forget. It would be nicer if it could take care of itself somehow, like not being copyable.
I realize that it goes somewhat against the spirit of the language, but in some cases I think it's justified. For example, if Content represents some stream of data that is modified even by reading it. I thought, if not in the core language, maybe there is some #interface for compile-time checking or a way of creating one?
Edit: The idea is that the owner can modify the object freely, before or after taking copies, and if someone takes a deep copy, they can modify theirs (not affecting the original), so making the Content immutable is a bit too harsh (unless I'm misunderstanding what that implies).

There are a couple of common strategies here:
Privacy with defensive copying
In this strategy, you'd have the owner have a private reference to the content, and if it's appropriate for it to give out copies of that content, to do so via a defensive copy:
class Owner {
private Content content;
// ...unnecessary detail omitted...
public Content getContent() {
return new Content(this.content);
}
}
The Cloneable interface can sometimes be useful here.
Immutable objects
The other common strategy is to use immutable objects (e.g., ensure that Content, once instantiated, cannot be modified). Then you don't care who has a reference to the content, since they cannot change it.

No there isn't.
Once you have established a reference to an object, there's absolutely nothing you can do to stop someone form assigning another reference to that object via that established reference.
Java programmers get round this by making objects immutable (see java.lang.String). Then you ought not give two hoots about who else is referring to a particular instance.

You can declare the class Content as Immutable by doing this:
Don't provide "setter" methods — methods that modify fields or objects referred to by fields.
Make all fields final and private.
Don't allow subclasses to override methods. The simplest way to do this is to declare the class as final.
If the instance fields include references to mutable objects, don't allow those objects to be changed
Here is a java official doc: https://docs.oracle.com/javase/tutorial/essential/concurrency/imstrat.html

Java does not have something like that. There are some language elements that can help with such requirements:
Enums that have only one constant; to be used as "built-in" singletons
Methods in Collections to create immutable copies of collections
And of course, you can make all fields in your class final; so they get initialized only during construction time; to prevent later changes
But as Java is also missing a const concept, you can partially work around such things. Like in:
class Foo {
private final List<Bar> bars = new ArrayList<>();
doesn't mean that instances of Foo will be immutable - as you still can add/remove elements to that list owned by Foo.
Similar; given
List<Foo> root = ...
List<Foo> immutableCopy = Collections.unmodifiableList(root);
one can still change that immutableCopy ... by messing up root.

Do effectively immutable objects make sense?

In the book Java Concurrency In Practice it explains the advantages of "effectively immutable" objects versus mutable objects concurrency-wise. But it does not explain what advantage "effectively immutables" objects would offer over really immutable objects.
And I don't get it: can't you always build a really immutable object at the moment you'd decide to publish safely an "effectively immutable" object? (instead of doing your "safe publication" you'd build a really immutable object and that's it)
When I'm designing classes I fail to see cases where I couldn't always build a truly immutable object (using delegation if needed etc. to build other wrapped objects, themselves truly immmutable of course) at the moment I'd decide to "safely publish".
So are "effectively immutable" object and their "safe publication" just a case of bad design or poor APIs?
Where would you be forced to use an effectively immutable object and be forced to safely publish it where you couldn't build a much superior really immutable object?

Yes, they make sense in some cases. An easy example is when you want some property to be generated lazily and cached so you can avoid the overhead of generating it if it's never accessed. String is an example of an effectively immutable class that does this (with its hashcode).

For circular immutables:
class Foo
{
final Object param;
final Foo other;
Foo(Object param, Foo other)
{
this.param = param;
this.other = other;
}
// create a pair of Foo's, A=this, B=other
Foo(Object paramA, Object paramB)
{
this.param = paramA;
this.other = new Foo(paramB, this);
}
Foo getOther(){ return other; }
}
// usage
Foo fooA = new Foo(paramA, paramB);
Foo fooB = fooA.getOther();
// publish fooA/fooB (unsafely)
A question is, since this of fooA is leaked inside constructor, is fooA still a thread safe immutable? That is, if another thread reads fooB.getOther().param, is it guaranteed to see paramA? The answer is yes, since this is not leaked to another thread before the freeze action; we can establish hb/dc/mc orders required by spec to prove that paramA is the only visible value for the read.
Back to your original question. In practice there are always constraints beyond the pure technical ones. Initialize everything inside constructor is not necessarily the best option for a design, considering all engineering, operational, political and other human-ish reasons.
Ever wondering why we are fed to think that it is a great supreme idea?
The deeper problem is Java lacks a general cheap fense for safe publication which is cheaper than volatile. Java only has it for final fields; for some reason, that fence is not available otherwise.
Now final carries two independent meanings: 1st, that a final field must be assigned exactly once; 2nd, the memory semantics of safe publication. These two meanings have nothing to do with each other. It is quite confusing to bundle them together. When people need the 2nd meaning, they are forced to accept the 1st meaning too. When the 1st is very inconvenient to achieve in a design, people wonder what they have done wrong - not realizing that it's Java that did wrong.
Bundling of two meanings under one final makes it double plus good, so that apparently we have more reason and motivation to use final. The more sinister story is actually we are forced to use it because we are not given a more flexible choice.

Using effectively immutable objects lets you avoid creating a considerable number of classes. Instead of making pairs of [mutable builder]/[immutable object] classes, you can build one effectively immutable class. I usually define an immutable interface, and a mutable class that implements this interface. An object is configured through its mutable class methods, and then published through its immutable interface. As long as the clients of your library program to the interface, to them your objects remain immutable through their published lifetime.

Suppose one has an immutable class Foo with five properties, named Alpha, Beta, etc., and one wishes to provide WithAlpha, WithBeta, etc. methods which will return an instance which is identical to the original except with the particular property changed. If the class is truly and deeply immutable, the methods have to take the form:
Foo WithAlpha(string newAlpha)
{
return new Foo(newAlpha, Beta, Gamma, Delta, Epsilon);
}
Foo WithBeta(string newBeta)
{
return new Foo(Alpha, NewBeta, Gamma, Delta, Epsilon);
}
Ick. A massive violation of "Don't Repeat Yourself" (DRY) principles. Further, adding a new property to the class would require adding it to every single one of those methods.
On the other hand, if each Foo held an internal FooGuts which included a copy constructor, one could instead do something like:
Foo WithAlpha(string newAlpha)
{
FooGuts newGuts = new FooGuts(Guts); // Guts is a private or protected field
newGuts.Alpha = newAlpha;
return new Foo(newGuts); // Private or protected constructor
}
The number of lines of code for each method has increased, but the methods no longer need to make any reference to any properties they aren't "interested" in. Note that while a Foo might not be immutable if its constructor were called with a FooGuts to which any outside reference existed, its constructor is only accessible to code which is trusted not to maintain any such reference after construction.

Shallow/deep-copy semantics in accessor/mutator

What's considered best practice when it comes to accessors/mutators and shallow/deep-copy? Or is this question specific to the situation at hand?
i.e.
public class Test {
private final Point point = new Point(-32, 168);
public void getPoint{
return point
}
}
My current thinking is to use deep-copy for anything mutable; therefore, if Point provided a setter, I would use deep-copy in Test#getPoint.
What's the status quo?
[Edit]
After JB Nizet's answer, I found this great resource.

It depends. Sometimes you want to return a reference to the object, and let the caller modify it. Sometimes you want to return a copy. In the case of a Point, a copy seems more appropriate, though.
Another way to protect the state of your class is to return an unmodifiable view of your field. This can be done by declaring the return type as an interface that only provides read-only methods, or, as in the Java collections api, by returning a wrapper object offering the same interface as the object but throwing exceptions when mutator methods are called.
Whatever the solution you choose, the key is to document what your method does.

What have you used Object.clone() for?

A colleague recently asked me how to deep-clone a Map and I realized that I probably have never used the clone() method- which worries me.
What are the most common scenarios you have found where you need to clone an object?

I assume you are referring to Object.clone() in Java. If yes, be advised that Object.clone() has some major problems, and its use is discouraged in most cases. Please see Item 11, from "Effective Java" by Joshua Bloch for a complete answer. I believe you can safely use Object.clone() on primitive type arrays, but apart from that you need to be judicious about properly using and overriding clone. You are probably better off defining a copy constructor or a static factory method that explicitly clones the object according to your semantics.

Most commonly, when I have to return a mutable object to a caller that I'm worried the caller might muck with, often in a thread-unfriendly way. Lists and Date are the ones that I do this to most. If the caller is likely to want to iterate over a List and I've got threads possibly updating it, it's safer to return a clone or copy of it.
Actually, that brings up something I'm going to have to open up another question for: copy constructors or clone? When I did C++, we ALWAYS did a copy constructor and implemented clone with it, but FindBugs does not like it if you implement your clone using a copy constructor.

When I need to make a duplicate of something to modify the duplicate without impacting the original, and of course in this scenario deep cloning (only) will suffice. I've had to do this on a system where I would clone a domain class instance, apply the user's changes, and then perform a comparison of the two for the user to verify their changes.

The Object.clone() method doesn't specify whether the copy of a subclass is a deep or shallow copy, it's completely dependent of the specific class. The Object.clone() method itself does a shallow copy (copies internal state of the Object class), but subclasses must override it, call super.clone(), and copy their internal state as needed (shallow or deep).
It does specify some conventions, which you may or not follow. For (a.getClass() == a.clone().getClass()) to return true, super.clone() should be called instead of simply 'new Subclass()', since super.clone() presumably would correctly instantiate the class of this object (even in subclasses), and copy all internal state, including private fields, which couldn't be copied by subclasses using a copy constructor, due visibility rules. Or you would be forced to expose a constructor that shouldn't be exposed, for better encapsulation.
Example:
//simple clone
class A implements Cloneable {
private int value;
public A clone() {
try {
A copy = (A) super.clone();
copy.value = this.value;
return copy;
} catch (CloneNotSupportedException ex) {}
}
}
//clone with deep and shallow copying
class B extends A {
Calendar date;
Date date;
public B clone() {
B copy = (B) super.clone();
copy.date = (Calendar) this.date.clone(); // clones the object
copy.date = this.date; // copies the reference
return copy;
}
}
Deep copy is usually used when dependent objects are mutable (like Calendar), and the copy must be completely independent of the original.
When dependent objects are immutable (like Date), sharing the same instance usually isn't an issue, and a shallow copy may be sufficient.
When using Object.clone() you must follow some rules, but they are simple enough to be understandable. Probably the most difficult part is correctly defining how deep you should copy into your object graph. A logical issue, not a language issue, that is.

I have used Object.clone() in a Spring webflow application to check what has changed when a user edits / enters data on a form for auditing purposes.
At the beginning of the flow, I call the clone method which was implemented on the form backing object used in the spring webflow and save the instance of the clone to the user session. Once the user has completed editing data on the html form and pressed the save button I compare the new values bound to the backing object to the cloned value to determine what data the user has changed.
This worked well and was really easy to implement, I haven't really experienced any issues with cloning in Java.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Defensive copy: should it be specified in the Javadoc? - java

Related

Don't getters returning objects allow callers direct access to member variables?

Does Java have a concept of reference ownership or noncopyable classes?

Do effectively immutable objects make sense?

Shallow/deep-copy semantics in accessor/mutator

What have you used Object.clone() for?

Categories

Resources