Shallow/deep-copy semantics in accessor/mutator - java

What's considered best practice when it comes to accessors/mutators and shallow/deep-copy? Or is this question specific to the situation at hand?
i.e.
public class Test {
private final Point point = new Point(-32, 168);
public void getPoint{
return point
}
}
My current thinking is to use deep-copy for anything mutable; therefore, if Point provided a setter, I would use deep-copy in Test#getPoint.
What's the status quo?
[Edit]
After JB Nizet's answer, I found this great resource.

It depends. Sometimes you want to return a reference to the object, and let the caller modify it. Sometimes you want to return a copy. In the case of a Point, a copy seems more appropriate, though.
Another way to protect the state of your class is to return an unmodifiable view of your field. This can be done by declaring the return type as an interface that only provides read-only methods, or, as in the Java collections api, by returning a wrapper object offering the same interface as the object but throwing exceptions when mutator methods are called.
Whatever the solution you choose, the key is to document what your method does.

Related

Don't getters returning objects allow callers direct access to member variables?

Let's say I have a simple class that stores a user's friends in an ArrayList of strings, with a getter to access that ArrayList:
public class User
{
private ArrayList<String> mFriends;
// ...other code like constructors and setters...
public ArrayList<String> getFriends()
{
return mFriends;
}
}
Since Java and many other languages are (equivalently) pass-by-reference, does this not allow the caller of getFriends() direct access to my ArrayList of strings i.e. they could modify it without even calling my setter?
I feel this immensely breaks the concept of encapsulation. Is there a normal workaround for this, besides dynamically creating a new ArrayList with the same values and returning that, or am I misunderstanding something?
Edit: I understand that Java is not truly pass-by-reference but rather passes over a copy of the address of the original object, but this has the same effect as pass-by-reference when exposing objects to those outside your class
Java is not, and has never been implementing Pass by Reference mechanism, it has always been Pass by Value;
The problem you are describing is known as Reference Escape, and yes, you are right, caller can modify your object, if you expose it via reference;
In order to avoid the Reference Escape problem, you can either:
return a deep copy of the object (with .clone());
create a new object with the existing data (e.g. new ArrayList<>(yourObjectHere));
or come up with some other idea, there are some other ways too do this;
This does not really break the Encapsulation, per se, it is rather a point of correct design how you implement the encapsulation;
Your concern about performance: no, it is not going to break performance, moreover - it has nothing to do with performance; rather it is a matter of proper design of the OOP, mandating either mutability or immutability of the object. If you were to always return a deep copy instead of reference, you would not have a chance to have a good leverage of your object.
Taking your example: what if you want to change the state of the object without just setting a new object via its setter? what if you want to amend the existing friends (which is in your example)? do you think it is rather better to create a new List of friends and set it into the object? no, you are simply losing control over your object in the latter case.
If you are worried about encapsulation then you can return a copy of your list e.g.
public ArrayList<String> getFriends() {
return new ArrayList<>(mFriends);
}
By the way, Java is not truly pass-by-reference it's more pass-by-value.
You are right for mutable objects. You could wrap the field with Collections.unmodifiableList and such. What one also sees, is for (mutable) collections just have no getters, but addFriend, getFriend(index) and such.
In fact getters (and estpecially setters) are no longer a very esteemed pattern.
The member "m" prefix is imho better suited for other languages.

How does one know if a certain method will change the state of a java object?

Some methods are mutator methods, usually they return nothing, the so-called setters. Others, like the .plusDays() method of the LocalDate class, return a full, instantiated object of type Localdate, so if you want to change the object, you need to point your existing object variable to the newly created one.
Is there a way to know beforehand if a method will be a mutator, or work like the before-mentioned apart from looking at its return value?
No, there is no way to know (short of looking at the documentation or implementation) whether a method will change some sort of state.
Methods that return void are generally going to change some sort of state (otherwise what are they doing?), but there's still no guarantee what will change (options include the object, one of its fields, the method's parameters, global state, or even the JVM runtime itself).
There's no general-purpose way to tell whether methods that return something will also have other side-effects or not.
If a type is immutable you can be confident that none of its methods will mutate its own state, but then the question has simply shifted to "how do you tell whether a type is immutable or not?" This is easier to answer, but still tricky. Static analysis tools like ErrorProne's #Immutable check are helpful but still fallible.
Well, pure setters which follow the pattern void setProperty(PropertyType property) are likely to modify the internal state (ok, one could implement it in a different way, e.g. modify the state of the passed parameter, but that would be strange).
Methods found in Builders for instance (like Builder withProperty(PropertyType property)) are free to choose whether they update the state of the actual instance or create and return new instance holding the updated property.
In the end one cannot foresee whether one or the other implementation strategy has been chosen just by looking at the method, so one has to read the docs (and sometimes the code).

Create a hashmap of immutable generic objects

I don't think that there is a way that is efficient (if at all) of doing this, but I figured I'd ask in case someone else knows otherwise. I'm looking to create my own Cache/lookup table. To make it as useful as possible, I'd like it to be able to store generic objects. The problem with this approach is that even though you can make a Collections.unmodifiableMap, immutableMap, etc, these implementations only prevent you from changing the Map itself. They don't prevent you from getting the value from the map and modifying its underlying values. Essentially what I'd need is for something to the effect of HashMap<K, ? extends Immutable>, but to my knowledge nothing like this exists.
I had originally thought that I could just return a copy of the values in the cache in the get method, but since Java's Cloneable interface is jacked up, you cannot simple call
public V getItem(K key){
return (V) map.get(k).clone();
}
Your thinking is good, and you're right that there's no built-in way of handling immutability.
However, you could try this:
interface Copyable<T> {
T getCopy();
}
Then override the get() method to return copies instead of the value itself;
class CopyMap<K, V extends Copyable<V>> extends HashMap<K, V> {
#Override
public V get(Object key) {
return super.get(key).getCopy();
}
}
Then it's up to the implementation to return a copy of itself, rather than this (unless the class itself is immutable). Although you can't enforce that in code, you would be within your rights to publicly humiliate the programmer that doesn't conform.
I'm looking to create my own Cache/lookup table.
Why not use Guava's cache?
The problem with this approach is that even though you can make a
Collections.unmodifiableMap, immutableMap, etc, these implementations
only prevent you from changing the Map itself. They don't prevent you
from getting the value from the map and modifying its underlying
values.
This is not something any collection can enforce for you. You need to make the classes themselves immutable. There is a hacky approach using Reflection (which can also be used to make a class mutable!), but really, you should avoid this and simply create classes that are immutable.
There are other options for object cloning in Java: Making a copy of an object Dynamically?
Be aware though that deep cloning any object might be dangerous. The objects stored in this map must be i.e. isolated from each other, to make sure that whole object graph won't be copied when returning a single entry.
There is no formal concept of "mutability" or "immutability" in the language. The compiler cannot tell whether a type is "mutable" or "immutable". To determine whether something is immutable, we humans have to examine every field and method of the class, and reason through the behavior of the methods to discover that none of them will alter the state of the object, then we call it "immutable". But there is no difference from the perspective of the language.

Defensive copy: should it be specified in the Javadoc?

as far as I understand, getters/setters should always make copies, in order to protect the data.
However, for many of my classes, it is safe to have the getter return a reference to the property asked for, so that the following code
b = a.getB();
b.setC(someValue);
actually changes the state of object a. If I can prove that it is OK for my class, is it good practice to implement the getter this way? Should the user then be notified of this, for example in the Javadoc? I think that this would break the implementation-hiding paradigm, so, should I always assume that the state of a did not change, and make a call to the setter
b = a.getB();
b.setC(someValue);
a.setB(b);
Thanks in advance
S
There's a good argument in your above example that since A is maintaining a reference to B, A should look after B, and not hand it out but manipulate it on your behalf. Otherwise you can argue that you're breaking encapsulation (since A reveals it has a reference to B), and ideally objects should do things for you, rather than export their contents such that you can manipulate them.
Having said all that, the above is certainly not an uncommon practise and often a pragmatic choice.
When you expose an object via get(), you have three options:
expose the actual object
make a defensive copy
expose an object that wraps the original, but prohibits modification. e.g. you can wrap the original object in a restricted interface. See (for example) Collections.unmodifiableCollection() which wraps the original collection (and doesn't copy it) but provides an interface that doesn't permit modification.
Whatever you do, you should document it in the interface (and hence in the Javadoc). Otherwise you're at liberty to change it later, and dependent code can easily break.
Well, the setC violates the Law of Demeter, so I don't think I'd call it a best practice. ("Law" is a bit strong - for instance, it's generally not applied to fluent interfaces.)
That said, getters should not always make copies IMHO. Doing a deep clone can be expensive. There are other options, such as immutable objects.
And, realistically, there are pragmatic considerations.
But I'd err on the side of TMI (too much information) in the JavaDoc.
A further option not yet mentioned is to expose the object via an immutable interface. Obviously this isn't fool-proof as the calling code could always downcast the object into the mutable version, but it avoids any overhead in wrapping the object or creating a copy.
I usually take this approach if I'm writing an API that I'm likely to use myself or within my programming team; i.e. where I know "clients" are going to be good citizens!
// Immutable interface definition.
public interface Record {
String getContent();
}
// Mutable implementation of Record interface.
public class MutableRecord implements Record {
private final String content;
public MutableRecord(String content) {
this.content = content;
}
public String getContent() {
return content;
}
public void setContent(String content) {
this.content = content;
}
}
// API that only exposes the object via its Record interface.
public class MyApi {
private final MutableRecord mutableRecord;
public Record getRecord() {
return mutableRecord;
}
}
You will get copying wrong before you know it! And then you have a real bug, not a potential one.
Therefore, if you trust the client code, don't bother about it.
This is highly academic and I personally never had a problem with it. I also immediately turn the check off in FindBugs (Java) for example...
And unless there is a problem, who reads JavaDoc and the like anyway? Anybody out there?

Java: Rationale of the Object class not being declared abstract

Why wasn't the java.lang.Object class declared to be abstract ?
Surely for an Object to be useful it needs added state or behaviour, an Object class is an abstraction, and as such it should have been declared abstract ... why did they choose not to ?
An Object is useful even if it does not have any state or behaviour specific to it.
One example would be its use as a generic guard that's used for synchronization:
public class Example {
private final Object o = new Object();
public void doSomething() {
synchronized (o) {
// do possibly dangerous stuff
}
}
}
While this class is a bit simple in its implementation (it isn't evident here why it's useful to have an explicit object, you could just declare the method synchronized) there are several cases where this is really useful.
Ande, I think you are approaching this -- pun NOT intended -- with an unnecessary degree of abstraction. I think this (IMHO) unnecessary level of abstraction is what is causing the "problem" here. You are perhaps approaching this from a mathematical theoretical approach, where many of us are approaching this from a "programmer trying to solve problems" approach. I believe this difference in approach is causing the disagreements.
When programmers look at practicalities and how to actually implement something, there are a number of times when you need some totally arbitrary Object whose actual instance is totally irrelevant. It just cannot be null. The example I gave in a comment to another post is the implementation of *Set (* == Hash or Concurrent or type of choice), which is commonly done by using a backing *Map and using the Map keys as the Set. You often cannot use null as the Map value, so what is commonly done is to use a static Object instance as the value, which will be ignored and never used. However, some non-null placeholder is needed.
Another common use is with the synchronized keyword where some Object is needed to synchronize on, and you want to ensure that your synchronizing item is totally private to avoid deadlock where different classes are unintentionally synchronizing on the same lock. A very common idiom is to allocate a private final Object to use in a class as the lock. To be fair, as of Java 5 and java.util.concurrent.locks.Lock and related additions, this idiom is measurably less applicable.
Historically, it has been quite useful in Java to have Object be instantiable. You could make a good point that with small changes in design or with small API changes, this would no longer be necessary. You're probably correct in this.
And yes, the API could have provided a Placeholder class that extends Object without adding anything at all, to be used as a placeholder for the purposes described above. But -- if you're extending Object but adding nothing, what is the value in the class other than allowing Object to be abstract? Mathematically, theoretically, perhaps one could find a value, but pragmatically, what value would it add to do this?
There are times in programming where you need an object, some object, any concrete object that is not null, something that you can compare via == and/or .equals(), but you just don't need any other feature to this object. It exists only to serve as a unique identifier and otherwise does absolutely nothing. Object satisfies this role perfectly and (IMHO) very cleanly.
I would guess that this is part of the reason why Object was not declared abstract: It is directly useful for it not to be.
Does Object specify methods that classes extending it must implement in order to be useful? No, and therefor it needn't be abstract.
The concept of a class being abstract has a well defined meaning that does not apply to Object.
You can instantiate Object for synchronization locks:
Object lock = new Object();
void someMethod() {
//safe stuff
synchronized(lock) {
//some code avoiding race condition
}
}
void someOtherMethod() {
//safe code
synchronized(lock) {
//some other stuff avoiding race condition
}
}
I am not sure this is the reason, but it allows (or allowed, as there are now better ways of doing it) for an Object to be used as a lock:
Object lock = new Object();
....
synchronized(lock)
{
}
How is Object any more offensive than null?
It makes a good place marker (as good as null anyway).
Also, I don't think it would be good design to make an object abstract without an abstract method that needs to go on it.
I'm not saying null is the best thing since sliced bread--I read an article the other day by the "Inventor" discussing the cost/value of having the concept of null... (I didn't even think null was inventable! I guess someone somewhere could claim he invented zero..) just that being able to instantiate Object is no worse than being able to pass null.
You never know when you might want to use a simple Object as a placeholder. Think of it as like having a zero in a numerical system (and null doesn't work for this, since null represents the absence of data).
There should be a reason to make a class abstract. One is to prevent clients from instantiating the class and force them into using only subclasses (for whatever reasons). Another is if you wish to use it as an interface by providing abstract methods, which subclasses must implement. Probably, the designers og Java saw no such reasons, so java.lang.Object remains concrete.
As always, Guava comes to help: with http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/base/Optional.html
Stuff here can be used to kill nulls / Object instances for "a not-null placeholder" from the code.
There are entirely seperated questions here:
why did not they make Object abstract?
how much disaster comes after if they decide to make it abstract in a future release?
I'll just throw in another reason that I've found Object to useful to instantiate on its own. I have a pool of objects I've created that has a number of slots. Those slots can contain any of a number of objects, all that inherit from an abstract class. But what do I put in the pool to represent "empty". I could use null, but for my purpose, it made more sense to insure that there was always some object in each slot. I can't instantiate the abstract class to put in there, and I wouldn't have wanted to. So I could have created a concrete subclass of my abstract class to represent "not a useful foo", but that seemed unnecessary when using an instance of Object was just as good..in fact better, as it clearly says that what's in the slot has no functionality. So when I initialize my pool, I do so by creating an Object to assign to each slot as the initial condition of the pool.
I agree that it might have made sense for the original Java crew to have defined a Placeholder object as a concrete subclass of Object, and then made Object abstract, but it doesn't rub me wrong at all that they went the way they did. I would then have used Placeholder in place of Object.

Categories

Resources