How can implementation know if an input parameter is mutable? - java

For example, some method has the next implementation:
void setExcludedCategories(List<Long> excludedCategories) {
if (excludedCategories.contains(1L)) {
excludedCategories.remove(1L);
}
}
And it's called in the next way:
setExcludedCategories(Array.asList(1L, 2L, 3L));
Of course, it will lead ot an exception java.lang.UnsupportedOperationException when it will try to remove item.
The question: how can I modify this code to be sure that the input parameter excludedCategories supports remove?
UPD:
Thanks for answers. Let's summarize results:
Always create new ArrayList from the input list to be sure it's mutable - a lot of useless memory would be used -> NO.
Catch the UnsupportedOperationException.
Specify in the JavaDoc that a caller mustn't pass an immutable list - anybody read the JavaDoc? When something doesn't work only :)
Don't use Arrays.asList() in a caller's code - that's an option, if you an owner of this code, but anyway you should know if this concrete method allows immutable or not (see 3).
It seems the second variant is the only way to resolve this problem.

How can I modify this code to be sure that the input parameter excludedCategories supports remove?
In the general case, you can't. Given an arbitrary class that implements the List API, you cannot tell (statically or dynamically) if the optional methods are supported.
You can use instanceof tests to check if the class of the list is known to implement the method or to not implement it. For example ArrayList and LinkedList do, but Collections.UnmodifiableList does not. The problem is that your code could encounter list classes that your tests don't cover. (Especially if it is a library that is intended to be reusable in other peoples applications.)
You could also try to test the behavior of previously unknown classes; e.g. create a test instance, try a remove to see what happens, and record the behavior in a Map<Class, Boolean>. There are two problems with this:
You may not be able to (correctly) instantiate the list class to test it.
The behavior could depend on how you instantiate the class (e.g. constructor parameters) or even on the nature of the element you are trying to remove ... though the latter is pushing the boundary of plausibility.
In fact, the only completely reliable approach is to call the method and catch the exception (if it is thrown) each and every time.

In short, you can't know. If an object implements an interface (such as List) you can't know if it will actually do what is expected for all of the methods. For instance Collections.unmodifiableList() returns a List that throws UnsupportedOperationException. It can't be filtered out via the method signature if you want to be able to get other List implementations.
The best you can do is to throw IllegalArgumentException for known subtypes that don't support what you want. And catch UnsupportedOperationException for other types of cases. But really you should javadoc your method with what is required and that it throws IllegalArgumentException in other cases.

That depends somewhat on what you're trying to do. In your posted example for example you could just catch the UnsupportedOperationException and do something else instead.
This assumes that you can assume that non-mutable containers will throw that on every attempt to modify the container and will do so without side effects (that is they are indeed non-mutable).
In other cases where your code has other side effects than trying to modify the container you will have to make sure these doesn't happen before knowing that you can modify the container.

You can catch the exception in an utility class like in the example below (as others mentioned). Bad thing is you have to do insert/delete to test if there will be exception. You can not use instanceof since all Collections.Unmodifiablexxx classes have default access.
CollectionUtils:
import java.util.List;
public class CollectionUtils {
public <T> boolean isUnmodifiableList(List<T> listToCheck) {
T object = listToCheck.get(0);
try {
listToCheck.remove(object);
} catch (UnsupportedOperationException unsupportedOperationException) {
return true;
}
listToCheck.add(0, object);
return false;
}
}
Main:
import java.util.Arrays;
import java.util.List;
public class Main {
private static final CollectionUtils COLLECTION_UTILS = new CollectionUtils();
public static void main(String[] args) {
setExcludedCategories(Arrays.asList(1L, 2L, 3L));
}
private static void setExcludedCategories(List<Long> excludedCategories) {
if (excludedCategories.contains(1L)) {
if(!COLLECTION_UTILS.<Long>isUnmodifiableList(excludedCategories)){
excludedCategories.remove(1L);
}
}
}
}

Arrays.asList(T... a) returns the List<java.util.Arrays.ArrayList<E>> which is an immutable list. To get your code working just wrap the result with java.util.ArrayList<T> like shown below
setExcludedCategories(new ArrayList<Long>(Arrays.asList(1L, 2L, 3L)));

Always create new ArrayList from the input list to be sure it's mutable - a lot of useless memory would be used -> NO.
Thats actually the preferred way to do things. "A lot of useless memory" isn't a lot in most practical situations, certainly not in your cited exampled.
And ignoring that, its the only robust and inutitively understood idiom.
The only workable alternative would be to explicitly change the name of your method (thus communicating its behavior better), form the example you show, name it "removeExcludedCategories" if its meant to modify the argument list (but not an objects state).
Otherwise if it is meant as a bulk-setter, you're out of luck, there is no commonly recognized naming idiom that clearly communicates that the argument collection is directly incorporated into the state of an object (its dangerous also because the objects state can then be altered without the object knowing about it).
Also, only marginally related, I would design not an exclusion list, but an exclusion set. Sets are conceptually better suited (no duplicates) and there are set implementations that have far better runtime complexity for the most commonly asked question: contains().

Related

Side effects in Java methods

This might be a trivial question, but I need some clarification...
There is a book called Clean Code that says that our methods should be small, preferably up to 5-10 lines long. In order to achieve that we need to split our methods into smaller ones.
For instance, we may have someMethod() shown below. Let's say, modification of 'Example' takes 5 lines and I decide to move it into a separate method, modify 'Example' there and return it back to someMethod(). By doing this, someMethod() becomes smaller and easier to read. That's good, but there is a thing called "side effects" which says that we shouldn't pass an object to another method and modify it there. At least, I was told that it's a bad idea ) But I haven't seen anything prohibiting me from doing so in Clean Code.
public Example someMethod() {
// ... different lines here
Example example = new Example();
example = doSomethingHere(example, param1, param2, ...);
// ... different lines here
return example;
}
private Example doSomethingHere(Example example, 'some additional params here') {
// ... modify example's fields here ...
return example;
}
So, am I allowed to split the methods this way or such a side effect is prohibited and instead I should deal with a rather long-line method that definitely breaks Clean Code's rules talking about short methods?
UPDATED (more specific name for the sub-method)
public Example someMethod() {
// ... different lines here
Example example = new Example();
example = setExampleFields(example, param1, param2, ...);
// ... different lines here
return example;
}
private Example setExampleFields(Example example, 'some additional params here') {
// ... modify example's fields here ...
return example;
}
As JB Nizet commented, it's not actually a side effect if it's the only effect, so any blanket statement that "all side effects are bad" doesn't apply here.
Still, the main question stands: Is this (side) effect okay?
Talking about the principles first, side effects are, in general, dangerous for two reasons:
they make concurrency more difficult
they obscure/hide information
In your example, there is some information that is hidden. You could call this a potential side effect, and it can be exposed with a question: "Does this doSomethingHere method create a new object or modify the one I pass in?"
The answer is important, and even more so if it's a public method.
The answer should be trivial to find by reading the doSomethingHere method, especially if you're keeping your methods 'clean', but the information is nonetheless hidden/obscured.
In this specific case, I would make doSomethingHere return void. That way there's no potential for people to think that you've created a new object.
This is just a personal approach - I'm sure that plenty of developers say you should return the object you modify.
Alternatively, you can pick a 'good' method name. "modifyExampleInPlace" or "changeSomeFieldsInPlace" are pretty safe names for your specific example, imo.
we shouldn't pass an object to another method and modify it there.
Who says that? That is actually a good practice in order to split your function in a way that forms a "recipe" and have specific functions that know exactly how to populate your object properly.
What is not recommended (and probably the source where you got your recommendation misunderstood this rule) is defining a public API and modify the arguments. Users appreciate not having their arguments modified as it leads to less surprises. An example of that is passing arrays as arguments to methods.
When you define an object and pass it to an other method, method itself can modify the content of the object therein which may be unwanted in some cases. This is because you pass the reference(shallow copy) of the object to that method and method can modify that object.For example when you pass an Array, Arrays are objects, to a method, method can change the content of the Array which may not be what the caller method expects.
public static void main(String[] args){
int[] arr= {1,2,3,4};
y(arr);
//After the method arr is changed
}
public void y(int[] comingArray){
comingArray[0] = 10;
}
To make sure the values of Array cannot be changed, deep copy of the Array should be sent to method which is another story
However this is not the case when you use primite types(int, float etc.)
public static void main(String[] args){
int a= 1
y(a);
//After the method a is not changed
}
public void y(int comingInt){
comingInt = 5;
}
Due to the nature of the Objects, you should be carefulTo learn more about shallow copy and deep copy https://www.cs.utexas.edu/~scottm/cs307/handouts/deepCopying.htm

Preserving encapsulation of a generic in Java

Good evening.
I have a rather involved question. To practice Java, I've been re-implementing some of the data structures in the standard library. Stacks, LinkedLists, Trees, etc. I just established, through a very simple example, that the java.util.Stack class performs a deep copy when either the peek() or pop() methods are used. This is understandable, since the goal would be to protect the contents of the class from outside interference. So far, in my own implementation of the Stack (a naive implementation with a simple array, linked lists will come later), I have not cared for this at all:
public class ArrayStack<T> implements Stack<T> {
private T[] data; // Will expand the array when stack is full.
private int top; // serves as both top and count indicator.
...
...
#Override
public T pop() throws EmptyStackException {
if(top == -1)
throw new EmptyStackException("Stack is empty.");
return data[top--]; // Shallow copy, dangerous!
}
Unfortunately, since a generic cannot be instantiated, I cannot assume a copy constructor and do stuff like return new T(data[top--]); I've been looking around in S.O and I've found two relevant threads which attempt to solve the problem by using some variant of clone(). This thread suggests that the class's signature be extended to:
public class ArrayStack<T extends DeepCloneableClass> implements Stack<T>
...
where DeepCloneableClass is a class that implements an interface that allows for "deep cloning" (see the top response in that thread for the relevant details). The problem with this method, of course, is that I can't really expect from standard classes such as String or Integer to be extending that custom class of mine, and, of course, all my existing jUnit tests are now complaining at compile-time, since they depend on such Stacks of Integers and Strings. So I don't feel as if this solution is viable.
This thread suggests the use of a third-party library for cloning pretty much any object. While it appears that this library is still supported (latest bug fixes date from less than a month ago), I would rather not rely on third-party tools and use whatever Java can provide for me. The reason for this is that the source code for these ADTs might be someday shared with undergraduate college students, and I would rather not have them burdened with installing extra tools.
I am therefore looking for a simple and, if possible, efficient way to maintain a generic Java data structure's inner integrity while still allowing for a simple interface to methods such as pop(), peek(), popFront(), etc.
Thanks very much for any help!
Jason
Why do you need to clone the objects?
Your stack just has a collection of references. You probably don't need to clone them, just make a new array and put the appropriate references in it, then throw away the old array.
Integer, Strings, etc are all immutable, so their contents are safe by design.
As for custom objects, while experienced Java Programmers will certain have mixed feelings about it, implementing a custom interface is certainly one way to approach the problem.
Another one is to make <T extends Serializable> (which is implemented by Integer, String, etc) and "clone" through serialization.
But if you want to teach your students the "right way" I would definitively use a third party library... You can just create a lib folder in your project and configure you build tool / IDE to add the needed jars to the Classpath using relative paths, so your undergraduate students will not have to install or setup anything.
Just for reference, this question may be very useful.
I've been teaching Java introductory classes (as an IT Instructor / not as a college Professor) using this kind of approach, and it is way less painful than it sounds.
The comments helped me understand what I had wrong. I was using the following example to "prove" to myself and others that the Java Standard Library's Collections do a deep copy when providing references to objects in the Collection:
import java.util.Stack;
public class StackTestDeepCopy {
public static void main(String[] args){
Stack<String> st = new Stack<String>();
st.push("Jim");
st.push("Jill");
String top = st.peek();
top = "Jack";
System.out.println(st);
}
}
When printing st, I saw that objects had been unchanged, and concluded that a deep copy had taken place. Wrong! Strings are immutable, and therefore the statement top = "Jack" does not in any way modify the String (not that any Object would be "modified" by a statement like that, but I wasn't thinking straight), it just makes the reference point to a new place on the heap. A new example, involving an actually mutable class, made me understand the error in my ways.
Now that this problem has been solved, I'm quite baffled by the fact that the standard library allows for this. Why is it that accessing elements in the standard library is implemented as a shallow copy? Sounds very unsafe.
java.util.Stack doesn't do a deep copy:
import java.util.Stack;
public class Test {
String foo;
public static void main(String[] args) {
Test test = new Test();
test.foo = "bar";
Stack<Test> stack = new Stack<Test>();
stack.push(test);
Test otherTest = stack.pop();
otherTest.foo = "wibble";
System.out.println("Are the same object: "+(test.foo == otherTest.foo));
}
}
Results in:
Are the same object: true
If it did do a copy then test and otherTest would point to a different object. A typical stack implementation simply returns a reference to the same object that was added onto the stack, not a copy.
You probably also want to set the array item to null before returning, otherwise the array will still hold a reference to the object.

Builder pattern with error-checking: is it possible/advisable?

I am using the Builder pattern to make it easier to create objects. However, the standard builder pattern examples do not include error-checking, which are needed in my code. For example, the accessibility and demandMean arrays in the Simulator object should have the same length. A brief framework of the code is shown below:
public class Simulator {
double[] accessibility;
double[] demandMean;
// Constructor obmitted for brevity
public static class Builder {
private double[] _accessibility;
private double[] _demandMean;
public Builder accessibility(double[] accessibility) {
_accessibility = accessiblity.clone();
return this;
}
public Builder demandMean(double[] demandMean) {
_demandMean = demandMean.clone();
return this;
}
// build() method obmitted for brevity
}
}
As another example, in a promotion optimization problem, there are various promotional vehicles (e.g. flyers, displays) and promotion modes, which are a set of promotional vehicles (e.g. none, flyer only, display only, flyer and display). When I create the Problem, I have to define the set of vehicles available, and check that the promotion modes use a subset of these vehicles and not some other unavailable vehicles, as well as that the promotion modes are not identical (e.g. there aren't two promo modes that are both "flyer only"). A brief framework of the code is shown below:
public class Problem {
Set<Vehicle> vehicles;
Set<PromoMode> promoModes;
public static class Builder {
Set<Vehicle> _vehicles;
Set<PromoMode> _promoModes;
}
}
public class PromoMode {
Set<Vehicle> vehiclesUsed;
}
My questions are the following:
Is there a standard approach to address such a situation?
Should the error checking be done in the constructor or in the builder when the build() method is called?
Why is this the "right" approach?
When you need invariants to hold while creating an object then stop construction if any parameter violates the invariants. This is also a fail-fast approach.
The builder pattern helps creating an object when you have a large number of parameters.
That does not mean that you don't do error checking.
Just throw an appropriate RuntimeException as soon as a parameter violates the objects invariants
You should use the constructor, since that follows the Single Responsibility Principle better. It is not the responsibility of the Builder to check invariants. It's only real job is to collect the data needed to build the object.
Also, if you decide to change the class later to have public constructors, you don't have to move that code.
You definitely shouldn't check invariants in setter methods. This has several benefits:
* You only need to do checking ONCE
* In cases such as your code, you CAN'T check your invariants earlier, since you're adding your two arrays at different times. You don't know what order your users are going to add them, so you don't know which method should run the check.
Unless a setter in your builder does some intense calculations (which is rarely the case - generally, if there's some sort of calculation required, it should happen in the constructor anyway), it doesn't help very much to 'fail early' in, especially since fluent Builders like yours use only 1 line of code to build the object anyway, so any try block would surround that whole line either way.
The "right" approach really depends on the situation - if it is invalid to construct the arrays with different sizes, i'd say it's better to do the handling in the construction, the sooner an invalid state is caught the better.
Now, if you for instance can change the arrays and put in a different one - then it might be better to do it when calling them.

How to guarantee Java collection

I would like to find an API like Apache Commons that will easily and always return a collection.
The intent is to produce code that doesn't require NPE checks or CollectionUtils.isNotEmpty checks prior to collection iteration. The assumption in the code is to always guarantee a list instance thus eliminating code complexity for every collection iteration.
Here's an example of a method, but I would like an API instead of rolling my own.
private List<Account> emptyCollection(
List<Account> requestedAccounts) {
if (CollectionUtils.isNotEmpty(requestedAccounts)) {
return requestedAccounts;
} else {
return new ArrayList<Account>();
}
}
I would like to find a generic API / method that could be used for any class generically.
Here are some of my research classes inside commons that may help me do the trick.
http://commons.apache.org/collections/apidocs/org/apache/commons/collections/TransformerUtils.html
http://commons.apache.org/collections/apidocs/org/apache/commons/collections/CollectionUtils.html
Maybe the .collect might work using a transformer.
I'm open to using alternative API's as well.
Is this an example of what you mean?
public static <T> List<T> nullToEmpty(List<T> list) {
if (list != null) {
return list;
}
return Collections.emptyList();
}
Your question is a bit hard to understand, Do you simply want to avoid NPE, or also want to avoid CollectionUtil.isNotEmpty ?
The first is very easy, the second not so, because you essentially want to guarantee that your API will always return a Collection with at least one element.
That is a business centric constraint IMO, and not something you can guarantee via an API contract.
If all you want to avoid is NPE, you can use java.lang.Collections.EMPTY_(SET|MAP|LIST), classes. But mind you , these are immutable, i.e. the calling code, can't add objects to a collection returned this way. If you want the calling code to mutate the Collection (i.e. add/remove/update elements), then you'll have to return a zero element concrete implementation of your LIST|MAP|SET etc.

Use of guava immutable collection as method parameter and/or return type

I am trying to determine what the best practices would be for an ImmutableList. Below is a simplistic example that will help drive my questions:
Ex:
public ImmutableCollection<Foo> getFooOne(ImmutableList<Foo> fooInput){
//.. do some work
ImmutableList<Foo> fooOther = // something generated during the code
return fooOther;
}
public Collection<Foo> getFooTwo(List<Foo> fooInput){
//.. do some work
List<Foo> fooOther = // something generated during the code
return ImmutableList.copyOf(fooOther);
}
public void doSomethingOne(){
ImmutableCollection<Foo> myFoo = getFooOne(myList);
...
someOtherMethod(myFoo);
}
public void doSomethingTwo(){
Collection<Foo> myFoo = getFooOne(myList);
...
someOtherMethod(myFoo);
}
My Questions:
Which makes the most sense to use in an application? [doSomethingOne and getFooOne] or [doSomethingTwo and fooTwo]? In other words if you know you are using ImmutableCollections does it make sense to keep casting back and forth and doing copyOf(), or just use Immutable everywhere?
These examples are public methods which could imply that other people use them. Would any of these answers change if the methods were private and used internally?
If a user tries to add anything to an immutable List an exception will be thrown. Because they may not be aware of this, wouldn't it make more sense to explicitly return an ImmutableCollection instead of a Collection?
In general, it's wise not to commit to a specific implementation in your declared return type, but we think of the immutable types as an exception. There are a few reasons to declare a return type of Immutable*:
They document that you're returning a snapshot, not a live view.
They document that the caller can't mutate the result.
They document that insertion order is preserved (which may or may not be significant in your use case).
They document that the collection won't contain null.
Someone might want the asList() or reverse() method.
You may save someone a copyOf() call if he wishes to assign to an Immutable* field. (But note that, if he does include copyOf(), it will short-circuit for most immutable inputs, even if you don't declare the return type.)
Basically, I'm just cribbing from https://github.com/google/guava/wiki/TenThingsAboutImmutableCollections, which you may want to check out in its entirety.
If I understood your intentions, the proper way of designing getFooXxx for making an immutable copy of maybe-mutable-list is something like this:
/**
* Returns an <b>immutable copy</b> of input list.
*/
public ImmutableList<Foo> immutableCopyOfFoo(List<Foo> input){
return ImmutableList.copyOf(input);
}
Why?
ImmutableList.copyOf() does it's magic when given list is immutable,
method signature explicitly says what it does,
method returns ImmutableList which is, in fact, an ImmutableCollection but why would you like to hide information about ImmutableList from user? If he wants, he'll write Iterable foo = immutableCopyOfFoo(mutableFoo); instead, but 99% he'll use an ImmtableList,
returning an ImmutableList makes a promise - "I am immutable, and I will blow everything up if you try to change me!"
and last but not least - proposed method is unnecessary in internal use; just use
someOtherMethod(ImmutableList.copyOf(foo));
directly in your code...
You should check #ChrisPovirk's answer (and link to wiki in that answer) to know that i.e. when List<Foo> input contains nulls, you will get nasty NPE on runtime if you try to make an immutable copy...
EDIT answering comment #1:
Collection contract is less strict than List's one; i.e. Collection doesn't guarantee any order of elements ("Some are ordered and others unordered") while List does ("An ordered collection (also known as a sequence)").
If an input is a List it suggests that order is important and therefore output should guarantee the same. Imagine that:
public ImmutableCollection<Foo> immutableCopyOfFoo(List<Foo> input){
return ImmutableSortedSet.copyOf(input, someFancyComparator);
}
It doesn't smell right. If you don't care about order then maybe method signature should be immutableCopyOfFoo(Collection<Foo> input)? But it depends on concrete use case.
public ImmutableCollection<Foo> getFooOne(ImmutableList<Foo> fooInput){
ImmutableList<Foo> fooOther= fooInput;
return ImmutableList.copyOf(fooOther);
}
This makes no sense at all. Why would you ever copy an immutable collection? The whole point of immutability is: it can't be changed, so you might as well re-use it.
public Collection<Foo> getFooTwo(List<Foo> fooInput){
ImmutableList<Foo> fooOther= ImmutableList.copyOf(fooInput);
return ImmutableList.copyOf(fooOther);
}
??? Why do it twice??? This is fine:
public Collection<Foo> getFooTwo(List<Foo> fooInput){
return ImmutableList.copyOf(fooInput);
}
ImmutableList.copyOf(Collection) is smart enough to return ImmutableList unmodified and create a new ImmutableList for everything else.
My usual approach is:
accept List for parameters (so the interface is easier to use for clients)
if performance/memory usage/thread-safety is important, copy the contents of the provided List into a data structure that is optimized for usage by your class
when returning an ImmutableList, ImmutableList should be the return type (because it gives the caller more information about how it can use the returned value)
when returning a mutable implementation of List, List should be the return type, unless something else about the return type is important (thread-safety, as a bad* example)
* It's a bad example because if your return values need to be thread-safe, it probably means something else is wrong with your code.
Replace List/ImmutableList with any of the immutable collection types.
You should always use the standard JRE classes on public interfaces. There are no extra methods on Guava's Immutable... classes so you're not gaining any compile-time safety: any attempts to make changes to those objects will only fail at run-time (but see Bart's comment). You should document in methods that return collections that they're immutable.
You should make defensive copies of lists provided on public methods if you're worried about concurrent modification, but it's OK to specify ImmutableCollection on private method arguments.

Categories

Resources