JavaDoc of ImmutableSet says:
Unlike Collections.unmodifiableSet, which is a view of a separate collection that can still change, an instance of this class contains its own private data and will never change. This class is convenient for public static final sets ("constant sets") and also lets you easily make a "defensive copy" of a set provided to your class by a caller.
But the ImmutableSet still stores reference of elements, I couldn't figure out the difference to Collections.unmodifiableSet(). Sample:
StringBuffer s=new StringBuffer("a");
ImmutableSet<StringBuffer> set= ImmutableSet.of(s);
s.append("b");//s is "ab", s is still changed here!
Could anyone explain it?
Consider this:
Set<String> x = new HashSet<String>();
x.add("foo");
ImmutableSet<String> guava = ImmutableSet.copyOf(x);
Set<String> builtIn = Collections.unmodifiableSet(x);
x.add("bar");
System.out.println(guava.size()); // Prints 1
System.out.println(builtIn.size()); // Prints 2
In other words, ImmutableSet is immutable despite whatever collection it's built from potentially changing - because it creates a copy. Collections.unmodifiableSet prevents the returned collection from being directly changed, but it's still a view on a potentially-changing backing set.
Note that if you start changing the contents of the objects referred to by any set, all bets are off anyway. Don't do that. Indeed, it's rarely a good idea to create a set using a mutable element type in the first place. (Ditto maps using a mutable key type.)
Besides the behavioral difference that Jon mentions, an important difference between ImmutableSet and the Set created by Collections.unmodifiableSet is that ImmutableSet is a type. You can pass one around and have it remain clear that the set is immutable by using ImmutableSet rather than Set throughout the code. With Collections.unmodifiableSet, the returned type is just Set... so it's only clear that the set is unmodifiable at the point where it is created unless you add Javadoc everywhere you pass that Set saying "this set is unmodifiable".
Kevin Bourrillion (Guava lead developer) compares immutable / unmodifiable collections in this presentation. While the presentation is two years old, and focuses on "Google Collections" (which is now a subpart of Guava), this is a very interesting presentation. The API may have changed here and there (the Google Collections API was in Beta at the time), but the concepts behind Google Collections / Guava are still valid.
You might also be interested in this other SO question ( What is the difference between google's ImmutableList and Collections.unmodifiableList() ).
A difference between the two not stated in other answers is that ImmutableSet does not permit null values, as described in the Javadoc
A high-performance, immutable Set with reliable, user-specified iteration order. Does not permit null elements.
(The same restriction applies to values in all Guava immutable collections.)
For example:
ImmutableSet.of(null);
ImmutableSet.builder().add("Hi").add(null); // Fails in the Builder.
ImmutableSet.copyOf(Arrays.asList("Hi", null));
All of these fail at runtime. In contrast:
Collections.unmodifiableSet(new HashSet<>(Arrays.asList("Hi", null)));
This is fine.
Related
In C#, I just got the need of having an immutable list, meaning that the list can not be changed.
Much like in Java's immutable list: https://www.geeksforgeeks.org/immutable-list-in-java/
From there:
If any attempt is made to add null element in List,
UnsupportedOperationException is thrown.
Now, with .NET (at least with Core 2.2) there is also an immutable list, documented here.
They say (emphasis mine):
When you add or remove items from an immutable list, a copy of the
original list is made with the items added or removed, and the
original list is unchanged.
So, this implementation basically allows changing the list (by getting a manipulated copy each time), as opposed to the java understanding, and what's more, it will mostly go undetected clogging memory.
What's the point in having an immutable list that supports add and remove methods in the first place?
The problem for me here is, that users of my code would get a list, immutable presumably, but out of neglectance would happily add items, which will never made it to the original "repository". This will cause confusion.
I guess the (only) way to go here, to forbid manipulation entirely, and make it clear to the code user, would be to use the IEnumerale interface?
What's the point in having an immutable list that supports add and
remove methods in the first place?
No one but to be conform with the List contract, the implementation even immutable will expose every List methods.
After you have two ways to cope with these modification methods : throwing an exception or guaranteeing the immutability by creating and returning a new List at each modification.
About :
I guess the (only) way to go here, to forbid manipulation entirely,
would be to use the IEnumerale interface?
Indeed, in Java you use Iterable (that is close enough) when you want to be able to manipulate a collection of things without a way to change it.
As alternative you can also use an array.
As you said: "a copy of the original list is made with the items added or removed, and the original list is unchanged.".
So you can add/remove elements and a new list is made with the changes. The original list is unchanged.
What's the point in having an immutable list that supports add and remove methods in the first place?
First think of this: What is the point of an immutable list that doesn't support adding or removing items in any way? There is nothing particular useful to that. You can use array for that.
Now back to your question. The list is immutable, so consumers can't change the instance itself which was provided through some other method or class. The backing storage can't be altered by consumers! But the producer of the immutable list can 'alter' the backing store by creating a new immutable list and assigning that to the original variable. Isn't that useful!
I have a number of Java classes that use private sets or lists internally. I want to be able to return these sets/lists using a get...List() method.
The alternatives I am considering:
return a reference to the internal object
construct a new set/list and fill it up (this seems bad practice?)
use Collections.unmodifiableList(partitions);
Which of these is the most common / best way to solve this issue?
There are many aspects to consider here. As others already have pointed out, the final decision depends on what your intention is, but some general statements regarding the three options:
1. return a reference to the internal object
This may impose problems. You can hardly ever guarantee a consistent state when you are doing this. The caller might obtain the list, and then do nasty things
List<Element> list = object.getList();
list.clear();
list.add(null);
...
Maybe not with a malicious intention but accidentally, because he assumed that it was safe/allowed to do this.
2. construct a new set/list and fill it up (this seems bad practice?)
This is not a "bad practice" in general. In any case, it's by far the safest solution in terms of API design. The only caveat here may be that there might be a performance penalty, depending on several factors. E.g. how many elements are contained in the list, and how the returned list is used. Some (questionable?) patterns like this one
for (int i=0; i<object.getList().size(); i++)
{
Element element = object.getList().get(i);
...
}
might become prohibitively expensive (although one could argue whether in this particular case, it was the fault of the user who implemented it like that, the general issue remains valid)
3. use Collections.unmodifiableList(partitions);
This is what I personally use rather often. It's safe in the sense of API design, and involves only a negligible overhead compared to copying the list. However, it's important for the caller to know whether this list may change after he obtained a reference to it.
This leads to...
The most important recommendation:
Document what the method is doing! Don't write a comment like this
/**
* Returns the list of elements.
*
* #return The list of elements.
*/
public List<Element> getList() { ... }
Instead, specify what you can make sure about the list. For example
/**
* Returns a copy of the list of elements...
*/
or
/**
* Returns an unmodifiable view on the list of elements...
*/
Personally, I'm always torn between the two options that one has for this sort of documentation:
Make clear what the method is doing and how it may be used
Don't expose or overspecify implementation details
So for example, I'm frequently writing documentations like this one:
/**
* Returns an unmodifiable view on the list of elements.
* Changes in this object will be visible in the returned list.
*/
The second sentence is a clear and binding statement about the behavior. It's important for the caller to know that. For a concurrent application (and most applications are concurrent in one way or the other), this means that the caller has to assume that the list may change concurrently after he obtained the reference, which may lead to a ConcurrentModificationException when the change happens while he is iterating over the list.
However, such detailed specifications limit the possibilities for changing the implementation afterwards. If you later decide to return a copy of the internal list, then the behavior will change in an incompatible way.
So sometimes I also explicitly specify that the behavior is not specified:
/**
* Returns an unmodifiable list of elements. It is unspecified whether
* changes in this object will be visible in the returned list. If you
* want to be informed about changes, you may attach a listener to this
* object using this-and-that method...
*/
These questions are mainly imporant when you intent do create a public API. Once you have implemented it in one way or another, people will rely on the behavior in one or the other way.
So coming back to the first point: It always depends on what you want to achieve.
Your decision should be based on one thing (primarily)
Allow other methods to modify the original collection ?
Yes : return a reference of the internal object.
No :
construct a new set/list and fill it up (this seems bad practice? -- No. Not at all. This is called Defensive programming and is widely used).
use Collections.unmodifiableList(partitions);
return a reference to the internal object
In this case receiver end can able to modify the object's set or list which might not be requirement. If you allow users to modify state of object then it is simplest approach.
construct a new set/list and fill it up (this seems bad practice?)
This is example shallow copy where collection object will not be modifiable but object would be used same. So any change in object state will effect the actual collection.
use Collections.unmodifiableList(partitions);
In this case it returns an unmodifiable view of the specified list. This method allows modules to provide users with "read-only" access to internal lists. This could be used as best practice in situation where you want to keep object's state safe.
I believe the best solution is to return an unmodifiable list. If compared to the construction of a new list, returning an unmodifiable "proxy" of the original list may save the client from implicitly generating a lot of unnecessary lists. On the other hand, if the client really needs to have a modifiable list, let it create a new list by itself.
The problem you still have to consider is that the objects contained into the list may be modified. There is no cheap and easy const-correctness in Java.
The second option is definitely the right way to go.
The other two options depend on your requirements.
If you are not going to modify the list values outside the class, return an unmodifiable list.
otherwise, just return the reference.
Say you are adding x number of objects to a collection, and after or before adding them to a collection you are modifying the objects attributes. When would you add the element to the collection before or after the object has been modified.
Option A)
public static void addToCollection(List<MyObject> objects) {
MyObject newObject = new MyObject();
objects.add(newObject);
newObject.setMyAttr("ok");
}
Option B)
public static void addToCollection(List<MyObject> objects) {
MyObject newObject = new MyObject();
newObject.setMyAttr("ok");
objects.add(newObject);
}
To be on the safe side, you should modify before adding, unless there is a specific reason you cannot do this, and you know the collection can handle the modification. The example can reasonably be assumed to be safe, since the general List contract does not depend upon object attributes - but that says nothing about specific implementations, which may have additional behavior that depends upon the object's value.
TreeSet, and Maps in general do no tolerate modifying objects after they have been inserted, because the structure of the collection is dependent upon the attributes of the object. For trees, any attributes used by the comparator cannot be changed once the item has been added. For maps, it's the hashCode that must remain constant.
So, in general, modify first, and then add. This becomes even more important with concurrent collections, since adding first can lead to other collection users seeing an object before it been assigned it's final state.
The example you provided won't have any issues because you're using a List collection which doesn't care about the Object contents.
If you were using something like TreeMap which internally sorts the contents of the Object keys it stores it could cause the Collection to get into an unexpected state. Again this depends on if the equals method uses the attribute you're changing to compare.
The safest way is to modify the object before placing it into the collection.
One of the good design rules to follow, is not to expose half-constructed object to a 3rd party subsystem.
So, according to this rule, initialize your object to the best of your abilities and then add it to the list.
If objects is an ArrayList then the net result is probably the same, however imaging if objects is a special flavor of List that fires some kind of notification event every time a new object is added to it, then the order will matter greatly.
In my opinion its depend of the settted attribure and tyle of collection, if the collection is a Set and the attribute have infulance on the method equal or hascode then definitely i will set this property before this refer also to sorterd list etc. in other cases this is irrelevant. But for this exapmle where object is created i will first set the atributes than add to collection because the code is better organized.
I think either way it's the same, personally I like B, :)
It really does boil down to what the situation requires. Functionally there's no difference.
One thing you should be careful with, is being sure you have the correct handle to the object you want to modify.
Certainly in this instance, modifying the object is part of the "create the object" thought, and so should be grouped with the constructor as such. After you "create the object" you "add it to the collection". Thus, I would do B, and maybe even add a blank line after the modification to give more emphasis on the two separate thoughts.
The possible answers are either "never" or "it depends".
Personally, I would say, it depends.
Following usage would make a collection appear (to me) to be a flyweight:
public final static List<Integer> SOME_LIST =
Collections.unmodifiableList(
new LinkedList<Integer>(){ // scope begins
{
add(1);
add(2);
add(3);
}
} // scope ends
);
Right? You can't ever change it, because the only place where the
"original" collection object is known (which could be changed), is the
scope inside unmodifiableList's parameter list, which ends immediately.
Second thing is: when you retrieve an element from the list, it's an
Integer which itself is a flyweight.
Other obvious cases where final static and unmodifiableList are
not used, would not be considered as flyweights.
Did I miss something?
Do I have to consider some internal aspects of LinkedList which could
compromise the flyweight?
i think you are referring to the flyweight pattern. the fundamental idea of this pattern is that you are dealing with complex objects whose instances can be reused, and put out different representations with its methods.
to make such a object work correctly it should be immutable.
immutability is clearly given when creating a List the way you described.
but since there is no external object/parameters on which the SOME_LISt operates on i would not call this an example of a flyweight pattern.
another typical property of the flyweight pattern is the "interning" of such objects. when creating just a single instance of an object this does not make sense.
if you are dealing a lot with lists that are passed around from one object to another and you want to ensure the Immutability, a better option might be to use Google-Collections.
final static ImmutableList<Integer> someList = ImmutableList.of(1, 2, 3);
of course it is also possible to construct more complex Immutable Objects with Builders.
this creates an instance of an immutable list. it will still implement the List interface, but will refuse to execute any add(),addAll() set(), remove() operation.
so you can still pass it to methods when a List interface is required, yet be sure that its content is not altered.
I think your example are for immutable objects, a flyweight is something quite different. Immutable objects are candidates for flyweight, but a flyweight doesn't have to be immutable, it just has to be designed to save memory.
Having the library detect that the mutable List has not otherwise escaped is a bit of an ask, although theoretically possible.
If you serialise the returned object, then trusted code could view the internal object. Although the serialised form of the class are documented, it's not documented that the method uses those classes.
In practical terms, any cache is down to the user of the API.
(Why LinkedList for an immutable list, btw? Other than it changes the unmodifiable implementation.)
Integer is only a flyweight from -128 to 127.
See also http://www.javaworld.com/javaworld/jw-07-2003/jw-0725-designpatterns.html.
The method Concat() does not modify the original value. It returns a new value.
like this:
String str = "good";
str.concat("ness");
System.out.println(str); //"good"
But some method modify the original value. Why?
In Groovy:
def languages = ["Java", "Groovy", "JRuby"]
languages.reverse()
===> [JRuby, Groovy, Java]
println languages
===> [Java, Groovy, JRuby]
languages.sort()
===> [Groovy, JRuby, Java]
println languages
===> [Groovy, JRuby, Java]
String is immutable in Java. Any method that "modifies" a String must return a new instance of String.
From the Java API Specifications for the String class:
Strings are constant; their values
cannot be changed after they are
created.
The Java Language Specifications defines this behavior in Section 4.3.3: The Class String.
Response to the edit:
It appears that an example in Groovy has been added. (I haven't used Groovy before, so my understanding of it may not be correct.)
From what I understand from looking at the example, there seems to be a languages list that is being reverse-ed and sort-ed -- those operations themselves do not modify the String objects contained in the list, but are acting upon the list itself.
The way the list is returns a new list, or how it modifies or doesn't modify the list is not related to the behavior of the String objects themselves.
The Java API was designed by many many different people, as such it's hard to keep everything consistent. I believe people generally accept that immutability (i.e., the internal states should not change) is a good thing now though, at least where value objects are concerned.
Another similar question would be, "why are indexes sometimes 0-based (most of the time), and somes times 1-based (JDBC)." Again, I believe it's another situation of the API being too broad, and developers of different APIs not coordinating (I could be wrong here though, if anyone knows the real reason for JDBC being 1-based, please let me know).
I think you mean str.concat("ness") instead. In this particular example with Strings, no method can mutate the object because Strings are designed to be immutable. In the library, you will find many methods that mutate the state of the object (e.g. StringBuffer.replace()) and others that don't (e.g. String.replace()). You'll have to read the API carefully to determine which is the case. Ultimately, this is a choice made by the library designer, who has to consider the functionality, ease of use, and conventions associated with the package he or she is writing.
Because there are immutable and mutable classes.
String, as another answer points out, is an immutable class. Their value always stays the same once a String is created.
If you have an ArrayList<Integer> object, you can use its add function to add another Integer to the list. The add function changes the list in-place, instead of returning a new list. An ArrayList is mutable.
Response to Edit:
For your groovy example, probably its designers sat down and noticed that more often one would want a new list that contains the reversed result, and keep the old list untouched. (Why? I don't know). On the other side, they may have noticed there are more cases where you want not to have a new list which contains the sorted result. So it does its job in-place. But i don't know and haven't used groovy before, so just a guess.
In Ruby, i have heard there is a notion for this: Functions that change objects in-place have an exclamation written after them, and functions that return the result as a new object have no exclamation mark:
newObj = obj.sort(); // new sorted list is returned
obj.sort!(); // obj is sorted in-place
To some extent this also has to do with programming style. Not changing the original object and creating new copies to reflect the changes is an idiom for safe programming. I believe Josh Bloch mentioned it in his book "Effective Java" (The first edition). Though I cannot remember the exact term he used for it.
In the case of String it returns a new object because String is immutable. However, across the Java API, you will see some places where the original object is changed and some places where a new object is returned. As someone pointed out earlier, it is because different people have worked on the API, and they bring their own programming styles.
On a slightly different note: keeping objects immutable adds safety to the code, and it also allows us to code in a certain way.
(new Date()).add(new Month(7)).add(new Day(4))
If every method on Date returns a new object instead of changing it's own state, then we can write such code. This makes programs very readable.
However, keeping objects immutable, may add reduce the performance of the system if we have large objects.