I downloaded the source code for Google Guava just out of curiosity to see what is backing the immutable collections.
I was going through the ImmutableList and I noticed that it was still backed by an old-fashioned array, as in Object[]
So I'm just curious, the Object[] array is probably mutable by its nature, and it is not threadsafe. I already knew that ArrayList and CopyOnWriteArrayList were backed by an Object[] array, but they are mutable.
So is the ImmutableList only immutable and threadsafe because its internal properties are well encapsulated and protected? Is it also immutable and threadsafe because the encapsulation ensures nothing will modify it after construction? Will there ever be a day where low-level arrays like this will be switched out for something better and not of a legacy, and be inherently immutable and final rather than immutable by careful encapsulation?
Yes, the ImmutableList is "only" immutable because it does not allow its internal backing array to be modified.
That is exactly the same situation as for a java.lang.String, which also wraps a private char[]. Along the same lines, the very useful concurrency libraries are largely implemented as regular Java classes relying on only very few (and very basic) JVM synchronization primitives.
So that should be good enough. Obviously, people writing these library classes have to be careful and knowledgable, but this requirement does not magically go away by moving the code even more "low-level" into JVM primitives. (Agreed, you can shoot yourself in the foot using dark voodoo like reflection now, but that hardly happens by accident, and in regular usage, a "user-land" implementation works just as well).
Will there ever be a day
That is pure speculation. But there is apparently work on "value types", which is a related topic.
Related
I came across newArrayListWithCapacity(int) (1) in the codebase and was wondering if it has any advantages.
From what I understand, if we don't know the size, new ArrayList is going to serve the purpose, and if we do now the exact size, we can simply use an array. I'm trying to understand the benefits of using newArrayListWithCapacity(int). And how is it different from newArrayListWithExpectedSize?
If I define the expected size as x and if I end up having y number of entries, does it adversely affect the performance?
1: https://guava.dev/releases/15.0/api/docs/com/google/common/collect/Lists.html#newArrayListWithCapacity(int)
#GwtCompatible(serializable=true) public static ArrayList
newArrayListWithCapacity(int initialArraySize) Creates an ArrayList
instance backed by an array of the exact size specified; equivalent to
ArrayList.ArrayList(int). Note: if you know the exact size your list
will be, consider using a fixed-size list (Arrays.asList(Object[])) or
an ImmutableList instead of a growable ArrayList.
Note: If you have only an estimate of the eventual size of the list,
consider padding this estimate by a suitable amount, or simply use
newArrayListWithExpectedSize(int) instead.
The method and the constructor do the same thing, which is to provide an ArrayList with a specific capacity, meaning that the initial backing array is created with the provided size.
Guava created a factory method wrapping the ArrayList(int) constructor because, while well documented, it might sometimes lead people in confusion. The Guava team did the same with other collection factory methods such as Sets.newHashSetWithCapacity(int) and Sets.newHashSetWithExpectedSize(int).
The pros are the following:
You have an explicit factory method name. This is good because not everyone knows what the constructors mean. So having a factory method that says what it creates is handy for people not entirely familiar with the API, but who are reading the code.
You don't have to handle generics in any way. Those are handled automatically for you through the generic method mechanism. Guava added this method in when Java 6 was released, so you had to manually add <FullTypeName> to each constructor calls (such as List<String> strings = new ArrayList<String>(10)). Nowadays you can simply use <> (such as List<String> strings = new ArrayList<>(10), but at the time the gain was huge!
The cons are:
You lose one method call of performance. But usually, when you're already dealing with Java's standard API, performance is not what you're looking for. Some other libraries provide high performance collections, and neither the Java Collections or Guava's Collections are serious contenders there. Also, as MikeFHay mentions in the comments below, the method call will likely be inlined in modern JVMs.
You depend of Google Guava. This is not really a letdown because Guava is a fantastic library, but if this is the only reason you use Guava, it might be.
The Guava team made that method because the pros clearly outmatch the cons in their view (which i personally share).
I have been looking for options to implement a mutable sorted map in scala. I know I can store my data in a mutable map and then transformed into a sorted map if is needed or wrap the TreeMap from Java. However, Does anyone know why this is not implemented in scala? Is against any functional programming style?
Regards
There is no reason but the omission of writing one. In fact, a mutable sorted map was added to Scala 2.12.x.
There's some discussion in this old answer about possible reasons there isn't an implementation.
With regards to your second question, there are other mutable collections in Scala so I don't see any hard reason there couldn't be a mutable sorted map (see the older question as well). In a more general sense, functional programming can be taken to imply that mutable data is not used, and in this case a mutable sorted map would be avoided. However mutable collections may well be used "behind the scenes" in a library to improve performance, as long as they won't be visible to users of the library.
This question already has answers here:
What's the advantage of a String being Immutable?
(7 answers)
Closed 9 years ago.
I have read a lot of places where it is written that immutability in java is a desired feature. Why is this so?
Immutability makes it easier to reason about the life cycle of the object. It is especially useful in multi-threaded programs as it makes it simpler to share between threads.
Some data structures assume the keys or elements are immutable, or are not changed in critical ways. e.g. Maps and Sets. They don't have to be strictly immutable but it makes it a lot easier if they are.
The downside of immutable objects is that it make recycling them much harder and can significantly impact performance.
In short, if performance is an issue, consider mutable objects, if not use immutable objects as much as possible.
The main benefit of immutability is that an object cannot be changed "out from under you". Eg, consider a String that is a key to a Map. If it were mutable then one could insert a key/value pair then modify the String that is the key. This would break the hash scheme used in the map, and lead to some potentially very unpredictable operation.
In some cases such mutability could lead to security exposures, due to intentional abuse, but more commonly it would simply lead to mysterious bugs. And "defensive" copying of the mutable objects (to avoid such bugs) would lead to many more objects being created (especially if a formulaic approach to the problem is taken).
(On the other hand, lots of objects are created, eg, because a new String is created when someone appends to the String, or "chops off" a part of it, vs simply mutating the existing String, so both approaches end up causing more objects to be created, for different reasons.)
Immutability is highly desired when you need to know that a class hasn't changed when you don't expect. Consider the keys in a hash map. If you set a mutable object for a key, then store it at the key's current hash value (mod hash array size), what happens when you change it? The hash changes and suddenly it's in the wrong spot! You (the HashMap) don't know that it has changed, so you don't know to update its location, so the next time someone tries to look up that key, you won't find it. Immutability removes this problem - if you put an object somewhere based on its hash, you know it will always be where you expect it to be.
And, as pointed out elsewhere, being immutable means it's safe to read concurrently from multiple threads without synchronization.
In programming, an immutable class or object is an object whose state can not be modified after it is created. Some cases they are still considered as immutable even if you can change their attributes (fields), but the object nature keeps same.
Immutable Objects are often desired because 3 main reasons:
Thread-safe
Higher security than mutable objects
Simplicity
There is a ref. that you can review for a better understanding about immutability for java classes and objects
Effective Java Item 15 Immutable classes are easier to design, implement, and use than mutable classes. They are less prone
to error and are more secure.
I have been using guava for some time now and truly trusted it, until I stumbled of an example yesterday, which got me thinking. Long story short, here it is:
public static void testGuavaImmutability(){
StringBuilder stringBuilder = new StringBuilder("partOne");
ImmutableList<StringBuilder> myList = ImmutableList.of(stringBuilder);
System.out.println(myList.get(0));
stringBuilder.append("appended");
System.out.println(myList.get(0));
}
After running this you can see that the value of an entry inside an ImmutableList has changed. If two threads were involved here, one could happen to not see the updated of the other.
Also the thing that makes me very impatient for an answer is that Item15 in Effective Java, point five says this:
Make defensives copies in the constructor - which seems pretty logic.
Looking at the source code of the ImmutableList, I see this:
SingletonImmutableList(E element) {
this.element = checkNotNull(element);
}
So, no copy is actually made, although I have no idea how a generic deep copy would be implemented in such a case (may be serialization?).
So.. why are they called Immutable then?
What you're getting at here is the difference between immutable and deeply immutable.
An immutable object will never change, but anything that it refers to might change. Deep immutability is much stronger: neither the base object nor any object you can navigate to from it will change.
Each is appropriate in its own situations. When you create your own class that has a field of type Date, that date is owned by your object; it's truly a part of it. Therefore, you should make defensive copies of it (on the way in and the way out!) to provide deep immutability.
But a collection does not really "own" its elements. Their states are not considered part of the collection's state; it is a different type of class -- a container. (Furthermore, as you allude, it has no deep knowledge of what element type is being used, so it wouldn't know how to copy the elements anyway.)
Another answer states that the Guava collections should have used the term unmodifiable. But there is a very well-defined difference between the terms unmodifiable and immutable in the context of collections, and it has nothing to do with shallow vs. deep immutability. "Unmodifiable" says you cannot change this instance, via the reference you have; "immutable" means this instance cannot change, period, whether by you or any other actor.
The list itself is immutable because you cannot add/remove elements. The elements are on their own regarding immutability. In more precise terms, we have definitions from a historical Java 1.4.2 document:
Collections that do not support any modification operations (such as add, remove and clear) are referred to as unmodifiable. Collections that are not unmodifiable are referred to modifiable.
Collections that additionally guarantee that no change in the Collection object will ever be visible are referred to as immutable. Collections that are not immutable are referred to as mutable.
Note that for these definitions to make any sense we must assume an implicit distiction between a collection in an abstract sense and an object that represents that collection. This is important because the object that represents an immutable collection is not itself immutable by any standard definition of that term. For example, its equals relation has no temporal consistency, a vital requirement on immutable objects.
As far as defensive copying, note that is an ill-defined problem in general and there will never be a general immutable collection in Java that will manage to defensively copy its elements. Note additionally that such a collection would be less useful than the immutable collections that really exist: when you put an object into a collection, in 99.99% cases you want that very object to be there, not some other object that is not even equal to it.
There is a quite standard definition of object immutability (as opposed to collection immutability) which assumes transitive immutability of the whole object graph reachable from the immutable object. Taken too literally, though, such a definition will almost never be satisfied in the real world. Two cases in point:
nothing is immutable in the face of reflection. Even final fields are writable.
even String, that bastillon of immutability, has been proven mutable outside the Java sandbox (without a SecurityManager—which covers 99% of real-world Java programs).
You mix the immutability of the list and the immutability of the objects it contains.
In an immutable collection you cannot add/remove objects, but if the object it contains are mutable you can modify them after get()ing them.
I want to have all the efficiencies of EnumSet and pass it around without worrying that somebody would modify it.
You can get an immutable EnumSet with Google collections (Guava).
Resources :
Guava home page
Google-collections bug tracker - immutable enum set convenience constructor
Google documentation - Sets.immutableEnumSet()
What's wrong with Collections.unmodifiableSet() wrapping an EnumSet?
True, the original EnumSet is still mutable, but as long as you discard the original reference, it's as good as immutable inside the wrapper.
edit: OK, since EnumSet doesn't offer any instance methods over and above the Set interface, the only reason for not using this solution is that the EnumSet type is useful for documentation purposes, and you lose that when wrapping it in a Set. Other than that, EnumSet behaviour will be preserved.