This question already has answers here:
What's the advantage of a String being Immutable?
(7 answers)
Closed 9 years ago.
I have read a lot of places where it is written that immutability in java is a desired feature. Why is this so?
Immutability makes it easier to reason about the life cycle of the object. It is especially useful in multi-threaded programs as it makes it simpler to share between threads.
Some data structures assume the keys or elements are immutable, or are not changed in critical ways. e.g. Maps and Sets. They don't have to be strictly immutable but it makes it a lot easier if they are.
The downside of immutable objects is that it make recycling them much harder and can significantly impact performance.
In short, if performance is an issue, consider mutable objects, if not use immutable objects as much as possible.
The main benefit of immutability is that an object cannot be changed "out from under you". Eg, consider a String that is a key to a Map. If it were mutable then one could insert a key/value pair then modify the String that is the key. This would break the hash scheme used in the map, and lead to some potentially very unpredictable operation.
In some cases such mutability could lead to security exposures, due to intentional abuse, but more commonly it would simply lead to mysterious bugs. And "defensive" copying of the mutable objects (to avoid such bugs) would lead to many more objects being created (especially if a formulaic approach to the problem is taken).
(On the other hand, lots of objects are created, eg, because a new String is created when someone appends to the String, or "chops off" a part of it, vs simply mutating the existing String, so both approaches end up causing more objects to be created, for different reasons.)
Immutability is highly desired when you need to know that a class hasn't changed when you don't expect. Consider the keys in a hash map. If you set a mutable object for a key, then store it at the key's current hash value (mod hash array size), what happens when you change it? The hash changes and suddenly it's in the wrong spot! You (the HashMap) don't know that it has changed, so you don't know to update its location, so the next time someone tries to look up that key, you won't find it. Immutability removes this problem - if you put an object somewhere based on its hash, you know it will always be where you expect it to be.
And, as pointed out elsewhere, being immutable means it's safe to read concurrently from multiple threads without synchronization.
In programming, an immutable class or object is an object whose state can not be modified after it is created. Some cases they are still considered as immutable even if you can change their attributes (fields), but the object nature keeps same.
Immutable Objects are often desired because 3 main reasons:
Thread-safe
Higher security than mutable objects
Simplicity
There is a ref. that you can review for a better understanding about immutability for java classes and objects
Effective Java Item 15 Immutable classes are easier to design, implement, and use than mutable classes. They are less prone
to error and are more secure.
Related
This question already has answers here:
Why equals and hashCode were defined in Object?
(10 answers)
Closed 5 years ago.
I was thinking why the hash code is implemented in Object class when its purpose is served only while using collections like HashMap.So should'nt the hashcode be implemented in interfaces implementing Maps.
It's not a good idea to say that hashcode implementation is used
in Collections only.
In the Java API documentation, the general contract of hashCode is given as:
Whenever it is invoked on the same object more than once during an
execution of a Java application, the hashCode method must
consistently return the same integer, provided no information used
in equals comparisons on the object is modified. This integer need
not remain consistent from one execution of an application to
another execution of the same application.
If two objects are equal according to the equals(Object) method,
then calling the hashCode method on each of the two objects must
produce the same integer result.
It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then calling the hashCode method on
each of the two objects must produce distinct integer results.
However, the programmer should be aware that producing distinct
integer results for unequal objects may improve the performance of
hashtables.
So hashcode has to do only with Object. Collections just get benefits
from this feature for their own use cases e.g checking objects having same hashcode, storing objects based on hashcode
etc.
Note:- Collections doesn't use hashcode value to sort the objects.
hashcode() method is used mainly in case of hash based collections like HashMap and HashSet. The hash code returned by this method is used for calculation of hash index or bucket index.
HashCode function is the necessity of all classes or POJOs or Beans where we need a comparison or check equality.
Let suppose we need to compare two objects irrespective of the Collection API, then there should be a way to achieve it.
If HashCode is not the part of Object Class then it would be difficult to calculate the hash every time and be an overburden.
It's a pragmatic design decision but the question is essentially correct. A purist analysis would say that it's an example of a Interface Bloat or a Fat Interface.
The Java java.lang.Object has more methods than are strictly required by (or even meaningful for) all objects. It's not just hashCode() either.
It's arguable that the only method on Object that makes sense for all objects is getClass().
Not all applications are concurrent let alone needing their own monitors. So a purist object model would remove notify(), notifyAll() and 3 versions of wait() to an interface called (say) Monitored and then only permit synchronized to be used with objects implementing that.
It's very common for it to be either invalid or unnecessary to clone() objects - though that method is fortunately protected. Again best off in an interface say interface Cloneable<T>.
Object identity comparisons (are these references to the same object) has been provided as the intrinsic operator ==, so equals(Object) should (still being a purist) be in a ValueComparable<T> interface for objects that have that semantic (many don't).
Being very pure even then you'd push hashCode() into another interface (say) HashCodable.
finalize() could also be put in an interface HasFinalize. Indeed that could make the garbage collectors life a bit easier especially given its use so rare and specialized.
However there is a clear design decision in Java to simplify things and the designers decided to put a number of methods in Object that are apparently 'frequently used' or useful rather than 'strictly part of the minimal nature of being an object' which is (in the Java model at least) 'being an instance of some class of objects (having common methods, interfaces and semantics)'.
IMHO hashCode() is probably the least out of place!
It is totally unnecessary to provide a monitor on every object and leaves implementers with a headache of supporting the methods on every object knowing they will be called for a minuscule number of them. Don't under estimate the overhead that might cause given it may be necessary to allocate things like mutexes a whole cache-line (typically tens of bytes) to every one of millions of objects and there being no sane way it would ever get used.
I'm not suggesting for a second 'Java is broken' or 'badly designed'. I am not here to knock Java. It is a great language. As with the design of generics it has always chosen to make things simple and been willing to make some compromises on performance for simplicity and as a result produced a very powerful and accessible language in which by great implementation those performance overheads only occasionally grate.
But to repeat the point I think we should recognise those methods are not in the intrinsic nature of all objects.
I have been using guava for some time now and truly trusted it, until I stumbled of an example yesterday, which got me thinking. Long story short, here it is:
public static void testGuavaImmutability(){
StringBuilder stringBuilder = new StringBuilder("partOne");
ImmutableList<StringBuilder> myList = ImmutableList.of(stringBuilder);
System.out.println(myList.get(0));
stringBuilder.append("appended");
System.out.println(myList.get(0));
}
After running this you can see that the value of an entry inside an ImmutableList has changed. If two threads were involved here, one could happen to not see the updated of the other.
Also the thing that makes me very impatient for an answer is that Item15 in Effective Java, point five says this:
Make defensives copies in the constructor - which seems pretty logic.
Looking at the source code of the ImmutableList, I see this:
SingletonImmutableList(E element) {
this.element = checkNotNull(element);
}
So, no copy is actually made, although I have no idea how a generic deep copy would be implemented in such a case (may be serialization?).
So.. why are they called Immutable then?
What you're getting at here is the difference between immutable and deeply immutable.
An immutable object will never change, but anything that it refers to might change. Deep immutability is much stronger: neither the base object nor any object you can navigate to from it will change.
Each is appropriate in its own situations. When you create your own class that has a field of type Date, that date is owned by your object; it's truly a part of it. Therefore, you should make defensive copies of it (on the way in and the way out!) to provide deep immutability.
But a collection does not really "own" its elements. Their states are not considered part of the collection's state; it is a different type of class -- a container. (Furthermore, as you allude, it has no deep knowledge of what element type is being used, so it wouldn't know how to copy the elements anyway.)
Another answer states that the Guava collections should have used the term unmodifiable. But there is a very well-defined difference between the terms unmodifiable and immutable in the context of collections, and it has nothing to do with shallow vs. deep immutability. "Unmodifiable" says you cannot change this instance, via the reference you have; "immutable" means this instance cannot change, period, whether by you or any other actor.
The list itself is immutable because you cannot add/remove elements. The elements are on their own regarding immutability. In more precise terms, we have definitions from a historical Java 1.4.2 document:
Collections that do not support any modification operations (such as add, remove and clear) are referred to as unmodifiable. Collections that are not unmodifiable are referred to modifiable.
Collections that additionally guarantee that no change in the Collection object will ever be visible are referred to as immutable. Collections that are not immutable are referred to as mutable.
Note that for these definitions to make any sense we must assume an implicit distiction between a collection in an abstract sense and an object that represents that collection. This is important because the object that represents an immutable collection is not itself immutable by any standard definition of that term. For example, its equals relation has no temporal consistency, a vital requirement on immutable objects.
As far as defensive copying, note that is an ill-defined problem in general and there will never be a general immutable collection in Java that will manage to defensively copy its elements. Note additionally that such a collection would be less useful than the immutable collections that really exist: when you put an object into a collection, in 99.99% cases you want that very object to be there, not some other object that is not even equal to it.
There is a quite standard definition of object immutability (as opposed to collection immutability) which assumes transitive immutability of the whole object graph reachable from the immutable object. Taken too literally, though, such a definition will almost never be satisfied in the real world. Two cases in point:
nothing is immutable in the face of reflection. Even final fields are writable.
even String, that bastillon of immutability, has been proven mutable outside the Java sandbox (without a SecurityManager—which covers 99% of real-world Java programs).
You mix the immutability of the list and the immutability of the objects it contains.
In an immutable collection you cannot add/remove objects, but if the object it contains are mutable you can modify them after get()ing them.
It is obvious that immutability increases the re-usability since it creates new object in each state change.Can somebody tells me a practical scenario where we need a immutable class ?
Consider java.lang.String. If it weren't immutable, every time you ever have a string you want to be confident wouldn't change underneath you, you'd have to create a copy.
Another example is collections: it's nice to be able to accept or return a genuinely immutable collection (e.g. from Guava - not just an immutable view on a mutable collection) and have confidence that it won't be changed.
Whether those count as "needs" or not, I don't know - but I wouldn't want to develop without them.
A good example is related to hashing. A class overrides the equals() and hashCode() methods so that it can be used in data structures like HashSet and (as keys in) HashMap, and the hash code is typically derived by some identifying member attributes. However, if these attributes were to change then so would the object's hash code, so the object is no longer usable in a hashing data structure.
Java provides a nice example: String.
This article has a good color example (since color definitions don't change).
http://www.ibm.com/developerworks/java/library/j-jtp02183/index.html
I want to have an object that allows other objects of a specific type to register themselves with it. Ideally it would store the references to them in some sort of set collection and have .equals() compare by reference rather than value. It shouldn't have to maintain a sort at all times, but it should be able to be sorted before the collection is iterated over.
Looking through the Java Collection Library, I've seen the various features I'm looking for on different collection types, but I am not sure about how I should go about using them to build the kind of collection I'm looking for.
This is Java in the context of Android if that is significant.
Java's built-in tree-based collections won't work.
To illustrate, consider a tree containing weak references to nodes 'B', 'C', and 'D':
C
B D
Now let the weak reference 'C' get collected, leaving null behind:
-
B D
Now insert an element into the tree. The TreeMap/TreeSet doesn't have sufficient information to select the left or right subtree. If your comparator says null is a small value, then it will be incorrect when inserting 'A'. If it says null is a large value, it will be incorrect when inserting 'E'.
Sort on demand is a good choice.
A more robust solution is to use an ArrayList<WeakReference<T>> and to implement a Comparator<WeakReference<T>> that delegates to a Comparator<T>. Then call Collections.sort() prior to iteration.
Android's Collections.sort uses TimSort behind-the-scenes and so it runs quite efficiently if the input is already partially sorted.
Perhaps the collections classes are a level of abstraction below what you're looking for? It sounds like the end product you want is a cache with the ability to iterate in a user-defined sort order. If so, perhaps the cache interface in the Google Guava library is close enough to what you want:
http://code.google.com/p/guava-libraries/source/browse/trunk/guava/src/com/google/common/cache/Cache.java
At a glance, it looks like CacheBuilder in that package doesn't allow you to build an implementation with user-defined iteration order. However, it does provide a Map view that might be good enough for your needs:
List<Thing> cachedThings = Lists.newArrayList(cache.asMap().values());
Collections.sort(cachedThings, YOUR_THING_COMPARATOR);
for (Thing thing : cachedThings) { ... }
Even if this isn't exactly what you want, the classes in that package might give you some useful insights re: using References with Collections.
DISCLAIMER: This was a comment but it got kinda big, sorry if it doesn't solve your problem:
References in Java
Just to clarify what I mean when I say reference, since it isn't really a term commonly used in Java: Java does not really use references or pointers. It uses a kind of pseudo-reference that can be (and is by default) assigned to the special null instance. That's one way to explain it anyway. In Java, these pseudo-references are the only way that an Object can be handled. When I say reference, I mean these pseudo-references.
Sets
Any Set implementation will not allow two references to the same object to be included in it since it uses identity equality for this check. That violates the mathematical concept of a set. The Java Sets ignore any attempt to add duplicate references.
You mention a Map in your comment though... Could you clarify what kind of collection you are after? And why you need that kind of equality checking within it? Are you thinking in C++ terms? I'll try to edit my answer to be more helpful then :)
EDIT: I thought that might have been your goal ;) So a TreeSet should do the trick then! I would not get concerned about performance until there is a performance issue. Simplicity is fantastic for readability, maintenance and preventing bugs. If performance does become a problem, ideally you should profile your code and only optimize the areas that are proven to be the problem.
I have a Java HashMap whose keys are instances of java.lang.Object, that is: the keys are of different types. The hashCode values of two key objects of different types are likely to be the same when they contain identical variable values.
In order to improve the performance of the get method for my HashMap, I'm inclined to mix the name of the Java type into the hashCode methods of my key objects. I have not seen examples of this elsewhere, and so my this-might-be-wacky alarm went off. Do you think mixing the type into hashCode is a good idea? Should I mix in the class name, or the hashCode of the relevant Class object?
I wouldn't mix the type name in - but if you're controlling the hashCode algorithm already, why not just change it so that they won't clash? For example, if you're using the common "add and multiply" approach, you could start off with different base cases or use different multipliers.
Before you worry about this too much though, have you actually measured how often you're really getting collisions with real data? Is this definitely a problem, or are you just concerned that it might be a problem?
I think your this-might-be-wacky alarm should have gone off when you decided to have keys of different types. But let's assume this is a case where Object is really the way to go.
You should try it without mixing in the type name and stress test the performance if you find that this particular lookup is determined to be a hotspot in the system. Chances are the performance doesn't matter that much.
Like Jon implied, the performance of the hash map is improved by reducing collisions. Mixing in the type name is just as likely to increase collisions as it is to reduce them. To keep your hashmap in peak condition, you want the likelihood of any particular hashcode to be about that same as any other over the domain of valid key values. So the probability of a hashcode of 10 should be about the same as the probability of 100 or any other number. That way the hash table buckets fill evenly (in all likelihood). so whether you have an object of type A or type B should not matter. just the probability distribution of the hashcodes of all occurring key values.
Years later...
Apart from it being a premature optimization, it's not a bad idea and the overhead is tiny. Choy's recommendation to profile first is surely good in general, but sometimes a simple optimization takes much less time than the profiling. This seems to be such a case.
I'd use a different multiplier as already suggested and mix in getClass().getHashCode().
Or maybe getClass().getName().getHashCode() as it stays consistent across JVM invocations, which might be helpful if you want a reproducible HashMap iteration order for easier debugging. Note that you should never rely on such a reproducibility and that there are quite many things destroying it.