Is one-instance-per-unique-immutable design pattern considered evil? - java

I was reading a chapter on effective Java that talks about the advantages of keeping only one instance of an immutable object, such that we can do object identity comparison x == y instead of comparing the values for identity.
Also, POJOs like java.awt.RenderingHints.Key often use the one-instance-per-unique-immutable design pattern:
Instances of this class are immutable and unique which means that tests for matches can be made using the == operator instead of the more expensive equals() method.
I can understand the speed boost with this approach,
But wouldn't this design pattern eventually cause a memory leak ?

Yes, it may cause memory growth (it's not a leak if it's an intentional behavior). Whether it will or won't depends on just how the uniqueness contract is specified. For example, if you serialize one of these objects to disk, exit the scope in which it exists, and then deserialize it back from disk, one of two things happens: either you get the same object, or you get a different one. If you get the same object, then every object every used in the life of the JVM needs to be kept, and you'll have memory growth. If you get a different object, then the objects only need to exist while there is a reference to them, and you won't have memory growth.

That is sometimes called the Flyweight pattern, especially if the space of possible objects is bounded.

Regarding implementing the cache you can choose http://docs.oracle.com/javase/6/docs/api/java/util/WeakHashMap.html or you can have bounded LRU cache implemented.

Related

Stored area of immutable classes

In recent interview, I was asked if string is stored in string-pool, as it supports immutability then where are our custom immutable classes are stored in java.
I have given below explanations -
All the class variable primitive or object references (which is just a pointer to location where object is stored i.e. heap) are also stored in heap.
Classes loaded by class-loader and static variables and static object references are stored in a special location in heap which permanent generation.
But the interviewer kept arguing that -
If String has string-pool, can immutable classes also have some concept like that ?
Please can anyone explain regarding storage area of immutable classes as they are also immutable like string ?
In recent interview, I was asked if string is stored in string-pool, as it supports immutability then where are our custom immutable classes are stored in java.
This is a nonsensical question with a wrong premise. Strings are not “stored in string-pool”, they are stored in the heap, like any other object. That’s a kind of tautology, as the heap memory is precisely defined as “The heap is the run-time data area from which memory for all class instances and arrays is allocated.”
The string pool can be seen as containing strings, just like a Collection may contain objects, but in either case, it’s just holding references to objects. So a string contained in the pool still is stored in the heap memory, by definition, while the pool has a reference to it.
But the interviewer kept arguing that - If String has string-pool, can immutable classes also have some concept like that?
That’s an entirely different question. Of course, you can implement a pool of objects, as the Collection analogy above already indicated. Like strings contained in the pool are still stored in the heap memory, objects of your class are still stored in the heap memory when being referenced by whatever data structure used for the pool is referencing them. It’s not even necessary for the object to be immutable, to have such a pool, but the implied sharing of instances would cause semantic problems when mutations are possible. So creating a pool usually only makes sense for immutable objects.
For example, lots of the wrapper classes have such sharing, the valueOf methods for Short, Integer and Long will return shared instances for values in the -128 … +127 range and implementations are allowed to share even more, whereas Byte and Boolean return shared instances for all possible values.
However, there are reasons why not every immutable class implements a pool for all of its values:
If there is a large value space, you have to consider supporting garbage collection of unused objects, even when referenced by the pool
You have to consider the thread safety of your pool
Both points above may lead to a complex solution with a performance penalty that you don’t want to pay when the object is only used for a short time, as sharing only reduces memory consumption for long living objects
This applies to the existing examples as well. The wrapper objects only provide shared objects for a limited value space. Which are preallocated and never GCed. The string pool on the other hand, is dynamic, thread safe and supports garbage collection of its elements, which is the reason why intern() is not a cheap operation and should not be applied on every string. Instead, the pool is primarily used for constants, which are indeed long-living.

java cyclic reference and garbage collections

Let's consider the following 2 cyclic referencing examples:
Straight forward cyclic referencing
class A {
B b;
}
class B {
A a;
}
WeakReferenceing
class A {
B b;
}
class B {
WeakReference<A> aRef;
}
The following SO question answered by #Jon Skeet makes clear that the straight forward example will also be garbage collected as long as no "GC walk" from a known root exists to the cycle.
My question is as follows:
Is there any reason performance or otherwise to use or not to use the idiom represented in example 2 - the one employing a WeakReference?
Is there any reason performance or otherwise to use or not to use the idiom represented in example 2
The Java Reference types have a couple of performance implications:
They use more space than regular references.
They are significantly more work for the garbage collector than ordinary references.
I also believe that they can cause the collection of objects to be delayed by one or more GC cycles ... depending on the GC implementation.
In addition the application has to deal with the possibility that a WeakReference may be broken
By contrast, there are no performance or space overheads for normal cyclic references as you use them in your first example.
In summary, your weak reference idiom reduces performance and increases program complexity ... with no tangible benefits that I can see.
My guess is that this Question derives from the mistaken notion that cyclic references are more expensive than non-cyclic references in Java ... or that they are somehow problematic. (What other logical reason would cause one to propose an "idiom" like this?) In fact, this is not the case. Java garbage collectors don't suffer from the problems of reference counting; e.g. C++ "smart pointers". Cyclic references are handled correctly (i.e. without leaking memory) and efficiently in Java.
The problem is you do not know when GC will clear the weakreference objects.
It may be cleared just as you declare it! GC is very eager to collect it.
Or you can have root reference to the weakreference object to prevent it from the garbage collection.
Or check its status through RegisteredQueue.
It's like finalize method. You do not know when GC will execute this method.
Sources:
http://pawlan.com/monica/articles/refobjs/
http://docs.oracle.com/javase/7/docs/api/java/lang/ref/WeakReference.html

Freeing memory used by no longer needed objects in instance controlled classes in java

Consider the following scenario.
You are building a class, in java, where the fundamental semantics of the class demand that no two instances of the class be equal in value unless they are in fact the same object (see instance-controlled classes in Effective Java by Joshua Bloch). In a sense this is like a very large enum (possibly hundreds of millions of "constants") that are not known until runtime. So, to recap, you want the class to ensure that that there are no "equal" instances on the heap. There may be lots of references to a particular object on the heap, but no extraneous equal objects. This can obviously be done in code but it seems to me that there is a major flaw that I have not seen addressed anywhere, including in Effective Java. It seems to me that in order to make this guarantee the instance-controlled class must keep a reference to every instance of itself that has EVER been created at any point during program execution and can NEVER "delete" one of those objects because it can never know that there are no longer any "pointers" to that object (besides the one that it itself keeps). In other words, if you think about this in the context of reference-counting, there will come some point in the program where the only reference to the object is the one held by the class itself (the one that says, "this was created at some point"). At that point you would like to release the memory associated with the object, but you can't because that one pointer that is left has no way of knowing that it is the last one.
Is there a good approach to providing instance-controlled classes which can also free no-longer-needed memory?
Update: So, I think I've found something that might help. It turns out java has a java.lang.ref class that provides weak references. From wikipedia: "A WeakReference is used to implement weak maps. An object that is not strongly or softly reachable, but is referenced by a weak reference is called "weakly reachable". A weakly reachable object is garbage collected in the next collection cycle. This behavior is used in the class java.util.WeakHashMap. A weak map allows the programmer to put key/value pairs in the map and not worry about the objects taking up memory when the key is no longer reachable anywhere else. Another possible application of weak references is the string intern pool. Semantically, a weak reference means "get rid of this object when nothing else references it at the next garbage collection."
You need to use one of the special reference objects, like a weak reference. These were created just to support the use case you mention.
As you create an object, you search your collection of weak references to see if the object already exists; if it does, you return a regular reference to it. If it does not, you create it and return a regular reference, and add a weak reference to it to your collection.
Your weak reference will notify you when it is not used anywhere outside of your collection; you can then remove it from your collection. With no references any where, it can then be garbage collected.
The general concept is called a "canonicalizing cache."
The WeakHashMap class is a shortcut that does some of the plumbing for this for you.
It is not clear what your requirements are. You say you want hundreds of millions of entires. This suggests that a database or NoSQL is the best way to store this.
To ensure you have no duplicates, you can keep track of referenced objects which have been retained with a WeakHashMap.

Does java creates a new object each time when new operator is called

In java, When we call a new Constructor(), then a new object is created each time i.e; a new memory is allocated or suppose there are already many objects created for a class that do not have any reference.
So can java return such object that are marked for de-allocation or will the java create a new object each time a new constructor() is called.
What's my basic intention to ask this question is if such thing happens then performance can be improved as the cost to create a new memory and destroying a un-refrenced object will be reduced.
Yes.
Java will never re-use an object.
Java always creates new object. Note that new operator is very fast in Java. There is no allocation, typical JVM will just increment one pointer on heap. Once heap is full, old and unnecessary objects are removed and live are compacted. But garbage collection is a different story.
Your idea is clever, but would actually worsen performance. JVM would not only have to keep track of dead objects (eligible for GC) which it is not doing. But also it would have to clean up the old object somehow, so that it appears to be fresh. That's non-trivial and would take a lot of time. new is both faster and simpler.
There is one catch, though:
Integer answer = 42;
42 is a literal that has to be converted to Integer object. However JVM won't simply call new Integer(42) but Integer.valueOf(42) instead. And in the latter case valueOf() sometimes returns cached values (it will e.g. for 42).
Yes, when you use new in Java, a new object is always created.
However, that does not necessarily mean that the JVM has to ask the operating system to allocate memory. How the JVM exactly allocates memory is up to the particular JVM implementation, and most likely it contains a lot of optimizations to make it fast and efficient.
Object allocation is generally considered a cheap operation in Java - usually you do not need to worry about it.
One example of a sophisticated optimization that's implemented in the current versions of Oracle's Java is escape analysis.
The cost of creating objects and destroying unreferenced object is trivial. What takes the time is
detecting when an object is no longer referenced. To do this every strong reference must be checked.
copying objects which are kept from one generation to another and defragmenting a generation.
finalising objects which implement the finalize() method.
If you create short lived temporary objects whether your Eden size is 8 MB or 8 GB, the time it takes to do a minor collection is almost the same.
There is a design pattern called flyweight, its main advantage is to reuse objects. Java uses that in creating strings itself.
you can learn more about it here: http://en.wikipedia.org/wiki/Flyweight_pattern
Wherever you see new(), you can be pretty sure that a new object is being created.. As simple as that..

Why String class is immutable even though it has a non -final field called "hash"

I was reading through Item 15 of Effective Java by Joshua Bloch. Inside Item 15 which speaks about 'minimizing mutability' he mentions five rules to make objects immutable. One of them is is to make all fields final . Here is the rule :
Make all fields final : This clearly expresses your intent in a manner that is enforced
by the system. Also, it is necessary to ensure correct behavior if a reference
to a newly created instance is passed from one thread to another without
synchronization, as spelled out in the memory model [JLS, 17.5; Goetz06 16].
I know that String class is an example of a immutable class. Going through the source code I see that it actually has a hash instance which is not final .
//Cache the hash code for the string
private int hash; // Default to 0
How does String become immutable then ?
The remark explains why this is not final:
//Cache the hash code for the string
It's a cache. If you don't call hashCode, the value for it will not be set. It could have been set during the creation of the string, but that would mean longer creation time, for a feature you might not need (hash code). On the other hand, it would be wasteful to calculate the hash each time its asked, give the string is immutable, and the hash code will never change.
The fact that there's a non-final field does somewhat contradict that definition you quote, but here it's not part of the object's interface. It's merely an internal implementation detail, which has no effect on the mutability of the string (as a characters container).
Edit - due to popular demand, completing my answer: although hash is not directly part of the public interface, it could have affected the behavior of that interface, as hashCode return its value. Now, since hashCode is not synchronized, it is possible that hash be set more than once, if more than one thread used that method concurrently. However, the value that is set to hash is always the result of a stable calculation, which relies only on final fields (value, offset and count). Therefore, every calculation of the hash yield the exact same result. For an external user, this is just as if hash was calculated once - and just as if it was calculated each and every time, as the contract of hashCode requires that it consistently returns the same result for a given value. Bottom line, even though hash is not final, its mutability is never visible to an external viewer, hence the class can be considered immutable.
String is immutable because as far as its users are concerned, it can never be modified and will always look the same to all threads.
hashCode() is computed using the racy single-check idiom (EJ item 71), and it's safe because it doesn't hurt anybody if hashCode() is computed more than once accidentally.
Making all fields final is the easiest and simplest way to make classes immutable, but it's not strictly required. So long as all methods return the same thing no matter which thread calls it when, the class is immutable.
Even though String is immutable, it can change through reflection. If you make hash final, you could mess things up royally were this to occur. The hash field is different too in that it is there mainly as a cache, a way to speed up the calculation of hashCode() and should really be thought of as a calculated field, less so a constant.
There are many situations in which it may be helpful for a class which is logically immutable have several different representations for the same observable state, and for instances of the class to be able to switch among them. The hashcode value that will be returned from a string whose hash field is zero will be the same as the value that would be returned if the hash field held the result of an earlier hashcode call. Consequently, changing the hash value from the former to the latter will not change the object's observable state, but will cause future operations to run faster.
The biggest difficulties with coding things in those ways are
If an object is changed from holding a reference to some particular immutable object to holding a reference to a different object with identical semantic content, such a change shouldn't affect the observable state of the object holding the reference, but if it turns out the supposedly-identical object wasn't really identical, bad things can happen, especially if the object supposedly holding the reference was assumed to be substitutable for other semantically-identical objects.
Even if there aren't any mistakes in which objects are "identical", there may still be a danger that objects which appear identical to a thread which makes a substitution may not appear identical to other threads. This scenario isn't likely to occur, but if it does occur the effects may be very bad.
Still, there can be some advantages to making substitutitions of immutable objects. For example, if a program will be comparing many objects which hold long strings and many of them, though separately generated, will be identical to each other, it may be useful to use a WeakDictionary to build a pool of distinct string instances, and replace any string which is found to be identical to one in the pool with a reference to the pool copy. Doing that would cause many strings which are identical to be mapped to the same string, thus greatly accellerating any future comparisons that may be done between them. Of course, as noted it's very important that the objects are properly logically immutable, that the comparisons are done correctly. Any problems in that regard can turn what should be an optimization into a mess.
To create a object immutable You need to make the class final and all its member final so that once objects gets crated no one can modify its state. You can achieve same functionality by making member as non final but private and not modifying them except in constructor.
EDIT:
Notice :
When hashing a string, Java also caches the hash value in the hash attribute, but only if the result is different from zero.

Categories

Resources