Stored area of immutable classes - java

In recent interview, I was asked if string is stored in string-pool, as it supports immutability then where are our custom immutable classes are stored in java.
I have given below explanations -
All the class variable primitive or object references (which is just a pointer to location where object is stored i.e. heap) are also stored in heap.
Classes loaded by class-loader and static variables and static object references are stored in a special location in heap which permanent generation.
But the interviewer kept arguing that -
If String has string-pool, can immutable classes also have some concept like that ?
Please can anyone explain regarding storage area of immutable classes as they are also immutable like string ?

In recent interview, I was asked if string is stored in string-pool, as it supports immutability then where are our custom immutable classes are stored in java.
This is a nonsensical question with a wrong premise. Strings are not “stored in string-pool”, they are stored in the heap, like any other object. That’s a kind of tautology, as the heap memory is precisely defined as “The heap is the run-time data area from which memory for all class instances and arrays is allocated.”
The string pool can be seen as containing strings, just like a Collection may contain objects, but in either case, it’s just holding references to objects. So a string contained in the pool still is stored in the heap memory, by definition, while the pool has a reference to it.
But the interviewer kept arguing that - If String has string-pool, can immutable classes also have some concept like that?
That’s an entirely different question. Of course, you can implement a pool of objects, as the Collection analogy above already indicated. Like strings contained in the pool are still stored in the heap memory, objects of your class are still stored in the heap memory when being referenced by whatever data structure used for the pool is referencing them. It’s not even necessary for the object to be immutable, to have such a pool, but the implied sharing of instances would cause semantic problems when mutations are possible. So creating a pool usually only makes sense for immutable objects.
For example, lots of the wrapper classes have such sharing, the valueOf methods for Short, Integer and Long will return shared instances for values in the -128 … +127 range and implementations are allowed to share even more, whereas Byte and Boolean return shared instances for all possible values.
However, there are reasons why not every immutable class implements a pool for all of its values:
If there is a large value space, you have to consider supporting garbage collection of unused objects, even when referenced by the pool
You have to consider the thread safety of your pool
Both points above may lead to a complex solution with a performance penalty that you don’t want to pay when the object is only used for a short time, as sharing only reduces memory consumption for long living objects
This applies to the existing examples as well. The wrapper objects only provide shared objects for a limited value space. Which are preallocated and never GCed. The string pool on the other hand, is dynamic, thread safe and supports garbage collection of its elements, which is the reason why intern() is not a cheap operation and should not be applied on every string. Instead, the pool is primarily used for constants, which are indeed long-living.

Related

How does variable memory allocation work?

When I create new object for String builder and use that variable then how memory allocation work and what is result of my code snippet my sample code is
1)
String nextPoint=new StringBuilder().append("My").append("next").append("point").toString();
System.out.println(nextPoint);
2)
StringBuilder downPoint=new StringBuilder().append("My").append("next").append("point");
System.out.println(downPoint.toString());
Which variables/instance can consume memory?? which solution is better when i'm using "nextPoint" varaible or "downPoint" variable?
Which variables/instance can consume memory?
Every class occupies memory. How much depends on the class. Every object occupies heap memory. How much depends on its class. Many classes and objects also contain references to other objects, and those other objects occupy their own heap memory. Some objects also have associated native resources, which occupy an idiosyncratic amount of memory. Local variables occupy stack memory appropriate for their type, though under some circumstances certain local variables may share the same stack memory as others.
In your case (1):
String nextPoint=new StringBuilder().append("My").append("next").append("point").toString();
System.out.println(nextPoint);
variable nextPoint is a local reference variable, consuming stack memory (for a reference, not a whole String). It is initialized by creating a new StringBuilder object (on the heap) and appending three Strings to it (each one an object occupying heap memory), and then creating a new String object (also occupying heap memory), and storing a reference to it in nextPoint. The StringBuilder will have some kind of associated storage for the accumulated character data; this will not overlap that of any of the Strings involved.
Your case (2) differs only in that a reference to the StringBuilder is retained instead of a reference to the generated String. That may have implications for code that follows, but it makes no difference to which objects are created and what memory is needed.
which solution is better when i'm using "nextPoint" varaible or "downPoint" variable?
It depends on what you want to do afterward. If you're not going to use either of those variables again then the difference is purely stylistic.
what is result of my code snippet [?]
Put it in a class, run it, and find out for yourself. Or figure it out from the code. This one is not something you should need us to answer for you.
Both snippets do the same sequence of method calls, being
new StringBuilder().append("My").append("next").append("point").toString(),
so their memory usage is (mostly) the same. Only snippet (1) stores a reference to the StringBuilder in a variable, and (2) stores the resulting String reference. But as references are the same size, no matter what the refer to, that results in the same number of bytes occupied.

Interning a string

When we intern a string, we are making sure that all uses of that string are referring to the same instance.
I would assume that the underlying string object is in the heap.
However, where is the referring variable stored in the memory?
Does it have the same behaviour as static - wherein the reference gets stored in permgen and makes the string instance available for gc only after the classloader(and application) exits?
Up to JDK 6, Intern'ed strings are stored in the memory pool in a place called the Permanent Generation, which is an area of the JVM that is reserved for non-user objects, like Classes, Methods and other internal JVM objects. The size of this area is limited, and is usually much smaller than the heap.
From JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
A detailed explanation of this can be found on this answer.
When we intern a string, we are making sure that all uses of that string are referring to the same instance.
Not exactly. When you do this:
String s2 = s1.intern();
what you are doing is ensuring that s2 refers to a String in the string pool. This does not affect the value in s1, or any other String references or variables. If you want other copies of the string to be interned, you need to do that explicitly ... or assign interned string references to the respective variables.
I would assume that the underlying string object is in the heap.
That is correct. It might be in the "permgen" heap or the regular heap, depending on the version of Java you are using. But it is always "in the heap".
However, where is the referring variable stored in the memory?
The "referring variable" ... i.e. the one that holds the reference that you got from calling intern() ... is no different from any other variable. It can be
a local variable or parameter (held in a stack frame),
an instance field (held in a regular heap object),
a static field (held in a permgen heap object) ... or even
a jstring variable or similar in JNI code (held "somewhere else".)
In fact, a typical JVM uses a private hash table to hold the references to interned strings, and it uses the JVM's weak reference mechanism to ensure that interned strings can be garbage collected if nothing else is using them.
Does it have the same behaviour as static - wherein the reference gets stored in permgen and makes the string instance available for gc only after the classloader(and application) exits?
Typically no ... see above.
In most Java platforms, interned Strings can be garbage collected just like other Strings. If the interned Strings are stored in "permgen" space, it may take longer for the object to be garbage collected, because "permgen" is collected infrequently. However the lifetime of an interned String is not tied to the lifetime of a classloader, etc.

Freeing memory used by no longer needed objects in instance controlled classes in java

Consider the following scenario.
You are building a class, in java, where the fundamental semantics of the class demand that no two instances of the class be equal in value unless they are in fact the same object (see instance-controlled classes in Effective Java by Joshua Bloch). In a sense this is like a very large enum (possibly hundreds of millions of "constants") that are not known until runtime. So, to recap, you want the class to ensure that that there are no "equal" instances on the heap. There may be lots of references to a particular object on the heap, but no extraneous equal objects. This can obviously be done in code but it seems to me that there is a major flaw that I have not seen addressed anywhere, including in Effective Java. It seems to me that in order to make this guarantee the instance-controlled class must keep a reference to every instance of itself that has EVER been created at any point during program execution and can NEVER "delete" one of those objects because it can never know that there are no longer any "pointers" to that object (besides the one that it itself keeps). In other words, if you think about this in the context of reference-counting, there will come some point in the program where the only reference to the object is the one held by the class itself (the one that says, "this was created at some point"). At that point you would like to release the memory associated with the object, but you can't because that one pointer that is left has no way of knowing that it is the last one.
Is there a good approach to providing instance-controlled classes which can also free no-longer-needed memory?
Update: So, I think I've found something that might help. It turns out java has a java.lang.ref class that provides weak references. From wikipedia: "A WeakReference is used to implement weak maps. An object that is not strongly or softly reachable, but is referenced by a weak reference is called "weakly reachable". A weakly reachable object is garbage collected in the next collection cycle. This behavior is used in the class java.util.WeakHashMap. A weak map allows the programmer to put key/value pairs in the map and not worry about the objects taking up memory when the key is no longer reachable anywhere else. Another possible application of weak references is the string intern pool. Semantically, a weak reference means "get rid of this object when nothing else references it at the next garbage collection."
You need to use one of the special reference objects, like a weak reference. These were created just to support the use case you mention.
As you create an object, you search your collection of weak references to see if the object already exists; if it does, you return a regular reference to it. If it does not, you create it and return a regular reference, and add a weak reference to it to your collection.
Your weak reference will notify you when it is not used anywhere outside of your collection; you can then remove it from your collection. With no references any where, it can then be garbage collected.
The general concept is called a "canonicalizing cache."
The WeakHashMap class is a shortcut that does some of the plumbing for this for you.
It is not clear what your requirements are. You say you want hundreds of millions of entires. This suggests that a database or NoSQL is the best way to store this.
To ensure you have no duplicates, you can keep track of referenced objects which have been retained with a WeakHashMap.

String Constant Pool memory sector and garbage collection

I read this question on the site How is the java memory pool divided? and i was wondering to which of these sectors does the "String Constant Pool" belongs?
And also does the String literals in the pool ever get GCed?
The intern() method returns the base link of the String literal from the pool.
If the pool does gets GCed then wouldn't it be counter-productive to the idea of the string pool? New String literals would again be created nullifying the GC.
(It is assuming that only a specific set of literals exist in the pool, they never go obsolete and sooner or later they will be needed again)
As far as I know String literals end up in the "Perm Gen" part of non-Heap JVM memory. Perm Gen space is only examined during Full GC runs (not Partials).
In early JVM's (and I confess I had to look this up because I wasn't sure), String literals in the String Pool never got GC'ed. In the newer JVM's, WeakReferences are used to reference the Strings in the pool, so interned Strings can actually get GC'ed, but only during Full Garbage collections.
Reading the JavaDoc for String.intern() doesn't give hints to the implementation, but according to this page, the interned strings are held by a weak reference. This means that if the GC detects that there are no references to the interned string except for the repository that holds interned strings then it is allowed to collect them. Of course this is transparent to external code so unless you are using weak references of your own you'll never know about the garbage collection.
String pooling
String pooling (sometimes also called as string canonicalisation) is a
process of replacing several String objects with equal value but
different identity with a single shared String object. You can achieve
this goal by keeping your own Map (with possibly soft
or weak references depending on your requirements) and using map
values as canonicalised values. Or you can use String.intern() method
which is provided to you by JDK.
At times of Java 6 using String.intern() was forbidden by many
standards due to a high possibility to get an OutOfMemoryException if
pooling went out of control. Oracle Java 7 implementation of string
pooling was changed considerably. You can look for details in
http://bugs.sun.com/view_bug.do?bug_id=6962931 and
http://bugs.sun.com/view_bug.do?bug_id=6962930.
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location –
the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
String pool values are garbage collected
Yes, all strings in the JVM string pool are eligible for garbage
collection if there are no references to them from your program roots.
It applies to all discussed versions of Java. It means that if your
interned string went out of scope and there are no other references to
it – it will be garbage collected from the JVM string pool.
Being eligible for garbage collection and residing in the heap, a JVM
string pool seems to be a right place for all your strings, isn’t it?
In theory it is true – non-used strings will be garbage collected from
the pool, used strings will allow you to save memory in case then you
get an equal string from the input. Seems to be a perfect memory
saving strategy? Nearly so. You must know how the string pool is
implemented before making any decisions.
source.
String literals don't get created into the pool at runtime. I don't know for sure if they get GC'd or not, but I suspect that they do not for two reasons:
It would be immensely complex to detect in the general case when a literal will not be used anymore
There is likely a static code segment where it is stored for performance. The rest of the data is likely built around it, where the boundaries are also static
Strings, even though they are immutable, are still objects like any other in Java. Objects are created on the heap and Strings are no exception. So, Strings that are part of the "String Literal Pool" still live on the heap, but they have references to them from the String Literal Pool.
For more please refer this link
`http://www.javaranch.com/journal/200409/ScjpTipLine-StringsLiterally.html`
Edited Newly :
public class ImmutableStrings
{
public static void main(String[] args)
{
String one = "someString";
String two = new String("someString");
one = two = null;
}
}
Just before the main method ends, how many objects are available for garbage collection? 0? 1? 2?
The answer is 1. Unlike most objects, String literals always have a reference to them from the String Literal Pool. That means that they always have a reference to them and are, therefore, not eligible for garbage collection.
neither of our local variables, one or two, refer to our String object, there is still a reference to it from the String Literal Pool. Therefore, the object is not elgible for garbage collection.The object is always reachable through use of the intern() method

Is one-instance-per-unique-immutable design pattern considered evil?

I was reading a chapter on effective Java that talks about the advantages of keeping only one instance of an immutable object, such that we can do object identity comparison x == y instead of comparing the values for identity.
Also, POJOs like java.awt.RenderingHints.Key often use the one-instance-per-unique-immutable design pattern:
Instances of this class are immutable and unique which means that tests for matches can be made using the == operator instead of the more expensive equals() method.
I can understand the speed boost with this approach,
But wouldn't this design pattern eventually cause a memory leak ?
Yes, it may cause memory growth (it's not a leak if it's an intentional behavior). Whether it will or won't depends on just how the uniqueness contract is specified. For example, if you serialize one of these objects to disk, exit the scope in which it exists, and then deserialize it back from disk, one of two things happens: either you get the same object, or you get a different one. If you get the same object, then every object every used in the life of the JVM needs to be kept, and you'll have memory growth. If you get a different object, then the objects only need to exist while there is a reference to them, and you won't have memory growth.
That is sometimes called the Flyweight pattern, especially if the space of possible objects is bounded.
Regarding implementing the cache you can choose http://docs.oracle.com/javase/6/docs/api/java/util/WeakHashMap.html or you can have bounded LRU cache implemented.

Categories

Resources