In java there is concept of string literal pool. If I am not creating any string in my code, this memory pool is waste for me. How can I use this memory area instead of keeping it for string literal pool.
There is no "string literal pool"; string literals are interned, but that means they are just normal objects on the heap. They presumably get referenced a lot and in this way save on memory, but fundamentally they are no different than any other object.
If no string literals exist in your program (and you don't ever call String.intern) then the JVM does not allocate heap memory for such. There is no "hidden" memory area involved, and you don't need to do anything to "get access to it".
I don't think it makes any sense. Anyway, theoretically, string pool is in permgen area of Java heap. This is the same memory where JVM stores classes. By default (at least for Oracle HotSpot JVM) it is 64 M. You can try to configure this area with two HotSpot JVM options: -XX:MaxPermSize and -XX:PermSize. The less permgen the more memory for objects.
How can I use this memory area instead of keeping it for string literal pool.
It is the area reserved by JVM and being increased (upto some threshold) as program runs. In short, you can't use it for some other purpose.
Related
I have a map of format Map stored in a file.
This file has over 100,000 records.
The value of each entry is nearly 10k.
I load 1000 records into a map in memory , process them ,then clear the map and load the next 1000 records.
My question is :
Since the strings are stored in String pool which is in permgen
memory area , when i clear the map will the Strings be garbage
collected ?
Incase if they are not garbage collected is there any way to force
them to be garbage collected?
Is there any guarantee that if the program is running out of memory
, JVM would clean the permGen memory before throwing OutOfMemory
Exception ?
Ok.. Let's start....
Since the strings are stored in String pool which is in permgen memory
area , when i clear the map will the Strings be garbage collected ?
All strings are NOT stored in String constants pool. Only interned Strings and String literals go into the String constants pool. There is no concept of permgen in java-8. Metaspace has (almost gracefully) replaced Permgen.
If you have Strings read from a file (which are not interned), then yes your strings will get GCed. If you have String literals (and God save you if you do.. :P), the they will be GCed when the classloader which loaded your class which defined these string literals gets GCed.
Incase if they are not garbage collected is there any way to force
them to be garbage collected?
Well, You could always call System.gc() explicitly (NOT a good idea in production environment). If you are using java-8 use G1Gc and enable String deduplication.
Is there any guarantee that if the program is running out of memory ,
JVM would clean the permGen memory before throwing OutOfMemory
Exception
The GC will try its best to cleanup as much as it can. No, there is no guarantee that this would happen.
In Java, why are String datatypes allocated memory on the heap?
The reason is simple all objects are stored on the heap. It is designed like that. String is a class and its object will be stored on the heap.
Also note that String literals were previously stored in a Heap called the "permgen" heap. Now according to the JVM Specification, the area for storing string literals is in the runtime constant pool.
Only the primitive datatypes are stored on stack.
Heap memory is used by java runtime to allocate memory to Objects and
JRE classes. Whenever we create any object, it’s always created in the
Heap space. Garbage Collection runs on the heap memory to free the
memory used by objects that doesn’t have any reference. Any object
created in the heap space has global access and can be referenced from
anywhere of the application.
A good point to quote from the JDK7
Area: HotSpot
Synopsis: In JDK 7, interned strings are no longer
allocated in the permanent generation of the Java heap, but are
instead allocated in the main part of the Java heap (known as the
young and old generations), along with the other objects created by
the application. This change will result in more data residing in the
main Java heap, and less data in the permanent generation, and thus
may require heap sizes to be adjusted. Most applications will see only
relatively small differences in heap usage due to this change, but
larger applications that load many classes or make heavy use of the
String.intern() method will see more significant differences. RFE:
6962931
By default all objects are on the heap. String has two objects, the String and the char[] it wraps. It is not unusual to find the most numerous object by type is a char[] even if you create none directly.
What is surprising is that it doesn't always create objects on the heap, but it can place objects on the stack through escape analysis. Note: it can't do this for String literals as they are stored in the String literal pool.
When user enters the Strings, its always dynamic that is the size of the string may change for each execution, hence the compiler doesn't know the exact memory requirement needed for the String. Even during the run time, the size of string is not predicted until the user enters the complete string, so no memory can be assigned on the stack, hence, it generally stores a pointer on the stack which points to the string (on the heap).
After exploring java's string internals I've grown confused on what is referred to as the "perm space." My understanding initially of it was that it held String literals as well as class meta data as explained in this question.
I've also read about the String.intern() method and that it places Strings into the String Pool returning a reference to unique instance of it. It is my understanding that this is the same string pool holding String literals that exists in the JVM's perm-space. It didn't seem possible to me that the "perm-space" could be modifiable, (it is permanent after all, yes?). But Then I found this question where the top voted comment by EJP on the accepted answer explains that
Intern'd strings have been GC-able for quite some years now.
Implying that the GC runs on the perm-space which doesn't seem very permanent. How does this reconcile? Does the GC check everything in the perm-space? Does the GC check everything in the string pool including string literals from the source? Is there a second string pool for intern'd strings? Does the GC know only to look over intern'd strings when collecting? Or is this comment mistaken and intern'ing a string prevents it from ever being GC'd (which I hope is not the case)?
String literals are interned. As of Java 7, the HotSpot JVM puts interned Strings in the heap, not permgen.
Prior to java 7, hotspot put interned Strings in permgen. However, interned Strings in permgen were garbage collected. Apparently, Class objects in permgen are also collectable, so everything in permgen is collectable, though permgen collection might not be enabled by default in some old JVMs.
String literals, being interned, would be a reference held by the declaring Class object to the String object in the intern pool. So the interned literal String would only be collected if the Class object that referred to it were also collected.
When we intern a string, we are making sure that all uses of that string are referring to the same instance.
I would assume that the underlying string object is in the heap.
However, where is the referring variable stored in the memory?
Does it have the same behaviour as static - wherein the reference gets stored in permgen and makes the string instance available for gc only after the classloader(and application) exits?
Up to JDK 6, Intern'ed strings are stored in the memory pool in a place called the Permanent Generation, which is an area of the JVM that is reserved for non-user objects, like Classes, Methods and other internal JVM objects. The size of this area is limited, and is usually much smaller than the heap.
From JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
A detailed explanation of this can be found on this answer.
When we intern a string, we are making sure that all uses of that string are referring to the same instance.
Not exactly. When you do this:
String s2 = s1.intern();
what you are doing is ensuring that s2 refers to a String in the string pool. This does not affect the value in s1, or any other String references or variables. If you want other copies of the string to be interned, you need to do that explicitly ... or assign interned string references to the respective variables.
I would assume that the underlying string object is in the heap.
That is correct. It might be in the "permgen" heap or the regular heap, depending on the version of Java you are using. But it is always "in the heap".
However, where is the referring variable stored in the memory?
The "referring variable" ... i.e. the one that holds the reference that you got from calling intern() ... is no different from any other variable. It can be
a local variable or parameter (held in a stack frame),
an instance field (held in a regular heap object),
a static field (held in a permgen heap object) ... or even
a jstring variable or similar in JNI code (held "somewhere else".)
In fact, a typical JVM uses a private hash table to hold the references to interned strings, and it uses the JVM's weak reference mechanism to ensure that interned strings can be garbage collected if nothing else is using them.
Does it have the same behaviour as static - wherein the reference gets stored in permgen and makes the string instance available for gc only after the classloader(and application) exits?
Typically no ... see above.
In most Java platforms, interned Strings can be garbage collected just like other Strings. If the interned Strings are stored in "permgen" space, it may take longer for the object to be garbage collected, because "permgen" is collected infrequently. However the lifetime of an interned String is not tied to the lifetime of a classloader, etc.
I read this question on the site How is the java memory pool divided? and i was wondering to which of these sectors does the "String Constant Pool" belongs?
And also does the String literals in the pool ever get GCed?
The intern() method returns the base link of the String literal from the pool.
If the pool does gets GCed then wouldn't it be counter-productive to the idea of the string pool? New String literals would again be created nullifying the GC.
(It is assuming that only a specific set of literals exist in the pool, they never go obsolete and sooner or later they will be needed again)
As far as I know String literals end up in the "Perm Gen" part of non-Heap JVM memory. Perm Gen space is only examined during Full GC runs (not Partials).
In early JVM's (and I confess I had to look this up because I wasn't sure), String literals in the String Pool never got GC'ed. In the newer JVM's, WeakReferences are used to reference the Strings in the pool, so interned Strings can actually get GC'ed, but only during Full Garbage collections.
Reading the JavaDoc for String.intern() doesn't give hints to the implementation, but according to this page, the interned strings are held by a weak reference. This means that if the GC detects that there are no references to the interned string except for the repository that holds interned strings then it is allowed to collect them. Of course this is transparent to external code so unless you are using weak references of your own you'll never know about the garbage collection.
String pooling
String pooling (sometimes also called as string canonicalisation) is a
process of replacing several String objects with equal value but
different identity with a single shared String object. You can achieve
this goal by keeping your own Map (with possibly soft
or weak references depending on your requirements) and using map
values as canonicalised values. Or you can use String.intern() method
which is provided to you by JDK.
At times of Java 6 using String.intern() was forbidden by many
standards due to a high possibility to get an OutOfMemoryException if
pooling went out of control. Oracle Java 7 implementation of string
pooling was changed considerably. You can look for details in
http://bugs.sun.com/view_bug.do?bug_id=6962931 and
http://bugs.sun.com/view_bug.do?bug_id=6962930.
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location –
the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
String pool values are garbage collected
Yes, all strings in the JVM string pool are eligible for garbage
collection if there are no references to them from your program roots.
It applies to all discussed versions of Java. It means that if your
interned string went out of scope and there are no other references to
it – it will be garbage collected from the JVM string pool.
Being eligible for garbage collection and residing in the heap, a JVM
string pool seems to be a right place for all your strings, isn’t it?
In theory it is true – non-used strings will be garbage collected from
the pool, used strings will allow you to save memory in case then you
get an equal string from the input. Seems to be a perfect memory
saving strategy? Nearly so. You must know how the string pool is
implemented before making any decisions.
source.
String literals don't get created into the pool at runtime. I don't know for sure if they get GC'd or not, but I suspect that they do not for two reasons:
It would be immensely complex to detect in the general case when a literal will not be used anymore
There is likely a static code segment where it is stored for performance. The rest of the data is likely built around it, where the boundaries are also static
Strings, even though they are immutable, are still objects like any other in Java. Objects are created on the heap and Strings are no exception. So, Strings that are part of the "String Literal Pool" still live on the heap, but they have references to them from the String Literal Pool.
For more please refer this link
`http://www.javaranch.com/journal/200409/ScjpTipLine-StringsLiterally.html`
Edited Newly :
public class ImmutableStrings
{
public static void main(String[] args)
{
String one = "someString";
String two = new String("someString");
one = two = null;
}
}
Just before the main method ends, how many objects are available for garbage collection? 0? 1? 2?
The answer is 1. Unlike most objects, String literals always have a reference to them from the String Literal Pool. That means that they always have a reference to them and are, therefore, not eligible for garbage collection.
neither of our local variables, one or two, refer to our String object, there is still a reference to it from the String Literal Pool. Therefore, the object is not elgible for garbage collection.The object is always reachable through use of the intern() method