What i would like to ask is this:
String str1;
for(int i=0;i<10;i++){
str1 = Integer.toString(i);
}
will this create 1 string object and reassign it's value 10 times or will it create 10 String Objects eating 10*(String's Bytes) from memory?
That will create 10 different string objects, each assigned to the str1 variable in turn. All but the final one (the one currently referenced by str1) will be available for garbage collection at the end of the loop.
There are different concepts at play:
Assignments are stored in stack memory.
Objects are stored in heap space
Because integers and strings are immutable, each time you do Integer.toString() an instance of string will be created, so you will have 10 instances of String in heap.
on each loop run, you are telling str1 to point at each of those specific instances, but it is replaced immediately by the next run.
Garbage collector will check eventually which heap objects do not have a stack memory reference and collect them to throw them into oblivion.
It will create 10 of them. String is actually immutable. But the garbage collector will take care of all that if needs be.
You can see that in the doc ;)
https://docs.oracle.com/javase/7/docs/api/java/lang/String.html
In java String is immutable, by immutable, we mean that Strings are constant, their values cannot be changed after they are created.
from specification
a string literal always refers to the same instance of class String.
This is because string literals - or, more generally, strings that are
the values of constant expressions - are "interned" so as to share
unique instances, using the method String.intern.
in this case it will create 10 immutable string objects.
A lot of answers have been provided and I won't add to them anymore. Moving forward, if you are interested into how the JVM memory management works, I would suggest to checkout Java Memory Management course by Matt Greencroft, https://www.udemy.com/java-memory-management/. I bought it 2 years ago during a sale and it was worth it.
I ran your code in an infinite loop with a 5 secs. pause.
class MemoryTest {
public static void main(String[] args) throws InterruptedException {
while (true) {
String str1;
for (int i = 0; i < 10; i++) {
str1 = Integer.toString(i);
}
Thread.sleep(5000);
System.out.println("Running...");
}
}
}
This is the result of Visual VM + Visual GC plugin, thanks to the course. I have been running it for 12 mins. and only 1 garbage collection has been executed by the JVM. Even in an infinite loop, the JVM is smart enough to keep the memory in S1 space and never execute more than one GC and never put the object in Old space.
Related
I was looking for a method to turn a byte array into a string, the only way I found is this:
String s1="horse";
String s2;
byte[] b=s1.getBytes();
s2=new String(b);
So my questions are, is there any other way to convert a byte array to a string without creating a new instance of String?
Also if I repeatedly did:
String s;
while(true){
s=new String();
}
Would take up more and more memory, or it is automatically deallocated and reallocated? If the memory were deallocated automatically, I would no longer have to look for an alternative method to convert an array of bytes to a string.
P.S.(I want s2 to be "horse")
P.S.2(Sorry my bad english)
The comment by Johannes is a good starting place as Garbage collection is a key concept in Java.
To answer your questions though, no you will need to create a new instance of string when initializing with a byte array.
In your second code snippet:
String s;
while(true){
s=new String();
}
What we have is a String 's' which is a String pointer to nothing. Then in your loop you point this to a String object on the heap. When you reassign 's' in the loop it will allocate more memory for your new String and then the garbage collector will pick up the old String object from the heap. This is because the object has nothing referencing it anymore.
Here's a good article on string immutability.
So my questions are, is there any other way to convert a byte array to a string without creating a new instance of String?
No there isn't. Java strings are immutable. That means that you cannot update / replace the characters in a string. Converting a byte array to an existing string would violate immutability.
String s;
while (true) {
s = new String();
}
Would take up more and more memory, or it is automatically deallocated and reallocated?
Memory is automatically reclaimed by the Garbage collector (GC). The GC runs occasionally, identifies objects that are no longer reachable; i.e. that the program cannot find anymore. Those objects are then deleted.
That is the simple version. In reality not all "lost" objects are reclaimed at the same time, and there are some special kinds of reachability that are handled differently.
Anyway, in your example, each time the program goes around the loop, a new String object is created, and the reference to the previous String is lost. Then later (as required) the GC finds and deletes the lost objects.
If the memory were deallocated automatically, I would no longer have to look for an alternative method to convert an array of bytes to a string.
It is, and you don't.
In Java, you just let the runtime system deal with allocation and deallocation of object memory. (Just make sure that you don't cause objects to remain reachable by accident.)
List<String> list = new ArrayList<>();
for (int i = 0; i < 1000; i++)
{
StringBuilder sb = new StringBuilder();
String string = sb.toString();
string = string.intern()
list.add(string);
}
In the above sample, after invoking string.intern() method, when will the 1000 objects created in heap (sb.toString) be cleared?
Edit 1:
If there is no guarantee that these objects could be cleared. Assuming that GC haven't run, is it obsolete to use string.intern() itself? (In terms of the memory usage?)
Is there any way to reduce memory usage / object creation while using intern() method?
Your example is a bit odd, as it creates 1000 empty strings. If you want to get such a list with consuming minimum memory, you should use
List<String> list = Collections.nCopies(1000, "");
instead.
If we assume that there is something more sophisticated going on, not creating the same string in every iteration, well, then there is no benefit in calling intern(). What will happen, is implementation dependent. But when calling intern() on a string that is not in the pool, it will be just added to the pool in the best case, but in the worst case, another copy will be made and added to the pool.
At this point, we have no savings yet, but potentially created additional garbage.
Interning at this point can only save you some memory, if there are duplicates somewhere. This implies that you construct duplicate strings first, to look up their canonical instance via intern() afterwards, so having the duplicate string in memory until garbage collected, is unavoidable. But that’s not the real problem with interning:
in older JVMs, there was special treatment of interned string that could result in worse garbage collection performance or even running out of resources (i.e. the fixed size “PermGen” space).
in HotSpot, the string pool holding the interned strings is a fixed size hash table, yielding hash collisions, hence, poor performance, when referencing significantly more strings than the table size.
Before Java 7, update 40, the default size was about 1,000, not even sufficient to hold all string constants for any nontrivial application without hash collisions, not to speak of manually added strings. Later versions use a default size of about 60,000, which is better, but still a fixed size that should discourage you from adding an arbitrary number of strings
the string pool has to obey inter-thread semantics mandated by the language specification (as it is used to for string literals), hence, need to perform thread safe updates that can degrade the performance
Keep in mind that you pay the price of the disadvantages named above, even in the cases that there are no duplicates, i.e. there is no space saving. Also, the acquired reference to the canonical string has to have a much longer lifetime than the temporary object used to look it up, to have any positive effect on the memory consumption.
The latter touches your literal question. The temporary instances are reclaimed when the garbage collector runs the next time, which will be when the memory is actually needed. There is no need to worry about when this will happen, but well, yes, up to that point, acquiring a canonical reference had no positive effect, not only because the memory hasn’t been reused up to that point, but also, because the memory was not actually needed until then.
This is the place to mention the new String Deduplication feature. This does not change string instances, i.e. the identity of these objects, as that would change the semantic of the program, but change identical strings to use the same char[] array. Since these character arrays are the biggest payload, this still may achieve great memory savings, without the performance disadvantages of using intern(). Since this deduplication is done by the garbage collector, it will only applied to strings that survived long enough to make a difference. Also, this implies that it will not waste CPU cycles when there still is plenty of free memory.
However, there might be cases, where manual canonicalization might be justified. Imagine, we’re parsing a source code file or XML file, or importing strings from an external source (Reader or data base) where such canonicalization will not happen by default, but duplicates may occur with a certain likelihood. If we plan to keep the data for further processing for a longer time, we might want to get rid of duplicate string instances.
In this case, one of the best approaches is to use a local map, not being subject to thread synchronization, dropping it after the process, to avoid keeping references longer than necessary, without having to use special interaction with the garbage collector. This implies that occurrences of the same strings within different data sources are not canonicalized (but still being subject to the JVM’s String Deduplication), but it’s a reasonable trade-off. By using an ordinary resizable HashMap, we also do not have the issues of the fixed intern table.
E.g.
static List<String> parse(CharSequence input) {
List<String> result = new ArrayList<>();
Matcher m = TOKEN_PATTERN.matcher(input);
CharBuffer cb = CharBuffer.wrap(input);
HashMap<CharSequence,String> cache = new HashMap<>();
while(m.find()) {
result.add(
cache.computeIfAbsent(cb.subSequence(m.start(), m.end()), Object::toString));
}
return result;
}
Note the use of the CharBuffer here: it wraps the input sequence and its subSequence method returns another wrapper with different start and end index, implementing the right equals and hashCode method for our HashMap, and computeIfAbsent will only invoke the toString method, if the key was not present in the map before. So, unlike using intern(), no String instance will be created for already encountered strings, saving the most expensive aspect of it, the copying of the character arrays.
If we have a really high likelihood of duplicates, we may even save the creation of wrapper instances:
static List<String> parse(CharSequence input) {
List<String> result = new ArrayList<>();
Matcher m = TOKEN_PATTERN.matcher(input);
CharBuffer cb = CharBuffer.wrap(input);
HashMap<CharSequence,String> cache = new HashMap<>();
while(m.find()) {
cb.limit(m.end()).position(m.start());
String s = cache.get(cb);
if(s == null) {
s = cb.toString();
cache.put(CharBuffer.wrap(s), s);
}
result.add(s);
}
return result;
}
This creates only one wrapper per unique string, but also has to perform one additional hash lookup for each unique string when putting. Since the creation of a wrapper is quiet cheap, you really need a significantly large number of duplicate strings, i.e. small number of unique strings compared to the total number, to have a benefit from this trade-off.
As said, these approaches are very efficient, because they use a purely local cache that is just dropped afterwards. With this, we don’t have to deal with thread safety nor interact with the JVM or garbage collector in a special way.
You can open JMC and check for GC under Memory tab inside MBean Server of the particular JVM when it performed and how much did it cleared. Still, there is no fixed guarantee of the time when it would be called. You can initiate GC under Diagnostic Commands on a specific JVM.
Hope it helps.
In the code snippet below, new String object will be created to to store the modified new string as Strings are immutable in java. But i'm not sure which one will create new object and which one of them will be marked for garbage collection?
String s1 = "It";
String s2 = "was";
String s3 = s1+" "+s2;
s2+=" roses";
s3 = s3+s2+" roses all the way";
System.out.println(s3);
It depends on scope of your code snippet.
If all code is inside one method all of them will be in garbage after execution.
Garbage Collector works on count of references to the object.
As example you declare a new reference String s1 inside a method and assign to something. Then method executed and upon completion there is no more references. So, go to garbage.
This is a brief, basic explanation, the actual behavior can vary based on compiler and JVM used. Also there are a lot of articles available that go into this topic in depth and can provide more detailed explanations.
These literals will be put in the JVM's string pool and never be GC'ed. (i.e. they will exist in memory for the duration of the JVM.
"It", "was", " ", " roses", " roses all the way"
Then, as far as the String references are concerned, It depends on the scope of the variables. I will assume local method level for this answer:
String s1 = "It"; // a reference will be created on the stack
String s2 = "was"; // a reference will be created on the stack
String s3 = s1+" "+s2; // a reference will be created on the stack for s3, and then two temp objects will be created in memory, one for s1+" ", one for concatenating the result with +s2. (this operation can vary greatly based on how compiler optimizes), the first one will become eligible for GC immediately.
s2+=" roses"; // same as above.
s3 = s3+s2+" roses all the way"; // same as above but the object s3 was pointing to will become eligible for GC immediately.
System.out.println(s3); // no memory allocation.
When the method ends, s1, s2, and s3 references will be cleared from the stack and any remaining objects pointed to become eligible for GC.
Hope this helps, remember this is a very basic explanation, I recommend reading up on this topic, as how it will actually behave can vary greatly depending on how the compiler decides to optimize. (For example, the compiler may see all these temporary references and concatenations are not needed and discard them)
Note: Since you are concatenating to create strings, you may want to consider the Mutable alternatives like StringBuffer or StringBuilder which can help optimize.
I understood that if a String is initialized with a literal then it is allotted a space in String Pool and if initialized with the new Keyword it create a String's object. But I am confused with a case which is written below.
My question is what if a String is created with the new keyword and then it value is updated with a literal?
E.g.
String s = new String("Value1"); -- Creates a new object in heap space
then what if write the next statement as below.
s = "value2";
So my question is,
1 Will it create a String literal in a String Pool or it will update the value of that object?
2 If it creates a new literal in String Pool what will be happened to the currently existed object? Will it be destroyed or it will be there until the garbage collector is called.
This is a small string if the string is say of the thousands of characters then I am just worried about the space it uses. So my key question is for the space.
Will it immediately free the space from the heap after assigning the literal?
Can anyone explain what what value goes where from the first statement to the second and what will happened to the memory area (heap and String Pool).
Modifying Strings
The value is not updated when running
s = "value2";
In Java, except for the primitive types, all other variables are references to objects. This means that only s is pointing to a new value.
Immutability guarantees that the state of an object cannot change after construction. In other words, there are no means to modify the content of any String object in Java. If you for instance state s = s+"a"; you have creates a new string, that somehow stores the new text.
Garbage collection
This answer already provides an in-depth answer. Below a short summary if you don't want to read the full answer, but it omits some details.
By default new String(...) objects are not interned and thus the normal rules of garbage collection apply. These are just ordinary objects.
The constant strings in your code, which are interned are typically never removed as it is likely that eventually you will refer back to these.
There is however a side-note in the answer that sometimes classes are dynamically (un)loaded, in which case the literals can be removed from the pool.
To answer your additional questions:
Will it immediately free the space from the heap after assigning the literal?
No, that would not be really efficient: the garbage collector needs to make an analysis about which objects to remove. It is possible that you shared the references to your old string with other objects, so it is not guaranteed that you can recycle the object. Furthermore there is not much wrong with storing data no longer useful, as long as you don't need to ask additional memory to the operating system (compare it with you computer, as long as you can store all your data on your hard disk drive, you don't really have to worry about useless files, from the moment you would have to buy an additional drive, you will probably try to remove some files first). The analysis requires some computational effort. In general a garbage collector only runs when it (nearly) runs out of memory. So you shouldn't worry much about memory.
Can anyone explain what what value goes where from the first statement to the second and what will happened to the memory area (heap and String Pool).
Your first string:
String s = new String("Value1");
is a reference to the heap. If you call the command, it will allocate space on the heap for the string.
Now if you call:
s = "value2";
"value2" is an element of the String Pool, it will remain there until your program ends.
Since you don't have a reference to your old string (value1), anymore. That object is a candidate for collection. If the garbage collector later walks by, it will remove the object from the heap and mark the space as free.
If you need to change a string, you can always create a new one that contains
the modifications.
Java defines a peer class of String, called StringBuffer, which allows strings to be altered.
Now, I met a strange case likes that:
public class SoftRefDemo {
private static List<SoftReference<String>> cache;
public static void main(String[] args) {
int total = 3000000;
cache = new ArrayList<SoftReference<String>>(total);
for (int i = 0; i < total; i++) {
cache.add(new SoftReference<String>("fafsfsfsdf" + i));
}
System.out.println(cache.size());
}
}
I have set the JVM setting:-Xms20m -Xmx40m. When I want to put many of SoftReference to cache, the JVM exit without any promption or exception. Actually, I am doubtful of the action of SoftReference, it's special object for JVM. Could anyone explains what's happen for this program?
Another two questions:
1. Does there has extra memory allocation method for those 'special reference instance' in JVM heap?
2. When does those reference instance can be freed when the instance that they pointer to has been freed? Thanks a lot!
When the OOM occurs, there always many of SoftReference instance has existing, can you help to explain this case?
Each instance of SoftReference occupies 24 bytes of heap-memory by itself (this is true for 32-bit Sun JVM, and may differ for others VMs). You are trying to store in the list 3'000'000 instances which means you will need at least ~70Mb of heap space just to store the SoftReference objects.
SoftReference object keeps a "soft reference" to the soft-reachable object (String in your case) and JVM spec guarantees those references will be cleared (i.e. String objects will be garbage collected) before the virtual machine throws an OutOfMemoryError. However JVM will NOT garbage-collect SoftReference objects since you keep strong references to them from your cache list. (So if you need SoftReference object to be removed from the heap - remove it from the cache list).
If I run this program with -mx40m
char[] chars = new char[4096];
List<SoftReference<String>> strings = new ArrayList<SoftReference<String>>();
do {
strings.add(new SoftReference<String>(new String(chars)));
} while(strings.get(0).get()!=null);
int nulls=0, set=0;
for (SoftReference<String> string : strings) {
if(string.get() == null) nulls++; else set++;
}
System.out.println("nulls= "+nulls+", was still set= "+set);
I get
nulls= 4618, was still set= 1
One of the problems with WeakReferences and SoftReferences is that they tend to all be cleared at once.
Try WeakReference, SoftReferences are only clear if it really has to. To see them clearer, try creating a large enough array to trigger an OutOfMemoryError.
All object are in the heap. Even static fields are wrapped in a pseudo object which is on the heap. (This later behaviour is not defined but it how I have seen it work)
The reference instance is freed after it has been discarded (like any other object)