vector and garbage collector

vector and garbage collector - java

I'm running a java program that uses many vectors. I'm afraid my use of them is causing the garbage collector not to work.
I have many threads that do:
vec.addAll(<collection>);
and other threads that do:
vec.remove(0);
I have printouts that show the vector is empty from time to time but I was wondering if the memory is actually freed.
Do I need to worry?

If the objects in your Vectors are not referenced anywhere (by the Vector or by any other code), then they will get collected at the garbage collector's discretion. 99.999% of the time the garbage collector won't need your help with this.
However, even after the garbage collector frees the objects, it may not give heap memory back to the operating system, so your process may appear to hold more memory than it should.
Additionally, I'm very not familiar with the implementation of the Vector class (as others have pointed out, you should really be using ArrayList instead), but when you call .remove() I don't think the underlying array is ever resized downward. So if you stuff several thousand objects into a Vector and then delete them all, it will probably still have several thousand bytes of empty array allocated. The solution in this case is to call vector.trimToSize().

The memory will eventually be freed if there are no references to the objects in question. Seeing your vectors become empty indicates that at least they are not holding on to references. If there is nothing else with references to those objects, they will be cleaned up, but only when the garbage collector chooses to do so (which is always before you run out of memory).

(Caveat: Obviously calling remove(0) will only remove the first element from the Vector, not multiple elements.)
Assuming your Vector is empty then you do not need to worry about garbage collection if the objects in your vector are not being referenced elsewhere. However, if there are still other references to the objects then there is no way they can be garbage collected.
To verify this, I'd recommend running a profiler (e.g. JProfiler) and periodically "snapping" the object count for the type of object being stored in your Vector, and then monitor this count to see if it increased over time.
One other piece of advice: Vector is obsolete; You should consider using LinkedList or ArrayList instead, which are thread-unsafe equivalents. If you wish to make them thread-safe you should initialise them using Collections.synchronizedList(new ArrayList());

No, you don't. If the vector is empty it is not referencing objects you once put in there. If you want to know what is holding on to memory, you can get a profiler and look at what is consuming the memory.

No.
Just trust the implementors of the standard library. They probably have done a good job.
If you really worry about memory leaks, call this from time to time :
System.out.println("Total Memory"+Runtime.getRuntime().totalMemory());
System.out.println("Free Memory"+Runtime.getRuntime().freeMemory());

The implementation of java.util.Vector.remove(int) from JDK 1.6 is:
public synchronized E remove(int index) {
modCount++;
if (index >= elementCount)
throw new ArrayIndexOutOfBoundsException(index);
Object oldValue = elementData[index];
int numMoved = elementCount - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index, numMoved);
elementData[--elementCount] = null; // Let gc do its work
return (E)oldValue;
}
As you can see, if elements are removed from the vector, the vector will not keep a reference to to them, and hence not impede their garbage collection.
However, the vector itself (and the potentially large array backing it) might not be reclaimed.

Related

Does java compiler insert free when pointer is allocated and go out of scope in a block?

I am scratching my head trying to understand the point of the following code
Map<String Set<MyOtherObj>> myMap = myapi.getMyMap();
final MyObj[] myObjList;
{
final List<MyObj> list = new ArrayList<>(myMap.size());
for (Entry<String, Set<MyOtherObj>> entry : myMap.entrySet()) {
final int myCount = MyUtility.getCount(entry.getValue());
if (myCount <= 0)
continue;
list.add(new MyObj(entry.getKey(), myCount));
}
if (list.isEmpty())
return;
myObjList = list.toArray(new MyObj[list.size()]);
}
Which can be rewrite into the following
Map<String Set<MyOtherObj>> myMap = myapi.getMyMap();
final List<MyObj> list = new ArrayList<>(myMap.size());
for (Entry<String, Set<MyOtherObj>> entry : myMap.entrySet()) {
final int myCount = MyUtility.getCount(entry.getValue());
if (myCount <= 0)
continue;
list.add(new MyObj(entry.getKey(), myCount));
}
if (list.isEmpty())
return;
The only reason I can think of why we put the ArrayList in a block and then reassign the content to an array is
The size of ArrayList is bigger than the size of list, so reassigning ArrayList to array save space
There is some sort of compiler magic or gc magic that deallocates and reclaim the memory use by ArrayList immediately after the block scope ends (eg. like rust), otherwise we are now sitting on up to 2 times amount of space until gc kicks in.
So my question is, does the first code sample make sense, is it more efficient?
This code currently executes 20k message per second.

As stated in this answer:
Scope is a language concept that determines the validity of names. Whether an object can be garbage collected (and therefore finalized) depends on whether it is reachable.
So, no, the scope is not relevant to garbage collection, but for maintainable code, it’s recommended to limit the names to the smallest scope needed for their purpose. This, however, does not apply to your scenario, where a new name is introduced to represent the same thing that apparently still is needed.
You suggested the possible motivation
The size of ArrayList is bigger than the size of list, so reassigning ArrayList to array save space
but you can achieve the same when declaring the variable list as ArrayList<MyObj> rather than List<MyObj> and call trimToSize() on it after populating it.
There’s another possible reason, the idea that subsequently using a plain array was more efficient than using the array encapsulated in an ArrayList. But, of course, the differences between these constructs, if any, rarely matter.
Speaking of esoteric optimizations, specifying an initial array size when calling toArray was believed to be an advantage, until someone measured and analyzed, to find that, i.e. myObjList = list.toArray(new MyObj[0]); would be actually more efficient in real life.
Anyway, we can’t look into the author’s mind, which is the reason why any deviation from straight-forward code should be documented.
Your alternative suggestion:
There is some sort of compiler magic or gc magic that deallocates and reclaim the memory use by ArrayList immediately after the block scope ends (eg. like rust), otherwise we are now sitting on up to 2 times amount of space until gc kicks in.
is missing the point. Any space optimization in Java is about minimizing the amount of memory occupied by objects still alive. It doesn’t matter whether unreachable objects have been identified as such, it’s already sufficient that they are unreachable, hence, potentially reclaimable. The garbage collector will run when there is an actual need for memory, i.e. to serve a new allocation request. Until then, it doesn’t matter whether the unused memory contains old objects or not.
So the code may be motivated by a space saving attempt and in that regard, it’s valid, even without an immediate freeing. As said, you could achieve the same in a simpler fashion by just calling trimToSize() on the ArrayList. But note that if the capacity does not happen to match the size, trimToSize()’s shrinking of the array doesn’t work differently behind the scenes, it implies creating a new array and letting the old one become subject to garbage collection.
But the fact that there’s no immediate freeing and there’s rarely a need for immediate freeing should allow the conclusion that space saving attempts like this would only matter in practice, when the resulting object is supposed to persist a very long time. When the lifetime of the copy is shorter than the time to the next garbage collection, it didn’t save anything and all that remains, is the unnecessary creation of a copy. Since we can’t predict the time to the next garbage collection, we can only make a rough categorization of the object’s expected lifetime (long or not so long)…
The general approach is to assume that in most cases, the higher capacity of an ArrayList is not a problem and the performance gain matters more. That’s why this class maintains a higher capacity in the first place.

No, it is done for the same reason as empty lines are added to the code.
The variables in the block are scoped to that block, and can no longer be used after the block. So one does not need to pay attention to those block variables.
So this is more readable:
A a;
{ B b; C c; ... }
...
Than:
A a;
B b;
C c;
...
...
It is an attempt to structure the code more readable. For instance above one can read "a declaration of A a; and then a block probably filling a.
Life time analysis in the JVM is fine. Just as there is absolutely no need to set variables to null at the end of their usage.
Sometimes blocks are also abused to repeat blocks with same local variables:
A a1;
{ B b; C c; ... a1 ... }
A a2;
{ B b; C c; ... a2 ... }
A a3;
{ B b; C c; ... a3 ... }
Needless to say that this is the opposite of making code better style.

What is an Obsolete reference in java

I'm reading the Effective Java book and its saying that eliminating obsolete reference is one of best way to avoid memory leaks. according to the below program, by doing -> elements[size] = null; its eliminating obsolete references in that program.
My problem here what is the advantage of doing elements[size] = null;. Any other program can use that freed memory location? Or is it garbage collected?
According to my understanding the array is already allocated the memory for its size. Even we do elements[size] = null; anyone can't use that freed memory location until you do elements = null;. Please someone tell me what is advantage of doing elements[size] = null; here.
public Object pop() {
if (size == 0)
throw new EmptyStackException();
Object result = elements[--size];
elements[size] = null; // Eliminate obsolete reference
return result;
}

My problem here what is the advantage of doing elements[size] = null;.
Here obsolete references refer to object references not required any longer for the program.
You want that unnecessary objects to be free to consume only memory that your program need. Generally it is done for the good working of the current application.
Any other program can use that freed memory location?
Theoretically yes but it also depends on the JVM memory options used. You don't generally focus on it.
elements[size] = null and elements = null; don't have at all the same intention and the same effects.
In the context of the book, elements is a structural intern of a class.
The idea is that some elements of the array may be stale and not required any longer after some removal operations.
The first one (elements[size] = null) will make the object of the array element located at the size index to be eligible to be GC if no other objects reference .
But the second one (elements = null) is much more. It will make all elements of the array to be eligible to be GC if no other objects reference it.

There are two cases we have to distinguish:
The outer object is "teared down" somehow, so it closes any open resource and also "voluntarily" releases all objects it had referred to. This s simply the explicit way of telling the jvm that the corresponding refence is "gone". You make it easier for the gc to understand: the corresponding object is eligible for garbage collection. Of course, that only has that effect if there are no other references to the same object elsewhere. And beyond: doing so isn't really required, the jvm/gc must of course be able to detect any eligible object all by itself.
But nullifying makes sense for refences that exist for longer periods of time, pointing to different objects over that time span. Like a container, such as the stack class in the underlying example. A container must forget about objects it referenced to when they get "removed". Otherwise you create a memory leak!

What happens here?
Let's imagine, elements is a 20-elements Object array (elements = new Object[20];), and has been filled with 18 BigInteger instances, the remaining two places being null.
So the heap now contains 18 BigInteger instances and a 20-elements Object[] array. The garbage collector won't reclaim any of these instances, and that's okay as you'll most probably use them later (via the pop() method).
Now you call the pop() method to get the BigInteger most recently added to the array. Let's assume you just want to print it and then forget it, so in your overall application that number isn't needed any more, and you'd expect the garbage collector to reclaim it. But that won't happen unless you do the null assignment
elements[size] = null; // Eliminate obsolete reference
Why that?
As long as you store the reference to an object in some accessible place, the garbage collector believes that you'll still need the object later.
As long as elements[17] still refers to the BigInteger, it can potentially be accessed by your program, so it can't be reclaimed. If elements[17] points to null, the BigInteger that used to be there isn't accessible via elements any more and can be reclaimed by the garbage collector (if no other part of your code still uses it).
Conclusion
It's only worth thinking about "obsolete references" if you have a long-living storage structure that contains fat objects, and you can tell at some point in time that you won't need one of the stored objects any more. As you won't need this object any more, you can now re-assign the storage with null, and then the GC no longer believes you still need the object and is able to reclaim the storage space.

WeakReference of a Collection in java

Backstory
In a library that I maintain we have an internal map keeping track of our cache.
Users of the library are interested in having list access to this map, however we can only provide this by copying its contents (thread-safety reasons).
The idea is to cache this list when it is first accessed without having much memory overhead on a second access.
To illustrate:
List<Bob> list = cache.asList();
List<Bob> otherList = cache.asList(); // use from cache, if still available
The problem is, we don't want to keep this list forever if its not needed anymore. Since java uses GC we thought it would be appropriate to use a WeakReference for this, to allow using it if its not collected.
Question
If I have a WeakReference<List<Bob>> stored inside my class, what happens if one of the elements becomes weakly reachable (which implies the list is weakly reachable)? Is it possible that the GC decides to just collect the element inside the list or would it look for all other weakly reachable objects referencing it and also collect them, in this case the list?
The problem would be, if the GC collected an element of the list and we then try to access the list again (if thats even possible) what would happen?
Clarifications
I'm not interested in the reachability of the list, I know that the list is inside the WeakReference and that the elements are irrelevant to its reachability. I care about a specific state, in which both the list and an element of the list are weakly reachable and whether it is possible that the GC only collects the element but not the list itself. What exactly does the GC do in this specific scenario?

As long as the List itself is not weakly reachable its elements will not be either. (Assuming the list implementation itself does not use weak references or similar)
So there is no problem with having the list cached with a weak reference because it would either be garbage collected completely or not at all.

In provided case (WeakReference<List<Something>>) you have only such possible scenario:
public class Test {
private WeakReference<List<String>> listWeakReference;
public Test(final WeakReference<List<String>> listWeakReference) {
this.listWeakReference = listWeakReference;
}
public static void main(String[] args) {
List<String> testList = Arrays.asList("a", "b", "c");
Test test = new Test(new WeakReference<>(testList));
// Initial check
System.out.println(test.listWeakReference.get());
// Call gc and check
System.gc();
System.out.println(test.listWeakReference.get());
// Remove reference and call gc
testList = null;
System.gc();
System.out.println(test.listWeakReference.get());
}
}

Firstly SoftReference is better for caches, and even that isn't very good.
WeakReference may be released immediately the reference becomes weakly reachable. However, it might not do that until sometime into execution - i.e. it doesn't happen during extensive testing, but it does in production. Fun times. NetBeans used to do this in its caching of files. Of course the rest of the code was expecting the caching so grabbed and released references with incredible frequency. After sometime using the application it would suddenly hammer file I/O and become unusable.
For best performance you need to explicitly estimate how much memory the process is using and release as necessary. Not easy.
Back to the question. Collection of contents of WeakReference (and SoftReference is a two phase operation. The first phase just clears the Reference (and queues if you are using that). The associated memory is not collected. The memory may be resurrected through a finaliser. The WeakReference is forever cleared and queued, it does not reset. Only when an object is completely unreachable can the associated memory be collected as a separate phase.
Fear not, Java is memory-safe (bugs excepted).

Objects and pointers in Java

If you have that:
while(true) byte[] fillbuffer = new byte[400];
What will happen? Will create thousand of objects and each time just link the pointer of the fillbuffer with the new object's pointer? or something else?

Yes, then the old objects will be garbage collected as needed since they are no longer referenced. You can roughly see this yourself if you hook up VisualVM or similar and watch the memory usage (consider adding a sleep).
As pointed out, to be technical the array is the only object. You are allocating 400 bytes, and one object, whose main job is to know where the 400 bytes are, each loop.
I'm not aware of any optimizations that are done to avoid the allocations, but in general compilers/virtual machines in any language have a lot of license to take shortcuts. "Logically" my answer explains what happens here, but YMMV (specifically, YMMV depending on how much of the JVM spec you have read.)

If you are lucky, the Hotspot JVM will succeed at escape analysis and notice that the array never escapes the loop. It may then be optimized away altogether.
Escape analysis is a technique by which the Java Hotspot Server Compiler can analyze the scope of a new object's uses and decide whether to allocate it on the Java heap.
But at least for the first 1000 or so iterations - before optimization kicks in - it will likely allocate these, and eventually garbage collect them.
Congratulations, you have written an infinite loop.

What will happen?
Upon each iteration of the while loop, a new array of 400 bytes is allocated on the heap.
Will create thousand of objects and each time just link the pointer of
the fillbuffer with the new object's pointer?
Yes, a new array object is created each time. Since the variable fillbuffer is in scope only within the body of the while loop, the referenced byte array becomes immediately available for garbage collection upon completion of each loop iteration.
Edit: Note
If you were to define fillbuffer outside the loop, then its value would not be immediately available for garbage collection upon completion of each loop iteration, but the old value would become available for garbage collection as soon as the variable was assigned a new value. I.e.
byte[] fillbuffer;
while(true)
fillbuffer = new byte[400];

Java Collection#clear reclaim memory

With the following function:
Collection#clear
how can I attempt to reclaim memory that could be freed from an invocation? Code sample:
public class Foo
{
private static Collection<Bar> bars;
public static void main(String[] args){
bars = new ArrayList<Bar>();
for(int i = 0; i < 100000;i++)
{
bars.add(new Bar());
}
bars.clear();
//how to get memory back here
}
}
EDIT
What I am looking for is similar to how ArrayList.remove reclaims memory by copying the new smaller array.

It is more efficient to only reclaim memory when you need to. In this case it is much simpler/faster to let the GC do it asynchronous when there is a need to do. You can give the JVM a hint using System.gc() but this is likely to be slower and complicate your program.
how ArrayList.remove reclaims memory by copying the new smaller array.
It doesn't do this. It never shrinks the array, nor would you need to.
If you really need to make the collection smaller, which I seriously doubt, you can create a new ArrayList which has a copy of the elements you want to keep.

bars= null ;
would be the best. clear doesn't guarantee to release any memory, only to reset the logical contents to "empty".
In fact, bars= null ; doesn't guarantee that memory will be immediately released. However, it would make the object previously pointed by bars and all its dependents "ready for garbage collection" ("finalization", really, but let's keep this simple). If the JVM finds itself needing memory, it will collect these objects (other simplification here: this depends on the exact garbage collection algorithm the JVM is configured to use).

You can't.
At some point after there are no more references to the objects, the GC will collect them for you.
EDIT: To force the ArrayList to release its reference to the giant empty array, call trimToSize()

You can't force memory reclamation, that will happen when garbage collection occurs.
If you use clear() you will clear the references to objects that were contained in the collection. If there are no other references to those objects, then they will be reclaimed next time GC is run.
The collection itself (which just contains references, not the objects referred to), will not be resized. The only way to get back the storage used by the collection is to set the reference bars to null so it will eventually be reclaimed.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.