I have some doubts about the garbage collector and how I can clear memory in Java.
I have a program that writes a binary search tree to a file and I made a function that inserts an element and another that removes an element, but in the method that removes I put the elements that I remove in a space in the file that I call "empty blocks" (which is a stack). In the C language there is a method that freed the memory that was free(), in Java there is the garbage collector that is at the discretion of Java. How can I free the memory of these blocks in the file (elements excluded).
Is there a way to free the memory of an element on file in Java (the element is of type int)?
I put the elements that I remove in a space in the file that I call “empty blocks ”(Which is a stack)
Whatever data structure you use to track your data will be in an object of some class.
When that object no longer has any references pointing to it, that object becomes a candidate for garbage collection. No need for you to do anything except not hang on to any reference longer than needed.
The garbage collector may clear the unneeded object immediately, or may clear it later. Either way, we as Java programmers do not care. Eventually the memory will be freed up.
If the reference variable pointing to an object is a local variable, that reference is dropped when the local variable goes out of scope.
If the reference variable is a member field on another object, the
object in question will be released when the other object becomes
garbage.
If the reference variable is static, you should assign null explicitly to let the referenced object become garbage. In Java, static variables stay in memory throughout the execution run of your app.
In the first two cases, you can release the object sooner by setting the reference variable to null. Generally this is not needed, but doing so may be wise if a large amount of memory is at stake. Ditto if other precious resources are being needlessly held.
Is there a way to free the memory of an element on file in Java (the element is of type int)?
Your question is really hard to understand, but I think you are asking about freeing up disk blocks in a data structure stored in a file1.
There is no Java support for this. If you write a data structure to a file, the problem of reclaiming space in the file is yours, not Java's. Indeed, I don't think that a typical OS will allow you to (literally) free disk blocks in the middle of a file2.
There may be 3rd-party libraries that support this kind of thing, but I don't have the background knowledge to make a recommendation.
If I have correctly understood what you are asking, your discussion of C's malloc / free versus Java's garbage collection is only peripherally relevant. Both of these schemes are for managing memory, not space in a random access file. Now you could conceivably implement similar schemes for managing space in a file, but you would need to take account of the different characteristics of memory and disk I/O. (Even if you are mapping the file into memory.)
1 - If you are actually talking about managing objects in heap memory in Java, your best bet is to just let the garbage collector deal with it; see Basil's answer. There are also 3rd-party libraries for storing objects in off-heap memory, but it is unclear if they would help you. I understand that such libraries typically leave it to the programmer to decide when to free an object. (They are not garbage collected.)
2 - It would be a bad idea. If the disk blocks thus freed were then used in a different file, you would get a lot of file fragmentation. That would be bad for file I/O performance.
Related
Is there a way to free memory in Java, similar to C's free() function? Or is setting the object to null and relying on GC the only option?
Java uses managed memory, so the only way you can allocate memory is by using the new operator, and the only way you can deallocate memory is by relying on the garbage collector.
This memory management whitepaper (PDF) may help explain what's going on.
You can also call System.gc() to suggest that the garbage collector run immediately. However, the Java Runtime makes the final decision, not your code.
According to the Java documentation,
Calling the gc method suggests that
the Java Virtual Machine expend effort
toward recycling unused objects in
order to make the memory they
currently occupy available for quick
reuse. When control returns from the
method call, the Java Virtual Machine
has made a best effort to reclaim
space from all discarded objects.
No one seems to have mentioned explicitly setting object references to null, which is a legitimate technique to "freeing" memory you may want to consider.
For example, say you'd declared a List<String> at the beginning of a method which grew in size to be very large, but was only required until half-way through the method. You could at this point set the List reference to null to allow the garbage collector to potentially reclaim this object before the method completes (and the reference falls out of scope anyway).
Note that I rarely use this technique in reality but it's worth considering when dealing with very large data structures.
System.gc();
Runs the garbage collector.
Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects.
Not recommended.
Edit: I wrote the original response in 2009. It's now 2015.
Garbage collectors have gotten steadily better in the ~20 years Java's been around. At this point, if you're manually calling the garbage collector, you may want to consider other approaches:
If you're forcing GC on a limited number of machines, it may be worth having a load balancer point away from the current machine, waiting for it to finish serving to connected clients, timeout after some period for hanging connections, and then just hard-restart the JVM. This is a terrible solution, but if you're looking at System.gc(), forced-restarts may be a possible stopgap.
Consider using a different garbage collector. For example, the (new in the last six years) G1 collector is a low-pause model; it uses more CPU overall, but does it's best to never force a hard-stop on execution. Since server CPUs now almost all have multiple cores, this is A Really Good Tradeoff to have available.
Look at your flags tuning memory use. Especially in newer versions of Java, if you don't have that many long-term running objects, consider bumping up the size of newgen in the heap. newgen (young) is where new objects are allocated. For a webserver, everything created for a request is put here, and if this space is too small, Java will spend extra time upgrading the objects to longer-lived memory, where they're more expensive to kill. (If newgen is slightly too small, you're going to pay for it.) For example, in G1:
XX:G1NewSizePercent (defaults to 5; probably doesn't matter.)
XX:G1MaxNewSizePercent (defaults to 60; probably raise this.)
Consider telling the garbage collector you're not okay with a longer pause. This will cause more-frequent GC runs, to allow the system to keep the rest of it's constraints. In G1:
XX:MaxGCPauseMillis (defaults to 200.)
*"I personally rely on nulling variables as a placeholder for future proper deletion. For example, I take the time to nullify all elements of an array before actually deleting (making null) the array itself."
This is unnecessary. The way the Java GC works is it finds objects that have no reference to them, so if I have an Object x with a reference (=variable) a that points to it, the GC won't delete it, because there is a reference to that object:
a -> x
If you null a than this happens:
a -> null
x
So now x doesn't have a reference pointing to it and will be deleted. The same thing happens when you set a to reference to a different object than x.
So if you have an array arr that references to objects x, y and z and a variable a that references to the array it looks like that:
a -> arr -> x
-> y
-> z
If you null a than this happens:
a -> null
arr -> x
-> y
-> z
So the GC finds arr as having no reference set to it and deletes it, which gives you this structure:
a -> null
x
y
z
Now the GC finds x, y and z and deletes them aswell. Nulling each reference in the array won't make anything better, it will just use up CPU time and space in the code (that said, it won't hurt further than that. The GC will still be able to perform the way it should).
To extend upon the answer and comment by Yiannis Xanthopoulos and Hot Licks (sorry, I cannot comment yet!), you can set VM options like this example:
-XX:+UseG1GC -XX:MinHeapFreeRatio=15 -XX:MaxHeapFreeRatio=30
In my jdk 7 this will then release unused VM memory if more than 30% of the heap becomes free after GC when the VM is idle. You will probably need to tune these parameters.
While I didn't see it emphasized in the link below, note that some garbage collectors may not obey these parameters and by default java may pick one of these for you, should you happen to have more than one core (hence the UseG1GC argument above).
VM arguments
Update: For java 1.8.0_73 I have seen the JVM occasionally release small amounts with the default settings. Appears to only do it if ~70% of the heap is unused though.. don't know if it would be more aggressive releasing if the OS was low on physical memory.
A valid reason for wanting to free memory from any programm (java or not ) is to make more memory available to other programms on operating system level. If my java application is using 250MB I may want to force it down to 1MB and make the 249MB available to other apps.
I have done experimentation on this.
It's true that System.gc(); only suggests to run the Garbage Collector.
But calling System.gc(); after setting all references to null, will improve performance and memory occupation.
If you really want to allocate and free a block of memory you can do this with direct ByteBuffers. There is even a non-portable way to free the memory.
However, as has been suggested, just because you have to free memory in C, doesn't mean it a good idea to have to do this.
If you feel you really have a good use case for free(), please include it in the question so we can see what you are rtying to do, it is quite likely there is a better way.
Entirely from javacoffeebreak.com/faq/faq0012.html
A low priority thread takes care of garbage collection automatically
for the user. During idle time, the thread may be called upon, and it
can begin to free memory previously allocated to an object in Java.
But don't worry - it won't delete your objects on you!
When there are no references to an object, it becomes fair game for
the garbage collector. Rather than calling some routine (like free in
C++), you simply assign all references to the object to null, or
assign a new class to the reference.
Example :
public static void main(String args[])
{
// Instantiate a large memory using class
MyLargeMemoryUsingClass myClass = new MyLargeMemoryUsingClass(8192);
// Do some work
for ( .............. )
{
// Do some processing on myClass
}
// Clear reference to myClass
myClass = null;
// Continue processing, safe in the knowledge
// that the garbage collector will reclaim myClass
}
If your code is about to request a large amount of memory, you may
want to request the garbage collector begin reclaiming space, rather
than allowing it to do so as a low-priority thread. To do this, add
the following to your code
System.gc();
The garbage collector will attempt to reclaim free space, and your
application can continue executing, with as much memory reclaimed as
possible (memory fragmentation issues may apply on certain platforms).
In my case, since my Java code is meant to be ported to other languages in the near future (Mainly C++), I at least want to pay lip service to freeing memory properly so it helps the porting process later on.
I personally rely on nulling variables as a placeholder for future proper deletion. For example, I take the time to nullify all elements of an array before actually deleting (making null) the array itself.
But my case is very particular, and I know I'm taking performance hits when doing this.
* "For example, say you'd declared a List at the beginning of a
method which grew in size to be very large, but was only required
until half-way through the method. You could at this point set the
List reference to null to allow the garbage collector to potentially
reclaim this object before the method completes (and the reference
falls out of scope anyway)." *
This is correct, but this solution may not be generalizable. While setting a List object reference to null -will- make memory available for garbage collection, this is only true for a List object of primitive types. If the List object instead contains reference types, setting the List object = null will not dereference -any- of the reference types contained -in- the list. In this case, setting the List object = null will orphan the contained reference types whose objects will not be available for garbage collection unless the garbage collection algorithm is smart enough to determine that the objects have been orphaned.
Althrough java provides automatic garbage collection sometimes you will want to know how large the object is and how much of it is left .Free memory using programatically import java.lang; and Runtime r=Runtime.getRuntime(); to obtain values of memory using mem1=r.freeMemory(); to free memory call the r.gc(); method and the call freeMemory()
Recommendation from JAVA is to assign to null
From https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
Explicitly assigning a null value to variables that are no longer needed helps the garbage collector to identify the parts of memory that can be safely reclaimed. Although Java provides memory management, it does not prevent memory leaks or using excessive amounts of memory.
An application may induce memory leaks by not releasing object references. Doing so prevents the Java garbage collector from reclaiming those objects, and results in increasing amounts of memory being used. Explicitly nullifying references to variables after their use allows the garbage collector to reclaim memory.
One way to detect memory leaks is to employ profiling tools and take memory snapshots after each transaction. A leak-free application in steady state will show a steady active heap memory after garbage collections.
Are there JVMs out there, that create Objects on the stack?
Or JVMs that do not interact with Java Garbage Collection via Reference Counters etc?
Assuming we have a temporary Object created in a method.
And this Object's reference never gets passed/stored/accessed outside the method.
It is just used internally.
When following the classic approach of allocating objects (on the stack, along with reference counters), the following steps would have to be take care of:
Find a spot in the Heap that is large enough to hold the Object
Allocate the space
Update reference pointer
Register Object with garbage collection
[... object gets used, eventually discarded ...]
Identify for Garbage Collection
Remove from Heap
Unregister from GC
So if now a VM created Objects on the stack, the steps 1,3,4,6,7,8 would not be necessary, and step 2 and its 7ish counterpart would be easy stack management.
So are there JVMs that optimize this?
Or any hybrid systems, like allocating the Object in Heap, but not touching the normal GC, and instead direclty remove the Object at the end of its scope?
Are there implementations with multiple Heaps (one GC-supervised and the other stack-supervised)?
Kinda, there is project called valhalla that aims to provide value types to java, and it can be already download & used, but it is NOT ready for production usage (and if it will be ready, then it will be probably just merged to one of java official releases).
You can download the early access release. You can download the EA release from https://jdk.java.net/valhalla/ and page about feature itself: https://openjdk.java.net/jeps/169
Additional notes:
Java does not use reference counting, GC works by looking for root objects that are definitely used, like object from currently executed methods, and then finds any other objects that are referenced from these roots, and remove all the rest.
Also JIT perform escape analysis and can remove the need of allocating an object at all, instead will just use the stack to store data that would normally be stored in that object. (note that this is NOT stack allocation, as object is not even created). And thanks to inlining it can also do that cross-methods, but you can't control it or have any guarantee that it will happen.
Quoting Wikipedia 'A heap is a useful data structure when you need to remove the object with the highest (or lowest) priority'.
I am familiar with what a heap is and the kind of problems I can solve with them, but I was wondering why this data structure is the one used for the allocation of Objects in Java? Also, what determines the priority of an Object?
The quoted text is referring to a kind of data structure called a heap.
The word heap is also used for a form of dynamic memory management.
This is a case where one IT English word has taken on two different and independent meanings. (This is a fairly common phenomenon in normal English ...)
I was wondering why this data structure is the one used for the allocation of Objects in Java?
Simply, it isn't. A dynamic memory heap (such as the Java heap) is not organized using a heap data structure.
In fact, the Java heap isn't really a data structure at all. Rather it is an area of memory in which objects are allocated. Space is reclaimed by tracing the reachable objects, and then deleting the remaining objects and consolidating the remaining space.
By contrast, a C or C++ heap cannot be traced and consolidated (because there is insufficient reliable type information to allow pointers to be identified unambiguously). Therefore a C / C++ heap will include a data structure to organize the free space. However, this isn't a heap data structure in the sense of the quoted text. Typically it is an array of lists of "nodes" of the same size.
I will explain that with a reference to C++.
You got local variables that get created on the stack when initializing the variable and destroyed when leaving the block. Basically that means that every local variable lives inside the stack frame of the block. Hence, dies the block, dies the variable.
If you don't know in advance how big your object is going to be, you have to allocate memory on the heap. An example would be a dynamically resizable array. In C++ this is done with the "new" operator (or malloc, calloc, realloc etc.). In Java you are doing this with the "new" operator too. That means you are responsible for creating and releasing the memory.
Objects on the heap don't just get destroyed when you leave a block. Except you define it in your main function and the program exits after that.
In C++ you either call delete or free() to free the created memory of your heap object. In Java on the other hand, the garbage collector does this for you. It is doing that by basically keeping a reference count to the instance (of course its a bit more complicated than that).
I have a situation where there are 2 files A and B, and data is being written continuously in both of them (like a stream).
Now I know that both files A and B are going to be competing for memory and the garbage collector is going to decide what page for what file will be replaced.
I want to control garbage collection by making the garbage collector favor file A (i.e. garbage collector should always choose eviction of pages of file B compared to A). Other possibility is to force writing of file B to disk instead of caching in memory.
Can these things happen in java?
I suspect you are confusing memory management with garbage collection. Yes, garbage collection is a form of memory management, but it's not what you are talking about when discussing "which pages of memory will be swapped out to disk when memory space is low" That's not garbage collection because there are still active references to the A and B files. The Garbage Collector won't do anything until there are no references to an object.
You want to control memory page swapping not garbage collection. I'm sure I'll be corrected in comments if I'm wrong about this, but I don't think you can control in Java which pages of memory get swapped to disk when available memory is low.
You cannot forcefully ask Java to do garbage collection.
But you can call System.gc() to request the JVM to do a garbage collection.
To make sure an object is ready for garbage collection you can assign it to null. That way you can make sure that when the garbage collector runs it gets this object and is removed from the heap.
Java has automatic garbage collection and identifies which objects are in use and which are not, and deleting the unused objects.
A good source about garbage collection within Java is here
The description of your problem lacks certain details, specifically, are the writes to your files sequential or is there random access involved?
As geneSummons correctly points out, you have memory management in the JVM confused with that of the Operating System. Even sun.misc.Unsafe will not allow you control over paging activity at the OS level from a Java application.
What you may want to look at is using memory mapped files, but that does depend on whether you are using random access for your writes. If all you're doing is writing sequentially this is most likely no use. Although this does not give you control over the paging of the files at the OS level it may provide you with a more efficient way of solving your problem.
There is a useful article on this subject, https://howtodoinjava.com/java-7/nio/java-nio-2-0-memory-mapped-files-mappedbytebuffer-tutorial/
You create a variable to store a value that you can refer to that variable in the future. I've heard that you must set a variable to 'null' once you're done using it so the garbage collector can get to it (if it's a field var).
If I were to have a variable that I won't be referring to agaon, would removing the reference/value vars I'm using (and just using the numbers when needed) save memory? For example:
int number = 5;
public void method() {
System.out.println(number);
}
Would that take more space than just plugging '5' into the println method?
I have a few integers that I don't refer to in my code ever again (game loop), but I've seen others use reference vars on things that really didn't need them. Been looking into memory management, so please let me know, along with any other advice you have to offer about managing memory
I've heard that you must set a variable to 'null' once you're done using it so the garbage collector can get to it (if it's a field var).
This is very rarely a good idea. You only need to do this if the variable is a reference to an object which is going to live much longer than the object it refers to.
Say you have an instance of Class A and it has a reference to an instance of Class B. Class B is very large and you don't need it for very long (a pretty rare situation) You might null out the reference to class B to allow it to be collected.
A better way to handle objects which don't live very long is to hold them in local variables. These are naturally cleaned up when they drop out of scope.
If I were to have a variable that I won't be referring to agaon, would removing the reference vars I'm using (and just using the numbers when needed) save memory?
You don't free the memory for a primitive until the object which contains it is cleaned up by the GC.
Would that take more space than just plugging '5' into the println method?
The JIT is smart enough to turn fields which don't change into constants.
Been looking into memory management, so please let me know, along with any other advice you have to offer about managing memory
Use a memory profiler instead of chasing down 4 bytes of memory. Something like 4 million bytes might be worth chasing if you have a smart phone. If you have a PC, I wouldn't both with 4 million bytes.
In your example number is a primitive, so will be stored as a value.
If you want to use a reference then you should use one of the wrapper types (e.g. Integer)
So notice variables are on the stack, the values they refer to are on the heap. So having variables is not too bad but yes they do create references to other entities. However in the simple case you describe it's not really any consequence. If it is never read again and within a contained scope, the compiler will probably strip it out before runtime. Even if it didn't the garbage collector will be able to safely remove it after the stack squashes. If you are running into issues where you have too many stack variables, it's usually because you have really deep stacks. The amount of stack space needed per thread is a better place to adjust than to make your code unreadable. The setting to null is also no longer needed
It's really a matter of opinion. In your example, System.out.println(5) would be slightly more efficient, as you only refer to the number once and never change it. As was said in a comment, int is a primitive type and not a reference - thus it doesn't take up much space. However, you might want to set actual reference variables to null only if they are used in a very complicated method. All local reference variables are garbage collected when the method they are declared in returns.
Well, the JVM memory model works something like this: values are stored on one pile of memory stack and objects are stored on another pile of memory called the heap. The garbage collector looks for garbage by looking at a list of objects you've made and seeing which ones aren't pointed at by anything. This is where setting an object to null comes in; all nonprimitive (think of classes) variables are really references that point to the object on the stack, so by setting the reference you have to null the garbage collector can see that there's nothing else pointing at the object and it can decide to garbage collect it. All Java objects are stored on the heap so they can be seen and collected by the garbage collector.
Nonprimitive (ints, chars, doubles, those sort of things) values, however, aren't stored on the heap. They're created and stored temporarily as they're needed and there's not much you can do there, but thankfully the compilers nowadays are really efficient and will avoid needed to store them on the JVM stack unless they absolutely need to.
On a bytecode level, that's basically how it works. The JVM is based on a stack-based machine, with a couple instructions to create allocate objects on the heap as well, and a ton of instructions to manipulate, push and pop values, off the stack. Local variables are stored on the stack, allocated variables on the heap.* These are the heap and the stack I'm referring to above. Here's a pretty good starting point if you want to get into the nitty gritty details.
In the resulting compiled code, there's a bit of leeway in terms of implementing the heap and stack. Allocation's implemented as allocation, there's really not a way around doing so. Thus the virtual machine heap becomes an actual heap, and allocations in the bytecode are allocations in actual memory. But you can get around using a stack to some extent, since instead of storing the values on a stack (and accessing a ton of memory), you can stored them on registers on the CPU which can be up to a hundred times (maybe even a thousand) faster than storing it on memory. But there's cases where this isn't possible (look up register spilling for one example of when this may happen), and using a stack to implement a stack kind of makes a lot of sense.
And quite frankly in your case a few integers probably won't matter. The compiler will probably optimize them out by itself in this case anyways. Optimization should always happen after you get it running and notice it's a tad slower than you'd prefer it to be. Worry about making simple, elegant, working code first then later make it fast (and hopefully) simple, elegant, working code.
Java's actually very nicely made so that you shouldn't have to worry about nulling variables very often. Whenever you stop needing to use something, it will usually incidentally be disappearing from the scope of your program (and thus becoming eligible for garbage collection). So I guess the real lesson here is to use local variables as often as you can.
*There's also a constant pool, a local variable pool, and a couple other things in memory but you have close to no control over the size of those things and I want to keep this fairly simple.