JVM creating Objects on the Stack? (and more)

JVM creating Objects on the Stack? (and more) - java

Are there JVMs out there, that create Objects on the stack?
Or JVMs that do not interact with Java Garbage Collection via Reference Counters etc?
Assuming we have a temporary Object created in a method.
And this Object's reference never gets passed/stored/accessed outside the method.
It is just used internally.
When following the classic approach of allocating objects (on the stack, along with reference counters), the following steps would have to be take care of:
Find a spot in the Heap that is large enough to hold the Object
Allocate the space
Update reference pointer
Register Object with garbage collection
[... object gets used, eventually discarded ...]
Identify for Garbage Collection
Remove from Heap
Unregister from GC
So if now a VM created Objects on the stack, the steps 1,3,4,6,7,8 would not be necessary, and step 2 and its 7ish counterpart would be easy stack management.
So are there JVMs that optimize this?
Or any hybrid systems, like allocating the Object in Heap, but not touching the normal GC, and instead direclty remove the Object at the end of its scope?
Are there implementations with multiple Heaps (one GC-supervised and the other stack-supervised)?

Kinda, there is project called valhalla that aims to provide value types to java, and it can be already download & used, but it is NOT ready for production usage (and if it will be ready, then it will be probably just merged to one of java official releases).
You can download the early access release. You can download the EA release from https://jdk.java.net/valhalla/ and page about feature itself: https://openjdk.java.net/jeps/169
Additional notes:
Java does not use reference counting, GC works by looking for root objects that are definitely used, like object from currently executed methods, and then finds any other objects that are referenced from these roots, and remove all the rest.
Also JIT perform escape analysis and can remove the need of allocating an object at all, instead will just use the stack to store data that would normally be stored in that object. (note that this is NOT stack allocation, as object is not even created). And thanks to inlining it can also do that cross-methods, but you can't control it or have any guarantee that it will happen.

Related

Exclusion of elements in a Java file

I have some doubts about the garbage collector and how I can clear memory in Java.
I have a program that writes a binary search tree to a file and I made a function that inserts an element and another that removes an element, but in the method that removes I put the elements that I remove in a space in the file that I call "empty blocks" (which is a stack). In the C language there is a method that freed the memory that was free(), in Java there is the garbage collector that is at the discretion of Java. How can I free the memory of these blocks in the file (elements excluded).
Is there a way to free the memory of an element on file in Java (the element is of type int)?

I put the elements that I remove in a space in the file that I call “empty blocks ”(Which is a stack)
Whatever data structure you use to track your data will be in an object of some class.
When that object no longer has any references pointing to it, that object becomes a candidate for garbage collection. No need for you to do anything except not hang on to any reference longer than needed.
The garbage collector may clear the unneeded object immediately, or may clear it later. Either way, we as Java programmers do not care. Eventually the memory will be freed up.
If the reference variable pointing to an object is a local variable, that reference is dropped when the local variable goes out of scope.
If the reference variable is a member field on another object, the
object in question will be released when the other object becomes
garbage.
If the reference variable is static, you should assign null explicitly to let the referenced object become garbage. In Java, static variables stay in memory throughout the execution run of your app.
In the first two cases, you can release the object sooner by setting the reference variable to null. Generally this is not needed, but doing so may be wise if a large amount of memory is at stake. Ditto if other precious resources are being needlessly held.

Is there a way to free the memory of an element on file in Java (the element is of type int)?
Your question is really hard to understand, but I think you are asking about freeing up disk blocks in a data structure stored in a file1.
There is no Java support for this. If you write a data structure to a file, the problem of reclaiming space in the file is yours, not Java's. Indeed, I don't think that a typical OS will allow you to (literally) free disk blocks in the middle of a file2.
There may be 3rd-party libraries that support this kind of thing, but I don't have the background knowledge to make a recommendation.
If I have correctly understood what you are asking, your discussion of C's malloc / free versus Java's garbage collection is only peripherally relevant. Both of these schemes are for managing memory, not space in a random access file. Now you could conceivably implement similar schemes for managing space in a file, but you would need to take account of the different characteristics of memory and disk I/O. (Even if you are mapping the file into memory.)
1 - If you are actually talking about managing objects in heap memory in Java, your best bet is to just let the garbage collector deal with it; see Basil's answer. There are also 3rd-party libraries for storing objects in off-heap memory, but it is unclear if they would help you. I understand that such libraries typically leave it to the programmer to decide when to free an object. (They are not garbage collected.)
2 - It would be a bad idea. If the disk blocks thus freed were then used in a different file, you would get a lot of file fragmentation. That would be bad for file I/O performance.

Why are Java objects allocated in a heap?

Quoting Wikipedia 'A heap is a useful data structure when you need to remove the object with the highest (or lowest) priority'.
I am familiar with what a heap is and the kind of problems I can solve with them, but I was wondering why this data structure is the one used for the allocation of Objects in Java? Also, what determines the priority of an Object?

The quoted text is referring to a kind of data structure called a heap.
The word heap is also used for a form of dynamic memory management.
This is a case where one IT English word has taken on two different and independent meanings. (This is a fairly common phenomenon in normal English ...)
I was wondering why this data structure is the one used for the allocation of Objects in Java?
Simply, it isn't. A dynamic memory heap (such as the Java heap) is not organized using a heap data structure.
In fact, the Java heap isn't really a data structure at all. Rather it is an area of memory in which objects are allocated. Space is reclaimed by tracing the reachable objects, and then deleting the remaining objects and consolidating the remaining space.
By contrast, a C or C++ heap cannot be traced and consolidated (because there is insufficient reliable type information to allow pointers to be identified unambiguously). Therefore a C / C++ heap will include a data structure to organize the free space. However, this isn't a heap data structure in the sense of the quoted text. Typically it is an array of lists of "nodes" of the same size.

I will explain that with a reference to C++.
You got local variables that get created on the stack when initializing the variable and destroyed when leaving the block. Basically that means that every local variable lives inside the stack frame of the block. Hence, dies the block, dies the variable.
If you don't know in advance how big your object is going to be, you have to allocate memory on the heap. An example would be a dynamically resizable array. In C++ this is done with the "new" operator (or malloc, calloc, realloc etc.). In Java you are doing this with the "new" operator too. That means you are responsible for creating and releasing the memory.
Objects on the heap don't just get destroyed when you leave a block. Except you define it in your main function and the program exits after that.
In C++ you either call delete or free() to free the created memory of your heap object. In Java on the other hand, the garbage collector does this for you. It is doing that by basically keeping a reference count to the instance (of course its a bit more complicated than that).

How does Java solve retain cycles in garbage collection?

I know that a retain cycle (at least in Objective-C and Swift) is when two objects claim ownership of one another (they have references to each other). And in Objective-C we can solve the issue by declaring one of them weak.
From what I have read and understood, the Java GC is not affected by retain cycles, and we do not have to worry about weak references. How does it solve it?

The Java (JVM) garbage collector works by looking for "reachable" objects - from the root(s) of the object tree. If they can't be reached (if they have no outside object references) then entire object graphs can be discarded.
Essentially it just just traverses the tree from root(s) to leaf nodes and marks all objects it encounters. Any memory not taken up by marked objects in the heap is swept (marked as free). This is called mark and sweep. img src
This can't be done easily in objective-c because it uses reference counting, not mark and sweep which has it's flaws
The reason there can be no retain cycles is because if they aren't linked to the "tree" anywhere, they aren't marked and can be discarded.

The garbage collector looks for reachable objects, starting from the roots (typically: variables on the call stack or global variables). So if two objects reference each other but are not otherwise reachable they won't be flagged as "live" and will be collected.

As the name suggests, Garbage Collection refers to removing of objects
which are no longer in use. It is a well known fact that irrespective
of their scope objects, Java stores objects in heap. Thus, if we keep
on creating objects without clearing the heap, our computers might run
out of heap space and we get ‘Out of Memory’ error. Garbage Collection
in Java is a mechanism which is controlled and executed by the Java
Virtual Machine (JVM) to release the heap space occupied by the
objects which are no more in use. In contrast to C++, garbage
collection in java relives the developer from the Memory Management
related activities. The JVM executes this process with the help of a
demon thread called the ‘Garbage Collector’. The garbage collector
thread first invokes the finalize method of the object. This performs
the cleanup activity on the said object. As a developer we cannot
force the JVM to run the garbage collector thread. Though there are
methods e.g Runtime.gc () or System.gc(), but none of these assures
the execution of garbage collector thread. These methods are used to
send garbage collection requests to the JVM. It is up to the Java
Virtual machine when it will initiate the garbage collection process.
Take a look at this stuff
How Garbage Collection works in Java

In basic terms, Garbage Collection works by walking the object graphs from a number of predefined roots. Anything not accessible from those roots is garbage, therefore one object referencing another is irrelevant unless either can be accessed from one or more roots.
It's all explained in more detail in How Garbage Collection Really Works.

The behavior of a tracing garbage collector may be viewed as analogous to that of a bowling alley pinsetter, which automatically sweeps up all pins that have been knocked over without disrupting pins that are still standing. Rather than trying to identify knocked-over pins, the pinsetter grabs all of the pins that are still standing, lifts them off the alley, and then runs a sweeper bar over the alley surface, removing wholesale any pins that might happen to be there without knowing or caring where they are.
A tracing GC works by visiting a certain set of "rooted" object references (which are regarded as always "reachable") and objects that are reachable via references held in reachable objects. The GC will mark such objects and protect their contents somehow. Once all such objects have been visited, the system will then visit some "special" objects (e.g. lists of weak or phantom references, or references to objects with finalizers) and others which are reachable from them but weren't reachable from ordinary rooted references, and then regard any storage which hasn't been guarded as eligible for reuse.
The system will need to specially treat objects that were reachable from special objects but weren't reachable from ordinary ones, but otherwise won't need to care about "ordinary" objects that become eligible for collection. If an object doesn't have a finalizer and isn't targeted by a weak or phantom reference, the GC may reuse its associated storage without ever bothering to look at any of it. There's no need for the GC to worry about the possibility that a group of objects that aren't reachable via any rooted references might hold references to each other because the GC wouldn't bother examining of those references even if they existed.

When to create variables (memory management)

You create a variable to store a value that you can refer to that variable in the future. I've heard that you must set a variable to 'null' once you're done using it so the garbage collector can get to it (if it's a field var).
If I were to have a variable that I won't be referring to agaon, would removing the reference/value vars I'm using (and just using the numbers when needed) save memory? For example:
int number = 5;
public void method() {
System.out.println(number);
}
Would that take more space than just plugging '5' into the println method?
I have a few integers that I don't refer to in my code ever again (game loop), but I've seen others use reference vars on things that really didn't need them. Been looking into memory management, so please let me know, along with any other advice you have to offer about managing memory

I've heard that you must set a variable to 'null' once you're done using it so the garbage collector can get to it (if it's a field var).
This is very rarely a good idea. You only need to do this if the variable is a reference to an object which is going to live much longer than the object it refers to.
Say you have an instance of Class A and it has a reference to an instance of Class B. Class B is very large and you don't need it for very long (a pretty rare situation) You might null out the reference to class B to allow it to be collected.
A better way to handle objects which don't live very long is to hold them in local variables. These are naturally cleaned up when they drop out of scope.
If I were to have a variable that I won't be referring to agaon, would removing the reference vars I'm using (and just using the numbers when needed) save memory?
You don't free the memory for a primitive until the object which contains it is cleaned up by the GC.
Would that take more space than just plugging '5' into the println method?
The JIT is smart enough to turn fields which don't change into constants.
Been looking into memory management, so please let me know, along with any other advice you have to offer about managing memory
Use a memory profiler instead of chasing down 4 bytes of memory. Something like 4 million bytes might be worth chasing if you have a smart phone. If you have a PC, I wouldn't both with 4 million bytes.

In your example number is a primitive, so will be stored as a value.
If you want to use a reference then you should use one of the wrapper types (e.g. Integer)

So notice variables are on the stack, the values they refer to are on the heap. So having variables is not too bad but yes they do create references to other entities. However in the simple case you describe it's not really any consequence. If it is never read again and within a contained scope, the compiler will probably strip it out before runtime. Even if it didn't the garbage collector will be able to safely remove it after the stack squashes. If you are running into issues where you have too many stack variables, it's usually because you have really deep stacks. The amount of stack space needed per thread is a better place to adjust than to make your code unreadable. The setting to null is also no longer needed

It's really a matter of opinion. In your example, System.out.println(5) would be slightly more efficient, as you only refer to the number once and never change it. As was said in a comment, int is a primitive type and not a reference - thus it doesn't take up much space. However, you might want to set actual reference variables to null only if they are used in a very complicated method. All local reference variables are garbage collected when the method they are declared in returns.

Well, the JVM memory model works something like this: values are stored on one pile of memory stack and objects are stored on another pile of memory called the heap. The garbage collector looks for garbage by looking at a list of objects you've made and seeing which ones aren't pointed at by anything. This is where setting an object to null comes in; all nonprimitive (think of classes) variables are really references that point to the object on the stack, so by setting the reference you have to null the garbage collector can see that there's nothing else pointing at the object and it can decide to garbage collect it. All Java objects are stored on the heap so they can be seen and collected by the garbage collector.
Nonprimitive (ints, chars, doubles, those sort of things) values, however, aren't stored on the heap. They're created and stored temporarily as they're needed and there's not much you can do there, but thankfully the compilers nowadays are really efficient and will avoid needed to store them on the JVM stack unless they absolutely need to.
On a bytecode level, that's basically how it works. The JVM is based on a stack-based machine, with a couple instructions to create allocate objects on the heap as well, and a ton of instructions to manipulate, push and pop values, off the stack. Local variables are stored on the stack, allocated variables on the heap.* These are the heap and the stack I'm referring to above. Here's a pretty good starting point if you want to get into the nitty gritty details.
In the resulting compiled code, there's a bit of leeway in terms of implementing the heap and stack. Allocation's implemented as allocation, there's really not a way around doing so. Thus the virtual machine heap becomes an actual heap, and allocations in the bytecode are allocations in actual memory. But you can get around using a stack to some extent, since instead of storing the values on a stack (and accessing a ton of memory), you can stored them on registers on the CPU which can be up to a hundred times (maybe even a thousand) faster than storing it on memory. But there's cases where this isn't possible (look up register spilling for one example of when this may happen), and using a stack to implement a stack kind of makes a lot of sense.
And quite frankly in your case a few integers probably won't matter. The compiler will probably optimize them out by itself in this case anyways. Optimization should always happen after you get it running and notice it's a tad slower than you'd prefer it to be. Worry about making simple, elegant, working code first then later make it fast (and hopefully) simple, elegant, working code.
Java's actually very nicely made so that you shouldn't have to worry about nulling variables very often. Whenever you stop needing to use something, it will usually incidentally be disappearing from the scope of your program (and thus becoming eligible for garbage collection). So I guess the real lesson here is to use local variables as often as you can.
*There's also a constant pool, a local variable pool, and a couple other things in memory but you have close to no control over the size of those things and I want to keep this fairly simple.

In Java, is there a performance difference between new and local?

In C and C++ I know that there could be a huge difference in performance between instantiating objects on the stack vs. using 'new' to create them on the heap.
Is this the same in Java?
The 'new' operator in Java is very convenient (especially when I don't have to remember freeing/deleting the objects created with 'new'), but does this mean that I can go wild with 'new'?

Erm, there is no other way in java to instantiate an object.
All objects are created with new, and all objects are created on the heap.
in Java, when you say
MyObject foo;
You're simply declaring a variable (reference). It isn't instantiated until you say
foo = new MyObject();
When all references to that object are out of scope, the object becomes elegible for garbage collection. You'll note there's no such thing as delete in java :)

There is no allocation of objects on the stack in Java.
Only local variables (and parameters) can live on the stack and those can only contain references or primitive values, but never objects.

You can't create objects on the stack, you can only have primitives and references on the stack, so the question doesn't apply to Java.
There have been attempts to use escape analysis to optimise objects which are short lived (and possibly put them on the stack instead) however I haven't seen any evidence this improved performance.
Part of the reason there isn't the same performance hit/benifit as there would be in C/C++ is that Java has thread local allocation on the heap and objects are not recycled as agressively. C/C++ has thread local stacks, but you need additional libraires to support multi-thread object allocation. Objects are recycled more aggresively which increases the cost of object allocation.
One of the biggest changes coming from C/C++ world is to find that Java has far less features, but tries to do make the most of them (There is alot of complex optimisation going on in the JVM) On the other hand Java has a rich/baffling array of open sources libraries.

Repeat after me: there is no allocation of objects on the stack in Java
In Java, unlike C++, all objects are allocated on the heap, and the only way out is when they are garbage collected.
In Java, unlike C++, the variable falling out of scope does not mean that the destructor of the object runs; in fact, there is no destructor. So the variable might fall out of scope, but the object remains alive on the heap.
Can I go wild with 'new'?
Yes. First, because it's the only way to instantiate an object. Second, because the JVM is so good it can create up to 2^32 ightweight objects in less than a second.

In Java, there is no way to manually allocate objects on the Stack, though the compiler may decide to allocate objects created with 'new' on the stack, see Java theory and practice: Urban performance legends, revisited.

There's really nothing to compare here: you can't create objects on the stack in Java.
If it's any comfort, however, heap-based allocation in Java is (at least usually) quite fast. Java's garbage collector periodically "cleans up" the heap, so it basically looks a lot like a stack, and allocating from it is a lot like allocating from a stack as well -- in a typical case, you have a pointer to the beginning (or end) of the free memory area, and allocating a chunk of memory simply means adding (or subtracting) the amount from that pointer, and returning the address of the beginning (then, of course, constructing an object (or objects) in that area, etc.)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.