A Java application starts up with one heap for all threads. Each thread has its own stack.
When a Java application is started, we use the JVM option -Xms and -Xmx to control the size of heap and -Xss to control the stack size.
My understanding is that the heap being created becomes a "managed" memory of JVM and all the object being created are placed there.
But how does the stack creation work? Does Java create a stack for each thread when it is created? If so, where exactly the stack is on the memory? It is certainly not in the "managed" heap.
Does JVM create stack from native memory or does it pre-allocate a section of managed memory area for stack? If so, how does JVM know how may threads will be created?
There are a few things about thread stacks that the Java specification tells us. Among other things:
Each Java Virtual Machine thread has a private Java Virtual Machine stack, created at the same time as the thread.
Because the Java Virtual Machine stack is never manipulated directly except to push and pop frames, frames may be heap allocated. The memory for a Java Virtual Machine stack does not need to be contiguous.
Specification permits Java Virtual Machine stacks either to be of a fixed size or to dynamically expand and contract as required by the computation.
Now, if we focus on JVM implementations such as HotSpot, we can get some more information. Here are a few facts I've collected from different sources:
The minimum stack size in HotSpot for a thread seems to be fixed. This is what the aforementioned -Xss option is for.
(Source)
In Java SE 6, the default on Sparc is 512k in the 32-bit VM, and 1024k in the 64-bit VM. ... You can reduce your stack size by running with the -Xss option. ...
64k is the least amount of stack space allowed per thread.
JRockit allocates memory separate from the heap where stacks are located. (Source)
Note that the JVM uses more memory than just the heap. For example Java methods, thread stacks and native handles are allocated in memory separate from the heap, as well as JVM internal data structures.
There is a direct mapping between a Java Thread and a native OS Thread in HotSpot. (Source).
But the Java thread stack in HotSpot is software managed, it is not an OS native thread stack. (Source)
It uses a separate software stack to pass Java arguments, while the native C stack is used by the VM itself. A number of JVM internal variables, such as the program counter or the stack pointer for a Java thread, are stored in C variables, which are not guaranteed to be always kept in the hardware registers. Management of these software interpreter structures consumes a considerable share of total execution time.
JVM also utilizes the same Java thread stack for the native methods and JVM runtime calls (e.g. class loading). (Source).
Interestingly, even allocated objects may be sometimes located on stack instead on heap as a performance optimization. (Source)
JVMs can use a technique called escape analysis, by which they can tell that certain objects remain confined to a single thread for their entire lifetime, and that lifetime is bounded by the lifetime of a given stack frame. Such objects can be safely allocated on the stack instead of the heap.
And because an image is worth a thousand words, here is one from James Bloom
Now answering some of your questions:
How does JVM knows how may threads will be created?
It doesn't. Can be easily proved by contradiction by creating a variable number of threads. It does make some assumptions about the maximum number of threads and stack size of each thread. That's why you may run out of memory (not meaning heap memory!) if you allocate too many threads.
Does Java create a stack for each thread when it is created?
As mentioned earlier, each Java Virtual Machine thread has a private Java Virtual Machine stack, created at the same time as the thread. (Source).
If so, where exactly the stack is on the memory? It is certainly not in the "managed" heap.
As stated above, Java specification allows stack memory to be stored on heap, technically speaking. But at least JRockit JVM uses a different part of memory.
Does JVM create stack from native memory or does it pre-allocate a section of managed memory area for stack?
The stack is JVM managed because the Java specification prescribes how it must behave: A Java Virtual Machine stack stores frames (§2.6). A Java Virtual Machine stack is analogous to the stack of a conventional language. One exception are Native Method stacks used for native methods. More about this again in the specification.
JVM uses more memory than just the heap. For example Java methods,
thread stacks and native handles are allocated in memory separate from
the heap, as well as JVM internal data structures.
Further reading.
So to answer your questions:
Does Java create a stack for each thread when it is created?
Yes.
If so, where exactly the stack is on the memory?
In the JVM allocated memory, but not on the heap.
If so, how does JVM knows how may threads will be created?
It doesn't.
You can create as many as you'd like until you've maxed out your JVM memory and get
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
EDIT:
All of the above refers to Jrockit JVM, although i find it hard to believe other JVMs would be different on such fundamental issues.
Related
I come from C/C++ background, where a process memory is divided into:
Per thread stack
Heap
Instructions
Data
I am trying to understand how JVM works, I looked at different resources, I gathered that the JVM memory is divided into heap and stack as well plus few other things.
I want to wrap my mind around this, when I read heap and stack in JVM are we talking about concepts of stack and heap? and that the actual memory of the entire JVM resides on the heap (and here I mean the C++ concept of a Heap)?
I want to wrap my mind around this, when I read heap and stack in JVM are we talking about concepts of stack and heap?
Yes, in general this is the case. Each thread has its own per-thread stack, which is used to store local variables in stack frames (corresponding to method calls). The stack need not be located in a location related to the per-thread stack at the OS level. If the stack attempts to grow past a size as specified by -Xss or a default set by the implementation, a StackOverflowError will be thrown.
The stack can exist in C/C++ heap memory, and need not be contiguous (JVM spec v7):
Each Java Virtual Machine thread has a private Java Virtual Machine stack, created at the same time as the thread. A Java Virtual Machine stack stores frames (§2.6). A Java Virtual Machine stack is analogous to the stack of a conventional language such as C: it holds local variables and partial results, and plays a part in method invocation and return. Because the Java Virtual Machine stack is never manipulated directly except to push and pop frames, frames may be heap allocated. The memory for a Java Virtual Machine stack does not need to be contiguous.
The Java heap is a means of storing objects, including automatic garbage collection when objects are no longer reachable via strong references. It is shared between all threads running on a JVM.
The Java Virtual Machine has a heap that is shared among all Java Virtual Machine threads. The heap is the run-time data area from which memory for all class instances and arrays is allocated.
The heap is created on virtual machine start-up. Heap storage for objects is reclaimed by an automatic storage management system (known as a garbage collector); objects are never explicitly deallocated. The Java Virtual Machine assumes no particular type of automatic storage management system, and the storage management technique may be chosen according to the implementor's system requirements. The heap may be of a fixed size or may be expanded as required by the computation and may be contracted if a larger heap becomes unnecessary. The memory for the heap does not need to be contiguous.
By simply calling a constructor (e.g. HashMap foo = new HashMap()) the JVM will allocate the requisite memory on the heap for this object (or throw an OutOfMemoryError if that is not possible). It's also important to note that objects never live on the stack--only references to them do. Additionally, non-primitive fields also always contain references to objects.
It's also possible to allocate memory off-heap through sun.misc.Unsafe on some JVMs, some NIO classes that allocate direct buffers, and through the use of JNI. This memory is not part of the JVM heap and does not undergo automatic garbage-collection (meaning that it would need to be released through means such as delete, but it may be a part of heap memory as C++ may refer to it.
Does anyone know if there is a way to dynamically (runtime) increase the stack size of the main Thread? Also, and I believe it is the same question, is it possible to increase / update the stack size of a Thread after its instantiation?
Thread’s CTOR allows the definition of its stack size but I can’t find any way to update it. Actually, I didn’t find any management of the stack size in the JDK (which tends to indicate that it’s not possible), everything is done in the VM.
According to the java language specification it is possible to set the stack size ‘when stack is created’ but there is a note:
A Java virtual machine implementation may provide the programmer or the user control over the initial size of Java virtual machine stacks, as well as, in the case of dynamically expanding or contracting Java virtual machine stacks, control over the maximum and minimum sizes.
IMO that’s not very clear, does that mean that some VM handle Threads with max (edit) stack sizes evolving within a given range? Can we do that with Hostpot (I didn't find any stack size related options beside Xss) ?
Thanks !
The stack size dynamcally updates itself as it is used so you never need to so this.
What you can set is the maximum size it can be with -Xss This is the virtual memory size used and you can make it as large as you like on 64-bit JVMs. The actual memory used is based on the amount of memory you use. ;)
EDIT: The important distinction is that the maximum size is reserved as virtual memory (so is the heap btw). i.e. the address space is reserved, which is also why it cannot be extended. In 32-bit systems you have limited address space and this can still be a problem. But in 64-bit systems, you usually have up to 256 TB of virtual memory (a processor limitation) so virtual memory is cheap. The actual memory is allocated in pages (typically 4 KB) and they are only allocated when used. This is why the memory of a Java application appears to grow over time even though the maximum heap size is allocated on startup. The same thing happens with thread stacks. Only the pages actually touched are allocated.
There's not a way to do this in the standard JDK, and even the stackSize argument isn't set in stone:
The effect of the stackSize parameter, if any, is highly platform dependent. ... On some platforms, the value of the stackSize parameter may have no effect whatsoever. ... The virtual machine is free to treat the stackSize parameter as a suggestion.
(Emphasis in original.)
http://courses.washington.edu/css342/zander/css332/arch.html
bottom of the page:
The C++ memory model differs from the Java memory model. In C++,
memory comes from two places, the run time stack and the memory heap.
This reads as if Java doesnt have a heap (or stack)?
I am trying to learn all the "under the bonnet" details for Java and C++
Java has a heap and a (per-thread) stack as well. The difference is that in Java, you cannot choose where to allocate a variable or object.
Basically, all objects and their instance variables are allocated on the heap, and all method parameters and local variables (just the references in the case of objects) are allocated on the stack.
However, some modern JVMs will allocate some objects on the stack as a performance optimization when they detect that the object is only used locally.
Java uses a heap memory model. All objects are created on the heap; references are used to refer to them.
It also puts method frames onto a stack when processing them.
I would say it has both.
Yes, Java have both heap (common to the entire JVM) and stack (one stack per thread).
And having stack & heap is more a property of implementations than of languages.
I would even say that most Linux programs have heap (obtained thru mmap & sbrk system calls) and stack (at the level of the operating system, this is not dependent of the language).
What Java have, but C++ usually not, is a garbage collector. You don't need to release unused memory in Java. But in C++ you need to release it, by calling delete, for every C++ object allocated in the heap with new.
See however Boehm's garbage collector for a GC usable in C & C++. It works very well in practice (even if it can leak in theory, being a conservative, not a precise, GC).
Some restricted C++ or C environments (in particular free standing implementations for embedded systems without operating system kernel) don't have any heap.
Is there a way to find out how much memory my java thread is taking in the VM?
For example, using stack trace dump, or some other means.
Thanks
Java threads use the heap as shared memory. Individual threads have their stack (the size of which you can set via the -Xss command line option, default is 512KB), but all other memory (the heap) does not belong to specific threads, and asking how much of it one specific thread uses simply does not make sense.
I've been running into a peculiar issue with certain Java applications in the HP-UX environment.
The heap is set to -mx512, yet, looking at the memory regions for this java process using gpm, it shows it using upwards of 1.6GBs of RSS memory, with 1.1GB allocated to the DATA region. Grows quite rapidly over a 24-48hour period and then slows down substantially, still growing 2MB every few hours. However, the Java heap shows no sign of leakage.
Curious how this was possible I researched a bit and found this HP write-up on memory leaks in java heap and c heap: http://docs.hp.com/en/JAVAPERFTUNE/Memory-Management.pdf
My question is what determines what is ran in the C heap vs the java heap, and for things that do not run through the java heap, how would you identify those objects being run on the C heap? Additionally does the java heap sit inside the C heap?
Consider what makes up a Java process.
You have:
the JVM (a C program)
JNI Data
Java byte codes
Java data
Notably, they ALL live in the C heap (the JVM Heap is part of the C heap, naturally).
In the Java heap is simply Java byte codes and the Java data. But what is also in the Java heap is "free space".
The typical (i.e. Sun) JVM only grows it Java Heap as necessary, but never shrinks it. Once it reaches its defined maximum (-Xmx512M), it stops growing and deals with whatever is left. When that maximum heap is exhausted, you get the OutOfMemory exception.
What that Xmx512M option DOES NOT do, is limit the overall size of the process. It limits only the Java Heap part of the process.
For example, you could have a contrived Java program that uses 10mb of Java heap, but calls a JNI call that allocates 500MB of C Heap. You can see how your process size is large, even though the Java heap is small. Also, with the new NIO libraries, you can attach memory outside of the heap as well.
The other aspect that you must consider is that the Java GC is typically a "Copying Collector". Which means it takes the "live" data from memory it's collecting, and copies it to a different section of memory. This empty space that is copies to IS NOT PART OF THE HEAP, at least, not in terms of the Xmx parameter. It's, like, "the new Heap", and becomes part of the heap after the copy (the old space is used for the next GC). If you have a 512MB heap, and it's at 510MB, Java is going to copy the live data someplace. The naive thought would be to another large open space (like 500+MB). If all of your data were "live", then it would need a large chunk like that to copy into.
So, you can see that in the most extreme edge case, you need at least double the free memory on your system to handle a specific heap size. At least 1GB for a 512MB heap.
Turns out that not the case in practice, and memory allocation and such is more complicated than that, but you do need a large chunk of free memory to handle the heap copies, and this impacts the overall process size.
Finally, note that the JVM does fun things like mapping in the rt.jar classes in to the VM to ease startup. They're mapped in a read only block, and can be shared across other Java processes. These shared pages will "count" against all Java processes, even though it is really only consuming physical memory once (the magic of virtual memory).
Now as to why your process continues to grow, if you never hit the Java OOM message, that means that your leak is NOT in the Java heap, but that doesn't mean it may not be in something else (the JRE runtime, a 3rd party JNI librariy, a native JDBC driver, etc.).
In general, only the data in Java objects is stored on the Java heap, all other memory required by the Java VM is allocated from the "native" or "C" heap (in fact, the Java heap itself is just one contiguous chunk allocated from the C heap).
Since the JVM requires the Java heap (or heaps if generational garbage collection is in use) to be a contiguous piece of memory, the whole maximum heap size (-mx value) is usually allocated at JVM start time. In practice, the Java VM will attempt to minimise its use of this space so that the Operating System doesn't need to reserve any real memory to it (the OS is canny enough to know when a piece of storage has never been written to).
The Java heap, therefore, will occupy a certain amount of space in memory.
The rest of the storage will be used by the Java VM and any JNI code in use. For example, the JVM requires memory to store Java bytecode and constant pools from loaded classes, the result of JIT compiled code, work areas for compiling JIT code, native thread stacks and other such sundries.
JNI code is just platform-specific (compiled) C code that can be bound to a Java object in the form of a "native" method. When this method is executed the bound code is executed and can allocate memory using standard C routines (eg malloc) which will consume memory on the C heap.
My only guess with the figures you have given is a memory leak in the Java VM. You might want to try one of the other VMs they listed in the paper you referred. Another (much more difficult) alternative might be to compile the open java on the HP platform.
Sun's Java isn't 100% open yet, they are working on it, but I believe that there is one in sourceforge that is.
Java also thrashes memory by the way. Sometimes it confuses OS memory management a little (you see it when windows runs out of memory and asks Java to free some up, Java touches all it's objects causing them to be loaded in from the swapfile, windows screams in agony and dies), but I don't think that's what you are seeing.