I haven't deep dive into how Java treats memory when a program is running as I have been in working at application level. I recently had one instance in which I needed to know owing to performance issues of application.
I have been aware of "stack" , "heap" regions of memory and I thought this is the model of a Java program. However, it turns out that it is much more, and beyond that.
For example, I came across terms like: Eden, s0, s1, Old memory and so on. I was never aware of these terminologies prior.
As Java is / have been changing and so may be these terminologies are/aren't relevant as of Java 8.
Can anyone guide where to get this information and under what circumstance we need to know them? Are these part of main memory that is RAM.
Eden, s0, s1, Old memory and other memory areas exist only in the context of the specific garbage collector implementation e.g. generational collectors like G1 will divide the heap into mentioned areas however non-generational collectors like ZGC will not.
Start by reviewing the main garbage collectors in the JVM:
ParNew
CMS
G1
ZGC / Shenandoah / Azul C4
and then try to understand related concepts:
Thread-local allocation buffers (TLAB)
Escape analysis
String constant pools, string interning, string de-duplication
Permanent generation vs Metaspace
Object layout e.g. why boolean is not taking 1 bit (word tearing)
Native memory e.g. JNI or off-heap memory access
I don't believe that there is a single website that will explain the full JVM memory management approach.
Java, as defined by the Java Language Specification and the Java Virtual Machine Specification talks about the stack and the heap (as well as the method area).
Those are the things that are needed to describe, conceptually, what makes a Java Virtual Machine.
If you wanted to implement a JVM you'd need to implement those in some way. They are just as valid in Java 13 as they were back in Java 1. Nothing has fundamentally changed about how those work.
The other terms you mentioned (as well as "old gen", "new gen", ...) are memory areas used in the implementation of specific garbage collection mechanisms, specifically those of implemented in the Oracle JDK / OpenJDK.
All of those areas are basically specific parts of the heap. The exact way the heap is split into those areas is up to the garbage collector to decide and knowing about them shouldn't be necessary unless you want to tweak your garbage collector.
Since garbage collectors change between releases and new garbage collector approaches are implemented regularly (as this is one of the primary ways to speed up JVMs), the concrete terms used here will change over the years.
Related
ChronicleMap on OpenHFT's repository on Github states in their documentation:
Chronicle Map implements the java.util.concurrent.ConcurrentMap, that stores
its data off the java heap.
I've built a compiler and contributed to a few off-shoot languages' compiler implementation. The one's I've worked with allocate everything on the stack (that's what's available during code generation). I've never worked on the JVM and the java compiler, but I do know that typically only the heap and stack are available to allocate instances of classes, local variables, function parameters, etc.
Could someone please explain how we're able to write code, where we can tell the compiler to instantiate data-structures such as the ChronicalMap, have them available to garbage collection by the JVM (and be kept track-of with JVM's general memory management features), but live off the heap?
I've read up on the simple construction documentation and the associate example. I see the how but the reasoning underlying what exactly is going on in-conjunction with the JVM is unclear.
An important thing to remember is that the javac compiler doesn't do much in the way of optimisation, nor does it give you any means of specifying where data is stored or how code should be optimised. (With a few obscure exceptions in Java 8 like #Contended)
Java derives much of it's extensibility from libraries which generally operate at runtime. (There is often a build time option as well) A key thing to realise is that a Java program can generate and alter code while it is running, so in fact much of the smarts are happening at runtime.
In the case of off-heap usage, you need a library which supports this functionality and this will directly, or indirectly use sun.misc.Unsafe (On most popular JVMs) This class allows you to do many things the language doesn't support, but is still really useful to have if you are a low level library builder.
While off heap memory is not directly managed by the GC, you can have proxy objects e.g. ByteBuffer which have a Cleaner so that when these objects are GC-ed the off heap memory associated with it is also cleaned up.
Disclaimer, I wrote most of ChronicleMap.
The term off heap refers to the ability to use "raw" memory buffers in java. these maybe regular memory buffers from the process address space, or memory mapped files.
These buffers are "raw" - you manage their content yourself - they are not managed by the garbage collector.
I have a Tomcat webapp which does some pretty memory and CPU-intensive tasks on the behalf of clients. This is normal and is the desired functionality. However, when I run Tomcat, memory usage skyrockets over time to upwards of 4.0GB at which time I usually kill the process as it's messing with everything else running on my development machine:
I thought I had inadvertently introduced a memory leak with my code, but after checking into it with VisualVM, I'm seeing a different story:
VisualVM is showing the heap as taking up approximately a GB of RAM, which is what I set it to do with CATALINA_OPTS="-Xms256m -Xmx1024".
Why is my system seeing this process as taking up a ton of memory when according to VisualVM, it's taking up hardly any at all?
After a bit of further sniffing around, I'm noticing that if multiple jobs are running simultaneously in the applications, memory does not get freed. However, if I wait for each job to complete before submitting another to my BlockingQueue serviced by an ExecutorService, then memory is recycled effectively. How can I debug this? Why would garbage collection/memory reuse differ?
You can't control what you want to control, -Xmx only controls the Java Heap, it doesn't control consumption of native memory by the JVM, which is consumed completely differently based on implementation. VisualVM is only showing you what the Heap is comsuming, it doesn't show what the entire JVM is consuming as native memory as an OS process. You will have to use OS level tools to see that, and they will report radically different numbers, usually much much larger than anything VisualVM reports, because the JVM uses up native memory in an entirely different way.
From the following article Thanks for the Memory ( Understanding How the JVM uses Native Memory on Windows and Linux )
Maintaining the heap and garbage collector use native memory you can't control.
More native memory is required to maintain the state of the
memory-management system maintaining the Java heap. Data structures
must be allocated to track free storage and record progress when
collecting garbage. The exact size and nature of these data structures
varies with implementation, but many are proportional to the size of
the heap.
and the JIT compiler uses native memory just like javac would
Bytecode compilation uses native memory (in the same way that a static
compiler such as gcc requires memory to run), but both the input (the
bytecode) and the output (the executable code) from the JIT must also
be stored in native memory. Java applications that contain many
JIT-compiled methods use more native memory than smaller applications.
and then you have the classloader(s) which use native memory
Java applications are composed of classes that define object structure
and method logic. They also use classes from the Java runtime class
libraries (such as java.lang.String) and may use third-party
libraries. These classes need to be stored in memory for as long as
they are being used. How classes are stored varies by implementation.
I won't even start quoting the section on Threads, I think you get the idea that
-Xmx doesn't control what you think it controls, it controls the JVM heap, not everything
goes in the JVM heap, and the heap takes up way more native memory that what you specify for
management and book keeping.
Plain and simple the JVM uses more memory than what is supplied in -Xms and -Xmx and the other command line parameters.
Here is a very detailed article on how the JVM allocates and manages memory, it isn't as simple as what you are expected based on your assumptions in your question, it is well worth a comprehensive read.
ThreadStack size in many implementations have minimum limits that vary by Operating System and sometimes JVM version; the threadstack setting is ignored if you set the limit below the native OS limit for the JVM or the OS ( ulimit on *nix has to be set instead sometimes ). Other command line options work the same way, silently defaulting to higher values when too small values are supplied. Don't assume that all the values passed in represent what are actually used.
The Classloaders, and Tomcat has more than one, eat up lots of memory that isn't documented easily. The JIT eats up a lot of memory, trading space for time, which is a good trade off most of the time.
You should also check for CPU usage and garbage collector.
It is possible that garbage collection pauses and the CPU gc consumes further slow down your machine.
Can someone give me some advice on this? I am reading in an old text and some notes from my teacher that when using multiple threads with Java it's necessary to write a special program for garbage collection.
Does this still apply in Java SE6 and above? If it does could someone provide the standard way to do this.
Using a garbage collector makes writing multi-threaded code easier. This is because manual freeing of resources in a multi-threaded context is hard to get right. With GC its something you don't need to worry about most of the time.
I am reading that when using multiple threads it's necessary to write a special program for garbage collection.
I don't believe this was ever the case.
Does this still apply in SE6 and above and if so is there a standard way to do this.
The standard way to do this is to not reference objects you don't need. e.g. if you have a local variable you don't need, let it drop out of scope.
It doesn't have to be complicated.
As far as I know, as long if nothing is pointing to an object, that object get's freed by the garbage collector.
Java's garbage collector is very robust in terms of circular referencing, I don't see why It won't work with multiple threads running at the same time.
So it is safe for you to assume that you don't need to write a special program for garbage collection, because java will do it for you very effectively.
If you want to free objects in java, just make sure that no variables are referencing your object. (Including structures (lists, arrays, etc) from java collections or other libraries)
This article from JavaWorld in 2003, J2SE 1.4.1 boosts garbage collection, has this to say about the Java garbage collection prior to J2SE 1.4.1:
Mark and sweep is a "stop-the-world" garbage collection technique;
that is, all application threads stop until garbage collection
completes, or until a higher-priority thread interrupts the garbage
collector. If the garbage collector is interrupted, it must restart,
which can lead to application thrashing with little apparent result.
The other problem with mark and sweep is that many types of
applications can't tolerate its stop-the-world nature. That is
especially true of applications that require near real-time behavior
or those that service large numbers of transaction-oriented clients.
An article in Dr. Dobbs from 2009, G1: Java's Garbage First Garbage Collector, has this to say about Java garbage collector before SE 6.
Until recently, Java SE came with two main collectors: the parallel
collector, and the concurrent-mark-sweep (CMS) collector -- see the
sidebar Parallelism and Concurrency. As of the latest Java SE 6 update
release, the G1 collector is another option. The plan is for G1 to
eventually replace CMS as a low-pause, soft real-time collector. Let's
take a look at how it works.
So it may be that prior to SE 6 some additional precautions to assist with Java garbage collection may have helped, especially with multi-threaded applications with a fair amount of temporary variables generating garbage that needed collecting. However this should entail at most an explicit call to the garbage collector during slow times. Writing something special would seem very unusual.
However things are much more improved than they were. Plus garbage collection can vary between different versions of Java Virtual Machines.
So what may have been true years ago is almost definitely not true now with current technology.
This posting, How to monitor Java memory usage?, discusses monitoring Java memory usage as well as some of the pros and cons of calling the garbage collector explicitly.
Oracle has a Java Garbage Collection Basics tutorial that covers Java SE 7 Hotspot JVM.
Use following code to call garbage collector explicitly
Runtime runtime = Runtime.getRuntime();
runtime.gc();
But it is not needed, jvm will automatically handle correct timely running of GC.
Almost certainly your instructor's notes are stating (correctly) that since Java is a multithreaded environment, more care is needed when implementing the garbage collector inside the Java run time environment than would be necessary if only a single thread were involved. This is true of any multithreaded environment.
As others have said, you the programmer don't see any of this complexity. That's the gift of automatic memory management that gc provides.
Almost everyone eventually runs into GC issues with Java.
Is there a cookbook guide or semi-automated tool to tune GC for Java?
My rationale is this:
Almost anyone eventually has these problems
There are many possible factors (say 20) out of which only a few affect your problem.
Most people don't know how to identify the key factors so GC tuning is more like a black art than a science.
Not everyone uses a HotSpot VM. Different Sun versions have different GC characteristics.
There is little incentive to experiment (like run the VM with slightly different settings every day to see how they play out).
So the question really is: Is there something that I can use in a check-list manner? Or maybe even a tool that analyzes GC logs or heap dumps and gives me specific hints where to look (instead of telling me "95% of the data is allocated in objects of the type byte[]" which is basically useless).
Related questions:
Appropriate Tomcat 5.5 start-up parameters to tune JVM for extremely high demand, large heap web application? which is very specific. My question is more wide.
What are the best garbage collection settings for client side? Again very narrow scope
Does anyone know of a good guide to configure GC in Java? HotSpot only
JVM memory management & garbage collection book? is 80% there but I'm missing the checklist/cookbook/for-dummies approach.
Out of various resources I have compiled a sanity checklist that I use to analyze GC behavior and performance of my applications. These guidelines are general and apply to any vendor-specific JVM but contain also HotspotVM-specific information for illustration.
Disable Explicit GC. Explicit GC is a bad coding practice, it never helps. Use -XX:+DisableExplicitGC.
Enable Full GC logging. Lightweight yet powerful.
Compute Live Data Set, Allocation Rate, and Promotion Rate. This will tell you if you need a bigger Heap or if your eg. Young Gen is too small, or if your Survivor spaces are overflowing, etc.
Compute total GC time, it should be <5% of total running time.
Use -XX:+PrintTenuringDistribution -XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:LogFile=jvm.log -XX:+HeapDumpOnOutOfMemoryError -Xloggc:gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -showversion
Consider additional means of collecting information about your GC. Logging is fine but there are sometimes available lightweight command-line tools that will give you even more insight. Eg. jstat for Hotspot which will show you occupation/capacity of Eden, Survivor and Old Gen.
Collect Class Histograms These are lightweigh and will show you the content of the heap. You can take snapshots whenever you notice some strange GC activity, or you can take them before/after Full GC:
Content of the OldGen space: You can find out which objects reside in the OldGen. You need to print histograms before and after Full GC. And since a YoungGen collection is executed before the Full GC, these Histograms will show you the content of the Old generation. Use -XX:+PrintClassHistogramBeforeFullGC -XX:+PrintClassHistogramAfterFullGC.
Detecting prematurely promoted objects: To determine if any instances are promoted early, you need to study the Histograms to see which classes are expected to reside in the OldGen and which classes should be seen only in the YoungGen. This cannot be done automatically, you need to reason about the purpose of each class and its instance to determine if the object is temporary or not.
Consider different GC Algorithm. The VMs usually come with several different GC implementations that are providing various tradeoffs : throughput, footprint, pause-less/short-pauses, real-time, etc. Consider the options you have and pick the one that suites your needs.
Beware of finalize(). Check that GC keeps up with classes using finalize(). The execution of this method may be quite costly and this can impact GC and application throughput.
Heap Dumps. This is the first step that is heavyweight and will impact the running application. Collect the Heap Dump to further study the heap content or to confirm a hypothesis observed in step 4.
Resources used:
Books:
Java Performance - practical guide
The Garbage Collection Handbook - theory explained
Talks/Articles:
Java One 2012 Advanced JVM Tuning
From Java code to Java heap
Java One 2012 G1 Garbage Collector Performance Tuning
Garbage Collection Tuning Guide
Mailing Lists:
OpenJDK Hotspot GC Use
References for various GC information:
Oracle
Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine
and this also
Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning
IBM
Fine Tuning Garbage Collection [link dead]
Extensible Verbose Toolkit
SAP JVM
Memory Management (Garbage Collection)
Detecting Memory Leaks
Detecting Hanging / Looping VMs
Analyzing Out-of-Memory Situations
Sorry I don't know much about SAP but have provided some things I have found.
As for a cookbook, tuning is most likely application specific at this level, but it is an interesting topic.
ADDENDUM
You also mentioned analysis tools. Some candidates are listed here:
Know of any Java garbage collection log analysis tools?
is garbage collection algorithm in java "vendor implemented?"
From the introduction paragraph to Chapter 3 of the Java Virtual Machine Specification:
For example, the memory layout of
run-time data areas, the
garbage-collection algorithm used, and
any internal optimization of the Java
virtual machine instructions (for
example, translating them into machine
code) are left to the discretion of
the implementor. [emphasis mine]
Yes, and not only that, each JVM can contain more than one garbage collection strategy:
Sun
JRockit
IBM
Definitely vendor dependent. GCJ and the Sun VM use totally different garbage collectors, for example.
Yes. The Java VM Spec's don't say anything specific about garbage collection. Each vendor has their own implementation for performing GC. In fact, each vendor will have multiple GC policies that can be best chosen for a particular task.
Example
A GC tuned for throughput may not be good for real-time systems since they will have erratic (and often longer) pause times which are not predictable. Non-predictability is a killer for real-time application.
Some GC's such as the ones from Oracle and IBM are very tunable and can be tune based on your application's run-time memory characteristics.
The internals of GC are not too complicated at a higher level. Many algorithms that began in the early days of LISP are still in use today.
Read this (http://nd.edu/~dthain/courses/cse40243/spring2006/gc-survey.pdf "GC Introduction") for a good introduction to Garbage Collection at a moderately high-level.
Yes. The Java VirtualMachine Specification don't say anything specific about garbage collection. Each vendor has their own implementation for performing the task.
each can automatically calls garbage collector, then we didn't need manual calls for garbage collection