Related
In the spirit of question Java: Why does MaxPermSize exist?, I'd like to ask why the Oracle JVM uses a fixed upper limit for the size of its memory allocation pool.
The default is 1/4 of your physical RAM (with upper & lower limit); as a consequence, if you have a memory-hungry application you have to manually change the limit (parameter -Xmx), or your app will perform poorly, possible even crash with an OutOfMemoryError.
Why does this fixed limit even exist? Why does the JVM not allocate memory as needed, like native programs do on most operating systems?
This would solve a whole class of common problems with Java software (just Google to see how many hints there are on the net on solving problems by setting -Xmx).
Edit:
Some answers point out that this will protect the rest of the system from a Java program with a run-away memory leak; without the limit this would bring the whole system down by exhausting all memory. This is true. However, it is equally true for any other program, and modern OSes already let you limit the maximum memory for a programm (Linux ulimit, Windows "Job Objects"). So this does not really answer the question, which is "Why does the JVM do it differently from most other programs / runtime environments?".
Why does this fixed limit even exist? Why does the JVM not allocate memory as needed, like native programs do on most operating systems?
The reason is NOT that the GC needs to know before hand what the maximum heap size can be. The JVM is clearly capable of expanding its heap ... up to the maximum ... and I'm sure it would be a relatively small change to remove that maximum. (After all, other Java implementations do this.) And it would equally be possible to have a simple way to say "use as much memory as you like" to the JVM.
I'm sure that the real reason is to protect the host operating system against the effects of faulty Java applications using all available memory. Running with an unbounded heap is potentially dangerous.
Basically, many operating systems (e.g. Windows, Linux) suffer serious performance degradation if some application tries to use all available memory. On Linux for example, the system may thrash badly, resulting in everything on the system running incredibly slowly. In the worst case, the system won't be able to start new processes, and existing processes may start crashing when the operating system refuses their (legitimate) requests for more memory. Often, the only option is to reboot.
If the JVM ran with an unbounded heap by default, any time someone ran a Java program with a storage leak ... or that simply tried to use too much memory ... they would risk bringing down the entire operating system.
In summary, having a default heap bound is a good thing because:
it protects the health of your system,
it encourages developers / users to think about memory usage by "hungry" applications, and
it potentially allows GC optimizations. (As suggested by other answers: it is plausible, but I cannot confirm this.)
EDIT
In response to the comments:
It doesn't really matter why Sun's JVMs live within a bounded heap, where other applications don't. They do, and advantages of doing so are (IMO) clear. Perhaps a more interesting question is why other managed languages don't put a bound on their heaps by default.
The -Xmx and ulimit approaches are qualitatively different. In the former case, the JVM has full knowledge of the limits it is running under and gets a chance to manage its memory usage accordingly. In the latter case, the first thing a typical C application knows about it is when a malloc call fails. The typical response is to exit with an error code (if the program checks the malloc result), or die with a segmentation fault. OK, a C application could in theory keep track of how much memory it has used, and try to respond to an impending memory crisis. But it would be hard work.
The other thing that is different about Java and C/C++ applications is that the former tend to be both more complicated and longer running. In practice, this means that Java applications are more likely to suffer from slow leaks. In the C/C++ case, the fact that memory management is harder means that developers don't attempt to build single applications of that complexity. Rather, they are more likely to build (say) a complex service by having a listener process fork of child processes to do stuff ... and then exit. This naturally mitigates the effect of memory leaks in the child process.
The idea of a JVM responding "adaptively" to requests from the OS to give memory back is interesting. But there is a BIG problem. In order to give a segment of memory back, the JVM first has to clear out any reachable objects in the segment. Typically that means running the garbage collector. But running the garbage collector is the last thing you want to do if the system is in a memory crisis ... because it is pretty much guaranteed to generate a burst of virtual memory paging.
Hm, I'll try summarizing the answers so far.
There is no technical reason why the JVM needs to have a hard limit for its heap size. It could have been implemented without one, and in fact many other dynamic languages do not have this.
Therefore, giving the JVM a heap size limit was simply a design decision by the implementors. Second-guessing why this was done is a bit difficult, and there may not be a single reason. The most likely reason is that it helps protect a system from a Java program with a memory leak, which might otherwise exhaust all RAM and cause other apps to crash or the system to thrash.
Sun could have omitted the feature and simply told people to use the OS-native resource limiting mechanisms, but they probably wanted to always have a limit, so they implemented it themselves.
At any rate, the JVM needs to be aware of any such limit (to adapt its GC strategy), so using an OS-native mechanism would not have saved much programming effort.
Also, there is one reason why such a built-in limit is more important for the JVM than for a "normal" program without GC (such as a C/C++ program):
Unlike a program with manual memory management, a program using GC does not really have a well-defined memory requirement, even with fixed input data. It only has a minimum requirement, i.e. the sum of the sizes of all objects that are actually live (reachable) at a given point in time. However, in practice a program will need additional memory to hold dead, but not yet GCed objects, because the GC cannot collect every object right away, as that would cause too much GC overhead. So GC only kicks in from time to time, and therefore some "breathing room" is required on the heap, where dead objects can await the GC.
This means that the memory required for a program using GC is really a compromise between saving memory and having good througput (by letting the GC run less often). So in some cases it may make sense to set the heap limit lower than what the JVM would use if it could, so save RAM at the expense of performance. To do this, there needs to be a way to set a heap limit.
I think part of it has to do with the implementation of the Garbage Collector (GC). The GC is typically lazy, meaning it will only start really trying to reclaim memory internally when the heap is at its maximum size. If you didn't set an upper limit, the runtime would happily continue to inflate until it used every available bit of memory on your system.
That's because from the application's perspective, it's more performant to take more resources than exert effort to use the resources you already have to full utilization. This tends to make sense for a lot of (if not most) uses of Java, which is a server setting where the application is literally the only thing that matters on the server. It tends to be slightly less ideal when you're trying to implement a client in Java, which will run amongst dozens of other applications at the same time.
Remember that with native programs, the programmer typically requests but also explicitly cleans up resources. That isn't typically true with environments who do automatic memory management.
It is due to the design of the JVM. Other JVM's (like the one from Microsoft and some IBM ones) can use all the memory available in the system if needed, without an arbitrary limit.
I believe it allows for GC-optimizations.
I think that the upper limit for memory is is linked to the fact that JVM is a VM.
As any physical machine has a given (fixed) ammount of RAM so the VM has one.
The maximal size makes the JVM easier to manage by the operating system and ensures some performance gains(less swapping).
Sun' JVM also works in quite limited hardware architecture(embedded ARM systems) and there the management of resources is crucial.
One answer that no-one above gave is that the JVM uses both heap and non-heap memory pools. Putting an upper limit on the heap defines not only how much memory is available for the heap memory pools, but it also defines how much memory is available for NON-HEAP usages. I suppose that the JVM could just allocate non-heap at the top of virtual memory and heap at the bottom of virtual memory and grow both toward each other.
Non-heap memory includes the DLLs or SOs that comprise the JVM and any native code being used as well as compiled Java code, thread stacks, native objects, PermGen (meta-data about compiled classes), among other uses. I've seen Java programs crash because so much memory was given to the heap that the application ran out of non-heap memory. This is where I learned that it can be important to reserve memory for non-heap usages by not setting the heap to be too large.
This makes a much bigger difference in a 32-bit world where an application often has only 2GB of virtual address space than it does in a 64-bit world, of course.
Would it not make more sense to separate the upper bound that triggers GC and the maximum that can be allocated ? Once the memory allocated hits the upper-bound, GC can kick in and release some memory to the free pool.
sort of like how I clean my desk that I share with my co-worker. I have a large desk, and my threshold of how much junk I can tolerate on the table is much less than the size of my desk. I don't need to have fill up every available inch before I garbage collect.
I could also return some of the desk space that I using to my co-worker, who is sharing my desk....I understand jvms don't return memory back to the system after they've allocated it to themselves, but it does not have to be that way no ?
It does allocate memory as needed, up to -Xmx ;)
One reason I can think of is that once the JVM allocates an amount of memory for its heap, it will never let it go. So if your heap has no upper bound, the JVM may just grab all the free memory on the system and then never let it go.
The upper bound also tells the JVM when it needs to do a full garbage collection. If your app is still under the upper bound, the JVM will postpone garbage collection and let the memory footprint of your application grow.
Native programs can die due to out of memory errors as well since native applications also have a memory limit: the memory available on the system - the memory already held by other applications.
The JVM also needs a contiguous block of system memory in order for garbage collection to be performed efficiently.
EDIT
Contiguous memory claim or here
The JVM will apparently let some memory go, but it is rare with the default configuration.
I am looking into a potential memory leak (or at least memory waste) in a largish Java based system. The JVM is running with a maximum heap size of 5 GB and 2-3GB heap usage is an expected base line for the application. (There can be peaks that are higher)
In an overload scenario which I am investigating the heap gets filled up. Analyzing the a heap-dump with the "Eclipse MemoryAnalyzer Tool" shows (no surprise) that the heap is entirely used up.
MAT shows 2 potential leak candidates, both roughly retaining 2.5GB: java.lang.Thread and a domain object from the system which is used extensively during transaction processing in the system. All these domain objects are however (no surprise) reachable from the Thread instances. Those threads are processing the transactions, after all. Thus, the 2.5 GB attributed to java.lang.Thread is almost entirely caused by those domain objects. No surprise here.
Listing the object tree of all java.lang.Thread instances and summing up the retained heap of all threads results in 2.5 GB of retained heap.
Where should I look for the other 2.5 GB that are needed to fill up the heap, if they are not reachable from an instance of java.lang.Thread?
- There is nothing in the finalizer queue
- There is not a significant amount of unreachable objects pending GC
I think another way to put this question is: "How do I find all objects that are not reachable from an instance of java.lang.Thread? Maybe an OQL query?, and the other question: "What kind of Objects are there that are not reachable from an instance of java.lang.Thread other then Objects in the Finalizer Queue and unreferenced objects pending GC?"
I too faced the problem with memory leaks at our site,
Use yourkit java profiler which provide lots of information and with its ability you can have a wider image where all the memory is being utilized.
You can find a great tutorial Find Java Memory Leaks with the above tool.
Your question,
"What kind of Objects are there that are not reachable from an instance of java.lang.Thread other then Objects in the Finalizer Queue and unreferenced objects pending GC?"
There are four kinds of object,
Strong reachable, objects that can be reached directly via references from live objects
Weak/Soft reachable, objects that are having weak/Soft reference associated with them
Pending Finalization, objects that are pending for finalization and whose reference can be reached through finalizer queue
Unreachable these are objects that are unreachable from GC roots, but not yet collected
Besides these JVM also uses Native memory whose information you can find on IBM Heap and native memory use by the JVM and Thanks for the memory and according to YourKit the JVM Memory Structure has Non-Heap Memory whose definition according to them is
Also, the JVM has memory other than the heap, referred to as non-heap memory. It is created at the JVM startup and stores per-class structures such as runtime constant pool, field and method data, and the code for methods and constructors, as well as interned Strings.
Since the extra memory is not showing in MAT it's hard to know what to suggest. My apologies if some (or even most) of this is things you already know, I've just tried to pull together everything I could think of.
FindBugs
FindBugs is a static analysis tool that will scan your code looking for common anti-patterns and problems and giving you a nice report on them. It does pick up on a lot of causes of potential memory and resource leaks.
Manual dump
You could try using something like jmap or visualvm to take a heap dump for analysis manually and see if you get different results from letting eclipse do it:
http://docs.oracle.com/javase/1.5.0/docs/tooldocs/share/jmap.html
http://java.dzone.com/articles/java-heap-dump-are-you-task
Analyzer Quirks
The memory analyzer FAQ:
http://wiki.eclipse.org/MemoryAnalyzer/FAQ
says:
Symptom: When monitoring the memory usage interactively, the used heap size is much bigger than what MAT reports.
During the index creation, the Memory Analyzer removes unreachable objects because the various garbage collector algorithms tend to leave some garbage behind (if the object is too small, moving and re-assigning addresses is to expensive). This should, however, be no more than 3 to 4 percent. If you want to know what objects are removed, enable debug output as explained here: MemoryAnalyzer/FAQ#Enable_Debug_Output
Another reason could be that the heap dump was not written properly. Especially older VM (1.4, 1.5) can have problems if the heap dump is written via jmap.
Enabling debug output will allow you to see what is going on there and confirm there is nothing odd in that area.
Some of these tips may be relevant
http://eclipsesource.com/blogs/2013/01/21/10-tips-for-using-the-eclipse-memory-analyzer/
Use JProfiler and break the heap object count down by class - find which class has lots of instances and start your hunt there.
You can also take a couple of snapshots a short time apart and compare the two heap dumps to see what objects were created during that time. This is particularly handy if you know that a certain action is causing the problem and you want to ignore all the background JVM object noise and just examine the delta.
I have used it with great success to find a memory leak. It isn't free, but it's worth the licence fee.
FYI: I have no affiliation with JProfiler.
Since the extra memory is not showing in MAT it's hard to know what to suggest.
It isn't true. MAT show unreachable objects. Just go to de Preferences and select check box enabling this options. After MAT restart you will see these objects with details. Of course roots to GC will be not available.
Maybe you should look for memory leaks in database connector code or maybe ORM. Because if you are using raw connection library when you don't close cursor you can get potentially memory leak. Also my second thought is also related to database connector. Because some of them (may be not yours) uses native code beneath and this is source of this leak. Due to heavy concurrent usage that makes sens for me. You can check that if you want.
I use the NetBeans profiler (which is actually an embedded VisualVM) to monitor the memory consumption of my Java applications. I use the heap view, the surviving generation view, and memory dumps to track memory leaks.
The heap view shows the total of used memory, but it's a bit chaotic, due to the way the garbage collector manages the memory. The graph is essentially sawtooth-shaped, and thus not particularly readable. Sometimes, I force the GC to happen, so that I can have a more precise value of the real memory consumption.
I was wondering : is there a garbage collector which is more appropriate for memory profiling, and which would yield a heap graph closer to the real memory usage ? Or more generally, what JVM settings (-XX options or other) can I use in order to efficiently track memory leaks ?
What you are seeing in your graph is the real behavior of your applications memory utilization. The repeated sawtooth pattern is likely due to allocations of short lived objects which are being scavenged. If you believe you have a memory leak, take a heap dump snapshot and see what objects are being retained in the heap. You can take a snapshot using JConsole and open the resulting dumpfile using HPjmeter.
I suggest you use the GC you intend to use without the profiler. Using this approach you will get a graph which is more like how the application will behave, though not always as readable.
If you want a graph which is more readable, but not as realistic, you can increase the minimum memory size to say 1 GB. This will result in less GCs and a less spikey graph but may not help you except make the graph prettier.
I have heard several people claiming that you can not scale the JVM heap size up. I've heard claims of the practical limit being 4 gigabytes (I heard an IBM consultant say that), 10 gigabytes, 32 gigabytes, and so on... I simply can not believe any of those numbers and have been wondering about the issue now for a while.
So, I have three part question I would hope someone with experience could answer:
Given the following case how would you tune the heap and GC settings?
Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
Should this really still work? I think it should.
The case:
64 bit platform
64 cores
64 gigabytes of memory
The application server is client facing (ie. Jboss/tomcat web application server) - complete pauses of JVM would probably be noticed by end users
Sun JVM, probably 1.5
To prove I am not asking you guys to do my homework this is what I came up with:
-XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:+UnlockDiagnosticVMOptions -XX:-EliminateZeroing -Xmn768m -Xmx55000m
CMS should reduce the amount of pauses, although it comes with overhead. The other settings for CMS seem to default automatically to the number of CPUs so they seem sane to me. The rest that I added are extras that might do good or bad generally for performance, and they should probably be tested.
Definitely.
I think it's going to be difficult for anybody to give you anything more than general advice, without having further knowledge of your application.
What I would suggest is that you use VisualGC (or the VisualGC plugin for VisualVM) to actually look at what the garbage collection is doing when your app is running. Once you have a greater understanding of how the GC is working alongside your application, it'll be far easier to tune it.
#1. Given the following case how would you tune the heap and GC settings?
First, having 64 gigabytes of memory doesn't imply that you have to use them all for one JVM. Actually, it rather means you can run many of them. Then, it is impossible to answer your question without any access to your machine and application to measure and analyse things (knowing what your application is doing isn't enough). And no, I'm not asking to get access to your environment :)
#2. Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
The goal of tuning is to find a good compromise between frequency and duration of (major) GCs. With a ~55g heap, GC won't be frequent but will take noticeable time, for sure (the bigger the heap, the longer the major GC). Using a Parallel or Concurrent garbage collector will help on multiprocessor systems but won't entirely solve this issue. Why do you need ~55g (this is mega ultra huge for a webapp IMO), that's my question. I'd rather run many clustered JVMs to handle load if required (at some point, the database will become the bottleneck anyway with a data oriented application).
#3. Should this really still work? I think it should.
Hmm... not sure I get the question. What is "this"? Instantiating a JVM with a big heap? Yes, it should. Is it equivalent to running several JVMs? No, certainly not.
PS: 4G is the maximum theoretical heap limit for the 32-bit JVM running on a 64-bit operating system (see Why can't I get a larger heap with the 32-bit JVM?)
PPS: On 64-bit VMs, you have 64 bits of addressability to work with resulting in a maximum Java heap size limited only by the amount of physical memory and swap space your system provides. (see How large a heap can I create using a 64-bit VM?)
Obviously heap size is not unlimited and the larger is the heap size, the more your JVM will eventually spend on GC. Though I think it is possible to set heap size quite high on 64-bit JVM, I still think it's not really practical. The advice here is better to have several JVMs running with the same parameters i.e. cluster of JBoss/Tomcat nodes running on the same physical machine and you will get better throughput.
EDIT: Also your GC behavior depends on the taxonomy of your heap. If you have a lot of short-living objects and each request to the server creates a lot of those, then your GC will collect a lot of garbage very often and thus on large heap size this will result in longer pauses. If you have very many long-living objects (e.g. caching most of your data in memory) and the amount of short-living objects is not that big, then having bigger heap size is OK.
As Chris Rice already wrote, I wouldn't expect any obvious problems with the GC for heap sizes up to 32-64GB, although there may of course be some point of your application logic, which can cause problems.
Not directly related to GC, but I would still recommend you to perform a realistic load test on your production system. I used to work on a project, where we had a similar setup (relatively large, clustered JBoss/Tomcat setup to serve a public web application) and without exaggeration, JBoss is not behaving very well under high load or with a high number of concurrent calls if you are using EJBs. JBoss is spending a lot of time in synchronized blocks when accessing and managing the EJB instance pools and if you opt for a cluster, it will even wait for intra-cluster network communication within these synchronized blocks. Be especially aware of poorly performing state replication, if you are using SFSBs.
Only to add some more switches I would use by default: -Xms55g can help to reduce the rampup time because it frees Java from the need to check if it can fall back to the initial size and allows also better internal initial sizing of memory areas.
Additionally we made good experiences with NewSize to give you a large young size to get rid of short term garbage: -XX:NewSize=1g Additionally most webapps create a lot of short time garbage that will never survive the request processing. You can even make that bigger. With Xms55g, the VM reserves a large chunk already. Maybe downsizing can help.
-Xincgc helps to clean the young generation incrementally and return the cpu often to the user threads.
-XX:CMSInitiatingOccupancyFraction=70 If you really fill all that memory, try to start CMS garbage collection earlier.
-XX:+CMSIncrementalMode puts the CMS into incremental mode to return the cpu to the user threads more often.
Attach to the process with jstat -gc -h 10 <pid> 1s and watch the GC working.
Will you really fill up the memory? I assume that 64cpus for request processing might even be able to work with less memory. What do you store in there?
Depending on your GC pause analysis, you may wish to implement Incremental mode whereby the long pause may be broken out over a period of time.
I have found memory architecture plays a part in large memory sizes. Applications in general don't perform as well if they use more than one memory bank. The JVM appears to suffer as well, esp the GC which has to sweep the whole memory.
If you have an application which doesn't fit into one memory bank, your application has to pull in memory which is not local to a processor and use memory local to another processor.
On linux you can run numactl --hardware to see the layout of processors and memory banks.
I know there is no "right" heap size, but which heap size do you use in your applications (application type, jdk, os)?
The JVM Options -Xms (initial/minimum) and -Xmx (maximum) allow for controlling the heap size. What settings make sense under which circumstances? When are the defaults appropriate?
You have to try your application and see how it performs. for example, I used to always run IDEA out of the box until I've got this new job where I work on this huge monolithic project. IDEA was running very slow and regularly throwing out of memory errors when compiling the full project.
first thing I did is to ramp up the heap to 1 gig. this got rid of the out of memory issues but it was still slow. I also noticed IDEA was regularly freezing for 10 seconds or so after which the used memory was cut in half only to ramp up again and , and that triggered the garbage collection idea. I now use it with -Xms512m, -Xmx768m but, I also added -Xincgc, to activate incremental garbage collection
As a result, I've got my old IDEA back: it runs smooth, doesn't freeze anymore and never uses more than 600m of heap.
For your application you have to use a similar approach. try to determine the typical memory usage and tune your heap for the application to run well in those conditions. But also let advanced users tune the setting, to address out of the ordinary data loads.
It depends on the application type. A desktop application is much different than a web application. An application server is much different than a standalone application.
It also depends on the JVM that you are using. JDK5 and later 6 include enhancements that help understand how to tune your application.
Heap size is important, but its also important to know how it plays with the garbage collector.
JDK1.4 Garbage Collector Tuning
JDK5 Garbage Collector Tuning
JDK6 Garbage Collector Tuning
Actually I always considered it very strange that Java limits the heap size. A native application can usually use as much heap as it wants, until it runs out of virtual address space. The only reason to limit the heap in Java seems the garbage collector, which has a certain kind of "laziness" and may not garbage collect objects, unless there is a necessity to do so. That means if you choose the heap too big, your app constantly uses more memory than is really necessary.
However, Sun has improved the GC a lot over the years and to emulate the behavior of a native C app, I would set the initial heap size to 32 MB (for small programs) or 64 MB (for bigger ones) and the maximum to something between 1-2 GB. If your app really needs over a 1 GB of memory, it is most likely broken (unless you deal with data objects that large), but I see no reason why your app should be killed, just because it goes over a certain heap size.
Of course, this is referring to normal PCs. If you create Java code for mobile phones or other limited devices, you should probably adopt the initial and maximum heap size to the limitations of that device.
Typically i try not to use heaps which are larger than 1GB.
It will cost you on major garbage collections.
Sometime it is better to split your application to a few JVM on the same machine and not you large heap sizes.
Major collection with a large heap size can take >10 mintues (on unoptimized GC applications).
This is entirely dependent on your application and any hardware limitations you may have. There is no one size fits all.
jmap can be used to have a look at what heap you are actually using and is a good starting point for right-sizing the heap.
You need to spend quite some time in JConsole or visualvm to get a clear picture on what the plateau memory usage is. Wait until everything is stable and you see the characteristic sawtooth curve of heap memory usage. The peaks should be your 70-80% heap, depending on what garbage collector you use.
Most garbage collectors trigger full GCs when heap usage reaches a certain percentage. This percentage is from 60% to 80% of max heap, depending on what strategy is involved.
1.3Gb for a heavy GUI application.
Unfortunately on Linux the JVM seems to pre-request 1.3G of virtual memory in that situation, which looks bad even if it's not needed (and causes a lot of confused grumbling from users)
On my most memory intensive app:
-Xms250M -Xmx1500M -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC