Is there a cookbook guide for GC problems?

Is there a cookbook guide for GC problems? - java

Almost everyone eventually runs into GC issues with Java.
Is there a cookbook guide or semi-automated tool to tune GC for Java?
My rationale is this:
Almost anyone eventually has these problems
There are many possible factors (say 20) out of which only a few affect your problem.
Most people don't know how to identify the key factors so GC tuning is more like a black art than a science.
Not everyone uses a HotSpot VM. Different Sun versions have different GC characteristics.
There is little incentive to experiment (like run the VM with slightly different settings every day to see how they play out).
So the question really is: Is there something that I can use in a check-list manner? Or maybe even a tool that analyzes GC logs or heap dumps and gives me specific hints where to look (instead of telling me "95% of the data is allocated in objects of the type byte[]" which is basically useless).
Related questions:
Appropriate Tomcat 5.5 start-up parameters to tune JVM for extremely high demand, large heap web application? which is very specific. My question is more wide.
What are the best garbage collection settings for client side? Again very narrow scope
Does anyone know of a good guide to configure GC in Java? HotSpot only
JVM memory management & garbage collection book? is 80% there but I'm missing the checklist/cookbook/for-dummies approach.

Out of various resources I have compiled a sanity checklist that I use to analyze GC behavior and performance of my applications. These guidelines are general and apply to any vendor-specific JVM but contain also HotspotVM-specific information for illustration.
Disable Explicit GC. Explicit GC is a bad coding practice, it never helps. Use -XX:+DisableExplicitGC.
Enable Full GC logging. Lightweight yet powerful.
Compute Live Data Set, Allocation Rate, and Promotion Rate. This will tell you if you need a bigger Heap or if your eg. Young Gen is too small, or if your Survivor spaces are overflowing, etc.
Compute total GC time, it should be <5% of total running time.
Use -XX:+PrintTenuringDistribution -XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:LogFile=jvm.log -XX:+HeapDumpOnOutOfMemoryError -Xloggc:gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -showversion
Consider additional means of collecting information about your GC. Logging is fine but there are sometimes available lightweight command-line tools that will give you even more insight. Eg. jstat for Hotspot which will show you occupation/capacity of Eden, Survivor and Old Gen.
Collect Class Histograms These are lightweigh and will show you the content of the heap. You can take snapshots whenever you notice some strange GC activity, or you can take them before/after Full GC:
Content of the OldGen space: You can find out which objects reside in the OldGen. You need to print histograms before and after Full GC. And since a YoungGen collection is executed before the Full GC, these Histograms will show you the content of the Old generation. Use -XX:+PrintClassHistogramBeforeFullGC -XX:+PrintClassHistogramAfterFullGC.
Detecting prematurely promoted objects: To determine if any instances are promoted early, you need to study the Histograms to see which classes are expected to reside in the OldGen and which classes should be seen only in the YoungGen. This cannot be done automatically, you need to reason about the purpose of each class and its instance to determine if the object is temporary or not.
Consider different GC Algorithm. The VMs usually come with several different GC implementations that are providing various tradeoffs : throughput, footprint, pause-less/short-pauses, real-time, etc. Consider the options you have and pick the one that suites your needs.
Beware of finalize(). Check that GC keeps up with classes using finalize(). The execution of this method may be quite costly and this can impact GC and application throughput.
Heap Dumps. This is the first step that is heavyweight and will impact the running application. Collect the Heap Dump to further study the heap content or to confirm a hypothesis observed in step 4.
Resources used:
Books:
Java Performance - practical guide
The Garbage Collection Handbook - theory explained
Talks/Articles:
Java One 2012 Advanced JVM Tuning
From Java code to Java heap
Java One 2012 G1 Garbage Collector Performance Tuning
Garbage Collection Tuning Guide
Mailing Lists:
OpenJDK Hotspot GC Use

References for various GC information:
Oracle
Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine
and this also
Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning
IBM
Fine Tuning Garbage Collection [link dead]
Extensible Verbose Toolkit
SAP JVM
Memory Management (Garbage Collection)
Detecting Memory Leaks
Detecting Hanging / Looping VMs
Analyzing Out-of-Memory Situations
Sorry I don't know much about SAP but have provided some things I have found.
As for a cookbook, tuning is most likely application specific at this level, but it is an interesting topic.
ADDENDUM
You also mentioned analysis tools. Some candidates are listed here:
Know of any Java garbage collection log analysis tools?

Related

JAVA : Why releasing Heap memory not reflecting in Task manager [duplicate]

When the garbage collector runs and releases memory does this memory go back to the OS or is it being kept as part of the process. I was under the strong impression that the memory is never actually released back to OS but kept as part of the memory area/pool to be reused by the same process.
As a result the actual memory of a process would never decrease. An article that reminded me was this and Java’s Runtime is written in C/C++ so I guess the same thing applies?
Update
My question is about Java. I am mentioning C/C++ since I assume the Java’s allocation/deallocation is done by JRE using some form of malloc/delete

The HotSpot JVM does release memory back to the OS, but does so reluctantly since resizing the heap is expensive and it is assumed that if you needed that heap once you'll need it again.
In general shrinking ability and behavior depends on the chosen garbage collector, the JVM version since shrinking capability was often introduced in later versions long after the GC itself was added. Some collectors may also require additional options to be passed to opt into shrinking. And some most likely never will support it, e.g. EpsilonGC.
So if heap shrinking is desired it should be tested for a particular JVM version and GC configuration.
JDK 8 and earlier
There are no explicit options for prompt memory reclamation in these versions but you can make the GC more aggressive in general by setting -XX:GCTimeRatio=19 -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=30 which will allow it to spend more CPU time on collecting and constrain the amount of allocated-but-unused heap memory after a GC cycle.
If you're using a concurrent collector you can also set -XX:InitiatingHeapOccupancyPercent=N with N to some low value to let the GC run concurrent collections almost continuously, which will consume even more CPU cycles but shrink the heap sooner. This generally is not a good idea, but on some types of machines with lots of spare CPU cores but short on memory it can make sense.
If you're using G1GC note that it only gained the ability to yield back unused chunks in the middle of the heap with jdk8u20, earlier versions were only able to return chunks at the end of the heap which put significant limits on how much could be reclaimed.
If you're using a collector with a default pause time goal (e.g. CMS or G1) you can also relax that goal to place fewer constraints on the collector, or you can switch go the parallel collector to prioritize footprint over pause times.
To verify that shrinking occurs or to diagnose why a GC decides not to shrink you can use GC Logging with -XX:+PrintAdaptiveSizePolicy may also provide insight, e.g. when the JVM tries to use more memory for the young generation to meet some goals.
JDK 9
Added the -XX:-ShrinkHeapInSteps option which can be be used to apply the shrinking caused by the options mentioned in the previous section more aggressively. Relevant OpenJDK bug.
For logging -XX:+PrintAdaptiveSizePolicy has been replaced with -Xlog:gc+ergo
JDK 12
Introduced the option to enable prompt memory release for G1GC via the G1PeriodicGCInterval (JEP 346), again at the expense of some additional CPU. The JEP also mentions similar features in Shenandoah and the OpenJ9 VM.
JDK 13
Adds similar behavior for ZGC, in this case it is enabled by default. Additionally XXSoftMaxHeapSize can be helpful for some workloads to keep the average heap size below some threshold while still allowing transient spikes.

The JVM does release back memory under some circumstances, but (for performance reasons) this does not happen whenever some memory is garbage collected. It also depends on the JVM, OS, garbage collector etc. You can watch the memory consumption of your app with JConsole, VisualVM or another profiler.
Also see this related bug report

If you use the G1 collector and call System.gc() occasionally (I do it once a minute), Java will reliably shrink the heap and give memory back to the OS.
Since Java 12, G1 does this automatically if the application is idle.
I recommend using these options combined with the above suggestion for a very compact resident process size:
-XX:+UseG1GC -XX:MaxHeapFreeRatio=30 -XX:MinHeapFreeRatio=10
Been using these options daily for months with a big application (a whole Java-based guest OS) with dynamically loading and unloading classes - and the Java process almost always stays between 400 and 800 MB.

this article here explains how the GC work in Java 7. In a nutshell, there are many different garbage collectors available. Usually the memory is kept for the Java process and only some GCs release it to the system (upon request I think). But, the memory used by the Java process will not grow indefinitely, as there is an upper limit defined by the Xmx option (which is 256m usually, but I think it is OS/machine dependant).

ZGC released in 13 java and it can return unused heap memory to the operating system
Please see the link

Does GC release back memory to OS?

When the garbage collector runs and releases memory does this memory go back to the OS or is it being kept as part of the process. I was under the strong impression that the memory is never actually released back to OS but kept as part of the memory area/pool to be reused by the same process.
As a result the actual memory of a process would never decrease. An article that reminded me was this and Java’s Runtime is written in C/C++ so I guess the same thing applies?
Update
My question is about Java. I am mentioning C/C++ since I assume the Java’s allocation/deallocation is done by JRE using some form of malloc/delete

The HotSpot JVM does release memory back to the OS, but does so reluctantly since resizing the heap is expensive and it is assumed that if you needed that heap once you'll need it again.
In general shrinking ability and behavior depends on the chosen garbage collector, the JVM version since shrinking capability was often introduced in later versions long after the GC itself was added. Some collectors may also require additional options to be passed to opt into shrinking. And some most likely never will support it, e.g. EpsilonGC.
So if heap shrinking is desired it should be tested for a particular JVM version and GC configuration.
JDK 8 and earlier
There are no explicit options for prompt memory reclamation in these versions but you can make the GC more aggressive in general by setting -XX:GCTimeRatio=19 -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=30 which will allow it to spend more CPU time on collecting and constrain the amount of allocated-but-unused heap memory after a GC cycle.
If you're using a concurrent collector you can also set -XX:InitiatingHeapOccupancyPercent=N with N to some low value to let the GC run concurrent collections almost continuously, which will consume even more CPU cycles but shrink the heap sooner. This generally is not a good idea, but on some types of machines with lots of spare CPU cores but short on memory it can make sense.
If you're using G1GC note that it only gained the ability to yield back unused chunks in the middle of the heap with jdk8u20, earlier versions were only able to return chunks at the end of the heap which put significant limits on how much could be reclaimed.
If you're using a collector with a default pause time goal (e.g. CMS or G1) you can also relax that goal to place fewer constraints on the collector, or you can switch go the parallel collector to prioritize footprint over pause times.
To verify that shrinking occurs or to diagnose why a GC decides not to shrink you can use GC Logging with -XX:+PrintAdaptiveSizePolicy may also provide insight, e.g. when the JVM tries to use more memory for the young generation to meet some goals.
JDK 9
Added the -XX:-ShrinkHeapInSteps option which can be be used to apply the shrinking caused by the options mentioned in the previous section more aggressively. Relevant OpenJDK bug.
For logging -XX:+PrintAdaptiveSizePolicy has been replaced with -Xlog:gc+ergo
JDK 12
Introduced the option to enable prompt memory release for G1GC via the G1PeriodicGCInterval (JEP 346), again at the expense of some additional CPU. The JEP also mentions similar features in Shenandoah and the OpenJ9 VM.
JDK 13
Adds similar behavior for ZGC, in this case it is enabled by default. Additionally XXSoftMaxHeapSize can be helpful for some workloads to keep the average heap size below some threshold while still allowing transient spikes.

The JVM does release back memory under some circumstances, but (for performance reasons) this does not happen whenever some memory is garbage collected. It also depends on the JVM, OS, garbage collector etc. You can watch the memory consumption of your app with JConsole, VisualVM or another profiler.
Also see this related bug report

If you use the G1 collector and call System.gc() occasionally (I do it once a minute), Java will reliably shrink the heap and give memory back to the OS.
Since Java 12, G1 does this automatically if the application is idle.
I recommend using these options combined with the above suggestion for a very compact resident process size:
-XX:+UseG1GC -XX:MaxHeapFreeRatio=30 -XX:MinHeapFreeRatio=10
Been using these options daily for months with a big application (a whole Java-based guest OS) with dynamically loading and unloading classes - and the Java process almost always stays between 400 and 800 MB.

this article here explains how the GC work in Java 7. In a nutshell, there are many different garbage collectors available. Usually the memory is kept for the Java process and only some GCs release it to the system (upon request I think). But, the memory used by the Java process will not grow indefinitely, as there is an upper limit defined by the Xmx option (which is 256m usually, but I think it is OS/machine dependant).

ZGC released in 13 java and it can return unused heap memory to the operating system
Please see the link

Best suitable JVM implementation for Realtime Applications of Telecom domain

Out of many implementations of JVM, which is most suitable for Realtime Applications like applications for Telecom domain?
I am working on an application of Telecom domain, and wanted some advice regarding the choice of JVM.
Currently using HotSpot but read somewhere regarding JRockit and Azul.
If some one uses one of these JVMs and has seen some major improvements in performance please share.

HostSpot JVM is pretty good and cost-effectve option. It provides few GC algorithms, in particular Concurrent Mark Sweep is good for certain kinds of real-time applications.
G1 is another low-pause GC algorithm, which is actively promoted by Oracle, but so far its results are quite disappointing.
JRockit - is deadend. It never had good low-pause GC algorithm and it is going to be merged in to HotSpot.
Azul Zing - is another league compared to HotSpot/JRockit.It really reliably keep GC pauses in order of milliseconds, but it requires more complex setup. It has few deployment options (virtual appliance, etc) you should check whenever it would fit your infrastructure.
On general note
No JVM could guaranty you minimal GC pause time, it is always best effort. There are lot of factors affecting GC puases duration and most of them very application specific.
If your are seeking guarantied response time below 5ms (not just for 99.9...% responses but for 100%), you should consider techniques avoid Java heap memory usage (i.e. using off-heap memory or static memory preallocation).
Here few links where you can find more specific details about GC algorithms and low-pause tuning.
Understanding GC pauses in JVM, HotSpot's minor GC
Understanding GC pauses in JVM, HotSpot's CMS collector
JRockit GC in action
GC checklist for data grid nodes

What are the best garbage collection settings for client side?

Recent JVM's have a lot of XX parameters for garbage collection (see here for example), but what are the options which can make a client side Swing application really perform better?
I should note that one of the things that really annoys me on client side java applications is the large delay in stop-the-world garbage collection. In Intelli-J IDEA I have seen it go three minutes or more.
EDIT: Thanks for all the responses. Just to report back I put on the CMS garbage collector for IDEA (which is a good common reference of the type of application that most everyone reading this question is familiar with) using the setting's suggested from here. I also set -XX:+StringCache to see if it would reduce memory requirements.
In general, the observation is that regular running performance is not degraded to the point where you can notice looking at it. The memory reduction is huge using the String Cache option, however the CMS method is not thorough and ends up requiring a stop the world garbage collection cycle (back to the three minute wait) to clear out the memory (400MB in one run).
However, given the reduced memory footprint, I might be able to just put a smaller maximum amount of memory which will keep the stop the world collections smaller in sizes.
IDEA 8.1.4 comes with JDK 1.6.0_12, so I didn't test G1 yet. Also, my machine only has 2 cores, so a G1 approach won't really be maximized. Time to hit the boss up for a better machine ;).

There is no single answer to this question, it is highly depending on what your application is doing and how it manages its objects. Maybe have a look at How does garbage collection work and Parallel and concurrent garbage collectors to understand the various options.
Then, check the Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning document that expands on GC tuning concepts and techniques for Java SE 6 that were introduced in the Tuning Garbage Collection with the 5.0 Java Virtual Machine document.
If you want to keep garbage collection pauses short, the concurrent collector is likely the right direction as it performs most of its work concurrently (i.e., while the application is still running). But finding the best setup will require profiling (consider measuring the GC throughput, the max and average pause time, the frequency of full GCs and their duration too).
(EDIT: Having read a comment from the OP, I think that reading My advice on JVM heap tuning, keep your fingers off the knobs! from the performance guru Kirk Pepperdine would be a good idea.)

Garbage collection tuning is more than an art then science, and it really depends on your application and its usage. If the standard stop-the-world strategies bother you, why not convert to the CMS (concurrent mark and sweep) or the new G1 collector?
The best way is to change the parameters and attach a profiler to examine the application behaviour.

This is quite automatic and works for us:
-server -Xss4096k -Xms12G -Xmx12G -XX:MaxPermSize=512m -XX:+HeapDumpOnOutOfMemoryError -verbose:gc -Xmaxf1 -XX:+UseCompressedOops -XX:+DisableExplicitGC -XX:+AggressiveOpts -XX:+ScavengeBeforeFullGC -XX:CMSFullGCsBeforeCompaction=10 -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:+CMSParallelRemarkEnabled -XX:GCTimeRatio=19 -XX:+UseAdaptiveSizePolicy -XX:MaxGCPauseMillis=500 -XX:+PrintGCTaskTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationConcurrentTime -XX:+PrintTenuringDistribution -Xloggc:gc.log

There is no "best" option (if there was, anyone would use it, right?) but maybe an option which helps in your case. But here are some tips:
Use the latest VM. The GC code got better with every release.
Use the client jvm.dll (available sinve Java 1.5 in jre/bin/client/). This should be the default.
Allocating and freeing objects in Java is cheap. It's expensive to keep them around.

If you want better performance then give the garbage collector less work. Consider using a pool of objects rather than constantly creating and dumping them, and make sure you need every object you create.

Java very large heap sizes [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Does anyone have experience with using very large heaps, 12 GB or higher in Java?
Does the GC make the program unusable?
What GC params do you use?
Which JVM, Sun or BEA would be better suited for this?
Which platform, Linux or Windows, performs better under such conditions?
In the case of Windows is there any performance difference to be had between 64 bit Vista and XP under such high memory loads?

If your application is not interactive, and GC pauses are not an issue for you, there shouldn't be any problem for 64-bit Java to handle very large heaps, even in hundreds of GBs. We also haven't noticed any stability issues on either Windows or Linux.
However, when you need to keep GC pauses low, things get really nasty:
Forget the default throughput, stop-the-world GC. It will pause you application for several tens of seconds for moderate heaps (< ~30 GB) and several minutes for large ones (> ~30 GB). And buying faster DIMMs won't help.
The best bet is probably the CMS collector, enabled by -XX:+UseConcMarkSweepGC. The CMS garbage collector stops the application only for the initial marking phase and remarking phases. For very small heaps like < 4 GB this is usually not a problem, but for an application that creates a lot of garbage and a large heap, the remarking phase can take quite a long time - usually much less then full stop-the-world, but still can be a problem for very large heaps.
When the CMS garbage collector is not fast enough to finish operation before the tenured generation fills up, it falls back to standard stop-the-world GC. Expect ~30 or more second long pauses for heaps of size 16 GB. You can try to avoid this keeping the long-lived garbage production rate of you application as low as possible. Note that the higher the number of the cores running your application is, the bigger is getting this problem, because the CMS utilizes only one core. Obviously, beware there is no guarantee the CMS does not fall back to the STW collector. And when it does, it usually happens at the peak loads, and your application is dead for several seconds. You would probably not want to sign an SLA for such a configuration.
Well, there is that new G1 thing. It is theoretically designed to avoid the problems with CMS, but we have tried it and observed that:
Its throughput is worse than that of CMS.
It theoretically should avoid collecting the popular blocks of memory first, however it soon reaches a state where almost all blocks are "popular", and the assumptions it is based on simply stop working.
Finally, the stop-the-world fallback still exists for G1; ask Oracle, when that code is supposed to be run. If they say "never", ask them, why the code is there. So IMHO G1 really doesn't make the huge heap problem of Java go away, it only makes it (arguably) a little smaller.
If you have bucks for a big server with big memory, you have probably also bucks for a good, commercial hardware accelerated, pauseless GC technology, like the one offered by Azul. We have one of their servers with 384 GB RAM and it really works fine - no pauses, 0-lines of stop-the-world code in the GC.
Write the damn part of your application that requires lots of memory in C++, like LinkedIn did with social graph processing. You still won't avoid all the problems by doing this (e.g. heap fragmentation), but it would be definitely easier to keep the pauses low.

I am CEO of Azul Systems so I am obviously biased in my opinion on this topic! :) That being said...
Azul's CTO, Gil Tene, has a nice overview of the problems associated with Garbage Collection and a review of various solutions in his Understanding Java Garbage Collection and What You Can Do about It presentation, and there's additional detail in this article: http://www.infoq.com/articles/azul_gc_in_detail.
Azul's C4 Garbage Collector in our Zing JVM is both parallel and concurrent, and uses the same GC mechanism for both the new and old generations, working concurrently and compacting in both cases. Most importantly, C4 has no stop-the-world fall back. All compaction is performed concurrently with the running application. We have customers running very large (hundreds of GBytes) with worse case GC pause times of <10 msec, and depending on the application often times less than 1-2 msec.
The problem with CMS and G1 is that at some point Java heap memory must be compacted, and both of those garbage collectors stop-the-world/STW (i.e. pause the application) to perform compaction. So while CMS and G1 can push out STW pauses, they don't eliminate them. Azul's C4, however, does completely eliminate STW pauses and that's why Zing has such low GC pauses even for gigantic heap sizes.

We have an application that we allocate 12-16 Gb for but it really only reaches 8-10 during normal operation. We use the Sun JVM (tried IBMs and it was a bit of a disaster but that just might have been ignorance on our part...I have friends that swear by it--that work at IBM). As long as you give your app breathing room, the JVM can handle large heap sizes with not too much GC. Plenty of 'extra' memory is key.
Linux is almost always more stable than Windows and when it is not stable it is a hell of a lot easier to figure out why. Solaris is rock solid as well and you get DTrace too :)
With these kind of loads, why on earth would you be using Vista or XP? You are just asking for trouble.
We don't do anything fancy with the GC params. We do set the minimum allocation to be equal to the maximum so it is not constantly trying to resize but that is it.

I have used over 60 GB heap sizes on two different applications under Linux and Solaris respectively using 64-bit versions (obviously) of the Sun 1.6 JVM.
I never encountered garbage collection problems with the Linux-based application except when pushing up near the heap size limit. To avoid the thrashing problems inherent to that scenario (too much time spent doing garbage collection), I simply optimized memory usage throughout the program so that peak usage was about 5-10% below a 64 GB heap size limit.
With a different application running under Solaris, however, I encountered significant garbage-collection problems which made it necessary to do a lot of tweaking. This consisted primarily of three steps:
Enabling/forcing use of the parallel garbage collector via the -XX:+UseParallelGC -XX:+UseParallelOldGC JVM options, as well as controlling the number of GC threads used via the -XX:ParallelGCThreads option. See "Java SE 6 HotSpot Virtual Machine Garbage Collection Tuning" for more details.
Extensive and seemingly ridiculous setting of local variables to "null" after they are no longer needed. Most of these were variables that should have been eligible for garbage collection after going out of scope, and they were not memory leak situations since the references were not copied. However, this "hand-holding" strategy to aid garbage collection was inexplicably necessary for some reason for this application under the Solaris platform in question.
Selective use of the System.gc() method call in key code sections after extensive periods of temporary object allocation. I'm aware of the standard caveats against using these calls, and the argument that they should normally be unnecessary, but I found them to be critical in taming garbage collection when running this memory-intensive application.
The three above steps made it feasible to keep this application contained and running productively at around 60 GB heap usage instead of growing out of control up into the 128 GB heap size limit that was in place. The parallel garbage collector in particular was very helpful since major garbage-collection cycles are expensive when there are a lot of objects, i.e., the time required for major garbage collection is a function of the number of objects in the heap.
I cannot comment on other platform-specific issues at this scale, nor have I used non-Sun (Oracle) JVMs.

12Gb should be no problem with a decent JVM implementation such as Sun's Hotspot.
I would advice you to use the Concurrent Mark and Sweep colllector ( -XX:+UseConcMarkSweepGC) when using a SUN VM.Otherwies you may face long "stop the world" phases, were all threads are stopped during a GC.
The OS should not make a big difference for the GC performance.
You will need of course a 64 bit OS and a machine with enough physical RAM.

I recommend also considering taking a heap dump and see where memory usage can be improved in your app and analyzing the dump in something such as Eclipse's MAT . There are a few articles on the MAT page on getting started in looking for memory leaks. You can use jmap to obtain the dump with something such as ...
jmap -heap:format=b pid

As mentioned above, if you have a non-interactive program, the default (compacting) garbage collector (GC) should work well. If you have an interactive program, and you (1) don't allocate memory faster than the GC can keep up, and (2) don't create temporary objects (or collections of objects) that are too big (relative to the total maximum JVM memory) for the GC to work around, then CMS is for you.
You run into trouble if you have an interactive program where the GC doesn't have enough breathing room. That's true regardless of how much memory you have, but the more memory you have, the worse it gets. That's because when you get too low on memory, CMS will run out of memory, whereas the compacting GCs (including G1) will pause everything until all the memory has been checked for garbage. This stop-the-world pause gets bigger the more memory you have. Trust me, you don't want your servlets to pause for over a minute. I wrote a detailed StackOverflow answer about these pauses in G1.
Since then, my company has switched to Azul Zing. It still can't handle the case where your app really needs more memory than you've got, but up until that very moment it runs like a dream.
But, of course, Zing isn't free and its special sauce is patented. If you have far more time than money, try rewriting your app to use a cluster of JVMs.
On the horizon, Oracle is working on a high-performance GC for multi-gigabyte heaps. However, as of today that's not an option.

If you switch to 64-bit you will use more memory. Pointers become 8 bytes instead of 4. If you are creating lots of objects this can be noticeable seeing as every object is a reference (pointer).
I have recently allocated 15GB of memory in Java using the Sun 1.6 JVM with no problems. Though it is all only allocated once. Not much more memory is allocated or released after the initial amount. This was on a Linux but I imagine the Sun JVM will work just as well on 64-bit Windows.

You should try running visualgc against your app. It´s a heap visualization tool that´s part of the jvmstat download at http://java.sun.com/performance/jvmstat/
It is a lot easier than reading GC logs.
It quickly helps you understand how the parts (generations) of the heap are working. While your total heap may be 10GB, the various parts of the heap will be much smaller. GCs in the Eden portion of the heap are relatively cheap, while full GCs in the old generation are expensive. Sizing your heap so that that the Eden is large and the old generation is hardly ever touched is a good strategy. This may result in a very large overall heap, but what the heck, if the JVM never touches the page, it´s just a virtual page, and doesn´t have to take up RAM.

A couple of years ago, I compared JRockit and the Sun JVM for a 12G heap. JRockit won, and Linux hugepages support made our test run 20% faster. YMMV as our test was very processor/memory intensive and was primarily single-threaded.

here's an article on gc FROM one of Java Champions --
http://kirk.blog-city.com/is_your_concurrent_collector_failing_you.htm
Kirk, the author writes
"Send me your GC logs
I'm currently interested in studying Sun JVM produced GC logs. Since these logs contain no business relevent information it should be ease concerns about protecting proriatary information. All I ask that with the log you mention the OS, complete version information for the JRE, and any heap/gc related command line switches that you have set. I'd also like to know if you are running Grails/Groovey, JRuby, Scala or something other than or along side Java. The best setting is -Xloggc:. Please be aware that this log does not roll over when it reaches your OS size limit. If I find anything interesting I'll be happy to give you a very quick synopsis in return. "

An article from Sun on Java 6 can help you: https://www.oracle.com/java/technologies/javase/troubleshooting-javase.html

The max memory that XP can address is 4 gig(here). So you may not want to use XP for that(use a 64 bit os).

sun has had an itanium 64-bit jvm for a while although itanium is not a popular destination. The solaris and linux 64-bit JVMs should be what you should be after.
Some questions
1) is your application stable ?
2) have you already tested the app in a 32 bit JVM ?
3) is it OK to run multiple JVMs on the same box ?
I would expect the 64-bit OS from windows to get stable in about a year or so but until then, solaris/linux might be better bet.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.