Python uses the reference count method to handle object life time. So an object that has no more use will be immediately destroyed.
But, in Java, the GC(garbage collector) destroys objects which are no longer used at a specific time.
Why does Java choose this strategy and what is the benefit from this?
Is this better than the Python approach?
There are drawbacks of using reference counting. One of the most mentioned is circular references: Suppose A references B, B references C and C references B. If A were to drop its reference to B, both B and C will still have a reference count of 1 and won't be deleted with traditional reference counting. CPython (reference counting is not part of python itself, but part of the C implementation thereof) catches circular references with a separate garbage collection routine that it runs periodically...
Another drawback: Reference counting can make execution slower. Each time an object is referenced and dereferenced, the interpreter/VM must check to see if the count has gone down to 0 (and then deallocate if it did). Garbage Collection does not need to do this.
Also, Garbage Collection can be done in a separate thread (though it can be a bit tricky). On machines with lots of RAM and for processes that use memory only slowly, you might not want to be doing GC at all! Reference counting would be a bit of a drawback there in terms of performance...
Actually reference counting and the strategies used by the Sun JVM are all different types of garbage collection algorithms.
There are two broad approaches for tracking down dead objects: tracing and reference counting. In tracing the GC starts from the "roots" - things like stack references, and traces all reachable (live) objects. Anything that can't be reached is considered dead. In reference counting each time a reference is modified the object's involved have their count updated. Any object whose reference count gets set to zero is considered dead.
With basically all GC implementations there are trade offs but tracing is usually good for high through put (i.e. fast) operation but has longer pause times (larger gaps where the UI or program may freeze up). Reference counting can operate in smaller chunks but will be slower overall. It may mean less freezes but poorer performance overall.
Additionally a reference counting GC requires a cycle detector to clean up any objects in a cycle that won't be caught by their reference count alone. Perl 5 didn't have a cycle detector in its GC implementation and could leak memory that was cyclic.
Research has also been done to get the best of both worlds (low pause times, high throughput):
http://cs.anu.edu.au/~Steve.Blackburn/pubs/papers/urc-oopsla-2003.pdf
Darren Thomas gives a good answer. However, one big difference between the Java and Python approaches is that with reference counting in the common case (no circular references) objects are cleaned up immediately rather than at some indeterminate later date.
For example, I can write sloppy, non-portable code in CPython such as
def parse_some_attrs(fname):
return open(fname).read().split("~~~")[2:4]
and the file descriptor for that file I opened will be cleaned up immediately because as soon as the reference to the open file goes away, the file is garbage collected and the file descriptor is freed. Of course, if I run Jython or IronPython or possibly PyPy, then the garbage collector won't necessarily run until much later; possibly I'll run out of file descriptors first and my program will crash.
So you SHOULD be writing code that looks like
def parse_some_attrs(fname):
with open(fname) as f:
return f.read().split("~~~")[2:4]
but sometimes people like to rely on reference counting to always free up their resources because it can sometimes make your code a little shorter.
I'd say that the best garbage collector is the one with the best performance, which currently seems to be the Java-style generational garbage collectors that can run in a separate thread and has all these crazy optimizations, etc. The differences to how you write your code should be negligible and ideally non-existent.
I think the article "Java theory and practice: A brief history of garbage collection" from IBM should help explain some of the questions you have.
One big disadvantage of Java's tracing GC is that from time to time it will "stop the world" and freeze the application for a relatively long time to do a full GC. If the heap is big and the the object tree complex, it will freeze for a few seconds. Also each full GC visits the whole object tree over and over again, something that is probably quite inefficient. Another drawback of the way Java does GC is that you have to tell the jvm what heap size you want (if the default is not good enough); the JVM derives from that value several thresholds that will trigger the GC process when there is too much garbage stacking up in the heap.
I presume that this is actually the main cause of the jerky feeling of Android (based on Java), even on the most expensive cellphones, in comparison with the smoothness of iOS (based on ObjectiveC, and using RC).
I'd love to see a jvm option to enable RC memory management, and maybe keeping GC only to run as a last resort when there is no more memory left.
Garbage collection is faster (more time efficient) than reference counting, if you have enough memory. For example, a copying gc traverses the "live" objects and copies them to a new space, and can reclaim all the "dead" objects in one step by marking a whole memory region. This is very efficient, if you have enough memory. Generational collections use the knowledge that "most objects die young"; often only a few percent of objects have to be copied.
[This is also the reason why gc can be faster than malloc/free]
Reference counting is much more space efficient than garbage collection, since it reclaims memory the very moment it gets unreachable. This is nice when you want to attach finalizers to objects (e.g. to close a file once the File object gets unreachable). A reference counting system can work even when only a few percent of the memory is free. But the management cost of having to increment and decrement counters upon each pointer assignment cost a lot of time, and some kind of garbage collection is still needed to reclaim cycles.
So the trade-off is clear: if you have to work in a memory-constrained environment, or if you need precise finalizers, use reference counting. If you have enough memory and need the speed, use garbage collection.
Reference counting is particularly difficult to do efficiently in a multi-threaded environment. I don't know how you'd even start to do it without getting into hardware assisted transactions or similar (currently) unusual atomic instructions.
Reference counting is easy to implement. JVMs have had a lot of money sunk into competing implementations, so it shouldn't be surprising that they implement very good solutions to very difficult problems. However, it's becoming increasingly easy to target your favourite language at the JVM.
The latest Sun Java VM actually have multiple GC algorithms which you can tweak. The Java VM specifications intentionally omitted specifying actual GC behaviour to allow different (and multiple) GC algorithms for different VMs.
For example, for all the people who dislike the "stop-the-world" approach of the default Sun Java VM GC behaviour, there are VM such as IBM's WebSphere Real Time which allows real-time application to run on Java.
Since the Java VM spec is publicly available, there is (theoretically) nothing stopping anyone from implementing a Java VM that uses CPython's GC algorithm.
Late in the game, but I think one significant rationale for RC in python is its simplicity. See this email by Alex Martelli, for example.
(I could not find a link outside google cache, the email date from 13th october 2005 on python list).
Related
Is there any way to change the frequency of the garbage collector, whether if to reduce it or increase it?
I found some articles that say that in order to increase the frequency, I need to increase the young generation to allow more objects to get into it before a GC is called.
But I didn't find anywhere a real way to do it, with real commands or actions or instructions HOW to make it happen (to reduce or to increase GC frequency).
You first configure which garbage collector you want to use, then you may be able to configure said garbage collector.
Whatever you read about it is irrelevant / invalid / misleading / vastly oversimplified. Garbage Collection is extremely complicated, specifically so complicated that talking about 'frequency' doesn't really make sense. 20 years ago garbage collection was simple:
Freeze every thread.
Make a list of all live objects.
Tree-walk these objects to find all objects reachable from those live objects, and keep walking.
Start from position 0 in the heap memory allocated to the JVM and start moving every object still reachable, updating all pointers as you go, and therefore silently overwriting all non-reachables.
Now you're done; memory is nicely compacted, lots of free space, unfreeze the world.
That model? It died 20 years ago. Garbage collection is vastly more complicated now, with aspects like:
Live tracking: Where the JVM uses heuristic mechanisms to be able to fast-collect a subset of garbage. (basically, reference counting; if the refcount is 0, it's definitely garbage. However, non-0 refcounts could also be garbage, for example if a refers to b, b refers to a, and nothing 'live' refers to either: Both refcounts are 1, but they're still garbage). These garbage collectors still collect them, just, not as quickly as refcount-0 garbage. What does 'frequency' mean now?
Generations, with vastly different approaches between generations. For example, your basic eden/'fast garbage' system works in reverse: A java thread gets a page worth of memory, new objects are created here, completely unreachable by any other thread. Once it is full, the system does a quick check on what this and only this thread can currently reach in context, makes a new page, copies over just the objects still reachable, and marks the old page as free. "Free garbage collection" just occurred. What the heck would 'frequency' mean here? There is nothing to configure: When the page is full, this process kicks in. Until the page is full, it doesn't. There's nothing to configure.
that's just 2 of like 50 things that garbage collectors do that cannot be described simply as a thing to which the term 'frequency' can be applied unambiguously.
Every JDK version sees pretty massive changes to the GC implementations available, and the way these implementations works, and even the settings these implementations support. None of it is part of the core java spec, which means that the OpenJDK team is far more cavalier about changing them between java releases, and for the same reason, alternate JDK providers like Azul, coretto etc often provide extra GC impls and extra settings.
So what do I do?
Stop worrying. The general rule of thumb is: If you mess with GC settings, you'll make everything worse. Get an expert if you need to tweak GC settings, and rest safe in the knowledge that it is highly unlikely you need it.
Forget about what you read. It's outdated information.
I have a memory leak in Java in which I have 9600 ImapClients in my heap dump and only 7800 MonitoringTasks. This is a problem since every ImapClient should be owned by a MonitoringTask, so those extra 1800 ImapClients are leaked.
One problem is I can't isolate them in the heap dump and see what's keeping them alive. So far I've only been able to pinpoint them by using external evidence to guess at which ImapClients are dangling. I'm learning OQL which I believe can solve this but it's coming slowly, and it'll take a while before I can understand how to perform something recursive like this in a new query language.
Determining a leak exists is difficult, so here is my full situation:
this process was spewing OOMEs a week ago. I thought I fixed it and I'm trying to verify whether my fixed worked without waiting another full week to see if it spews OOMEs again.
This task creates 7000-9000 ImapClients on start then under normal operation connects and disconnects very few of them.
I checked another process running older pre-OOME code, and it showed numbers of 9000/9100 instead of 7800/9600. I do not know why old code will be different from new code but this is evidence of a leak.
The point of this question is so I can determine if there is a leak. There is a business rule that every ImapClient should be a referee of a MonitoringTask. If this query I am asking about comes up empty, there is not a leak. If it comes up with objects, together with this business rule, it is not only evidence of a leak but conclusive proof of one.
Your expectations are incorrect, there is no actual evidence of any leaks occuring
The Garbage Collector's goal is to free space when it is needed and
only then, anything else is a waste of resources. There is absolutely
no benefit in attempting to keep as much free space as possible
available all the time and only down sides.
Just because something is a candidate for garbage collection doesn't
mean it will ever actually be collected, and there is no way to
force garbage collection either.
I don't see any mention of OutOfMemoryError anywhere.
What you are concerned about you can't control, not directly anyway
What you should focus on is what in in your control, which is making sure you don't hold on to references longer than you need to, and that you are not duplicating things unnecessarily. The garbage collection routines in Java are highly optimized, and if you learn how their algorithms work, you can make sure your program behaves in the optimal way for those algorithms to work.
Java Heap Memory isn't like manually managed memory in other languages, those rules don't apply
What are considered memory leaks in other languages aren't the same thing/root cause as in Java with its garbage collection system.
Most likely in Java memory isn't consumed by one single uber-object that is leaking ( dangling reference in other environments ).
Intermediate objects may be held around longer than expected by the garbage collector because of the scope they are in and lots of other things that can vary at run time.
EXAMPLE: the garbage collector may decide that there are candidates, but because it considers that there is plenty of memory still to be had that it might be too expensive time wise to flush them out at that point in time, and it will wait until memory pressure gets higher.
The garbage collector is really good now, but it isn't magic, if you are doing degenerate things, it will cause it to not work optimally. There is lots of documentation on the internet about the garbage collector settings for all the versions of the JVMs.
These un-referenced objects may just have not reached the time that the garbage collector thinks it needs them to for them to be expunged from memory, or there could be references to them held by some other object ( List ) for example that you don't realize still points to that object. This is what is most commonly referred to as a leak in Java, which is a reference leak more specifically.
I don't see any mention of OutOfMemoryError
You probably don't have a problem in your code, the garbage collection system just might not be getting put under enough pressure to kick in and deallocate objects that you think it should be cleaning up. What you think is a problem probably isn't, not unless your program is crashing with OutOfMemoryError. This isn't C, C++, Objective-C, or any other manual memory management language / runtime. You don't get to decide what is in memory or not at the detail level you are expecting you should be able to.
Check your code for finalizers, especially anything relating to IMapclient.
It could be that your MonitoringTasks are being easily collected whereas your IMapclient's are finalized, and therefore stay on the heap (though dead) until the finalizer thread runs.
The obvious answer is to add a WeakHashMap<X, Object> (and Y) to your code -- one tracking all instances of X and another tracking all instances of Y (make them static members of the class and insert every object into the map in the constructor with a null 'value'). Then you can at any time iterate over these maps to find all live instances of X and Y and see which Xs are not referenced by Ys. You might want to trigger a full GC first, to ignore objects that are dead and not yet collected.
I understand that in Java, if an object doesn't have any references to it any more, the garbage collector will reclaim it back some time later.
But how does the garbage collector know that an object has or has not references associated to it?
Is garbage collector using some kind of hashmap or table?
Edit:
Please note that I am not asking how generally gc works. really, I am not asking that.
I am asking specifically that How gc knows which objects are live and which are dead, with efficiencies.
That's why I say in my question that is gc maintain some kind of hashmap or set, and consistently update the number of references an object has?
A typical modern JVM uses several different types of garbage collectors.
One type that's often used for objects that have been around for a while is called Mark-and-Sweep. It basically involves starting from known "live" objects (the so-called garbage collection roots), following all chains of object references, and marking every reachable object as "live".
Once this is done, the sweep stage can reclaim those objects that haven't been marked as "live".
For this process to work, the JVM has to know the location in memory of every object reference. This is a necessary condition for a garbage collector to be precise (which Java's is).
Java has a variety of different garbage collection strategies, but they all basically work by keeping track which objects are reachable from known active objects.
A great summary can be found in the article How Garbage Collection works in Java but for the real low-down, you should look at Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine
An object is considered garbage when it can no longer be reached from any pointer in the running program. The most straightforward garbage collection algorithms simply iterate over every reachable object. Any objects left over are then considered garbage. The time this approach takes is proportional to the number of live objects, which is prohibitive for large applications maintaining lots of live data.
Beginning with the J2SE Platform version 1.2, the virtual machine incorporated a number of different garbage collection algorithms that are combined using generational collection. While naive garbage collection examines every live object in the heap, generational collection exploits several empirically observed properties of most applications to avoid extra work.
The most important of these observed properties is infant mortality. ...
I.e. many objects like iterators only live for a very short time, so younger objects are more likely to be eligible for garbage collection than much older objects.
For more up to date tuning guides, take a look at:
Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning
Java Platform, Standard Edition HotSpot Virtual Machine Garbage Collection Tuning Guide (Java SE 8)
Incidentally, be careful of trying to second guess your garbage collection strategy, I've known many a programs performance for be trashed by over zealous use of System.gc() or inappropriate -XX options.
GC will know that object can be removed as quickly as it is possible. You are not expected to manage this process.
But you can ask GC very politely to run using System.gc(). It is just a tip to the system. GC does not have to run at that moment, it does not have to remove your specific object etc. Because GC is the BIG boss and we (Java programmers) are just its slaves... :(
The truth is that the garbage collector does not, in general, quickly know which objects no longer have any incoming references. And, in fact, an object can be garbage even when there are incoming references it.
The garbage collector uses a traversal of the object graph to find the objects that are reachable. Objects that are not reached in this traversal are deemed garbage, even if they are part of a cycle of references. The delay between an object being unreachable, and the garbage collector actually collecting the object, could be arbitrarily long.
There is no efficient way - it will still require traversal of the heap, but there is a hacky way: when the heap is divided into smaller pieces (thus no need to scan the entire heap). This is the reason we have generational garbage collectors, so that the scanning takes less time.
This is relatively "easy" to answer when your entire application is stopped and you can analyze the graph of objects. It all starts from GC roots (I'll let you find the documentation for what these are), but basically these are "roots" that are not collected by the GC.
From here a certain scan starts that analyzes the "live" objects: objects that have a direct (or transitive) connection to these roots, thus not reclaimable. In graph theory this is know to "color/traverse" your graph by using 3 colors: black, grey and white. White means it is not connected to the roots, grey means it's sub-graph is not yet traversed, black means traversed and connected to the roots. So basically to know what exactly is dead/alive right now - you simply need to take all your heap that is white initially and color it to black. Everything that is white is garbage. It is interesting that "garbage" is really identified by a GC by knowing what is actually alive. There are some drawings to visualize this here for example.
But this is the simple scenario: when your application is entirely stopped (for seconds at times) and you can scan the heap. This is called a STW - stop the world event and people hate these usually. This is what parallel collectors do: stop everything, do whatever GC has to (including finding garbage), let the application threads start after that.
What happens when you app is running and you are scanning the heap? Concurrently? G1/CMS do this. Think about it: how can you reason about a leaf from a graph being alive or not when your app can change that leaf via a different thread.
Shenandoah for example, solves this by "intercepting" changes over the graph. While running concurrently with your application, it will catch all the changes and insert these to some thread local special queues, called SATB Queues (snapshot at the begging queues); instead of altering the heap directly. When that is finished, a very short STW event will occur and these queues will be drained. Still under the STW what that drain has "caused" is computed, i.e. : extra coloring of the graph. This is far simplified, just FYI. G1 and CMS do it differently AFAIK.
So in theory, the process is not really that complicated, but implementing it concurrently is the most challenging part.
I've developed a Web Application which process a huge amount of data and takes a lot of time to complete?
So now I am doing profiling of my application and I noticed one very bad thing about GC.
When a Full GC occurred it stops all process for 30 - 40 secs.
I wonder if there is any way to improve this. I don't want to waist my CPU's that much time only in GC. Below are some details that can be useful:
I am using Java 1.6.0.23
My Application takes 20 GB max memory.
A full GC occur after every 14 minutes.
Memory Before GC is 20 GB and after GC is 7.8 GB
Memory used in CPU (i.e. shown in task manager) is 41 GB.
After process completed(JVM is still running) Used memory 5 GB and free memory 15 GB.
There are many algorithms that modern JVM's use for garbage collection. Some algorithms such as reference counting are so fast, and some such as memory copying are so slow. You could change your code so that help the JVM to use the faster algorithms most of the time.
One of the fastest algorithms is reference counting, and as the name describes, it counts references to an object, and when it reaches zero, it is ready for garbage collection, and after that it decreases reference count to objects referenced by the current GCed object.
To help JVM to use this algorithm, avoid having circular references (object A references B, then B references C, C references D ...., and Z references A again). Because even when the whole object graph is not reachable, none of the object's reference counters reaches zero.
You could only just break the circle when you don't need the objects in the circle any more (by assigning null to one of references)....
If you use 64 bit architecture add:
-XX:+UseCompressedOops 64bit addresses are converted to 32bit
Use G1GC instead of CMS:
-XX:+UseG1GC - it use incremental steps
Set the same initial and max size: -Xms5g -Xmx5g
Tune parameters (just example):
-XX:MaxGCPauseMillis=100 -XX:GCPauseIntervalMillis=1000
See Java HotSpot VM Options Performance Options
Either improve app by reusing resources or kick-in System.gc() yourself in some critical regions of the app (which is not guaranteed to help you). Most likely you have a memory leak somewhere that you have to investigate and consequently restructure the code.
The fewer things you new, the fewer things need to be collected.
Suppose you have class A.
You can include in it a reference to another instance of class A.
That way you can make a "free list" of instances of A.
Whenever you need an A, just pop one off the free list.
If the free list is empty, then new one.
When you no longer need it, push it on the free list.
This can save a lot of time.
The amount of time spent in GC depends on two factors:
How many objects are live (= can be reached from anyone)
How many dead objects implement finalize()
Objects which can't be reached and which don't use finalize() cost nothing to clean up in Java which is why Java is usually on par with other languages like C++ (and often much better because C++ spends a lot of time to delete objects).
So what you need to do in your app is cut down on the number of objects that survive and/or cut references to objects (that you no longer need) earlier in the code. Example:
When you have a very long method, you will keep all the objects alive that you reference from local variables. If you split that method in many smaller methods, the references will be lost faster and the GC won't have to deal with those objects.
If you put everything that you might need in huge hash maps, the maps will keep all those instances alive until your code completes. So even when you don't need those anymore, the GC will still have to spend time on them.
Does the Java virtual machine ever move objects in memory, and if so, how does it handle updating references to the moved object?
I ask because I'm exploring an idea of storing objects in a distributed fashion (ie. across multiple servers), but I need the ability to move objects between servers for efficiency reasons. Objects need to be able to contain pointers to each-other, even to objects on remote servers. I'm trying to think of the best way to update references to moved objects.
My two ideas so far are:
Maintain a reference indirection somewhere that doesn't move for the lifetime of the object, which we update if the object moves. But - how are these indirections managed?
Keep a list of reverse-references with each object, so we know what has to be updated if the object is moved. Of course, this creates a performance overhead.
I'd be interested in feedback on these approaches, and any suggestions for alternative approaches.
In reference to the comment above about walking the heap.
Different GC's do it different ways.
Typically copying collectors when they walk the heap, they don't walk all of the objects in the heap. Rather they walk the LIVE objects in the heap. The implication is that if it's reachable from the "root" object, the object is live.
So, at this stage is has to touch all of the live objects anyway, as it copies them from the old heap to the new heap. Once the copy of the live objects is done, all that remains in the old heap are either objects already copied, or garbage. At that point the old heap can be discarded completely.
The two primary benefits of this kind of collector are that it compacts the heap during the copy phase, and that it only copies living objects. This is important to many systems because with this kind of collector, object allocation is dirt cheap, literally little more than incrementing a heap pointer. When GC happens, none of the "dead" objects are copied, so they don't slow the collector down. It also turns out in dynamic systems that there's a lot more little, temporary garbage, than there is long standing garbage.
Also, by walking the live object graph, you can see how the GC can "know" about every object, and keep track of them for any address adjustment purposes performed during the copy.
This is not the forum to talk deeply about GC mechanics, as it's a non-trivial problem, but that's the basics of how a copying collector works.
A generational copying GC will put "older" objects in different heaps, and those end up being collected less often than "newer" heaps. The theory is that the long lasting objects get promoted to older generations and get collected less and less, improving overall GC performance.
The keyword you're after is "compacting garbage collector". JVMs are permitted to use one, meaning that objects can be relocated. Consult your JVM's manual to find out whether yours does, and to see whether there are any command-line options which affect it.
The conceptually simplest way to explain compaction is to assume that the garbage collector freezes all threads, relocates the object, searches heap and stack for all references to that object, and updates them with the new address. Actually it's more complex than that, since for performance reasons you don't want to perform a full sweep with threads stalled, so an incremental garbage collector will do work in preparation for compaction whenever it can.
If you're interested in indirect references, you could start by researching weak and soft references in Java, and also the remote references used by various RPC systems.
I'd be curious to know more about your requirements. As another answer suggests, Terracotta may be exactly what you are looking for.
There is a subtle difference however between what Terracotta provides, and what you are asking for, thus my inquiry.
The difference is that as far as you are concerned, Terracotta does not provide "remote" references to objects - in fact the whole "remote" notion of RMI, JMS, etc. is entirely absent when using Terracotta.
Rather, in Terracotta, all objects reside in large virtual heap. Threads, whether on Node 1, or Node 2, Node 3, Node 4, etc all have access to any object in the virtual heap.
There's no special programming to learn, or special APIs, objects in the "virtual" heap have exactly the same behavior as objects in the local heap.
In short, what Terracotta provides is a programming model for multiple JVMs that operates exactly the same as a the programming model for a single JVM. Threads in separate nodes simply behave like threads in a single node - object mutations, synchronized, wait, notify all behave exactly the same across nodes as as across threads - there's no difference.
Furthermore, unlike any solution to come before it, object references are maintained across nodes - meaning you can use ==. It's all a part of maintaining the Java Memory Model across the cluster which is the fundamental requirement to make "regular" Java (e.g. POJOs, synchronized, wait/notify) work (none of that works if you don't / can't preserve object identity across the cluster).
So the question comes back to you to further refine your requiements - for what purpose do you need "remote" pointers?
(Practically) Any garbage collected system has to move objects around in memory to pack them more densely and avoid fragmentation problems.
What you are looking at is a very large and complex subject. I'd suggest you read up on existing remote object style API's: .NET remoting and going further back technologies like CORBA
Any solution for tracking the references will be complicated by having to deal with all the failure modes that exist in distributed systems. The JVM doesn't have to worry about suddenly finding it can't see half of its heap because a network switch glitched.
When you drill into the design I think a lot of it will come down to how you want to handle different failure cases.
Response to comments:
Your question talks about storing objects in a distributed fashion, which is exactly what .NET remoting and CORBA address. Admittedly neither technology supports migration of these objects (AFAIK). But they both deal extensively with the concepts of object identity which is a critical part of any distributed object system: how do different parts of the system know which objects they are talking about.
I am not overly familiar with the details of the Java garbage collector, and I'm sure the Java and .NET garbage collectors have a lot of complexity in them to achieve maximum performance with minimum impact on the application.
However, the basic idea for garbage collection is:
The VM stops all threads from running managed code
It performs a reachability analysis from the set of known 'roots': static variables, local variables on all the threads. For each object it finds it follows all references within the object.
Any object not identified by the reachability analysis is garbage.
Objects that are still alive can then be moved down in memory to pack them densely. This means that any references to these objects also have to be updated with the new address. By controlling when a garbage collect can occur the VM is able to guarantee that there are no object references 'in-the-air' (ie. being held in a machine register) that would cause a problem.
Once the process is complete the VM starts the threads executing again.
As a refinement of this process the VM can perform generational garbage collection, where separate heaps are maintained based on the 'age' of an object. Objects start in heap 0 and if they survive several GCs then the migrate to heap 1 and eventually to heap 2 (and so on - .NET supports 3 generations only though). The advantage of this is that the GC can run heap 0 collections very frequently, and not have to worry about doing the work to prove the long lived objects (which have ended up in heap 2) are still alive (which they almost certainly are).
There are other refinements to support concurrent garbage collection, and details around threads that are actually executing unmanaged code when the GC is scheduled that add a lot more complexity to this area.
sounds like you are looking for a distributed cache, something like terracotta or oracle's java objece cache (formerly tangersol).
If you are willing to go that deep down, you can take a look to JBoss Cache architecture docs and grab some of its source code as reference.
This is not exactly what you described, but it works very similar.
Here's the link.
http://www.jboss.org/jbosscache/
I hope this helps.