Can someone give me some advice on this? I am reading in an old text and some notes from my teacher that when using multiple threads with Java it's necessary to write a special program for garbage collection.
Does this still apply in Java SE6 and above? If it does could someone provide the standard way to do this.
Using a garbage collector makes writing multi-threaded code easier. This is because manual freeing of resources in a multi-threaded context is hard to get right. With GC its something you don't need to worry about most of the time.
I am reading that when using multiple threads it's necessary to write a special program for garbage collection.
I don't believe this was ever the case.
Does this still apply in SE6 and above and if so is there a standard way to do this.
The standard way to do this is to not reference objects you don't need. e.g. if you have a local variable you don't need, let it drop out of scope.
It doesn't have to be complicated.
As far as I know, as long if nothing is pointing to an object, that object get's freed by the garbage collector.
Java's garbage collector is very robust in terms of circular referencing, I don't see why It won't work with multiple threads running at the same time.
So it is safe for you to assume that you don't need to write a special program for garbage collection, because java will do it for you very effectively.
If you want to free objects in java, just make sure that no variables are referencing your object. (Including structures (lists, arrays, etc) from java collections or other libraries)
This article from JavaWorld in 2003, J2SE 1.4.1 boosts garbage collection, has this to say about the Java garbage collection prior to J2SE 1.4.1:
Mark and sweep is a "stop-the-world" garbage collection technique;
that is, all application threads stop until garbage collection
completes, or until a higher-priority thread interrupts the garbage
collector. If the garbage collector is interrupted, it must restart,
which can lead to application thrashing with little apparent result.
The other problem with mark and sweep is that many types of
applications can't tolerate its stop-the-world nature. That is
especially true of applications that require near real-time behavior
or those that service large numbers of transaction-oriented clients.
An article in Dr. Dobbs from 2009, G1: Java's Garbage First Garbage Collector, has this to say about Java garbage collector before SE 6.
Until recently, Java SE came with two main collectors: the parallel
collector, and the concurrent-mark-sweep (CMS) collector -- see the
sidebar Parallelism and Concurrency. As of the latest Java SE 6 update
release, the G1 collector is another option. The plan is for G1 to
eventually replace CMS as a low-pause, soft real-time collector. Let's
take a look at how it works.
So it may be that prior to SE 6 some additional precautions to assist with Java garbage collection may have helped, especially with multi-threaded applications with a fair amount of temporary variables generating garbage that needed collecting. However this should entail at most an explicit call to the garbage collector during slow times. Writing something special would seem very unusual.
However things are much more improved than they were. Plus garbage collection can vary between different versions of Java Virtual Machines.
So what may have been true years ago is almost definitely not true now with current technology.
This posting, How to monitor Java memory usage?, discusses monitoring Java memory usage as well as some of the pros and cons of calling the garbage collector explicitly.
Oracle has a Java Garbage Collection Basics tutorial that covers Java SE 7 Hotspot JVM.
Use following code to call garbage collector explicitly
Runtime runtime = Runtime.getRuntime();
runtime.gc();
But it is not needed, jvm will automatically handle correct timely running of GC.
Almost certainly your instructor's notes are stating (correctly) that since Java is a multithreaded environment, more care is needed when implementing the garbage collector inside the Java run time environment than would be necessary if only a single thread were involved. This is true of any multithreaded environment.
As others have said, you the programmer don't see any of this complexity. That's the gift of automatic memory management that gc provides.
Related
I am trying to understand the garbage collector method.
"In Java, your program must call the garbage collector method
frequently; otherwise your program will be overrun with garbage
objects"
I believe this is false because the program should do this automatically, running in the background correct?
Correct. In fact, there is no garbage-collector method (System.gc() is a hint that now might be a good time to garbage collect, but it's nothing more). The JVM, if it implements garbage collection (and all Java SE and Java EE ones do), will collect based on its own rules, which usually include concurrently cleaning up short-lived objects, and doing a major collection when memory starts getting low or fragmented.
While in general calling System.gc() is considered bad, bad, bad, very bad practice, in some cases this may prevent the out of memory error. This is mostly when you allocate lots of objects in a very rapid succession and have time moments when many of these are no longer referenced and can be dropped. While System.gc() is just a hint, sometimes the system may need this hint. Alternatively you may rethink you algorithm.
These situations are, however, increasingly rare with the newer Java versions; used to be more frequent in the past. The official point of view is never call garbage collector manually.
The method System.gc() does not necessarily call the Garbage Collector. Garbage Collection is dependent on JVM implementation. For Example a Linux JVM may actually call GC when you do System.gc() a Windows JVM may not.
Lets say you have too many unreachable objects in memory but you still have enough Heap space left so the JVM may decide not to run the GC thread. So conclusion "There is no guaruntee that the GC thread will run when you call System.gc() and it is not necessary to call it."
This is one of Java's main advantage the developer need not worry about Memory management and stuff. unlike C++ where you have to do something like
free(obj);
This question already has answers here:
Why is it bad practice to call System.gc()?
(13 answers)
Closed 9 years ago.
My application has lot of iterations. Till now I haven't faced any memory issues. But from code level I can suspect there are few places, which causes memory leaks and out of memory problem. I am thinking of calling garbage collector manually. Is it good practice to call Garbage Collector manually?
You can call Garbage collector using:
System.gc();
But this does not mean that it'll be executed immediately. The JVM decides when to execute it. In general if the JVM is about to throw an OutOfMemoryError, calling System.gc() won't prevent it. Better investigate why you're leaking so much memory and clean it up along the way.
JavaDoc:
Calling the gc method suggests that the Java Virtual Machine expend
effort toward recycling unused objects in order to make the memory
they currently occupy available for quick reuse. When control returns
from the method call, the Java Virtual Machine has made a best effort
to reclaim space from all discarded objects
Is it good practice to call Garbage Collector manually?
No, it is definitely not good practice.
You can use System.gc(). Note that this is not guaranteed to call the garbage collector - it only gives a hint to the system that it might be a good idea to do garbage collection.
The garbage collector in Oracle's JVM contains a lot of sophisticated logic to determine when and what to cleanup. Tuning it requires knowledge about details on how it works. Just putting a System.gc() somewhere in your program is not likely to help a lot, and in fact, it can even make it worse.
See Java SE 6 HotSpot Virtual Machine Garbage Collection Tuning for details on how to tune garbage collection with Java SE 6.
Yes you can explicitly call garbage collector using
System.gc();
But what happens is that you can't order JVM to do garbage collection immediately. JVM decides on its own when to garbage collect.Hence its not a good idea of calling it manually.
Moreover regarding OutOfMemoryException doing manual garbage collection won't help you prevent the exception, since JVM throws this exception after reclaiming all the memory it can. It has some very sophisticated algorithms to determine when and how to do perform the garbage collection.
So what i suggest is that if you are getting OutOfMemoryException then recheck your program make it more efficient or increase heap space.
You can call Garbage Collector explicitly, but JVM decides whether to process the call or not.
Ideally, you should never write code dependent on call to garbage collector.
JVM internally uses some algorithm to decide when to make this call. When you make call using System.gc(), it is just a request to JVM and JVM can anytime decide to ignore it.
Regardless of whether you can or cannot manually trigger the garbage collector (and the different levels of collection), and what impact this has on performance (which indeed is a topic worth discussion), it will not prevent OutOfMemoryErrors, because when the JVM is about to run out of memory, it does the most thorough collection it can anyway. Only if after that collection not enough memory is available, will it error out. Even if you trigger the collection earlier yourself, the result (the amount of memory reclaimed) is the same.
Memory leaks cannot be fixed by running garbage collection more often.
They have to be fixed in your program (stop referencing things you don't need anymore earlier), or (worst case, if it is a "real" leak) in the JVM or runtime library itself (but genuine memory management bugs should not exist anymore after so many years of service).
Calling System.gc() does not guarantee any GC. Garbage is collected if there is a real need for memory.
You can have a look at different garbage collectors here.
http://javarevisited.blogspot.in/2011/04/garbage-collection-in-java.html
You can include any of these GCs in your command line params as per your needs.
When building a system which needs to respond very consistently and fast, is having a garbage collector a potential problem?
I remember horror stories from years ago where the typical example always was an action game where your character would stop for a few seconds in mid-jump, when the garbage collector would do its cleanup.
We are some years further, but I'm wondering if this is still an issue. I read about the new garbage collector in .Net 4, but it still seems a lot like a big black box, and you just have to trust everything will be fine.
If you have a system which always has to be quick to respond, is having a garbage collector too big of a problem and is it better to chose for a more hardcore, control it yourself language like c++? I would hate it that if it turns out to be a problem, that there is basically almost nothing you can do about it, other than waiting for a new version of the runtime or doing very weird things to try and influence the collector.
EDIT
thanks for all the great resources. However, it seems that most articles/custom gc's/solutions pertain to the Java environment. Does .Net also have tuning capabilities or options for a custom GC?
To be precise, garbage collectors are a problem for real-time systems. To be even more precise, it is possible to write real-time software in languages that have automatic memory management.
More details can be found in the Real Time Specification for Java on one of the approaches for achieving real-time behavior using Java. The idea behind RTSJ is very simple - do not use a heap. RTSJ provides for new varieties of Runnable objects that ensure threads do not access heap memory of any kind. Threads can either access scoped memory (nothing unusual here; values are destroyed when the scope is closed) or immortal memory (that exists throughout the application lifetime). Variables in the immortal memory are written over, time and again with new values.
Through the use of immortal memory, RTSJ ensures that threads do not access the heap, and more importantly, the system does not have a garbage collector that preempts execution of the program by the threads.
More details are available in the paper "Project Golden Gate: Towards Real-Time Java in Space Missions" published by JPL and Sun.
I've written games in Java and .NET and never found this to be a big problem. I expect your "horror stories" are based on the garbage collectors of many years ago - the technology really has moved a long way since then.
The only thing I would hesitate to use Java/.NET for on the the basis of garbage collection would be something like embedded programming with hard real time constraints (e.g. motion controllers).
However you do need to be aware of GC pauses and all of the following can be helpful in minimising the risk of GC pauses:
Minimise new object allocations - while object allocations are extremely fast in modern GC systems, they do contribute to future pauses so should be minimised. You can use techniques like pre-allocating arrays of objects, keeping object pools or using unboxed primitives.
Use specialized low-latency libraries such as Javalution for heavily used functions and data types. These are designed specifically for real-time / low latency application
Make sure you are using the best GC algorithm when there are multiple versions available. I've heard good things about the Sun G1 Collector for low latency applications. The best GC systems do most of their collections concurrently so that garbage collections do not have to "stop the world" for very long if at all.
Tune the GC parameters appropriately. Usually there is a trade-off between overall performance and pause times, you may want to improve the latter at the expense of the former.
If you're very rich, you can of course buy machines with hardware GC support. :-)
Yes, garbage must be handled in a deterministic manner in real-time systems.
One approach is to schedule a certain amount of garbage collection time during each memory allocation. This is called "work-based garbage collection." The idea is that in the absence of leaks, allocation and collection should be proportional.
Another simple approach ("time-based garbage collection") is to schedule a certain proportion of time for periodic garbage collection, whether it is needed or not.
In either case, it is possible that a program will run out of usable memory because it is not allowed to spend enough time to do a full garbage collection. This is in contrast to a non-realtime system, which is permitted to pause as long as it needs to in order to collect garbage.
On a theoretical point of view, garbage collectors are not a problem but a solution. Real-time systems are hard, when there is dynamic memory allocation. In particular, the usual C functions malloc() and free() do not offer real-time guarantees (they are normally fast but have, at least theoretically, "worst cases" where they use inordinate amounts of time).
It so happens that it is possible to build a dynamic memory allocator which offers real-time guarantees, but this requires the allocator to do some heavy stuff, in particular moving some objects in RAM. Object moving implies adjusting pointers (transparently, from the application code point of view), and at that point the allocator is just one small step away from being a garbage collector.
Usual Java or .NET implementations do not offer real-time garbage collection, in the sense of guaranteed response times, but their GC are still heavily optimized and have very short response times most of the time. Under normal conditions, very short average response times are better than guaranteed response times ("guaranteed" does not mean "fast").
Also, note that usual Java or .NET implementations run on operating systems which are not real-time either (the OS can decide to schedule other threads, or may aggressively send some data to a swap file, and so on), and neither is the underlying hardware (e.g. a typical hard disk may make "recalibration pauses" on time to time). If you are ready to tolerate the occasional timing glitch due to the hardware, then you should be fine with a (carefully tuned) JVM garbage collector. Even for games.
It is a potential problem, BUT...
Your character might also freeze in the middle of your C++ program while the OS retrieves a page of memory from an overtaxed hard disk. If you are not using a real-time OS on hardware designed to provide concrete performance guarantees, you are never guaranteed performance.
To get a more specific answer, you'd have to ask about a specific implementation of a specific virtual machine. You can use a garbage-collected virtual machine for real-time systems if it provides suitable performance guarantees about garbage collection.
You bet it is a problem. If you are writing low-latency applications you cannot afford the stop-the-world pauses that most garbage collectors impose. Since Java does not allow you to turn off the GC, your only option is to produce no garbage. That can be done and has been done through object pooling and bootstrapping. I wrote a blog article where I talk about this in detail.
Our company is employing a large .Net-based software application that amongst other things monitors binary sensors over fieldbus networks. In some situations, the sensors activate only for a short amount of time (300 ms) but our software still needs to capture those events as the controlled system will immediately fail when an event is missed. We recently observed increased problems at our customer sites due to the garbage collector running for long timespans (up to 1 second). We are still trying to figure out how to enforce a time limit on the garbage collector. In conclusion of this short story, i would say the garbage collector is a handicap in time critical applications.
Recently I heard Kirk Pepperdine speak about changing garbage collectors for better performance -- but what exactly does that mean and what makes one garbage collector better or different than the other?
You ask two questions:
What does it mean to change garbage collectors in Java for better performance?
This is a huge topic, and like some of the other responders, I urge you to do some reading. I recommend Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning from Sun. The information below mostly comes from there. The "turbo-charging" java article recommended in another answer is older.
In brief, one of the many options we have when running the JVM is to select a garbage collector, of which there are presently three:
The serial collector (selected with the -XX:+UseSerialGC option) - this uses a single thread to do all collection work, and everything waits while it happens.
The parallel collector (selected with the -XX:+UseParallelGC option) - this does minor collections (of the young generation) in parallel, but everything waits during the major collections.
The concurrent collector (selected with the -XX:+UseConcMarkSweepGC option) - this allows most collection operations to happen while the application is running.
What makes one garbage collector better than another?
Your application does. Each of the garbage collectors has a "sweet spot" - a range of application profiles for which it is the superior collector.
First, know that the VM is pretty good at selecting a collector for you, and as with most optimizations, you should not consider second-guessing it until you've identified that your application is not performing well, and that garbage collection is the likely culprit.
In that case, you have to ask these questions: 1) is your app running on a single-processor machine, or multi? 2) Are you more concerned with "minimizing pause time", or with "maximizing throughput"? That is, if you had to choose between the application never pausing but getting less work done overall, versus getting more work done overall, but pausing from time to time, which would you pick?
Roughly speaking, as a starting point:
On a Multi-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
On a Multi-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Parallel collector (consider enabling parallel compaction)
On a Single-processor machine, with small datasets (up to roughly 100Mb), you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with maximizing throughput, you'd tend to use the Serial collector
On a Single-processor machine, mostly concerned with minimizing pause time, you'd tend to use the Concurrent collector (consider enabling incremental mode)
Again, though, the VM does a pretty good job of selecting a collector for you, and you're better off not overriding that unless and until you discover that it's not working well enough for your application.
Some collectors are better for throughput, others are better for response time. The difference is usually in how the collector chooses to pause the application. Some such as CMS use mutiple passes to triage the garbage before stopping the application. This triage can happen in a background thread while the application is running, and thus not interfere with your application as much as one that "stops the world" to do a GC.
Edit
Check out this document by sun. Also, about half way down there is a nice image showing the default mark-compact collector against the CMS collector. A picture is worth a thousand words, but the article is a good read too ;) Also worth reading is all the documents on the new G1 collector.
The basic problem is that the way that Java program sees memory (you call "new MyObject" and there it is, and when you are done with it you just forget about it) does not map very well to the underlying operating system and hardware.
The job of the garbage collector is to identify those memory areas which are not in use by an object, and "melt" them together to give a LARGE memory area from where new objects can be allocated. This is very vaguely worded in the Java specification HOW this is done, most likely in order to provide maximum flexibility for the designers of this important task.
Several approaches exist, with advantages and disadvantages. What you usually want is a garbage collector that can keep up in the background with the rate of objects being abandoned, as the only way for it to catch up is to stop the program while catching up. That gives really bad user experiences.
A typical trend for Java objects is that either they live for a very short time (current block or method) or a very long time. Modern garbage collectors deal with this by having multiple pools so that young objects are treated differnetly than old objects.
is garbage collection algorithm in java "vendor implemented?"
From the introduction paragraph to Chapter 3 of the Java Virtual Machine Specification:
For example, the memory layout of
run-time data areas, the
garbage-collection algorithm used, and
any internal optimization of the Java
virtual machine instructions (for
example, translating them into machine
code) are left to the discretion of
the implementor. [emphasis mine]
Yes, and not only that, each JVM can contain more than one garbage collection strategy:
Sun
JRockit
IBM
Definitely vendor dependent. GCJ and the Sun VM use totally different garbage collectors, for example.
Yes. The Java VM Spec's don't say anything specific about garbage collection. Each vendor has their own implementation for performing GC. In fact, each vendor will have multiple GC policies that can be best chosen for a particular task.
Example
A GC tuned for throughput may not be good for real-time systems since they will have erratic (and often longer) pause times which are not predictable. Non-predictability is a killer for real-time application.
Some GC's such as the ones from Oracle and IBM are very tunable and can be tune based on your application's run-time memory characteristics.
The internals of GC are not too complicated at a higher level. Many algorithms that began in the early days of LISP are still in use today.
Read this (http://nd.edu/~dthain/courses/cse40243/spring2006/gc-survey.pdf "GC Introduction") for a good introduction to Garbage Collection at a moderately high-level.
Yes. The Java VirtualMachine Specification don't say anything specific about garbage collection. Each vendor has their own implementation for performing the task.
each can automatically calls garbage collector, then we didn't need manual calls for garbage collection