I am developing a Java desktop application where I have many caches, such as object pools, cached JPanels ...etc.
Example:
When a user switches from one Panel to another, I don't destroy the previous one in case the user switches back.
But, The application memory consumption might get high while the system is in desperate need of these memory resources that I am consuming not so justifiably...
In an iOS application, I would release these in "applicationDidReceiveMemoryWarning" method. But in Java... ?
So, When is it a good time to release cached objects in Java?
Caching often isn't a good idea in Java - creating new objects is usually much cheaper than you think, and often results in better performance than keeping objects cached "just in case". Having a lot of long-lived objects around is bad for GC performance and also can put considerable pressure on processor caches.
JPanels for example - are sufficiently lightweight that it's fine to create a completely new one whenever you need it.
If I were you, I'd cache as little as possible, and only do so when you've proved a substantial performance benefit from doing so.
Even if you do need to cache, consider using a cache that uses Soft References - this way the JVM will be able to clear items from the cache automatically if it needs to free up memory. This is simpler and safer than trying to roll your own caching strategy. You could use an existing Soft Reference cache implementation like Guava's CacheBuilder (thanks AlistairIsrael!).
Related
Should Java Objects be reused as often as it can be reused ? Or should we reuse it only when they are "heavyweight", ie have OS resources associated with it ?
All old articles on the internet talk about object reuse and object pooling as much as possible, but I have read recent articles that say new Object() is highly optimized now ( 10 instructions ) and Object reuse is not as big a deal as it used to be.
What is the current best practice and how are you people doing it ?
I let the garbage collector do that kind of deciding for me, the only time I've hit heap limit with freshly allocated objects was after running a buggy recursive algorithm for a couple of seconds which generated 3 * 27 * 27... new objects as fast as it could.
Do what's best for readability and encapsulation. Sometimes reusing objects may be useful, but generally you shouldn't worry about it.
If you use them very intensively and the construction is costly, you should try to reuse them as much as you can.
If your objects are very small, and cheap to create ( like Object ) you should create new ones.
For instance connections database are pooled because the cost of creating a new one is higher than those of creating .. mmhh new Integer for instance.
So the answer to your question is, reuse when they are heavy AND are used often ( it is not worth to pool a 3 mb object that is only used twice )
Edit:
Additionally, this item from Effective Java:Favor Immutability is worth reading and may apply to your situation.
Object creation is cheap, yes, but sometimes not cheap enough.
If you create a lot (and I mean A LOT) temporary objects in rapid succession, the costs for the garbage collector are considerable. However even with a good profiler you may not necessarily see the costs easily, as the garbage collector nowadays works in short intervals instead of blocking the whole application for a second or two.
Most of the performance improvements I got in my projects came from either avoiding object creation or avoiding the whole work (including the object creation) through aggressive caching. No matter how big or small the object is, it still takes time to create it and to manage the references and heap structures for it. (And of course, the cleanup and the internal heap-defrag/copying also takes time.)
I would not start to be religious about avoiding object creation at all cost, but if you see a jigsaw pattern in your memory-profiler, it means your garbage collector is on heavy duty. And if your garbage collector uses the CPU, the CPI is not available for your application.
Regarding object pooling: Doing it right and not running into either memory leaks or invalid states or spending more time on the management than you would save is difficult. So I never used that strategy.
My strategy has been to simply strive for immutable objects. Immutable things can be cached easily and therefore help to keep the system simple.
However, no matter what you do: Make sure you check your hotspots with a profiler first. Premature optimization is the root of most evilness.
Let the garbage collector do its job, it can be considered better than your code.
Unless a profiler proves it guilty. And don't even use common sense to try to figure out when it's wrong. In unusual cases even cheap objects like byte arrays are better pooled.
Rule 1 of optimization: don't do it.
Rule 2 (for experts only): don't do it yet.
The rule of thumb should be to use your common sense and reuse objects when their creation consumes significant resources such as I/O, network traffic, DB connections, etc...
If it's just creating a new String(), forget about the reuse, you'll gain nothing from it. Code readability has higher preference.
I would worry about performance issues if they arise. Do what makes sense first (would you do this with primatives), if you then run a profiling tool and find that it is new causing you problems, start to think about pre-allocation (ie. when your program isn't doing much work).
Re-using objects sounds like a disaster waiting to happen by the way:
SomeClass someObject = new SomeClass();
someObject.doSomething();
someObject.changeState();
someObject.changeOtherState();
someObject.sendSignal();
// stuff
//re-use
someObject.reset(); // urgh, had to put this in to support reuse
someObject.doSomethingElse(); // oh oh, this is wrong after calling changeOtherState, regardless of reset
someObject.changeState(); // crap, now this is wrong but it's not obvious yet
someObject.doImportantStuff(); // what's going on?
Object creation is certainly faster than it used to be. The newer generational GC in JDKs 5 and higher are improvements, too.
I don't think either of these makes excessive creation of objects cost-free, but they do reduce the importance of object pooling. I think pooling makes sense for database connections, but I don't attempt it for my own domain objects.
Reuse puts a premium on thread-safety. You need to think carefully to ensure that you can reuse objects safely.
If I decided that object reuse was important I'd do it with products like Terracotta, Tangersol, GridGain, etc. and make sure that my server had scads of memory available to it.
Second the above comments.
Don't try and second guess the GC and Hotspot. Object pooling may have been useful once but these days its not so useful unless you are talking about database connections or unique system resources.
Just try and write clean and simple code and be amazed at what Hotspot can do.
Why not use VisualVM or a profiler to take a look at your code?
May be this is a well known question, But i didn't find the best reference for this ques...
what is the formula to calculate and assign the default u-limit, verbose (for gc) and max heap memory value?
If there is no specific formula, what is the criteria to specify this for a particular machine.
If possible could anyone please explain these concepts also.
Is there any other concepts we need to consider for performance improvement?
How to tune the JVM for better performance,
Stop what you're doing right now.
Tuning the JVM is probably the last thing you should worry about. Until you've gone through every other performance trick in the book, the default settings should be just fine.
Firstly you need to profile your application and find out where the bottlenecks are. Specifically, you will want to know:
What functions /methods are consuming the majority of CPU time?
Where are all the memory allocations happening?
What kind of objects are taking up most space on the heap?
Then you should apply targeted optimisations to the areas that are causing problems. There are thousands of valid techniques, but here are the ones that I find are most useful:
Improve algorithms - anything that is taking up a decent chunk of CPU time and has complexity of O(n^2) or worse is probably a good candidate for improvement. Try to get it to O(n log n) or better.
Share immutable data - if you have a lot of copies of the same data then it makes sense to turn these into immutable objects and share a single instance. This can save a lot of memory (and has the nice effect of improving thread safety / concurrency)
Use primitive types - replace Integer with int etc. This saves memory and makes numerical operations faster.
Be lazy - don't compute things until they are definitely needed.
Cache things - if something is expensive to compute but frequently requested, store it in a cache after the first request. Use a cache backed by a SoftHashMap so that the memory can still be released if needed.
Offload work - Can you make use of multiple cores? Can the client application do some of the work for you?
After making any changes you then need to profile again. At the very least, you will want to confirm that your optimisations actually helped. Additionally, fixing one bottleneck will usually move the bottleneck to another part of the application. So you will need to identify the new place to focus next.
Repeat until your application is fast enough (as defined by your own or your customers' requirements).
In my application, I would like to load some amount of data into memory when first needed and keep it there in case another part of the application wants to use it. The same data would be accessed from a couple of different Activity'es, but by far not all the user could interact with. So, when not working with the relevant part of my application, I would like Android to feel free to discard the data, reloading them again at need. Note that it is unpredictable for me what the user will do, so I want Android to free the data only if hasn't been used for some time. What is a good approach to doing this?
I thought of creating a class that would be only used statically, loading the data in its static initialisation block. However, I am not sure if Dalvik would ever discard any static data stored this way. I have read something on class loaders but I have no idea what loader is used in loading my class and how it could potentially become discarded. Perhaps someone does...?
Another way I came up with is using weak reference to keep an instance of the data-holding class (non-static, obviously) but here I am afraid that the GC could decide it's useless when no Activity is currently actively operating it, even when memory is no concern at that moment. (In that case, I would like to keep the data loaded.)
The loading of my data is costly. I want, if possible, to destroy it only when the system is running out of memory or when the application exits.
It sounds like SoftReferences are what you need. These are cleared at the garbage collector's discretion when it detects that there is a memory shortfall.
If you read the class javadoc, it gives some hints about how to prevent recently used cache entries from being reclaimed.
For the record, classloaders won't help you manage instances of a class. But making the cache a static should allow the cached objects to be discarded if the cache class gets unloaded.
FOLLOWUP
My data is a solid block that would be represented by a single object.
This rather changes things. If you have a single object to cache, then LRU makes no sense. Basically it sounds like you want to hang onto the object as long as possible ... without trigger OOMEs by hanging onto it too long. This is kind of hard. Indeed, doing a perfect job is going to entail correctly predicting what the user is going to do ... which is clearly impossible.
Possibly the best strategy is to make use of the reference enqueing mechanism, and implement the queue processor to make an "intelligent" choice between letting the object die or recreating the soft link. The "intelligence" might entail looking at how much free memory there is, and / or how long it was since the object was last used. But beware!! If you get this wrong you can cause OOMEs or cause the platform to spend lots of time thrashing the garbage collector.
If I set up the cache to hold 1 object, it would be equivalent to a hard reference, wouldn't it?
Nope. If you use a SoftReference the GC will break the reference if it is running out of memory.
You can use SoftReferences. Take a look at:
http://docs.oracle.com/javase/6/docs/api/java/lang/ref/SoftReference.html
With SoftReferences you can achieve what you need:
" I want, if possible, to destroy it only when the system is running out of memory or when the application exits."
Take a look at
SoftReference gets garbage collected too early
You can also look into LruCache if your looking to cache some data in memory your app.
http://developer.android.com/reference/android/support/v4/util/LruCache.html
For a longer lived disk based cache take a look at Android Objects Cache
You can find the DiskLruCache source at https://github.com/JakeWharton/DiskLruCache/
Object creation is a bottleneck in my application.
I think that adding more threads for object creation makes the situation worse, because object creation is a CPU-bound task, right?
Then, how to improve performance?
Often the problem is not object creation itself, but repeated object creation and garbage generation. That causes two performance hits: creating all those objects and extra garbage collection stalls.
First, you should use profiling tools to verify that excessive object creation is the source of your performance problems. Assuming that you have verified that this is the problem, there are various things to look for and strategies to try. It all depends on how your code is written, so there's no one recommendation that will work. This list of Java performance guidelines from IBM is definitely worth applying. It identifies how to avoid many of the most common sins: don't create objects inside loops; use StringBuilder instead of a series of string concatenation expressions; use primitive types and avoid auto-boxing/unboxing where possible; cache frequently used objects; allocate collection classes with an explicit capacity instead of allowing them to grow; etc.
Another nice resource is Chapter 4 of the book Java Performance Tuning. (You can read it on-line here.)
If you search the web for excessive object creation java, you can find lots of other recommendations.
You can still get significant performance improvement by multi-threading CPU bound tasks when your app is running on a machine with multiple processors.
As #Pst says - are you sure it's the bottleneck? because these days it's not a common one.
But given that. One thing you could try is avoiding creation by caching and reusing instances. But that totally depends on what your program does.
Java uses a TLAB (Thread Local Allocation Buffer) for small to medium sizes objects. This means each thread can allocate objects concurrently. i.e. you don't get a slow down for using multiple threads.
In general, more CPUs improve CPU-bound problems. Its IO bound tasks where one cpu can use all the available bandwidth, like disk access, which are no faster when you use multiple CPUs.
The simplest way to reduce the cost of Object Creation is to create/discard less objects. There is a common assumption that object creation is unavoidable, but the last 2.5 years I have worked on applications which GC less than once per day, even under production load.
Most application don't work this way because they don't need to. However, if you have a need to minimise object creation you can.
I'm looking to implement a simple cache without doing too much work (naturally). It seems to me that one of the standard Java collections ought to suffice, with a little extra work. Specifically, I'm storing responses from a server, and the keys can either be the request URL string or a hash code generated from the URL.
I originally thought I'd be able to use a WeakHashMap, but it looks like that method forces me to manage which objects I want to keep around, and any objects I don't manage with strong references are immediately swept away. Should I try out a ConcurrentHashMap of SoftReference values instead? Or will those be cleaned up pretty aggressively too?
I'm now looking at the LinkedHashMap class. With some modifications it looks promising for an MRU cache. Any other suggestions?
Whichever collection I use, should I attempt to manually prune the LRU values, or can I trust the VM to bias against reclaiming recently accessed objects?
FYI, I'm developing on Android so I'd prefer not to import any third-party libraries. I'm dealing with a very small heap (16 to 24 MB) so the VM is probably pretty eager to reclaim resources. I assume the GC will be aggressive.
If you use SoftReference-based keys, the VM will bias (strongly) against recently accessed objects. However it would be quite difficult to determine the caching semantics - the only guarantee that a SoftReference gives you (over a WeakReference) is that it will be cleared before an OutOfMemoryError is thrown. It would be perfectly legal for a JVM implementation to treat them identically to WeakReferences, at which point you might end up with a cache that doesn't cache anything.
I don't know how things work on Android, but with Sun's recent JVMs one can tweak the SoftReference behaviour with the -XX:SoftRefLRUPolicyMSPerMB command-line option, which determines the number of milliseconds that a softly-reachable object will be retained for, per MB of free memory in the heap. As you can see, this is going to be exceptionally difficult to get any predictable lifespan behaviour out of, with the added pain that this setting is global for all soft references in the VM and can't be tweaked separately for individual classes' use of SoftReferences (chances are each use will want different parameters).
The simplest way to make an LRU cache is by extending LinkedHashMap as described here. Since you need thread-safety, the simplest way to extend this initially is to just use Collections.synchronizedMap on an instance of this custom class to ensure safe concurrent behaviour.
Beware premature optimisation - unless you need very high throughput, the theoretically suboptimal overhead of the coarse synchronization is not likely to be an issue. And the good news - if profiling shows that you are performing too slowly due to heavy lock contention, you'll have enough information available about the runtime use of your cache that you'll be able to come up with a suitable lockless alternative (probably based on ConcurrentHashMap with some manual LRU treatment) rather than having to guess at its load profile.
LinkedHashMap is easy to use for cache. This creates an MRU cache of size 10.
private LinkedHashMap<File, ImageIcon> cache = new LinkedHashMap<File, ImageIcon>(10, 0.7f, true) {
#Override
protected boolean removeEldestEntry(Map.Entry<File, ImageIcon> eldest) {
return size() > 10;
}
};
I guess you can make a class with synchronized delegates to this LinkedHashMap. Forgive me if my understanding of synchronization is wrong.
www.javolution.org has some interestig features - synchronized fast collections.
In your case it worth a try as it offers also some nifty enhancements for small devices as Android ones.
For synchronization, the Collections framework provides a synchronized map:
Map<V,T> myMap = Collections.synchronizedMap(new HashMap<V, T>());
You could then wrap this, or handle the LRU logic in a cache object.
I like Apache Commons Collections LRUMap