Not so recently I've published a game that is written entirely in Java on Android platform. Currently I'm trying to get as much of the performance as possible. It seems that the problem in my game's case is that I'm using more too often ArrayList container in places where Map could be better suited. To explain myself I did it because I was afraid of dynamic memory allocations that would be triggered behind the scene (Map/Tree structures on Android). Maybe there is some sort of structure on Android/Java platform I don't know about, which will provide me with fast searching results and additionally will not allocate dynamically extra memory when adding new elements?
UPDATE:
For example I'm using an ArrayList structure for holding most of my game's Particles. Of course removing them independently (not sequentially) is a pain in the b**t as the system needs to iterate through the whole container just to remove one entity object (of course in the worst case scenario).
I wouldn't worry about slowdown because of memory allocation unless you specifically find it to be an issue. Memory allocation isn't really the cause of slowdowns in Android games, it's when the GC runs that's usually the problem. Unless you are inserting and deleting from the Map very often, you might not have to worry about the allocations.
Update:
Instead of using a Map, you might want to consider just marking particles as "dead" when you no longer need them and using that flag to skip over them in your update iteration. Store the references to the dead particles in a new deadParticles ArrayList, and just take one out from that list when you need a new one. That way the you have instant access to particles when you need them.
Are you preallocating your element objects, and reusing the empties rather than allocating new ones?
Related
I'm developing a software package which makes heavy use of arrays (ArrayLists). Instructions to be process are put into an array queue to be processed, then when used, deleted from the array. Same with drawing on a plot, data is placed into an array queue, which is read to plot data, and the oldest data is eventually deleted as new data comes in. We are talking about thousands of instructions over an hour and at any time maybe 200,000 points plotted, continually growing/shrinking the array.
After sometime, the software beings to slow where the instructions are processed slower. Nothing really changes as to what is going on for processing, that is, the system is stable as to what how much data is plotted and what instructions are being process, just working off similar incoming data time after time.
Is there some memory issue going on with the "abuse" of the variable-sized (not a defined size, add/delete as needed) arrays/queues that could be causing eventual slowing?
Is there a better way than the String ArrayList to act as a queue?
Thanks!
Yes, you are most likely using the wrong data structure for the job. An ArrayList is a list with a backing array so get() is fast.
The Java runtime library has a very rich set of data structures so you can get a well-written and debugged with the characteristics you need out of the box. You most likely should be using one or more Queues instead.
My guess is that you forget to null out values in your arraylist so the JVM has to keep all of them around. This is a memory leak.
To confirm, use a profiler to see where your memory and cpu go. Visualvm is a nice standalone. Netbeans include one.
The use of VisualVM helped. It showed a heavy use of a "message" form that I was dumping incoming data to and forgot existed, so it was dealing with a million characters when the sluggishness became apparent, because I never limited its size.
I know that we can use two stacks to implement Undo/Redo for text editors. For Piece Table, you can simply push the nodes that are going to be affected into the stack as mentioned here (Great write-up about Piece Table in general, btw). And for Rope, my understanding is that since Rope should be immutable, whenever there is a change, simply push the root of the old tree into stack as mentioned here:
"Not only can text insertions and deletions be performed in near-constant time for extremely large documents, but ropes' immutability makes implementation of an undo stack trivial: simply store a reference to the previous rope with every change."
If this is the case, then Rope seems very memory intensive, and can quickly fill up your memory with a large file after a couple of modifications. How is this handled in modern text editors?
This leads to another question: What would you do if there is a 5GB file and you only have 2GB memory? I was thinking maybe use paging or dynamic loading, so when you scroll down it will discard some old text in memory and load more from the disk. Then how is this realized in Piece Table and Rope? Maybe we could serialize older part of data structure onto disk as we load more content and put into our data structure, but this just does not seem to be an optimal solution to me.
Cheers!
One of the main advantages of immutable objects is that they can share structures with each other. Because a structure never changes, the whole structure doesn't need to be copied; only the parts affected by the modification need to be duplicated. This means that only a relatively small amount of memory is needed for each addition. Here's a good explanation of how this is realized with another tree-like structure: Understanding Clojure's Persistent Vector.
Even with memory saving optimizations, very large files that don't fit into RAM or a program's allowed memory space can still pose a problem. To fix this, text editors will store parts of the file that aren't directly being edited in some type of swap file. Some editors use their own virtual memory implementations to accomplish this, while others just pick parts of the file that are sufficiently far away from the cursor. As an (hypothetical) example, a rope's subtrees could be saved to a file if no modifications have been made on it for a certain amount of time.
I have a Tomcat Java webapp which is thrashing the Java GC when under load. I think this is due to a combination of a large amount of short lived objects along with an unknown amount of moderately long lived objects.
To validate this theory I want to find a tool which will let me determine the object lifetimes for all allocated objects (or every 10th object etc for better performance). Ideally the final output will be a histogram showing the relative number of objects which live for different amounts of time.
I think this tool will likely be built on top of either the Instrumentation API or the JVMTI. If there are no good tools which already do this I would also appreciate suggestions about which of the JVM's interfaces would be best to use when writing such a tool.
I have now started writing a tool to do what I originally asked about. The current code can be found here:
http://wiki.github.com/mchr3k/org.inmemprofiler/
So far I have managed to get a textual histogram of all object allocations by instance count. This does not include array allocations which are handled differently.
I am now working on adding instance size information along with tracking of array allocations by using the JVMTI.
I need to cache objects in Java using a proportion of whatever RAM is available. I'm aware that others have asked this question, but none of the responses meet my requirements.
My requirements are:
Simple and lightweight
Not dramatically slower than a plain HashMap
Use LRU, or some deletion policy that approximates LRU
I tried LinkedHashMap, however it requires you to specify a maximum number of elements, and I don't know how many elements it will take to fill up available RAM (their sizes will vary significantly).
My current approach is to use Google Collection's MapMaker as follows:
Map<String, Object> cache = new MapMaker().softKeys().makeMap();
This seemed attractive as it should automatically delete elements when it needs more RAM, however there is a serious problem: its behavior is to fill up all available RAM, at which point the GC begins to thrash and the whole app's performance deteriorates dramatically.
I've heard of stuff like EHCache, but it seems quite heavy-weight for what I need, and I'm not sure if it is fast enough for my application (remembering that the solution can't be dramatically slower than a HashMap).
I've got similar requirements to you - concurrency (on 2 hexacore CPUs) and LRU or similar - and also tried Guava MapMaker. I found softValues() much slower than weakValues(), but both made my app excruciatingly slow when memory filled up.
I tried WeakHashMap and it was less problematic, oddly even faster than using LinkedHashMap as an LRU cache via its removeEldestEntry() method.
But by the fastest for me is ConcurrentLinkedHashMap which has made my app 3-4 (!!) times faster than any other cache I tried. Joy, after days of frustration! It's apparently been incorporated into Guava's MapMaker, but the LRU feature isn't in Guava r07 at any rate. Hope it works for you.
I've implemented serval caches and it's probably as difficult as implementing a new datasource or threadpool, my recommendation is use jboss-cache or a another well known caching lib.
So you will sleep well without issues
I've heard of stuff like EHCache, but it seems quite heavy-weight for what I need, and I'm not sure if it is fast enough for my application (remembering that the solution can't be dramatically slower than a HashMap).
I really don't know if one can say that EHCache is heavy-weight. At least, I do not consider EHCache as such, especially when using a Memory Store (which is backed by an extended LinkedHashMap and is of course the the fastest caching option). You should give it a try.
I believe MapMaker is going to be the only reasonable way to get what you're asking for. If "the GC begins to thrash and the whole app's performance deteriorates dramatically," you should spend some time properly setting the various tuning parameters. This document may seem a little intimidating at first, but it's actually written very clearly and is a goldmine of helpful information about GC:
https://www.oracle.com/technetwork/java/javase/memorymanagement-whitepaper-150215.pdf
I don't know if this would be a simple solution, especially compared with EHCache or similar, but have you looked at the Javolution library? It is not designed for as such, but in the javolution.context package they have an Allocator pattern which can reuse objects without the need for garbage collection. This way they keep object creation and garbage collection to a minimum, an important feature for real-time programming. Perhaps you should take a look and try to adapt it to your problem.
This seemed attractive as it should
automatically delete elements when it
needs more RAM, however there is a
serious problem: its behavior is to
fill up all available RAM
Using soft keys just allows the garbage collector to remove objects from the cache when no other objects reference them (i.e., when the only thing referring to the cache key is the cache itself). It does not guarantee any other kind of expulsion.
Most solutions you find will be features added on top of the java Map classes, including EhCache.
Have you looked at the commons-collections LRUMap?
Note that there is an open issue against MapMaker to provide LRU/MRU functionality. Perhaps you can voice your opinion there as well
Using your existing cache, store WeakReference rather than normal object refererences.
If GC starts running out of free space, the values held by WeakReferences will be released.
In the past I have used JCS. You can set up the configuration to try and meet you needs. I'm not sure if this will meet all of your requirements/needs but I found it to be pretty powerful when I used it.
You cannot "delete elements" you can only stop to hard reference them and wait for the GC to clean them, so go on with Google Collections...
I'm not aware of an easy way to find out an object's size in Java. Therefore, I don't think you'll find a way to limit a data structure by the amount of RAM it's taking.
Based on this assumption, you're stuck with limiting it by the number of cached objects. I'd suggest running simulations of a few real-life usage scenarios and gathering statistics on the types of objects that go into the cache. Then you can calculate the statistically average size, and the number of objects you can afford to cache. Even though it's only an approximation of the amount of RAM you want to dedicate to the cache, it might be good enough.
As to the cache implementation, in my project (a performance-critical application) we're using EhCache, and personally I don't find it to be heavyweight at all.
In any case, run several tests with several different configurations (regarding size, eviction policy etc.) and find out what works best for you.
Caching something, SoftReference maybe the best way until now I can imagine.
Or you can reinvent an Object-pool. That every object you doesn't use, you don't need to destroy it. But it to save CPU rather than save memory
Assuming you want the cache to be thread-safe, then you should examine the cache example in Brian Goetz's book "Java Concurrency in Practice". I can't recommend this highly enough.
How do you optimize the heap size usage of an application that has a lot (millions) of long-lived objects? (big cache, loading lots of records from a db)
Use the right data type
Avoid java.lang.String to represent other data types
Avoid duplicated objects
Use enums if the values are known in advance
Use object pools
String.intern() (good idea?)
Load/keep only the objects you need
I am looking for general programming or Java specific answers. No funky compiler switch.
Edit:
Optimize the memory representation of a POJO that can appear millions of times in the heap.
Use cases
Load a huge csv file in memory (converted into POJOs)
Use hibernate to retrieve million of records from a database
Resume of answers:
Use flyweight pattern
Copy on write
Instead of loading 10M objects with 3 properties, is it more efficient to have 3 arrays (or other data structure) of size 10M? (Could be a pain to manipulate data but if you are really short on memory...)
I suggest you use a memory profiler, see where the memory is being consumed and optimise that. Without quantitative information you could end up changing thing which either have no effect or actually make things worse.
You could look at changing the representation of your data, esp if your objects are small.
For example, you could represent a table of data as a series of columns with object arrays for each column, rather than one object per row. This can save a significant amount of overhead for each object if you don't need to represent an individual row. e.g. a table with 12 columns and 10,000,000 rows could use 12 objects (one per column) rather than 10 million (one per row)
You don't say what sort of objects you're looking to store, so it's a little difficult to offer detailed advice. However some (not exclusive) approaches, in no particular order, are:
Use a flyweight pattern wherever
possible.
Caching to disc. There are
numerous cache solutions for
Java.
There is some debate as to whether
String.intern is a good idea. See
here for a question re.
String.intern(), and the amount of
debate around its suitability.
Make use of soft or weak
references to store data that you can
recreate/reload on demand. See
here for how to use soft
references with caching techniques.
Knowing more about the internals and lifetime of the objects you're storing would result in a more detailed answer.
Ensure good normalization of your object model, don't duplicate values.
Ahem, and, if it's only millions of objects I think I'd just go for a decent 64 bit VM and lots of ram ;)
Normal "profilers" won't help you much, because you need an overview of all your "live" objects. You need heap dump analyzer. I recommend the Eclipse Memory analyzer.
Check for duplicated objects, starting with Strings.
Check whether you can apply patterns like flightweight, copyonwrite, lazy initialization (google will be your friend).
Take a look at this presentation linked from here. It lays out the memory use of common java object and primitives and helps you understand where all the extra memory goes.
Building Memory-efficient Java Applications: Practices and Challenges
You could just store fewer objects in memory. :) Use a cache that spills to disk or use Terracotta to cluster your heap (which is virtual) allowing unused parts to be flushed out of memory and transparently faulted back in.
I want to add something to the point Peter alredy made(can't comment on his answer :() it's always better to use a memory profiler(check java memory profiler) than to go by intution.80% of time it's routine that we ignore has some problem in it.also collection classes are more prone to memory leaks.
If you have millions of Integers and Floats etc. then see if your algorithms allow for representing the data in arrays of primitives. That means fewer references and lower CPU cost of each garbage collection.
A fancy one: keep most data compressed in ram. Only expand the current working set. If your data has good locality that can work nicely.
Use better data structures. The standard collections in java are rather memory intensive.
[what is a better data structure]
If you take a look at the source for the collections, you'll see that if you restrict yourself in how you access the collection, you can save space per element.
The way the collection handle growing is no good for large collections. Too much copying. For large collections, you need some block-based algorithm, like btree.
Spend some time getting acquainted with and tuning the VM command line options, especially those concerning garbage collection. While this won't change the memory used by your objects, it can have a big impact on performance with memory-intensive apps on machines with a lot of RAM.
Assign null value to all the variables which are no longer used. Thus make it available for Garbage collection.
De-reference the collections once usage is over, otherwise GC won't sweep those.
1) Use right dataTypes wherever possible
Class Person {
int age;
int status;
}
Here we can use below variables to save memory while sending Person object
class Person{
short age;
byte status;
}
2) Instead of returning new ArrayList<>(); from method , you can use Collection.emptyList() which will only contain only one element instead of default 10;
For e.g
public ArrayList getResults(){
.....
if(failedOperation)
return new ArrayList<>();
}
//Use this
public ArrayList getResults(){
if(failedOperation)
return Collections.emptyList();
}
3 ) Move creation of objects in methods instead of static declaration wherever possible as fields of objects will be stored on stack instead of heap
4) Using binary formats like protobuf,thrift,avro,messagepack for reducing intercommunication instead of json or XML