I have been working on a Java program that generates fractal orbits for quite some time now. Much like photographs, the larger the image, the better it will be when scaled down. The program uses a 2D object (Point) array, which is written to when a point's value is calculated. That is to say the Point is stored in it's corresponding value, I.e.:
Point p = new Point(25,30);
histogram[25][30] = p;
Of course, this is edited for simplicity. I could just write the point values to a CSV, and apply them to the raster later, but using similar methods has yielded undesirable results. I tried for quite some time because I enjoyed being able to make larger images with the space freed by not having this array. It just won't work. For clarity I'd like to add that the Point object also stores color data.
The next problem is the WriteableRaster, which will have the same dimensions as the array. Combined the two take up a great deal of memory. I have come to accept this, after trying to change the way it is done several times, each with lower quality results.
After trying to optimize for memory and time, I've come to the conclusion that I'm really limited by RAM. This is what I would like to change. I am aware of the -Xmx switch (set to 10GB). Is there any way to use Windows' virtual memory to store the raster and/or the array? I am well aware of the significant performance hit this will cause, but in lieu of lowering quality, there really doesn't seem to be much choice.
The OS is already making hard drive space into RAM for you and every process of course -- no magic needed. This will be more of a performance disaster than you think; it will be so slow as to effectively not work.
Are you looking for memory-mapped files?
http://docs.oracle.com/javase/6/docs/api/java/nio/MappedByteBuffer.html
If this is really to be done in memory, I would bet that you could dramatically lower your memory usage with some optimization. For example, your Point object is mostly overhead and not data. Count up the bytes needed for the reference, then for the Object overhead, compared to two ints.
You could reduce the overhead to nothing with two big parallel int arrays for your x and y coordinates. Of course you'd have to encapsulate this for access in your code. But it could halve your memory usage for this data structure. Millions fewer objects also speeds up GC runs.
Instead of putting a WritableRaster in memory, consider writing out the image file in some simple image format directly, yourself. BMP can be very simple. Then perhaps using an external tool to efficiently convert it.
Try -XX:+UseCompressedOops to reduce object overhead too. Also try -XX:NewRatio=20 or higher to make the JVM reserve almost all its heap for long-lived objects. This can actually let you use more heap.
It is not recommended to configure your JVM memory parameters (Xmx) in order to make the operating system to allocate from it's swap memory. apparently the garbage collection mechanism needs to have random access to heap memory and if doesn't, the program will thrash for a long time and possibly lock up. please check the answer given already to my question (last paragraph):
does large value for -Xmx postpone Garbage Collection
Related
I have an application that produces large results objects and puts them in a queue. Multiple worker threads create the results objects and queue them, and a single writer thread de-queues the objects, converts them to CSV, and writes them to disk. Due to both I/O and the size of the results objects, writing the results takes far longer than generating them. This application is not a server, it is simply a command-line app that runs through a large batch of requests and finishes.
I would like to decrease the overall memory footprint of the application. Using a heap analysis tool (IBM HeapAnalyzer), I am finding that just before the program terminates, most of the large results objects are still on the heap, even though they were de-queued and have no other references to them. That is, they are all root objects. They take up the majority of the heap space.
To me, this means that they made it into tenured heap space while they were still in the queue. As no full GC is ever triggered during the run, that is where they remain. I realize that they should be tenured, otherwise I'd be copying them back and forth within the Eden spaces while they are still in the queue, but at the same time I wish there was something I could do to facilitate getting rid of them after de-queueing, short of calling System.gc().
I realize one way of getting rid of them would be to simply shrink the maximum heap size and trigger a full GC. However the inputs to this program vary considerably in size and I would prefer to have one -Xmx setting for all runs.
Added for Clarification: this is all an issue because there is also a large memory overhead in Eden for actually writing the object out (mostly String instances, which also appear as roots in the heap analysis). There are frequent minor GC's in Eden as a result. These would be less frequent if the result objects were not hanging around in the tenured space. The argument could be made that my real problem is the output overhead in Eden, and I am working on that, but wanted to pursue this tenured issue at the same time.
As I research this, are there any particular garbage collector settings or programmatic approaches I should be focusing on? Note I am using JDK 1.8.
Answer Update: #maaartinus made some great suggestions that helped me avoid queueing (and thus tenuring) the large objects in the first place. He also suggested bounding the queue, which would surely cut down on the tenuring of what I am now queueing instead (the CSV byte[] representations of the results objects). The right mix of thread count and queue bounds will definitely help, though I have not tried this as the problem basically disappeared by finding a way to not tenure the big objects in the first place.
I'm sceptical concerning a GC-related solution, but it looks like you're creating a problem you needn't to have:
Multiple worker threads create the results objects and queue them, and a single writer...
... writing the results takes far longer than generating them ...
So it looks like it should actually be the other way round: single producer and many consumers to keep the game even.
Multiple writers mightn't give you much speed up, but I'd try it, if possible. The number of producers doesn't matter much as long as you use a bounded queue for their results (I'm assuming they have no substantially sized input as you haven't mentioned it). This bounded queue could also ensure that the objects get never too old.
In any case, you can use multiple to CSV converters, so effectively replacing a big object by a big String or byte[], or ByteBuffer, or whatever (assuming you want to do the conversion in memory). The nice thing about the buffer is that you can recycle it (so the fact that it gets tenured is no problem anymore).
You could also use some unmanaged memory, but I really don't believe it's necessary. Simply bounding the queue should be enough, unless I'm missing something.
And by the way, quite often the cheapest solution is to buy more RAM. Really, one hour of work is worth a couple of gigabytes.
Update
how much should I be worried about contention between multiple writer threads, since they would all be sharing one thread-safe Writer?
I can imagine two kinds of problems:
Atomicity: While synchronizations ensures that each executed operations happens atomically, it doesn't mean that the output makes any sense. Imagine multiple writers, each of them generating a single CSV and the resulting file should contain all the CSVs (in any order). Using a PrintWriter would keep each line intact, but it'd intermix them.
Concurrency: For example, a FileWriter performs the conversion from chars to bytes, which may in this context end up in a synchronized block. This could reduce parallelism a bit, but as the IO seems to be the bottleneck, I guess, it doesn't matter.
This question already has answers here:
Calculate size of Object in Java [duplicate]
(3 answers)
Closed 1 year ago.
I am comparing a Trie with a HashMap storing English words, over 1 million. After the data is loaded, only lookup is performed. I am writing code to test both speed and memory. The speed seems easy to measure, simply recording the system time before and after the testing code.
What's the way to measure the memory usage of an object? In this case, it's either a Trie and HashMap. I watched the system performance monitor and tested in Eclipse. The OS performance monitor shows over 1G memory is used after my testing program is launched. I doubt the fact that storing the data needs so much memory.
Also, on my Windows machine, it shows that memory usage keeps rising throughout the testing time. This shouldn't happen, since the initial loading time of the data is short. And after that, during the lookup phrase, there shouldn't be any more additional memory consumption, since no new objects are created. On linux, the memory usage seems more stable, though it also increased some.
Would you please share some thoughts on this? Thanks a lot.
The short answer is: you can't.
The long answer is: you can calculate the size of objects in memory by repeating the differential memory analysis calling GC multiple times before and after the tests. But even then only a very large numbers or round can approximate the real size. You need a warmup phase first and even if it all seams to work just smoothly, you can get stuck with jit and other optimizations, you were not aware of.
In general it's a good rule of thumb to count the amount of objects you use.
If your tree implementation uses objects as structure representing the data, it is quite possible that your memory consumption is high, compared to a map.
If you have wast amount of data a map might become slow because of collisions.
A common approach is to optimize later in case optimization is needed.
Did you try the "jps" tool which is provided by Oracle in Java SDK? You can find this in JavaSDK/bin folder. Its a great tool for performance checking and even memory usage.
I'm developing a visualization app for Android (including older devices running Android 2.2).
The input model of my app contains an area, which typically consists of tens of thousands of vertices. Typical models have 50000-100000 vertices (each with an x,y,z float coord), i.e. they use up 600K-1200 kilobytes of total memory. The app requires all vertices are available in memory at any time. This is all I can share about this app (I'm not allowed to share high-level use cases), so I'm wondering if my below conclusions are correct and whether there is a better solution.
For example, assume there are count=50000 vertices. I see two solutions:
1.) My earlier solution was using an own VertexObj (better readability due to encapsulation, better locality when accessing individual coordinates):
public static class VertexObj {
public float x, y, z;
}
VertexObj mVertices = new VertexObj[count]; // 50,000 objects
2.) My other idea is using a large float[] instead:
float[] mVertices = new VertexObj[count * 3]; // 150,000 float values
The problem with the first solution is the big memory overhead -- we are on a mobile device where the app's heap might be limited to 16-24MB (and my app needs memory for other things too). According to the official Android pages, object allocation should be avoided when it is not truly necessary. In this case, the memory overhead can be huge even for 50,000 vertices:
First of all, the "useful" memory is 50000*3*4 = 600K (this is used up by float values). Then we have +200K overhead due to the VertexObj elements, and probably another +400K due to Java object headers (they're probably at least 8 bytes per object on Android, too). This is 600K "wasted" memory for 50,000 vertices, which is 100% overhead (!). In case of 100,000 vertices, the overhead is 1.2MB.
The second solution is much better, as it requires only the useful 600K for float values.
Apparently, the conclusion is that I should go with float[], but I would like to know the risks in this case. Note that my doubts might be related with lower-level (not strictly Android-specific) aspects of memory management as well.
As far as I know, when I write new float[300000], the app requests the VM to reserve a contiguous block of 300000*4 = 1200K bytes. (It happened to me in Android that I requested a 1MB byte[], and I got an OutOfMemoryException, even though the Dalvik heap had much more than 1MB free. I suppose this was because it could not reserve a contiguous block of 1MB.)
Since the GC of Android's VM is not a compacting GC, I'm afraid that if the memory is "fragmented", such a hugefloat[] allocation may result in an OOM. If I'm right here, then this risk should be handled. E.g. what about allocating more float[] objects (and each would store a portion such as 200KB)? Such linked-list memory management mechanisms are used by operating systems and VMs, so it sounds unusual to me that I would need to use it here (on application level). What am I missing?
If nothing, then I guess that the best solution is using a linked list of float[] objects (to avoid OOM but keep overhead small)?
The out of memory you are facing while allocating the float array is quite strange.
If the biggest countinous memory block available in the heap is smaller then the memory required by the float array, the heap increases his size in order to accomodate the required memory.
Of course, this would fail if the heap has already reach the maximum available to your application. This would mean, that your application has exausted the heap, and then release a significant number of objects that resulted in memory fragmentation, and no more heap to allocate. However, if this is the case, and assuming that the fragmented memory is enough to hold the float array (otherwise your application wouldn't run anyawy), it's just a matter of allocating order.
If you allocate the memory required for the float array during application startup, you have plenty of countinous memory for it. Then, you just let your application do the remaining stuff, as the countigous memory is already allocated.
You can easly check the memory blocks being allocated (and the free ones) using DDMS in Eclipse, selecting yout app, and pressing Update Heap button.
Just for the sake of of avoiding misleading you, I've tested it before post, allocationg several contigous memory bloocks of float[300000].
Regards.
I actually ran into a problem where I wanted to embed data for a test case. You'll have quite a fun time embedding huge arrays because Eclipse kept complaining when the function exceeded something like 65,535 bytes of data due to me declaring an array like that. However, this is actually a quite common approach.
The rest goes into optimization. The big question is this: would be worth the trouble doing all of that optimizing? If you aren't hard-up on RAM, you should be fine using 1.2 megs. There's also a chance that Java will whine if you have an array that large, but you can do things like use a fancier data structure like a LinkedList or chop up the array into smaller ones. For statically set data, I feel an array might be a good choice if you are reading it like crazy.
I know you can make .xml files for integers, so storing as an integer with a tactic like multiplying the value, reading it in, and then dividing it by a value would be another option. You can also put in things like text files into your assets folder. Just do it once in the application and you can read/write however you like.
As for double vs float, I feel that in your case, a math or science case, that doubles would be safer if you can pull it off. If you do any math, you'll have less chance of error with double, especially with an operation like multiplication. floats are usually faster. I'm not sure if Java does SIMD packing, but if it does, more floats can be packed into an SIMD register than doubles
I am implementing a program that has about 2,000,000 (2 million) arrays each of size 16,512 (128 x 129) of integers. I only need to call 200 arrays at a time (that is 3.3 MB), but I wonder if I can expand the program to have more than 2 million (say 200 million) but I still only need to call 200 arrays at a time. So what is the limit of making more and more arrays while I don't use more than 200 arrays at a time?
I highly doubt that, unless you're running on a 64 bit machine with a lot of RAM and a very generous heap.
Let's calculate the memory you'll need for your data:
2,000,000*128*129*8/1024/1024/1024 = 30.8GB.
You'll need additional RAM for the JVM, the rest of your program, and the operating system.
Sounds like a poorly conceived solution to me.
If you mean "I only have 200 arrays in memory at a time" you can certainly do that, but you'll have to move the rest out to secondary storage or a relational database. Query for them, use them, GC them. It might not be the best solution, but it's hard to tell based on the little you've posted.
Update:
Does "trigger" mean "database trigger"?
Yes, you can store them on the disk. I can't guarantee that it'll perform. Your hard drive can certainly handle 30GB of data; it's feasible that it'll accomodate 300GB if it's large enough.
Just remember that you have to think about how you'll manage RAM. GC thrashing might be a problem. A good caching solution might be your friend here. Don't write one yourself.
What happens if that hard drive fails and you lose all that data? Do you back it up? Can your app afford to be down if the disk fails? Think about those scenarios, too. Good luck.
As long as you increase max heap size to make sure your application doesn't run out of memory, you shuold be fine.
As long as you don't keep references to arrays you no longer need, there is no hard limit. Old arrays will automatically get garbage collected, so you can keep allocating and abandoning arrays pretty much ad infinitum.
There is, of course, a limit on how many arrays you can keep around at any given time. This is limited by the amount of memory available to the JVM.
I am developing an application that allows users to set the maximum data set size they want me to run their algorithm against
It has become apparent that array sizes around 20,000,000 in size causes an 'out of memory' error. Because I am invoking this via reflection, there is not really a great deal I can do about this.
I was just wondering, is there any way I can check / calculate what the maximum array size could be based on the users heap space settings and therefore validate user entry before running the application?
If not, are there any better solutions?
Use Case:
The user provides a data size they want to run their algorithm against, we generate a scale of numbers to test it against up to the limit they provided.
We record the time it takes to run and measure the values (in order to work out the o-notation).
We need to somehow limit the users input so as to not exceed or get this error. Ideally we want to measure n^2 algorithms on as bigger array sizes as we can (which could last in terms of runtime for days) therefore we really don't want it running for 2 days and then failing as it would have been a waste of time.
You can use the result of Runtime.freeMemory() to estimate the amount of available memory. However, it might be that actually a lot of memory is occupied by unreachable objects, which will be reclaimed by GC soon. So you might actually be able to use more memory than this. You can try invoking the GC before, but this is not guaranteed to do anything.
The second difficulty is to estimate the amount of memory needed for a number given by the user. While it is easy to calculate the size of an ArrayList with so many entries, this might not be all. For example, which objects are stored in this list? I would expect that there is at least one object per entry, so you need to add this memory too. Calculating the size of an arbitrary Java object is much more difficult (and in practice only possible if you know the data structures and algorithms behind the objects). And then there might be a lot of temporary objects creating during the run of the algorithm (for example boxed primitives, iterators, StringBuilders etc.).
Third, even if the available memory is theoretically sufficient for running a given task, it might be practically insufficient. Java programs can get very slow if the heap is repeatedly filled with objects, then some are freed, some new ones are created and so on, due to a large amount of Garbage Collection.
So in practice, what you want to achieve is very difficult and probably next to impossible. I suggest just try running the algorithm and catch the OutOfMemoryError.
Usually, catching errors is something you should not do, but this seems like an occasion where its ok (I do this in some similar cases). You should make sure that as soon as the OutOfMemoryError is thrown, some memory becomes reclaimable for GC. This is usually not a problem, as the algorithm aborts, the call stack is unwound and some (hopefully a lot of) objects are not reachable anymore. In your case, you should probably ensure that the large list is part of these objects which immediately become unreachable in the case of an OOM. Then you have a good chance of being able to continue your application after the error.
However, note that this is not a guarantee. For example, if you have multiple threads working and consuming memory in parallel, the other threads might as well receive an OutOfMemoryError and not be able to cope with this. Also the algorithm needs to support the fact that it might get interrupted at any arbitrary point. So it should make sure that the necessary cleanup actions are executed nevertheless (and of course you are in trouble if those need a lot of memory!).