Non runtime allocation solution - ArrayList - java

I'm making a game in Java. I need some solution for my current runtime allocation, caused by my ArrayList. Every single minute or 30 seconds the garbage collector starts to runs because of I am calling for draw and updates-method through this collection.
How should I be able to do a non runtime allocation solution?
Thanks in advance and if needed, my code is posted below from my Manager class which contains the ArrayList of objects.:
Some code:
#Override
public void draw(GL10 gl) {
final int size = objects.size();
for(int x = 0; x < size; x++) {
Object object = objects.get(x);
object.draw(gl);
}
}
public void add(Object parent) {
objects.add(parent);
}
//Get collection, and later we call the draw function from these objects
public ArrayList<Object> getObjects() {
return objects;
}
public int getNumberOfObjects() {
return objects.size();
}
More explanation: The reason I mix with this is because (1) I see that the ArrayList implementation is slow and causing lags and (2) that I want to merge the objects/components together. When firing an update call from my Thread-class, it goes through my collection, send things down the tree/graph using the Manager's update function.
When looking at an Open Source project, Replica Island, I found that he used an alternative class FixedSizeArray that he wrotes on his own. Since I'm not that good at Java, I wanted to make things easier and now I'm looking for another solution. And at last, he explained WHY he made the special class:
FixedSizeArray is an alternative to a standard Java collection like ArrayList. It is designed to provide a contiguous array of fixed length which can be accessed, sorted, and searched without requiring any runtime allocation. This implementation makes a distinction between the "capacity" of an array (the maximum number of objects it can contain) and the "count" of an array (the current number of objects inserted into the array). Operations such as set() and remove() can only operate on objects that have been explicitly add()-ed to the array; that is, indexes larger than getCount() but smaller than getCapacity() can't be used on their own.

I see that the ArrayList implementation is slow and causing lags ...
If you see that, you are misinterpreting the evidence and jumping to unjustifiable conclusions. ArrayList is NOT slow, and it does NOT cause lags ... unless you use the class in a particularly suboptimal way.
The only times that an array list allocates memory are when you create the list, add more elements, copy the list, or call iterator().
When you create the array list, 2 java objects are created; one for the ArrayList and one for its backing array. If you use the initialCapacity argument and give an appropriate value, you can arrange that subsequent updates will not allocate memory.
When you add or insert an element, the array list may allocate one new object. But this only happens when the backing array is too small to hold all of the elements, and when it does happen the new backing array is typically twice the size of the old one. So inserting N elements will result in at most log2(N) allocations. Besides, if you create the array list with an appropriate initialCapacity, you can guarantee that there are zero allocations on add or insert.
When you copy a list to another list or array (using toArray or a copy constructor) you will get 1 or 2 allocations.
The iterator() method creates a new object each time you call it. But you can avoid this by iterating using an explicit index variable, List.size() and List.get(int). (Be aware that for (E e : someList) { ... } implicitly calls List.iterator().)
(External operations like Collections.sort do entail extra allocations, but that is not the fault of the array list. It will happen with any list type.)
In short, the only way you can get lots of allocations using an array list is if you create lots of array lists, or use them unintelligently.
The FixedSizedArray class you have found sounds like a waste of time. It sounds like it is equivalent to creating an ArrayList with an initial capacity ... with the restriction that it will break if you get the initial capacity wrong. Whoever wrote it probably doesn't understand Java collections very well.

It's not quite clear what you are asking, but:
If you know at compile time what objects should be in the collection, make it an array not an ArrayList and set the contents in an initialisation block.
Object[] objects = new Object[]{obj1,obj2,obj3};

What makes you think you know what the GC is reclaiming? Have you profiled your application?

What do you mean by "non-runtime allocation"? I'm really not even sure what you mean by "allocation" in this context... allocation of memory? That's done at runtime, obviously. You clearly aren't referring to any kind of fixed pool of objects that are known at compile time either, since your code allows adding objects to your list several different ways (not that you'd be able to allocate anything for them at compile time even if you were).
Beyond that, nothing in the code you've posted is going to cause garbage collection by itself. Objects can only be garbage collected when nothing in the program has a strong reference to them, and your posted code only allows adding objects to the ArrayList (though they can be removed by calling getObjects() and removing from that, of course). As long as you aren't removing objects from the objects list, you aren't reassigning objects to point to a different list, and the object containing it isn't itself becoming eligible for garbage collection, none of the objects it contains will ever be available for garbage collection either.
So basically, there isn't any specific problem with the code you've posted and your question doesn't make sense as asked. Perhaps there are more details you can provide or there's a better explanation of what exactly your issue is and what you want. If so, please try to add that to your question.
Edit:
From the description of FixedSizeArray and the code I looked at in it, it seems largely equivalent to an ArrayList that is initialized with a specific array capacity (using the constructor that takes an int initialCapcacity) except that it will fail at runtime if something tries to add to it when its array is full, where ArrayList will expand itself to hold more and continue working just fine. To be honest, it seems like a pointless class, possibly written because the author didn't actually understand ArrayList.
Note also that its statement about "not requiring any runtime allocation" is a bit misleading... it does of course have to allocate an array when it is created, but it just refuses to allocate a new array if its initial array fills up. You can achieve the same thing using ArrayList by simply giving it an initialCapacity that is at least large enough to hold the maximum number of objects you will ever add to it. If you do so, and you do in fact ensure you never add more than that number of objects to it, it will never allocate a new array after it is created.
However, none of this relates in any way to your stated issue about garbage collection, and your code still doesn't show anything that would cause huge numbers of objects to be garbage collected. If there is any issue at all, it may relate to the code that is actually calling the add and getObjects methods and what it's doing.

Related

Why do arrays in Java need to have a pre-defined length when Objects don't?

Sorry if this is a really stupid question, but hearing as "Java arrays are literally just Objects" it makes no sense to me that they need to have a pre-defined length?
I understand why primitive types do, for example int myInt = 15; allocates 32 bits of memory to store an integer and that makes sense to me. But if I had the following code:
class Integer{
int myValue;
public Integer(int myValue){
this.myValue = myValue;
}
}
Followed by a Integer myInteger = new Integer(15);myInteger.myValue = 5; then there's no limit on the amount of data I can store in myInteger. It's not limited to 32 bits, but rather it's a pointer to an Object which can store any amount of ints, doubles, Strings, or really anything. It allocated 32 bits of memory to store the pointer, but the object itself can store any amount of data, and it doesn't need to be specified beforehand.
So why can't an array do that? Why do I need to tell an array how much memory to allocate beforehand? If an array is "literally just an object" then why can't I simply say String[] myStrings = new String[];myStrings[0] = "Something";?
I'm super new to Java so there's a 100% chance that this is a stupid question and that there's a very simple and clear answer, but I am curious.
Also, to give another example, I can say ArrayList<String> myStrings = new ArrayList<String>();myStrings.add("Something"); without any problem... So what makes an ArrayList different from an array? Why does an array NEED to be told how much memory to allocate when an ArrayList doesn't?
Thanks in advance to anybody who takes the time to fill me in. :)
EDIT: Okay, so far everybody in the comments have misunderstood my post and I feel like it's my fault for wording it wrong.
My question is not "how do I define an array?", or "does changing the value of a variable change its memory usage?", or "do pointers store the data of the object they point to?", or "are arrays objects?", nor is it "how to ArrayLists work?"
My question is, how come when I make an array I need to tell it how big the object it points to is, but when I make any other object it scales on its own without me telling it anything upfront? (With ArrayLists being an example of the difference)
I hope this makes more sense now... I'm not sure why everybody misunderstood? (Did I word something wrong? If so, let me know and I'll change it for others' convenience)
My question is why does a pointer to an array need to know how big the array is beforehand, when a pointer to any other object doesn't?
It doesn't. Here, this runs perfectly fine:
String[] x = new String[10];
x = new String[15];
The whole 'needs to know in advance how large it is' refers only to the ARRAY OBJECT. As in, new int[10] goes to the heap, which is like a giant beach, and creates a new treasure chest out of thin air, big enough to hold exactly 10 ints (Which, being primitives, are like coins in this example). It then buries it in the sand, lost forever. Hence why new int[10]; all by its lonesome is quite useless.
When you write int[] arr = new int[10];, you still do that, but you now also make a treasure map. X marks the spot. 'arr' is this map. It is NOT AN INT ARRAY. It is a map to an int array. In java, both [] and . are 'follow the map, dig down, and open er up'.
arr[5] = 10; means: Follow your arr map, dig down, open up the chest you find there, and you'll see it has room for precisely 10 little pouches, each pouch large enough to hold one coin. Take the 6th pouch. Remove whatever was there, put a 10ct coin in.
It's not the map that needs to know how large the chest is that the map leads to. It's the chest itself. And this is true for objects as well, it is not possible in java to make a treasure chest that can arbitrarily resize itself.
So how does ArrayList work?
Maps-in-boxes.
ArrayList has, internally, a field of type Object[]. That field doesn't hold an object array. It can't. It holds a map to an object array: It's a reference.
So, what happens when you make a new arraylist? It is a treasure chest, fixed size, with room for exactly 2 things:
A map to an 'object array' treasure chest (which it will also make, with room for 10 maps, and buries it in the sand, and stores the map to this chest-of-maps inside itself.
A coinpouch. The coin inside represents how many objects the list actually contains. The map to the treasure it has leads to a treasure with room for 10 maps, but this coin (value: 0) says that so far, none of those maps go anywhere.
If you then run list.add("foo"), what that does is complicated:
"foo" is an object (i.e. treasure), so "foo" as an expression resolves to be a map to "foo". It then takes your list treasuremap, follows it, digs down, opens the box, and you yell 'oi! ADD THIS!', handing it a copy of your treasuremap to the "foo" treasure. What the box then does with this is opaque to you - that's the point of OO.
But let's dig into the sources of arraylist: What it will do, is query its treasuremap to the object array (which is private, you can't get to it, it's in a hidden compartment that only the djinn that lives in the treasure chest can open), follows it, digs down, and goes to the first slot (why? Because the 'size coin' in the coinpouch is currently at 0). It takes the map-to-nowhere that is there, tosses it out, makes a copy of your map to the "foo" treasure, and puts the copy in there. It then replaces its coin in the coin pouch with a penny, to indicate it is now size 1.
If you add an 11th element, the ArrayList djinn goes out to the other treasure, notices there is no room, and goes: Well, dang. Okay. It then conjures up an entirely new treasure chest that can hold 15 treasure maps, it copies over the 10 maps in the old treasure, moves them to the new treasurechest, adds the copy of the map of the thing you added as 11th, then goes back to its own chest, rips out the map to the real treasure and replaces it to a map of the newly made treasure (With 15 slots), and puts an 11ct coin in the pouch.
The old treasure chest remains exactly where it is. If nobody has any maps to this (and nobody does), eventually, the beachcomber finds it, gets rid of it (that'd be the garbage collector).
Thus, ALL treasure chests are fixed size, but by replacing maps with new maps and conjuring up new treasure chests, you can nevertheless make it look like ArrayList is capable of shrinking and growing.
So why don't arrays allow it? Because that shrinking and growing stuff is complicated and arrays expose low-level functionality. Don't use arrays, use Lists.
You seem to misunderstand what "storage" means. You say "there's no limit on the amount of data I can store", but if you run myInteger.myValue = 15, you overwrite the value of 32 that you put there originally. You still can't store any more than 32 bits, it's simply that you can change which 32 bits you put in that variable.
If you want to see how ArrayList works, you can read the source code; it can expand because if it runs out of space it creates a new larger array and switches its single array variable elementData to it.
Based on your update, it seems like you may be wondering about the ability to add lots of different fields to your object definition. In this case, those fields and their types are fixed when the class is compiled, and from that point on the class has a fixed size. You can't just pile in extra properties at runtime like you can in JavaScript. You are telling it up front about the scale it needs.
I'm going to ignore most of the details you've given, and answer the question in your edit.
My question is, how come when I make an array I need to tell it how big the object it points to is, but when I make any other object it scales on its own without me telling it anything upfront?
It's worth starting by dealing with "when I make any other object it scales on its own", because this isn't true. If you create a class like this:
class MyInteger
public int value;
public MyInteger(int value) {
this.value = value;
}
}
Then that class has a statically defined size. Once you've compiled this class, the amount of memory for an instance of MyInteger is already determined. In this case, it's the object header size (JVM dependent), and the size of an integer (at least 4 bytes).
Once an object has been allocated by the JVM, its size cannot change. It is treated as a block of bytes by the JVM (and importantly, the garbage collector) until it is reclaimed. Classes like ArrayList give the illusion of growing, but they actually work by allocating other objects, which they store references to.
class MyArrayList {
public int[] values;
public MyArrayList(int[] values) {
this.values = values;
}
}
In this case, the MyArrayList instance will always take the same amount of memory (object header size + reference size), but the array that is referenced may change. We could do something like this:
MyArrayList list = new MyArrayList(new int[50]);
This allocates a block of memory for list, and a block of memory for list.values. If we then do (as ArrayList effectively does internally):
list.values = new int[500];
then the memory allocated for list is still the same size, but we have allocated a new block which we then reference in list.values. This leaves our old int[50] with no references (so it can be garbage collected). Importantly, though, no allocation has changed size. We have reallocated a new, bigger, block for our list to use, and have referenced it from our MyArrayList instance.
Why do arrays in Java need to have a pre-defined length when Objects don't?
In order to understand this, we need to establish that "size" is a complicated concept in Java. There are a variety of meanings:
Each object is stored in the heap as one or more heap nodes, where one of these is the primary node, and the rest are component objects that can be reached from the primary node.
The primary heap node is represented by a fixed and unchanging number of bytes of heap memory. I will call this1 the native size of the object.
An array has an explicit length field. This field is not declared. It has a type int and cannot be assigned to. There is actually a 32 bit field in the header of each array instance that holds the length.
The length of an array directly maps to its native size. The JVM can compute the native size from the length.
An object that is not an array instance also has a native size. This is determined by the number and types of the object's fields. Since fields cannot be added or removed at runtime, the native size does not change. But it doesn't need to be stored since it can be determined (when needed) at runtime from the object's class.
Some objects support a class specific size concept. For example, a String has a size returned by its length() method, and an ArrayList has a size returned by its size() method.
NB:
The meaning of the class specific size is ... class specific.
The class specific size does not correlate to the native size of an instance. (Except in degenerate cases ...)
In fact, all objects have a fixed native size.
1 - This term is solely for the purposes of this answer. I claim no authority for this term ...
Examples:
A String[] has a native size that depends on its length. On a typical JVM it will be 12 + length * (<reference size>) rounded up to a multiple of 16 bytes.
Your Integer class has a fixed native size. On a typical JVM each instance will be 16 bytes long.
An ArrayList object has 2 private int fields and a private Object[] field. That gives it a fixed native size of either 16 or 24 bytes. One of the int fields is call size, and it contains the value returned by size().
The size of an ArrayList may change, but this is implemented by the code of the class. In order to do this, it may need to reallocate its internal Object[] to make it large enough to hold more elements. If you examine the source code for the ArrayList class, you can see how this happens. (Look for the ensureCapacity and grow methods.)
So, the differences between the size(s) of regular object and the length of an array are:
The natural size of a regular object is determined solely by the type of the object, it never changes. It is rarely relevant to the application and it is not exposed via a field.
The length of an array depends on value supplied when you instantiate it. It never changes. The natural size can be determined from the length.
The class specific size of an object (if relevant) is managed by the class.
To your revised question:
My question is, how come when I make an array I need to tell it how big the object it points to is, but when I make any other object it scales on its own without me telling it anything upfront? (With ArrayLists being an example of the difference)
The point is that at the JVM level, NOTHING scales automatically. The native size of a Java object CANNOT change.
Why? Because increasing the size of the object's heap node would entail moving the heap node, and a heap node cannot be moved without updating all references for the object. That cannot be done efficiently.
(It has been pointed out that the GC can efficiently move heap nodes. However, that is not a viable solution. Running the GC is expensive. It would be highly inefficient to perform a GC in order to (say) grow a single Java array. If Java had been specified so that arrays could "grow", it would need to implemented using an underlying non-growable array type.)
The ArrayList case is being handled by the ArrayList class itself, and it does it by (if necessary) creating a new, larger backing array, copying the elements from the old to the new, and then discarding the old backing array. It also adjusts the size field that hold the logical size of the list.
Object arrays allocate space for object pointers, and not entire objects in memory.
So new String[10] doesnt allocate space for 10 strings, but for 10 object references that would be point to what strings are stored in the array.

Does appending/removing entries to a Java list reallocate memory?

This is low-level memory question about how Java performs .add and .remove on an ArrayList or other types of lists. I would think that Java would have to do a reallocation of memory to append/remove items to a list, but it could be doing something I'm not thinking of to avoid this. Does anyone know?
If by "regular list" you mean java.util.List, that is an interface. It does not specify anything about whether or when any memory is allocated in association with adding or removing elements -- those are details of specific implementations.
As for java.util.ArrayList in particular, its docs say:
Each ArrayList instance has a capacity. The capacity is the size of the array used to store the elements in the list. It is always at least as large as the list size. As elements are added to an ArrayList, its capacity grows automatically. The details of the growth policy are not specified beyond the fact that adding an element has constant amortized time cost.
In other words, Java does not specify the answer to your question.
If I were to speculate based on the available documentation, I would guess that java.util.ArrayList.remove() never performs any memory allocation or reallocation. It seems to follow from the docs overall that java.util.ArrayList.add() allocates additional space at least sometimes (in the form of a new, longer internal array). In order to achieve constant amortized cost for element additions, however, I don't see how it could reallocate on every element addition. Almost certainly, it reallocates only when its capacity is insufficient, and then it scales the capacity by a constant factor -- e.g. doubles it.
All list implementations require storage of some information about the objects in the list and the order of those objects. Larger lists require more such information because there is some information for each object in the list. Thus adding to a list must, on average, result in allocation of more storage for this information.
Adding an element to a list does not copy the object that was added to the list. Indeed, no Java statements cause an additional copy of an object to be visible to your program (you have to explicitly use a copy constructor or a clone method to do that). This is because Java objects are never accessed directly, but are always accessed through a reference. Adding an object to a collection really means adding a new reference to the object to the collection.

What is the best practice for setting the initial capacity of an ArrayList that is not always used?

I have some classes that have ArrayList fields that are only sometimes used. I generally initialize these fields like so:
private List<Widget> widgets = new ArrayList<>();
I understand about using the overload constructors to set the initial capacity, so I'm wondering if I should declare these fields as such:
private List<Widget> widgets = new ArrayList<>(0);
The dilemma, is that if I initialize the list with 0 then the list will always have to re-initialize itself for adding even one item. But, if I use the default constructor, which gives a default capacity of 10, then I may have a bunch of items (and there can be many) that are wasting memory with unused capacity.
I know some of you are going to push back asking 'how often' and 'how many items are you expecting' but I'm really looking for the "best practices" approach. All things being ~equal, should one initialize with (0) or () on a list that is sometimes used?
It's our department policy to always initialize lists, so I may not simply leave the lists as null, besides, that would just side-step the question.
Premature optimisation is the root of all evil. - D. Knuth.
This seems like the kind of "performance issue" which actually never has any effect on performance. For one thing, how sure are you that these empty lists are actually initialised? I suspect that most modern compilers delay initialisation of objects until they know for sure that there will be a call on them. So if you pass the no arg constructor it will most likely never be used unless something is added to the list. On the other hand, if you use the 0 argument constructor, it guarantees that it has to resize every one that it uses.
These are the three laws of performance optimisation
Never assume that you know what the compiled code is actually doing, or that you can sport small optimisations better than the compiler can.
Never optimise without using a profiler to work out where the bottleneck is. If you think that you know, refer to rule number (1).
Don't bother unless your application has a performance issue. Then refer to rule (2).
On the off chance that you somehow still believe that you understand compilers, check out this question: Why is it faster to process a sorted array than an unsorted array?
If the list is not always used use lazy initialization
private List<Widget> widgets;
private List<Widget> getList() {
if (widgets == null) {
widgets = new ArrayList<>();
}
return widgets;
}
If you set it to 0 the ArrayList will have to resize anyhow, so really you're shooting yourself in the foot. The only time you would benefit from an explicit declaration of size would be if you already know the maximum bounds that you will be reaching in your list.
As stated, this is a micro-optimization, it's more likely you will find other things that you can significantly improve than the initial size of your ArrayList.
I tent to disagree that these optimisation are bad. If you declare an arraylist that holds n elements (8-th by default if Im not mistaken), and you put one more, then arraylist, internally will double the size it holds. When you remove this element later, the list will not decrease.
ArrayList utilizes processor cache a lot and actually is so fast, that you don't need to optimize it any furter. Still, if you have to create millions of tiny ArrayList instances it may worth thinking of reworking your overall design and not bother about default AL capacity.
As has already been said, Lazy initialization can help you by postponing the moment when you have to initialize the list (and therefore choose its initial size).
If Lazy initialization is not possible because of your department policy that does not allow to initialize with null an object (for which I do not find much sense), a workaround might be to initialize an empty list as
List widget = new ArrayList<>(0)
and only when (and if) you actually need to work with the list, you create a new list object:
widget = new ArrayList<>(someSize)
and hopefully at that moment you could know the max size that the list can reach (or at least its order of magnitude).
I know, this is a very stupid trick, but it adhers to your policy.

Java: Effeciently keep track of used objects

I have a program that collects objects over time. Those objects are often, but not always duplicates of objects the program has already received. The number of unique objects can sometimes be up in the tens of thousands. As my lists grow, it takes more time to identify whether an object has appeared or not before.
My current method is to store everything in an ArrayList, al; use Collections.sort(al); and use Collections.binarySearch(al, key) to determine whether I've used an object. Everytime I come across a new object I have to insert and sort however.
I'm wondering if there's just a better way to do this. Contains tends to slow up too quickly. I'm looking for something as close to O(1) as possible.
Thanks much.
This is java. For the purpose of understanding what I'm talking about, I basically need a method that does this:
public boolean objectAlreadyUsed(Object o) {
return \\ Have we seen this object already?
}
Instead of using an ArrayList, why wouldn't you use a Set implementation (likely a HashSet)? You'll get constant-time lookup, no sorting needed.
N.B. your objects will need to correctly override hashCode() and equals().
This begs the question - why not use a data structure that doesn't allow duplicates (e.g. Set)? If you attempt to add a duplicate item, the method will return false and the data structure will remain unchanged.
Make sure the objects have correct equals() and hashCode() methods, and store them in a HashSet. Lookup then becomes constant time.
If retaining unwanted objects becomes an issue, by the way, you could consider using one of the many WeakHashSet implementations available on the Internet -- it will hold the objects but still allow them to be garbage collected if necessary.

Is Java HashMap.clear() and remove() memory effective?

Consider the follwing HashMap.clear() code:
/**
* Removes all of the mappings from this map.
* The map will be empty after this call returns.
*/
public void clear() {
modCount++;
Entry[] tab = table;
for (int i = 0; i < tab.length; i++)
tab[i] = null;
size = 0;
}
It seems, that the internal array (table) of Entry objects is never shrinked. So, when I add 10000 elements to a map, and after that call map.clear(), it will keep 10000 nulls in it's internal array. So, my question is, how does JVM handle this array of nothing, and thus, is HashMap memory effective?
The idea is that clear() is only called when you want to re-use the HashMap. Reusing an object should only be done for the same reason it was used before, so chances are that you'll have roughly the same number of entries. To avoid useless shrinking and resizing of the Map the capacity is held the same when clear() is called.
If all you want to do is discard the data in the Map, then you need not (and in fact should not) call clear() on it, but simply clear all references to the Map itself, in which case it will be garbage collected eventually.
Looking at the source code, it does look like HashMap never shrinks. The resize method is called to double the size whenever required, but doesn't have anything ala ArrayList.trimToSize().
If you're using a HashMap in such a way that it grows and shrinks dramatically often, you may want to just create a new HashMap instead of calling clear().
You are right, but considering that increasing the array is a much more expensive operation, it's not unreasonable for the HashMap to think "once the user has increased the array, chances are he'll need the array this size again later" and just leave the array instead of decreasing it and risking to have to expensively expand it later again. It's a heuristic I guess - you could advocate the other way around too.
Another thing to consider is that each element in table is simply a reference. Setting these entries to null will remove the references from the items in your Map, which will then be free for garbage collection. So it isn't as if you are not freeing any memory at all.
However, if you need to free even the memory being used by the Map itself, then you should release it as per Joachim Sauer's suggestion.

Categories

Resources