How to store an array of instances - java

I'm coding in Java, but most languages would do just fine.
Right now, I have an implementation like this:
I have an array that stores objects from a class. The array's length is 10,000.
This is for a little project I working on. Essentially, over time, any place in the array can be unused or have an object in it. Objects can be created or destroyed at any moment.
What I was trying to figure out was the best way to store and recall them to minimize time for two steps:
Creating an object. Cycling through the array until you find an open slot can be slow when lots of instances are near the front.
Drawing an object. I have to cycle through the array constantly, and, based on the existence of an instance, retrieve and display information regarding it.
My current system uses a list of the Objects, as well as a list of Booleans. When creating an object, I just cycle through till the first empty place in the array, then fill it, and to draw, I just go over the whole thing.
Granted, it isn't slow enough to make the project impossible, but I'd still like to know the most efficient method.

The best way is not to do this. Use one of the Java collections. If you aren't going to be adding or removing things very often, maybe use a ArrayList. If you are, then use a LinkedList. And if you want to maintain a mapping to particular indices, you might consider e.g. a TreeMap. These are all iterable.

What about using some sort of named-map construct instead of an array such as a HashMap?
This will link your objects to a unique key and then you can just use .remove(key) to remove the object once its done.
No real iterations involved, just direct access.

You could maintain a second datastructure, a simple list of 'unused' locations in your array. When you remove an object, push either a reference to that array location or the index of that array location into your list. When you need to add an object, take the first item off your list. When the list is empty, then you get to scan your array as you do now, but when the list isn't empty, finding an unused element can be very quick. (You could keep a list of all unused places, which would remove the full-array search, but keeping a list of 10,000 unused items also seems excessive.)

You could have a list where you could place the indexes of the empty spaces in the array of objects.

Instead of using an array, you could use a HashSet or even HashMap without worrying about open slots or the max capacity your array can hold. You also don't need to worry about adding duplicate objects in HashSet or HashMap. With HashMap, it allows you to get your object by whatever key you assigned to the object... makes your code looks cleaner without all the loopings.

You could use Object[] and just keep around an index to the first open slot. Although, that's pretty much what an ArrayList does.
Also, be aware that iteration through an array of 10k items, is incredibly fast. I wouldn't be concerned about that time unless you have proven it to be an issue.

Related

What is the purpose of a Stream.Builder in Java

To put it differently, what do I gain by using Stream.Builder.add() to add items to the builder and then using Stream.Builder.build(), versus adding the items in a collection or array and creating a Stream from that?
I assume there is a benefit somewhere in some circumstances but it's not obvious to me...
Assuming the machine has enough memory, using Stream.Builder allows to add more than Integer.MAX_VALUE elements to it.
Internally, Stream.Builder uses a SpinedBuffer, which is a non public class.
From SpinedBuffer docs:
An ordered collection of elements. Elements can be added, but not removed.
Goes through a building phase, during which elements can be added, and a
traversal phase, during which elements can be traversed in order but no
further modifications are possible.
One or more arrays are used to store elements. The use of a multiple
arrays has better performance characteristics than a single array used by
ArrayList, as when the capacity of the list needs to be increased
no copying of elements is required. This is usually beneficial in the case
where the results will be traversed a small number of times.
So, it also avoids ArrayList resizing.
It has a very interesting internal data structure that is highly optimized for insertions, without the possibility of removal or random access.
Your array and/or Collection pay an additional price for all other functionality that they support.

Performance difference between arrays, stacks and queues

What is the search performance of arrays, stacks and queues?
I think that arrays are the quickest and most straightforward, because I can access any element immediately by calling it using its index. Is this correct? What about the performance of stacks and queues? How do they compare?
Arrays (and collections based on arrays, eg. ArrayList) are bad for search performance because in the worst case you end up comparing every item (O(n)).
But if you don't mind modifying the order of your elements, you could sort the array (Arrays.sort(yourArray)) then use Arrays.binarySearch(yourArray, element) on it, which provides a O(log n) performance profile (much better that O(n)).
Stacks are in O(n), so nope.
Queues are not even meant to be iterated on, so looking for an object in here would mean consuming the whole queue, which is 1. not performant (O(n)) and 2. probably not what you want.
So among the datastructures you suggested, I would go for a sorted array.
Now, if you don't mind considering other datastructures, you really should take a look at those who use hash functions (HashSet, HashMap...).
Hash functions are really good at searching for elements, with a performance profile in O(1) (with a good hashcode() method in your objects).
I'll try to answer in a very simple way.
Stacks and queues are for storing data temporarily so that you can process the contents one by one. Just like a queue for buying movie tickets, or a stack of pancakes, you process one element at a time.
Arrays are for storing data as well as for accessing elements from the beginning, the end or in between. For searching, arrays would be a better choice.
Can you search elements inside stacks and queues? Possibly. But that's not what they are used for.
It depends how your search (or which search algorithm) is implemented. Stack or Queue may also helpful for some searching application like - BFS, DFS. But in normal case when you are using linear search you can consider about array or ArrayList.
In Java you have ArrayList (built on an array), a Stack (built on an array) and an ArrayQueue and ArrayDeque (which is also built on an array) As they all use the same underlying data structure their access speeds are basically the same.
For a brute force search, the time to scan or iterate over them (all of them support iteration) is O(n) Btw even a HashMap uses an array to store it's entries which is why iterating over its elements to find a value e.g. containsValue is O(n) as well.
While you could have a sorted array which would more naturally sit in an ArrayList, you could equally argue that a PriorityQueue will find and remove the next element the most efficiently. A Stack is ideal for finding the most recently added element.
To answer the question you have to determine what assumption the person asking the question is making. Without these further assumption you would have to say they could all be utilised. In that case I would use an ArrayList as it is the simplest to understand IMHO.
stack, queue and array all are three different and efficient data structures but as a bioinformatician if you want to store biological data, you should choose stack as data structure because last in and first out characteristic of recursive procedure shows that it is the most suitable data structure. Recursion is actually a characteristic of stack. At each procedure call a value can easily be pushed in it and can be retrieved when you exit from procedure. So it actually an aesy to use method.
See you can not compare one data structures with another. Each one has their own advantage and disadvantage.
Although array are good but you can not use them all the time because of their fixed size. They are use for inserting, deleting etc because they need O(1) time.
But when you want to access and insert data from one way only and don't want to perform searching, deleting in between you go for stack and queue. Stack and queue differ only in method of accessing element. In stack you access data from same side you enter data[LIFO] requires O(1) time. In queue you access data from other corner [FIFO]. Accessing elements in between can be done, but that's not what they are used for.

Efficiency of ArrayList

I am making a program in Java in which a ball bounces around on the screen. The user can add other balls, and they all bounce off of each other. My question lies in the storage of the added balls. At the moment, I am using an ArrayList to store them, and every time the space bar is pressed, a new ball class is created and added to an Array List. Is this the most efficient way of doing things? I don't specify the size of the Array List at the beginning, so is it inefficient to have to allocate a new space on the array every time the user wants a new ball, even if the ball count will get up in the hundreds? Is there another class I could use to handle this in a more efficient manner?
Thanks!
EDIT:
Sorry, I should have been more clear. I iterate through the balls every 30 milliseconds, using nested for loops to see if they are intersecting with each other. I do access one ball the most often (the ball which the user can control with the arrow keys, another feature of the game), but the user can choose to switch control balls. Balls are never removed. So, I am performing some fairly complex calculations (I use my own vector class to move them off of each other every time there is a collision) on the balls very often.
Measure it and find out! In all seriousness, often times the best way to get answers to these questions is to set up a benchmark and swap in different collection types.
I can tell you that it won't allocate new space every time you add a new item to the ArrayList. Extra space is allocated so that it has room to grow.
LinkedList is another List option. It is super cheap to add items, but random access (list.get(10)) is expensive. Sets could also be good if you don't need ordered access (though there are ordered sets, too), and you want a Map implementation if you're accessing them by some sort of key/id. It really all depends on how you're using the collection.
Update based on added details
It sounds like you are mostly doing sequential reads through the entire list. In that scenario, a LinkedList is probably your best choice. Though again, if you only expose the List interface to the rest of your code (or even a more general Collection), you can easily swap in different implementations and actually measure the difference.
ArrayList is a highly optimized and very efficient wrapper on top of a plain Java array. A small timing overhead comes from copying array elements, which happens when the allocated size is less than required number of elements. When you grow the array into a few hundreds of items, the copying will happen less than ten times, so the price you pay for not knowing the size in advance is very small. You can further reduce that overhead by suggesting an initial size for your ArrayList.
Removing from the middle of the ArrayList does take linear time. If you plan to remove items and/or insert them in the middle of the list frequently, this may become an issue. Note, however, that the overhead is not going to be worse than that for a plain array.
I iterate through the balls every 30 milliseconds, using nested for loops to see if they are intersecting with each other.
This does not have much to do with the collection in which the balls are stored. You could use a spatial index to improve the speed of finding intersections.
About ArrayList in Java, the complexity of remove at the end and add one element is Amortize O(1). Or, you can say, it's almost efficient in most cases. (In some rare cases, it will be awful.)
But you should think more carefully about your design before choosing your data structure.
How many objects often in your collection. If it's small, you can free to choose any data structure that you feel easily to work with. it will almost doesn't lost performance for your code.
If you often find one ball in all of your balls, another datastructure such as HashMap or HashSet would be better.
Or you often delete at middle of your list, maybe LinkedList will be appropriate choice :)
I'd recommend working out the way in which you need to access the balls, and pick an appropriate interface (Not implementation) eg. If you're accessing sequentially only, use a List. If you need to look up the ball by ID, think of a Map. The interface should match your requirements in terms of functionality, not in terms of speed/efficiency.
Then pick an implementation, eg. HashMap or TreeMap, and write your code.
Afterwards, profile it - Is your code inefficient in the ball access code? If so, then try to optimise by switching to an alternate implementation thats more appropriate to your needs.

ArrayList.remove() induces lags

I have a problem that's been bothering me for some time. I have a software where I generate a certain number of objects in a physical world, storing them in an ArrayList. However, this implementation generates lags when removing objects from the ArrayList.
A static array implementation does not lead to lags, but is less practical as I cannot use "add" and "remove".
I'm assuming the lags with ArrayLists are due to memory freeing and re-allocation. As my ArrayList has a fixed maximal size, is it possible to preallocate it a certain memory, to avoid these issues? Or is there another solution for this?
Thanks a lot in advance for any help!
I normally would not just repost someone elses answer, but this seems to fit very well.
The problem here is that an ArrayList is internally implemented, as the name states, with an array. This means that elements of the collection can't be freely inserted or removed without shifting the elements after the index you are working it.
So if your collection has a lot of elements and you, for example, remove the 5th, then all the elements from 6th to the end of the list must be shifted to the left by one position. This can be expensive indeed and leads to a O(n) complexity.
To avoid these issues, you should choose an appropriate collection according to the most common operations you are going to use on it. A LinkedList can be good if you need to iterate, remove (actually the removal requires to find the element anyway so it's good just if you are already enumerating them) or insert elements but whenever you want to access a specific index you are going into troubles.
You could also look for a HashSet or a TreeSet, they could be suitable to your solution.
In these circumstances knowing how most common data structures work and which are good/bad for is always useful to make appropriate choices.

Speeding up a linked list?

I'm a student and fairly new to Java. I was looking over the different speeds achieved by the two collections in Java, Linked List, and ArrayList. I know that an ArrayList is much much faster at looking up and placing in values into its indexes. My question is:
how can one make a linked list faster, if at all possible?
Thanks for any help.
zmahir
When talking about speed, perhaps you mean complexity. Insertion and retrieval operations for ArrayList (and arrays) are O(1), while for LinkedList they are O(n). And this cannot be changed - it is 'by definition'.
O(n) means that in order to insert an object at a given position, or retrieve it, you must traverse, in the worst case, all (n) the items in the list. Hence n operations. For ArrayList this is only one operation.
You probably can't. You don't know the size (well, ok you can), nor the location of each element. To find element 100 in a linked list, you need to start with item 1, find it's link to item 2, etc. until you find 100. This makes inserting into this list a tedious job.
There are many alternatives depending on your exact goals. You can use b-trees or similar methods to split the large linked list into smaller ones. Or use hashlists if you want to quickly find items. Or use simple arrays. But if you want a list that performs like an ArrayList, why not use an ArrayList?
You can split off regions which are linked to the main linked list, so this gives you entry points directly inside the list so you don't have to walk up to them. See the subList method here: http://download.oracle.com/javase/1.4.2/docs/api/java/util/AbstractList.html. This is useful if you have a number of 'sentences' made out of words, say. You can use a separate linked list to iterate over the sentences, which are sublists of the main linked list.
You can also use a ListIterator when adding, removing, or accessing elements. This helps greatly with increasing the speed of sequential access. See the listIterator method for this, and the class: http://download.oracle.com/javase/1.4.2/docs/api/java/util/ListIterator.html.
Speed of a linked list could be improved by using skip lists: http://igoro.com/archive/skip-lists-are-fascinating/
a linked list uses pointers to walk through the items, so for example if you asked for the 5th item, the runtime will start from the first item and walks through each pointer until it reaches the 5th item.
there is really not much you can do about it. a linked list may not be a good choice if you need fast acces to items. although there are some optimizations for it such as creating a circular linked list or a double linked list where you can walk back and forth the list but this really depends on the business logic and the application requirements.
my advise is to avoid linked lists if it does not match your needs and changing to a different data structure might be the best approach.
As a general rule, data structures are designed to do certain things well. LinkedLists are designed to be faster than ArrayLists at inserting elements and removing elements and about the same as ArrayLists at iterating across the list in order. When you change the way a LinkedList works, you make it no longer a true LinkedList, so there's not really any way to modify them to be faster at something and still be a LinkedList.
You'll need to examine the way you're using this particular collection and decide whether a LinkedList is really the best data structure for your purposes. If you share with us how you're using it, and why you need it to be faster, then we can advise you on which data structure you ought to consider using.
Lots of people smarter than you or I have looked at the implementation of the Java collection classes. If there were an optimization to be made, they would have found it and already made it.
Since the collection classes are pretty much as optimized as they can be, our primary task should be to choose the correct one.
When choosing your collection type, don't forget about things like HashSet. If order doesn't matter, and you don't need to put duplicates in the collection, then HashSet may be appropriate.
I'm a student and fairly new to Java. ... how can one make a linked list faster, if at all possible?
The standard Java collection type (indeed all data structures implemented in any language!) represent compromises on various "measures" such as:
The amount of memory needed to represent the data structure.
The time taken to perform various operations; e.g. for a "list" the operations of interest are insertion, removal, indexing, contains, iteration and so on.
How easy or hard it is to integrate / reuse the collection type; see below.
So for instance:
ArrayList offers lower memory overheads, fast indexing (O(1)), but slow contains, random insertion and removal (O(N)).
LinkedList has higher memory overheads, slow indexing and contains (O(N)), but faster removal (O(1)) under certain circumstances.
The various performance measures are typically determines by the maths of the various data structures. For example, if you have a chain of nodes, the only way to get the ith node is to step through them from the beginning. This involves following i pointers.
Sometimes you can modify the data structures to improve one aspect of the performance. But this typically comes at the cost of some other aspect of the performance. (For example, you could add a separate index to make indexing of a linked list faster. But the cost of maintaining the index on insertion / deletion would mean that you'd probably be better of using an ArrayList.)
In some cases the integration / reuse requirements have significant impact on performance.
For example, it is theoretically possible to optimize a linked list's space usage by adding a next field to the list element type, combining the element and node objects and saving 16 or so bytes per list entry. However, this would make the list type less general (the member/element class would need to implement a specific interface), and has the restriction that an element can belong to at most one list at any time. These restrictions are so limiting that this approach is rarely used in Java.
For a second example, consider the problem of inserting at a given position in a linked list. For the LinkedList class, this is normally an O(N) operation, because you have to step through the list to find the position. In theory, if an application could find and remember a position, it should be able to perform the insertion at that position in O(1). Unfortunately, neither the List APIs provides no way to "remember" a position.
While neither of these examples is a fundamental roadblock to a developer "doing his own thing", they illustrate that using general data structure APIs and general implementations of those APIs has performance implications, and therefore represents a trade-off between performance and ease-of-use.
I'm a bit surprised by the answers here. There are big difference between the theoretical performance of LinkedLists and ArrayLists compared to the actual performance of the Java implementations.
What makes the Java LinkedList slower than a theoretical LinkedList is that it does a lot more than just the operations. For example it checks for concurrent modifications and other safeties.
If you know your use case, you can write a your own simple implementation of a LinkedList and it will be much faster.

Categories

Resources