I have an ArrayList which I fill with objects of type Integer in a serial fashion (i.e. one-by-one) from the end of the ArrayList (i.e. using the method add(object)). Every time I do this, the other objects in the ArrayList are of course left-shifted by one index.
In my code I want to find the index of a random object in the ArrayList. I want to avoid using the indexOf method because I have a very big ArrayList and the looping will take an enormous amount of time. Are there any workarounds? Some idea how to keep in some data structure maybe the indexes of the objects that are in the ArrayList?
EDIT: Apparently my question was not clear or I had a missunderstanding of the arraylist.add(object) method (which is also very possible!). What I want to do is to have something like a sliding-window with objects being inserted at one end of the arraylist and dropped from the other, and as an object is inserted to one end the others are shifted by one index. I could use arraylist.add(0, object) for inserting the objects from the left of the arraylist and right-shifting each time the previous objects by one index, but making a google search I found that this is very processing-intensive operation - O(N) if I remember right. Thus, I thought "ok, let's insert the objects from the right-end of the arraylist, no problem!", assuming that still each insertion will move the previous objects by one index (to the left this time).
Also when I use the term "index" I simply mean the position of the object in the ArrayList - maybe there is some more formall term "index" which means something different.
You have a couple of options. Here are the two basic options:
You can maintain a Map<Object,Integer> that holds indexes, in parallel to the array. When you append an element to the array you can just add it to the map. When you remove an element from the beginning you will have to iterate through the entire map and subtract one from every index.
If it's appropriate for your situation and the Map does not meet your performance requirements, you could add an index field to your objects and store the index directly when you add it to the array. When you remove an element from the beginning you will have to iterate through all objects in the list and subtract one from their index. Then you can obtain the index in constant time given an object.
These still have the performance hit of updating the indexes after a remove. Now, after you choose one of these options, you can avoid having to iterate through the map / list to update after removal if you make a simple improvement:
Instead of storing the index of each object, store a count of the total number of objects added so far. Then to get the actual index, simply subtract the count value of the first object from the value of the one you are looking for. E.g. when you add:
add a to end;
a.counter = counter++;
remove first object;
(The initial value of counter when starting the program doesn't really matter.) Then to find an object "x":
index = x.counter - first object.counter;
Whether you store counter as a new field or in a map is up to you. Hope that helps.
By the way; a linked list will have better performance when removing object from the front of the list, but worse when accessing an object by index. It may be more appropriate depending on your balance of add/remove vs. random access (if you only care about the index but never actually need to retrieve an object by index, random access performance doesn't matter). If you really need to optimize further you could consider using a fixed-capacity ring buffer instead (back inserts, front removes, and random access will all be O(1)).
Of course, option 3 is to reconsider your algorithm at a higher level; perhaps there is a way to accomplish the behavior you are seeking that does not require finding the objects in the list.
Related
To solve Dynamic programming problem I used two approaches to store table entries, one using multi dimension array ex:tb[m][n][p][q] and other using hashmap and using indexes of 1st approach to make string to be used as key as in "m,n,p,q". But on one input first approach completes in 2 minutes while other takes more than 3 minutes.
If access time of both hashmap and array is asymptotically equal than why so big difference in performance ?
Like mentioned here:
HashMap uses an array underneath so it can never be faster than using
an array correctly.
You are right, array's and HashMap's access time is in O(1) but this just says it is independent on input size or the current size of your collection. But it doesn't say anything about the actual work which has to be done for each action.
To access an entry of an array you have to calculate the memory address of your entry. This is easy as array's memory address + (index * size of entity).
To access an entry of a HashMap, you first have to hash the given key (which needs many cpu cycles), then access the entry of the HashMap's array using the hash which holds a list (depends on implementation details of the HashMap), and last you have to linear search the list for the correct entry (those lists are very short most of the time, so it is treated as O(1)).
So you see it is more like O(10) for arrays and O(5000) hash maps. Or more precise T(Array access) for arrays and T(hashing) + T(Array access) + T(linear search) for HashMaps with T(X) as actual time of action x.
I am a relatively new programmer and am working on my first project to build a portfolio. In my project I have 2 rather large lists of strings (about 3.1 million) and I need to "associate" the elements in each one with a 1 to 1 relationship from predetermined values (elements are selected according to a set method) not just linearly (from top to bottom). For example:
lista(0) = list1(5);
listb(0) = list2(2);
lista(1) = list1(1);
listb(1) = list2(4);
lista(2) = list1(3);
listb(2) = list2(1);
The point of this is to reorder the lists in a manner that can be recreated at a later time or by a different program by "remembering" a set of values. I am using 2 lists because I need to be able to search one list for a String then pull the value from the corresponding element in the other list.
I have tried many different methods like storing each list in an arrayList then accessing the elements in the preset order and storing them in new arrayLists in the new order, then removing the elements from the old arrayLists. This would be ideal but didn't work because removing elements from a really large arrayList was very slow. I figured that removing an element from the lists will prevent it from being used again.
I tried storing them in String arrays, then accessing each element in the predefined method, storing them in another array then nulling out the elements so that they wont be used again, but creating null spaces made searching a nightmare, because if the program hit a null element during the predefined "move" value, I had to add in checks for nulls, then more movement which made things more complicated and harder to reproduce later.
I need an easy, and efficient way to create these associations between these 2 lists and ANY ideas are welcome.
This is my first post to stackoverflow and I apologize if its formatted improperly or confusing, but please be gentle.
if you need to pull one value from a given string, why not using a map ? The key is the value of the first list and the value is the value of the second list
use Map<String,String> which stores Key as a string and value as a string.And the best part is time complexity of removing an element would be O(1).
As mentioned before, Map is an option.More specifically HashMap, or another option could be Hashtable. Make sure you look at what each has to offer. Some major differences are HashMap allows nulls but it is not synchronized. On the other hand Hashtable is synchronized and does not accept null as key.
More specifically, suppose I have an array with duplicates:
{3,2,3,4,2,2,1,4}
I want to have a data structure that supports search and remove the first occurrence of some value faster than O(n), say if the value is 4, then it becomes:
{3,2,3,2,2,1,4}
I also need to iterate the list from head according to the same order. Other operations like get(index) or insert are not needed.
You can use O(n) time to record the original data(say it's an int[]) in your data structure, I just need the later search and remove faster than O(n).
"Search and remove" is considered as ONE operation as shown above.
If I have to make it myself, I would use a LinkedList to store the data, and HashMap to map every key to a list of all occurrence of nodes together with their previous and next ones.
Is it a right approach? Are there any better choices already there in Java?
The data structure you describe, essentially a hybrid linked list and map, I think is the most efficient way of handling your stated problem. You'll have to keep track of the nodes yourself, since Java's LinkedList doesn't provide access to the actual nodes. The AbstractSequentialList may be helpful here.
The index structure you'll need is a map from an element value to the appearances of that element in the list. I recommend a hash table from hashCode % modulus to a linked list of (value, list of main-list nodes).
Note that this approach is still O(n) in the worst case, when you have universal hash collisions; this applies whether you use open or closed hashing. In the average case it should be something closer to O(ln(n)), but I'm not prepared to prove that.
Consider also whether the overhead of keeping track of all of this is really worth the gains. Unless you've actually profiled running code and determined that a LinkedList is causing problems because remove is O(n), stick with that until you do.
Since your requirement is that the first occurrence of the element should be removed and the remaining occurrences retained, there would be no way to do it faster than O(n) as you would definitely have to move through to the end of the list to find out if there is another occurrence. There is no standard api from Oracle in the java package that does this.
I referred the android doc site for "SparseBooleanArray" class but still not getting idea of that class about what is the purpose of that class?? For what purpose we need to use that class??
Here is the Doc Link
http://developer.android.com/reference/android/util/SparseBooleanArray.html
From what I get from the documentation it is for mapping Integer values to booleans.
That is, if you want to map, if for a certain userID a widget should be shown and some userIDs have already been deleted, you would have gaps in your mapping.
Meaning, with a normal array, you would create an array of size=maxID and add a boolean value to element at index=userID. Then when iterating over the array, you would have to iterate over maxID elements in the worst case and have to check for null if there is no boolean for that index (eg. the ID does not exist). That is really inefficient.
When using a hashmap to do that you could map the ID to the boolean, but with the added overhead of generating the hashvalue for the key (that is why it is called *hash*map), which would ultimately hurt performance firstly in CPU cycles as well as RAM usage.
So that SparseBooleanArray seems like a good middleway of dealing with such a situation.
NOTE: Even though my example is really contrieved, I hope it illustrates the situation.
Like the javadoc says, SparseBooleanArrays map integers to booleans which basically means that it's like a map with Integer as a key and a boolean as value (Map).
However it's more efficient to use in this particular case It is intended to be more efficient than using a HashMap to map Integers to Booleans
Hope this clears out any issues you had with the description.
I found a very specific and wonderful use for the sparse boolean array.
You can put a true or false value to be associated with a position in a list.
For example: List item #7 was clicked, so putting 7 as the key and true as the value.
There can be three ways to store resource id's
1 Array
Boolean array containing id's as indexes.If we have used that id set it to true else false
Though all the operations are fast but this implementation will require huge amount of space.So it can't be used
High Space Complexity
2 HashMap
Key-ID
Value-Boolean True/False
Using this we need to process each id using the hashing function which will consume memory.Also there may be some empty locations where no id will be stored and we also need to deal with crashes.So due to usage complexity and medium space complexity, it is not used.
Medium Space Complexity
3 SparseBooleanArray
It is middle way.It uses mapping and Array Implementation
Key - ID
Value - Boolean True/False
It is an ArrayList which stores id's in an increasing order.So minimum space is used as it only contains id's which are being used.For searching an id binary search is used.
Though Binary Search O(logn) is slower than hashing O(1) or Array O(1),i.e. all the operations Insertion, Deletion, Searching will take more time but there is least memory wastage.So to save memory we prefer SparseBoolean Array
Least Space Complexity
I often* find myself in need of a data structure which has the following properties:
can be initialized with an array of n objects in O(n).
one can obtain a random element in O(1), after this operation the picked
element is removed from the structure.
(without replacement)
one can undo p 'picking without replacement' operations in O(p)
one can remove a specific object (eg by id) from the structure in O(log(n))
one can obtain an array of the objects currently in the structure in
O(n).
the complexity (or even possibility) of other actions (eg insert) does not matter. Besides the complexity it should also be efficient for small numbers of n.
Can anyone give me guidelines on implementing such a structure? I currently implemented a structure having all above properties, except the picking of the element takes O(d) with d the number of past picks (since I explicitly check whether it is 'not yet picked'). I can figure out structures allowing picking in O(1), but these have higher complexities on at least one of the other operations.
BTW:
note that O(1) above implies that the complexity is independent from #earlier picked elements and independent from total #elements.
*in monte carlo algorithms (iterative picks of p random elements from a 'set' of n elements).
HashMap has complexity O(1) both for insertion and removal.
You specify a lot of operation, but all of them are nothing else then insertion, removal and traversing:
can be initialized with an array of n objects in O(n).
n * O(1) insertion. HashMap is fine
one can obtain a random element in
O(1), after this operation the picked
element is removed from the structure.
(without replacement)
This is the only op that require O(n).
one can undo p 'picking without
replacement' operations in O(p)
it's an insertion operation: O(1).
one can remove a specific object (eg
by id) from the structure in O(log(n))
O(1).
one can obtain an array of the objects
currently in the structure in O(n).
you can traverse an HashMap in O(n)
EDIT:
example of picking up a random element in O(n):
HashMap map ....
int randomIntFromZeroToYouHashMapSize = ...
Collection collection = map.values();
Object[] values = collection.toArray();
values[randomIntFromZeroToYouHashMapSize];
Ok, same answer as 0verbose with a simple fix to get the O(1) random lookup. Create an array which stores the same n objects. Now, in the HashMap, store the pairs . For example, say your Objects (strings for simplicity) are:
{"abc" , "def", "ghi"}
Create an
List<String> array = ArrayList<String>("abc","def","ghi")
Create a HashMap map with the following values:
for (int i = 0; i < array.size(); i++)
{
map.put(array[i],i);
}
O(1) random lookup is easily achieved by picking any index in the array. The only complication that arises is when you delete an object. For that, do:
Find object in map. Get its array index. Lets call this index i (map.get(i)) - O(1)
Swap array[i] with array[size of array - 1] (the last element in the array). Reduce the size of the array by 1 (since there is one less number now) - O(1)
Update the index of the new object in position i of the array in map (map.put(array[i], i)) - O(1)
I apologize for the mix of java and cpp notation, hope this helps
Here's my analysis of using Collections.shuffle() on an ArrayList:
✔ can be initialized with an array of n objects in O(n).
Yes, although the cost is amortized unless n is known in advance.
✔ one can obtain a random element in O(1), after this operation the picked element is removed from the structure, without replacement.
Yes, choose the last element in the shuffled array; replace the array with a subList() of the remaining elements.
✔ one can undo p 'picking without replacement' operations in O(p).
Yes, append the element to the end of this list via add().
❍ one can remove a specific object (eg by id) from the structure in O(log(n)).
No, it looks like O(n).
✔ one can obtain an array of the objects currently in the structure in O(n).
Yes, using toArray() looks reasonable.
How about an array (or ArrayList) that's divided into "picked" and "unpicked"? You keep track of where the boundary is, and to pick, you generate a random index below the boundary, then (since you don't care about order), swap the item at that index with the last unpicked item, and decrement the boundary. To unpick, you just increment the boundary.
Update: Forgot about O(log(n)) removal. Not that hard, though, just a little memory-expensive, if you keep a HashMap of IDs to indices.
If you poke around on line you'll find various IndexedHashSet implementations that all work on more or less this principle -- an array or ArrayList plus a HashMap.
(I'd love to see a more elegant solution, though, if one exists.)
Update 2: Hmm... or does the actual removal become O(n) again, if you have to either recopy the arrays or shift them around?