Storing & lookup double array - java

I have a fairly expensive array calculation (SpectralResponse) which I like to keep to a minimum. I figured the best way is to store them and bring it back up when same array is needed again in the future. The decision is made using BasicParameters.
So right now, I use a LinkedList of object for the arrays of SpectralResponse, and another LinkedList for the BasicParameter. And the BasicParameters has a isParamsEqualTo(BasicParameters) method to compare the parameter set.
LinkedList<SpectralResponse> responses
LinkedList<BasicParameters> fitParams
LinkedList<Integer> responseNumbers
So to look up, I just go through the list of BasicParameters, check for match, if matched, return the SpectralResponse. If no match, then calculate the SpectralResponse.
Here's is the for loop I used to lookup.
size: LinkedList size, limited to a reasonable value
responseNumber: just another variable to distinguish the SpectralResponse.
for ( i = size-1; i > 0 ; i--) {
if (responseNumbers.get(i) == responseNum)
{
tempFit = fitParams.get(i);
if (tempFit.isParamsEqualTo(fit))
{
return responses.get(i);
}
}
}
But somehow, doing it this way no only take out lots of memory, it's actually slower than just calculating SpectralResponse straight. Much slower.
So it is my implementation that's wrong, or I was mistaken that precalculating and lookup is faster?

You are accessing a LinkedList by index, this is the worst possible way to access it ;)
You should use ArrayList instead, or use iterators for all your lists.
Possibly you should merge the three objects into one, and keep them in a map with responseNum as key.
Hope this helps!

You probably should use an array type (an actual array, like Vector, ArrayList), not Linked lists. Linked lists is best for stack or queue operation, not indexing (since you have to traverse it from one end). Vector is a auto resizing array, wich has less overhead in accessing inexes.

The get(i) methods of LinkedList require that to fetch each item it has to go further and further along the list. Consider using an ArrayList, the iterator() method, or just an array.

The second line, 'if (responseNumbers.get(i) == responseNum)' will also be inefficient as the responseNumbers.get(i) is an Integer, and has to be unboxed to an int (Java 5 onwards does this automatically; your code would not compile on Java 1.4 or earlier if responseNum is declared as an an int). See this for more information on boxing.
To remove this unboxing overhead, use an IntList from the apache primitives library. This library contains collections that store the underlying objects (ints in your case) as a primitive array (e.g. int[]) instead of an Object array. This means no boxing is required as the IntList's methods return primitive types, not Integers.

Related

Duplicate item's index in linkedHashSet

I am adding some values to a LinkedHashSet and based on add() method's output i.e. true/false, I am performing other operations.
If the Set contains duplicate element it returns false and in this case I want to know the index of the duplicate element in the Set as I need to use that index somewhere else. Being a 'linked' collection there must be some way to get the index, but I couldn't find any such thing in Set/LinkedHashSet API.
LinkedHashSet is not explicitly indexed per se. If you require an index, using a Set for such application is usually a sign of wrong abstraction and/or lousy programming. LinkedHashSet only guarantees you predictable iteration order, not proper indexing of elements. You should use a List in such cases, since that's the interface giving you indexing guarantee. You can, however, infer the index using a couple of methods, for example (not recommended, mind me):
a) use indexed iteration through the collection (e.g. with for loop), seeking the duplicate and breaking when it's found; it's O(n) complexity for getting the index,
Object o; // this is the object you want to add to collection
if ( !linkedHashSet.add(o) ) {
int index = 0;
for( Object obj : linkedHashSet ) {
if ( obj == o ) // or obj.equals(o), depending on your code's semantics
return index;
index++;
}
}
b) use .toArray() and find the element in the array, e.g. by
Object o; // this is the object you want to add to collection
int index;
if ( !linkedHashSet.add(o) )
index = Arrays.asList(linkedHashSet.toArray()).indexOf(o);
again, O(n) complexity of acquiring index.
Both would incur heavy runtime penalty (the second solution is obviously worse with respect to efficiency, as it creates an array every time you seek the index; creating a parallel array mirroring the set would be better there). All in all, I see a broken abstraction in your example. You say
I need to use that index somewhere else
... if that's really the case, using Set is 99% of the time wrong by itself.
You can, on the other hand, use a Map (HashMap for example), containing an [index,Object] (or [Object,index], depending on the exact use case) pairs in it. It'd require a bit of refactoring, but it's IMO a preferred way to do this. It'd give you the same order of complexity for most operations as LinkedHashSet, but you'd get O(1) for getting index essentially for free (Java's HashSet uses HashMap internally anyway, so you're not losing any memory by replacing HashSet with HashMap).
Even better way would be to use a class explicitly handling integer maps - see HashMap and int as key for more information; tl;dr - http://trove.starlight-systems.com/ has TIntObjectHashMap & TObjectIntHashMap , giving you possibly the best speed for such operations possible.

Performance primitive Array vs ArrayList

I want to know if there is a difference in performance if I use a primitive array and then rebuild it to add new elements like this:
AnyClass[] elements = new AnyClass[0];
public void addElement(AnyClass e) {
AnyClass[] temp = new AnyClass[elements.length + 1];
for (int i = 0; i < elements.length; i++) {
temp[i] = elements[i];
}
temp[elements.length] = e;
elements = temp;
}
or if I just use an ArrayList and add the elements.
I am not certain that is why I ask, is it the same speed because an ArrayList is build in the same way as I did it with the primitive array or is there really a difference and a primitive array is always faster even if I rebuild it everytime I add an element?
ArrayLists work in a similar way but instead of rebuilding every time they double there capacity every time the limit is reached. so if you are constantly adding to it ArrayLists will be faster because recreating the array is fairly slow.
So your implementation could use less memory if you are not adding to it often but as far as speed goes it will be slower most of the time.
In a nutshell, stick with ArrayList. It is:
widely understood;
well tested;
will probably be more performant that your own implementation (for example, ArrayList.add() is guaranteed to be amortised constant-time, which your method is not).
When an ArrayList resizes it doubles itself, so that you are not wasting time resizing each time. Amortized, that means that it doesn't take any time to resize. That's why you shouldn't waste time recreating the wheel. The people who created the first one already learned how to make one more efficient and know more about the platform than you do.
There is no performance issue in both Arrays and ArrayList.
Arrays and ArrayList are index based so both will work in same way.
If you required the dynamic Array you can use arrayList.
If Array size is static then go with Array.
Your implementation is likely to lose clearly to Java's ArrayList in terms of speed. One particularly expensive thing you're doing is reallocating the array every time you want to add elements, while Java's ArrayList tries to optimize by having some "buffer" before having to reallocate.
ArrayList will also use internally Array Only , so this is true Array will be faster than ArrayList. While writing high performance code always use Array. For the same reason Array is back bone for most of the collections.
You must go through JDK implementation of Collections.
We use ArrayList when we are developing some application and we are not concerned about such minor performance issues and we do trade off because we get already written API to put , get , resize etc.
Context is very important: I mean if you are constantly inserting new items/elements ArrayList will certainly be faster than Array. On the other hand if you just want to access an element at a known position-say arrayItems[8], Array is faster than ArrayList.get(8); Sine there is overhead of get() function calls and other steps and checks.

Large array of 'int' type needs to be passed to a generic array & collections

I am generating a large arrays(size>1000) with elements of int type, from a function. I need to pass this array to a generic type array but since the generic type array doesnt accept arrays of primitive type, I am unable to do so.
I fear to use the Integer type array since it will be costly, in terms of creation, performance, space used(an array of 12 byte objects) when doing so for a large size arrays. More it will create immutable Integer s when I need to perform some addition operations on the array elements.
What would be the best way to go with ?
EDIT Just to remove some confusions around, I need to pass int[] to a method of signature type: void setKeys(K... keys).
I want to pass an int[] to this function: public Query<K> setKeys(K... keys);
I assume that you mean that int[] should be the set of keys ... not just one key.
That is impossible. The type parameters of a generic type have to be reference types. Your use-case requires K to be a int.
You have two choices:
use Integer (or a mutable int holder class) and pay the performance penalty, or
forgo the use of generics and change the signature of that method.
Incidentally, the Integer class keeps a cache of Integer objects for small int values. If you create your objects using Integer.valueOf(int) there's a good chance that you will get a reference to an pre-existing object. (Of course, this only works because Integer objects are immutable.)
If your arrays are on the order of 1000 (or even 10,000 or 100,000) elements, the cost difference in terms of memory and performance probably wouldn't be noticeable unless you're processing the arrays thousands of times each. Write the code with Integer and optimize later if you have performance problems.
If you're that concerned about performance, you could write a simple class that wraps a public int, thus meaning you can make your call and still mutate it as needed. Having said that, I do agree that you want to make absolute sure you need this performance improvement before doing it.
If you actually do need to worry about the performance implications of boxing/unboxing integers, you could consider GNU Trove, specifically their TIntArrayList. It lets you mimic the functionality of an ArrayList<Integer> while being backed by primitives. That said, I'm not certain you need this, and I'm not certain this is exactly what you are looking for.
If you don't want the integers permanently boxed, you could pass in the result of Ints.asList() from the Google Collections library (http://guava-libraries.googlecode.com/svn/tags/release08/javadoc/com/google/common/primitives/Ints.html#asList(int...)), which would be a List<Integer> backed by the array. The values will get boxed as they're accessed, so this only makes sense if the values are not being accessed lots of times.

Is having a List and an array with the exact same element bad programming style?

I have a short (12 elements) LinkedList of short strings (7 characters each).
I need to search through this list both by index and by content (i.e. search a particular string and get its index in the list).
I thought about making a copy of the LinkedList as an array at runtime (just once, since the LinkedList is a static member of my class), so I can access the strings by index more quickly.
Given that the LinkedList is never changed at runtime, is this bad programming practice or is this an idea worth considering?
IMPORTANT EDIT: the array can't be sorted, I need it to map specific strings to specific numbers.
Instead of a LinkedList just use an ArrayList - you can look up fast based on an index, and you can easily search through it.
What problem are you trying to solve here? Are you worried that accessing elements by index is too slow in LinkedList? If so, you might want to use ArrayList instead.
But for a 12-element list, the improvement probably won't make any measurable difference. Unless this is something you're accessing several hundred times a second, I wouldn't waste any time on trying to optimize it.
Another idea you might want to consider is using a Map:
Map someMap<int, String>
It's easy to search for values in a map by both key and value.
Might also not be the best idea, but at least better then creating 2 lists with the same values =)
The question is, why are you using a LinkedList in the first place?
The main reason to choose a LinkedList over an array list is if you need to make a number of insertions/deletions in the middle of the List or if you don't know the exact size of the list and don't want to make a number of reallocations of the Array.
The main reason to choose an ArrayList over a LinkedList is if you need to have random access to each of the elements.
(There are other advantages/disadvantages to each, but those are probably the main ones that come to mind)
It looks like you do need random access to the list, so why did you pick a LinkedList over an ArrayList
I would say it depends on your intention and the effect it really has.
With only 12 elements it seems unlikely to me that converting the LinkedList to an array has an impact on performance. So it could make the code unnecessarily (slightly) harder to understand for other people. From this point of view it could be considered a non optimal programming style.
If the number of elements increases, i.g. you're need to pre-process some data which would require a dynamic data structure. And for later use an indexed lookup performs much better, this wouldn't be a bad programming style, rather a required improvement.
Given that you know the exact amount of elements you are going to be using why not use an array from the start?
string[] myArray = new string[7];
// Add your data
Sort(myArray); // Sort your strings
int value = binarySearch(myArray, "key"); // Search your array
Or since you cant sort the array you could just make a linear search method
public int Search(string[] array, string key)
{
for(int i = 0; i < array.legnth(); i++)
{
if(array[i] == key)
return i;
}
return -1;
}
Edit: After re-loading the page and reading peoples responses I agree that ArrayList should be exactly what you need.

Best way to remove repeats in a collection in Java?

This is a two-part question:
First, I am interested to know what the best way to remove repeating elements from a collection is. The way I have been doing it up until now is to simply convert the collection into a set. I know sets cannot have repeating elements so it just handles it for me.
Is this an efficient solution? Would it be better/more idiomatic/faster to loop and remove repeats? Does it matter?
My second (related) question is: What is the best way to convert an array to a Set? Assuming an array arr The way I have been doing it is the following:
Set x = new HashSet(Arrays.asList(arr));
This converts the array into a list, and then into a set. Seems to be kinda roundabout. Is there a better/more idiomatic/more efficient way to do this than the double conversion way?
Thanks!
Do you have any information about the collection, like say it is already sorted, or it contains mostly duplicates or mostly unique items? With just an arbitrary collection I think converting it to a Set is fine.
Arrays.asList() doesn't create a brand new list. It actually just returns a List which uses the array as its backing store, so it's a cheap operation. So your way of making a Set from an array is how I'd do it, too.
Use HashSet's standard Collection conversion constructor. According to The Java Tutorials:
Here's a simple but useful Set idiom.
Suppose you have a Collection, c, and
you want to create another Collection
containing the same elements but with
all duplicates eliminated. The
following one-liner does the trick.
Collection<Type> noDups = new HashSet<Type>(c);
It works by creating a Set (which, by
definition, cannot contain a
duplicate), initially containing all
the elements in c. It uses the
standard conversion constructor
described in the The Collection
Interface section.
Here is a minor variant of this idiom
that preserves the order of the
original collection while removing
duplicate element.
Collection<Type> noDups = new LinkedHashSet<Type>(c);
The following is a generic method that
encapsulates the preceding idiom,
returning a Set of the same generic
type as the one passed.
public static <E> Set<E> removeDups(Collection<E> c) {
return new LinkedHashSet<E>(c);
}
Assuming you really want set semantics, creating a new Set from the duplicate-containing collection is a great approach. It's very clear what the intent is, it's more compact than doing the loop yourself, and it leaves the source collection intact.
For creating a Set from an array, creating an intermediate List is a common approach. The wrapper returned by Arrays.asList() is lightweight and efficient. There's not a more direct API in core Java to do this, unfortunately.
I think your approach of putting items into a set to produce the collection of unique items is the best one. It's clear, efficient, and correct.
If you're uncomfortable using Arrays.asList() on the way into the set, you could simply run a foreach loop over the array to add items to the set, but I don't see any harm (for non-primitive arrays) in your approach. Arrays.asList() returns a list that is "backed by" the source array, so it doesn't have significant cost in time or space.
1.
Duplicates
Concurring other answers: Using Set should be the most efficient way to remove duplicates. HashSet should run in O(n) time on average. Looping and removing repeats would run in the order of O(n^2). So using Set is recommended in most cases. There are some cases (e.g. limited memory) where iterating might make sense.
2.
Arrays.asList() is a cheap operation that doesn't copy the array, with minimal memory overhead. You can manually add elements by iterating through the array.
public static Set arrayToSet(T[] array) {
Set set = new HashSet(array.length / 2);
for (T item : array)
set.add(item);
return set;
}
Barring any specific performance bottlenecks that you know of (say a collection of tens of thousands of items) converting to a set is a perfectly reasonable solution and should be (IMO) the first way you solve this problem, and only look for something fancier if there is a specific problem to solve.

Categories

Resources