Efficiently removing an item from Java LinkedList - java

Here is the pseudo-code I am using to remove an item from the linked-list:
public static void removeByID(LinkedList<Fruit> fruits, int fruitID) {
for(Fruit f : fruits) {
if (f.ID == fruitID) {
fruits.remove(f);
return;
}
}
}
I am thinking this is not very efficient as fruits.remove() will once again iterate over the list. Wondering if there is a better way to achieve this.

For a java.util.LinkedList, use the Iterator.
Iterator<Fruit> it = fruits.iterator();
while(it.hasNext()) {
if(it.next().ID == fruitID) {
it.remove();
break;
}
}
This will result in only a single traversal. The Iterator returned has access to the underlying link structure and can perform a removal without iterating.
The Iterator is implicitly used anyway when you use the for-each loop form. You'd just be retaining the reference to it so you can make use of its functionality.
You may also use listIterator for O(n) insertions.

Nope, not in terms of asymptotic complexity. That's the price you pay for using a LinkedList: removals require a traversal over the list. If you want something more efficient, you need to use a different data structure.
You're in fact doing two traversals here if you've got a singly linked list: the .remove() call needs to find the parent of the given node, which it can't do without another traversal.

If you need to access elements from a collection that have a unique attribute, it is better to use a HashMap instead, with that attribute as key.
Map<Integer, Fruit> fruits = new HashMap<Integer, Fruit>();
// ...
Fruit f = fruits.remove(fruitID);

As stated in the above linked lists generally need traversal for any operation. Avoiding multiple traversal can probably be done with an iterator. Although, if you are able to relate fruit to fruit.ID ahead of time you may be able to speed up your operations because you can a void the slow iterative look up. This will still require a different data structure, namely a Map (Hashmap probably).

Regarding to your post, using a HashMap appears to a good solution.
In addition, if we suppose that you need also to search a fruit using the fruitID into your set, HashMap will make the search time barely constant.
Regarding complexity, you can find additional information on Simple Notions article depending the data structure that you use.

Related

Sorted Lists in Java [duplicate]

In Java there are the SortedSet and SortedMap interfaces. Both belong to the Java Collections framework and provide a sorted way to access the elements.
However, in my understanding there is no SortedList in Java. You can use java.util.Collections.sort() to sort a list.
Any idea why it is designed like that?
List iterators guarantee first and foremost that you get the list's elements in the internal order of the list (aka. insertion order). More specifically it is in the order you've inserted the elements or on how you've manipulated the list. Sorting can be seen as a manipulation of the data structure, and there are several ways to sort the list.
I'll order the ways in the order of usefulness as I personally see it:
1. Consider using Set or Bag collections instead
NOTE: I put this option at the top because this is what you normally want to do anyway.
A sorted set automatically sorts the collection at insertion, meaning that it does the sorting while you add elements into the collection. It also means you don't need to manually sort it.
Furthermore if you are sure that you don't need to worry about (or have) duplicate elements then you can use the TreeSet<T> instead. It implements SortedSet and NavigableSet interfaces and works as you'd probably expect from a list:
TreeSet<String> set = new TreeSet<String>();
set.add("lol");
set.add("cat");
// automatically sorts natural order when adding
for (String s : set) {
System.out.println(s);
}
// Prints out "cat" and "lol"
If you don't want the natural ordering you can use the constructor parameter that takes a Comparator<T>.
Alternatively, you can use Multisets (also known as Bags), that is a Set that allows duplicate elements, instead and there are third-party implementations of them. Most notably from the Guava libraries there is a TreeMultiset, that works a lot like the TreeSet.
2. Sort your list with Collections.sort()
As mentioned above, sorting of Lists is a manipulation of the data structure. So for situations where you need "one source of truth" that will be sorted in a variety of ways then sorting it manually is the way to go.
You can sort your list with the java.util.Collections.sort() method. Here is a code sample on how:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
Collections.sort(strings);
for (String s : strings) {
System.out.println(s);
}
// Prints out "cat" and "lol"
Using comparators
One clear benefit is that you may use Comparator in the sort method. Java also provides some implementations for the Comparator such as the Collator which is useful for locale sensitive sorting strings. Here is one example:
Collator usCollator = Collator.getInstance(Locale.US);
usCollator.setStrength(Collator.PRIMARY); // ignores casing
Collections.sort(strings, usCollator);
Sorting in concurrent environments
Do note though that using the sort method is not friendly in concurrent environments, since the collection instance will be manipulated, and you should consider using immutable collections instead. This is something Guava provides in the Ordering class and is a simple one-liner:
List<string> sorted = Ordering.natural().sortedCopy(strings);
3. Wrap your list with java.util.PriorityQueue
Though there is no sorted list in Java there is however a sorted queue which would probably work just as well for you. It is the java.util.PriorityQueue class.
Nico Haase linked in the comments to a related question that also answers this.
In a sorted collection you most likely don't want to manipulate the internal data structure which is why PriorityQueue doesn't implement the List interface (because that would give you direct access to its elements).
Caveat on the PriorityQueue iterator
The PriorityQueue class implements the Iterable<E> and Collection<E> interfaces so it can be iterated as usual. However, the iterator is not guaranteed to return elements in the sorted order. Instead (as Alderath points out in the comments) you need to poll() the queue until empty.
Note that you can convert a list to a priority queue via the constructor that takes any collection:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
PriorityQueue<String> sortedStrings = new PriorityQueue(strings);
while(!sortedStrings.isEmpty()) {
System.out.println(sortedStrings.poll());
}
// Prints out "cat" and "lol"
4. Write your own SortedList class
NOTE: You shouldn't have to do this.
You can write your own List class that sorts each time you add a new element. This can get rather computation heavy depending on your implementation and is pointless, unless you want to do it as an exercise, because of two main reasons:
It breaks the contract that List<E> interface has because the add methods should ensure that the element will reside in the index that the user specifies.
Why reinvent the wheel? You should be using the TreeSet or Multisets instead as pointed out in the first point above.
However, if you want to do it as an exercise here is a code sample to get you started, it uses the AbstractList abstract class:
public class SortedList<E> extends AbstractList<E> {
private ArrayList<E> internalList = new ArrayList<E>();
// Note that add(E e) in AbstractList is calling this one
#Override
public void add(int position, E e) {
internalList.add(e);
Collections.sort(internalList, null);
}
#Override
public E get(int i) {
return internalList.get(i);
}
#Override
public int size() {
return internalList.size();
}
}
Note that if you haven't overridden the methods you need, then the default implementations from AbstractList will throw UnsupportedOperationExceptions.
Because the concept of a List is incompatible with the concept of an automatically sorted collection. The point of a List is that after calling list.add(7, elem), a call to list.get(7) will return elem. With an auto-sorted list, the element could end up in an arbitrary position.
Since all lists are already "sorted" by the order the items were added (FIFO ordering), you can "resort" them with another order, including the natural ordering of elements, using java.util.Collections.sort().
EDIT:
Lists as data structures are based in what is interesting is the ordering in which the items where inserted.
Sets do not have that information.
If you want to order by adding time, use List. If you want to order by other criteria, use SortedSet.
Set and Map are non-linear data structure. List is linear data structure.
The tree data structure SortedSet and SortedMap interfaces implements TreeSet and TreeMap respectively using used Red-Black tree implementation algorithm. So it ensure that there are no duplicated items (or keys in case of Map).
List already maintains an ordered collection and index-based data structure, trees are no index-based data structures.
Tree by definition cannot contain duplicates.
In List we can have duplicates, so there is no TreeList(i.e. no SortedList).
List maintains elements in insertion order. So if we want to sort the list we have to use java.util.Collections.sort(). It sorts the specified list into ascending order, according to the natural ordering of its elements.
JavaFX SortedList
Though it took a while, Java 8 does have a sorted List.
http://docs.oracle.com/javase/8/javafx/api/javafx/collections/transformation/SortedList.html
As you can see in the javadocs, it is part of the JavaFX collections, intended to provide a sorted view on an ObservableList.
Update: Note that with Java 11, the JavaFX toolkit has moved outside the JDK and is now a separate library. JavaFX 11 is available as a downloadable SDK or from MavenCentral. See https://openjfx.io
For any newcomers, as of April 2015, Android now has a SortedList class in the support library, designed specifically to work with RecyclerView. Here's the blog post about it.
Another point is the time complexity of insert operations.
For a list insert, one expects a complexity of O(1).
But this could not be guaranteed with a sorted list.
And the most important point is that lists assume nothing about their elements.
For example, you can make lists of things that do not implement equals or compare.
Think of it like this: the List interface has methods like add(int index, E element), set(int index, E element). The contract is that once you added an element at position X you will find it there unless you add or remove elements before it.
If any list implementation would store elements in some order other than based on the index, the above list methods would make no sense.
In case you are looking for a way to sort elements, but also be able to access them by index in an efficient way, you can do the following:
Use a random access list for storage (e.g. ArrayList)
Make sure it is always sorted
Then to add or remove an element you can use Collections.binarySearch to get the insertion / removal index. Since your list implements random access, you can efficiently modify the list with the determined index.
Example:
/**
* #deprecated
* Only for demonstration purposes. Implementation is incomplete and does not
* handle invalid arguments.
*/
#Deprecated
public class SortingList<E extends Comparable<E>> {
private ArrayList<E> delegate;
public SortingList() {
delegate = new ArrayList<>();
}
public void add(E e) {
int insertionIndex = Collections.binarySearch(delegate, e);
// < 0 if element is not in the list, see Collections.binarySearch
if (insertionIndex < 0) {
insertionIndex = -(insertionIndex + 1);
}
else {
// Insertion index is index of existing element, to add new element
// behind it increase index
insertionIndex++;
}
delegate.add(insertionIndex, e);
}
public void remove(E e) {
int index = Collections.binarySearch(delegate, e);
delegate.remove(index);
}
public E get(int index) {
return delegate.get(index);
}
}
(See a more complete implementation in this answer)
First line in the List API says it is an ordered collection (also known as a sequence). If you sort the list you can't maintain the order, so there is no TreeList in Java.
As API says Java List got inspired from Sequence and see the sequence properties http://en.wikipedia.org/wiki/Sequence_(mathematics)
It doesn't mean that you can't sort the list, but Java strict to his definition and doesn't provide sorted versions of lists by default.
I think all the above do not answer this question due to following reasons,
Since same functionality can be achieved by using other collections such as TreeSet, Collections, PriorityQueue..etc (but this is an alternative which will also impose their constraints i.e. Set will remove duplicate elements. Simply saying even if it does not impose any constraint, it does not answer the question why SortedList was not created by java community)
Since List elements do not implements compare/equals methods (This holds true for Set & Map also where in general items do not implement Comparable interface but when we need these items to be in sorted order & want to use TreeSet/TreeMap,items should implement Comparable interface)
Since List uses indexing & due to sorting it won't work (This can be easily handled introducing intermediate interface/abstract class)
but none has told the exact reason behind it & as I believe these kind of questions can be best answered by java community itself as it will have only one & specific answer but let me try my best to answer this as following,
As we know sorting is an expensive operation and there is a basic difference between List & Set/Map that List can have duplicates but Set/Map can not.
This is the core reason why we have got a default implementation for Set/Map in form of TreeSet/TreeMap. Internally this is a Red Black Tree with every operation (insert/delete/search) having the complexity of O(log N) where due to duplicates List could not fit in this data storage structure.
Now the question arises we could also choose a default sorting method for List also like MergeSort which is used by Collections.sort(list) method with the complexity of O(N log N). Community did not do this deliberately since we do have multiple choices for sorting algorithms for non distinct elements like QuickSort, ShellSort, RadixSort...etc. In future there can be more. Also sometimes same sorting algorithm performs differently depending on the data to be sorted. Therefore they wanted to keep this option open and left this on us to choose. This was not the case with Set/Map since O(log N) is the best sorting complexity.
https://github.com/geniot/indexed-tree-map
Consider using indexed-tree-map . It's an enhanced JDK's TreeSet that provides access to element by index and finding the index of an element without iteration or hidden underlying lists that back up the tree. The algorithm is based on updating weights of changed nodes every time there is a change.
We have Collections.sort(arr) method which can help to sort ArrayList arr. to get sorted in desc manner we can use Collections.sort(arr, Collections.reverseOrder())

SortedList that maintains order like SortedSet but also permits duplicate elements [duplicate]

In Java there are the SortedSet and SortedMap interfaces. Both belong to the Java Collections framework and provide a sorted way to access the elements.
However, in my understanding there is no SortedList in Java. You can use java.util.Collections.sort() to sort a list.
Any idea why it is designed like that?
List iterators guarantee first and foremost that you get the list's elements in the internal order of the list (aka. insertion order). More specifically it is in the order you've inserted the elements or on how you've manipulated the list. Sorting can be seen as a manipulation of the data structure, and there are several ways to sort the list.
I'll order the ways in the order of usefulness as I personally see it:
1. Consider using Set or Bag collections instead
NOTE: I put this option at the top because this is what you normally want to do anyway.
A sorted set automatically sorts the collection at insertion, meaning that it does the sorting while you add elements into the collection. It also means you don't need to manually sort it.
Furthermore if you are sure that you don't need to worry about (or have) duplicate elements then you can use the TreeSet<T> instead. It implements SortedSet and NavigableSet interfaces and works as you'd probably expect from a list:
TreeSet<String> set = new TreeSet<String>();
set.add("lol");
set.add("cat");
// automatically sorts natural order when adding
for (String s : set) {
System.out.println(s);
}
// Prints out "cat" and "lol"
If you don't want the natural ordering you can use the constructor parameter that takes a Comparator<T>.
Alternatively, you can use Multisets (also known as Bags), that is a Set that allows duplicate elements, instead and there are third-party implementations of them. Most notably from the Guava libraries there is a TreeMultiset, that works a lot like the TreeSet.
2. Sort your list with Collections.sort()
As mentioned above, sorting of Lists is a manipulation of the data structure. So for situations where you need "one source of truth" that will be sorted in a variety of ways then sorting it manually is the way to go.
You can sort your list with the java.util.Collections.sort() method. Here is a code sample on how:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
Collections.sort(strings);
for (String s : strings) {
System.out.println(s);
}
// Prints out "cat" and "lol"
Using comparators
One clear benefit is that you may use Comparator in the sort method. Java also provides some implementations for the Comparator such as the Collator which is useful for locale sensitive sorting strings. Here is one example:
Collator usCollator = Collator.getInstance(Locale.US);
usCollator.setStrength(Collator.PRIMARY); // ignores casing
Collections.sort(strings, usCollator);
Sorting in concurrent environments
Do note though that using the sort method is not friendly in concurrent environments, since the collection instance will be manipulated, and you should consider using immutable collections instead. This is something Guava provides in the Ordering class and is a simple one-liner:
List<string> sorted = Ordering.natural().sortedCopy(strings);
3. Wrap your list with java.util.PriorityQueue
Though there is no sorted list in Java there is however a sorted queue which would probably work just as well for you. It is the java.util.PriorityQueue class.
Nico Haase linked in the comments to a related question that also answers this.
In a sorted collection you most likely don't want to manipulate the internal data structure which is why PriorityQueue doesn't implement the List interface (because that would give you direct access to its elements).
Caveat on the PriorityQueue iterator
The PriorityQueue class implements the Iterable<E> and Collection<E> interfaces so it can be iterated as usual. However, the iterator is not guaranteed to return elements in the sorted order. Instead (as Alderath points out in the comments) you need to poll() the queue until empty.
Note that you can convert a list to a priority queue via the constructor that takes any collection:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
PriorityQueue<String> sortedStrings = new PriorityQueue(strings);
while(!sortedStrings.isEmpty()) {
System.out.println(sortedStrings.poll());
}
// Prints out "cat" and "lol"
4. Write your own SortedList class
NOTE: You shouldn't have to do this.
You can write your own List class that sorts each time you add a new element. This can get rather computation heavy depending on your implementation and is pointless, unless you want to do it as an exercise, because of two main reasons:
It breaks the contract that List<E> interface has because the add methods should ensure that the element will reside in the index that the user specifies.
Why reinvent the wheel? You should be using the TreeSet or Multisets instead as pointed out in the first point above.
However, if you want to do it as an exercise here is a code sample to get you started, it uses the AbstractList abstract class:
public class SortedList<E> extends AbstractList<E> {
private ArrayList<E> internalList = new ArrayList<E>();
// Note that add(E e) in AbstractList is calling this one
#Override
public void add(int position, E e) {
internalList.add(e);
Collections.sort(internalList, null);
}
#Override
public E get(int i) {
return internalList.get(i);
}
#Override
public int size() {
return internalList.size();
}
}
Note that if you haven't overridden the methods you need, then the default implementations from AbstractList will throw UnsupportedOperationExceptions.
Because the concept of a List is incompatible with the concept of an automatically sorted collection. The point of a List is that after calling list.add(7, elem), a call to list.get(7) will return elem. With an auto-sorted list, the element could end up in an arbitrary position.
Since all lists are already "sorted" by the order the items were added (FIFO ordering), you can "resort" them with another order, including the natural ordering of elements, using java.util.Collections.sort().
EDIT:
Lists as data structures are based in what is interesting is the ordering in which the items where inserted.
Sets do not have that information.
If you want to order by adding time, use List. If you want to order by other criteria, use SortedSet.
Set and Map are non-linear data structure. List is linear data structure.
The tree data structure SortedSet and SortedMap interfaces implements TreeSet and TreeMap respectively using used Red-Black tree implementation algorithm. So it ensure that there are no duplicated items (or keys in case of Map).
List already maintains an ordered collection and index-based data structure, trees are no index-based data structures.
Tree by definition cannot contain duplicates.
In List we can have duplicates, so there is no TreeList(i.e. no SortedList).
List maintains elements in insertion order. So if we want to sort the list we have to use java.util.Collections.sort(). It sorts the specified list into ascending order, according to the natural ordering of its elements.
JavaFX SortedList
Though it took a while, Java 8 does have a sorted List.
http://docs.oracle.com/javase/8/javafx/api/javafx/collections/transformation/SortedList.html
As you can see in the javadocs, it is part of the JavaFX collections, intended to provide a sorted view on an ObservableList.
Update: Note that with Java 11, the JavaFX toolkit has moved outside the JDK and is now a separate library. JavaFX 11 is available as a downloadable SDK or from MavenCentral. See https://openjfx.io
For any newcomers, as of April 2015, Android now has a SortedList class in the support library, designed specifically to work with RecyclerView. Here's the blog post about it.
Another point is the time complexity of insert operations.
For a list insert, one expects a complexity of O(1).
But this could not be guaranteed with a sorted list.
And the most important point is that lists assume nothing about their elements.
For example, you can make lists of things that do not implement equals or compare.
Think of it like this: the List interface has methods like add(int index, E element), set(int index, E element). The contract is that once you added an element at position X you will find it there unless you add or remove elements before it.
If any list implementation would store elements in some order other than based on the index, the above list methods would make no sense.
In case you are looking for a way to sort elements, but also be able to access them by index in an efficient way, you can do the following:
Use a random access list for storage (e.g. ArrayList)
Make sure it is always sorted
Then to add or remove an element you can use Collections.binarySearch to get the insertion / removal index. Since your list implements random access, you can efficiently modify the list with the determined index.
Example:
/**
* #deprecated
* Only for demonstration purposes. Implementation is incomplete and does not
* handle invalid arguments.
*/
#Deprecated
public class SortingList<E extends Comparable<E>> {
private ArrayList<E> delegate;
public SortingList() {
delegate = new ArrayList<>();
}
public void add(E e) {
int insertionIndex = Collections.binarySearch(delegate, e);
// < 0 if element is not in the list, see Collections.binarySearch
if (insertionIndex < 0) {
insertionIndex = -(insertionIndex + 1);
}
else {
// Insertion index is index of existing element, to add new element
// behind it increase index
insertionIndex++;
}
delegate.add(insertionIndex, e);
}
public void remove(E e) {
int index = Collections.binarySearch(delegate, e);
delegate.remove(index);
}
public E get(int index) {
return delegate.get(index);
}
}
(See a more complete implementation in this answer)
First line in the List API says it is an ordered collection (also known as a sequence). If you sort the list you can't maintain the order, so there is no TreeList in Java.
As API says Java List got inspired from Sequence and see the sequence properties http://en.wikipedia.org/wiki/Sequence_(mathematics)
It doesn't mean that you can't sort the list, but Java strict to his definition and doesn't provide sorted versions of lists by default.
I think all the above do not answer this question due to following reasons,
Since same functionality can be achieved by using other collections such as TreeSet, Collections, PriorityQueue..etc (but this is an alternative which will also impose their constraints i.e. Set will remove duplicate elements. Simply saying even if it does not impose any constraint, it does not answer the question why SortedList was not created by java community)
Since List elements do not implements compare/equals methods (This holds true for Set & Map also where in general items do not implement Comparable interface but when we need these items to be in sorted order & want to use TreeSet/TreeMap,items should implement Comparable interface)
Since List uses indexing & due to sorting it won't work (This can be easily handled introducing intermediate interface/abstract class)
but none has told the exact reason behind it & as I believe these kind of questions can be best answered by java community itself as it will have only one & specific answer but let me try my best to answer this as following,
As we know sorting is an expensive operation and there is a basic difference between List & Set/Map that List can have duplicates but Set/Map can not.
This is the core reason why we have got a default implementation for Set/Map in form of TreeSet/TreeMap. Internally this is a Red Black Tree with every operation (insert/delete/search) having the complexity of O(log N) where due to duplicates List could not fit in this data storage structure.
Now the question arises we could also choose a default sorting method for List also like MergeSort which is used by Collections.sort(list) method with the complexity of O(N log N). Community did not do this deliberately since we do have multiple choices for sorting algorithms for non distinct elements like QuickSort, ShellSort, RadixSort...etc. In future there can be more. Also sometimes same sorting algorithm performs differently depending on the data to be sorted. Therefore they wanted to keep this option open and left this on us to choose. This was not the case with Set/Map since O(log N) is the best sorting complexity.
https://github.com/geniot/indexed-tree-map
Consider using indexed-tree-map . It's an enhanced JDK's TreeSet that provides access to element by index and finding the index of an element without iteration or hidden underlying lists that back up the tree. The algorithm is based on updating weights of changed nodes every time there is a change.
We have Collections.sort(arr) method which can help to sort ArrayList arr. to get sorted in desc manner we can use Collections.sort(arr, Collections.reverseOrder())

Remove list elements - my approach for best performance in Java

If I need to remove elements in a list, will the following be better than using LinkedList:
int j = 0;
List list = new ArrayList(1000000);
...
// fill in the list code here
...
for (Iterator i = list.listIterator(); i.hasNext(); j++) {
if (checkCondition) {
i.remove();
i = list.listIterator(j);
}
}
?
LinkedList does "remove and add elements" more effectively than ArrayList, but LinkedList as a doubly-linked list needs more memory, since each element is wrapped as an Entry object. While I need a one-direction List interface, because I'm running over in ascending order of index.
The answer is: it depends on the frequency and distribution of your add and removes. If you have to do only a single remove infrequently, then you might use a linked list. However, the main killer for an ArrayList over a LinkedList is constant time random access. You can't really do this with a normal linked list (however, look at a skip list for some inspiration..). Instead, if you're removing elements relative to other elements (where, you need to remove the next element) then you should use a linked list.
There is no simple answer to this:
It depends on what you are optimizing for. Do you care more about the time taken to perform the operations, or the space used by the lists?
It depends on how long the lists are.
It depends on the proportion of elements that you are removing from the lists.
It depends on the other things that you do to the list.
The chances are that one or more of these determining factors is not predictable up-front; i.e. you don't really know. So my advice would be to put this off for now; i.e. just pick one or the other based on gut feeling (or a coin toss). You can revisit the decision later, if you have a quantifiable performance problem in this area ... as demonstrated by cpu or memory usage profiling.

List implementation that maintains ordering

Is there an existing List implementation in Java that maintains order based on provided Comparator?
Something that can be used in the following way:
Comparator<T> cmp = new MyComparator<T>();
List<T> l = new OrderedList<T>(cmp);
l.add(someT);
so that someT gets inserted such that the order in the list is maintained according to cmp
(On #andersoj suggestion I am completing my question with one more request)
Also I want to be able to traverse the list in sorted order without removing the elements, i.e:
T min = Const.SMALLEST_T;
for (T e: l) {
assertTrue(cmp.compare(min, e) >= 0);
min = e;
}
should pass.
All suggestions are welcome (except telling me to use Collections.sort on the unordered full list), though, I would prefer something in java.* or eventually org.apache.* since it would be hard to introduce new libraries at this moment.
Note: (UPDATE4) I realized that implementations of this kind of list would have inadequate performance. There two general approaches:
Use Linked structure (sort of) B-tree or similar
Use array and insertion (with binary search)
No 1. has problem with CPU cache misses
No 2. has problem with shifting elements in array.
UPDATE2:
TreeSet does not work because it uses the provided comparator (MyComparator) to check for equality and based on it assumes that the elements are equal and exclude them. I need that comparator only for ordering, not "uniqueness" filtering (since the elements by their natural ordering are not equal)
UPDATE3:
PriorityQueue does not work as List (as I need) because there is no way to traverse it in the order it is "sorted", to get the elements in the sorted order you have to remove them from the collection.
UPDATE:
Similar question:
A good Sorted List for Java
Sorted array list in Java
You should probably be using a TreeSet:
The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
Example:
Comparator<T> cmp = new MyComparator<T>();
TreeSet<T> t = new TreeSet<T>(cmp);
l.add(someT);
Note that this is a set, so no duplicate entries are allowed. This may or may not work for your specific use-case.
Response to new requirement. I see two potentials:
Do what the JavaDoc for PriorityQueue says:
This class and its iterator implement all of the optional methods of the Collection and Iterator interfaces. The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).
I suspect this will yield the best performance given your requirements. If this is not acceptable, you'll need to better explain what you're trying to accomplish.
Build a List that simply sorts itself upon addition of new elements. This is a real pain... if you used a linked structure, you can do an efficient insertion sort, but locality is bad. If you used an array-backed structure, insertion sort is a pain but traversal is better. If iteration/traversal is infrequent, you could hold the list contents unsorted and sort only on demand.
Consider using a PriorityQueue as I suggested, and in the event you need to iterate in order, write a wrapper iterator:
class PqIter implements Iterator<T>
{
final PriorityQueue<T> pq;
public PqIter(PriorityQueue <T> source)
{
pq = new PriorityQueue(source);
}
#Override
public boolean hasNext()
{
return pq.peek() != null
}
#Override
public T next()
{ return pq.poll(); }
#Override
public void remove()
{ throw new UnsupportedOperationException(""); }
}
Use Guava's TreeMultiSet. I tested the following code with Integer and it seems to do the right thing.
import com.google.common.collect.TreeMultiset;
public class TreeMultiSetTest {
public static void main(String[] args) {
TreeMultiset<Integer> ts = TreeMultiset.create();
ts.add(1); ts.add(0); ts.add(2);
ts.add(-1); ts.add(5); ts.add(2);
for (Integer i : ts) {
System.out.println(i);
}
}
}
The below addresses the uniqueness/filtering problem you were having when using a SortedSet. I see that you also want an iterator, so this won't work.
If what you really want is an ordered list-like thing, you can make use of a PriorityQueue.
Comparator<T> cmp = new MyComparator<T>();
PriorityQueue<T> pq = new PriorityQueue<T>(cmp);
pq.add(someT);
Take note of what the API documentation says about the time properties of various operations:
Implementation note: this implementation provides O(log(n)) time for the enqueing and dequeing methods (offer, poll, remove() and add); linear time for the remove(Object) and contains(Object) methods; and constant time for the retrieval methods (peek, element, and size).
You should also be aware that the iterators produced by PriorityQueue do not behave as one might expect:
The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).
I just noticed that Guava provides a MinMaxPriorityQueue. This implementation is array-backed, rather than the linked form provided in the JDK's PriorityQueue, and thus likely has different timing behavior. If you're doing something performance sensitive, you may wish to take a look. While the notes give slightly different (linear and logarithmic) big-O times, all those times should also be bounded, which may be useful.
There is not a List implementation per se that maintains ordering, but what you are likely looking for are implementations of SortedSet. A TreeSet is the most common. The other implementation, a ConcurrentSkipListSet is for more specific uses. Note that a SortedSet provides ordering, but does not allow duplicate entries, as does a List.
Refs:
Blog post PriorityQueue iterator is not ordered
SO question on PQ ordered iterator
I have a similar problem and I'm thinking of using a TreeSet. To avoid excluding "equal" elements I will modify the comparator so instead of returning 0 it will return a random number between (-1,1) or it will return always 1.
If you have no control over the Comparator or if you are using it for something else different than inserting this solution won't work for you.

Is there an indexable sorted list in the Java.util package?

I'm looking for a data structure in the java.util package. I need it to meet the following requirements:
The number of elements is (theoretically) unbounded.
The elements are sorted in an ascending order.
You can get the nth element (fast).
You can remove the nth element (fast).
I expected to find an indexable skip list, but I didn't. Do they have any data structure which meets the requirements I'v stated?
There is no such container in the Java standard libraries.
When I need a data structure with these properties, I use a List implementation (generally an ArrayList, but it doesn't matter), and I do all the insertions using Collections.binarySearch.
If I had to encapsulate a sorted list as a reusable class, I'd implement the List interface, delegating all methods to a 'standard' List implementation (it can even be passed as a parameter to the constructor). I'd implement every insertion method (add, addAll, set, Iterator's remove) by throwing an exception (UnsupportedOperationException), so that nobody can break the 'always sorted' property. Finally, I'd provide a method insertSorted that would use Collections.binarySearch to do the insertion.
There exists no simple data structure that fulfills all your criteria.
The only one that I know which does fulfills them all would be an indexable skip list. Hoewever,I don't know of any readily available Java implementations.
This question is very similar to
Sorted array list in Java
Have a look at my answer to that question.
Basically it suggests the following:
class SortedArrayList<T> extends ArrayList<T> {
#SuppressWarnings("unchecked")
public void insertSorted(T value) {
add(value);
Comparable<T> cmp = (Comparable<T>) value;
for (int i = size()-1; i > 0 && cmp.compareTo(get(i-1)) < 0; i--) {
T tmp = get(i);
set(i, get(i-1));
set(i-1, tmp);
}
}
}
A note on your first requirement: "The number of elements is unbounded.":
You may want to restrict this to something like "The number of elements should not be bound by less than 231-1..." since otherwise you're ruling out all options which are backed by a Java array. (You could get away with an arbitrary number of elements using for instance a LinkedList, but I can't see how you could do fast lookups in that.)
TreeSet provides you the functionality of natural sorting while adding elements to the list.
But if you don't need this and Collections.sort() is permitted you can use simple ArrayList.
Consider List combined with Collections.sort().
Going with what dwb stated with List<T> and Collections.sort(), you can use ArrayList<T> as that implements List<T> (and is not synchronized like Vector<T> unless of course you want that overhead). That is probably your best bet because they (Sun) typically do lots of research into these areas (from what I've seen anyway). If you need to sort by something other than the "default" (i.e. you are not sorting a list of integers, etc), then supply your own comparator.
EDIT: The only thing that does not meet your requirements are the fast removals...
Look at PriorityQueue.
If you don't need similar elements in your data structure, then usual TreeSet also fits your requirements.

Categories

Resources