Sorted Lists in Java [duplicate] - java

In Java there are the SortedSet and SortedMap interfaces. Both belong to the Java Collections framework and provide a sorted way to access the elements.
However, in my understanding there is no SortedList in Java. You can use java.util.Collections.sort() to sort a list.
Any idea why it is designed like that?

List iterators guarantee first and foremost that you get the list's elements in the internal order of the list (aka. insertion order). More specifically it is in the order you've inserted the elements or on how you've manipulated the list. Sorting can be seen as a manipulation of the data structure, and there are several ways to sort the list.
I'll order the ways in the order of usefulness as I personally see it:
1. Consider using Set or Bag collections instead
NOTE: I put this option at the top because this is what you normally want to do anyway.
A sorted set automatically sorts the collection at insertion, meaning that it does the sorting while you add elements into the collection. It also means you don't need to manually sort it.
Furthermore if you are sure that you don't need to worry about (or have) duplicate elements then you can use the TreeSet<T> instead. It implements SortedSet and NavigableSet interfaces and works as you'd probably expect from a list:
TreeSet<String> set = new TreeSet<String>();
set.add("lol");
set.add("cat");
// automatically sorts natural order when adding
for (String s : set) {
System.out.println(s);
}
// Prints out "cat" and "lol"
If you don't want the natural ordering you can use the constructor parameter that takes a Comparator<T>.
Alternatively, you can use Multisets (also known as Bags), that is a Set that allows duplicate elements, instead and there are third-party implementations of them. Most notably from the Guava libraries there is a TreeMultiset, that works a lot like the TreeSet.
2. Sort your list with Collections.sort()
As mentioned above, sorting of Lists is a manipulation of the data structure. So for situations where you need "one source of truth" that will be sorted in a variety of ways then sorting it manually is the way to go.
You can sort your list with the java.util.Collections.sort() method. Here is a code sample on how:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
Collections.sort(strings);
for (String s : strings) {
System.out.println(s);
}
// Prints out "cat" and "lol"
Using comparators
One clear benefit is that you may use Comparator in the sort method. Java also provides some implementations for the Comparator such as the Collator which is useful for locale sensitive sorting strings. Here is one example:
Collator usCollator = Collator.getInstance(Locale.US);
usCollator.setStrength(Collator.PRIMARY); // ignores casing
Collections.sort(strings, usCollator);
Sorting in concurrent environments
Do note though that using the sort method is not friendly in concurrent environments, since the collection instance will be manipulated, and you should consider using immutable collections instead. This is something Guava provides in the Ordering class and is a simple one-liner:
List<string> sorted = Ordering.natural().sortedCopy(strings);
3. Wrap your list with java.util.PriorityQueue
Though there is no sorted list in Java there is however a sorted queue which would probably work just as well for you. It is the java.util.PriorityQueue class.
Nico Haase linked in the comments to a related question that also answers this.
In a sorted collection you most likely don't want to manipulate the internal data structure which is why PriorityQueue doesn't implement the List interface (because that would give you direct access to its elements).
Caveat on the PriorityQueue iterator
The PriorityQueue class implements the Iterable<E> and Collection<E> interfaces so it can be iterated as usual. However, the iterator is not guaranteed to return elements in the sorted order. Instead (as Alderath points out in the comments) you need to poll() the queue until empty.
Note that you can convert a list to a priority queue via the constructor that takes any collection:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
PriorityQueue<String> sortedStrings = new PriorityQueue(strings);
while(!sortedStrings.isEmpty()) {
System.out.println(sortedStrings.poll());
}
// Prints out "cat" and "lol"
4. Write your own SortedList class
NOTE: You shouldn't have to do this.
You can write your own List class that sorts each time you add a new element. This can get rather computation heavy depending on your implementation and is pointless, unless you want to do it as an exercise, because of two main reasons:
It breaks the contract that List<E> interface has because the add methods should ensure that the element will reside in the index that the user specifies.
Why reinvent the wheel? You should be using the TreeSet or Multisets instead as pointed out in the first point above.
However, if you want to do it as an exercise here is a code sample to get you started, it uses the AbstractList abstract class:
public class SortedList<E> extends AbstractList<E> {
private ArrayList<E> internalList = new ArrayList<E>();
// Note that add(E e) in AbstractList is calling this one
#Override
public void add(int position, E e) {
internalList.add(e);
Collections.sort(internalList, null);
}
#Override
public E get(int i) {
return internalList.get(i);
}
#Override
public int size() {
return internalList.size();
}
}
Note that if you haven't overridden the methods you need, then the default implementations from AbstractList will throw UnsupportedOperationExceptions.

Because the concept of a List is incompatible with the concept of an automatically sorted collection. The point of a List is that after calling list.add(7, elem), a call to list.get(7) will return elem. With an auto-sorted list, the element could end up in an arbitrary position.

Since all lists are already "sorted" by the order the items were added (FIFO ordering), you can "resort" them with another order, including the natural ordering of elements, using java.util.Collections.sort().
EDIT:
Lists as data structures are based in what is interesting is the ordering in which the items where inserted.
Sets do not have that information.
If you want to order by adding time, use List. If you want to order by other criteria, use SortedSet.

Set and Map are non-linear data structure. List is linear data structure.
The tree data structure SortedSet and SortedMap interfaces implements TreeSet and TreeMap respectively using used Red-Black tree implementation algorithm. So it ensure that there are no duplicated items (or keys in case of Map).
List already maintains an ordered collection and index-based data structure, trees are no index-based data structures.
Tree by definition cannot contain duplicates.
In List we can have duplicates, so there is no TreeList(i.e. no SortedList).
List maintains elements in insertion order. So if we want to sort the list we have to use java.util.Collections.sort(). It sorts the specified list into ascending order, according to the natural ordering of its elements.

JavaFX SortedList
Though it took a while, Java 8 does have a sorted List.
http://docs.oracle.com/javase/8/javafx/api/javafx/collections/transformation/SortedList.html
As you can see in the javadocs, it is part of the JavaFX collections, intended to provide a sorted view on an ObservableList.
Update: Note that with Java 11, the JavaFX toolkit has moved outside the JDK and is now a separate library. JavaFX 11 is available as a downloadable SDK or from MavenCentral. See https://openjfx.io

For any newcomers, as of April 2015, Android now has a SortedList class in the support library, designed specifically to work with RecyclerView. Here's the blog post about it.

Another point is the time complexity of insert operations.
For a list insert, one expects a complexity of O(1).
But this could not be guaranteed with a sorted list.
And the most important point is that lists assume nothing about their elements.
For example, you can make lists of things that do not implement equals or compare.

Think of it like this: the List interface has methods like add(int index, E element), set(int index, E element). The contract is that once you added an element at position X you will find it there unless you add or remove elements before it.
If any list implementation would store elements in some order other than based on the index, the above list methods would make no sense.

In case you are looking for a way to sort elements, but also be able to access them by index in an efficient way, you can do the following:
Use a random access list for storage (e.g. ArrayList)
Make sure it is always sorted
Then to add or remove an element you can use Collections.binarySearch to get the insertion / removal index. Since your list implements random access, you can efficiently modify the list with the determined index.
Example:
/**
* #deprecated
* Only for demonstration purposes. Implementation is incomplete and does not
* handle invalid arguments.
*/
#Deprecated
public class SortingList<E extends Comparable<E>> {
private ArrayList<E> delegate;
public SortingList() {
delegate = new ArrayList<>();
}
public void add(E e) {
int insertionIndex = Collections.binarySearch(delegate, e);
// < 0 if element is not in the list, see Collections.binarySearch
if (insertionIndex < 0) {
insertionIndex = -(insertionIndex + 1);
}
else {
// Insertion index is index of existing element, to add new element
// behind it increase index
insertionIndex++;
}
delegate.add(insertionIndex, e);
}
public void remove(E e) {
int index = Collections.binarySearch(delegate, e);
delegate.remove(index);
}
public E get(int index) {
return delegate.get(index);
}
}
(See a more complete implementation in this answer)

First line in the List API says it is an ordered collection (also known as a sequence). If you sort the list you can't maintain the order, so there is no TreeList in Java.
As API says Java List got inspired from Sequence and see the sequence properties http://en.wikipedia.org/wiki/Sequence_(mathematics)
It doesn't mean that you can't sort the list, but Java strict to his definition and doesn't provide sorted versions of lists by default.

I think all the above do not answer this question due to following reasons,
Since same functionality can be achieved by using other collections such as TreeSet, Collections, PriorityQueue..etc (but this is an alternative which will also impose their constraints i.e. Set will remove duplicate elements. Simply saying even if it does not impose any constraint, it does not answer the question why SortedList was not created by java community)
Since List elements do not implements compare/equals methods (This holds true for Set & Map also where in general items do not implement Comparable interface but when we need these items to be in sorted order & want to use TreeSet/TreeMap,items should implement Comparable interface)
Since List uses indexing & due to sorting it won't work (This can be easily handled introducing intermediate interface/abstract class)
but none has told the exact reason behind it & as I believe these kind of questions can be best answered by java community itself as it will have only one & specific answer but let me try my best to answer this as following,
As we know sorting is an expensive operation and there is a basic difference between List & Set/Map that List can have duplicates but Set/Map can not.
This is the core reason why we have got a default implementation for Set/Map in form of TreeSet/TreeMap. Internally this is a Red Black Tree with every operation (insert/delete/search) having the complexity of O(log N) where due to duplicates List could not fit in this data storage structure.
Now the question arises we could also choose a default sorting method for List also like MergeSort which is used by Collections.sort(list) method with the complexity of O(N log N). Community did not do this deliberately since we do have multiple choices for sorting algorithms for non distinct elements like QuickSort, ShellSort, RadixSort...etc. In future there can be more. Also sometimes same sorting algorithm performs differently depending on the data to be sorted. Therefore they wanted to keep this option open and left this on us to choose. This was not the case with Set/Map since O(log N) is the best sorting complexity.

https://github.com/geniot/indexed-tree-map
Consider using indexed-tree-map . It's an enhanced JDK's TreeSet that provides access to element by index and finding the index of an element without iteration or hidden underlying lists that back up the tree. The algorithm is based on updating weights of changed nodes every time there is a change.

We have Collections.sort(arr) method which can help to sort ArrayList arr. to get sorted in desc manner we can use Collections.sort(arr, Collections.reverseOrder())

Related

SortedList that maintains order like SortedSet but also permits duplicate elements [duplicate]

In Java there are the SortedSet and SortedMap interfaces. Both belong to the Java Collections framework and provide a sorted way to access the elements.
However, in my understanding there is no SortedList in Java. You can use java.util.Collections.sort() to sort a list.
Any idea why it is designed like that?
List iterators guarantee first and foremost that you get the list's elements in the internal order of the list (aka. insertion order). More specifically it is in the order you've inserted the elements or on how you've manipulated the list. Sorting can be seen as a manipulation of the data structure, and there are several ways to sort the list.
I'll order the ways in the order of usefulness as I personally see it:
1. Consider using Set or Bag collections instead
NOTE: I put this option at the top because this is what you normally want to do anyway.
A sorted set automatically sorts the collection at insertion, meaning that it does the sorting while you add elements into the collection. It also means you don't need to manually sort it.
Furthermore if you are sure that you don't need to worry about (or have) duplicate elements then you can use the TreeSet<T> instead. It implements SortedSet and NavigableSet interfaces and works as you'd probably expect from a list:
TreeSet<String> set = new TreeSet<String>();
set.add("lol");
set.add("cat");
// automatically sorts natural order when adding
for (String s : set) {
System.out.println(s);
}
// Prints out "cat" and "lol"
If you don't want the natural ordering you can use the constructor parameter that takes a Comparator<T>.
Alternatively, you can use Multisets (also known as Bags), that is a Set that allows duplicate elements, instead and there are third-party implementations of them. Most notably from the Guava libraries there is a TreeMultiset, that works a lot like the TreeSet.
2. Sort your list with Collections.sort()
As mentioned above, sorting of Lists is a manipulation of the data structure. So for situations where you need "one source of truth" that will be sorted in a variety of ways then sorting it manually is the way to go.
You can sort your list with the java.util.Collections.sort() method. Here is a code sample on how:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
Collections.sort(strings);
for (String s : strings) {
System.out.println(s);
}
// Prints out "cat" and "lol"
Using comparators
One clear benefit is that you may use Comparator in the sort method. Java also provides some implementations for the Comparator such as the Collator which is useful for locale sensitive sorting strings. Here is one example:
Collator usCollator = Collator.getInstance(Locale.US);
usCollator.setStrength(Collator.PRIMARY); // ignores casing
Collections.sort(strings, usCollator);
Sorting in concurrent environments
Do note though that using the sort method is not friendly in concurrent environments, since the collection instance will be manipulated, and you should consider using immutable collections instead. This is something Guava provides in the Ordering class and is a simple one-liner:
List<string> sorted = Ordering.natural().sortedCopy(strings);
3. Wrap your list with java.util.PriorityQueue
Though there is no sorted list in Java there is however a sorted queue which would probably work just as well for you. It is the java.util.PriorityQueue class.
Nico Haase linked in the comments to a related question that also answers this.
In a sorted collection you most likely don't want to manipulate the internal data structure which is why PriorityQueue doesn't implement the List interface (because that would give you direct access to its elements).
Caveat on the PriorityQueue iterator
The PriorityQueue class implements the Iterable<E> and Collection<E> interfaces so it can be iterated as usual. However, the iterator is not guaranteed to return elements in the sorted order. Instead (as Alderath points out in the comments) you need to poll() the queue until empty.
Note that you can convert a list to a priority queue via the constructor that takes any collection:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
PriorityQueue<String> sortedStrings = new PriorityQueue(strings);
while(!sortedStrings.isEmpty()) {
System.out.println(sortedStrings.poll());
}
// Prints out "cat" and "lol"
4. Write your own SortedList class
NOTE: You shouldn't have to do this.
You can write your own List class that sorts each time you add a new element. This can get rather computation heavy depending on your implementation and is pointless, unless you want to do it as an exercise, because of two main reasons:
It breaks the contract that List<E> interface has because the add methods should ensure that the element will reside in the index that the user specifies.
Why reinvent the wheel? You should be using the TreeSet or Multisets instead as pointed out in the first point above.
However, if you want to do it as an exercise here is a code sample to get you started, it uses the AbstractList abstract class:
public class SortedList<E> extends AbstractList<E> {
private ArrayList<E> internalList = new ArrayList<E>();
// Note that add(E e) in AbstractList is calling this one
#Override
public void add(int position, E e) {
internalList.add(e);
Collections.sort(internalList, null);
}
#Override
public E get(int i) {
return internalList.get(i);
}
#Override
public int size() {
return internalList.size();
}
}
Note that if you haven't overridden the methods you need, then the default implementations from AbstractList will throw UnsupportedOperationExceptions.
Because the concept of a List is incompatible with the concept of an automatically sorted collection. The point of a List is that after calling list.add(7, elem), a call to list.get(7) will return elem. With an auto-sorted list, the element could end up in an arbitrary position.
Since all lists are already "sorted" by the order the items were added (FIFO ordering), you can "resort" them with another order, including the natural ordering of elements, using java.util.Collections.sort().
EDIT:
Lists as data structures are based in what is interesting is the ordering in which the items where inserted.
Sets do not have that information.
If you want to order by adding time, use List. If you want to order by other criteria, use SortedSet.
Set and Map are non-linear data structure. List is linear data structure.
The tree data structure SortedSet and SortedMap interfaces implements TreeSet and TreeMap respectively using used Red-Black tree implementation algorithm. So it ensure that there are no duplicated items (or keys in case of Map).
List already maintains an ordered collection and index-based data structure, trees are no index-based data structures.
Tree by definition cannot contain duplicates.
In List we can have duplicates, so there is no TreeList(i.e. no SortedList).
List maintains elements in insertion order. So if we want to sort the list we have to use java.util.Collections.sort(). It sorts the specified list into ascending order, according to the natural ordering of its elements.
JavaFX SortedList
Though it took a while, Java 8 does have a sorted List.
http://docs.oracle.com/javase/8/javafx/api/javafx/collections/transformation/SortedList.html
As you can see in the javadocs, it is part of the JavaFX collections, intended to provide a sorted view on an ObservableList.
Update: Note that with Java 11, the JavaFX toolkit has moved outside the JDK and is now a separate library. JavaFX 11 is available as a downloadable SDK or from MavenCentral. See https://openjfx.io
For any newcomers, as of April 2015, Android now has a SortedList class in the support library, designed specifically to work with RecyclerView. Here's the blog post about it.
Another point is the time complexity of insert operations.
For a list insert, one expects a complexity of O(1).
But this could not be guaranteed with a sorted list.
And the most important point is that lists assume nothing about their elements.
For example, you can make lists of things that do not implement equals or compare.
Think of it like this: the List interface has methods like add(int index, E element), set(int index, E element). The contract is that once you added an element at position X you will find it there unless you add or remove elements before it.
If any list implementation would store elements in some order other than based on the index, the above list methods would make no sense.
In case you are looking for a way to sort elements, but also be able to access them by index in an efficient way, you can do the following:
Use a random access list for storage (e.g. ArrayList)
Make sure it is always sorted
Then to add or remove an element you can use Collections.binarySearch to get the insertion / removal index. Since your list implements random access, you can efficiently modify the list with the determined index.
Example:
/**
* #deprecated
* Only for demonstration purposes. Implementation is incomplete and does not
* handle invalid arguments.
*/
#Deprecated
public class SortingList<E extends Comparable<E>> {
private ArrayList<E> delegate;
public SortingList() {
delegate = new ArrayList<>();
}
public void add(E e) {
int insertionIndex = Collections.binarySearch(delegate, e);
// < 0 if element is not in the list, see Collections.binarySearch
if (insertionIndex < 0) {
insertionIndex = -(insertionIndex + 1);
}
else {
// Insertion index is index of existing element, to add new element
// behind it increase index
insertionIndex++;
}
delegate.add(insertionIndex, e);
}
public void remove(E e) {
int index = Collections.binarySearch(delegate, e);
delegate.remove(index);
}
public E get(int index) {
return delegate.get(index);
}
}
(See a more complete implementation in this answer)
First line in the List API says it is an ordered collection (also known as a sequence). If you sort the list you can't maintain the order, so there is no TreeList in Java.
As API says Java List got inspired from Sequence and see the sequence properties http://en.wikipedia.org/wiki/Sequence_(mathematics)
It doesn't mean that you can't sort the list, but Java strict to his definition and doesn't provide sorted versions of lists by default.
I think all the above do not answer this question due to following reasons,
Since same functionality can be achieved by using other collections such as TreeSet, Collections, PriorityQueue..etc (but this is an alternative which will also impose their constraints i.e. Set will remove duplicate elements. Simply saying even if it does not impose any constraint, it does not answer the question why SortedList was not created by java community)
Since List elements do not implements compare/equals methods (This holds true for Set & Map also where in general items do not implement Comparable interface but when we need these items to be in sorted order & want to use TreeSet/TreeMap,items should implement Comparable interface)
Since List uses indexing & due to sorting it won't work (This can be easily handled introducing intermediate interface/abstract class)
but none has told the exact reason behind it & as I believe these kind of questions can be best answered by java community itself as it will have only one & specific answer but let me try my best to answer this as following,
As we know sorting is an expensive operation and there is a basic difference between List & Set/Map that List can have duplicates but Set/Map can not.
This is the core reason why we have got a default implementation for Set/Map in form of TreeSet/TreeMap. Internally this is a Red Black Tree with every operation (insert/delete/search) having the complexity of O(log N) where due to duplicates List could not fit in this data storage structure.
Now the question arises we could also choose a default sorting method for List also like MergeSort which is used by Collections.sort(list) method with the complexity of O(N log N). Community did not do this deliberately since we do have multiple choices for sorting algorithms for non distinct elements like QuickSort, ShellSort, RadixSort...etc. In future there can be more. Also sometimes same sorting algorithm performs differently depending on the data to be sorted. Therefore they wanted to keep this option open and left this on us to choose. This was not the case with Set/Map since O(log N) is the best sorting complexity.
https://github.com/geniot/indexed-tree-map
Consider using indexed-tree-map . It's an enhanced JDK's TreeSet that provides access to element by index and finding the index of an element without iteration or hidden underlying lists that back up the tree. The algorithm is based on updating weights of changed nodes every time there is a change.
We have Collections.sort(arr) method which can help to sort ArrayList arr. to get sorted in desc manner we can use Collections.sort(arr, Collections.reverseOrder())

In what order are the elements of a List accessed

I am looking at lists and ordering in Java. According to the documentation a List is an ordered list of Objects and when the Objects are standard class such as String, Integer it will use a standard comparator to do the comparison. A few questions jump to mind.
If your list of some arbitrary object (not a standard class) do you have to implement the Comparator interface, or will it rely on toString()?
Do you have to use a ListIterator to traverse the list in the order you require?
Looking at the documentation elements are added to the list at the end and thus will not be in order unless you sort it or use the ListIterator.
Items added to a List are stored in the order they are entered (so it's insertion order). If you want to sort a List you can use Collections.sort(List, Comparator) or the Collections.sort(List) method.
Ordered != Sorted
Ordered means that the elements in the structure have a determined order (given by insertion).
Sets are unordered, Lists are ordered, LinkedHashSet is ordered, and so on...
If you want the list to have in a specific order, determined by
the natural order (implements Comparable) or
a provivided order (implements Comparator),
you have to sort it (see Collections.sort()).
First, I am afraid that you are confusing with ordered and sorted terms. "Ordered" does not mean that the elements are sorted according to any criteria. It just means that elements order is predictable. In case of lists the order is dictated by list index and, for example, if you add 5 elements using add() method you will then iterate over the list and get the elements in the same order.
Sorted means that elements may be re-arranged (sorted) using one of available methods (e.g. Collections.sort()). In this case comparator is relevant.
Concerning the "standard" and non standard classes. Neither String nor Integer does not have any privileges in terms of sorting. Additionally to Comparator interface JDK provides Comparable interface implemented by both String and Integer. This is the reason that String and Interger lists can be sorted in natural order. Please take a look on javadoc of mentioned interfaces for more details. You can make your class to implement Comparable and enjoy the same features.
Neither Comparator nor Comparable do not relate to toString() and although it can be used when implementing both it is highly not recommended.
ListIterator provides more methods relatively to Iterator. For example you can traverse list backwards. You should choose iterator type according to your needs. Although since java 5 is released (~10 years ago) the iterators are needed more seldom because all collections implement Iterable that can be directly used in for loops. Iteraters are needed basically if you want to remove elements during iteration. BTW starting from Java 8 that was released several months ago List.forEach() will be used more and more, so iterators will become even less popular.
a List is an ordered list of Objects The "ordered" here means insertion order.
if you want to sort the list in certain order, you have to provide the Comparator or make your elements implement Comparable. you can call Collectoins class's:
public static <T extends Comparable<? super T>> void sort(List<T> list)
or
public static <T> void sort(List<T> list, Comparator<? super T> c)
If you just want to go through your list, a normal Iterator is enough. while, ListIterator provides more funtionality, like:
backwards iteration
get the index of pre/next element

List implementation that maintains ordering

Is there an existing List implementation in Java that maintains order based on provided Comparator?
Something that can be used in the following way:
Comparator<T> cmp = new MyComparator<T>();
List<T> l = new OrderedList<T>(cmp);
l.add(someT);
so that someT gets inserted such that the order in the list is maintained according to cmp
(On #andersoj suggestion I am completing my question with one more request)
Also I want to be able to traverse the list in sorted order without removing the elements, i.e:
T min = Const.SMALLEST_T;
for (T e: l) {
assertTrue(cmp.compare(min, e) >= 0);
min = e;
}
should pass.
All suggestions are welcome (except telling me to use Collections.sort on the unordered full list), though, I would prefer something in java.* or eventually org.apache.* since it would be hard to introduce new libraries at this moment.
Note: (UPDATE4) I realized that implementations of this kind of list would have inadequate performance. There two general approaches:
Use Linked structure (sort of) B-tree or similar
Use array and insertion (with binary search)
No 1. has problem with CPU cache misses
No 2. has problem with shifting elements in array.
UPDATE2:
TreeSet does not work because it uses the provided comparator (MyComparator) to check for equality and based on it assumes that the elements are equal and exclude them. I need that comparator only for ordering, not "uniqueness" filtering (since the elements by their natural ordering are not equal)
UPDATE3:
PriorityQueue does not work as List (as I need) because there is no way to traverse it in the order it is "sorted", to get the elements in the sorted order you have to remove them from the collection.
UPDATE:
Similar question:
A good Sorted List for Java
Sorted array list in Java
You should probably be using a TreeSet:
The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
Example:
Comparator<T> cmp = new MyComparator<T>();
TreeSet<T> t = new TreeSet<T>(cmp);
l.add(someT);
Note that this is a set, so no duplicate entries are allowed. This may or may not work for your specific use-case.
Response to new requirement. I see two potentials:
Do what the JavaDoc for PriorityQueue says:
This class and its iterator implement all of the optional methods of the Collection and Iterator interfaces. The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).
I suspect this will yield the best performance given your requirements. If this is not acceptable, you'll need to better explain what you're trying to accomplish.
Build a List that simply sorts itself upon addition of new elements. This is a real pain... if you used a linked structure, you can do an efficient insertion sort, but locality is bad. If you used an array-backed structure, insertion sort is a pain but traversal is better. If iteration/traversal is infrequent, you could hold the list contents unsorted and sort only on demand.
Consider using a PriorityQueue as I suggested, and in the event you need to iterate in order, write a wrapper iterator:
class PqIter implements Iterator<T>
{
final PriorityQueue<T> pq;
public PqIter(PriorityQueue <T> source)
{
pq = new PriorityQueue(source);
}
#Override
public boolean hasNext()
{
return pq.peek() != null
}
#Override
public T next()
{ return pq.poll(); }
#Override
public void remove()
{ throw new UnsupportedOperationException(""); }
}
Use Guava's TreeMultiSet. I tested the following code with Integer and it seems to do the right thing.
import com.google.common.collect.TreeMultiset;
public class TreeMultiSetTest {
public static void main(String[] args) {
TreeMultiset<Integer> ts = TreeMultiset.create();
ts.add(1); ts.add(0); ts.add(2);
ts.add(-1); ts.add(5); ts.add(2);
for (Integer i : ts) {
System.out.println(i);
}
}
}
The below addresses the uniqueness/filtering problem you were having when using a SortedSet. I see that you also want an iterator, so this won't work.
If what you really want is an ordered list-like thing, you can make use of a PriorityQueue.
Comparator<T> cmp = new MyComparator<T>();
PriorityQueue<T> pq = new PriorityQueue<T>(cmp);
pq.add(someT);
Take note of what the API documentation says about the time properties of various operations:
Implementation note: this implementation provides O(log(n)) time for the enqueing and dequeing methods (offer, poll, remove() and add); linear time for the remove(Object) and contains(Object) methods; and constant time for the retrieval methods (peek, element, and size).
You should also be aware that the iterators produced by PriorityQueue do not behave as one might expect:
The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).
I just noticed that Guava provides a MinMaxPriorityQueue. This implementation is array-backed, rather than the linked form provided in the JDK's PriorityQueue, and thus likely has different timing behavior. If you're doing something performance sensitive, you may wish to take a look. While the notes give slightly different (linear and logarithmic) big-O times, all those times should also be bounded, which may be useful.
There is not a List implementation per se that maintains ordering, but what you are likely looking for are implementations of SortedSet. A TreeSet is the most common. The other implementation, a ConcurrentSkipListSet is for more specific uses. Note that a SortedSet provides ordering, but does not allow duplicate entries, as does a List.
Refs:
Blog post PriorityQueue iterator is not ordered
SO question on PQ ordered iterator
I have a similar problem and I'm thinking of using a TreeSet. To avoid excluding "equal" elements I will modify the comparator so instead of returning 0 it will return a random number between (-1,1) or it will return always 1.
If you have no control over the Comparator or if you are using it for something else different than inserting this solution won't work for you.

Is there an indexable sorted list in the Java.util package?

I'm looking for a data structure in the java.util package. I need it to meet the following requirements:
The number of elements is (theoretically) unbounded.
The elements are sorted in an ascending order.
You can get the nth element (fast).
You can remove the nth element (fast).
I expected to find an indexable skip list, but I didn't. Do they have any data structure which meets the requirements I'v stated?
There is no such container in the Java standard libraries.
When I need a data structure with these properties, I use a List implementation (generally an ArrayList, but it doesn't matter), and I do all the insertions using Collections.binarySearch.
If I had to encapsulate a sorted list as a reusable class, I'd implement the List interface, delegating all methods to a 'standard' List implementation (it can even be passed as a parameter to the constructor). I'd implement every insertion method (add, addAll, set, Iterator's remove) by throwing an exception (UnsupportedOperationException), so that nobody can break the 'always sorted' property. Finally, I'd provide a method insertSorted that would use Collections.binarySearch to do the insertion.
There exists no simple data structure that fulfills all your criteria.
The only one that I know which does fulfills them all would be an indexable skip list. Hoewever,I don't know of any readily available Java implementations.
This question is very similar to
Sorted array list in Java
Have a look at my answer to that question.
Basically it suggests the following:
class SortedArrayList<T> extends ArrayList<T> {
#SuppressWarnings("unchecked")
public void insertSorted(T value) {
add(value);
Comparable<T> cmp = (Comparable<T>) value;
for (int i = size()-1; i > 0 && cmp.compareTo(get(i-1)) < 0; i--) {
T tmp = get(i);
set(i, get(i-1));
set(i-1, tmp);
}
}
}
A note on your first requirement: "The number of elements is unbounded.":
You may want to restrict this to something like "The number of elements should not be bound by less than 231-1..." since otherwise you're ruling out all options which are backed by a Java array. (You could get away with an arbitrary number of elements using for instance a LinkedList, but I can't see how you could do fast lookups in that.)
TreeSet provides you the functionality of natural sorting while adding elements to the list.
But if you don't need this and Collections.sort() is permitted you can use simple ArrayList.
Consider List combined with Collections.sort().
Going with what dwb stated with List<T> and Collections.sort(), you can use ArrayList<T> as that implements List<T> (and is not synchronized like Vector<T> unless of course you want that overhead). That is probably your best bet because they (Sun) typically do lots of research into these areas (from what I've seen anyway). If you need to sort by something other than the "default" (i.e. you are not sorting a list of integers, etc), then supply your own comparator.
EDIT: The only thing that does not meet your requirements are the fast removals...
Look at PriorityQueue.
If you don't need similar elements in your data structure, then usual TreeSet also fits your requirements.

When do you know when to use a TreeSet or LinkedList?

What are the advantages of each structure?
In my program I will be performing these steps and I was wondering which data structure above I should be using:
Taking in an unsorted array and
adding them to a sorted structure1.
Traversing through sorted data and removing the right one
Adding data (never removing) and returning that structure as an array
When do you know when to use a TreeSet or LinkedList? What are the advantages of each structure?
In general, you decide on a collection type based on the structural and performance properties that you need it to have. For instance, a TreeSet is a Set, and therefore does not allow duplicates and does not preserve insertion order of elements. By contrast a LinkedList is a List and therefore does allow duplicates and does preserve insertion order. On the performance side, TreeSet gives you O(logN) insertion and deletion, whereas LinkedList gives O(1) insertion at the beginning or end, and O(N) insertion at a selected position or deletion.
The details are all spelled out in the respective class and interface javadocs, but a useful summary may be found in the Java Collections Cheatsheet.
In practice though, the choice of collection type is intimately connected to algorithm design. The two need to be done in parallel. (It is no good deciding that your algorithm requires a collection with properties X, Y and Z, and then discovering that no such collection type exists.)
In your use-case, it looks like TreeSet would be a better fit. There is no efficient way (i.e. better than O(N^2)) to sort a large LinkedList that doesn't involve turning it into some other data structure to do the sorting. There is no efficient way (i.e. better than O(N)) to insert an element into the correct position in a previously sorted LinkedList. The third part (copying to an array) works equally well with a LinkedList or TreeSet; it is an O(N) operation in both cases.
[I'm assuming that the collections are large enough that the big O complexity predicts the actual performance accurately ... ]
The genuine power and advantage of TreeSet lies in interface it realizes - NavigableSet
Why is it so powerfull and in which case?
Navigable Set interface add for example these 3 nice methods:
headSet(E toElement, boolean inclusive)
tailSet(E fromElement, boolean inclusive)
subSet(E fromElement, boolean fromInclusive, E toElement, boolean toInclusive)
These methods allow to organize effective search algorithm(very fast).
Example: we need to find all the names which start with Milla and end with Wladimir:
TreeSet<String> authors = new TreeSet<String>();
authors.add("Andreas Gryphius");
authors.add("Fjodor Michailowitsch Dostojewski");
authors.add("Alexander Puschkin");
authors.add("Ruslana Lyzhichko");
authors.add("Wladimir Klitschko");
authors.add("Andrij Schewtschenko");
authors.add("Wayne Gretzky");
authors.add("Johann Jakob Christoffel");
authors.add("Milla Jovovich");
authors.add("Taras Schewtschenko");
System.out.println(authors.subSet("Milla", "Wladimir"));
output:
[Milla Jovovich, Ruslana Lyzhichko, Taras Schewtschenko, Wayne Gretzky]
TreeSet doesn't go over all the elements, it finds first and last elemenets and returns a new Collection with all the elements in the range.
TreeSet:
TreeSet uses Red-Black tree underlying. So the set could be thought as a dynamic search tree. When you need a structure which is operated read/write frequently and also should keep order, the TreeSet is a good choice.
If you want to keep it sorted and it's append-mostly, TreeSet with a Comparator is your best bet. The JVM would have to traverse the LinkedList from the beginning to decide where to place an item. LinkedList = O(n) for any operations, TreeSet = O(log(n)) for basic stuff.
The most important point when choosing a data structure are its inherent limitations. For example if you use TreeSet to store objects and during run-time your algorithm changes attributes of these objects which affect equal comparisons while the object is an element of the set, get ready for some strange bugs.
The Java Doc for Set interface state that:
Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.
Interface Set Java Doc

Categories

Resources