List implementation that maintains ordering - java

Is there an existing List implementation in Java that maintains order based on provided Comparator?
Something that can be used in the following way:
Comparator<T> cmp = new MyComparator<T>();
List<T> l = new OrderedList<T>(cmp);
l.add(someT);
so that someT gets inserted such that the order in the list is maintained according to cmp
(On #andersoj suggestion I am completing my question with one more request)
Also I want to be able to traverse the list in sorted order without removing the elements, i.e:
T min = Const.SMALLEST_T;
for (T e: l) {
assertTrue(cmp.compare(min, e) >= 0);
min = e;
}
should pass.
All suggestions are welcome (except telling me to use Collections.sort on the unordered full list), though, I would prefer something in java.* or eventually org.apache.* since it would be hard to introduce new libraries at this moment.
Note: (UPDATE4) I realized that implementations of this kind of list would have inadequate performance. There two general approaches:
Use Linked structure (sort of) B-tree or similar
Use array and insertion (with binary search)
No 1. has problem with CPU cache misses
No 2. has problem with shifting elements in array.
UPDATE2:
TreeSet does not work because it uses the provided comparator (MyComparator) to check for equality and based on it assumes that the elements are equal and exclude them. I need that comparator only for ordering, not "uniqueness" filtering (since the elements by their natural ordering are not equal)
UPDATE3:
PriorityQueue does not work as List (as I need) because there is no way to traverse it in the order it is "sorted", to get the elements in the sorted order you have to remove them from the collection.
UPDATE:
Similar question:
A good Sorted List for Java
Sorted array list in Java

You should probably be using a TreeSet:
The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
Example:
Comparator<T> cmp = new MyComparator<T>();
TreeSet<T> t = new TreeSet<T>(cmp);
l.add(someT);
Note that this is a set, so no duplicate entries are allowed. This may or may not work for your specific use-case.

Response to new requirement. I see two potentials:
Do what the JavaDoc for PriorityQueue says:
This class and its iterator implement all of the optional methods of the Collection and Iterator interfaces. The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).
I suspect this will yield the best performance given your requirements. If this is not acceptable, you'll need to better explain what you're trying to accomplish.
Build a List that simply sorts itself upon addition of new elements. This is a real pain... if you used a linked structure, you can do an efficient insertion sort, but locality is bad. If you used an array-backed structure, insertion sort is a pain but traversal is better. If iteration/traversal is infrequent, you could hold the list contents unsorted and sort only on demand.
Consider using a PriorityQueue as I suggested, and in the event you need to iterate in order, write a wrapper iterator:
class PqIter implements Iterator<T>
{
final PriorityQueue<T> pq;
public PqIter(PriorityQueue <T> source)
{
pq = new PriorityQueue(source);
}
#Override
public boolean hasNext()
{
return pq.peek() != null
}
#Override
public T next()
{ return pq.poll(); }
#Override
public void remove()
{ throw new UnsupportedOperationException(""); }
}
Use Guava's TreeMultiSet. I tested the following code with Integer and it seems to do the right thing.
import com.google.common.collect.TreeMultiset;
public class TreeMultiSetTest {
public static void main(String[] args) {
TreeMultiset<Integer> ts = TreeMultiset.create();
ts.add(1); ts.add(0); ts.add(2);
ts.add(-1); ts.add(5); ts.add(2);
for (Integer i : ts) {
System.out.println(i);
}
}
}
The below addresses the uniqueness/filtering problem you were having when using a SortedSet. I see that you also want an iterator, so this won't work.
If what you really want is an ordered list-like thing, you can make use of a PriorityQueue.
Comparator<T> cmp = new MyComparator<T>();
PriorityQueue<T> pq = new PriorityQueue<T>(cmp);
pq.add(someT);
Take note of what the API documentation says about the time properties of various operations:
Implementation note: this implementation provides O(log(n)) time for the enqueing and dequeing methods (offer, poll, remove() and add); linear time for the remove(Object) and contains(Object) methods; and constant time for the retrieval methods (peek, element, and size).
You should also be aware that the iterators produced by PriorityQueue do not behave as one might expect:
The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).
I just noticed that Guava provides a MinMaxPriorityQueue. This implementation is array-backed, rather than the linked form provided in the JDK's PriorityQueue, and thus likely has different timing behavior. If you're doing something performance sensitive, you may wish to take a look. While the notes give slightly different (linear and logarithmic) big-O times, all those times should also be bounded, which may be useful.
There is not a List implementation per se that maintains ordering, but what you are likely looking for are implementations of SortedSet. A TreeSet is the most common. The other implementation, a ConcurrentSkipListSet is for more specific uses. Note that a SortedSet provides ordering, but does not allow duplicate entries, as does a List.
Refs:
Blog post PriorityQueue iterator is not ordered
SO question on PQ ordered iterator

I have a similar problem and I'm thinking of using a TreeSet. To avoid excluding "equal" elements I will modify the comparator so instead of returning 0 it will return a random number between (-1,1) or it will return always 1.
If you have no control over the Comparator or if you are using it for something else different than inserting this solution won't work for you.

Related

Sorted Lists in Java [duplicate]

In Java there are the SortedSet and SortedMap interfaces. Both belong to the Java Collections framework and provide a sorted way to access the elements.
However, in my understanding there is no SortedList in Java. You can use java.util.Collections.sort() to sort a list.
Any idea why it is designed like that?
List iterators guarantee first and foremost that you get the list's elements in the internal order of the list (aka. insertion order). More specifically it is in the order you've inserted the elements or on how you've manipulated the list. Sorting can be seen as a manipulation of the data structure, and there are several ways to sort the list.
I'll order the ways in the order of usefulness as I personally see it:
1. Consider using Set or Bag collections instead
NOTE: I put this option at the top because this is what you normally want to do anyway.
A sorted set automatically sorts the collection at insertion, meaning that it does the sorting while you add elements into the collection. It also means you don't need to manually sort it.
Furthermore if you are sure that you don't need to worry about (or have) duplicate elements then you can use the TreeSet<T> instead. It implements SortedSet and NavigableSet interfaces and works as you'd probably expect from a list:
TreeSet<String> set = new TreeSet<String>();
set.add("lol");
set.add("cat");
// automatically sorts natural order when adding
for (String s : set) {
System.out.println(s);
}
// Prints out "cat" and "lol"
If you don't want the natural ordering you can use the constructor parameter that takes a Comparator<T>.
Alternatively, you can use Multisets (also known as Bags), that is a Set that allows duplicate elements, instead and there are third-party implementations of them. Most notably from the Guava libraries there is a TreeMultiset, that works a lot like the TreeSet.
2. Sort your list with Collections.sort()
As mentioned above, sorting of Lists is a manipulation of the data structure. So for situations where you need "one source of truth" that will be sorted in a variety of ways then sorting it manually is the way to go.
You can sort your list with the java.util.Collections.sort() method. Here is a code sample on how:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
Collections.sort(strings);
for (String s : strings) {
System.out.println(s);
}
// Prints out "cat" and "lol"
Using comparators
One clear benefit is that you may use Comparator in the sort method. Java also provides some implementations for the Comparator such as the Collator which is useful for locale sensitive sorting strings. Here is one example:
Collator usCollator = Collator.getInstance(Locale.US);
usCollator.setStrength(Collator.PRIMARY); // ignores casing
Collections.sort(strings, usCollator);
Sorting in concurrent environments
Do note though that using the sort method is not friendly in concurrent environments, since the collection instance will be manipulated, and you should consider using immutable collections instead. This is something Guava provides in the Ordering class and is a simple one-liner:
List<string> sorted = Ordering.natural().sortedCopy(strings);
3. Wrap your list with java.util.PriorityQueue
Though there is no sorted list in Java there is however a sorted queue which would probably work just as well for you. It is the java.util.PriorityQueue class.
Nico Haase linked in the comments to a related question that also answers this.
In a sorted collection you most likely don't want to manipulate the internal data structure which is why PriorityQueue doesn't implement the List interface (because that would give you direct access to its elements).
Caveat on the PriorityQueue iterator
The PriorityQueue class implements the Iterable<E> and Collection<E> interfaces so it can be iterated as usual. However, the iterator is not guaranteed to return elements in the sorted order. Instead (as Alderath points out in the comments) you need to poll() the queue until empty.
Note that you can convert a list to a priority queue via the constructor that takes any collection:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
PriorityQueue<String> sortedStrings = new PriorityQueue(strings);
while(!sortedStrings.isEmpty()) {
System.out.println(sortedStrings.poll());
}
// Prints out "cat" and "lol"
4. Write your own SortedList class
NOTE: You shouldn't have to do this.
You can write your own List class that sorts each time you add a new element. This can get rather computation heavy depending on your implementation and is pointless, unless you want to do it as an exercise, because of two main reasons:
It breaks the contract that List<E> interface has because the add methods should ensure that the element will reside in the index that the user specifies.
Why reinvent the wheel? You should be using the TreeSet or Multisets instead as pointed out in the first point above.
However, if you want to do it as an exercise here is a code sample to get you started, it uses the AbstractList abstract class:
public class SortedList<E> extends AbstractList<E> {
private ArrayList<E> internalList = new ArrayList<E>();
// Note that add(E e) in AbstractList is calling this one
#Override
public void add(int position, E e) {
internalList.add(e);
Collections.sort(internalList, null);
}
#Override
public E get(int i) {
return internalList.get(i);
}
#Override
public int size() {
return internalList.size();
}
}
Note that if you haven't overridden the methods you need, then the default implementations from AbstractList will throw UnsupportedOperationExceptions.
Because the concept of a List is incompatible with the concept of an automatically sorted collection. The point of a List is that after calling list.add(7, elem), a call to list.get(7) will return elem. With an auto-sorted list, the element could end up in an arbitrary position.
Since all lists are already "sorted" by the order the items were added (FIFO ordering), you can "resort" them with another order, including the natural ordering of elements, using java.util.Collections.sort().
EDIT:
Lists as data structures are based in what is interesting is the ordering in which the items where inserted.
Sets do not have that information.
If you want to order by adding time, use List. If you want to order by other criteria, use SortedSet.
Set and Map are non-linear data structure. List is linear data structure.
The tree data structure SortedSet and SortedMap interfaces implements TreeSet and TreeMap respectively using used Red-Black tree implementation algorithm. So it ensure that there are no duplicated items (or keys in case of Map).
List already maintains an ordered collection and index-based data structure, trees are no index-based data structures.
Tree by definition cannot contain duplicates.
In List we can have duplicates, so there is no TreeList(i.e. no SortedList).
List maintains elements in insertion order. So if we want to sort the list we have to use java.util.Collections.sort(). It sorts the specified list into ascending order, according to the natural ordering of its elements.
JavaFX SortedList
Though it took a while, Java 8 does have a sorted List.
http://docs.oracle.com/javase/8/javafx/api/javafx/collections/transformation/SortedList.html
As you can see in the javadocs, it is part of the JavaFX collections, intended to provide a sorted view on an ObservableList.
Update: Note that with Java 11, the JavaFX toolkit has moved outside the JDK and is now a separate library. JavaFX 11 is available as a downloadable SDK or from MavenCentral. See https://openjfx.io
For any newcomers, as of April 2015, Android now has a SortedList class in the support library, designed specifically to work with RecyclerView. Here's the blog post about it.
Another point is the time complexity of insert operations.
For a list insert, one expects a complexity of O(1).
But this could not be guaranteed with a sorted list.
And the most important point is that lists assume nothing about their elements.
For example, you can make lists of things that do not implement equals or compare.
Think of it like this: the List interface has methods like add(int index, E element), set(int index, E element). The contract is that once you added an element at position X you will find it there unless you add or remove elements before it.
If any list implementation would store elements in some order other than based on the index, the above list methods would make no sense.
In case you are looking for a way to sort elements, but also be able to access them by index in an efficient way, you can do the following:
Use a random access list for storage (e.g. ArrayList)
Make sure it is always sorted
Then to add or remove an element you can use Collections.binarySearch to get the insertion / removal index. Since your list implements random access, you can efficiently modify the list with the determined index.
Example:
/**
* #deprecated
* Only for demonstration purposes. Implementation is incomplete and does not
* handle invalid arguments.
*/
#Deprecated
public class SortingList<E extends Comparable<E>> {
private ArrayList<E> delegate;
public SortingList() {
delegate = new ArrayList<>();
}
public void add(E e) {
int insertionIndex = Collections.binarySearch(delegate, e);
// < 0 if element is not in the list, see Collections.binarySearch
if (insertionIndex < 0) {
insertionIndex = -(insertionIndex + 1);
}
else {
// Insertion index is index of existing element, to add new element
// behind it increase index
insertionIndex++;
}
delegate.add(insertionIndex, e);
}
public void remove(E e) {
int index = Collections.binarySearch(delegate, e);
delegate.remove(index);
}
public E get(int index) {
return delegate.get(index);
}
}
(See a more complete implementation in this answer)
First line in the List API says it is an ordered collection (also known as a sequence). If you sort the list you can't maintain the order, so there is no TreeList in Java.
As API says Java List got inspired from Sequence and see the sequence properties http://en.wikipedia.org/wiki/Sequence_(mathematics)
It doesn't mean that you can't sort the list, but Java strict to his definition and doesn't provide sorted versions of lists by default.
I think all the above do not answer this question due to following reasons,
Since same functionality can be achieved by using other collections such as TreeSet, Collections, PriorityQueue..etc (but this is an alternative which will also impose their constraints i.e. Set will remove duplicate elements. Simply saying even if it does not impose any constraint, it does not answer the question why SortedList was not created by java community)
Since List elements do not implements compare/equals methods (This holds true for Set & Map also where in general items do not implement Comparable interface but when we need these items to be in sorted order & want to use TreeSet/TreeMap,items should implement Comparable interface)
Since List uses indexing & due to sorting it won't work (This can be easily handled introducing intermediate interface/abstract class)
but none has told the exact reason behind it & as I believe these kind of questions can be best answered by java community itself as it will have only one & specific answer but let me try my best to answer this as following,
As we know sorting is an expensive operation and there is a basic difference between List & Set/Map that List can have duplicates but Set/Map can not.
This is the core reason why we have got a default implementation for Set/Map in form of TreeSet/TreeMap. Internally this is a Red Black Tree with every operation (insert/delete/search) having the complexity of O(log N) where due to duplicates List could not fit in this data storage structure.
Now the question arises we could also choose a default sorting method for List also like MergeSort which is used by Collections.sort(list) method with the complexity of O(N log N). Community did not do this deliberately since we do have multiple choices for sorting algorithms for non distinct elements like QuickSort, ShellSort, RadixSort...etc. In future there can be more. Also sometimes same sorting algorithm performs differently depending on the data to be sorted. Therefore they wanted to keep this option open and left this on us to choose. This was not the case with Set/Map since O(log N) is the best sorting complexity.
https://github.com/geniot/indexed-tree-map
Consider using indexed-tree-map . It's an enhanced JDK's TreeSet that provides access to element by index and finding the index of an element without iteration or hidden underlying lists that back up the tree. The algorithm is based on updating weights of changed nodes every time there is a change.
We have Collections.sort(arr) method which can help to sort ArrayList arr. to get sorted in desc manner we can use Collections.sort(arr, Collections.reverseOrder())

SortedList that maintains order like SortedSet but also permits duplicate elements [duplicate]

In Java there are the SortedSet and SortedMap interfaces. Both belong to the Java Collections framework and provide a sorted way to access the elements.
However, in my understanding there is no SortedList in Java. You can use java.util.Collections.sort() to sort a list.
Any idea why it is designed like that?
List iterators guarantee first and foremost that you get the list's elements in the internal order of the list (aka. insertion order). More specifically it is in the order you've inserted the elements or on how you've manipulated the list. Sorting can be seen as a manipulation of the data structure, and there are several ways to sort the list.
I'll order the ways in the order of usefulness as I personally see it:
1. Consider using Set or Bag collections instead
NOTE: I put this option at the top because this is what you normally want to do anyway.
A sorted set automatically sorts the collection at insertion, meaning that it does the sorting while you add elements into the collection. It also means you don't need to manually sort it.
Furthermore if you are sure that you don't need to worry about (or have) duplicate elements then you can use the TreeSet<T> instead. It implements SortedSet and NavigableSet interfaces and works as you'd probably expect from a list:
TreeSet<String> set = new TreeSet<String>();
set.add("lol");
set.add("cat");
// automatically sorts natural order when adding
for (String s : set) {
System.out.println(s);
}
// Prints out "cat" and "lol"
If you don't want the natural ordering you can use the constructor parameter that takes a Comparator<T>.
Alternatively, you can use Multisets (also known as Bags), that is a Set that allows duplicate elements, instead and there are third-party implementations of them. Most notably from the Guava libraries there is a TreeMultiset, that works a lot like the TreeSet.
2. Sort your list with Collections.sort()
As mentioned above, sorting of Lists is a manipulation of the data structure. So for situations where you need "one source of truth" that will be sorted in a variety of ways then sorting it manually is the way to go.
You can sort your list with the java.util.Collections.sort() method. Here is a code sample on how:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
Collections.sort(strings);
for (String s : strings) {
System.out.println(s);
}
// Prints out "cat" and "lol"
Using comparators
One clear benefit is that you may use Comparator in the sort method. Java also provides some implementations for the Comparator such as the Collator which is useful for locale sensitive sorting strings. Here is one example:
Collator usCollator = Collator.getInstance(Locale.US);
usCollator.setStrength(Collator.PRIMARY); // ignores casing
Collections.sort(strings, usCollator);
Sorting in concurrent environments
Do note though that using the sort method is not friendly in concurrent environments, since the collection instance will be manipulated, and you should consider using immutable collections instead. This is something Guava provides in the Ordering class and is a simple one-liner:
List<string> sorted = Ordering.natural().sortedCopy(strings);
3. Wrap your list with java.util.PriorityQueue
Though there is no sorted list in Java there is however a sorted queue which would probably work just as well for you. It is the java.util.PriorityQueue class.
Nico Haase linked in the comments to a related question that also answers this.
In a sorted collection you most likely don't want to manipulate the internal data structure which is why PriorityQueue doesn't implement the List interface (because that would give you direct access to its elements).
Caveat on the PriorityQueue iterator
The PriorityQueue class implements the Iterable<E> and Collection<E> interfaces so it can be iterated as usual. However, the iterator is not guaranteed to return elements in the sorted order. Instead (as Alderath points out in the comments) you need to poll() the queue until empty.
Note that you can convert a list to a priority queue via the constructor that takes any collection:
List<String> strings = new ArrayList<String>()
strings.add("lol");
strings.add("cat");
PriorityQueue<String> sortedStrings = new PriorityQueue(strings);
while(!sortedStrings.isEmpty()) {
System.out.println(sortedStrings.poll());
}
// Prints out "cat" and "lol"
4. Write your own SortedList class
NOTE: You shouldn't have to do this.
You can write your own List class that sorts each time you add a new element. This can get rather computation heavy depending on your implementation and is pointless, unless you want to do it as an exercise, because of two main reasons:
It breaks the contract that List<E> interface has because the add methods should ensure that the element will reside in the index that the user specifies.
Why reinvent the wheel? You should be using the TreeSet or Multisets instead as pointed out in the first point above.
However, if you want to do it as an exercise here is a code sample to get you started, it uses the AbstractList abstract class:
public class SortedList<E> extends AbstractList<E> {
private ArrayList<E> internalList = new ArrayList<E>();
// Note that add(E e) in AbstractList is calling this one
#Override
public void add(int position, E e) {
internalList.add(e);
Collections.sort(internalList, null);
}
#Override
public E get(int i) {
return internalList.get(i);
}
#Override
public int size() {
return internalList.size();
}
}
Note that if you haven't overridden the methods you need, then the default implementations from AbstractList will throw UnsupportedOperationExceptions.
Because the concept of a List is incompatible with the concept of an automatically sorted collection. The point of a List is that after calling list.add(7, elem), a call to list.get(7) will return elem. With an auto-sorted list, the element could end up in an arbitrary position.
Since all lists are already "sorted" by the order the items were added (FIFO ordering), you can "resort" them with another order, including the natural ordering of elements, using java.util.Collections.sort().
EDIT:
Lists as data structures are based in what is interesting is the ordering in which the items where inserted.
Sets do not have that information.
If you want to order by adding time, use List. If you want to order by other criteria, use SortedSet.
Set and Map are non-linear data structure. List is linear data structure.
The tree data structure SortedSet and SortedMap interfaces implements TreeSet and TreeMap respectively using used Red-Black tree implementation algorithm. So it ensure that there are no duplicated items (or keys in case of Map).
List already maintains an ordered collection and index-based data structure, trees are no index-based data structures.
Tree by definition cannot contain duplicates.
In List we can have duplicates, so there is no TreeList(i.e. no SortedList).
List maintains elements in insertion order. So if we want to sort the list we have to use java.util.Collections.sort(). It sorts the specified list into ascending order, according to the natural ordering of its elements.
JavaFX SortedList
Though it took a while, Java 8 does have a sorted List.
http://docs.oracle.com/javase/8/javafx/api/javafx/collections/transformation/SortedList.html
As you can see in the javadocs, it is part of the JavaFX collections, intended to provide a sorted view on an ObservableList.
Update: Note that with Java 11, the JavaFX toolkit has moved outside the JDK and is now a separate library. JavaFX 11 is available as a downloadable SDK or from MavenCentral. See https://openjfx.io
For any newcomers, as of April 2015, Android now has a SortedList class in the support library, designed specifically to work with RecyclerView. Here's the blog post about it.
Another point is the time complexity of insert operations.
For a list insert, one expects a complexity of O(1).
But this could not be guaranteed with a sorted list.
And the most important point is that lists assume nothing about their elements.
For example, you can make lists of things that do not implement equals or compare.
Think of it like this: the List interface has methods like add(int index, E element), set(int index, E element). The contract is that once you added an element at position X you will find it there unless you add or remove elements before it.
If any list implementation would store elements in some order other than based on the index, the above list methods would make no sense.
In case you are looking for a way to sort elements, but also be able to access them by index in an efficient way, you can do the following:
Use a random access list for storage (e.g. ArrayList)
Make sure it is always sorted
Then to add or remove an element you can use Collections.binarySearch to get the insertion / removal index. Since your list implements random access, you can efficiently modify the list with the determined index.
Example:
/**
* #deprecated
* Only for demonstration purposes. Implementation is incomplete and does not
* handle invalid arguments.
*/
#Deprecated
public class SortingList<E extends Comparable<E>> {
private ArrayList<E> delegate;
public SortingList() {
delegate = new ArrayList<>();
}
public void add(E e) {
int insertionIndex = Collections.binarySearch(delegate, e);
// < 0 if element is not in the list, see Collections.binarySearch
if (insertionIndex < 0) {
insertionIndex = -(insertionIndex + 1);
}
else {
// Insertion index is index of existing element, to add new element
// behind it increase index
insertionIndex++;
}
delegate.add(insertionIndex, e);
}
public void remove(E e) {
int index = Collections.binarySearch(delegate, e);
delegate.remove(index);
}
public E get(int index) {
return delegate.get(index);
}
}
(See a more complete implementation in this answer)
First line in the List API says it is an ordered collection (also known as a sequence). If you sort the list you can't maintain the order, so there is no TreeList in Java.
As API says Java List got inspired from Sequence and see the sequence properties http://en.wikipedia.org/wiki/Sequence_(mathematics)
It doesn't mean that you can't sort the list, but Java strict to his definition and doesn't provide sorted versions of lists by default.
I think all the above do not answer this question due to following reasons,
Since same functionality can be achieved by using other collections such as TreeSet, Collections, PriorityQueue..etc (but this is an alternative which will also impose their constraints i.e. Set will remove duplicate elements. Simply saying even if it does not impose any constraint, it does not answer the question why SortedList was not created by java community)
Since List elements do not implements compare/equals methods (This holds true for Set & Map also where in general items do not implement Comparable interface but when we need these items to be in sorted order & want to use TreeSet/TreeMap,items should implement Comparable interface)
Since List uses indexing & due to sorting it won't work (This can be easily handled introducing intermediate interface/abstract class)
but none has told the exact reason behind it & as I believe these kind of questions can be best answered by java community itself as it will have only one & specific answer but let me try my best to answer this as following,
As we know sorting is an expensive operation and there is a basic difference between List & Set/Map that List can have duplicates but Set/Map can not.
This is the core reason why we have got a default implementation for Set/Map in form of TreeSet/TreeMap. Internally this is a Red Black Tree with every operation (insert/delete/search) having the complexity of O(log N) where due to duplicates List could not fit in this data storage structure.
Now the question arises we could also choose a default sorting method for List also like MergeSort which is used by Collections.sort(list) method with the complexity of O(N log N). Community did not do this deliberately since we do have multiple choices for sorting algorithms for non distinct elements like QuickSort, ShellSort, RadixSort...etc. In future there can be more. Also sometimes same sorting algorithm performs differently depending on the data to be sorted. Therefore they wanted to keep this option open and left this on us to choose. This was not the case with Set/Map since O(log N) is the best sorting complexity.
https://github.com/geniot/indexed-tree-map
Consider using indexed-tree-map . It's an enhanced JDK's TreeSet that provides access to element by index and finding the index of an element without iteration or hidden underlying lists that back up the tree. The algorithm is based on updating weights of changed nodes every time there is a change.
We have Collections.sort(arr) method which can help to sort ArrayList arr. to get sorted in desc manner we can use Collections.sort(arr, Collections.reverseOrder())

Filter Java List in place without external libraries

This question is similar to What is the best way to filter a Java Collection? "filter a java.util.Collection based on a predicate." with the additional requirements that
The filter be done in place (O(1) memory excluding the input) because the list is large
No external libraries (i.e. Guava, Apache commons, etc.) may be used
Java 7 compatible (no Java 8 streams)
We can make the assumption that the java.util.Collection type is a java.util.List that implements .remove(int)
Possible solutions:
Use the .remove() method on an Iterator of the List. This could throw an UnsupportedOperationException since the .remove() method is optionally supported on Iterator
Write our own iterator that iterates through the list using an index, .size(), and .remove(int)
Are there any simpler solutions?
Is Iterator.remove() implemented for all standard Java Lists and/or Collections that implement .remove(int)?
There is no optimal solution that fits all Lists and that’s where you can never reach the efficiency of Java 8, as, being an interface method, Java 8’s default method can be overridden by any List implementation providing an implementation tailored for that particular class.
When you want to do a reasonable implementation of a similar feature in pre-Java 8, you have to focus on the common cases. There are almost no JRE provided lists for which remove(int) works but Iterator.remove doesn’t1. But consider that the ArrayList is the most used mutable List implementation and for that implementation, an iterator based solution will perform poorly for large list and lots of removed items. This is because every remove operation, regardless of whether you are using remove(int) or Iterator.remove, will shift all subsequent items by one position before you can proceed and possibly will remove again an item. In the worst case, having a predicate matching all items, that would impose a quadratic complexity. So it’s important to provide a more sophisticated solution for that case:
interface Predicate<T> {
boolean test(T object);
}
public static <T> boolean removeIf(List<T> list, Predicate<? super T> p) {
if(list instanceof RandomAccess) {
int num=list.size();
BitSet bs=new BitSet(num);
for(int index=0; index<num; index++) {
if(p.test(list.get(index))) bs.set(index);
}
if(bs.isEmpty()) {
return false;
}
for(int dst=bs.nextSetBit(0), src=dst;; dst++, src++) {
src=bs.nextClearBit(src);
if(src==num) {
list.subList(dst, src).clear();
break;
}
list.set(dst, list.get(src));
}
return true;
}
else {
boolean changed=false;
for(Iterator<T> it=list.iterator(); it.hasNext(); ) {
if(p.test(it.next())) {
it.remove();
changed=true;
}
}
return changed;
}
}
In the case of lists implementing RandomAccess, which includes all arraylist style implementations, the solution will mimic something similar to Java 8’s ArrayList.removeIf implementation though we don’t have direct access to the internal array and I left out all fail-fast concurrent modification detection stuff. Now, for ArrayList kind of lists it will have linear complexity and so it will have for LinkedList, as it doesn’t implement RandomAccess and thus, will get processed using its Iterator.
The method also fulfills the contract of Java 8’s removeIf method of returning whether the list has been changed by the operation.
1 CopyOnWriteArrayList is an exception but for a copy-on-write list the idea of an in-place removeIf is moot, unless provided by the list itself, as, when implementing it via its remove(int) (or any other public) operation we’re effectively copying the entire list on each change. So in that case, copying the entire list into an ordinary list, performing the removeIf on that list and copying it back will be more efficient in most cases.
Filters and Predicates are Java8 types, so if you don't want to use Java8, you need something similar.
You could fake the filter with an wrapped Iterator and make it work with an object (similar to how Prediates could be implemented); however, there are secondary questions:
You state the list is quite large, and the memory impact of the solution should be O(1) but such a thing is impossible to guarantee without knowing the list being operated upon. The remove(int) operator could allocate a new list index and copy into it, within the implementation.
Assuming the list does no such thing, the best you can do is implement your own iterator that takes a Predicate like test, or write a specific loop to handle the list.
In any case, this sounds like an interview question. Here's one example
public interface MyPredicate<T> {
public boolean isTrue(T value);
}
public void removeOnTrue(List<T> list, MyPredicate<T> predicate) {
Iterator<T> iterator = list.iterator();
while (iterator.hasNext()) {
T next = iterator.next();
if (predicate.isTrue(next)) {
iterator.remove();
}
}
}
doing it with a for loop across indexes is about the same, except that you would then have to keep track of the index (and remove using index).
To use the above example:
...
List<String> names = ...;
removeOnTrue(names, new MyPredicate<String>() {
public boolean isTrue(String value) {
return value.startsWith("A");
}
});
...
would yield a names with all strings starting with "A" removed.

In what order are the elements of a List accessed

I am looking at lists and ordering in Java. According to the documentation a List is an ordered list of Objects and when the Objects are standard class such as String, Integer it will use a standard comparator to do the comparison. A few questions jump to mind.
If your list of some arbitrary object (not a standard class) do you have to implement the Comparator interface, or will it rely on toString()?
Do you have to use a ListIterator to traverse the list in the order you require?
Looking at the documentation elements are added to the list at the end and thus will not be in order unless you sort it or use the ListIterator.
Items added to a List are stored in the order they are entered (so it's insertion order). If you want to sort a List you can use Collections.sort(List, Comparator) or the Collections.sort(List) method.
Ordered != Sorted
Ordered means that the elements in the structure have a determined order (given by insertion).
Sets are unordered, Lists are ordered, LinkedHashSet is ordered, and so on...
If you want the list to have in a specific order, determined by
the natural order (implements Comparable) or
a provivided order (implements Comparator),
you have to sort it (see Collections.sort()).
First, I am afraid that you are confusing with ordered and sorted terms. "Ordered" does not mean that the elements are sorted according to any criteria. It just means that elements order is predictable. In case of lists the order is dictated by list index and, for example, if you add 5 elements using add() method you will then iterate over the list and get the elements in the same order.
Sorted means that elements may be re-arranged (sorted) using one of available methods (e.g. Collections.sort()). In this case comparator is relevant.
Concerning the "standard" and non standard classes. Neither String nor Integer does not have any privileges in terms of sorting. Additionally to Comparator interface JDK provides Comparable interface implemented by both String and Integer. This is the reason that String and Interger lists can be sorted in natural order. Please take a look on javadoc of mentioned interfaces for more details. You can make your class to implement Comparable and enjoy the same features.
Neither Comparator nor Comparable do not relate to toString() and although it can be used when implementing both it is highly not recommended.
ListIterator provides more methods relatively to Iterator. For example you can traverse list backwards. You should choose iterator type according to your needs. Although since java 5 is released (~10 years ago) the iterators are needed more seldom because all collections implement Iterable that can be directly used in for loops. Iteraters are needed basically if you want to remove elements during iteration. BTW starting from Java 8 that was released several months ago List.forEach() will be used more and more, so iterators will become even less popular.
a List is an ordered list of Objects The "ordered" here means insertion order.
if you want to sort the list in certain order, you have to provide the Comparator or make your elements implement Comparable. you can call Collectoins class's:
public static <T extends Comparable<? super T>> void sort(List<T> list)
or
public static <T> void sort(List<T> list, Comparator<? super T> c)
If you just want to go through your list, a normal Iterator is enough. while, ListIterator provides more funtionality, like:
backwards iteration
get the index of pre/next element

Is there an indexable sorted list in the Java.util package?

I'm looking for a data structure in the java.util package. I need it to meet the following requirements:
The number of elements is (theoretically) unbounded.
The elements are sorted in an ascending order.
You can get the nth element (fast).
You can remove the nth element (fast).
I expected to find an indexable skip list, but I didn't. Do they have any data structure which meets the requirements I'v stated?
There is no such container in the Java standard libraries.
When I need a data structure with these properties, I use a List implementation (generally an ArrayList, but it doesn't matter), and I do all the insertions using Collections.binarySearch.
If I had to encapsulate a sorted list as a reusable class, I'd implement the List interface, delegating all methods to a 'standard' List implementation (it can even be passed as a parameter to the constructor). I'd implement every insertion method (add, addAll, set, Iterator's remove) by throwing an exception (UnsupportedOperationException), so that nobody can break the 'always sorted' property. Finally, I'd provide a method insertSorted that would use Collections.binarySearch to do the insertion.
There exists no simple data structure that fulfills all your criteria.
The only one that I know which does fulfills them all would be an indexable skip list. Hoewever,I don't know of any readily available Java implementations.
This question is very similar to
Sorted array list in Java
Have a look at my answer to that question.
Basically it suggests the following:
class SortedArrayList<T> extends ArrayList<T> {
#SuppressWarnings("unchecked")
public void insertSorted(T value) {
add(value);
Comparable<T> cmp = (Comparable<T>) value;
for (int i = size()-1; i > 0 && cmp.compareTo(get(i-1)) < 0; i--) {
T tmp = get(i);
set(i, get(i-1));
set(i-1, tmp);
}
}
}
A note on your first requirement: "The number of elements is unbounded.":
You may want to restrict this to something like "The number of elements should not be bound by less than 231-1..." since otherwise you're ruling out all options which are backed by a Java array. (You could get away with an arbitrary number of elements using for instance a LinkedList, but I can't see how you could do fast lookups in that.)
TreeSet provides you the functionality of natural sorting while adding elements to the list.
But if you don't need this and Collections.sort() is permitted you can use simple ArrayList.
Consider List combined with Collections.sort().
Going with what dwb stated with List<T> and Collections.sort(), you can use ArrayList<T> as that implements List<T> (and is not synchronized like Vector<T> unless of course you want that overhead). That is probably your best bet because they (Sun) typically do lots of research into these areas (from what I've seen anyway). If you need to sort by something other than the "default" (i.e. you are not sorting a list of integers, etc), then supply your own comparator.
EDIT: The only thing that does not meet your requirements are the fast removals...
Look at PriorityQueue.
If you don't need similar elements in your data structure, then usual TreeSet also fits your requirements.

Categories

Resources