Sorted array list in Java - java

I'm baffled that I can't find a quick answer to this. I'm essentially looking for a datastructure in Java which implements the java.util.List interface, but which stores its members in a sorted order. I know that you can use a normal ArrayList and use Collections.sort() on it, but I have a scenario where I am occasionally adding and often retrieving members from my list and I don't want to have to sort it every time I retrieve a member in case a new one has been added. Can anyone point me towards such a thing which exists in the JDK or even 3rd party libraries?
EDIT: The datastructure will need to preserve duplicates.
ANSWER's SUMMARY: I found all of this very interesting and learned a lot. Aioobe in particular deserves mention for his perseverance in trying to achieve my requirements above (mainly a sorted java.util.List implementation which supports duplicates). I have accepted his answer as the most accurate for what I asked and most thought provoking on the implications of what I was looking for even if what I asked wasn't exactly what I needed.
The problem with what I asked for lies in the List interface itself and the concept of optional methods in an interface. To quote the javadoc:
The user of this interface has precise control over where in the list each element is inserted.
Inserting into a sorted list doesn't have precise control over insertion point. Then, you have to think how you will handle some of the methods. Take add for example:
public boolean add(Object o)
Appends the specified element to the end of this list (optional operation).
You are now left in the uncomfortable situation of either
1) Breaking the contract and implementing a sorted version of add
2) Letting add add an element to the end of the list, breaking your sorted order
3) Leaving add out (as its optional) by throwing an UnsupportedOperationException and implementing another method which adds items in a sorted order.
Option 3 is probably the best, but I find it unsavory having an add method you can't use and another sortedAdd method which isn't in the interface.
Other related solutions (in no particular order):
java.util.PriorityQueue which is probably closest to what I needed than what I asked for. A queue isn't the most precise definition of a collection of objects in my case, but functionally it does everything I need it to.
net.sourceforge.nite.util.SortedList. However, this implementation breaks the contract of the List interface by implementing the sorting in the add(Object obj) method and bizarrely has a no effect method for add(int index, Object obj). General consensus suggests throw new UnsupportedOperationException() might be a better choice in this scenario.
Guava's TreeMultiSet A set implementation which supports duplicates
ca.odell.glazedlists.SortedList This class comes with the caveat in its javadoc: Warning: This class breaks the contract required by List

Minimalistic Solution
Here is a quick and dirty solution.
class SortedArrayList<T> extends ArrayList<T> {
#SuppressWarnings("unchecked")
public void insertSorted(T value) {
int i = Collections.binarySearch((List<Comparable<T>>) this, value);
add(i < 0 ? -i - 1 : i, value);
}
}
Note that despite the binarySearch, insertSorted will run in linear time since add(index, value) runs in linear time for an ArrayList.
Inserting something non-comparable results in a ClassCastException. (This is the approach taken by PriorityQueue as well: A priority queue relying on natural ordering also does not permit insertion of non-comparable objects (doing so may result in ClassCastException).)
A more complete implementation would, just like the PriorityQueue, also include a constructor that allows the user to pass in a Comparator.
Demo
SortedArrayList<String> test = new SortedArrayList<String>();
test.insertSorted("ddd"); System.out.println(test);
test.insertSorted("aaa"); System.out.println(test);
test.insertSorted("ccc"); System.out.println(test);
test.insertSorted("bbb"); System.out.println(test);
test.insertSorted("eee"); System.out.println(test);
....prints:
[ddd]
[aaa, ddd]
[aaa, ccc, ddd]
[aaa, bbb, ccc, ddd]
[aaa, bbb, ccc, ddd, eee]
Overriding List.add
Note that overriding List.add (or List.addAll for that matter) to insert elements in a sorted fashion would be a direct violation of the interface specification.
From the docs of List.add:
boolean add(E e)
    Appends the specified element to the end of this list (optional operation).
Maintaining the sortedness invariant
Unless this is some throw-away code, you probably want to guarantee that all elements remain sorted. This would include throwing UnsupportedOperationException for methods like add, addAll and set, as well as overriding listIterator to return a ListIterator whose set method throws.

Use java.util.PriorityQueue.

You can try Guava's TreeMultiSet.
Multiset<Integer> ms=TreeMultiset.create(Arrays.asList(1,2,3,1,1,-1,2,4,5,100));
System.out.println(ms);

Lists typically preserve the order in which items are added. Do you definitely need a list, or would a sorted set (e.g. TreeSet<E>) be okay for you? Basically, do you need to need to preserve duplicates?

Have a look at SortedList
This class implements a sorted list. It is constructed with a comparator that can compare two objects and sort objects accordingly. When you add an object to the list, it is inserted in the correct place. Object that are equal according to the comparator, will be in the list in the order that they were added to this list. Add only objects that the comparator can compare.
When the list already contains objects that are equal according to the comparator, the new object will be inserted immediately after these other objects.

Aioobe's approach is the way to go. I would like to suggest the following improvement over his solution though.
class SortedList<T> extends ArrayList<T> {
public void insertSorted(T value) {
int insertPoint = insertPoint(value);
add(insertPoint, value);
}
/**
* #return The insert point for a new value. If the value is found the insert point can be any
* of the possible positions that keeps the collection sorted (.33 or 3.3 or 33.).
*/
private int insertPoint(T key) {
int low = 0;
int high = size() - 1;
while (low <= high) {
int mid = (low + high) >>> 1;
Comparable<? super T> midVal = (Comparable<T>) get(mid);
int cmp = midVal.compareTo(key);
if (cmp < 0)
low = mid + 1;
else if (cmp > 0)
high = mid - 1;
else {
return mid; // key found
}
}
return low; // key not found
}
}
aioobe's solution gets very slow when using large lists. Using the fact that the list is sorted allows us to find the insert point for new values using binary search.
I would also use composition over inheritance, something along the lines of
SortedList<E> implements List<E>, RandomAccess, Cloneable, java.io.Serializable

It might be a bit too heavyweight for you, but GlazedLists has a SortedList that is perfect to use as the model of a table or JList

You could subclass ArrayList, and call Collections.sort(this) after any element is added - you would need to override two versions of add, and two of addAll, to do this.
Performance would not be as good as a smarter implementation which inserted elements in the right place, but it would do the job. If addition to the list is rare, the cost amortised over all operations on the list should be low.

Just make a new class like this:
public class SortedList<T> extends ArrayList<T> {
private final Comparator<? super T> comparator;
public SortedList() {
super();
this.comparator = null;
}
public SortedList(Comparator<T> comparator) {
super();
this.comparator = comparator;
}
#Override
public boolean add(T item) {
int index = comparator == null ? Collections.binarySearch((List<? extends Comparable<? super T>>)this, item) :
Collections.binarySearch(this, item, comparator);
if (index < 0) {
index = index * -1 - 2;
}
super.add(index+1, item);
return true;
}
#Override
public void add(int index, T item) {
throw new UnsupportedOperationException("'add' with an index is not supported in SortedArrayList");
}
#Override
public boolean addAll(Collection<? extends T> items) {
boolean allAdded = true;
for (T item : items) {
allAdded = allAdded && add(item);
}
return allAdded;
}
#Override
public boolean addAll(int index, Collection<? extends T> items) {
throw new UnsupportedOperationException("'addAll' with an index is not supported in SortedArrayList");
}
}
You can test it like this:
List<Integer> list = new SortedArrayList<>((Integer i1, Integer i2) -> i1.compareTo(i2));
for (Integer i : Arrays.asList(4, 7, 3, 8, 9, 25, 20, 23, 52, 3)) {
list.add(i);
}
System.out.println(list);

I think the choice between SortedSets/Lists and 'normal' sortable collections depends, whether you need sorting only for presentation purposes or at almost every point during runtime. Using a sorted collection may be much more expensive because the sorting is done everytime you insert an element.
If you can't opt for a collection in the JDK, you can take a look at the Apache Commons Collections

Since the currently proposed implementations which do implement a sorted list by breaking the Collection API, have an own implementation of a tree or something similar, I was curios how an implementation based on the TreeMap would perform. (Especialy since the TreeSet does base on TreeMap, too)
If someone is interested in that, too, he or she can feel free to look into it:
TreeList
Its part of the core library, you can add it via Maven dependency of course. (Apache License)
Currently the implementation seems to compare quite well on the same level than the guava SortedMultiSet and to the TreeList of the Apache Commons library.
But I would be happy if more than only me would test the implementation to be sure I did not miss something important.
Best regards!

https://github.com/geniot/indexed-tree-map
I had the same problem. So I took the source code of java.util.TreeMap and wrote IndexedTreeMap. It implements my own IndexedNavigableMap:
public interface IndexedNavigableMap<K, V> extends NavigableMap<K, V> {
K exactKey(int index);
Entry<K, V> exactEntry(int index);
int keyIndex(K k);
}
The implementation is based on updating node weights in the red-black tree when it is changed. Weight is the number of child nodes beneath a given node, plus one - self. For example when a tree is rotated to the left:
private void rotateLeft(Entry<K, V> p) {
if (p != null) {
Entry<K, V> r = p.right;
int delta = getWeight(r.left) - getWeight(p.right);
p.right = r.left;
p.updateWeight(delta);
if (r.left != null) {
r.left.parent = p;
}
r.parent = p.parent;
if (p.parent == null) {
root = r;
} else if (p.parent.left == p) {
delta = getWeight(r) - getWeight(p.parent.left);
p.parent.left = r;
p.parent.updateWeight(delta);
} else {
delta = getWeight(r) - getWeight(p.parent.right);
p.parent.right = r;
p.parent.updateWeight(delta);
}
delta = getWeight(p) - getWeight(r.left);
r.left = p;
r.updateWeight(delta);
p.parent = r;
}
}
updateWeight simply updates weights up to the root:
void updateWeight(int delta) {
weight += delta;
Entry<K, V> p = parent;
while (p != null) {
p.weight += delta;
p = p.parent;
}
}
And when we need to find the element by index here is the implementation that uses weights:
public K exactKey(int index) {
if (index < 0 || index > size() - 1) {
throw new ArrayIndexOutOfBoundsException();
}
return getExactKey(root, index);
}
private K getExactKey(Entry<K, V> e, int index) {
if (e.left == null && index == 0) {
return e.key;
}
if (e.left == null && e.right == null) {
return e.key;
}
if (e.left != null && e.left.weight > index) {
return getExactKey(e.left, index);
}
if (e.left != null && e.left.weight == index) {
return e.key;
}
return getExactKey(e.right, index - (e.left == null ? 0 : e.left.weight) - 1);
}
Also comes in very handy finding the index of a key:
public int keyIndex(K key) {
if (key == null) {
throw new NullPointerException();
}
Entry<K, V> e = getEntry(key);
if (e == null) {
throw new NullPointerException();
}
if (e == root) {
return getWeight(e) - getWeight(e.right) - 1;//index to return
}
int index = 0;
int cmp;
index += getWeight(e.left);
Entry<K, V> p = e.parent;
// split comparator and comparable paths
Comparator<? super K> cpr = comparator;
if (cpr != null) {
while (p != null) {
cmp = cpr.compare(key, p.key);
if (cmp > 0) {
index += getWeight(p.left) + 1;
}
p = p.parent;
}
} else {
Comparable<? super K> k = (Comparable<? super K>) key;
while (p != null) {
if (k.compareTo(p.key) > 0) {
index += getWeight(p.left) + 1;
}
p = p.parent;
}
}
return index;
}
You can find the result of this work at https://github.com/geniot/indexed-tree-map
TreeSet/TreeMap (as well as their indexed counterparts from the indexed-tree-map project) do not allow duplicate keys , you can use 1 key for an array of values. If you need a SortedSet with duplicates use TreeMap with values as arrays. I would do that.

The following method can be used to print the elements of a LinkedList of String objects.
The method accepts a LinkedList object as input and prints each element of the list, separated by a space, to the console.
To make the output more readable, the method also adds a newline before and after the list.
public static void printList(LinkedList<String> list) {
System.out.println("\nList is: ");
for (String element : list) {
System.out.print(element + " ");
}
System.out.println();
}

Related

Sorting a list of collections performance tips

I have a list of collections that contains pairs, I should keep the list sorted alphabetically by it's collections pairs key, My current solution is keeping the list sorted by overriding the add method, Like the code below.
Note: the list collections pairs key are always the same like
(Car,1)(Car,1)
(Bear,1)
So i just need to get first pair key of collections to get it sorting the list
List<Collection<Pair<String, Integer>>> shufflingResult;
public void init() {
shufflingResult = new ArrayList<>() {
public boolean add(Collection<Pair<String, Integer>> c) {
super.add(c);
Collections.sort(shufflingResult, new Comparator<Collection<Pair<String, Integer>>>() {
#Override
public int compare(Collection<Pair<String, Integer>> pairs, Collection<Pair<String, Integer>> t1) {
return pairs.iterator().next().getKey().compareTo(t1.iterator().next().toString());
}
});
return true;
}
};
}
Is this the best performance way to do what i'm looking for?
Performance is a tricky thing. The best sort algorithm will depend largely on the volume and type of data, and to what degree it is random. Some algorithms are best when Data which is partially sorted, others for truly random data.
Generally speaking, worry about optimization until you've determined that working code is not sufficiently performant. Get things working first, and then determine where the bottleneck it. It may not be sorting, but something else.
Java provides good general sorting algorithms. You're using one with Collections.sort(). There is no SortedList in Java, but javafx.base contains a SortedList which wraps a supplied List and keeps is sorted based on a Comparator supplied at instantiation. This would prevent you from having to override the base behavior of the List implementation.
While your code seems like it may work, here's a couple of suggestions:
pairs.iterator().next().getKey() will throw an NPE if pairs is null.
pairs.iterator().next().getKey() will throw an NoSuchElementException if pairs is empty.
pairs.iterator().next().getKey() will throw an NPE if the first Pair has a null key is empty.
All of this is true for t1 as well.
You're comparing pairs.iterator().next().getKey() to t1.iterator().next().toString(). One is the String representation of the Pair, and the other is the Key from the Pair. Is this correct?
While your code may make sure these conditions never happen, someone may modify it later with resulting unpleasant surprises. You may want to add validations to your add method to ensure that these cases won't occur. Throwing IllegalArgumentException when arguments aren't valid is generally good practice.
Another thought: Since your Collection contents are always the same, and if no two Collections have the same kind of Pairs, you should be able to use a SortedMap<String, Collection<Pair<String,Integer>>> instead of a List. If you are comparing by Key this sort of Map will keep things sorted for you. You'll put the Collection using the first Pair's Key as the map/entry key. The map's keySet(), values() and entrySet() will all return Collections which iterate in sorted order.
If the collection is already sorted and all you want to do is add. do a binary search and then just use list.add(index,element); sorting every time you want to insert is bad. you should do it once and then just maintain with good insertion order.
adding some code to show the bsearch. as the one for collections will only return matches. just supply the list. the new object. and the comparator that sorts the list how you want it. if adding many items where N> current size of the list probably better to add all then sort.
private static void add(List<ThingObject> l, ThingObject t, Comparator<ThingObject> c) {
if (l != null) {
if (l.size() == 0) {
l.add(t);
} else {
int index = bSearch(l, t, c);
l.add(index, t);
}
}
}
private static int bSearch(List<ThingObject> l, ThingObject t, Comparator<ThingObject> c) {
boolean notFound = true;
int high = l.size() - 1;
int low = 0;
int look = (low + high) / 2;
while (notFound) {
if (c.compare(l.get(look), t) > 0) {
// it's to the left of look
if (look == 0 || c.compare(l.get(look - 1), t) < 0) {
//is it adjacent?
notFound = false;
} else {
//look again.
high = look - 1;
look = (low + high) / 2;
}
} else if (c.compare(l.get(look), t) < 0) {
// it's to the right of look
if (look == l.size() - 1 || c.compare(l.get(look + 1), t) > 0) {
//is it adjacent?
look = look + 1;
notFound = false;
} else {
//look again.
low = look + 1;
look = (low + high) / 2;
}
} else {
notFound = false;
}
}
return look;
}

Sort List of Objects in Java [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I want to merge sorted lists into a single list. How is this solution? I believe it runs in O(n) time. Any glaring flaws, inefficiencies, or stylistic issues?
I don't really like the idiom of setting a flag for "this is the first iteration" and using it to make sure "lowest" has a default value. Is there a better way around that?
public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
List<T> result = new ArrayList<T>();
int totalSize = 0; // every element in the set
for (List<T> l : lists) {
totalSize += l.size();
}
boolean first; //awkward
List<T> lowest = lists.iterator().next(); // the list with the lowest item to add
while (result.size() < totalSize) { // while we still have something to add
first = true;
for (List<T> l : lists) {
if (! l.isEmpty()) {
if (first) {
lowest = l;
first = false;
}
else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
lowest = l;
}
}
}
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
Note: this isn't homework, but it isn't for production code, either.
Efficiency will suck if lists contains an ArrayList, since lowest.remove(0) will take linear time in the length of the list, making your algorithm O(n^2).
I'd do:
List<T> result = new ArrayList<T>();
for (List<T> list : lists) {
result.addAll(list);
}
Collections.sort(result);
which is in O(n log n), and leaves far less tedious code to test, debug and maintain.
Your solution is probably the fastest one. SortedLists have an insert cost of log(n), so you'll end up with M log (M) (where M is the total size of the lists).
Adding them to one list and sorting, while easier to read, is still M log(M).
Your solution is just M.
You can clean up your code a bit by sizing the result list, and by using a reference to the lowest list instead of a boolean.
public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
int totalSize = 0; // every element in the set
for (List<T> l : lists) {
totalSize += l.size();
}
List<T> result = new ArrayList<T>(totalSize);
List<T> lowest;
while (result.size() < totalSize) { // while we still have something to add
lowest = null;
for (List<T> l : lists) {
if (! l.isEmpty()) {
if (lowest == null) {
lowest = l;
} else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
lowest = l;
}
}
}
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
If you're really particular, use a List object as input, and lowest can be initialized to be lists.get(0) and you can skip the null check.
To expand on Anton's comment:
By placing the latest result from each List, along with an indicator of whch list it is, into a heap, then continually take the top off the heap, and put a new item on the heap from the list belonging to the item you just took off.
Java's PriorityQueue can provide the heap implementation.
This is a really old question, but I don't like any of the submitted answers, so this is what I ended up doing.
The solution of just adding them all into one list and sorting is bad because of the log linear complexity (O(m n log(m n))). If that's not important to you, then it's definitely the simplest and most straightforward answer. Your initial solution isn't bad, but it's a little messy, and #Dathan pointed out that the complexity is O(m n) for m lists and n total elements. You can reduce this to O(n log(m)) by using a heap to reduce the number of comparisons for each element. I use a helper class that allows me to compare iterables. This way I don't destroy the initial lists, and it should operate with reasonable complexity no matter what type of lists are input. The only flaw I see with the implementation below is that it doesn't support lists with null elements, however this could be fixed with sentinels if desired.
public static <E extends Comparable<? super E>> List<E> merge(Collection<? extends List<? extends E>> lists) {
PriorityQueue<CompIterator<E>> queue = new PriorityQueue<CompIterator<E>>();
for (List<? extends E> list : lists)
if (!list.isEmpty())
queue.add(new CompIterator<E>(list.iterator()));
List<E> merged = new ArrayList<E>();
while (!queue.isEmpty()) {
CompIterator<E> next = queue.remove();
merged.add(next.next());
if (next.hasNext())
queue.add(next);
}
return merged;
}
private static class CompIterator<E extends Comparable<? super E>> implements Iterator<E>, Comparable<CompIterator<E>> {
E peekElem;
Iterator<? extends E> it;
public CompIterator(Iterator<? extends E> it) {
this.it = it;
if (it.hasNext()) peekElem = it.next();
else peekElem = null;
}
#Override
public boolean hasNext() {
return peekElem != null;
}
#Override
public E next() {
E ret = peekElem;
if (it.hasNext()) peekElem = it.next();
else peekElem = null;
return ret;
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
#Override
public int compareTo(CompIterator<E> o) {
if (peekElem == null) return 1;
else return peekElem.compareTo(o.peekElem);
}
}
Every element of the returned list involves two O(log(m)) heap operations, there is also an initial iteration over all of the lists. Therefore the overall complexity is O(n log(m) + m) for n total elements and m lists. making this always faster than concatenating and sorting.
Since Balus and meriton have together given an excellent response to your question about the algorithm, I'll speak to your aside about the "first" idiom.
There are definitely other approaches (like setting lowest to a 'magic' value), but I happen to feel that "first" (to which I'd probably give a longer name, but that's being pedantic) is the best, because it's very clear. Presence of a boolean like "first" is a clear signal that your loop will do something special the first time through. It helps the reader.
Of course you don't need it if you take the Balus/meriton approach, but it's a situation which crops up.

How can I get Sorted List behavior in Java without using Collections.sort()?

I understand that Java does not possess a Sorted List for various conceptual reasons, but consider the case I need to have a collection which is kind of like a Priority Queue but also allows me random access (indexable), in other words, I need a List that follows a particular ordering. I would prefer not to use Collections.sort()
Preferable operation constraints:
retrieve - O(1) (index-based random access)
search - O(log n)
insert - O(log n)
delete - O(log n)
An iterator over the collection should give me all elements in Sorted Order (based on predefined Comparator supplied during instantiation of the data-structure)
I would prefer to use Java's inbuilt library to accomplish this, but feel free to suggest external libraries as well.
EDIT:
TreeSet won't do as index based access is difficult, using wrapper collections is also not my best choice as removal would imply I need to remove from both collections.
EDIT2: I was unable to find an implementation and/or documentation for an indexable skip list this seems a bit relevant, can anyone help me find it ? Any comments for or against the data structure proposed is also welcome.
EDIT3: Though this may not be the most perfect answer, I want to add this piece of code that I wrote so that anyone who has similar problems for the need of a sorted list can use this if they find it useful.
Do check for errors (if any), and suggest improvements (especially to the sortedSubList method)
import java.util.ArrayList;
import java.util.Collection;
import java.util.Comparator;
public class SortedList<E> extends ArrayList<E> {
private final Comparator<? super E> comparator;
public SortedList(Comparator<? super E> comparator) {
this.comparator = comparator;
}
public SortedList(int initialCapacity, Comparator<? super E> comparator) {
super(initialCapacity);
this.comparator = comparator;
}
#Override
public boolean add(E e) {
if (comparator == null)
return super.add(e);
if (e == null)
throw new NullPointerException();
int start = 0;
int end = size() - 1;
while (start <= end) {
int mid = (start + end) / 2;
if (comparator.compare(get(mid), e) == 0) {
super.add(mid, e);
return true;
}
if (comparator.compare(get(mid), e) < 0) {
end = mid - 1;
}
else {
start = mid + 1;
}
}
super.add(start, e);
return true;
}
#Override
public boolean contains(Object o) {
if (comparator == null)
return super.contains(o);
if (o == null)
return false;
E other = (E) o;
int start = 0;
int end = size() - 1;
while (start <= end) {
int mid = (start + end) / 2;
if (comparator.compare(get(mid), other) == 0) {
return true;
}
if (comparator.compare(get(mid), other) < 0) {
end = mid - 1;
}
else {
start = mid + 1;
}
}
return false;
}
#Override
public int indexOf(Object o) {
if (comparator == null)
return super.indexOf(o);
if (o == null)
throw new NullPointerException();
E other = (E) o;
int start = 0;
int end = size() - 1;
while (start <= end) {
int mid = (start + end) / 2;
if (comparator.compare(get(mid), other) == 0) {
return mid;
}
if (comparator.compare(get(mid), other) < 0) {
end = mid - 1;
}
else {
start = mid + 1;
}
}
return -(start+1);
}
#Override
public void add(int index, E e) {
throw new UnsupportedOperationException();
}
#Override
public boolean addAll(int index, Collection<? extends E> c) {
throw new UnsupportedOperationException();
}
#Override
public E set(int index, E e) {
throw new UnsupportedOperationException();
}
public SortedList<E> sortedSubList(int fromIndex, int toIndex) {
SortedList<E> sl = new SortedList<>(comparator);
for (int i = fromIndex; i < toIndex; i++)
sl.add(get(i));
return sl;
}
}
It's hard to get O(1) indexing and O(log n) insertion/deletion in the same data structure. O(1) indexing means we can't afford the link-following involved in indexing a tree, list, skip list, or other link-based data structure, while O(log n) modification means we can't afford to shift half the elements of an array on every insertion. I don't know if it's possible to fulfill these requirements simultaneously.
If we relax one of these requirements, things become much easier. For example, O(log n) for all operations can be achieved by an indexable skip list or a self-balancing BST with nodes that keep track of the size of the subtree rooted at the node. Neither of these can be built on top of the skip list or BST in Java's standard library, though, so you'd probably need to install another library or write your own data structure.
O(1) indexing, O(log n) search, and O(n) insert and delete can be done by keeping a sorted ArrayList and using Collections.binarySearch to search for elements or insert/delete positions. You never need to call Collections.sort, but you still need to call the ArrayList's O(n) insert and delete methods. This is probably the easiest option to build on top of Java's built-in tools. Note that with recent Java versions, Collections.sort is an adaptive mergesort that would take O(n) time to sort an array where only the last element is out of sorted order, so you could probably get away with relying on Collections.sort. However, that's an implementation detail that alternative Java implementations don't have to follow.
If your primary goal is O(1) for indexed lookup (get()), then you can implement your own class implementing List, backed by an array, using Arrays.binarySearch().
retrieve: get(int) - O(1) - array index
search: contains(Object) - O(log n) - binarySearch
indexOf(Object) - O(log n) - binarySearch
insert: add(E) - O(n) - binarySearch + array shift
delete: remove(int) - O(n) - array shift
remove(Object) - O(n) - binarySearch + array shift
The add(E) method is violating the List definition (append), but is consistent with the Collection definition.
The following methods should throw UnsupportedOperationException:
add(int index, E element)
addAll(int index, Collection<? extends E> c)
set(int index, E element)
If duplicate values are not allowed, which could be a logical restriction, consider also implementing NavigableSet, which is a SortedSet.
Build a custom collection that is backed by an ArrayList and a TreeSet. Delegate the random access to the ArrayList and the search to the TreeSet. Of course this means that every write operation will be very expensive, as it will have to sort the ArrayList every time. But the reads should be very efficient.

Maintain a sorted list with performance

Summary of my question: I need a list that can quickly be iterated and sorted (either by sorting method or adding/removing object).
I'm coding a game in which there are a lot of "collision zones", that are checked every frame. For optimization, I have a idea of sorting them depends on their X position. The problem is not all collision zones are static, because some of them can move around.
I have managed to handles all the changes, but to maintain the ArrayList (or ConcurrentLinkedQueue) sorted using Collections.sort() is too slow.
So I got a new idea: I may use a Tree, and whenever a zone's X is changed, instead of sorting all elements again, I can just remove then re-add it from the tree. However, I think that adding and removing operator in TreeList are expensive too. Moreover, iterating through Tree is not as effective as ConcurrentLinkedQueue, LinkedList or ArrayList.
Please tell me if there is any built-in data structure that satisfy my need. If there is no such data structure, I intend to extend ArrayList class, override the add method to ensure the order (by using overload add(index, item). If you think this is the best way, please give me the best way to find the index. I already use BinarySearch but I think there is a bug:
#Override
public boolean add(T e) {
// Find the position
int left = 0;
int right = this.size() - 1;
int pos = right / 2;
if (e.compareTo(this.get(0)) <= 0) {
pos = 0;
} else if (e.compareTo(this.get(this.size() - 1)) >= 0) {
pos = this.size();
} else {
// Need: e[pos - 1] <= e[pos] <= e[pos + 1]
boolean firstCondition = false;
boolean secondCondition = false;
do {
firstCondition = this.get(pos - 1).compareTo(this.get(pos)) <= 0;
secondCondition = this.get(pos).compareTo(this.get(pos + 1)) >= 0;
if (!firstCondition) {
right = pos - 1;
pos = (left + right) / 2;
} else if (!secondCondition) {
left = pos + 1;
pos = (left + right) / 2;
}
} while (!(firstCondition && secondCondition));
}
this.add(pos, e);
return true;
}
I would use a tree set. if you need to allow duplicates you can use a custom comparator. while iterating a tree set is slightly slower than an array, adding and removing is much faster.
It appears you are doing an insertion sort which is O (n). an insert on a tree set is O (ln n)
IMHO The best way to store duplicates by using a TreeMap<MyKey, List<MyType>> like this
Map<MyKey, List<MyType>> map = new TreeMap<>();
// to add
MyType type = ...
MyKey key = ...
List<MyType> myTypes = map.get(key);
if (myTypes == null)
map.put(key, myTypes = new ArrayList<>());
myTypes.add(type);
// to remove
MyType type = ...
MyKey key = ...
List<MyType> myTypes = map.get(key);
if (myTypes != null) {
myTypes.remove(myType);
if (myTypes.isEmpty())
map.remove(key);
}
In this case, addition and removal is O(ln N);
You can allow "duplicates" is a TreeSet by defining all objects as different e.g.
Set<MyType> set = new TreeSet<>(new Comparator<MyType>() {
public int compare(MyType o1, MyType o2) {
int cmp = /* compare the normal way */
if (cmp == 0) {
// or use System.identityHashCode()
cmp = Integer.compare(o1.hashCode(), o2.hashCode());
return cmp == 0 ? 1 : cmp; // returning 0 is a bad idea.
}
}
}
As you can see this approach is ugly unless you have some way of making every object unique.
It sounds like you want a TreeSet.
If you intend to use a binary search / insert on an already sorted array or ArrayList then you will get the same big O complexity than a binary tree.
So I recommend that you actuall test the provided tree implementations (i.e. TreeSet), not just guess. As they are not plain tree implementations I would not be surprised if iteration is also fast on them.

Java Code Review: Merge sorted lists into a single sorted list [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I want to merge sorted lists into a single list. How is this solution? I believe it runs in O(n) time. Any glaring flaws, inefficiencies, or stylistic issues?
I don't really like the idiom of setting a flag for "this is the first iteration" and using it to make sure "lowest" has a default value. Is there a better way around that?
public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
List<T> result = new ArrayList<T>();
int totalSize = 0; // every element in the set
for (List<T> l : lists) {
totalSize += l.size();
}
boolean first; //awkward
List<T> lowest = lists.iterator().next(); // the list with the lowest item to add
while (result.size() < totalSize) { // while we still have something to add
first = true;
for (List<T> l : lists) {
if (! l.isEmpty()) {
if (first) {
lowest = l;
first = false;
}
else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
lowest = l;
}
}
}
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
Note: this isn't homework, but it isn't for production code, either.
Efficiency will suck if lists contains an ArrayList, since lowest.remove(0) will take linear time in the length of the list, making your algorithm O(n^2).
I'd do:
List<T> result = new ArrayList<T>();
for (List<T> list : lists) {
result.addAll(list);
}
Collections.sort(result);
which is in O(n log n), and leaves far less tedious code to test, debug and maintain.
Your solution is probably the fastest one. SortedLists have an insert cost of log(n), so you'll end up with M log (M) (where M is the total size of the lists).
Adding them to one list and sorting, while easier to read, is still M log(M).
Your solution is just M.
You can clean up your code a bit by sizing the result list, and by using a reference to the lowest list instead of a boolean.
public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
int totalSize = 0; // every element in the set
for (List<T> l : lists) {
totalSize += l.size();
}
List<T> result = new ArrayList<T>(totalSize);
List<T> lowest;
while (result.size() < totalSize) { // while we still have something to add
lowest = null;
for (List<T> l : lists) {
if (! l.isEmpty()) {
if (lowest == null) {
lowest = l;
} else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
lowest = l;
}
}
}
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
If you're really particular, use a List object as input, and lowest can be initialized to be lists.get(0) and you can skip the null check.
To expand on Anton's comment:
By placing the latest result from each List, along with an indicator of whch list it is, into a heap, then continually take the top off the heap, and put a new item on the heap from the list belonging to the item you just took off.
Java's PriorityQueue can provide the heap implementation.
This is a really old question, but I don't like any of the submitted answers, so this is what I ended up doing.
The solution of just adding them all into one list and sorting is bad because of the log linear complexity (O(m n log(m n))). If that's not important to you, then it's definitely the simplest and most straightforward answer. Your initial solution isn't bad, but it's a little messy, and #Dathan pointed out that the complexity is O(m n) for m lists and n total elements. You can reduce this to O(n log(m)) by using a heap to reduce the number of comparisons for each element. I use a helper class that allows me to compare iterables. This way I don't destroy the initial lists, and it should operate with reasonable complexity no matter what type of lists are input. The only flaw I see with the implementation below is that it doesn't support lists with null elements, however this could be fixed with sentinels if desired.
public static <E extends Comparable<? super E>> List<E> merge(Collection<? extends List<? extends E>> lists) {
PriorityQueue<CompIterator<E>> queue = new PriorityQueue<CompIterator<E>>();
for (List<? extends E> list : lists)
if (!list.isEmpty())
queue.add(new CompIterator<E>(list.iterator()));
List<E> merged = new ArrayList<E>();
while (!queue.isEmpty()) {
CompIterator<E> next = queue.remove();
merged.add(next.next());
if (next.hasNext())
queue.add(next);
}
return merged;
}
private static class CompIterator<E extends Comparable<? super E>> implements Iterator<E>, Comparable<CompIterator<E>> {
E peekElem;
Iterator<? extends E> it;
public CompIterator(Iterator<? extends E> it) {
this.it = it;
if (it.hasNext()) peekElem = it.next();
else peekElem = null;
}
#Override
public boolean hasNext() {
return peekElem != null;
}
#Override
public E next() {
E ret = peekElem;
if (it.hasNext()) peekElem = it.next();
else peekElem = null;
return ret;
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
#Override
public int compareTo(CompIterator<E> o) {
if (peekElem == null) return 1;
else return peekElem.compareTo(o.peekElem);
}
}
Every element of the returned list involves two O(log(m)) heap operations, there is also an initial iteration over all of the lists. Therefore the overall complexity is O(n log(m) + m) for n total elements and m lists. making this always faster than concatenating and sorting.
Since Balus and meriton have together given an excellent response to your question about the algorithm, I'll speak to your aside about the "first" idiom.
There are definitely other approaches (like setting lowest to a 'magic' value), but I happen to feel that "first" (to which I'd probably give a longer name, but that's being pedantic) is the best, because it's very clear. Presence of a boolean like "first" is a clear signal that your loop will do something special the first time through. It helps the reader.
Of course you don't need it if you take the Balus/meriton approach, but it's a situation which crops up.

Categories

Resources