Sort List of Objects in Java [duplicate] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I want to merge sorted lists into a single list. How is this solution? I believe it runs in O(n) time. Any glaring flaws, inefficiencies, or stylistic issues?
I don't really like the idiom of setting a flag for "this is the first iteration" and using it to make sure "lowest" has a default value. Is there a better way around that?
public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
List<T> result = new ArrayList<T>();
int totalSize = 0; // every element in the set
for (List<T> l : lists) {
totalSize += l.size();
}
boolean first; //awkward
List<T> lowest = lists.iterator().next(); // the list with the lowest item to add
while (result.size() < totalSize) { // while we still have something to add
first = true;
for (List<T> l : lists) {
if (! l.isEmpty()) {
if (first) {
lowest = l;
first = false;
}
else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
lowest = l;
}
}
}
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
Note: this isn't homework, but it isn't for production code, either.

Efficiency will suck if lists contains an ArrayList, since lowest.remove(0) will take linear time in the length of the list, making your algorithm O(n^2).
I'd do:
List<T> result = new ArrayList<T>();
for (List<T> list : lists) {
result.addAll(list);
}
Collections.sort(result);
which is in O(n log n), and leaves far less tedious code to test, debug and maintain.

Your solution is probably the fastest one. SortedLists have an insert cost of log(n), so you'll end up with M log (M) (where M is the total size of the lists).
Adding them to one list and sorting, while easier to read, is still M log(M).
Your solution is just M.
You can clean up your code a bit by sizing the result list, and by using a reference to the lowest list instead of a boolean.
public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
int totalSize = 0; // every element in the set
for (List<T> l : lists) {
totalSize += l.size();
}
List<T> result = new ArrayList<T>(totalSize);
List<T> lowest;
while (result.size() < totalSize) { // while we still have something to add
lowest = null;
for (List<T> l : lists) {
if (! l.isEmpty()) {
if (lowest == null) {
lowest = l;
} else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
lowest = l;
}
}
}
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
If you're really particular, use a List object as input, and lowest can be initialized to be lists.get(0) and you can skip the null check.

To expand on Anton's comment:
By placing the latest result from each List, along with an indicator of whch list it is, into a heap, then continually take the top off the heap, and put a new item on the heap from the list belonging to the item you just took off.
Java's PriorityQueue can provide the heap implementation.

This is a really old question, but I don't like any of the submitted answers, so this is what I ended up doing.
The solution of just adding them all into one list and sorting is bad because of the log linear complexity (O(m n log(m n))). If that's not important to you, then it's definitely the simplest and most straightforward answer. Your initial solution isn't bad, but it's a little messy, and #Dathan pointed out that the complexity is O(m n) for m lists and n total elements. You can reduce this to O(n log(m)) by using a heap to reduce the number of comparisons for each element. I use a helper class that allows me to compare iterables. This way I don't destroy the initial lists, and it should operate with reasonable complexity no matter what type of lists are input. The only flaw I see with the implementation below is that it doesn't support lists with null elements, however this could be fixed with sentinels if desired.
public static <E extends Comparable<? super E>> List<E> merge(Collection<? extends List<? extends E>> lists) {
PriorityQueue<CompIterator<E>> queue = new PriorityQueue<CompIterator<E>>();
for (List<? extends E> list : lists)
if (!list.isEmpty())
queue.add(new CompIterator<E>(list.iterator()));
List<E> merged = new ArrayList<E>();
while (!queue.isEmpty()) {
CompIterator<E> next = queue.remove();
merged.add(next.next());
if (next.hasNext())
queue.add(next);
}
return merged;
}
private static class CompIterator<E extends Comparable<? super E>> implements Iterator<E>, Comparable<CompIterator<E>> {
E peekElem;
Iterator<? extends E> it;
public CompIterator(Iterator<? extends E> it) {
this.it = it;
if (it.hasNext()) peekElem = it.next();
else peekElem = null;
}
#Override
public boolean hasNext() {
return peekElem != null;
}
#Override
public E next() {
E ret = peekElem;
if (it.hasNext()) peekElem = it.next();
else peekElem = null;
return ret;
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
#Override
public int compareTo(CompIterator<E> o) {
if (peekElem == null) return 1;
else return peekElem.compareTo(o.peekElem);
}
}
Every element of the returned list involves two O(log(m)) heap operations, there is also an initial iteration over all of the lists. Therefore the overall complexity is O(n log(m) + m) for n total elements and m lists. making this always faster than concatenating and sorting.

Since Balus and meriton have together given an excellent response to your question about the algorithm, I'll speak to your aside about the "first" idiom.
There are definitely other approaches (like setting lowest to a 'magic' value), but I happen to feel that "first" (to which I'd probably give a longer name, but that's being pedantic) is the best, because it's very clear. Presence of a boolean like "first" is a clear signal that your loop will do something special the first time through. It helps the reader.
Of course you don't need it if you take the Balus/meriton approach, but it's a situation which crops up.

Related

How can I get Sorted List behavior in Java without using Collections.sort()?

I understand that Java does not possess a Sorted List for various conceptual reasons, but consider the case I need to have a collection which is kind of like a Priority Queue but also allows me random access (indexable), in other words, I need a List that follows a particular ordering. I would prefer not to use Collections.sort()
Preferable operation constraints:
retrieve - O(1) (index-based random access)
search - O(log n)
insert - O(log n)
delete - O(log n)
An iterator over the collection should give me all elements in Sorted Order (based on predefined Comparator supplied during instantiation of the data-structure)
I would prefer to use Java's inbuilt library to accomplish this, but feel free to suggest external libraries as well.
EDIT:
TreeSet won't do as index based access is difficult, using wrapper collections is also not my best choice as removal would imply I need to remove from both collections.
EDIT2: I was unable to find an implementation and/or documentation for an indexable skip list this seems a bit relevant, can anyone help me find it ? Any comments for or against the data structure proposed is also welcome.
EDIT3: Though this may not be the most perfect answer, I want to add this piece of code that I wrote so that anyone who has similar problems for the need of a sorted list can use this if they find it useful.
Do check for errors (if any), and suggest improvements (especially to the sortedSubList method)
import java.util.ArrayList;
import java.util.Collection;
import java.util.Comparator;
public class SortedList<E> extends ArrayList<E> {
private final Comparator<? super E> comparator;
public SortedList(Comparator<? super E> comparator) {
this.comparator = comparator;
}
public SortedList(int initialCapacity, Comparator<? super E> comparator) {
super(initialCapacity);
this.comparator = comparator;
}
#Override
public boolean add(E e) {
if (comparator == null)
return super.add(e);
if (e == null)
throw new NullPointerException();
int start = 0;
int end = size() - 1;
while (start <= end) {
int mid = (start + end) / 2;
if (comparator.compare(get(mid), e) == 0) {
super.add(mid, e);
return true;
}
if (comparator.compare(get(mid), e) < 0) {
end = mid - 1;
}
else {
start = mid + 1;
}
}
super.add(start, e);
return true;
}
#Override
public boolean contains(Object o) {
if (comparator == null)
return super.contains(o);
if (o == null)
return false;
E other = (E) o;
int start = 0;
int end = size() - 1;
while (start <= end) {
int mid = (start + end) / 2;
if (comparator.compare(get(mid), other) == 0) {
return true;
}
if (comparator.compare(get(mid), other) < 0) {
end = mid - 1;
}
else {
start = mid + 1;
}
}
return false;
}
#Override
public int indexOf(Object o) {
if (comparator == null)
return super.indexOf(o);
if (o == null)
throw new NullPointerException();
E other = (E) o;
int start = 0;
int end = size() - 1;
while (start <= end) {
int mid = (start + end) / 2;
if (comparator.compare(get(mid), other) == 0) {
return mid;
}
if (comparator.compare(get(mid), other) < 0) {
end = mid - 1;
}
else {
start = mid + 1;
}
}
return -(start+1);
}
#Override
public void add(int index, E e) {
throw new UnsupportedOperationException();
}
#Override
public boolean addAll(int index, Collection<? extends E> c) {
throw new UnsupportedOperationException();
}
#Override
public E set(int index, E e) {
throw new UnsupportedOperationException();
}
public SortedList<E> sortedSubList(int fromIndex, int toIndex) {
SortedList<E> sl = new SortedList<>(comparator);
for (int i = fromIndex; i < toIndex; i++)
sl.add(get(i));
return sl;
}
}
It's hard to get O(1) indexing and O(log n) insertion/deletion in the same data structure. O(1) indexing means we can't afford the link-following involved in indexing a tree, list, skip list, or other link-based data structure, while O(log n) modification means we can't afford to shift half the elements of an array on every insertion. I don't know if it's possible to fulfill these requirements simultaneously.
If we relax one of these requirements, things become much easier. For example, O(log n) for all operations can be achieved by an indexable skip list or a self-balancing BST with nodes that keep track of the size of the subtree rooted at the node. Neither of these can be built on top of the skip list or BST in Java's standard library, though, so you'd probably need to install another library or write your own data structure.
O(1) indexing, O(log n) search, and O(n) insert and delete can be done by keeping a sorted ArrayList and using Collections.binarySearch to search for elements or insert/delete positions. You never need to call Collections.sort, but you still need to call the ArrayList's O(n) insert and delete methods. This is probably the easiest option to build on top of Java's built-in tools. Note that with recent Java versions, Collections.sort is an adaptive mergesort that would take O(n) time to sort an array where only the last element is out of sorted order, so you could probably get away with relying on Collections.sort. However, that's an implementation detail that alternative Java implementations don't have to follow.
If your primary goal is O(1) for indexed lookup (get()), then you can implement your own class implementing List, backed by an array, using Arrays.binarySearch().
retrieve: get(int) - O(1) - array index
search: contains(Object) - O(log n) - binarySearch
indexOf(Object) - O(log n) - binarySearch
insert: add(E) - O(n) - binarySearch + array shift
delete: remove(int) - O(n) - array shift
remove(Object) - O(n) - binarySearch + array shift
The add(E) method is violating the List definition (append), but is consistent with the Collection definition.
The following methods should throw UnsupportedOperationException:
add(int index, E element)
addAll(int index, Collection<? extends E> c)
set(int index, E element)
If duplicate values are not allowed, which could be a logical restriction, consider also implementing NavigableSet, which is a SortedSet.
Build a custom collection that is backed by an ArrayList and a TreeSet. Delegate the random access to the ArrayList and the search to the TreeSet. Of course this means that every write operation will be very expensive, as it will have to sort the ArrayList every time. But the reads should be very efficient.

Fastest way to remove a Collection of Longs from another in Java

I have two collections of Long type. Both of size 20-30 million. What is the quickest way to remove from one, those that are common in the second? Lesser the heap space taken, the better, as there are other things going on in parallel.
I know LinkedList is better than ArrayList for removals using Iterator, but I'm just not sure if I need to iterate over each element. I want to poll for any better approaches, both Collections are sorted.
Edit: I previously stated my collection sizes as 2-3 million, I realized it is 20-30 million.
There will be lots of overlaps. The exact type of Collections is open to debate as well.
With counts in the range of millions, solutions with O(n2) complexity should be out. You have two basic solutions here:
Sort the second collection, and use binary search for an O((N+M)*logM) solution, or
Put elements from the second collection into a hash container, for an O(N+M) solution
Above, N is the number of elements in the first collection, and M is the number of elements in the second collection.
Set<Long> toRemove = new HashSet<Long>(collection2);
Iterator<Long> iter = collection1.iterator();
while (iter.hasNext()) {
if (toRemove.contains(iter.next())) {
iter.remove();
}
}
Note that if collection1 is an ArrayList, this will be very slow. If you must keep it an ArrayList, you can do it like this:
int rd = 0, wr = 0;
// Copy the elements you are keeping into a contiguous range
while (rd != arrayList1.size()) {
Long last = arrayList1.get(rd++);
if (!toRemove.contains(iter.next()) {
arrayList1.put(wr++, last);
}
}
// Remove "tail" elements
while (rd > wr) {
arrayList1.remove(--wr);
}
Without growing heap.
Collection<Long> a = new HashSet<Long>();
//fill a
Collection<Long> b = new ArrayList<Long>();
//fill b
for(int i = 0; i < b.size(); i++){
a.remove(b.get(i));
}
b.size() and b.get(int i) runs in constant time according to Oracles Javadoc.
Also a.remove(O o) runs in constant time.
First port of call would be the Collection.removeAll method. This uses no extra heap space, and its time complexity is dependent on the performance of the contains method on your second collection. If your second collection is a TreeSet then a.removeAll(b) takes O(n . log(m)) time (where n is the size of a and m is the size of b), if b is a HashSet then it takes O(n) time, if b is a sorted ArrayList then it's O(nm), but you can create a new wrapper Collection that uses a binary search to reduce it to O(n . log(m)) for negligible constant memory cost:
private static class SortedList<T extends Comparable<? super T>> extends com.google.common.collect.ForwardingList<T>
{
private List delegate;
public SortedList(ArrayList<T> delegate)
{
this.delegate = delegate;
}
#Override
protected List<T> delegate()
{
return delegate;
}
#Override
public boolean contains(Object object)
{
return Collections.binarySearch(delegate, (T) object) >= 0;
}
}
static <E extends Comparable<? super E>> void removeAll(Collection<E> a, ArrayList<E> b)
{
//assumes that b is sorted
a.removeAll(new SortedList<E>(b));
}
You should take a look at Apache Common Collections
I tested it with LinkedList containing ~3M Longs, it gives pretty good results :
Random r = new Random();
List<Long> list1 = new LinkedList<Long>();
for (int i = 0; i < 3000000; i++) {
list1.add(r.nextLong());
}
List<Long> list2 = new LinkedList<Long>();
for (int i = 0; i < 2000000; i++) {
list2.add(r.nextLong());
}
Collections.sort(list1);
Collections.sort(list2);
long time = System.currentTimeMillis();
list3 = ListUtils.subtract(list2, list1);
System.out.println("listUtils.intersection = " + (System.currentTimeMillis() - time));
I can't ensure you this is the best solution, but it is as easy one.
I get an execution time equals to :
1247 ms
Inconvenient : it creates a new List

Inserting into Sorted LinkedList Java

I have this code below where I am inserting a new integer into a sorted LinkedList of ints but I do not think it is the "correct" way of doing things as I know there are singly linkedlist with pointer to the next value and doubly linkedlist with pointers to the next and previous value. I tried to use Nodes to implement the below case but Java is importing this import org.w3c.dom.Node (document object model) so got stuck.
Insertion Cases
Insert into Empty Array
If value to be inserted less than everything, insert in the beginning.
If value to be inserted greater than everything, insert in the last.
Could be in between if value less than/greater than certain values in LL.
import java.util.*;
public class MainLinkedList {
public static void main(String[] args) {
LinkedList<Integer> llist = new LinkedList<Integer>();
llist.add(10);
llist.add(30);
llist.add(50);
llist.add(60);
llist.add(90);
llist.add(1000);
System.out.println("Old LinkedList " + llist);
//WHat if you want to insert 70 in a sorted LinkedList
LinkedList<Integer> newllist = insertSortedLL(llist, 70);
System.out.println("New LinkedList " + newllist);
}
public static LinkedList<Integer> insertSortedLL(LinkedList<Integer> llist, int value){
llist.add(value);
Collections.sort(llist);
return llist;
}
}
If we use listIterator the complexity for doing get will be O(1).
public class OrderedList<T extends Comparable<T>> extends LinkedList<T> {
private static final long serialVersionUID = 1L;
public boolean orderedAdd(T element) {
ListIterator<T> itr = listIterator();
while(true) {
if (itr.hasNext() == false) {
itr.add(element);
return(true);
}
T elementInList = itr.next();
if (elementInList.compareTo(element) > 0) {
itr.previous();
itr.add(element);
System.out.println("Adding");
return(true);
}
}
}
}
This might serve your purpose perfectly:
Use this code:
import java.util.*;
public class MainLinkedList {
private static LinkedList<Integer> llist;
public static void main(String[] args) {
llist = new LinkedList<Integer>();
addValue(60);
addValue(30);
addValue(10);
addValue(-5);
addValue(1000);
addValue(50);
addValue(60);
addValue(90);
addValue(1000);
addValue(0);
addValue(100);
addValue(-1000);
System.out.println("Linked List is: " + llist);
}
private static void addValue(int val) {
if (llist.size() == 0) {
llist.add(val);
} else if (llist.get(0) > val) {
llist.add(0, val);
} else if (llist.get(llist.size() - 1) < val) {
llist.add(llist.size(), val);
} else {
int i = 0;
while (llist.get(i) < val) {
i++;
}
llist.add(i, val);
}
}
}
This one method will manage insertion in the List in sorted manner without using Collections.sort(list)
You can do it in log (N) time Complexity simply. No need to iterate through all the values. you can use binary search to add value to sorted linked list.just add the value at the position of upper bound of that function.
Check code... you may understand better.
public static int ubound(LinkedList<Integer> ln, int x) {
int l = 0;
int h = ln.size();
while (l < h) {
int mid = (l + h) / 2;
if (ln.get(mid) <= x) l = mid + 1;
else h = mid;
}
return l;
}
public void solve()
{
LinkedList<Integer> ln = new LinkedList<>();
ln.add(4);
ln.add(6);
ln.add(ubound(ln, 5), 5);
out.println(ln);
}
Output : [4, 5, 6]
you can learn about binary search more at : https://www.topcoder.com/community/data-science/data-science-tutorials/binary-search/
#Atrakeur
"sorting all the list each time you add a new element isn't efficient"
That's true, but if you need the list to always be in a sorted state, it is really the only option.
"The best way is to insert the element directly where it has to be (at his correct position). For this, you can loop all the positions to find where this number belong to"
This is exactly what the example code does.
"or use Collections.binarySearch to let this highly optimised search algorithm do this job for you"
Binary search is efficient, but only for random-access lists. So you could use an array list instead of a linked list, but then you have to deal with memory copies as the list grows. You're also going to consume more memory than you need if the capacity of the list is higher than the actual number of elements (which is pretty common).
So which data structure/approach to take is going to depend a lot on your storage and access requirements.
[edit]
Actually, there is one problem with the sample code: it results in multiple scans of the list when looping.
int i = 0;
while (llist.get(i) < val) {
i++;
}
llist.add(i, val);
The call to get(i) is going to traverse the list once to get to the ith position. Then the call to add(i, val) traverses it again. So this will be very slow.
A better approach would be to use a ListIterator to traverse the list and perform insertion. This interface defines an add() method that can be used to insert the element at the current position.
Have a look at com.google.common.collect.TreeMultiset.
This is effectively a sorted set that allows multiple instances of the same value.
It is a nice compromise for what you are trying to do. Insertion is cheaper than ArrayList, but you still get search benefits of binary/tree searches.
Linked list isn't the better implementation for a SortedList
Also, sorting all the list each time you add a new element isn't efficient.
The best way is to insert the element directly where it has to be (at his correct position).
For this, you can loop all the positions to find where this number belong to, then insert it, or use Collections.binarySearch to let this highly optimised search algorithm do this job for you.
BinarySearch return the index of the object if the object is found in the list (you can check for duplicates here if needed) or (-(insertion point) - 1) if the object isn't allready in the list (and insertion point is the index where the object need to be placed to maintains order)
You have to find where to insert the data by knowing the order criteria.
The simple method is to brute force search the insert position (go through the list, binary search...).
Another method, if you know the nature of your data, is to estimate an insertion position to cut down the number of checks. For example if you insert 'Zorro' and the list is alphabetically ordered you should start from the back of the list... or estimate where your letter may be (probably towards the end).
This can also work for numbers if you know where they come from and how they are distributed.
This is called interpolation search: http://en.wikipedia.org/wiki/Interpolation_search
Also think about batch insert:
If you insert a lot of data quickly you may consider doing many insertions in one go and only sort once afterwards.
Solution of Amruth, simplified:
public class OrderedList<T extends Comparable<T>> extends LinkedList<T> {
private static final long serialVersionUID = 1L;
public boolean orderedAdd(T element) {
ListIterator<T> itr = listIterator();
while(itr.hasNext()) {
if (itr.next().compareTo(element) > 0) {
itr.previous();
break;
}
}
itr.add(element);
}
}
Obviously it's O(n)

Sorted array list in Java

I'm baffled that I can't find a quick answer to this. I'm essentially looking for a datastructure in Java which implements the java.util.List interface, but which stores its members in a sorted order. I know that you can use a normal ArrayList and use Collections.sort() on it, but I have a scenario where I am occasionally adding and often retrieving members from my list and I don't want to have to sort it every time I retrieve a member in case a new one has been added. Can anyone point me towards such a thing which exists in the JDK or even 3rd party libraries?
EDIT: The datastructure will need to preserve duplicates.
ANSWER's SUMMARY: I found all of this very interesting and learned a lot. Aioobe in particular deserves mention for his perseverance in trying to achieve my requirements above (mainly a sorted java.util.List implementation which supports duplicates). I have accepted his answer as the most accurate for what I asked and most thought provoking on the implications of what I was looking for even if what I asked wasn't exactly what I needed.
The problem with what I asked for lies in the List interface itself and the concept of optional methods in an interface. To quote the javadoc:
The user of this interface has precise control over where in the list each element is inserted.
Inserting into a sorted list doesn't have precise control over insertion point. Then, you have to think how you will handle some of the methods. Take add for example:
public boolean add(Object o)
Appends the specified element to the end of this list (optional operation).
You are now left in the uncomfortable situation of either
1) Breaking the contract and implementing a sorted version of add
2) Letting add add an element to the end of the list, breaking your sorted order
3) Leaving add out (as its optional) by throwing an UnsupportedOperationException and implementing another method which adds items in a sorted order.
Option 3 is probably the best, but I find it unsavory having an add method you can't use and another sortedAdd method which isn't in the interface.
Other related solutions (in no particular order):
java.util.PriorityQueue which is probably closest to what I needed than what I asked for. A queue isn't the most precise definition of a collection of objects in my case, but functionally it does everything I need it to.
net.sourceforge.nite.util.SortedList. However, this implementation breaks the contract of the List interface by implementing the sorting in the add(Object obj) method and bizarrely has a no effect method for add(int index, Object obj). General consensus suggests throw new UnsupportedOperationException() might be a better choice in this scenario.
Guava's TreeMultiSet A set implementation which supports duplicates
ca.odell.glazedlists.SortedList This class comes with the caveat in its javadoc: Warning: This class breaks the contract required by List
Minimalistic Solution
Here is a quick and dirty solution.
class SortedArrayList<T> extends ArrayList<T> {
#SuppressWarnings("unchecked")
public void insertSorted(T value) {
int i = Collections.binarySearch((List<Comparable<T>>) this, value);
add(i < 0 ? -i - 1 : i, value);
}
}
Note that despite the binarySearch, insertSorted will run in linear time since add(index, value) runs in linear time for an ArrayList.
Inserting something non-comparable results in a ClassCastException. (This is the approach taken by PriorityQueue as well: A priority queue relying on natural ordering also does not permit insertion of non-comparable objects (doing so may result in ClassCastException).)
A more complete implementation would, just like the PriorityQueue, also include a constructor that allows the user to pass in a Comparator.
Demo
SortedArrayList<String> test = new SortedArrayList<String>();
test.insertSorted("ddd"); System.out.println(test);
test.insertSorted("aaa"); System.out.println(test);
test.insertSorted("ccc"); System.out.println(test);
test.insertSorted("bbb"); System.out.println(test);
test.insertSorted("eee"); System.out.println(test);
....prints:
[ddd]
[aaa, ddd]
[aaa, ccc, ddd]
[aaa, bbb, ccc, ddd]
[aaa, bbb, ccc, ddd, eee]
Overriding List.add
Note that overriding List.add (or List.addAll for that matter) to insert elements in a sorted fashion would be a direct violation of the interface specification.
From the docs of List.add:
boolean add(E e)
    Appends the specified element to the end of this list (optional operation).
Maintaining the sortedness invariant
Unless this is some throw-away code, you probably want to guarantee that all elements remain sorted. This would include throwing UnsupportedOperationException for methods like add, addAll and set, as well as overriding listIterator to return a ListIterator whose set method throws.
Use java.util.PriorityQueue.
You can try Guava's TreeMultiSet.
Multiset<Integer> ms=TreeMultiset.create(Arrays.asList(1,2,3,1,1,-1,2,4,5,100));
System.out.println(ms);
Lists typically preserve the order in which items are added. Do you definitely need a list, or would a sorted set (e.g. TreeSet<E>) be okay for you? Basically, do you need to need to preserve duplicates?
Have a look at SortedList
This class implements a sorted list. It is constructed with a comparator that can compare two objects and sort objects accordingly. When you add an object to the list, it is inserted in the correct place. Object that are equal according to the comparator, will be in the list in the order that they were added to this list. Add only objects that the comparator can compare.
When the list already contains objects that are equal according to the comparator, the new object will be inserted immediately after these other objects.
Aioobe's approach is the way to go. I would like to suggest the following improvement over his solution though.
class SortedList<T> extends ArrayList<T> {
public void insertSorted(T value) {
int insertPoint = insertPoint(value);
add(insertPoint, value);
}
/**
* #return The insert point for a new value. If the value is found the insert point can be any
* of the possible positions that keeps the collection sorted (.33 or 3.3 or 33.).
*/
private int insertPoint(T key) {
int low = 0;
int high = size() - 1;
while (low <= high) {
int mid = (low + high) >>> 1;
Comparable<? super T> midVal = (Comparable<T>) get(mid);
int cmp = midVal.compareTo(key);
if (cmp < 0)
low = mid + 1;
else if (cmp > 0)
high = mid - 1;
else {
return mid; // key found
}
}
return low; // key not found
}
}
aioobe's solution gets very slow when using large lists. Using the fact that the list is sorted allows us to find the insert point for new values using binary search.
I would also use composition over inheritance, something along the lines of
SortedList<E> implements List<E>, RandomAccess, Cloneable, java.io.Serializable
It might be a bit too heavyweight for you, but GlazedLists has a SortedList that is perfect to use as the model of a table or JList
You could subclass ArrayList, and call Collections.sort(this) after any element is added - you would need to override two versions of add, and two of addAll, to do this.
Performance would not be as good as a smarter implementation which inserted elements in the right place, but it would do the job. If addition to the list is rare, the cost amortised over all operations on the list should be low.
Just make a new class like this:
public class SortedList<T> extends ArrayList<T> {
private final Comparator<? super T> comparator;
public SortedList() {
super();
this.comparator = null;
}
public SortedList(Comparator<T> comparator) {
super();
this.comparator = comparator;
}
#Override
public boolean add(T item) {
int index = comparator == null ? Collections.binarySearch((List<? extends Comparable<? super T>>)this, item) :
Collections.binarySearch(this, item, comparator);
if (index < 0) {
index = index * -1 - 2;
}
super.add(index+1, item);
return true;
}
#Override
public void add(int index, T item) {
throw new UnsupportedOperationException("'add' with an index is not supported in SortedArrayList");
}
#Override
public boolean addAll(Collection<? extends T> items) {
boolean allAdded = true;
for (T item : items) {
allAdded = allAdded && add(item);
}
return allAdded;
}
#Override
public boolean addAll(int index, Collection<? extends T> items) {
throw new UnsupportedOperationException("'addAll' with an index is not supported in SortedArrayList");
}
}
You can test it like this:
List<Integer> list = new SortedArrayList<>((Integer i1, Integer i2) -> i1.compareTo(i2));
for (Integer i : Arrays.asList(4, 7, 3, 8, 9, 25, 20, 23, 52, 3)) {
list.add(i);
}
System.out.println(list);
I think the choice between SortedSets/Lists and 'normal' sortable collections depends, whether you need sorting only for presentation purposes or at almost every point during runtime. Using a sorted collection may be much more expensive because the sorting is done everytime you insert an element.
If you can't opt for a collection in the JDK, you can take a look at the Apache Commons Collections
Since the currently proposed implementations which do implement a sorted list by breaking the Collection API, have an own implementation of a tree or something similar, I was curios how an implementation based on the TreeMap would perform. (Especialy since the TreeSet does base on TreeMap, too)
If someone is interested in that, too, he or she can feel free to look into it:
TreeList
Its part of the core library, you can add it via Maven dependency of course. (Apache License)
Currently the implementation seems to compare quite well on the same level than the guava SortedMultiSet and to the TreeList of the Apache Commons library.
But I would be happy if more than only me would test the implementation to be sure I did not miss something important.
Best regards!
https://github.com/geniot/indexed-tree-map
I had the same problem. So I took the source code of java.util.TreeMap and wrote IndexedTreeMap. It implements my own IndexedNavigableMap:
public interface IndexedNavigableMap<K, V> extends NavigableMap<K, V> {
K exactKey(int index);
Entry<K, V> exactEntry(int index);
int keyIndex(K k);
}
The implementation is based on updating node weights in the red-black tree when it is changed. Weight is the number of child nodes beneath a given node, plus one - self. For example when a tree is rotated to the left:
private void rotateLeft(Entry<K, V> p) {
if (p != null) {
Entry<K, V> r = p.right;
int delta = getWeight(r.left) - getWeight(p.right);
p.right = r.left;
p.updateWeight(delta);
if (r.left != null) {
r.left.parent = p;
}
r.parent = p.parent;
if (p.parent == null) {
root = r;
} else if (p.parent.left == p) {
delta = getWeight(r) - getWeight(p.parent.left);
p.parent.left = r;
p.parent.updateWeight(delta);
} else {
delta = getWeight(r) - getWeight(p.parent.right);
p.parent.right = r;
p.parent.updateWeight(delta);
}
delta = getWeight(p) - getWeight(r.left);
r.left = p;
r.updateWeight(delta);
p.parent = r;
}
}
updateWeight simply updates weights up to the root:
void updateWeight(int delta) {
weight += delta;
Entry<K, V> p = parent;
while (p != null) {
p.weight += delta;
p = p.parent;
}
}
And when we need to find the element by index here is the implementation that uses weights:
public K exactKey(int index) {
if (index < 0 || index > size() - 1) {
throw new ArrayIndexOutOfBoundsException();
}
return getExactKey(root, index);
}
private K getExactKey(Entry<K, V> e, int index) {
if (e.left == null && index == 0) {
return e.key;
}
if (e.left == null && e.right == null) {
return e.key;
}
if (e.left != null && e.left.weight > index) {
return getExactKey(e.left, index);
}
if (e.left != null && e.left.weight == index) {
return e.key;
}
return getExactKey(e.right, index - (e.left == null ? 0 : e.left.weight) - 1);
}
Also comes in very handy finding the index of a key:
public int keyIndex(K key) {
if (key == null) {
throw new NullPointerException();
}
Entry<K, V> e = getEntry(key);
if (e == null) {
throw new NullPointerException();
}
if (e == root) {
return getWeight(e) - getWeight(e.right) - 1;//index to return
}
int index = 0;
int cmp;
index += getWeight(e.left);
Entry<K, V> p = e.parent;
// split comparator and comparable paths
Comparator<? super K> cpr = comparator;
if (cpr != null) {
while (p != null) {
cmp = cpr.compare(key, p.key);
if (cmp > 0) {
index += getWeight(p.left) + 1;
}
p = p.parent;
}
} else {
Comparable<? super K> k = (Comparable<? super K>) key;
while (p != null) {
if (k.compareTo(p.key) > 0) {
index += getWeight(p.left) + 1;
}
p = p.parent;
}
}
return index;
}
You can find the result of this work at https://github.com/geniot/indexed-tree-map
TreeSet/TreeMap (as well as their indexed counterparts from the indexed-tree-map project) do not allow duplicate keys , you can use 1 key for an array of values. If you need a SortedSet with duplicates use TreeMap with values as arrays. I would do that.
The following method can be used to print the elements of a LinkedList of String objects.
The method accepts a LinkedList object as input and prints each element of the list, separated by a space, to the console.
To make the output more readable, the method also adds a newline before and after the list.
public static void printList(LinkedList<String> list) {
System.out.println("\nList is: ");
for (String element : list) {
System.out.print(element + " ");
}
System.out.println();
}

Java Code Review: Merge sorted lists into a single sorted list [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I want to merge sorted lists into a single list. How is this solution? I believe it runs in O(n) time. Any glaring flaws, inefficiencies, or stylistic issues?
I don't really like the idiom of setting a flag for "this is the first iteration" and using it to make sure "lowest" has a default value. Is there a better way around that?
public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
List<T> result = new ArrayList<T>();
int totalSize = 0; // every element in the set
for (List<T> l : lists) {
totalSize += l.size();
}
boolean first; //awkward
List<T> lowest = lists.iterator().next(); // the list with the lowest item to add
while (result.size() < totalSize) { // while we still have something to add
first = true;
for (List<T> l : lists) {
if (! l.isEmpty()) {
if (first) {
lowest = l;
first = false;
}
else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
lowest = l;
}
}
}
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
Note: this isn't homework, but it isn't for production code, either.
Efficiency will suck if lists contains an ArrayList, since lowest.remove(0) will take linear time in the length of the list, making your algorithm O(n^2).
I'd do:
List<T> result = new ArrayList<T>();
for (List<T> list : lists) {
result.addAll(list);
}
Collections.sort(result);
which is in O(n log n), and leaves far less tedious code to test, debug and maintain.
Your solution is probably the fastest one. SortedLists have an insert cost of log(n), so you'll end up with M log (M) (where M is the total size of the lists).
Adding them to one list and sorting, while easier to read, is still M log(M).
Your solution is just M.
You can clean up your code a bit by sizing the result list, and by using a reference to the lowest list instead of a boolean.
public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
int totalSize = 0; // every element in the set
for (List<T> l : lists) {
totalSize += l.size();
}
List<T> result = new ArrayList<T>(totalSize);
List<T> lowest;
while (result.size() < totalSize) { // while we still have something to add
lowest = null;
for (List<T> l : lists) {
if (! l.isEmpty()) {
if (lowest == null) {
lowest = l;
} else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
lowest = l;
}
}
}
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
If you're really particular, use a List object as input, and lowest can be initialized to be lists.get(0) and you can skip the null check.
To expand on Anton's comment:
By placing the latest result from each List, along with an indicator of whch list it is, into a heap, then continually take the top off the heap, and put a new item on the heap from the list belonging to the item you just took off.
Java's PriorityQueue can provide the heap implementation.
This is a really old question, but I don't like any of the submitted answers, so this is what I ended up doing.
The solution of just adding them all into one list and sorting is bad because of the log linear complexity (O(m n log(m n))). If that's not important to you, then it's definitely the simplest and most straightforward answer. Your initial solution isn't bad, but it's a little messy, and #Dathan pointed out that the complexity is O(m n) for m lists and n total elements. You can reduce this to O(n log(m)) by using a heap to reduce the number of comparisons for each element. I use a helper class that allows me to compare iterables. This way I don't destroy the initial lists, and it should operate with reasonable complexity no matter what type of lists are input. The only flaw I see with the implementation below is that it doesn't support lists with null elements, however this could be fixed with sentinels if desired.
public static <E extends Comparable<? super E>> List<E> merge(Collection<? extends List<? extends E>> lists) {
PriorityQueue<CompIterator<E>> queue = new PriorityQueue<CompIterator<E>>();
for (List<? extends E> list : lists)
if (!list.isEmpty())
queue.add(new CompIterator<E>(list.iterator()));
List<E> merged = new ArrayList<E>();
while (!queue.isEmpty()) {
CompIterator<E> next = queue.remove();
merged.add(next.next());
if (next.hasNext())
queue.add(next);
}
return merged;
}
private static class CompIterator<E extends Comparable<? super E>> implements Iterator<E>, Comparable<CompIterator<E>> {
E peekElem;
Iterator<? extends E> it;
public CompIterator(Iterator<? extends E> it) {
this.it = it;
if (it.hasNext()) peekElem = it.next();
else peekElem = null;
}
#Override
public boolean hasNext() {
return peekElem != null;
}
#Override
public E next() {
E ret = peekElem;
if (it.hasNext()) peekElem = it.next();
else peekElem = null;
return ret;
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
#Override
public int compareTo(CompIterator<E> o) {
if (peekElem == null) return 1;
else return peekElem.compareTo(o.peekElem);
}
}
Every element of the returned list involves two O(log(m)) heap operations, there is also an initial iteration over all of the lists. Therefore the overall complexity is O(n log(m) + m) for n total elements and m lists. making this always faster than concatenating and sorting.
Since Balus and meriton have together given an excellent response to your question about the algorithm, I'll speak to your aside about the "first" idiom.
There are definitely other approaches (like setting lowest to a 'magic' value), but I happen to feel that "first" (to which I'd probably give a longer name, but that's being pedantic) is the best, because it's very clear. Presence of a boolean like "first" is a clear signal that your loop will do something special the first time through. It helps the reader.
Of course you don't need it if you take the Balus/meriton approach, but it's a situation which crops up.

Categories

Resources