I am calculating a large number of possible resulting combinations of an algortihm. To sort this combinations I rate them with a double value und store them in PriorityQueue. Currently, there are about 200k items in that queue which is pretty much memory intesive. Acutally, I only need lets say the best 1000 or 100 of all items in the list.
So I just started to ask myself if there is a way to have a priority queue with a fixed size in Java. I should behave like this:
Is the item better than one of the allready stored? If yes, insert it to the according position and throw the element with the least rating away.
Does anyone have an idea? Thanks very much again!
Marco
que.add(d);
if (que.size() > YOUR_LIMIT)
que.poll();
or did I missunderstand your question?
edit: forgot to mention that for this to work you probably have to invert your comparTo function since it will throw away the one with highest priority each cycle. (if a is "better" b compare (a, b) should return a positvie number.
example to keep the biggest numbers use something like this:
public int compare(Double first, Double second) {
// keep the biggest values
return first > second ? 1 : -1;
}
MinMaxPriorityQueue, Google Guava
There is indeed a class for maintaining a queue that, when adding an item that would exceed the maximum size of the collection, compares the items to find an item to delete and thereby create room: MinMaxPriorityQueue found in Google Guava as of version 8.
EvictingQueue
By the way, if you merely want deleting the oldest element without doing any comparison of the objects’ values, Google Guava 15 gained the EvictingQueue class.
There is a fixed size priority queue in Apache Lucene: http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/util/PriorityQueue.html
It has excellent performance based on my tests.
Use SortedSet:
SortedSet<Item> items = new TreeSet<Item>(new Comparator<Item>(...));
...
void addItem(Item newItem) {
if (items.size() > 100) {
Item lowest = items.first();
if (newItem.greaterThan(lowest)) {
items.remove(lowest);
}
}
items.add(newItem);
}
Just poll() the queue if its least element is less than (in your case, has worse rating than) the current element.
static <V extends Comparable<? super V>>
PriorityQueue<V> nbest(int n, Iterable<V> valueGenerator) {
PriorityQueue<V> values = new PriorityQueue<V>();
for (V value : valueGenerator) {
if (values.size() == n && value.compareTo(values.peek()) > 0)
values.poll(); // remove least element, current is better
if (values.size() < n) // we removed one or haven't filled up, so add
values.add(value);
}
return values;
}
This assumes that you have some sort of combination class that implements Comparable that compares combinations on their rating.
Edit: Just to clarify, the Iterable in my example doesn't need to be pre-populated. For example, here's an Iterable<Integer> that will give you all natural numbers an int can represent:
Iterable<Integer> naturals = new Iterable<Integer>() {
public Iterator<Integer> iterator() {
return new Iterator<Integer>() {
int current = 0;
#Override
public boolean hasNext() {
return current >= 0;
}
#Override
public Integer next() {
return current++;
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
};
}
};
Memory consumption is very modest, as you can see - for over 2 billion values, you need two objects (the Iterable and the Iterator) plus one int.
You can of course rather easily adapt my code so it doesn't use an Iterable - I just used it because it's an elegant way to represent a sequence (also, I've been doing too much Python and C# ☺).
A better approach would be to more tightly moderate what goes on the queue, removing and appending to it as the program runs. It sounds like there would be some room to exclude some the items before you add them on the queue. It would be simpler than reinventing the wheel so to speak.
It seems natural to just keep the top 1000 each time you add an item, but the PriorityQueue doesn't offer anything to achieve that gracefully. Maybe you can, instead of using a PriorityQueue, do something like this in a method:
List<Double> list = new ArrayList<Double>();
...
list.add(newOutput);
Collections.sort(list);
list = list.subList(0, 1000);
Related
I have the following method which adds an element to a size limited ArrayList. If the size of the ArrayList exceeds, previous elements are removed (like FIFO = "first in first out") (version 1):
// adds the "item" into "list" and satisfies the "limit" of the list
public static <T> void add(List<T> list, final T item, int limit) {
var size = list.size() + 1;
if (size > limit) {
var exeeded = size - limit;
for (var i = 0; i < exeeded; i++) {
list.remove(0);
}
}
list.add(item);
}
The "version 1"-method works. However, I wanted to improve this method by using subList (version 2):
public static <T> void add(List<T> list, final T item, int limit) {
var size = list.size() + 1;
if (size > limit) {
var exeeded = size - limit;
list.subList(0, exeeded).clear();
}
list.add(item);
}
Both methods works. However, I want to know if "version 2" is also more performant than "version 1".
EDIT:
improved "Version 3":
public static <T> void add(List<T> list, final T item, int limit) {
var size = list.size() + 1;
if (size > limit) {
var exeeded = size - limit;
if (exeeded > 1) {
list.subList(0, exeeded).clear();
} else {
list.remove(0);
}
}
list.add(item);
}
It seems you have the ArrayList implementation in mind where remove(0) imposes the cost of copying all remaining elements in the backing array, repeatedly if you invoke remove(0) repeatedly.
In this case, using subList(0, number).clear() is a significant improvement, as you’re paying the cost of copying elements only once instead of number times.
Since the copying costs of remove(0) and subList(0, number).clear() are identical when number is one, the 3rd variant would save the cost of creating a temporary object for the sub list in that case. This, however is a tiny impact that doesn’t depend on the size of the list (or any other aspect of the input) and usually isn’t worth the more complex code. See also this answer for a discussion of the costs of a single temporary object. It’s even possible that the costs of the sub list construction get removed by the JVM’s runtime optimizer. Hence, such a conditional should only be used when you experience an actual performance problem, the profiler traces the problem back to this point, and benchmarks prove that the more complicated code has a positive effect.
But this is all moot when you use an ArrayDeque instead. This class has no copying costs when removing its head element, hence you can simply remove excess elements in a loop.
Question 1: The problem is this line:
list = list.subList(exeeded, list.size());
You're reassigning the variable list which will not change to object passed as an argument but only its local counterpart.
Question 2: The sublist will (on an array list) still need to recreate the array at some point. If you don't want that you could use a LinkedList. But as a general rule the ArrayList will still perform better on the whole. Since the underlying array only has to be recreated when exceeding the maximum capacity it usually doesn't matter a lot.
You could also try to actually shift the array, move every element to the next slot in the array. That way you would have to move all elements when a new one is added but don't need to recreate the array. So you avoid the trip to the heap which is usually the biggest impact on performance.
I would like to discuss a bit of performance of a particular collection, LinkedHashMap, for a particular requirement and how Java 8 or 9 new features could help on that.
Let's suppose I have the following LinkedHashMap:
private Map<Product, Item> items = new LinkedHashMap<>();
Using the default constructor means this Map follows the insertion-order when it is iterated.
--EDITED--
Just to be clear here, I understand that Maps are not the right data structure to be accessed by index, it happens that this class needs actually two remove methods, one by Product, the right way, which is the key, and the other by position, or index, which is not common so that's my concern about performance. BTW, it's not MY requirement.
I have to implement a removeItem() method by index. For those that doesn't know, a LinkedHashMap doesn't have some sort of map.get(index); method available.
So I will list a couple of solutions:
Solution 1:
public boolean removeItem(int position) {
List<Product> orderedList = new ArrayList<>(items.keySet());
Product key = orderedList.get(position);
return items.remove(key) != null;
}
Solution 2:
public boolean removeItem(int position) {
int counter = 0;
Product key = null; //assuming there's no null keys
for(Map.Entry<Product, Item> entry: items.entrySet() ){
if( counter == position ){
key = entry.getKey();
break;
}
counter++;
}
return items.remove(key) != null;
}
Considerations about these 2 solutions.
S1: I understand that ArrayLists have fast iteration and access, so I believe the problem here is that a whole new collection is being created, so the memory would be compromised if I had a huge collection.
S2: I understand that LinkedHashMap iteration is faster than a HashMap but not as fast as an ArrayList, so I believe the time of iteration here would be compromised if we had a huge collection, but not the memory.
Considering all of that, and that my considerations are correct, can I say that both solutions have O(n) complexity?
Is there a better solution for this case in terms of performance, using the latest features of Java 8 or 9?
Cheers!
As said by Stephen C, the time complexity is the same, as in either case, you have a linear iteration, but the efficiency still differs, as the second variant will only iterate to the specified element, instead of creating a complete copy.
You could optimize this even further, by not performing an additional lookup after finding the entry. To use the pointer to the actual location within the Map, you have to make the use of its Iterator explicit:
public boolean removeItem(int position) {
if(position >= items.size()) return false;
Iterator<?> it=items.values().iterator();
for(int counter = 0; counter < position; counter++) it.next();
boolean result = it.next() != null;
it.remove();
return result;
}
This follows the logic of your original code to return false if the key was mapped to null. If you never have null values in the map, you could simplify the logic:
public boolean removeItem(int position) {
if(position >= items.size()) return false;
Iterator<?> it=items.entrySet().iterator();
for(int counter = 0; counter <= position; counter++) it.next();
it.remove();
return true;
}
You may retrieve a particular element using the Stream API, but the subsequent remove operation requires a lookup which makes it less efficient as calling remove on an iterator which already has a reference to the position in the map for most implementations.
public boolean removeItem(int position) {
if(position >= items.size() || position < 0)
return false;
Product key = items.keySet().stream()
.skip(position)
.findFirst()
.get();
items.remove(key);
return true;
}
Considering all of that, and that my considerations are correct, can I say that both solutions have O(n) complexity?
Yes. The average complexity is the same.m
In the first solution the new ArrayList<>(entrySet) step is O(N).
In the second solution the loop is O(N).
There is difference in the best case complexity though. In the first solution you always copy the entire list. In the second solution, you only iterate as far as you need to. So the best case is that it can stop iterating at the first element.
But while the average complexity is O(N) in both cases, my gut feeling is that the second solution will be fastest. (If it matters to you, benchmark it ...)
Is there a better solution for this case in terms of performance, using the latest features of Java 8 or 9?
Java 8 and Java 9 don't offer any performance improvements.
If you want better that O(N) average complexity, you will need a different data structure.
The other thing to note is that indexing the Map's entry sets is generally not a useful thing to do. Whenever an entry is removed from the set, the index values for some of the other entries change ....
Mimicking this "unstable" indexing behavior efficiently is difficult. If you want stable behavior, then you can augment your primary HashMap<K,V> / LinkedHashMap<K,V> with a HashMap<Integer,K> which you use for positional lookup / insertion / retrieval. But even that is a bit awkward ... considering what happens if you need to insert a new entry between entries at positions i and i + 1.
I have a java.util.LinkedList containing data logically like
1 > 2 > 3 > 4 > 5 > null
and I want to remove elements from 2 to 4 and make the LinkedList like this
1 > 5 > null
In reality we should be able to achieve this in O(n) complexity considering you have to break chain at 2 and connect it to 5 in just a single operation.
In Java LinkedList I am not able to find any function which lets remove chains from linkedlist using from and to in a single O(n) operation.
It only provides me an option to remove the elements individually (Making each operation O(n)).
Is there anyway I can achieve this in just a single operation (Without writing my own List)?
One solution provided here solves the problem using single line of code, but not in single operation.
list.subList(1, 4).clear();
The question was more on algorithmic and performance. When I checked the performance, this is actually slower than removing the element one by one. I am guessing this solution do not actually remove an entire sublist in o(n) but doing that one by one for each element (each removal of O(n)). Also adding extra computation to take the sublist.
Average of 1000000 computations in ms:
Without sublist = 1414
With the provided sublist solution : = 1846**
The way to do it in one step is
list.subList(1, 4).clear();
as documented in the Javadoc for java.util.LinkedList#subList(int, int).
Having checked the source code, I see that this ends up removing the elements one at a time. subList is inherited from AbstractList. This implementation returns a List that simply calls removeRange on the backing list when you invoke clear on it. removeRange is also inherited from AbstractList and the implementation is
protected void removeRange(int fromIndex, int toIndex) {
ListIterator<E> it = listIterator(fromIndex);
for (int i=0, n=toIndex-fromIndex; i<n; i++) {
it.next();
it.remove();
}
}
As you can see, this removes the elements one at a time. listIterator is overridden in LinkedList, and it starts by finding the first node by following chains either by following links from the start of the list or the end (depending on whether fromIndex is in the first or second half of the list). This means that list.subList(i, j).clear() has time complexity
O(j - i + min(i, list.size() - i)).
Apart from the case when the you are better off starting from the end and removing the elements in reverse order, I am not convinced there is a solution that is noticeably faster. Testing the performance of code is not easy, and it is easy to be drawn to false conclusions.
There is no way of using the public API of the LinkedList class to remove all the elements in the middle in one go. This surprised me, as about the only reason for using a LinkedList rather than an ArrayList is that you are supposed to be able to insert and remove elements from the middle efficiently, so I thought this case worth optimising (especially as it's so easy to write).
If you absolutely need the O(1) performance that you should be able to get from a call such as
list.subList(1, list.size() - 1)).clear();
you will either have to write your own implementation or do something fragile and unwise with reflection like this:
public static void main(String[] args) {
LinkedList<Integer> list = new LinkedList<>();
for (int a = 0; a < 5; a++)
list.add(a);
removeRange_NEVER_DO_THIS(list, 2, 4);
System.out.println(list); // [0, 1, 4]
}
public static void removeRange_NEVER_DO_THIS(LinkedList<?> list, int from, int to) {
try {
Method node = LinkedList.class.getDeclaredMethod("node", int.class);
node.setAccessible(true);
Object low = node.invoke(list, from - 1);
Object hi = node.invoke(list, to);
Class<?> clazz = low.getClass();
Field nextNode = clazz.getDeclaredField("next");
Field prevNode = clazz.getDeclaredField("prev");
nextNode.setAccessible(true);
prevNode.setAccessible(true);
nextNode.set(low, hi);
prevNode.set(hi, low);
Field size = LinkedList.class.getDeclaredField("size");
size.setAccessible(true);
size.set(list, list.size() - to + from);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
To remove the middle elements in a single operation (method call) you could subclass java.util.LinkedList and then expose a call to List.removeRange(int, int):
list.removeRange(1, 4);
(Credit to the person who posted this answer then removed it. :)) However, even this method calls ListIterator.remove() n times.
I do not believe there is a way to remove n consecutive entries from a java.util.LinkedList without performing n operations under the hood.
In general removing n consecutive items from any linked list seems to require O(n) operations as one must traverse from the start index to the end index one item at a time - inherently - in order to find the next list entry in the modified list.
I'm iterating through a huge file reading key and value from every line. I need to obtain specific number (say 100k) of elements with highest values. To store them I figured that I need a collection that allows me to check a minimum value in O(1) or O(log(n)) and if currently read value is higher then remove element with minimum value and put new one. What collection enables me to do that? Values are not unique so BiMap is probably not adequate here.
EDIT:
Ultimate goal is to obtain best [key, value] that will be used later. Say my file looks like below (first column - key, second value):
3 6
5 9
2 7
1 6
4 5
Let's assume I'm looking for best two elements and algorithm to achieve that. I figured that I'll use a key-based collection to store best elements. First two elements (<3, 6>, <5, 9>) will be obviously added to the collection as its capacity is 2. But when I get to the third line I need to check if <2, 7> is eligible to be added to the collection (so I need to be able to check if 7 is higher than minimum value in collection (6)
It sounds like you don't actually need a structure because you are simply looking for the largest N values with their corresponding keys, and the keys are not actually used for sorting or retrieval for the purpose of this problem.
I would use the PriorityQueue, with the minimum value at the root. This allows you to retrieve the smallest element in constant time, and if your next value is larger, removal and insertion in O(log N) time.
class V{
int key;
int value;
}
class ComparatorV implements Comparator<V>{
int compare(V a, V b){
return Integer.compare(a.value, b.value);
}
}
For your specific situation, you can use a TreeSet, and to get around the uniqueness of elements in a set you can store pairs which are comparable but which never appear equal when compared. This will allow you to violate the contract with Set which specifies that the Set not contain equal values.
The documentation for TreeSet contains:
The behavior of a set is well-defined even if its ordering is
inconsistent with equals; it just fails to obey the general contract
of the Set interface
So using the TreeSet with the Comparable inconsistent with equals should be fine in this situation. If you ever need to compare your chess pairs for a different reason (perhaps some other algorithm you are also running in this app) where the comparison should be consistent with equals, then provide a Comparator for the other use. Notice that TreeSet has a constructor which takes a Comparator, so you can use that instead of having ChessPair implement Comparable.
Notice: A TreeSet provides more flexibility than a PriorityQueue in general because of all of its utility methods, but by violating the "comparable consistent with equals" contract of Set some of the functionality of the TreeSet is lost. For example, you can still remove the first element of the set using Set.pollFirst, but you cannot remove an arbitrary element using remove since that will rely on the elements being equivalent.
Per your "n or at worst log(n)" requirement, the documentation also states:
This implementation provides guaranteed log(n) time cost for the basic
operations (add, remove and contains).
Also, I provide an optimization below which reduces the minimum-value query to O(1).
Example
Set s = new TreeSet<ChessPair>();
and
public class ChessPair implements Comparable<ChessPair>
{
final int location;
final int value;
public ChessPair(final int location, final int value)
{
this.location = location;
this.value = value;
}
#Override
public int compareTo(ChessPair o)
{
if(value < o.value) return -1;
return 1;
}
}
Now you have an ordered set containing your pairs of numbers, they are ordered by your value, you can have duplicate values, and you can get the associated locations. You can also easily grab the first element (set.first), last (set.last), or get a sub-set (set.subSet(a,b)), or iterate over the first (or last, by using descendingSet) n elements. This provides everything you asked for.
Example Use
You specified wanting to keep the 100 000 best elements. So I would use one algorithm for the first 100 000 possibilities which simply adds every time.
for(int i = 0; i < 100000 && dataSource.hasNext(); i += 1)
{
ChessPair p = dataSource.next(); // or whatever you do to get the next line
set.add(p);
}
and then a different one after that
while(dataSource.hasNext())
{
ChessPair p = dataSource.next();
if(p.value > set.first().value)
{
set.remove(set.pollFirst());
set.add(p);
}
}
Optimization
In your case, you can insert an optimization into the algorithm where you compare against the lowest value. The above, simple version performs an O(log(n)) operation every time it compares against minimum-value since set.first() is O(log(n)). Instead, you can store the minimum value in a local variable.
This optimization works well for scaling this algorithm because the impact is negligible - no gain, no loss - when n is close to the total data set size (ie: you want best 100 values out of 110), but when the total data set is vastly larger than n (ie: best 100 000 out of 100 000 000 000) the query for the minimum value is going to be your most common operation and will now be constant.
So now we have (after loading the initial n values)...
int minimum = set.first().value;
while(dataSource.hasNext())
{
ChessPair p = dataSource.next();
if(p.value > minimum)
{
set.remove(set.pollFirst());
set.add(p);
minimum = set.first().value;
}
}
Now your most common operation - query minimum value - is constant time (O(1)), your second most common operation - add - is worst case log(n) time, and your least most common operation - remove - is worst case log(n) time.
For arbitrarily large data sets, each input is now processed in constant O(1) time.
See java.util.TreeSet
Previous answer (now obsolete)
Based on question edits and discussion in the question's comments, I no longer believe my original answer to be correct. I am leaving it below for reference.
If you want a Map collection which allows fast access to elements based on order, then you want an ordered Map, for which there is a sub-interface SortedMap. Fortunately for you, Java has a great implementation of SortedMap: it's TreeMap, a Map which is backed by a "red-black" tree structure which is an ordered tree.
Red-black-trees are nice since they rotate branches in order to keep the tree balanced. That is, you will not end up with a tree that branches n times in one direction, yielding n layers, just because your data may already have been sorted. You are guaranteed to have approximately log(n) layers in the tree, so it is always fast and guarantees log(n) query even for worst-case.
For your situation, try out the java.util.TreeMap. On the page linked in the previous sentence, there are links also to Map and SortedMap. You should check out the one for SortedMap too, so you can see where TreeMap gets some of the specific functionality that you are looking for. It allows you to get the first key, the last key, and a sub-map that fetches a range from within this map.
For your situation though, it is probably sufficient to just grab an iterator from the TreeMap and iterate over the first n pairs, where n is the number of lowest (or highest) values that you want.
Use a TreeSet, which offers O(log n) insertion and O(1) retrieval of either the highest or lowest scored item.
Your item class must:
Implement Comparable
Not implement equals()
To keep the top 100K items only, use this code:
Item item; // to add
if (treeSet.size() == 100_000) {
if (treeSet.first().compareTo(item) < 0) {
treeSet.remove(treeSet.first());
treeSet.add(item);
}
} else {
treeSet.add(item);
}
If you want a collection ordered by values, you can use a TreeSet which stores tuples of your keys and values. A TreeSet has O(log(n)) access times.
class KeyValuePair<Key, Value: Comparable<Value>> implements Comparable<KeyValuePair<Key, Value>> {
Key key;
Value value;
KeyValuePair(Key key, Value value) {
this.key = key;
this.value = value;
}
public int compare(KeyValuePair<Key, Value> other) {
return this.value.compare(other.value);
}
}
or instead of implementing Comparable, you can pass a Comparator to the set at creation time.
You can then retrieve the first value using treeSet.first().value.
Something like this?
entry for your data structure, that can be sorted based on the value
class Entry implements Comparable<Entry> {
public final String key;
public final long value;
public Entry(String key, long value) {
this.key = key;
this.value = value;
}
public int compareTo(Entry other) {
return this.value - other.value;
}
public int hashCode() {
//hashcode based on the same values on which equals works
}
}
actual code that works with a PriorityQueue. The sorting is based on the value, not on the key as with a TreeMap. This is because of the compareMethod defined in Entry. If the sets grows above 100000, the lowest entry (with the lowest value) is removed.
public class ProcessData {
private int maxSize;
private PriorityQueue<Entry> largestEntries = new PriorityQueue<>(maxSize);
public ProcessData(int maxSize) {
this.maxSize = maxSize;
}
public void addKeyValue(String key, long value) {
largestEntries.add(new Entry(key, value));
if (largestEntries.size() > maxSize) {
largestEntries.poll();
}
}
}
I have the following problem: I need to find pairs of the same elements in two lists, which are unordered. The thing about these two lists is that they are "roughly equal" - only certain elements are shifted by a few indexes e.g. (Note, these objects are not ints, I am just using integers in this example):
[1,2,3,5,4,8,6,7,10,9]
[1,2,3,4,5,6,7,8,9,10]
My first attempt would be to iterate through both lists and generate two HashMaps based on some unique key for each object. Then, upon the second pass, I would simply pull the elements from both maps. This yields O(2N) in space and time.
I was thinking about a different approach: we would keep pointers to the current element in both lists, as well as currentlyUnmatched set for each of the list. the pseudocode would be sth of the following sort:
while(elements to process)
elem1 = list1.get(index1)
elem2 = list2.get(index2)
if(elem1 == elem2){ //do work
... index1++;
index2++;
}
else{
//Move index of the list that has no unamtched elems
if(firstListUnmatched.size() ==0){
//Didn't find it also in the other list so we save for later
if(secondListUnamtched.remove(elem1) != true)
firstListUnmatched.insert(elem1)
index1++
}
else { // same but with other index}
}
The above probably does not work... I just wanted to get a rough idea what you think about this approach. Basically, this maintains a hashset on the side of each list, which size << problem size. This should be ~O(N) for small number of misplaced elements and for small "gaps". Anyway, I look forward to your replies.
EDIT: I cannot simply return a set intersection of two object lists, as I need to perform operations (multiple operations even) on the objects I find as matching/non-matching
I cannot simply return a set intersection of two object lists, as I need to perform operations (multiple operations even) on the objects I find as matching/non-matching
You can maintain a set of the objects which don't match. This will be O(M) in space where M is the largest number of swapped elements at any point. It will be O(N) for time where N is the number of elements.
interface Listener<T> {
void matched(T t1);
void onlyIn1(T t1);
void onlyIn2(T t2);
}
public static <T> void compare(List<T> list1, List<T> list2, Listener<T> tListener) {
Set<T> onlyIn1 = new HashSet<T>();
Set<T> onlyIn2 = new HashSet<T>();
for (int i = 0; i < list1.size(); i++) {
T t1 = list1.get(i);
T t2 = list2.get(i);
if (t1.equals(t2)) {
tListener.matched(t1);
continue;
}
if (onlyIn2.remove(t1))
tListener.matched(t1);
else
onlyIn1.add(t1);
if (!onlyIn1.remove(t2))
onlyIn2.add(t2);
}
for (T t1 : onlyIn1)
tListener.onlyIn1(t1);
for (T t2 : onlyIn2)
tListener.onlyIn2(t2);
}
If I have understood your question correctly, You can use Collection.retainAll and then iterate over collection that is been retained and do what you have to do.
list2.retainAll(list1);
All approaches based on maps will be O(n log(n)) at best, because creating the map is an insertion sort. The effect is to do an insertion sort on both, and then compare them, which is as good as it's going to get.
If the lists are nearly sorted to begin with, a sort step shouldn't take as long as the average case, and will scale with O(n log(n)), so just do a sort on both and compare. This allows you to step through and perform your operations on the items that match or do not match as appropriate.