I'm reading about PriorityQueues in javadocs and it mentions the term tie-breaking. I couldn't undersand what does the term mean. I hope someone could explain.
In Java, comparison is done using a compare(a,b) method (for comparators) or a.compareTo(b) method (for class instances that can be compared). This method is supposed to return a negative number whenever a < b, a positive number when a > b, and 0 when a = b.
However sometimes people just use return value 0 to mean a and b are incomparable (some orderings aren't total). In this case, the PriorityQueue has to decide which element goes first. This is tie-breaking. Specifically some priority queues preserve the order in which zero-comparing elements were inserted, so in that case insertion time is the tie-breaker. Then for a collection of elements where compareTo() always returns 0, the priority queue would act just like a normal queue.
If you have a priority Que that uses Node.Score to assign priority, if two scores are the same you have a tie.
You could implement first-in-first-out tie-breaking to comparable elements. If two scores are the same the priority is given to the node that was added first.
if(NodeA.getScore() == NodeB.getScore()){
//this is a tie
if(NodeA.getOrderAdded() > NodeB.GetOrderAdded(){
//NodeA has priority
} else {
//NodeB has priority
}
}
Related
Suppose I have a class not implementing the Comparable interface like
class Dummy {
}
and a collection of instances of this class plus some function external to the class that allows comparing these instances partially (a map will be used for this purpose below):
Collection<Dummy> col = new ArrayList<>();
Map<Dummy, Integer> map = new HashMap<>();
for (int i = 0; i < 12; i++) {
Dummy d = new Dummy();
col.add(d);
map.put(d, i % 4);
}
Now I want to sort this collection using the TreeSet class with a custom comparator:
TreeSet<Dummy> sorted = new TreeSet<>(new Comparator<Dummy>() {
#Override
public int compare(Dummy o1, Dummy o2) {
return map.get(o1) - map.get(o2);
}
});
sorted.addAll(col);
The result is obviously unsatisfactory (contains less elements than the initial collection). This is because such a comparator is not consistent with equals, i.e. sometimes returns 0 for non-equal elements. My next attempt was to change the compare method of the comparator to
#Override
public int compare(Dummy o1, Dummy o2) {
int d = map.get(o1) - map.get(o2);
if (d != 0)
return d;
if (o1.equals(o2))
return 0;
return 1; // is this acceptable?
}
It seemingly gives the desired result for this simple demonstrational example but I'm still in doubt: is it correct to always return 1 for unequal (but undistinguishable by the map) objects? Such a relation still violates the general contact for the Comparator.compare() method because sgn(compare(x, y)) == -sgn(compare(y, x)) is, generally, wrong. Do I really need to implement a correct total ordering for TreeSet to work correctly or the above is enough? How to do this when an instance has no fields to compare?
For more real-life example imagine that, instead of Dummy, you have a type parameter T of some generic class. T may have some fields and implement the equals() method through them, but you don't know these fields and yet need to sort instances of this class according to some external function. Is this possible with the help of TreeSet?
Edit
Using System.identityHashCode() is a great idea but there is (not so small) chance of collision.
Besides possibility of such a collision, there is one more pitfall. Suppose you have 3 objects: a, b, c such that map.get(a) = map.get(b) = map.get(c) (here = isn't assignment but the mathematical equality), identityHashCode(a) < identityHashCode(b) < identityHashCode(c), a.equals(c) is true, but a.equals(b) (and hence c.equals(b)) is false. After adding these 3 elements to a TreeSet in this order: a, b, c you can get into a situation when all of them have been added to the set, that contradicts the prescribed behaviour of the Set interface - it should not contain equal elements. How to deal with that?
In addition, it would be great if someone familiar with TreeSet mechanics explained to me what does the term "well-defined" in the phrase "The behavior of a set is well-defined even if its ordering is inconsistent with equals" from TreeSet javadoc mean.
Unless you have an absolutely huge amount of Dummy objects and really bad luck, you can use System.identityHashCode()to break ties:
Comparator.<Dummy>comparingInt(d -> map.get(d))
.thenComparingInt(System::identityHashCode)
Your comparator is not acceptable, since it violates the contract: you have d1 > d2 and d2 > d1 at the same time if they're not equal and don't share the same value in the map.
This answer covers just the first example in the question. The remainder of the question, and the various edits, are I think better answered as part of separate, focused questions.
The first example sets up 12 instances of Dummy, creates a map that maps each instance to an Integer in the range [0, 3], and then adds the 12 Dummy instances to a TreeSet. That TreeSet is provided with a comparator that uses the Dummy-to-Integer map. The result is that the TreeSet contains only four of the Dummy instances. The example concludes with the following statement:
The result is obviously unsatisfactory (contains less elements than the initial collection). This is because such a comparator is not consistent with equals, i.e. sometimes returns 0 for non-equal elements.
This last sentence is incorrect. The result contains fewer elements than were inserted because the comparator considers many of the instances to be duplicates and therefore they aren't inserted into the set. The equals method doesn't enter the discussion at all. Therefore, the concept of "consistent with equals" isn't relevant to this discussion. TreeSet never calls equals. The comparator is the only thing that determines membership in the TreeSet.
This seems like an unsatisfactory result, but only because we happen "know" that there are 12 distinct Dummy instances. However, the TreeSet doesn't "know" that they are distinct. It only knows how to compare the Dummy instances using the comparator. When it does so, it finds that several are duplicates. That is, the comparator returns 0 sometimes even though it's being called with Dummy instances that we believe to be distinct. That's why only four Dummy instances end up in the TreeSet.
I'm not entirely sure what the desired outcome is, but it seems like the result TreeSet should contain all 12 instances ordered by values in the Dummy-to-Integer map. My suggestion was to use Guava's Ordering.arbitrary() which provides a comparator that distinguishes between distinct-but-otherwise-equal elements, but does so in a way that satisfies the general contract of Comparator. If you create the TreeSet like this:
SortedSet<Dummy> sorted = new TreeSet<>(Comparator.<Dummy>comparingInt(map::get)
.thenComparing(Ordering.arbitrary()));
the result will be that the TreeSet contains all 12 Dummy instances, sorted by Integer value in the map, and with Dummy instances that map to the same value ordered arbitrarily.
In the comments, you stated that the Ordering.arbitrary doc "unequivocally cautions against using it in SortedSet". That's not quite right; that doc says,
Because the ordering is identity-based, it is not "consistent with Object.equals(Object)" as defined by Comparator. Use caution when building a SortedSet or SortedMap from it, as the resulting collection will not behave exactly according to spec.
The phrase "not behave exactly according to spec" really means that it will behave "strangely" as described in the class doc of Comparator:
The ordering imposed by a comparator c on a set of elements S is said to be consistent with equals if and only if c.compare(e1, e2)==0 has the same boolean value as e1.equals(e2) for every e1 and e2 in S.
Caution should be exercised when using a comparator capable of imposing an ordering inconsistent with equals to order a sorted set (or sorted map). Suppose a sorted set (or sorted map) with an explicit comparator c is used with elements (or keys) drawn from a set S. If the ordering imposed by c on S is inconsistent with equals, the sorted set (or sorted map) will behave "strangely." In particular the sorted set (or sorted map) will violate the general contract for set (or map), which is defined in terms of equals.
For example, suppose one adds two elements a and b such that (a.equals(b) && c.compare(a, b) != 0) to an empty TreeSet with comparator c. The second add operation will return true (and the size of the tree set will increase) because a and b are not equivalent from the tree set's perspective, even though this is contrary to the specification of the Set.add method.
You seemed to indicate that this "strange" behavior was unacceptable, in that Dummy elements that are equals shouldn't appear in the TreeSet. But the Dummy class doesn't override equals, so it seems like there's an additional requirement lurking behind here.
There are some additional questions added in later edits to the question, but as I mentioned above, I think these are better handled as separate question(s).
UPDATE 2018-12-22
After rereading the question edits and comments, I think I've finally figured out what you're looking for. You want a comparator over any object that provides a primary ordering based on some int-valued function that may result in duplicate values for unequal objects (as determined by the objects' equals method). Therefore, a secondary ordering is required that provides a total ordering over all unequal objects, but which returns zero for objects that are equals. This implies that the comparator should be consistent with equals.
Guava's Ordering.arbitrary comes close in that it provides an arbitrary total ordering over any objects, but it only returns zero for objects that are identical (that is, ==) but not for objects that are equals. It's thus inconsistent with equals.
It sounds, then, that you want a comparator that provides an arbitrary ordering over unequal objects. Here's a function that creates one:
static Comparator<Object> arbitraryUnequal() {
Map<Object, Integer> map = new HashMap<>();
return (o1, o2) -> Integer.compare(map.computeIfAbsent(o1, x -> map.size()),
map.computeIfAbsent(o2, x -> map.size()));
}
Essentially, this assigns a sequence number to every newly seen unequal object and keeps these numbers in a map held by the comparator. It uses the map's size as the counter. Since objects are never removed from this map, the size and thus the sequence number always increases.
(If you intend for this comparator to be used concurrently, e.g., in a parallel sort, the HashMap should be replaced with a ConcurrentHashMap and the size trick should be modified to use an AtomicInteger that's incremented when new entries are added.)
Note that the map in this comparator builds up entries for every unequal object that it's ever seen. If this is attached to a TreeSet, objects will persist in the comparator's map even after they've been removed from the TreeSet. This is necessary so that if objects are added or removed, they'll retain consistent ordering over time. Guava's Ordering.arbitrary uses weak references to allow objects to be collected if they're no longer used. We can't do that, because we need to preserve the ordering of non-identical but equal objects.
You'd use it like this:
SortedSet<Dummy> sorted = new TreeSet<>(Comparator.<Dummy>comparingInt(map::get)
.thenComparing(arbitraryUnequal()));
You had also asked what "well-defined" means in the following:
The behavior of a set is well-defined even if its ordering is inconsistent with equals
Suppose you were to use a TreeSet using a comparator that's inconsistent with equals, such as the one using Guava's Ordering.arbitrary shown above. The TreeSet will still work as expected, consistent with itself. That is, it will maintain objects in a total ordering, it will not contain any two objects for which the comparator returns zero, and all its methods will work as specified. However, it is possible for there to be an object for which contains returns true (since that's computed using the comparator) but for which equals is false if called with the object actually in the set.
For example, BigDecimal is Comparable but its comparison method is inconsistent with equals:
> BigDecimal z = new BigDecimal("0.0")
> BigDecimal zz = new BigDecimal("0.00")
> z.compareTo(zz)
0
> z.equals(zz)
false
> TreeSet<BigDecimal> ts = new TreeSet<>()
> ts.add(z)
> HashSet<BigDecimal> hs = new HashSet<>(ts)
> hs.equals(ts)
true
> ts.contains(zz)
true
> hs.contains(zz)
false
This is what the spec means when it says things can behave "strangely". We have two sets that are equal. Yet they report different results for contains of the same object, and the TreeSet reports that it contains an object even though that object is unequal to an object in the set.
Here's the comparator I ended up with. It is both reliable and memory efficient.
public static <T> Comparator<T> uniqualizer() {
return new Comparator<T>() {
private final Map<T, Integer> extraId = new HashMap<>();
private int id;
#Override
public int compare(T o1, T o2) {
int d = Integer.compare(o1.hashCode(), o2.hashCode());
if (d != 0)
return d;
if (o1.equals(o2))
return 0;
d = extraId.computeIfAbsent(o1, key -> id++)
- extraId.computeIfAbsent(o2, key -> id++);
assert id > 0 : "ID overflow";
assert d != 0 : "Concurrent modification";
return d;
}
};
}
It creates total ordering on all objects of the given class T and thus allows to distinguish objects not distinguishable by a given comparator via attaching to it like this:
Comparator<T> partial = ...
Comparator<T> total = partial.thenComparing(uniqualizer());
In the example given at the question, T is Dummy and
partial = Comparator.<Dummy>comparingInt(map::get);
Note that you don't need to specify the type T when calling uniqualizer(), complier automatically determines it via type inference. You only have to make sure that hashCode() in T is consistent with equals(), as described in the general contract of hashCode(). Then uniqualizer() will give you the comparator (total) consistent with equals() and you can use it in any code that requires comparing objects of type T, e.g. when creating a TreeSet:
TreeSet<T> sorted = new TreeSet<>(total);
or sorting a list:
List<T> list = ...
Collections.sort(list, total);
I suspect it doesn't. If I want to use the fact that the list is ordered, should I implement my own contains() method, using binary search, for example? Are there any methods that assume that the list is ordered?
This question is different to the possible duplicate because the other question doesn't ask about the contains() method.
No, because ArrayList is backed by array and internally calls indexOf(Object o) method where it searches sequentially. Thus sorting is not relevant to it. Here's the source code:
/**
* Returns the index of the first occurrence of the specified element
* in this list, or -1 if this list does not contain the element.
* More formally, returns the lowest index <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>,
* or -1 if there is no such index.
*/
public int indexOf(Object o) {
if (o == null) {
for (int i = 0; i < size; i++)
if (elementData[i]==null)
return i;
} else {
for (int i = 0; i < size; i++)
if (o.equals(elementData[i]))
return i;
}
return -1;
}
Use binary search of collections to search in an ordered array list
Collections.<T>binarySearch(List<T> list, T key)
Arraylist.contains will consider this as a normal list and it would take the same amount of time as any unordered list that is O(n) whereas complexity of binary search would be O(logn) in worst case
No. contains uses indexOf:
public boolean contains(Object var1) {
return this.indexOf(var1) >= 0;
}
and indexOf just simply iterates over the internal array:
for(var2 = 0; var2 < this.size; ++var2) {
if (var1.equals(this.elementData[var2])) {
return var2;
}
}
Collections.binarySearch is what you're looking for:
Searches the specified list for the specified object using the binary
search algorithm. The list must be sorted into ascending order
according to the natural ordering of its elements (as by the
sort(List) method) prior to making this call. If it is not sorted, the
results are undefined.
Emphasis mine
Also consider using a SortedSet such as a TreeSet which will provide stronger guarantees that the elements are kept in the correct order, unlike a List which must rely on caller contracts (as highlighted above)
Does the ArrayList's contains() method work faster if the ArrayList is ordered?
It doesn't. The implementation of ArrayList does not know if the list is ordered or not. Since it doesn't know, it cannot optimize in the case when it is ordered. (And an examination of the source code bears this out.)
Could a (hypothetical) array-based-list implementation know? I think "No" for the following reasons:
Without either a Comparator or a requirement that elements implement Comparable, the concept of ordering is ill-defined.
The cost of checking that a list is ordered is O(N). The cost of incrementally checking that a list is still ordered is O(1) ... but still one or two calls to compare on each update operation. That is a significant overhead ... for a general purpose data structure to incur in the hope of optimizing (just) one operation in the API.
But that's OK. If you (the programmer) are able to ensure (ideally by efficient algorithmic means) that a list is always ordered, then you can use Collections.binarySearch ... with zero additional checking overhead in update operations.
Just to keep it simple.
If you have an array [5,4,3,2,1] and you order it to [1,2,3,4,5] will forks faster if you look for 1 but it will take longer to find 5. Consequently, from the mathematical point of view if you order an array, searching for an item inside will anyway require to loop from 1 to, in the worst case, n.
May be that for your problem sorting may help, say you receive unordered timestamps but
if your array is not too small
want to avoid the additional cost of sorting per each new entry in the array
you just want to find quickly an object
you know the Object properties you are searching for
you can create a KeyObject containing the properties you are looking for implements equals & hashCode for it then store your items into a Map. Using a Map.containsKey(new KeyObject(prop1, prop2)) would be in any case faster than looping the array. If you do not have the real object you can always create a fake KeyObject, filled with the properties you expect, to check the Map.
I have a class that I'd like to put in a TreeSet, which implements Comparable to sort them by priority. Here's a small segment:
public abstract class PacketListener implements Comparable<PacketListener> {
public enum ListenerPriority {
LOWEST, LOW, NORMAL, HIGH, HIGHEST
}
private final ListenerPriority priority; // Initialized in constructor
// ... class body ...
#Override
public final int compareTo(PacketListener o) {
return priority.compareTo(o.priority);
}
}
The idea is obviously for the TreeSet to sort the objects by priority, allowing me to iterate through the listeners in order. However, I've discovered that, for some reason, I can't add a second PacketListener to the set object. After adding two different PacketListener objects, the size of the set remains at 1.
Should I not be using TreeSet?
The API docs for TreeSet contain this important information:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. [...] This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal.
In other words, a TreeSet can accommodate multiple instances of your PacketListener class, but only so long as each has a different priority than all the others, so that for every pair of elements A and B, exactly one of these holds: either A == B or A.compareTo(B) != 0.
If you must accommodate multiple PacketListener instances with the same priority in the same collection, then you need a different type of collection. HashSet is perfectly good with classes that use the equals() and hashCode() methods inherited from Object, provided that that is indeed the desired sense of instance equality. You could also consider a LinkedHashSet if you want some kind of guarantee about iteration order, or maybe a PriorityQueue if you want to order by priority but you are willing to use a different mechanism to avoid duplicates.
TreeSet treats two objects for which compareTo returns 0 to be equals. Which means you'll can never have two objects with the same priority in your tree set with your current implementation.
On way to fix your problem is to make your compareTo method consider all values that you care about (i.e. only objects that are actually equal return 0).
A Set is a collection of unique elements. As you are using the priority to compare PacketListener, I assume that there are at most five instances in you TreeSet, one for each priority.
If the structure allows you, you can find a secondary key to compare PacketListener, in case they have the same priority. If you can't, then TreeSet is the wrong way to go.
I'm just studying for exams right now, and came across this question in the sample exam:
Block Implementation of a Priority Queue
If we know in advance that a priority queue will only ever need to cater for a small number of discrete priorities (say 10), we can implement all operations of the priority queue in constant time by representing the priority queue as an array of queues - each queue storing the elements of a single priority. Note that while an operation may be linear in the number of priorities in the priority queue, the operation is still constant with respect to the size of the overall data structure.
The Objects stored in this priority queue is not comparable.
I have attempted it but I am lost as to how I am supposed to assign priority with a array implementation priority queue.
I have also tried looking for solutions, but all I've managed to find are examples that used Comparable, which we did not learn in this course.
Question: http://imgur.com/3mlBoW7
Each of the arrays will correspond to a different priority. Your lowest level priority array will deal only with objects of that priority level. Your highest level priority array will deal with objects of highest priority level, and so on. When you receive a new object, you place it into the array that corresponds to its priority.
It doesn't matter, then, that objects are not comparable since they are sorted by priority based on the stratification of the arrays. Then, when you are looking for next elements to execute, you check the highest priority array and see if there are any elements; if not, move to the next priority, and so on through each array.
I'm hoping I understood the problem and your question correctly; let me know if you have any additional questions in regards to my answer.
Following on Imcphers' answer, this would be a simple implementation in Java. Note that you do not need Comparable because enqueue takes an extra parameter, namely the discrete priority of the newly-added element:
public class PQueue<T> {
public static final int MAX_PRIORITIES = 10;
private ArrayList<ArrayDeque<T> > queues = new ArrayList<>();
public PQueue() {
for (int i=0; i<MAX_PRIORITIES; i++) {
queues.add(new ArrayDeque<T>());
}
}
public void enqueue(int priority, T element) {
// ... add element to the end of queues[priority]
}
public T dequeue() {
// ... find first non-empty queue and pop&return its first element
}
// ... other methods
}
Here, enqueue() and dequeue() are both O(1), because you know in advance how many priorities there can be, and what their values are (0 to MAX_PRIORITIES-1) so that no sorting is required, and search of a non-empty queue is constant-time (at most, MAX_PRIORITIES queues will have to be tested for emptyness). If these parameters are not known, the best implementation would use
private TreeSet<ArrayDeque<T extends Comparable> > queues
= new TreeSet<>(CustomComparator);
Where the CustomComparator asks queues to sort themselves depending on the natural order of their first elements, and which needs to keep these internal queues sorted after each call to enqueue --- this ups the complexity of enqueue/dequeue to O(log p), where p is the number of distinct priorities (and therefore, internal queues).
Note that Java's PriorityQueue belongs to the same complexity class, but avoids all that object overhead contributed by the TreeSet / ArrayDeque wrappers by implementing its own internal priorty heap.
I'm trying order pairs of integers ascendantly where a pair is considered less than another pair if both its entries are strictly less than those of the other pair, and larger than the other pair if both its entries are strictly larger than those of the other pair. All other cases are considered incomparable.
They way I want to solve this is by defining a Comparator that implements the above, but will throw an exception for incomparable cases, and provide that to a PriorityQueue. Of course, while inserting a pair the priority queue does several comparisons while bubbling the new entry up to its correct position in the heap, and many of these will be comparable. But it may happen during the bubbling process that a pair is encountered with which this new pair is incomparable, and an exception will be thrown. If this happens, what will be the state of the PriorityQueue? Will the pair I was trying to insert sit in the heap at the last position it was in before the exception was thrown? If I use the PriorityQueue's remove(Object o) method, will the PriorityQueue be restored to a consistent state?
Thanks
If you look at the PriorityQueue source code, when adding/offering new elements the .compare() method is called without any try/catch (this is in siftUpUsingComparator())- which makes sense, as it's not the PriorityQueue's responsibility to prevent you from putting incomparable elements in the queue. So any RuntimeException thrown by your Comparator will bubble up to your calling code.
The larger question here is why are you attempting to sort items which are, by your definition, "incomparable"? This doesn't make much sense. If the items are of the same type but are neither greater than or lesser than another item, your Comparator should return them as equal. The semantics of Comparator are such that "equal" doesn't mean "has the same value" but rather "has the same order the element being compared to" - in other words, you return equal (0) when you want the two items to be ordered next to each other in the sort.