I think I may have found a bug in Java.
I have a TreeMap in which I use a custom comparator. However, it seems when I put(key, value), on a key that already exists, it does not override the key, thus creating duplicate keys. I think I have verified this because I tried:
System.out.println(testMap.firstKey().equals(testMap.lastKey()));
And this prints out true. Anyone know why this is happening?
This is the comparator code:
private class TestComp implements Comparator<String> {
#Override
public int compare(String s1, String s2){
if (s1.equals(s2)) {
return 0;
}
int temp = otherMap.get(s1).compareTo(otherMap.get(s2));
if (temp > 0) {
return 1;
}
return -1;
}
A comparator always needs to return consistent results, and when used in a TreeMap, be consistent with equals.
In this case your comparator violates the first constraint since it does not necessarily give consistent results.
Example: If for instance otherMap maps
"a" -> "someString"
"b" -> "someString"
then both compare("a", "b") and compare("b", "a") will return -1.
Note that if you change the implementation to
if (s1.equals(s2)) {
return 0;
}
return otherMap.get(s1).compareTo(otherMap.get(s2));
you break the other criteria of being consistent with equals, since otherMap.get(s1).compareTo(otherMap.get(s2)) might return 0 even though s1 does not equal s2.
I've elaborated on this in a self-answered follow up question here.
From the comments:
Even if a comparator gives inconsistent results, shouldn't the Java language still not allow duplicate keys?
No, when you insert a key, the TreeMap will use the comparator to search the data structure to see if the key already exists. If the comparator gives inconsistent results, the TreeMap might look in the wrong place and conclude that the key does not exist, leading to undefined behavior.
Related
I have a class for which equality (as per equals()) must be defined by the object identity, i.e. this == other.
I want to implement Comparable to order such objects (say by some getName() property). To be consistent with equals(), compareTo() must not return 0, even if two objects have the same name.
Is there a way to compare object identities in the sense of compareTo? I could compare System.identityHashCode(o), but that would still return 0 in case of hash collisions.
I think the real answer here is: don't implement Comparable then. Implementing this interface implies that your objects have a natural order. Things that are "equal" should be in the same place when you follow up that thought.
If at all, you should use a custom comparator ... but even that doesn't make much sense. If the thing that defines a < b ... is not allowed to give you a == b (when a and b are "equal" according to your < relation), then the whole approach of comparing is broken for your use case.
In other words: just because you can put code into a class that "somehow" results in what you want ... doesn't make it a good idea to do so.
By definition, by assigning each object a Universally unique identifier (UUID) (or a Globally unique identifier, (GUID)) as it's identity property, the UUID is comparable, and consistent with equals. Java already has a UUID class, and once generated, you can just use the string representation for persistence. The dedicated property will also insure that the identity is stable across versions/threads/machines. You could also just use an incrementing ID if you have a method of insuring everything gets a unique ID, but using a standard UUID implementation will protect you from issues from set merges and parallel systems generating data at the same time.
If you use anything else for the comparable, that means that it is comparable in a way separate from its identity/value. So you will need to define what comparable means for this object, and document that. For example, people are comparable by name, DOB, height, or a combination by order of precedence; most naturally by name as a convention (for easier lookup by humans) which is separate from if two people are the same person. You will also have to accept that compareto and equals are disjoint because they are based on different things.
You could add a second property (say int id or long id) which would be unique for each instance of your class (you can have a static counter variable and use it to initialize the id in your constructor).
Then your compareTo method can first compare the names, and if the names are equal, compare the ids.
Since each instance has a different id, compareTo will never return 0.
While I stick by my original answer that you should use a UUID property for a stable and consistent compare / equality setup, I figured I'd go ahead an answer the question of "how far could you go if you were REALLY paranoid and wanted a guaranteed unique identity for comparable".
Basically, in short if you don't trust UUID uniqueness or identity uniqueness, just use as many UUIDs as it takes to prove god is actively conspiring against you. (Note that while not technically guaranteed not to throw an exception, needing 2 UUID should be overkill in any sane universe.)
import java.time.Instant;
import java.util.ArrayList;
import java.util.UUID;
public class Test implements Comparable<Test>{
private final UUID antiCollisionProp = UUID.randomUUID();
private final ArrayList<UUID> antiuniverseProp = new ArrayList<UUID>();
private UUID getParanoiaLevelId(int i) {
while(antiuniverseProp.size() < i) {
antiuniverseProp.add(UUID.randomUUID());
}
return antiuniverseProp.get(i);
}
#Override
public int compareTo(Test o) {
if(this == o)
return 0;
int temp = System.identityHashCode(this) - System.identityHashCode(o);
if(temp != 0)
return temp;
//If the universe hates you
temp = this.antiCollisionProp.compareTo(o.antiCollisionProp);
if(temp != 0)
return temp;
//If the universe is activly out to get you
temp = System.identityHashCode(this.antiCollisionProp) - System.identityHashCode(o.antiCollisionProp);;
if(temp != 0)
return temp;
for(int i = 0; i < Integer.MAX_VALUE; i++) {
UUID id1 = this.getParanoiaLevelId(i);
UUID id2 = o.getParanoiaLevelId(i);
temp = id1.compareTo(id2);
if(temp != 0)
return temp;
temp = System.identityHashCode(id1) - System.identityHashCode(id2);;
if(temp != 0)
return temp;
}
// If you reach this point, I have no idea what you did to deserve this
throw new IllegalStateException("RAGNAROK HAS COME! THE MIDGARD SERPENT AWAKENS!");
}
}
Assuming that with two objects with same name, if equals() returns false then compareTo() should not return 0. If this is what you want to do then following can help:
Override hashcode() and make sure it doesn't rely solely on name
Implement compareTo() as follows:
public void compareTo(MyObject object) {
this.equals(object) ? this.hashcode() - object.hashcode() : this.getName().compareTo(object.getName());
}
You are having unique objects, but as Eran said you may need an extra counter/rehash code for any collisions.
private static Set<Pair<C, C> collisions = ...;
#Override
public boolean equals(C other) {
return this == other;
}
#Override
public int compareTo(C other) {
...
if (this == other) {
return 0
}
if (super.equals(other)) {
// Some stable order would be fine:
// return either -1 or 1
if (collisions.contains(new Pair(other, this)) {
return 1;
} else if (!collisions.contains(new Pair(this, other)) {
collisions.add(new Par(this, other));
}
return 1;
}
...
}
So go with the answer of Eran or put the requirement as such in question.
One might consider the overhead of non-identical 0 comparisons neglectable.
One might look into ideal hash functions, if at some point of time no longer instances are created. This implies you have a collection of all instances.
There are times (although rare) when it is necessary to implement an identity-based compareTo override. In my case, I was implementing java.util.concurrent.Delayed.
Since the JDK also implements this class, I thought I would share the JDK's solution, which uses an atomically incrementing sequence number. Here is a snippet from ScheduledThreadPoolExecutor (slightly modified for clarity):
/**
* Sequence number to break scheduling ties, and in turn to
* guarantee FIFO order among tied entries.
*/
private static final AtomicLong sequencer = new AtomicLong();
private class ScheduledFutureTask<V>
extends FutureTask<V> implements RunnableScheduledFuture<V> {
/** Sequence number to break ties FIFO */
private final long sequenceNumber = sequencer.getAndIncrement();
}
If the other fields used in compareTo are exhausted, this sequenceNumber value is used to break ties. The range of a 64bit integer (long) is sufficiently large to count on this.
The HashSet class has an add(Object o) method, which is not inherited from another class. The Javadoc for that method says the following:
Adds the specified element to this set if it is not already present. More formally, adds the specified element e to this set if this set contains no element e2 such that (e==null ? e2==null : e.equals(e2)). If this set already contains the element, the call leaves the set unchanged and returns false.
In other words, if two objects are equal, then the second object will not be added and the HashSet will remain the same. However, I've discovered that this is not true if objects e and e2 have different hashcodes, despite the fact that e.equals(e2). Here is a simple example:
import java.util.HashSet;
import java.util.Iterator;
import java.util.Random;
public class BadHashCodeClass {
/**
* A hashcode that will randomly return an integer, so it is unlikely to be the same
*/
#Override
public int hashCode(){
return new Random().nextInt();
}
/**
* An equal method that will always return true
*/
#Override
public boolean equals(Object o){
return true;
}
public static void main(String... args){
HashSet<BadHashCodeClass> hashSet = new HashSet<>();
BadHashCodeClass instance = new BadHashCodeClass();
System.out.println("Instance was added: " + hashSet.add(instance));
System.out.println("Instance was added: " + hashSet.add(instance));
System.out.println("Elements in hashSet: " + hashSet.size());
Iterator<BadHashCodeClass> iterator = hashSet.iterator();
BadHashCodeClass e = iterator.next();
BadHashCodeClass e2 = iterator.next();
System.out.println("Element contains e and e2 such that (e==null ? e2==null : e.equals(e2)): " + (e==null ? e2==null : e.equals(e2)));
}
The results from the main method are:
Instance was added: true
Instance was added: true
Elements in hashSet: 2
Element contains e and e2 such that (e==null ? e2==null : e.equals(e2)): true
As the example above clearly shows, HashSet was able to add two elements where e.equals(e2).
I'm going to assume that this is not a bug in Java and that there is in fact some perfectly rational explanation for why this is. But I can't figure out what exactly. What am I missing?
I think what you're really trying to ask is:
"Why does a HashSet add objects with inequal hash codes even if they claim to be equal?"
The distinction between my question and the question you posted is that you're assuming this behavior is a bug, and therefore you're getting grief for coming at it from that perspective. I think the other posters have done a thoroughly sufficient job of explaining why this is not a bug, however they have not addressed the underlying question.
I will try to do so here; I would suggest rephrasing your question to remove the accusations of poor documentation / bugs in Java so you can more directly explore why you're running into the behavior you're seeing.
The equals() documentations states (emphasis added):
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
The contract between equals() and hashCode() isn't just an annoying quirk in the Java specification. It provides some very valuable benefits in terms of algorithm optimization. By being able to assume that a.equals(b) implies a.hashCode() == b.hashCode() we can do some basic equivalence tests without needing to call equals() directly. In particular, the invariant above can be turned around - a.hashCode() != b.hashCode() implies a.equals(b) will be false.
If you look at the code for HashMap (which HashSet uses internally), you'll notice an inner static class Entry, defined like so:
static class Entry<K,V> implements Map.Entry<K,V> {
final K key;
V value;
Entry<K,V> next;
int hash;
...
}
HashMap stores the key's hash code along with the key and value. Because a hash code is expected to not change over the time a key is stored in the map (see Map's documentation, "The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.") it is safe for HashMap to cache this value. By doing so, it only needs to call hashCode() once for each key in the map, as opposed to every time the key is inspected.
Now lets look at the implementation of put(), where we see these cached hashes being taken advantage of, along with the invariant above:
public V put(K key, V value) {
...
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
// Replace existing element and return
}
}
// Insert new element
}
In particular, notice that the conditional only ever calls key.equals(k) if the hash codes are equal and the key isn't the exact same object, due to short-circuit evaluation. By the contract of these methods, it should be safe for HashMap to skip this call. If your objects are incorrectly implemented, these assumptions being made by HashMap are no longer true, and you will get back unusable results, including "duplicates" in your set.
Note that your claim "HashSet ... has an add(Object o) method, which is not inherited from another class" is not quite correct. While its parent class, AbstractSet, does not implement this method, the parent interface, Set, does specify the method's contract. The Set interface is not concerned with hashes, only equality, therefore it specifies the behavior of this method in terms of equality with (e==null ? e2==null : e.equals(e2)). As long as you follow the contracts, HashSet works as documented, but avoids actually doing wasteful work whenever possible. As soon as you break the rules however, HashSet cannot be expected to behave in any useful way.
Consider also that if you attempted to store objects in a TreeSet with an incorrectly implemented Comparator, you would similarly see nonsensical results. I documented some examples of how a TreeSet behaves when using an untrustworthy Comparator in another question: how to implement a comparator for StringBuffer class in Java for use in TreeSet?
You've violated the contract of equals/hashCode basically:
From the hashCode() docs:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
and from equals:
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
HashSet relies on equals and hashCode being implemented consistently - the Hash part of the name HashSet basically implies "This class uses hashCode for efficiency purposes." If the two methods are not implemented consistently, all bets are off.
This shouldn't happen in real code, because you shouldn't be violating the contract in real code...
#Override
public int hashCode(){
return new Random().nextInt();
}
You are returning different has codes for same object every time it is evaluated. Obviously you will get wrong results.
add() function is as follows
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
and put() is
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
If you notice first has is calculated which is different in your case which is why object is added. equals() comes into picture only if hash are same for objects i.e collision has occured. Since in case hash are different equals() is never executed
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
Read more on what short circuiting is. since e.hash == hash is false nothing else is evaluated.
I hope this helps.
because hashcode() is really implemented very badly,
it will try to equate in each random bucket on each add(), if you return constant value from hashcode() it wouldn't let you enter any
It is not required that hash codes be different for all elements! It is only required that two elements are not equal.
HashCode is used first to find the hash bucket the object should occupy. If hadhcodes are different, objects are assumed to be not equal. If hashcodes are equal, then the equals() method is used to determine equality. The use of hashCode is an efficiency mechanism.
And...
Your hash code implementation violates the contract that it should not change unless the objects identifying fields change.
Am somewhat confused with Java's compareTo() and Collections.sort() behavior.
I am supposed to sort a column in ascending order using compareTo() & Collections.sort().
My criteria is (if the same number occurs than please sort the next available column).
(1) Document Number
(2) Posting Date
(3) Transaction Date
(4) Transaction Reference Number Comparison
Here's the code (which is executed in a calling method) that implements the Collection.sort() method:
public int compareTo(CreditCardTransactionDetail t) {
int comparison = 0;
int documentNumberComparison = this.getDocumentNumber().compareTo(t.getDocumentNumber());
if (documentNumberComparison != 0) {
comparison = documentNumberComparison;
}
else {
int postingDateComparison = this.getTransactionPostingDate().compareTo(t.getTransactionPostingDate());
if (postingDateComparison != 0) {
comparison = postingDateComparison;
}
else {
int transactionDateComparison = this.getTransactionDate().compareTo(t.getTransactionDate());
if (transactionDateComparison != 0) {
comparison = transactionDateComparison;
}
else {
int transactionRefNumberComparison = this.getTransactionReferenceNumber().compareTo(t.getTransactionReferenceNumber());
LOG.info("\n\n\t\ttransactionRefNumberComparison = " + transactionRefNumberComparison + "\n\n");
if (transactionRefNumberComparison != 0) {
comparison = transactionRefNumberComparison;
}
}
}
return comparison;
}
Question(s):
(1) Am I doing the right thing? When a comparison = 0, it returns as -2. Is this correct behavior because I always thought it to be between -1,0,1.
(2) Should I be using the comparator?
Happy programming...
To address your specific questions:
Yes, that looks fine. The result does not have to be -1, 0 or 1. Your code could be slightly less verbose, though, and just return as soon as it finds a result without using the comparison variable at all.
If you're implementing Comparable, no need to deal with a Comparator. It's for when you need to compare something that isn't Comparable or need to compare in a different way.
Guava's ComparisonChain class makes a compareTo method like this incredibly easy:
public int compareTo(CreditCardTransactionDetail o) {
return ComparisonChain.start()
.compare(getDocumentNumber(), o.getDocumentNumber())
.compare(getTransactionPostingDate(), o.getTransactionPostingDate())
.compare(getTransactionDate(), o.getTransactionDate())
.compare(getTransactionReferenceNumber(), o.getTransactionReferenceNumber())
.result();
}
Answer for (1): It's correct. see javadoc of Comparator.compare(T, T): "a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second."
Or use Google Guava which encapsulates Comparator for easier and powerful usage:
//Write 4 Comparators for each comparable field
Ordering ordering = Ordering.from(comparatorDocumentNumber)
.compound(comparatorTransactionPostingDate)
.compound(comparatorTransactionDate)
.compound(comparatorTransactionReferenceNumber);
Collections.sort(list, ordering);
It decouples each Comparator, it's easy to change/ add/ remove fields order.
EDIT: see ColinD's lighter solution.
Your compareTo is reasonable enough. compareTo can return values other than -1,0,1. Just negative, 0 and positive.
You should be using a comparator.
According to the Comparable Documentation, compareTo():
Returns a negative integer, zero, or a positive integer as this object
is less than, equal to, or greater than the specified object.
So -2 is a valid result.
That's a matter of preference, really. Personally I prefer using a Comparator, but compareTo() works just as well. In either case, your code would look pretty much the same.
Is checking for key existence in HashMap always necessary?
I have a HashMap with say a 1000 entries and I am looking at improving the efficiency.
If the HashMap is being accessed very frequently, then checking for the key existence at every access will lead to a large overhead. Instead if the key is not present and hence an exception occurs, I can catch the exception. (when I know that this will happen rarely). This will reduce accesses to the HashMap by half.
This might not be a good programming practice, but it will help me reduce the number of accesses. Or am I missing something here?
[Update] I do not have null values in the HashMap.
Do you ever store a null value? If not, you can just do:
Foo value = map.get(key);
if (value != null) {
...
} else {
// No such key
}
Otherwise, you could just check for existence if you get a null value returned:
Foo value = map.get(key);
if (value != null) {
...
} else {
// Key might be present...
if (map.containsKey(key)) {
// Okay, there's a key but the value is null
} else {
// Definitely no such key
}
}
You won't gain anything by checking that the key exists. This is the code of HashMap:
#Override
public boolean containsKey(Object key) {
Entry<K, V> m = getEntry(key);
return m != null;
}
#Override
public V get(Object key) {
Entry<K, V> m = getEntry(key);
if (m != null) {
return m.value;
}
return null;
}
Just check if the return value for get() is different from null.
This is the HashMap source code.
Resources :
HashMap source code Bad one
HashMap source code Good one
Better way is to use containsKey method of HashMap. Tomorrow somebody will add null to the Map. You should differentiate between key presence and key has null value.
Do you mean that you've got code like
if(map.containsKey(key)) doSomethingWith(map.get(key))
all over the place ? Then you should simply check whether map.get(key) returned null and that's it.
By the way, HashMap doesn't throw exceptions for missing keys, it returns null instead. The only case where containsKey is needed is when you're storing null values, to distinguish between a null value and a missing value, but this is usually considered bad practice.
Just use containsKey() for clarity. It's fast and keeps the code clean and readable. The whole point of HashMaps is that the key lookup is fast, just make sure the hashCode() and equals() are properly implemented.
if(map.get(key) != null || (map.get(key) == null && map.containsKey(key)))
You can also use the computeIfAbsent() method in the HashMap class.
In the following example, map stores a list of transactions (integers) that are applied to the key (the name of the bank account). To add 2 transactions of 100 and 200 to checking_account you can write:
HashMap<String, ArrayList<Integer>> map = new HashMap<>();
map.computeIfAbsent("checking_account", key -> new ArrayList<>())
.add(100)
.add(200);
This way you don't have to check to see if the key checking_account exists or not.
If it does not exist, one will be created and returned by the lambda expression.
If it exists, then the value for the key will be returned by computeIfAbsent().
Really elegant! 👍
I usually use the idiom
Object value = map.get(key);
if (value == null) {
value = createValue(key);
map.put(key, value);
}
This means you only hit the map twice if the key is missing
If key class is your's make sure the hashCode() and equals() methods implemented.
Basically the access to HashMap should be O(1) but with wrong hashCode method implementation it's become O(n), because value with same hash key will stored as Linked list.
The Jon Skeet answer addresses well the two scenarios (map with null value and not null value) in an efficient way.
About the number entries and the efficiency concern, I would like add something.
I have a HashMap with say a 1.000 entries and I am looking at improving
the efficiency. If the HashMap is being accessed very frequently, then
checking for the key existence at every access will lead to a large
overhead.
A map with 1.000 entries is not a huge map.
As well as a map with 5.000 or 10.000 entries.
Map are designed to make fast retrieval with such dimensions.
Now, it assumes that hashCode() of the map keys provides a good distribution.
If you may use an Integer as key type, do it.
Its hashCode() method is very efficient since the collisions are not possible for unique int values :
public final class Integer extends Number implements Comparable<Integer> {
...
#Override
public int hashCode() {
return Integer.hashCode(value);
}
public static int hashCode(int value) {
return value;
}
...
}
If for the key, you have to use another built-in type as String for example that is often used in Map, you may have some collisions but from 1 thousand to some thousands of objects in the Map, you should have very few of it as the String.hashCode() method provides a good distribution.
If you use a custom type, override hashCode() and equals() correctly and ensure overall that hashCode() provides a fair distribution.
You may refer to the item 9 of Java Effective refers it.
Here's a post that details the way.
Since java 1.8, you can simply use:
var item = mapObject.getOrDefault(key, null);
if(item != null)
Consider the code:
class A {
private int i;
boolean equals( Object t) {
if (this == t)
return true;
if (!( t instanceof A))
return false;
if (this.i == t.i);
}
}
Map<String,A> orig;
Map<String,B> dup;
I am trying to do this
orig.entrySet().removeAll(dup.entrySet());
I see that the equals method is called; is this always true, or might it call compareTo instead?
Yes, it calls equals(). compareTo() could only be used if the Set knew that it contained Comparable objects (sorted sets, for instance, might possibly do this).
It depends on the implementation.
For instance, a HashSet will use hashCode and equals. A TreeSet will probably use compareTo. Ultimately, so long as your types behave appropriately it shouldn't matter.
The TreeSet uses the compareTo, try this:
public class A {
private int i;
A(int i) {
this.i = i;
}
#Override
public boolean equals(Object t) {
if (this == t)
return true;
if (!( t instanceof A))
return false;
return (this.i == ((A)t).i);
}
public static void main(String[] args) {
List<A> remove = Arrays.asList(new A(123), new A(789));
Set<A> set = new TreeSet<A>(new Comparator<A>() {
#Override
public int compare(A o1, A o2) {
return o1.i - o2.i;
// return 0; // everything get removed
}
});
set.add(new A(123));
set.add(new A(456));
set.add(new A(789));
set.add(new A(999));
set.removeAll(remove);
for (A a : set) {
System.out.println(a.i);
}
System.out.println("done");
}
}
make the Comparator always return 0 and everything will be removed! Same happens if not using a Comparator but implementing Comparable.
The TreeSet is based on a TreeMap which uses the compareTo in getEntry.
In the Javadoc of the TreeSet you can (finally) read:
...the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method...
[]]
http://java.sun.com/j2se/1.5.0/docs/api/java/util/Collection.html
"Implementations are free to implement optimizations whereby the equals invocation is avoided, for example, by first comparing the hash codes of the two elements."
Most likely will use equals, but considering the statement above, you cannot fully rely on equals() to be called. Remember that it's always a good idea to override hashCode() whenever you override equals().
Some Set implementations rely on hashCode (e.g. HashSet). That is why you should always override hashCode too when you override equals.
The only implementation within the Java library that I am aware of that wont do this is IdentityHashMap. TreeMap for instance does not have an appropriate Comparator.
I don't see where compareTo is used; the javadoc for remove() for the Map interface says "More formally, if this map contains a mapping from key k to value v such that (key==null ? k==null : key.equals(k)), that mapping is removed." While for the Set interface it similarly says "More formally, removes an element e such that (o==null ? e==null : o.equals(e)), if the set contains such an element."
Note that removeAll()'s javadoc doesn't say how it operates, which means, as others have said, that it's an implementation detail.
In Sun's Java, according to Bloch in his Effective Java (if I remember correctly), it iterates over the collection and calls remove(), but he stresses that you must never assume that's how it's always done.