I have a scenario in my code where I need to compare two Lists and remove from the first list, objects which are present in the second list. Akin to how the "removeAll" object works for List. Since my List is created on a custom object, the removeAll method won't work for me.
I have tried various methods to make this work:
- implemented equals() and hashCode for the custom object comprising the list
- implemented the Comparable Interface for the custom object
- implemented the Comparator Interface for the custom object
I've even tried using the Apache Common's CollectionUtils and ListUtils methods (subtract, intersect, removeAll). None seem to work.
I understand I will perhaps need to write some custom removal code. But not sure how to go about doing that. Any pointers helping me move in the right direction will be really appreciated.
Thanks,
Jay
Java Collections already cater for your scenario. Call Collection.removeAll(Collection) and it'll remove all the items from the passed in collection using the equals() method to test for equality.
List<String> list1 = new ArrayList<String>();
Collections.addAll(list1, "one", "two", "three", "four");
List<String> list2 = new ArrayList<String>();
Collections.addAll(list2, "three", "four", "five");
list1.removeAll(list2); // now contains "one", "two"
To make this work the objects you're storing just need to properly implement the equals/hashCode contract, which is: given any two objects a and b:
a.equals(b) == b.equals(a)
and:
a.hashCode() == b.hashCode() if a.equals(b)
Improperly defined equals and hashCode methods create undefined behaviour and are the common cause of collections related issues.
Overriding equals and hashCode methods is enough to make method removeAll work on custom objects.
It's likely that you didn't override them in a proper way. Some code will help us a lot.
You said:
... Since my List is created on a custom object, the removeAll method won't work for me.
As others have stated, .removeAll() should work for the scenario you described, even for custom objects, as long as the custom objects obey the contracts that Java Collections expects of its objects, including a properly implement equals() and hashCode() method.
I have tried various methods to make this work: - implemented equals() and hashCode for the custom object comprising the list - implemented the Comparable Interface for the custom object - implemented the Comparator Interface for the custom object ...
It sounds like your are shot-gunning different approaches: coding one, trying it, quickly coding another, trying it, coding yet another, ... It make be worthwhile to slow down and try to understand why each approach failed and/or determine why that approach won't work for your situation, before moving on to the next. If you've already investigated and determined why each approach won't work, please explain in your question. If you haven't, then let us help by posting code.
Since most people agree the first approach (.removeall()) should have work and since custom objects are involved, why not take a quick review of this StackOverflow question to see if anything jumps out of you:
Overriding equals and hashCode in Java
"What issues / pitfalls do I need to consider when overriding equals and hashCode in a java class?"
I have found that his original statement it true. removeAll works automatically only if you override the remove in the iterator. Just overriding the remove in the Collection is not enough because the removeAll (and clear and retainAll) all use the iterator to work. Since you should not change the underlying collection while using the iterator except for the remove in the iterator, if you do not override the remove in the iterator, the removeAll, clear and retainAll will not work. If you throw an UnsupportedOperationException in the remove method inside the iterator, that is what you will see if you call one of the three methods discussed.
Related
I'm using a TreeMap (SortedMap) whose keys are Object[] with elements of varying types.
TreeMap's equals() doesn't work on Object[] like Arrays's equals() would do -- which means it won't work when using its methods like containsKey() and get() unless I workaround it.
Is there somewhere a solution for this that doesn't involve creating a whole new Class?
EDIT :
Just to make it clear, I made a mistaken assumption. Creating a new Comparator(){} also does affect every method that uses equality, such as equals(), not only the tree sorter.
Is there somewhere a solution for this that doesn't involve creating a whole new Class?
No. In fact, you shouldn't be using mutable values for map keys at all.
While I agree with Matt Ball that you generally shouldn't use mutable (changeable) types as your keys, it is possible to use a TreeMap in this manner as long as you are not planning on modifying the arrays once they are in the tree.
This solution does involve the creation of a class, but not a new Map class, which is what it seems you are asking. Instead, you would need to create your own class which implements Comparator<Object[]> that can compare arrays. The class could use the Arrays.equals() method to determine if they are equal, but would need to also have a consistent rule to determine which array comes before another array when the arrays are not equal.
I am looking for a way to determine if a Collection (or maybe even any Iterable) is guaranteed to be ordered by its class contract.
I already know the Guava method : Ordering.natural().isOrdered(myCollection)
But this method is not relevant to my needs, because it checks if the values inside the collection are ordered. That's not what I need to determine, what I want to have is a isSorted method that will behave like this :
isSorted(new HashSet()) -> false
isSorted(new ArrayList()) -> true
etc...
What I am looking at would be typically implemented by checking the class of the collection, and comparing it to some kind of reference table of the collections which contract states that they are ordered, and only return true for these ones.
Do you know if something like this already exists in some library ?
You can do the following to determine if a collection is defined to be sorted.
collection instanceof SortedSet
There are three interfaces for ordered collections: List, SortedSet and SortedMap. You can check if your class is implementing one of them.
No, this doesn't exist in any library, and for good reason.
That library would have to know all the collection types that are flying around. If you're using Apache Commons Collections, it'd have to know about all of those. If you're using Guava, it'd have to know about all of those. If someone comes along and introduces a new collection type, you're now going to reject that type, even if it's ordered.
It doesn't make sense to provide that method in a library that can't know what other libraries you might have with whatever other collection types might be out there.
In an end application, it might make sense to implement it, with the heuristic techniques you've already been describing.
It might help if we knew what you were actually trying to do with this method.
In java, and there is one catch here. The objects are already compiled and the hash() and equals() methods therefore cannot be over written. Throwing it into a set, then back into a list will not work because the criteria for uniqueness isn't currently defined in equals() and this cannot be overridden.
You should still be able to create subclasses and create equals and hashcode methods that work, unless the classes/methods are final.
If that is the case, you could use composition, basically create a wrapper for the things you are putting in the collection, and have the wrapper's equals and hashcode implement the contract correctly, for the thing being wrapped.
You are in a tough position, because what I am reading is that the original classes are not following the contract for equals and hashcode which is a real problem in Java. It's a pretty serious bug.
Write a custom Comparator for your objects and use Collections.sort() to sort your list. And then remove duplicates by going though a list in a loop.
a compareTo method would return -1, 0, 1; if 0, remove from list.
I have 2 linked lists.
I have the same object in both of those lists. By same object, I mean the object has the same state, but is referenced by a different object pointer.
I can call .remove(object); from the first list to remove it, but if I do the same for the second list it is not removed (because the object pointer reference is different)
Is there an easy way to remove objects with the same state from various lists?
Thinking about it, I will probably loop through the second list comparing state on its objects, but I was looking for a cleaner way
Override the equals method for the object. If they have similar equivalence functionality they should be removed correctly from both lists.
Edit - for the sake of correctnes:
You should always override the hashCode method when overriding the equals method. Failure to do so may not show any strange functionality in your List but once you try to use the same object in say a HashMap, you may find that the remove or put may not function like you wanted.
If the objects have the same state, then it is probably correct for you to override their equals and hashCode methods to reflect this. If the objects are the same as far as the equals method is concerned, then you can call remove on both lists.
If the linked lists are implemented properly, the fact that different objects are being pointed to in memory should not prevent this from working. According to the List API, the remove method:
...removes the element with the lowest index i such that (o==null ? get(i)==null : o.equals(get(i))) (if such an element exists)...
You must override both equals() and hashCode() on you object. When these are not overriden the default behaviour is to compare object identity ie the reference. When you override Equals you can change the comparison to be based on the object state, ie logically equality. It is important to remember to override hashCode as well, as if this is not done it can lead to strange behaviour when you object is used in a HashSet or HashTable.
I have a List which contains a list of objects and I want to remove from this list all the elements which have the same values in two of their attributes. I had though about doing something like this:
List<Class1> myList;
....
Set<Class1> mySet = new HashSet<Class1>();
mySet.addAll(myList);
and overriding hash method in Class1 so it returns a number which depends only in the attributes I want to consider.
The problem is that I need to do a different filtering in another part of the application so I can't override hash method in this way (I would need two different hash methods).
What's the most efficient way of doing this filtering without overriding hash method?
Thanks
Overriding hashCode and equals in Class1 (just to do this) is problematic. You end up with your class having an unnatural definition of equality, which may turn out to be other for other current and future uses of the class.
Review the Comparator interface and write a Comparator<Class1> implementation to compare instances of your Class1 based on your criteria; e.g. based on those two attributes. Then instantiate a TreeSet<Class>` for duplicate detection using the TreeSet(Comparator) constructor.
EDIT
Comparing this approach with #Tom Hawtin's approach:
The two approaches use roughly comparable space overall. The treeset's internal nodes roughly balance the hashset's array and the wrappers that support the custom equals / hash methods.
The wrapper + hashset approach is O(N) in time (assuming good hashing) versus O(NlogN) for the treeset approach. So that is the way to go if the input list is likely to be large.
The treeset approach wins in terms of the lines of code that need to be written.
Let your Class1 implements Comparable. Then use TreeSet as in your example (i.e. use addAll method).
As an alternative to what Roman said you can have a look at this SO question about filtering using Predicates. If you use Google Collections anyway this might be a good fit.
I would suggest introducing a class for the concept of the parts of Class1 that you want to consider significant in this context. Then use a HashSet or HashMap.
Sometimes programmers make things too complicated trying to use all the nice features of a language, and the answers to this question are an example. Overriding anything on the class is overkill. What you need is this:
class MyClass {
Object attr1;
Object attr2;
}
List<Class1> list;
Set<Class1> set=....
Set<MyClass> tempset = new HashSet<MyClass>;
for (Class1 c:list) {
MyClass myc = new MyClass();
myc.attr1 = c.attr1;
myc.attr2 = c.attr2;
if (!tempset.contains(myc)) {
tempset.add(myc);
set.add(c);
}
}
Feel free to fix up minor irregulairites. There will be some issues depending on what you mean by equality for the attributes (and obvious changes if the attributes are primitive). Sometimes we need to write code, not just use the builtin libraries.