How to implement a compareTo() method when consistent with Equal and hashcode - java

I have a class Product, which three variables:
class Product implements Comparable<Product>{
private Type type; // Type is an enum
Set<Attribute> attributes; // Attribute is a regular class
ProductName name; // ProductName is another enum
}
I used Eclipse to automatically generate the equal() and hashcode() methods:
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((attributes == null) ? 0 : attributes.hashCode());
result = prime * result + ((type == null) ? 0 : type.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Product other = (Product) obj;
if (attributes == null) {
if (other.attributes != null)
return false;
} else if (!attributes.equals(other.attributes))
return false;
if (type != other.type)
return false;
return true;
}
Now in my application I need to sort a Set of Product, so I need to implement the Comparable interface and compareTo method:
#Override
public int compareTo(Product other){
int diff = type.hashCode() - other.getType().hashCode();
if (diff > 0) {
return 1;
} else if (diff < 0) {
return -1;
}
diff = attributes.hashCode() - other.getAttributes().hashCode();
if (diff > 0) {
return 1;
} else if (diff < 0) {
return -1;
}
return 0;
}
Does this implementation make sense? What about if I just want to sort the product based on the String values of "type" and "attributes" values. So how to implement this?
Edit:
The reason I want to sort a Set of is because I have Junit test which asserts on the string values of a HashSet. My goal is to maintain the same order of output as I sort the set. otherwise, even if the Set's values are the same, the assertion will fail due to random output of a set.
Edit2:
Through the discussion, it's clear that to assert the equality of String values of a HashSet isn't good in unit tests. For my situation I currently write a sort() function to sort the HashSet String values in natural ordering, so it can consistently output the same String value for my unit tests and that suffice for now. Thanks all.

Looks like from all the comments in here you dont need to use Comparator at all. Because:
1) You are using HashSet that does not work with Comparator. It is not ordered.
2) You just need to make sure that two HashSets containing Products are equal. It means they are same size and contain the same set of Products.
Since you already added hashCode and equals methods to Product all you need to do is call equals method on those HashSets.
HashSet<Product> set1 = ...
HashSet<Product> set2 = ...
assertTrue( set1.equals(set2) );

This implementation does not seem to be consistent. You have no control over how the hash codes look like. If you have obj1 < obj2 according to compareTo in the first try, the next time you start your JVM it could be the other way around obj1 > obj2.
The only thing that you really know is that if diff == 0 then the objects are considered to be equal. However you can also just use the equals method for that check.
It is now up to you how you define when obj1 < obj2 or obj1 > obj2. Just make sure that it is consistent.
By the way, you know that the current implementation does not include ProductName name in the equals check? Dont know if that is intended thus the remark.
The question is, what do you know about that attributes? Maybe they implement Comparable (for example if they are Numbers), then you can order according to their compareTo method. If you totally know nothing about the objects, it will be hard to build up a consistent ordering.
If you just want them to be ordered consistently but the ordering itself does not play any role, you could just give them ids at creation time and sort by them. At this point you could indeed use the hashcodes if it does not matter that it can change between JVM calls, but only then.

Related

Comparing two large lists in java

I have to Array lists with 1000 objects in each of them. I need to remove all elements in Array list 1 which are there in Array list 2. Currently I am running 2 loops which is resulting in 1000 x 1000 operations in worst case.
List<DataClass> dbRows = object1.get("dbData");
List<DataClass> modifiedData = object1.get("dbData");
List<DataClass> dbRowsForLog = object2.get("dbData");
for (DataClass newDbRows : dbRows) {
boolean found=false;
for (DataClass oldDbRows : dbRowsForLog) {
if (newDbRows.equals(oldDbRows)) {
found=true;
modifiedData.remove(oldDbRows);
break;
}
}
}
public class DataClass{
private int categoryPosition;
private int subCategoryPosition;
private Timestamp lastUpdateTime;
private String lastModifiedUser;
// + so many other variables
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
DataClass dataClassRow = (DataClass) o;
return categoryPosition == dataClassRow.categoryPosition
&& subCategoryPosition == dataClassRow.subCategoryPosition && (lastUpdateTime.compareTo(dataClassRow.lastUpdateTime)==0?true:false)
&& stringComparator(lastModifiedUser,dataClassRow.lastModifiedUser);
}
public String toString(){
return "DataClass[categoryPosition="+categoryPosition+",subCategoryPosition="+subCategoryPosition
+",lastUpdateTime="+lastUpdateTime+",lastModifiedUser="+lastModifiedUser+"]";
}
public static boolean stringComparator(String str1, String str2){
return (str1 == null ? str2 == null : str1.equals(str2));
}
public int hashCode() {
int hash = 7;
hash = 31 * hash + (int) categoryPosition;
hash = 31 * hash + (int) subCategoryPosition
hash = 31 * hash + (lastModifiedUser == null ? 0 : lastModifiedUser.hashCode());
return hash;
}
}
The best work around i could think of is create 2 sets of strings by calling tostring() method of DataClass and compare string. It will result in 1000 (for making set1) + 1000 (for making set 2) + 1000 (searching in set ) = 3000 operations. I am stuck in Java 7. Is there any better way to do this? Thanks.
Let Java's builtin collections classes handle most of the optimization for you by taking advantage of a HashSet. The complexity of its contains method is O(1). I would highly recommend looking up how it achieves this because it's very interesting.
List<DataClass> a = object1.get("dbData");
HashSet<DataClass> b = new HashSet<>(object2.get("dbData"));
a.removeAll(b);
return a;
And it's all done for you.
EDIT: caveat
In order for this to work, DataClass needs to implement Object::hashCode. Otherwise, you can't use any of the hash-based collection algorithms.
EDIT 2: implementing hashCode
An object's hash code does not need to change every time an instance variable changes. The hash code only needs to reflect the instance variables that determine equality.
For example, imagine each object had a unique field private final UUID id. In this case, you could determine if two objects were the same by simply testing the id value. Fields like lastUpdateTime and lastModifiedUser would provide information about the object, but two instances with the same id would refer to the same object, even if the lastUpdateTime and lastModifiedUser of each were different.
The point is that if you really want to want to optimize this, include as few fields as possible in the hash computation. From your example, it seems like categoryPosition and subCategoryPosition might be enough.
Whatever fields you choose to include, the simplest way to compute a hash code from them is to use Objects::hash rather than running the numbers yourself.
It is a Set A-B operation(only retain elements in Set A that are not in Set B = A-B)
If using Set is fine then we can do like below. We can use ArrayList as well in place of Set but in AL case for each element to remove/retain check it needs to go through an entire other list scan.
Set<DataClass> a = new HashSet<>(object1.get("dbData"));
Set<DataClass> b = new HashSet<>(object2.get("dbData"));
a.removeAll(b);
If ordering is needed, use TreeSet.
Try to return a set from object1.get("dbData") and object2.get("dbData") that skips one more intermediate collection creation.

TreeSet.contains() does not call overwritten equals

i have a problem with the contains() method of TreeSet. As I understand it, contains() should call equals() of the contained Objects as the javadoc says:
boolean java.util.TreeSet.contains(Object o): Returns true if this set
contains the specified element. More formally, returns true if and
only if this set contains an element e such that (o==null ? e==null :
o.equals(e)).
What I try to do:
I have a list of TreeSets with Result Objects that have a member String baseword. Now I want to compare each TreeSet with all Others, and make for each pair a list of basewords they share. For this, I iterate over the list once for a treeSet1 and a second time for a treeSet2, then I iterate over all ResultObjects in treeSet2 and run treeSet1.contains(ResultObject) for each, to see if treeSet1 contains a Result Object with this wordbase. I adjusted the compareTo and equals methods of the ResultObject. But it seems that my equals is never called.
Can anyone explain me why this doesn't work?
Greetings,
Daniel
public static void getIntersection(ArrayList<TreeSet<Result>> list, int value){
for (TreeSet<Result> treeSet : list){
//for each treeSet, we iterate again through the list of TreeSet, starting at the TreeSet that is next
//to the one we got in the outer loop
for (TreeSet<Result> treeSet2 : list.subList((list.indexOf(treeSet))+1, list.size())){
//so at this point, we got 2 different TreeSets
HashSet<String> intersection = new HashSet<String>();
for (Result result : treeSet){
//we iterate over each result in the first treeSet and see if the wordbase exists also in the second one
//!!!
if (treeSet2.contains(result)){
intersection.add(result.wordbase);
}
}
if (!intersection.isEmpty()){
intersections.add(intersection);
}
}
}
public class Result implements Comparable<Result>{
public Result(String wordbase, double result[]){
this.result = result;
this.wordbase = wordbase;
}
public String wordbase;
public double[] result;
public int compareTo(DifferenceAnalysisResult o) {
if (o == null) return 0;
return this.wordbase.compareTo(o.wordbase);
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result
+ ((wordbase == null) ? 0 : wordbase.hashCode());
return result;
}
//never called
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
DifferenceAnalysisResult other = (DifferenceAnalysisResult) obj;
if (wordbase == null) {
if (other.wordbase != null)
return false;
} else if (!wordbase.equals(other.wordbase))
return false;
return true;
}
}
As I understand it, contains() should call equals() of the contained Objects
Not for TreeSet, no. It calls compare:
A NavigableSet implementation based on a TreeMap. The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
...
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface.
Your compareTo method isn't currently consistent with equals - x.compareTo(null) returns 0, whereas x.equals(null) returns false. Maybe you're okay with that, but you shouldn't expect equals to be called.

compareTo() method in java (Student ids)

Learning java and having trouble with the compareTo method. I tried google but it was not much help for what i need.What i need is
// compareTo public int compareTo(Student other)
// is defined in the Comparable Interface
// and should compare the student ID's (they are positive integers).
// Must be able to handle null "other" students. A null student should be
// ordered before any student s so the s.compareTo(null) should be positive.
so basically a compareTo(), in the end this method is going to help me put my students in order based on there student ids lowest to greatest.. I'm at a brick wall and just need some help in the right direction
public int compareTo(StudentIF other) {
// do stuff
return 0;
}
There's a good tutorial about implementing compareTo() here. That said, when learning how to do something in general it's often helpful for me to see how to implement it in my specific use case - so, in this case, I would imagine something like this would suffice:
public int compareTo(StudentIF other) {
if (other == null) {return 1;} //satisfies your null student requirement
return this.studentId > other.studentId ? 1 :
this.studentId < other.studentId ? -1 : 0;
}
compareTo() is expected to return a positive value if the other object is comparitively smaller, a negative value if it's comparitively larger, and 0 if they are equal. Assuming you're familiar with the ternary operator, you'll see that that's what this is doing. If you're not, then the if/else equivalent would be:
public int compareTo(StudentIF other) {
if (other == null) { return 1; } //satisfies your null student requirement
if (this.studentId > other.studentId) return 1;
else if (this.studentId < other.studentId) return -1;
else return 0; //if it's neither smaller nor larger, it must be equal
}
As the compareTo interface required:
a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object.
plus your additional requirement of null comparison, we can simply check whether the other param is null or not, and then do a subtraction to compare.
public int compareTo(StudentIF other) {
if (other == null) {
return 1;
}
return this.id - other.id;
}

Correct way to implement Map<MyObject,ArrayList<MyObject>>

I was asked this in interview. using Google Guava or MultiMap is not an option.
I have a class
public class Alpha
{
String company;
int local;
String title;
}
I have many instances of this class (in order of millions). I need to process them and at the end find the unique ones and their duplicates.
e.g.
instance --> instance1, instance5, instance7 (instance1 has instance5 and instance7 as duplicates)
instance2 --> instance2 (no duplicates for instance 2)
My code works fine
declare datastructure
HashMap<Alpha,ArrayList<Alpha>> hashmap = new HashMap<Alpha,ArrayList<Alpha>>();
Add instances
for (Alpha x : arr)
{
ArrayList<Alpha> list = hashmap.get(x); ///<<<<---- doubt about this. comment#1
if (list == null)
{
list = new ArrayList<Alpha>();
hashmap.put(x, list);
}
list.add(x);
}
Print instances and their duplicates.
for (Alpha x : hashmap.keySet())
{
ArrayList<Alpha> list = hashmap.get(x); //<<< doubt about this. comment#2
System.out.println(x + "<---->");
for(Alpha y : list)
{
System.out.print(y);
}
System.out.println();
}
Question: My code works, but why? when I do hashmap.get(x); (comment#1 in code). it is possible that two different instances might have same hashcode. In that case, I will add 2 different objects to the same List.
When I retrieve, I should get a List which has 2 different instances. (comment#2) and when I iterate over the list, I should see at least one instance which is not duplicate of the key but still exists in the list. I don't. Why?. I tried returning constant value from my hashCode function, it works fine.
If you want to see my implementation of equals and hashCode,let me know.
Bonus question: Any way to optimize it?
Edit:
#Override
public boolean equals(Object obj) {
if (obj==null || obj.getClass()!=this.getClass())
return false;
if (obj==this)
return true;
Alpha guest = (Alpha)obj;
return guest.getLocal()==this.getLocal()
&& guest.getCompany() == this.getCompany()
&& guest.getTitle() == this.getTitle();
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + (title==null?0:title.hashCode());
result = prime * result + local;
result = prime * result + (company==null?0:company.hashCode());
return result;
}
it is possible that two different instances might have same hashcode
Yes, but hashCode method is used to identify the index to store the element. Two or more keys could have the same hashCode but that's why they are also evaluated using equals.
From Map#containsKey javadoc:
Returns true if this map contains a mapping for the specified key. More formally, returns true if and only if this map contains a mapping for a key k such that (key==null ? k==null : key.equals(k)). (There can be at most one such mapping.)
Some enhancements to your current code:
Code oriented to interfaces. Use Map and instantiate it by HashMap. Similar to List and ArrayList.
Compare Strings and Objects in general using equals method. == compares references, equals compares the data stored in the Object depending the implementation of this method. So, change the code in Alpha#equals:
public boolean equals(Object obj) {
if (obj==null || obj.getClass()!=this.getClass())
return false;
if (obj==this)
return true;
Alpha guest = (Alpha)obj;
return guest.getLocal().equals(this.getLocal())
&& guest.getCompany().equals(this.getCompany())
&& guest.getTitle().equals(this.getTitle());
}
When navigating through all the elements of a map in pairs, use Map#entrySet instead, you can save the time used by Map#get (since it is supposed to be O(1) you won't save that much but it is better):
for (Map.Entry<Alpha, List<Alpha>> entry : hashmap.keySet()) {
List<Alpha> list = entry.getValuee();
System.out.println(entry.getKey() + "<---->");
for(Alpha y : list) {
System.out.print(y);
}
System.out.println();
}
Use equals along with hashCode to solve the collision state.
Steps:
First compare on the basis of title in hashCode()
If the title is same then look into equals() based on company name to resolve the collision state.
Sample code
class Alpha {
String company;
int local;
String title;
public Alpha(String company, int local, String title) {
this.company = company;
this.local = local;
this.title = title;
}
#Override
public int hashCode() {
return title.hashCode();
}
#Override
public boolean equals(Object obj) {
if (obj instanceof Alpha) {
return this.company.equals(((Alpha) obj).company);
}
return false;
}
}
...
Map<Alpha, ArrayList<Alpha>> hashmap = new HashMap<Alpha, ArrayList<Alpha>>();
hashmap.put(new Alpha("a", 1, "t1"), new ArrayList<Alpha>());
hashmap.put(new Alpha("b", 2, "t1"), new ArrayList<Alpha>());
hashmap.put(new Alpha("a", 3, "t1"), new ArrayList<Alpha>());
System.out.println("Size : "+hashmap.size());
Output
Size : 2

Do I need to implement hashCode() and equals() methods?

If I have a map and an object as map key, are the default hash and equals methods enough?
class EventInfo{
private String name;
private Map<String, Integer> info
}
Then I want to create a map:
Map<EventInfo, String> map = new HashMap<EventInfo, String>();
Do I have to explicitly implement hashCode() and equals()? Thanks.
Yes, you do. HashMaps work by computing the hash code of the key and using that as a base point. If the hashCode function isn't overriden (by you), then it will use the memory address, and equals will be the same as ==.
If you're in Eclipse, it'll generate them for you. Click Source menu → Generate hashCode() and equals().
If you don't have Eclipse, here's some that should work. (I generated these in Eclipse, as described above.)
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((info == null) ? 0 : info.hashCode());
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (!(obj instanceof EventInfo)) {
return false;
}
EventInfo other = (EventInfo) obj;
if (info == null) {
if (other.info != null) {
return false;
}
} else if (!info.equals(other.info)) {
return false;
}
if (name == null) {
if (other.name != null) {
return false;
}
} else if (!name.equals(other.name)) {
return false;
}
return true;
}
Yes, you need them else you won't be able to compare two EventInfo (and your map won't work).
Strictly speaking, no. The default implementations of hashCode() and equals() will produce results that ought to work. See http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html#hashCode()
My understanding is that the default implementation of hashCode() works by taking the object's address in memory and converting to integer, and the default implementation of equals() returns true only if the two objects are actually the same object.
In practice, you could (and should) probably improve on both of those implementations. For example, both methods should ignore object members that aren't important. In addition, equals() might want to recursively compare references in the object.
In your particular case, you might define equals() as true if the two objects refer to the same string or the two strings are equal and the two maps are the same or they are equal. I think WChargin gave you pretty good implementations.
Depends on what you want to happen. If two different EventInfo instances with the same name and info should result in two different keys, then you don't need to implement equals and hashCode.
So
EventInfo info1 = new EventInfo();
info1.setName("myname");
info1.setInfo(null);
EventInfo info2 = new EventInfo();
info2.setName("myname");
info2.setInfo(null);
info1.equals(info2) would return false and info1.hashCode() would return a different value to info2.hashCode().
Therefore, when you are adding them to your map:
map.put(info1, "test1");
map.put(info2, "test2");
you would have two different entries.
Now, that may be desired behaviour. For example, if your EventInfo is collecting different events, two distinct events with the same data may well want to be desired to be two different entries.
The equals and hashCode contracts is also applicable in a Set.
So for example, if your event info contains mouse clicks, it may well be desired that you would want to end up with:
Set<EventInfo> collectedEvents = new HashSet<EventInfo>();
collectedEvents.add(info1);
collectedEvents.add(info2);
2 collected events instead of just 1...
Hope I'm making sense here...
EDIT:
If however, the above set and map should only contain a single entry, then you could use apache commons EqualsBuilder and HashCodeBuilder to simplify the implementation of equals and hashCode:
#Override
public boolean equals(Object obj) {
if (obj instanceof EventInfo) {
EventInfo other = (EventInfo) obj;
EqualsBuilder builder = new EqualsBuilder();
builder.append(name, other.name);
builder.append(info, other.info);
return builder.isEquals();
}
return false;
}
#Override
public int hashCode() {
HashCodeBuilder builder = new HashCodeBuilder();
builder.append(name);
builder.append(info);
return builder.toHashCode();
}
EDIT2:
It could also be appropriate if two EventInfo instances are considered the same, if they have the same name, for example if the name is some unique identifier (I know it's a bit far fetched with your specific object, but I'm generalising here...)

Categories

Resources