Object equality vs uniqueness - java

In my hobby Kotlin project I've run into dilema how to implement equals() method in case of a class like:
// I'm using Kotlin-like syntax
class ConfigurationParameter {
val name: String // used for command line option, *.conf file parameter, ENV variable, ...
val allowedValues: Set<String> // the valid values of the configuration parameter
val description: String // used in --help, as a comment above parameter in *.conf file, ...
}
Equality
Now, from my POV, two objects of this class are equal only if they are equal in all their properties. Otherwise they would beheave differently:
In case of name ... that's completely other parameter.
In case of allowedValues ... the validation would differ.
In case of description ... the printed usage help would differ.
Uniqueness
At the same time I don't want two objects with just the same name (but possibly with distinct allowedValues or description) to appear in one set (Set<ConfigurationParameter>).
That would lead to problems like duplicate command line options and the like.
This should not happen
I'm aware of there should not be created two configuration parameters with the same name and distinct other properties in the application in the first place. But let's consider this to be some internal self-check mechanism.
Solution
The only solution I've come at yet is to create a brand new ConfigurationParameterSet (not based on Set) that treats the "sameness" of its items by their name and not by their equals() method.
The problem with this solution is that there must be such a new Set class for every entity class that has equality distinct from its uniqueness.
Question
Is there any well-established generic solution to this equality vs uniqueness dilema?

Instead of your custom set-like class, you can use a Map that uses the name property as the keys. You could also add extension functions so you can use it kind of like a Set. In Java, you'd have to extend the class to add these.
fun MutableMap<String, ConfigurationParameter>.add(parameter: ConfigurationParameter) =
put(parameter.name, parameter)
fun MutableMap<String, ConfigurationParameter>.remove(parameter: ConfigurationParameter) =
remove(parameter.name, parameter)
operator fun Map<String, ConfigurationParameter>.contains(parameter: ConfigurationParameter) =
containsValue(parameter)
If you have lots of classes like this where you want to store them by a name property, you could make an interface with a name property that they can all use and then create the above extension function for any map that uses values that implement the interface:
interface NamedItem { val name: String }
class ConfigurationParameter: NamedItem {
override val name: String,
val allowedValues: Set<String>,
val description: String
}
fun <T: NamedItem> MutableMap<String, T>.add(parameter: T) =
put(parameter.name, parameter)
fun <T: NamedItem> MutableMap<String, T>.remove(parameter: T) =
remove(parameter.name, parameter)
operator fun <T: NamedItem> Map<String, T>.contains(parameter: T) =
containsValue(parameter)

I'm not well versed in Kotlin, but this problem sounds exactly the same in Java. In Java you have two types of equality: (1) reference equality (a == b) where a and b are both references to the same object and (2) hashCode/equals equality. I suspect when you are talking about "uniqueness" that you don't mean reference equality but rather a notion of hash/equals equality where all fields are the same.
What you have isn't a language problem. It's a design problem. You need to decide what makes two objects equal OR take another approach.
So, one way to do this would be to define a method like:
enum Similarity { FULL, NAME }
boolean same(Object object, Similarity similarity)
Then you can call same() from equals() to give the default kind of similarity. You can also imagine making the object sort of modal, where it has a similarity state and the equals method uses that state to decide which kind of similarity to use. The downside of this state is (1) the concern of similarity/equality isn't necessarily best defined by methods in the class itself (separation of concerns) and (2) mutable state is not the best if you can avoid it.
Another, possibly better, approach might be to create two Comparator implementations, where one comparator uses just the name and the other uses all values. This is a very common approach in Java and should be just as easy in Kotlin. Comparators give sort order, but a return value of 0 indicates equality. If you prefer a boolean, you could use the same technique but create an interface like:
interface SimilarityComparator
{
boolean same(Object a, Object b)
}
BTW, if you implement the comparator as a nested class, you can increase encapsulation by obviating the need to expose property values or fields to allow comparison (property getters and setters are bad, see Alan Holub).
https://www.baeldung.com/java-comparator-comparable
Hopefully this helps.
Jon

Related

get a getter method from field name to avoid if-else

I have this code, which obviously doesn't look nice - it seems all the if-else can somehow be avoided.
if(sortBy.equals("firstName"))
personList.sort(Comparator.comparing(Person::getFirstName));
else if(sortBy.equals("lastName"))
personList.sort(Comparator.comparing(Person::getLastName));
else if(sortBy.equals("age"))
personList.sort(Comparator.comparing(Person::getAge));
else if(sortBy.equals("city"))
personList.sort(Comparator.comparing(Person::getCity));
else if(sortBy.equals("state"))
personList.sort(Comparator.comparing(Person::getState));
else if(sortBy.equals("zipCode"))
personList.sort(Comparator.comparing(Person::getZipCode));
the function takes sortBy, which is the name of one of the attributes of a Person, and applies a sorting to a personList based on that field. How can I avoid the if-else and write a better looking, possibily one line code?
Currently I have found that I can use a HashMap to create a mapping between a field name and a corresponding comparator.
map.put("age", Comparator.comparing(Person::getAge));
map.put("firstName", Comparator.comparing(Person::getFirstName))
...
And use personList.sort(map.get(sortBy)).
But still felt like it can further be improved without an extra step, to the point where it follows the open-closed principle, and adding a new field to Person would not need us to modify the code. I'm looking for something like
personList.sort(Comparator.comparing(Person::getterOfField(sortBy)))
UPDATE-1
For now, I decided to stick with using a Map<String, Function<Person, Comparable<?>> and I do not like to consider reflection based solutions. But still searching if I can find a similar way as this one where sort is a parameter.
UPDATE-2
I think a one-liner is not a good solution, cuz you wouldn't get a compile time error if one of the fields does not implement Comparator.
In general java doesn't want you to work with it this way1; it is not a structurally typed language, and unlike e.g. javascript or python, objects aren't "hashmaps of strings to thingies".
Also, your request more fundamentally doesn't add up: You can't just go from "field name" to "sort on that": What if the field's type isn't inherently sortable (is not a subtype of Comparator<Self>?)
What if there is a column in whatever view we're talking about / config file that is 'generated'? Imagine you have a field LocalDate birthDate; but you have a column 'birth month'2. You can sort on birth month, no problem. However, given that it's a 'generated value' (not backed directly by a field, instead, derived from a calculation based on field(s)), you can't just sort on this. You can't even sort on the backing field (as that would sort by birth year first, not what you want), nor does 'backing field' make sense; what if the virtual column is based on multiple fields?
It is certainly possible that currently you aren't imagining either virtual columns or fields whose type isn't self-sortable and that therefore you want to deposit a rule that for this class, you close the door on these two notions until a pretty major refactor, but it goes to show perhaps why "java does not work that way" is in fact somewhat 'good' (closely meshes with real life concerns), and why your example isn't as boilerplatey as you may have initially thought: No, it is not, in fact, inevitable. Specifically, you seem to want:
There is an exact 1-to-1 match between 'column sort keys' and field names.
The strategy to deliver on the request to sort on a given column sort key is always the same: Take the column sort key. Find the field (it has the same name); now find its getter. Create a comparator based on comparing get calls; this getter returns a type that has a natural sorting order guaranteed.
Which are 2 non-obvious preconditions that seem to have gotten a bit lost. At any rate, a statement like:
if(sortBy.equals("firstName"))
personList.sort(Comparator.comparing(Person::getFirstName));
encodes these 2 non-obvious properties, and trivially, therefore means it is also possible to add virtual columns as well as sort keys that work differently (for example, sorts on birth month, or, sorts on some explicit comparator you write for this purpose. Or even sorts case insensitively; strings by default do not do that, you'd have to sort by String.CASE_INSENSITIVE_COMPARATOR instead.
It strikes me as a rather badly written app if a change request comes in with: "Hey, could you make the sort option that sorts on patient name be case insensitive?" and you go: "Hooo boy that'll be a personweek+ of refactoring work!", no?
But, if you insist, you have 2 broad options:
Reflection
Reflection lets you write code that programatically gets a list of field names, method names, and can also be used to programatically call them. You can fetch a list of method names and filter out everything except:
instance methods
with no arguments
whose name starts with get
And do a simple-ish get-prefix-to-sort-key conversion (basically, .substring(3) to lop off the get, then lowercase the first character, though note that the rules for getter to field name get contradictory if the first 'word' of the field is a single letter, such as getXAxis, where half of the beanspec documents say the field name is definitely XAxis, as xAxis would have become getxAxis, and the other half say it is ambiguous and could mean the field name is XAxis or xAxis).
It looks something like this:
// intentionally raw type!
Map comparators = new HashMap();
for (Method m : Person.class.getMethods()) {
if (Modifiers.isStatic(m.getModifiers()) continue;
if (m.getParameterCount() != 0) continue;
String n = m.getName();
if (!n.startsWith("get") || n.length() < 4) continue;
n = Character.toLowerCase(n.charAt(3)) + n.substring(4);
comparators.put(n, (a, b) -> {
Object aa = m.invoke(a);
Object bb = m.invoke(b);
return ((Comparable) aa).compareTo(bb);
});
}
MyClass.COMPARATORS = (Map<String, Comparator<?>>) Collections.unmodifiableMap(comparators);
Note how this causes a boatload of errors because you just chucked type checking out the window - there is no actual way to ensure that any given getter type actually is an appropriate Comparable. The warnings are correct and you have to ignore them, no fixing that, if you go by this route.
You also get a ton of checked exceptions issues that you'll have to deal with by catching them and rethrowing something appropriate; possibly RuntimeException or similar if you want to disregard the need to deal with them by callers (some RuntimeException is appropriate if you consider any attempt to add a field of a type that isn't naturally comparable 'a bug').
Annotation Processors
This is a lot more complicated: You can stick annotations on a method, and then have an annotation processor that sees these and generates a source file that does what you want. This is more flexible and more 'compile time checked', in that you can e.g. check that things are of an appropriate type, or add support for mentioning a class in the annotation that is an implementation of Comparable<T>, T being compatible with the type of the field you so annotate. You can also annotate methods themselves (e.g. a public Month getBirthMonth() method). I suggest you search the web for an annotation processor tutorial, it'd be a bit much to stuff an example in an SO answer. Expect to spend a few days learning and writing it, it won't be trivial.
[1] This is a largely objective statement. Falsifiable elements: There are no field-based 'lambda accessors'; no foo::fieldName support. Java does not support structural typing and there is no way to refer to things in the language by name alone, only by fully qualified name (you can let the compiler infer things, but the compiler always translates what you write to a fully "named" (package name, type name that the thing you are referring to is in, and finally the name of the method or field) and then sticks that in the class file).
[2] At least in the Netherlands it is somewhat common to split patient populations up by birth month (as a convenient way to split a population into 12 roughly equally sized, mostly arbitrary chunks) e.g. for inviting them in for a checkup or a flu shot or whatnot.
Assuming that the sortBy values and the corresponding getters are known at compile, this would be a good place to use a string switch statement:
Function<Person.String> getter = null;
switch (sortBy) {
case "firstName":
getter = Person::getFirstName; break;
case "lastName":
getter = Person::getLastName; break;
...
}
personList.sort(Comparator.comparing(getter));
If you use a recent version of Java (Java 12 and later) you could use a switch expression rather than a switch statement.
Function<Person.String> getter;
getter = switch (sortBy) {
case "firstName" -> Person::getFirstName;
case "lastName" -> Person::getLastName;
...
default -> null;
}
personList.sort(Comparator.comparing(getter));
Note: you should do a better job (than my dodgy code) of dealing with the case where the sortBy value is not recognized.
As keshlam suggested, I think using the reflection API is the best fitting answer to your question, but keep in mind that using it in production code is generally discouraged.
Note: if you add a new Person-attribute which isn't itself Comparable, you'll have to resort to a custom Comparator anyway. With that in mind, you might want to keep the Map<String, Comparator<?>> solution you already have.

Should I compare all fields in my class's "equals" method?

I'm working on an application that allows the user to manage accounts. So, suppose I have an Account class, representing one of the user's accounts:
class Account
{
public int id;
public String accountName;
public String accountIdentifier;
public String server;
public String notes;
}
My equals method looks like this:
public boolean equals(Object o)
{
if (this == o)
return true;
if (o == null || !(o instanceof Account))
return false;
Account other = (Account) o;
if (!accountIdentifier.equals(other.accountIdentifier))
return false;
if (!server.equals(other.server))
return false;
return true;
}
As you can see, I'm only comparing the accountIdentifier and the server, but not the other fields. There are several reasons why I chose this approach.
I keep the accounts in a List. When the user updates an account, by changing the account name (which is just a name specified by the user to identify the account) or the notes, I can do accountList.set(accountList.indexOf(account), account); to update the account in the list. If equals compared all properties, this approach wouldn't work, and I'd have to work around it (for example by iterating over the list and checking for these properties manually).
This might actually be more important, but it only came to my mind after thinking about it for a while. An Account is uniquely identified by the accountIdentifier and the server it belongs to. The user might decide to rename the account, or change the notes, but it's still the same account. But if the server is changed, I think I would consider it a different account. The id is just an internal ID since the accounts are stored in a database. Even if that changed, the account is still considered the same account if the accountIdentifier and the server stayed the same.
What I'm trying to say is that I basically implemented equals this way to allow for shorter, more concise code in the rest of the application. But I'm not sure if I'm breaking some rules here, or if I'm doing something that might cause other developers headaches if it ever happens that someone is working with my application's API.
Is it okay to only compare some fields in the equals method, or should I compare all fields?
Yes, it's definitely okay to do this. You get to decide what equality means for your class, and you should use it in a way that makes the most sense for your application's logic — in particular, for collections and other such classes that make use of equality. It sounds like you have thought about that and decided that the (server, identifier) pair is what uniquely distinguishes instances.
This would mean, for instance, that two instances with the same (server, identifier) pair but a different accountName are different versions of the same Account, and that the difference might need to be resolved somehow; that's a perfectly reasonable semantic.
It may make sense to define a separate boolean allFieldsEqual(Account other) method to cover the "extended" definition, depending on whether you need it (or would find it useful for testing).
And, of course, you should override hashCode to make it consistent with whatever definition of equals you go with.
You should compare all of the fields that are necessary to determine equality. If the accountIdentifier and server fields are enough to determine if two objects are equal, then that is perfectly fine. No need to include any of the other fields that don't matter in terms of equality.
For the key normally you should use the business key, this key can be simple or composite key and not necessary need to include all the fields in the entity. So... depends of each case to select what identify an entity. If possible should be the minimum number of field fully and unique identify the entity.
Some people prefer (and is a good practice) to create a surrogate key that will identity the object, this is very useful when you want to persist your objects using any ORM due you don’t need to export the keys to the child entities in 1:M or M:N relations. For example the ID in your sample can be considered as surrogate key if you create it as internal unique identifier.
Also may want to take into consideration:
Always you override equals you must override hashCode too, this is important to work properly with classes like Collections, Maps etc
Apache provide a really nice API to help in the implementation of equals and hashCode. Those classes are EqualsBuilder and HashCodeBuilder. Both allow you to concatenate the fields you want to use in your comparison and have a way also to use reflection.
The answer is "it depends depends on the semantics of your data".
For example, you might internally store a field that can be derived (calculated) from the other fields. In which case, you don't need to compare the calculated value.
As a gross generalisation, anything that cannot be derived from other fields should be included.
This is fine - and probably a good thing to do. If you've identified equality as the accountIdentifier and the server being distinct and unique, then that's perfectly valid for your use case.
You don't want to use more fields than you need to since that would produce false positives in your code. This approach is perfectly suitable to your needs.

Deep Comparision of Two Java Objects

Already referred few question here and there.
Use Case -
1.) Given any Two objects, compare both of them property by property.
2.) It will contains Collections also, now compare collections of both the objects, now inside collection same data can be upside down, meaning say i have List<Address>, it contains two entries in both (say Residential Address, Office Address), but in both list the data may be at different indexes.
3.) Need to create 3rd Object of same type, with similar data copied, and properties set to null with different data.
4.) It might have reference classes as well.
I tired many solutions but stuck somewhere or the other, i am thinking of writing some generic solution. Though of generating two xml's out of the two objects and then comparing node by node, but just want to get more options.
Or How much Java reflection is stronger in this case.
answer to #markspace question.
To access a private field you will need to call the Class.getDeclaredField(String name) or Class.getDeclaredFields() method. The methods Class.getField(String name) and Class.getFields() methods only return public fields, so they won't work.
To access a private method you will need to call the Class.getDeclaredMethod(String name, Class[] parameterTypes) or Class.getDeclaredMethods() method. The methods Class.getMethod(String name, Class[] parameterTypes) and Class.getMethods() methods only return public methods.
XMLUnit will work, it compares the list references recursively, also it has option to exclude fields which you do not wish to compare.
String expectedXML = "some xml";
String actualXML = "some xml";
DetailedDiff diff1 = new DetailedDiff(XMLUnit.compareXML(expectedXML, actualXML));
diff1.overrideElementQualifier(new RecursiveElementNameAndTextQualifier());
System.out.println("Differences found: " + diff1.getAllDifferences().toString());
RecursiveElementNameAndTextQualifier
Compares all Element and Text nodes in two pieces of XML. Allows
elements of complex, deeply nested types that are returned in
different orders but have the same content to be recognized as
comparable.

Why doesn't the Java library provide `HashSet.get(Object o)` and `HashMap.getKey(Object o)`

I know this question has already been asked on SO a couple of times, but I still haven't found a satisfying solution, and I'm unsure which way to go. The question is:
Why doesn't the Java library provide HashSet.get(Object o) and HashMap.getKey(Object o) methods that return the actual instance in the map providing an equal instance? Example:
// Retrieve a house with ID=10 that contains additional information like size,
// location and price.
houses.get(new House(10));
I think the best answer can be found here. So here's a mixture of answers that I'm aware of:
Why would you need the instance when you already have it? It doesn't make sense to try to get the same object you already have. The object has an identifier (which controls it's equality to other Foo types) plus any number of other fields that do not contribute to it's identity. I want to be able to get the object from the Set (which includes the extra fields) by constructing an 'equal' Foo object (text is taken from one of the comments). -> no answer
Iterate the Collection and search for the instance using equals(). This uses linear search and is extremely slow in big collections. -> bad answer
Use a HashMap instead of a HashSet I don't need a map and I think it's not adequate to return a map in a method like getHouses(). The getter should return a Set and not a Map.
Use TreeSet.ceiling - don't know
This hacky code below (Java 8 HashSet only) uses reflection and provides the missing functionality. I did not find something like this in other answers (no surprise). This could have been an acceptable solution if the target Java version is defined and future Java versions would finally provide such a method, now that we have default methods for interfaces. One could think of default E get(E o){stream().filter(e->e.equals(o)).findAny().orElse(null);}
// Alternative: Subclass HashSet/HashMap and provide a get()/getKey() methods
public static <T> T getFromSet(HashSet<T> set, T key) throws Exception {
Field mapField = set.getClass().getDeclaredField("map");
mapField.setAccessible(true);
HashMap<T, Object> map = (HashMap) mapField.get(set);
Method getNodeMethod = map.getClass().getDeclaredMethod("getNode",
int.class, Object.class);
getNodeMethod.setAccessible(true);
return (T) ((Map.Entry) getNodeMethod.invoke(map, key.hashCode(),
key)).getKey();
}
Here are the questions:
Is the best solution the use of HashMap<House, House> instead of HashSet<House>?
Is there another library out there that provides this functionality and supports concurrent access?
Do you know of a bug addressing this feature?
Similar questions on SO:
Why doesn't java.util.HashSet have a get(Object o) method?
Java: Retrieving an element from a HashSet
Why does the java.util.Set interface not provide a get(Object o) method?
The reason this behaviour hasn't been catered for is that creating a House instance with invalid data just to obtain one with valid data is really poor design.
Composition is the correct solution here:
/** immutable class containing all the fields defining identity */
public final class HouseIdentifier {
private final String id;
}
public class House {
private final HouseIdentifier id;
/** all the mutable, ephemeral properties of the house should go here */
private int size;
private Person owner;
}
If you design your class hierarchy like this, then all you need for your lookups is a simple and straightforward Map<HouseIdentifier, House>.
Map doesn't have a getKey(Object o) because it's not a bidirectional map. It only maps keys to values, not the other way around.
Set doesn't have get(Object o) because that's the job for a Map.
Mapping a House object to another House object is just bad design on your part. You want to get a House by an address or a number or similar, so you have one or more maps that give you those mappings (or more likely, a database). Your question makes sense only to you, because you're thinking "in the wrong way".
Your "wrong way of thinking" is evidenced by your statement
I don't need a map and I think it's not adequate to return a map in a
method like getHouses(). The getter should return a Set and not a Map.
I have never heard that a getter can't return a Map. Although I would probably name it getHouseMap(). You're creating a huge problem out of a trivial little issue. This is the job for a database anyways, so your dataset must be quite small.

what is the best way to compare two complex java object and generate event depending upon comparision

I have a requirement to compare two complex object, e:g
Policy{
Private Vehicle-information info1;
private Driver-information info2;
...
}
I have two populated instance of this class. I want to compare those instance and depending upon difference I need to show them in UI marked in colors using some flag.
What is the best way to compare these objects. Can we achieve it using XML because java code will be complex.
Override the equals() and hashCode() method in your Policy class. Then you can check for equality like:
if(object1.equals(object2)) {
// do something
}
Implement Comparable and override the compareTo() method if you need to order the objects.
One solution: use Jackson to serialize your objects as JSON, then use this: it is a Java implementation of JSON Patch which also can generate differences between two JSONs as JSON Patches.
Which means you can know what has changed and where. And since this is JSON, you can send the result to your browser and have it handled by some JavaScript code easily. Unlike XML!

Categories

Resources