Diff between two instances of same class [duplicate]

Diff between two instances of same class [duplicate] - java

This question already has answers here:
find out the differences between two java beans for version tracking
(4 answers)
Closed 6 years ago.
I have two instances of the same class. I need to find property(s) which are different amongst them (basically value of the property, say firstName might be different in both). The fields are primitive, complex as well as collections.
Basically, I need to find differences between two instances and if the fields are different in both the instances, I will copy value of the field from first instance to a third instance (diff object).
I think I can use reflection, but the classes are very complex and it might become error prone.

Your question is a bit unclear. First you said "two instances of the same class", then you said "the classes are very complex". Is this a generic solution to find differences between instances for any class, or is it just a specific case? The other ambiguity is what you mean by "copy the differences". For example, if its a String, then what is the difference between the two strings that would get copied into the new instance? If its a collection, what is the difference between the collections? What if you have two ArrayLists that have the same things in different orders.
If its a specific case, then you can just create a method on the class where you pass in an instance of the same class. Then you can iterate over each field and compare the differences.
public TheClass difference(TheClass that) {
TheClass newClass = new TheClas()
if (this.getName().equals(that.getName()) == false ) {
newClass.setName(that.getName());
}
...
}
But this can get out of hand depending on how deep your object graph is.
Perhaps the builder pattern might come in handy here if you're trying to build a generic/reusable solution. You might look at the Apache Commons code base and see how they implement HashCodeBuilder and ToStringBuilder. They even have a reflection version of the utilities. See how they handle deep and shallow equals.

There is no other way than to use reflection if you want to do it generically. If your fields have getters then you could keep comparing it with equals method. Then you have know the fields at compile time.

write a method like
public YourClass diff(YourClass other) {
// diff code here
}
or
public static YourClass diff(YourClass first, YourClass second) {
// diff code here
}
and wrap some unit tests around the implementations.
Reflection might seem more complicated, but it has the following advantages:
1) It can be generic and you can make the same code work with any class
2) It will not have to be changed if the class evolves.
If you want to go with reflection, you can use Class.getDeclaredFields() (google for "java class api 6") to get all the fields on the class, and then you can use the Field api to get the values for an object. If you wanted to be really fancy, you have a static String[] "diffIgnoredFields" and include field names you want to ignore in the diff.

You need to be careful. Some formulations of this problem are essentially the same as the Subgraph Isomorphism Problem which is known to be NP complete.

Related

Java ArrayList subclass extraction

I'm creating an application where I use genetic algorithm (not implemented yet) to make creatures follow food and avoid obstacles.
I have in my simulation class (where the magic happens) an arraylist where all the creatures are stored. To be noted the arraylist is full of abstract class objects whereas my creatures are all a subclass of Creature.
My question is: how can I make another ArrayList or similar where i can iterate over the arraylist and extract a particular subclass? I had a look and it seems there is no way for me to do so because of how java Collections work. Is there any kind of workaround or some library that could make this possible for me?
It is important for me to have separate lists because I need to apply behaviours to different kind of creature and weigh them according to the "dna" of the creature.
GitHub repository for the whole project: https://github.com/Jamesinvi/Animosity/tree/master/Animosity
I tried this but I get a list of all creatures because they are all of the Creature class
//in PSEUDOCODE i would like to do this:
new ArrayList newlist=new ArrayList<Creature>();
for(Creature old:oldList){
if (old instanceof CreatureSubclass){
newlist.add(old);
}
}
Disclaimer: I am a student so forgive me if this is kind of a stupid question but I am struggling a bit with this. Thanks for the help :)
ArrayList <Creature>oldlist=new ArrayList<Creature>();
ArrayList <Creature>newlist=new ArrayList<Creature>();
for(int i=0;i<oldlist.size();i++){
if (oldlist.get(i) instanceof CreatureSubclass){
newlist.add(oldlist.get(i));
}
}

Totally agree with OldCurmudgeon. You should not extract the subclass.
If you really want to do that, one ugly method is to add a string as a member variable to Creature class called flag. So you could use the string comparison instead of instance of which is very dangerous.
if (oldCreature.flag.equals("SmallCreature"))
{
newList.add(oldCreature); // another possible error: do you need a new copy or just reference?
}
And you could consider use enum class instead of string, which would also be a feasible and simple solution.
public enum CreatureName{SmallCreature, LargeCreature}
And again if you want to apply different behaviors onto the different kinds of the subclass (dna), do you have ever considered the design patterns like strategy or abstract class? The visitor pattern may be a good one mentioned by JB Nizet. But it may be overkill for this question.

Any simple examples for a Rubyist on how to use java clone the right way and the wrong way?

I'm a rubyist and have started learning java. Came across some dialog that says not to use clone() method in java or if I do, make sure to know what I'm doing with it.
Java clone method seems to be a popular topic on stackoverflow but most questions have been about advanced topics related to why cloning is not working or shallow or deep copy etc. Don't know what to make of that. What about a few simple examples of how to use clone the right way and the wrong way?
It looks like clone is in the interface of an object but has absolutely no implementation. If there is no implementation why do I have to throw the cloneNotsupported exception? Could someone provide a comprehensive list of examples of how clone can be used the right way as well as the wrong way?
thank you in advance.

I think clone() could maybe used with your own, well defined, final data structure (or record) that is not exactly a class (has all public fields and no methods). Also, these fields should be either primitive data types, or immutable types (like Strings) so could be shared without problems.
Cloning such a structure by assigning all fields manually simply means more code, makes the maintenance more difficult (more changes after you add or remove a field) and I really do not understand what exactly benefits this brings.
C / C++ has an assignment statement for the structures to copy all fields in one go and in some cases does this transparently ("structure passed by value"). Java could use clone for the similar goal. After all, simply assigning a double value to a variable is a kind of cloning: all fields of the IEEE data structure (sign bit, exponent, fraction) are copied. Never used to be any fundamental problems with this. How this is different from cloning a final Point class with two public integer fields, x and y?
Most of arguments against clone() are valid for cases when the exact class of the instance in use is not known or it may be unknown invisible fields that may not get initialized correctly.

Checking for deep equality in JUnit tests

I am writing unit tests for objects that are cloned, serialized, and/or written to an XML file. In all three cases I would like to verify that the resulting object is the "same" as the original one. I have gone through several iterations in my approach and having found fault with all of them, was wondering what other people did.
My first idea was to manually implement the equals method in all the classes, and use assertEquals. I abandoned this this approach after deciding that overriding equals to perform a deep compare on mutable objects is a bad thing, as you almost always want collections to use reference equality for mutable objects they contain[1].
Then I figured I could just rename the method to contentEquals or something. However, after thinking more, I realized this wouldn't help me find the sort of regressions I was looking for. If a programmer adds a new (mutable) field, and forgets to add it to the clone method, then he will probably forget to add it to the contentEquals method too, and all these regression tests I'm writing will be worthless.
I then wrote a nifty assertContentEquals function that uses reflection to check the value of all the (non-transient) members of an object, recursively if necessary. This avoids the problems with the manual compare method above since it assumes by default that all fields must be preserved and the programmer must explicitly declare fields to skip. However, there are legitimate cases when a field really shouldn't be the same after cloning[2]. I put in an extra parameter toassertContentEquals that lists which fields to ignore, but since this list is declared in the unit test, it gets real ugly real fast in the case of recursive checking.
So I am now thinking of moving back to including a contentEquals method in each class being tested, but this time implemented using a helper function similar to the assertContentsEquals described above. This way when operating recursively, the exemptions will be defined in each individual class.
Any comments? How have you approached this issue in the past?
Edited to expound on my thoughts:
[1]I got the rational for not overriding equals on mutable classes from this article. Once you stick a mutable object in a Set/Map, if a field changes then its hash will change but its bucket will not, breaking things. So the options are to not override equals/getHash on mutable objects or have a policy of never changing a mutable object once it has been put into a collection.
I didn't mention that I am implementing these regression test on an existing codebase. In this context, the idea of changing the definition of equals, and then having to find all instances where it could change the behavior of the software frightens to me. I feel like I could easily break more than I fix.
[2]One example in our code base is a graph structure, where each node needs a unique identifier to use to link the nodes XML when eventually written to XML. When we clone these objects we want the identifier to be different, but everything else to remain the same. After ruminating about it more, it seems like the questions "is this object already in this collection" and "are these objects defined the same", use fundamentally different concepts of equality in this context. The first is asking about identity and I would want the ID included if doing a deep compare, while the second is asking about similarity and I don't want the ID included. This is making me lean more against implementing the equals method.
Do you guys agree with this decision, or do you think that implementing equals is the better way to go?

I would go with the reflection approach and define a custom Annotation with RetentionPolicy.RUNTIME to allow the implementers of the tested classes to mark the fields that are expected to change after cloning. You can then check the annotation with reflection and skip the marked fields.
This way you can keep your test code generic and simple and have a convenient means to mark exceptions directly in the code without affecting the design or runtime behavior of the code that needs to be tested.
The annotation could look like this:
import java.lang.annotation.*;
#Retention(RetentionPolicy.RUNTIME)
#Target({ElementType.FIELD})
public #interface ChangesOnClone
{
}
This is how it can be used in the code that is to be tested:
class ABC
{
private String name;
#ChangesOnClone
private Cache cache;
}
And finally the relevant part of the test code:
for ( Field field : fields )
{
if( field.getAnnotation( ChangesOnClone.class ) )
continue;
// else test it
}

AssertJ's offers a recursive comparison function:
assertThat(new).usingRecursiveComparison().isEqualTo(old);
See the AssertJ documentation for details: https://assertj.github.io/doc/#basic-usage
Prerequisites for using AssertJ:
import:
import static org.assertj.core.api.Assertions.*;
maven dependency:
<!-- test -->
<dependency>
<groupId>org.assertj</groupId>
<artifactId>assertj-core</artifactId>
<version>3.19.0</version>
<scope>test</scope>
</dependency>

Java need help checking if string is instance

I have an interface, GenericExpression, that gets extended to create expressions (ie AndExpression, OrExpression etc.).
Each GenericExpression implementation has a string that represents it (ie "&", "+", etc.) (stored as a static variable "stringRep")
Is there any way to take a user input String and check if it represents a GenericExpression?
If not (seems likely this is the case), is there any way to achieve a similar effect with a refactored design?
Thanks!
EDIT: Offered a little bit more detail above.
Also, the end goal is to be able to arbitrarily implement GenericExpression and still check if a string represents an instance of one of its subclasses. As such, I can't just store a map of implementation - string representation pairs, because it would make make it so GenericExpression is no longer easily extendible.
Also, this is homework

Well I think you will need to define somewhere what expressions are supported by your program. I think the best way is to use a map, where you map your interface to strings. That way you can easily look up an expression with its representing string. Where you will define this map is dependant on your design. One possibility is a static method in a helper class that resolves expressions to a string like:
Expressions.get("&").invoke(true, false);
Where get is a static method on Expressions that looks up the desired expression in a static map. You will have to initialize this map in a static initializer, or let the expression instances add themselves on creation.
EDIT:
(I wanted to comment this on an answer but it seems to be deleted)
Personally I don't like the idea of classes registering themselves. It gives me the feeling of not being in control of my code. I would prefer to instantiate the classes in the Expressions class itself. The code for registering a class must be written for every new subclass anyway. I prefer to centralize this code in a single class so if I want to change logic or refactor, I only have to touch one class.

trying to use only one method name

When I was programming a Form Validator in PHP, when creating new methods, I needed to increase the number of arguments in old methods.
When I was learning Java, when I read that extends is to not touch previously tested, working code, I thought I shouldn't have increased the number of arguments in the old methods, but overridden the old methods with the new methods.
Imagine if you are to verify if a field is empty in one part of the form, in an other and in yet an other.
If the arguments are different, you'll overload isEmpty, but, if the arguments are equal, is it right to use isEmpty, isEmpty2, isEmpty3, three classes and one isEmpty per class or, if both are wrong, what should I have done?

So the question is:
If I need different behaviors for a method isEmpty which receives the same number arguments, what should I do?
Use different names? ( isEmpty, isEmpty2, isEmpty3 )
Have three classes with a single isEmpty method?
Other?
If that's the question then I think you should use:
When they belong to the same logical unit ( they are of the same sort of validation ) but don't use numbers as version, better is to name them after what they do: isEmptyUser, isEmptyAddress, isEmptyWhatever
When the validator object could be computed in one place and passed around during the program lifecycle. Let's say: Validator v = Validator.getInstance( ... ); and then use it as : validator.isEmpty() and let polymorphism to it's job.
Alternatively you could pack the arguments in one class and pass it to the isEmpty method, although you'll end up with pretty much the same problem of the name. Still it's easier to refactor from there and have the new class doing the validation for you.
isEmpty( new Arguments(a,b,c ) ); => arguments.isEmpty();

The Open/Closed Principle [usually attributed to Bertrand Meyer] says that "software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification". This might be the principle that you came across in your Java days. In real life this applies to completed code where the cost of modification, re-testing and re-certification outweighs the benefit of the simplicity gained by making a direct change.
If you are changing a method because it needs an additional argument, you might choose to use the following steps:
Copy the old method.
Remove the implementation from the copy.
Change the signature of the original method to add the new argument.
Update the implementation of the original method to use the new argument.
Implement the copy in terms of the new method with a default value for the argument.
If your implementation language doesn't support method overloading then the principle is the same but you need to find a new name for the new method signature.
The advantage of this approach is that you have added the new argument to the method, and your existing client code will continue to compile and run.
This works well if there is an obvious default for the new argument, and less well if there isn't.

Since java 5 you can use variable list of arguments as in void foo(Object ... params)
You will need to come up with creative names for your methods since you can't overload methods that have same type and number of arguments (or based on return type). I actually personally prefer this to overloading anyway. So you can have isEmpty and isEmptyWhenFoo and isEmptyWhenIHaveTheseArguments (well meybe not the last one :)

Not sure if this actually answers your question, but the best way to think about OO in "real life" is to think of the Nygaard Classification:
ObjectOrientedProgramming. A program execution is regarded as a physical model, simulating the behavior of either a real or imaginary part of the world.
So how would you build a physical device to do what you are trying to do in code? You'd probably have some kind of "Form" object, and the form object would have little tabs or bits connected to it to represent the different Form variables, and then you would build a Validator object that would take the Form object in a slot and then flash one light if the form was valid and another if it was invalid. Or your Validator could take a Form object in one slot and return a Form object out (possibly the same one), but modified in various ways (that only the Validator understood) to make it "valid". Or maybe a Validator is part of a Form, and so the Form has this Validator thingy sticking out of it...
My point is, try to imagine what such a machine would look like and how it would work. Then think of all of the parts of that machine, and make each one an object. That's how "object-oriented" things work in "real life", right?
With that said, what is meant by "extending" a class? Well, a class is a "template" for objects -- each object instance is made by building it from a class. A subclass is simply a class that "inherits" from a parent class. In Java at least, there are two kinds of inheritance: interface inheritance and implementation inheritance. In Java, you are allowed to inherit implementation (actual method code) from at most one class at a time, but you can inherit many interfaces -- which are basically just collections of attributes that someone can see from outside your class.
Additionally, a common way of thinking about OO programming is to think about "messages" instead of "method calls" (in fact, this is the original term invented by Alan Kay for Smalltalk, which was the first language to actually be called "object-oriented"). So when you send an isEmpty message to the object, how do you want it to respond? Do you want to be able to send different arguments with the isEmpty message and have it respond differently? Or do you want to send the isEmpty message to different objects and have them respond differently? Either are appropriate answers, depending on the design of your code.

Instead having one class providing multiple versions of isEmpty with differing names, try breaking down your model into a finer grained pieces the could be put together in more flexible ways.
Create an interface called Empty with
one method isEmpty(String value);
Create implemntations of this
interface like EmptyIgnoreWhiteSpace
and EmptyIgnoreZero
Create FormField
class that have validation methods
which delegate to implementations of
Empty.
Your Form object will have
instances of FormField which will
know how to validate themselves.
Now you have a lot of flexibility, you can combine your Empty implemenation classes to make new classes like EmptyIgnoreWhiteSpaceAndZero. You can use them in other places that have nothing to do with form field validation.
You don't have have have multple similarly named methods polluting your object model.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.