Following question also refers to discussion in following questions as well
https://stackoverflow.com/search?page=2&tab=Relevance&q=one%20to%20many%20unidirectional%20java
Best practise for adding a bidirectional relation in OO model
I tried to implementing 8 association combinations formed by [Unidirectional/Bidirectional] X [(One/Many) to (One/Many)] in Java. I found two cases can not be implemented namely Unidirectional One to One and Unidirectional One to Many (e.g. Person->*Vehicle). Other 6 combinations and Composition are possible programatically.
I Feel its not only the case with Java, these 2 cases do not exist. e.g. Use case - allocate one Aadhar/SSN number to only one person is possible if we know that number is not allocated to anybody else (reverse navigation is must). Does this mean we need to take care while making our design model not to arrive at these specific associations (though they might be present in analysis model)? I am confused on this.
Basic (No Aggregation)
If you are looking at basic unidirectional association, then that's the simplest of them all.
Unidirectional One to One
class Person {
String name;
}
Unidirectional One to Many
class Person {
List vehicles;
}
Composite Aggregation
If I assume that you are asking about composite relationshions (where one SSN can be assigned to at most one person), then you can still implement it.
How exactly you decide to implement it is however subject to your specific domain or e.g. how you store your data, because
reverse navigation is must
is not actually true, because you can just check all Person instances; or you can store all the SSNs in a smart data structure that allows you to quickly check if a new one is unique, and then you would assign it to the Person without additional checks, because you already know that it is unique).
Or you can implement also the opposite lookup, which is not prohibited even if the association is "uni-directional"
To quote the UML Specs (11.5.3.1 Associations) [emphasis mine]:
Navigability means that instances participating in links at runtime (instances of an Association) can be
accessed efficiently from instances at the other ends of the Association. The precise mechanism by which such efficient
access is achieved is implementation specific. If an end is not navigable, access from the other ends may or may not be
possible, and if it is, it might not be efficient.
Update from comments
Noone claims that upholding the relationship constraints has to be done in the accessors. In fact pretty much always you will have temporarily invalid relationships, imagine:
person = new Person();
// right now person is invalid state because it doesn't have an SSN
ssn = ssnGenerator.createNew();
// now ssn is also in invalid state because it has no person
person.setSSN(ssn);
// only now is person and ssn valid
(creating a constructor wouldn't help, because constructor is called after the object has already been created (so another part of the constructor could need the ssn already set).
So it is the responsibility of the programmer to ensure that the system upholds all constraints in whatever way it makes most sense. Using constructors/accessors is the easiest way in some circumstances, but you could e.g. wrap the code above in an atomic transaction. After all, if you kept your validation in the setSSN(), then what would happen if the programmer were to forget to call the method at all?
(person 1->* vehicle)
p1.add(v1) and p2.add(v1) are possible violations
You asked about "person ->* vehicle", now you've changed it to "person 1 -> * vehicle" so obviously the answer differs. But the same principle as above applies -- it is the responsibility of the system to uphold all constraints, and wherever that's done in accessors, validation methods, or the way the system constructed is an implementational detail -- there's no single best way, and there will be always trade-offs.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last month.
Improve this question
In Java apps, I prefer to use unique fields in equals and hashCode methods instead of adding only id field or all the fields. However, I am confused about the following points:
By considering object states in Hibernate, I think it is good practice not using id field in equals and hashCode methods, right?
When there is a unique field in a class, is it enough to use only one of the unique fields in equals and hashCode methods (except from id field)?
Should I add all the fields except from id field when there is not any unique field except from id field in a class? Or should I only add some numeric field instead of adding text fields?
JPA and Hibernate don't specify or rely on any particular semantics for entities' equals() and hashCode() methods, so you can do what you want.
Good alternatives
With that said, there is a handful of alternatives for equality that make much more sense to me than any others:
Equality corresponds to object identity. This is of course the default provided by Object.equals(), and it can serve perfectly well for entities. OR
Equality corresponds to persistent identity. That is, entities are equal if and only if they have the same entity type and primary key. OR
Equality corresponds to (only) value equality. That is, equality of all corresponding persistent fields except the ID. There are additional variations around how that applies to mapped relationships. OR
Equality corresponds to persistent identity AND value equality. Again, there are variations around how the value equality part applies to mapped relationships.
General advice
In general, you would do well to follow a fule rules of thumb:
As with most other classes, especially mutable ones, default to just inheriting Object.equals() and Object.hashCode(). Have a specific purpose and plan before you do otherwise, and remember that you get only one choice for this. And that it is impactful.
If you do override equals() (and therefore hashCode() as well) then do it in a consistent way across all your entities.
Think carefully before you go with an option involving value equality. This is usually a poor choice for mutable classes in general, and entities are no exception.
Specific Questions
1. By considering object states in Hibernate, I think it is good practice not using id field in equals and hashCode methods, right?
I think using the ID is fine. It's simply a question of what you want equality to represent for your entities. You absolutely can have distinct entity objects with the same type and ID, and you might want to be able to detect that with equals(). The other persistent fields might or might not factor into that.
In particular, an equals() method based solely on entity ID might make sense for entities that appear on the "many" side of a one-to-many relationship when that is mapped to a Set.
2. When there is a unique field in a class, is it enough to use only one of the unique fields in equals and hashCode methods (except
from id field)?
I see no good reason to consider only a proper subset of unique fields, except that subset consisting only of the entity ID. Or if all the fields are unique then the one consisting of all the fields except the ID. The logic that suggests that you might be able to consider other proper subsets revolves around the persistent identity of the entity, which is completely and best represented by its ID.
3. Should I add all the fields except from id field when there is not any unique field except from id field in a class? Or should I only
add some numeric field instead of adding text fields?
If your sense of equality is to be based on entity value then I don't see how it makes much sense to omit any persistent fields except, possibly, the ID. Do not arbitrarily omit the ID -- it may very well be something you want to include. Again, it depends on what equals() is intended to mean for your entities.
That's a tricky question that hibernate itself doesn't have a clear answer on.
John Bollinger's answer covers your specific question, but there is some additional context about how to think about equality and hibernate that should help figure out what to do. After all, given that hibernate doesn't require you to do anything particular, you can do whatever you want, which leads to the obvious question: ... okay, so what should I do, then?
That question boils down to (using Person as an arbitrary example of a model class + associated table; furthermore, lets say the person table has a single unique ID that is generated (A random UUID or auto-sequenced integer value).
What does an instance of Person represent?
There are in broad strokes 2 answers:
It represents a person. A row in the person table also represents a person; these 2 things aren't related.
It represents a row in the person table.
It represents a state in my application, nothing more.
Even though these things sound quite similar, they result in opposite meanings as to equality.
Which choice is correct? That's up to you.
When reading on, remember:
Any Person instance which isn't "saved" yet, would have a null value for id, because upon insertion, hibernate will ask the DB to generate a value for it or generates one itself and only then fills it in.
An instance represents a row
Equality under the second model (an instance of Person represents a row in the table) should look only at the id column, because that defines row uniqueness; any 2 representations of a row in the person table are guaranteed to be referring to the same row (hence, equal) if and only if the id is equal. That is a necessary and sufficient condition: If they are equal the 2 objects are necessarily referring to the same row, and if they aren't equal, then they are necessarily referring to different rows.
Notably, if id is still null, then they cannot be equal, not even to themselves: More generally the question: "Is this object-representing-a-row equal to this other object-representing-a-row" is a meaningless question if these objects are representing rows-to-be (unsaved rows). If you invoke save() on each object, you end up with 2 rows. Optimally such an object should be considered in a state such that attempting to invoke equals on it is a failure, but the spec of equals states that they can't throw, therefore, false is the best answer. This would mean you want:
class Person {
// fields
#Override public boolean equals(Object other) {
if (other == this) return true;
if (other == null || other.getClass() != Person.class) return false;
UUID otherId = ((Person) other).id;
return id == null ? false : id.equals(otherId);
}
}
This defines your equals method as 'ends up representing the same row'. This holds even if you change meaningful state:
Change the name and save the object? It's... still the same row, and this equality implementation reflects this.
Call save() on each in the comparison when they were unsaved? Then you get 2 rows - and this equality implementation reflects this before and after attempting to save it.
If invoking on self (a.equals(a)) this returns true as the equality spec demands; it also works out in the 'modelling a row' view: If you invoke save() on the same object twice, it's still just one row.
An instance represents a person
The nature of what a person is is entirely unrelated to the autosequence/autogen ID it gets; the fact that we're using hibernate is an implementation detail that should play no part at all in considering equality; after all, this object represents the notion of a person, and that notion exists entirely independent of the database. The database is one thing that is modelling persons; instances of this class are another.
In this model you should do the exact opposite: Find something that uniquely identifies a person itself, and compare against that. After all, if you have 2 rows in a database that both contain the same social security number, then you have only 1 person.. and you just happen to have 2 rows that are both referring to the same person. Given that we chose our instance to imply that it represents a person, then an instance loaded from row A, and an instance loaded from row B, ought to be considered as equal - after all, they are representing the same individual.
In this case, you write an equals method that considers all relevant fields except the autoseq/autogen ID field! If there is a separate unique id such as social security number, use that. If there isn't, essentially it boils down to an equals method that compares all fields, except ID. Because that's the one field that definitely has zero bearing on what defines a person.
An instance defines a state in your application
This is almost a cop-out, and in general means equality is irrelevant / not applicable. It's like asking how to implement an equals method to an InputStream implementation - mostly, you.. don't.
Here, the default behaviour (Object's own impls) are what you want, and therefore, you don't implement either hashCode or equals. Any instance of Person is equal to itself (as in, a.equals(a), same reference), and not equal to any other, even if the other has identical values for each and every field, even the id field isn't null (represents the same row).
Such an object cannot meaningfully be used as a value object. For example, it would be pointless to stuff such things in a hashmap (at best, you can stuff them in an IdentityHashMap, as those semantics would apply. Only way to do any lookups is to have a ref that was .put() into it before and call .get() with that).
Which one is right? Up to you. But document it clearly, because in my experience, lots of hibernate users are absolutely convinced either the first or second model is the one, and only, right answer, and consider the other answer utterly bonkers. This is problematic - they'd be writing their code assuming all hibernate model classes work precisely as they want, and would therefore not even be thinking of checking docs/impl to know how it actually works.
For what its worth, objects are objects and database rows do not neatly map to the notion of an object. SQL's and java's notion of null are utterly incompatible, and the notion of 'a query' does not neatly map to tables (between selecting expressions, selecting on views, and JOINs, that should be obvious) - hibernate is tilting at windmills. It is a leaky abstraction and this is one of its many, many leaks. Leaky abstractions can be useful, just, be aware that at the 'edges' the principle hibernate tries to peddle you (that objects can represent query results and rows) has limits you will run into. A lot.
This is either a Java coding design question, or a domain modelling question, I'm not sure yet!
Background, simplified as I can't share all the detail:
The system I'm working on is deployed in multiple instances (e.g. at different customers, and at one customer there may be development, test, preprod, prod instances)
The system configuration is a list of ConfigArtefact<T>, where T indicates that it might be a database connection configuration, or a predefined-query, or....
ConfigArtefacts are named. The names are semantically meaningful/well-known (e.g there could be an object for "Console.translations.en", "Console.translations.fr") or "Application.Database.connection.credentials", "Reporting.Database.connection.credentials") and are distinct for each deployment- no two different ConfigArtefacts will have the same name for a given deployment
ConfigArtefacts have other attributes (e.g. for the database, username and password) depending on the type used for <T>. The value of the attributes could be different in different deployments of this system.
There's no natural ordering of artefacts, even ones of the same type <T>. Where some arbitrary ordering is needed, I use the name.
Goal:
I need to write something that compares the configuration of two deployments of this system and identify Artifacts that have been added, removed, or changed. In order to find the same artefact on each deployment, I need to compare by name only (I always know what type of artefact I'm working with). In order to say if they've changed, I need to compare by all other attributes.
So, two kinds of comparison. One can be modelled with equals/hashcode, but not the other. Which should use equals()? (I think the one by name, as then added and deleted are just set subtraction, using one of the many collection libraries).
Would that be the normal choice? And if so, is there a conventional name for the other ("full compare") one? I'm considering identicalTo() (so two objects are changed if one.identicalTo(two) is false)
Your design is flawed - you have misused equals() by only comparing name.
If all attributes must be compared to know if the object has changed, then it is not true that objects with the same name are “equal”, because the use of the word “changed” implies there’s a difference, and if there’s a difference they’re not equal.
Finding something by using an identifier is different to two objects being equal if they have the same identifier.
Implement equals() and hashCode() using all attributes that matter for determining if an object is “different”.
To facilitate convenience and performance, populate a Map<String, ConfigArtefact<?>> for each environment using the name as the key.
Finding differences between 2 such maps is a fairly trivial O(n) task.
I am trying to understand the relationship between classes in Object oriented world, and came across various terms like:
Association , Aggregation, Composition, Dependency, Generalization, Realization, Using (and may be there are more to the list, which I would encounter soon).
I came across the following UML diagram:
Here, we have two different Classes (and so objects), Car and Road, and the connector symbol connecting them (and I believe it is directed association symbol, as per MS Visio).
So this means that Car and Road classes are having some relationship (association). I have some doubts on this to understand this relation:
1) How would this relationship be translated to Java classes? I am having difficulty in understanding how Car and Road would have "some code" connecting them?
2) what does * and 0..1 mean in this diagram? Usually I have seen these in an Entity-Relationship diagrams (in DB).
Any pointer to understand this would be of great help.
A Car object would have a reference to a Road object (in other words, an instance variable of type Road). A Road object would also have a list (or List) of Car objects. The first sentence represents the 0..1 relationship between the classes; note that the reference could be null (car is on 0 roads) or not (car is on one road). The list in the Road object represents the * relationship -- 0 or more cars are on the road.
1) is too broad to answer. UML and Java are both formal languages with well-defined structures, but there is no standardized way of expressing any particular UML concept in Java or vice versa. Thus, any answer would be opinion-based.
Furthermore, an association is a loosely-defined relationship. UML has many others more strictly defined (you've listed a few), and those are easier to translate to source code.
Because there are more strictly defined relationships, the correct reading of an association relationship is along the lines "these two things are related somehow, but not so tightly that the one contains the other, or that the one uses the other, or is dependent on the other." Those concepts all have their own connectors, and the modeller has made a conscious decision not to use them.
2) * means "any number" and 0..1 means "zero or one", which is usually read as "an optional". So the drivesOn relationship associates any number of Cars with an optional Road.
Presumably this should be taken to mean that a car may drive on a road, but never on more than one, and a road may have any number of cars driving on it.
In terms of understanding UML, this is a very poor example so don't try to read too much from it.
Your questions do have simple answers:
Your uni-directional many-to-one association drivesOn is expressed in (or translated to) Java in the form of a single-valued reference property in the following way:
class Car {
int passengers;
Road drivesOn;
}
The symbols * and 0..1 represent multiplicities: * means many (or unbounded) and 0..1 means at most one, so your model makes two multiplicity statements: (1) a Car movesOn at most one Road, and (2) a Road has many Cars moving on it.
A pointer for reading more about the meaning of associations and multiplicities and how they are expressed in Java is my book chapter Reference Properties and Unidirectional Associations.
It is like many cars can be associated to no road or at max one road. In other words many cars can be driven on 1 road or not at all driven on any road
I'm working on an application that allows the user to manage accounts. So, suppose I have an Account class, representing one of the user's accounts:
class Account
{
public int id;
public String accountName;
public String accountIdentifier;
public String server;
public String notes;
}
My equals method looks like this:
public boolean equals(Object o)
{
if (this == o)
return true;
if (o == null || !(o instanceof Account))
return false;
Account other = (Account) o;
if (!accountIdentifier.equals(other.accountIdentifier))
return false;
if (!server.equals(other.server))
return false;
return true;
}
As you can see, I'm only comparing the accountIdentifier and the server, but not the other fields. There are several reasons why I chose this approach.
I keep the accounts in a List. When the user updates an account, by changing the account name (which is just a name specified by the user to identify the account) or the notes, I can do accountList.set(accountList.indexOf(account), account); to update the account in the list. If equals compared all properties, this approach wouldn't work, and I'd have to work around it (for example by iterating over the list and checking for these properties manually).
This might actually be more important, but it only came to my mind after thinking about it for a while. An Account is uniquely identified by the accountIdentifier and the server it belongs to. The user might decide to rename the account, or change the notes, but it's still the same account. But if the server is changed, I think I would consider it a different account. The id is just an internal ID since the accounts are stored in a database. Even if that changed, the account is still considered the same account if the accountIdentifier and the server stayed the same.
What I'm trying to say is that I basically implemented equals this way to allow for shorter, more concise code in the rest of the application. But I'm not sure if I'm breaking some rules here, or if I'm doing something that might cause other developers headaches if it ever happens that someone is working with my application's API.
Is it okay to only compare some fields in the equals method, or should I compare all fields?
Yes, it's definitely okay to do this. You get to decide what equality means for your class, and you should use it in a way that makes the most sense for your application's logic — in particular, for collections and other such classes that make use of equality. It sounds like you have thought about that and decided that the (server, identifier) pair is what uniquely distinguishes instances.
This would mean, for instance, that two instances with the same (server, identifier) pair but a different accountName are different versions of the same Account, and that the difference might need to be resolved somehow; that's a perfectly reasonable semantic.
It may make sense to define a separate boolean allFieldsEqual(Account other) method to cover the "extended" definition, depending on whether you need it (or would find it useful for testing).
And, of course, you should override hashCode to make it consistent with whatever definition of equals you go with.
You should compare all of the fields that are necessary to determine equality. If the accountIdentifier and server fields are enough to determine if two objects are equal, then that is perfectly fine. No need to include any of the other fields that don't matter in terms of equality.
For the key normally you should use the business key, this key can be simple or composite key and not necessary need to include all the fields in the entity. So... depends of each case to select what identify an entity. If possible should be the minimum number of field fully and unique identify the entity.
Some people prefer (and is a good practice) to create a surrogate key that will identity the object, this is very useful when you want to persist your objects using any ORM due you don’t need to export the keys to the child entities in 1:M or M:N relations. For example the ID in your sample can be considered as surrogate key if you create it as internal unique identifier.
Also may want to take into consideration:
Always you override equals you must override hashCode too, this is important to work properly with classes like Collections, Maps etc
Apache provide a really nice API to help in the implementation of equals and hashCode. Those classes are EqualsBuilder and HashCodeBuilder. Both allow you to concatenate the fields you want to use in your comparison and have a way also to use reflection.
The answer is "it depends depends on the semantics of your data".
For example, you might internally store a field that can be derived (calculated) from the other fields. In which case, you don't need to compare the calculated value.
As a gross generalisation, anything that cannot be derived from other fields should be included.
This is fine - and probably a good thing to do. If you've identified equality as the accountIdentifier and the server being distinct and unique, then that's perfectly valid for your use case.
You don't want to use more fields than you need to since that would produce false positives in your code. This approach is perfectly suitable to your needs.
I'm in a position where our company has a database search service that is highly configurable, for which it's very useful to configure queries in a programmatic fashion. The Criteria API is powerful but when one of our developers refactors one of the data objects, the criteria restrictions won't signal that they're broken until we run our unit tests, or worse, are live and on our production environment. Recently, we had a refactoring project essentially double in working time unexpectedly due to this problem, a gap in project planning that, had we known how long it would really take, we probably would have taken an alternative approach.
I'd like to use the Example API to solve this problem. The Java compiler can loudly indicate that our queries are borked if we are specifying 'where' conditions on real POJO properties. However, there's only so much functionality in the Example API and it's limiting in many ways. Take the following example
Product product = new Product();
product.setName("P%");
Example prdExample = Example.create(product);
prdExample.excludeProperty("price");
prdExample.enableLike();
prdExample.ignoreCase();
Here, the property "name" is being queried against (where name like 'P%'), and if I were to remove or rename the field "name", we would know instantly. But what about the property "price"? It's being excluded because the Product object has some default value for it, so we're passing the "price" property name to an exclusion filter. Now if "price" got removed, this query would be syntactically invalid and you wouldn't know until runtime. LAME.
Another problem - what if we added a second where clause:
product.setPromo("Discounts up to 10%");
Because of the call to enableLike(), this example will match on the promo text "Discounts up to 10%", but also "Discounts up to 10,000,000 dollars" or anything else that matches. In general, the Example object's query-wide modifications, such as enableLike() or ignoreCase() aren't always going to be applicable to every property being checked against.
Here's a third, and major, issue - what about other special criteria? There's no way to get every product with a price greater than $10 using the standard example framework. There's no way to order results by promo, descending. If the Product object joined on some Manufacturer, there's no way to add a criterion on the related Manufacturer object either. There's no way to safely specify the FetchMode on the criteria for the Manufacturer either (although this is a problem with the Criteria API in general - invalid fetched relationships fail silently, even more of a time bomb)
For all of the above examples, you would need to go back to the Criteria API and use string representations of properties to make the query - again, eliminating the biggest benefit of Example queries.
What alternatives exist to the Example API that can get the kind of compile-time advice we need?
My company gives developers days when we can experiment and work on pet projects (a la Google) and I spent some time working on a framework to use Example queries while geting around the limitations described above. I've come up with something that could be useful to other people interested in Example queries too. Here is a sample of the framework using the Product example.
Criteria criteriaQuery = session.createCriteria(Product.class);
Restrictions<Product> restrictions = Restrictions.create(Product.class);
Product example = restrictions.getQueryObject();
example.setName(restrictions.like("N%"));
example.setPromo("Discounts up to 10%");
restrictions.addRestrictions(criteriaQuery);
Here's an attempt to fix the issues in the code example from the question - the problem of the default value for the "price" field no longer exists, because this framework requires that criteria be explicitly set. The second problem of having a query-wide enableLike() is gone - the matcher is only on the "name" field.
The other problems mentioned in the question are also gone in this framework. Here are example implementations.
product.setPrice(restrictions.gt(10)); // price > 10
product.setPromo(restrictions.order(false)); // order by promo desc
Restrictions<Manufacturer> manufacturerRestrictions
= Restrictions.create(Manufacturer.class);
//configure manuf restrictions in the same manner...
product.setManufacturer(restrictions.join(manufacturerRestrictions));
/* there are also joinSet() and joinList() methods
for one-to-many relationships as well */
Even more sophisticated restrictions are available.
product.setPrice(restrictions.between(45,55));
product.setManufacturer(restrictions.fetch(FetchMode.JOIN));
product.setName(restrictions.or("Foo", "Bar"));
After showing the framework to a coworker, he mentioned that many data mapped objects have private setters, making this kind of criteria setting difficult as well (a different problem with the Example API!). So, I've accounted for that too. Instead of using setters, getters are also queryable.
restrictions.is(product.getName()).eq("Foo");
restrictions.is(product.getPrice()).gt(10);
restrictions.is(product.getPromo()).order(false);
I've also added some extra checking on the objects to ensure better type safety - for example, the relative criteria (gt, ge, le, lt) all require a value ? extends Comparable for the parameter. Also, if you use a getter in the style specified above, and there's a #Transient annotation present on the getter, it will throw a runtime error.
But wait, there's more!
If you like that Hibernate's built-in Restrictions utility can be statically imported, so that you can do things like criteria.addRestriction(eq("name", "foo")) without making your code really verbose, there's an option for that too.
Restrictions<Product> restrictions = new Restrictions<Product>(){
public void query(Product queryObject){
queryObject.setPrice(gt(10));
queryObject.setPromo(order(false));
//gt() and order() inherited from Restrictions
}
}
That's it for now - thank you very much in advance for any feedback! We've posted the code on Sourceforge for those that are interested. http://sourceforge.net/projects/hqbe2/
The API looks great!
Restrictions.order(boolean) smells like control coupling. It's a little unclear what the values of the boolean argument represent.
I suggest replacing or supplementing with orderAscending() and orderDescending().
Have a look at Querydsl. Their JPA/Hibernate module requires code generation. Their Java collections module uses proxies but cannot be used with JPA/Hibernate at the moment.