I have been working with the following class named City
#ToString
#AllArgsConstructor
public class City {
Integer id;
String name;
}
and tried to convert it to a record called CityRecord as
record CityRecord(Integer id, String name) {} // much cleaner!
But moving to such a representation, one of our unit tests starts failing. The tests internally deal with a list of cities read from a JSON file and mapped to an object further counting the cities while grouping them under into a Map. Simplified to something like:
List<City> cities = List.of(
new City(1, "one"),
new City(2, "two"),
new City(3, "three"),
new City(2, "two"));
Map<City, Long> cityListMap = cities.stream()
.collect(Collectors.groupingBy(Function.identity(),
Collectors.counting()));
The above code asserted true to contain 4 keys and each accounting for 1 of its occurrence. With the record representation, there are no more than 3 keys in the resulting Map. What is causing this and what should be the way to go around this?
Cause
The reason behind the behavior observed is as documented in java.lang.Record
For all record classes, the following invariant must hold: if a record
R's components are c1, c2, ... cn, then if a record instance is copied
as follows:
R copy = new R(r.c1(), r.c2(), ..., r.cn()); then it must be the case that r.equals(copy).
In short, your CityRecord class now has an equals(and hashcode) implementation that compares the two attributes and ensure if they are equal the record consisting of those components are also equal. As a result of this evaluation, the two record objects with the same attributes would be grouped together.
The result, therefore, would be correct to infer/assert that there should be three such keys with the one having id=2, name="two" counted twice.
Immediate Remedy
An immediate temporary solution to this would be to create a custom(flawed - reason explained later) equals implementation within your record representation as well. This would look like:
record CityRecord(Integer id, String name) {
// WARNING, BROKEN CODE
// Does not adhere to contract of `Record::equals`
#Override
public boolean equals(Object o) {
return this == o;
}
#Override
public int hashCode() {
return System.identityHashCode(this);
}
}
Now that the comparison would be between two objects as in while using the existing City class, your tests would just work fine. But you must note the caution below before using any such remedy.
Caution
As the JEP-359 reads, Records are more like "data carrier" and while choosing to migrate your existing classes, you must be aware of the standard members acquired by a record automatically.
Planning to migrate one must be aware of the complete details of the current implementation, such as in the example you quoted while you've grouped by City, there should be no reason to have two cities with same id and name data to be listed differently. They should be equal, it should be the same data after all repeated twice and hence the correct counts.
In which case, your existing implementation if representing a data model could be rectified to match the record in a way by overwriting the equals implementation to account for comparing the individual attributes as well which is where the immediate remedy stated above is contradictory and should be avoided.
Related
I have a list of this sorts
List<Employee>emp = Arrays.asList(new Employee("Jack", 29), new Employee("Tom", 24));
class Employee {
private String name;
private Integer id;
}
I want to insert to Employee full name List as follows:
List<Employee>empFullName = Arrays.asList(new Employee("Jack Tom", 29));
class EmployeeFullName {
private String fullName;
private Integer id;
}
How can I merge the name fields in Employee to fullName in Employee List after combining the names? I want to use Java 8 for the solution.
Notwithstanding all the reasonable questions previous commenters have posted, it feels to me like your main problem boils down to "How do I get pairs of objects out of a stream".
Once you have paired up the objects into a new collection (or stream) of pairs, you can do whatever you want to with them (i.e. make a new object out of them).
Collect successive pairs from a stream
You would still have to decide how to "merge" the pairs. In your case, it looks like you're taking the "name" and joining them together for each Pair to produce a fullName. And you're using the left-hand-side ID. That still leaves one to wonder what happened to the right-hand-side ID, but maybe with your real data-set, it's functionally duplicated..? Even so, it might be worth doing a programmatic Assert to make sure Pairs you're streaming out are consistent in that way. Otherwise one missing element in your stream and you'll be tying together all sorts of random users...
In Effective Java in Item 8 the recommendation is that
For each significant field in the class to check the corresponding field of the this object.
I understand that we can have secondary fields that are calculated by primary fields but what exactly is the meaning of "for each significant field"? Is the equals contract implemented properly only when all fields of an object is compared?
If I have e.g. a class Employee which has a multitude of fields like id, first and last name, dob, position, location etc all these seem significant but to me it seems that just using the id would suffice for a proper and performant equals implementation.
Am I wrong on this? Or the id I mention is exactly what Bloch means by "significant" fields?
class Employee {
private UUID id;
private String firstName;
private String lastName;
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (!(obj instanceof Employee))
return false;
return id.equals(((Employee)obj).id);
}
#Override
public int hashCode() {
return Objects.hash(id);
}
}
In case of Employee is stored into DB, i.e. has unique id, then no need to check other fields like firstName and lastName in equals; according to data object comparison, only id filed is significant.
A significant field is merely one that, if omitted, would result in an incorrect implementation of equals (according to the notion of equality you have defined for instances of your class).
I appreciate that is a bit of a self-referential definition, but that's what it means.
The canonical example of a non-significant field is String.hashCode: as you observe, this is calculated from other fields (and lazily), so it would not be appropriate to include in the equality because there is no guarantee that it has been calculated for either of the strings being compared; and, if it has been calculated for both, it tells you nothing more than you already know.
In your case, yes, it sounds like comparing instances using only the id would suffice: this is a significant field, the name (etc) is not significant: there should only be one person (little p, as in an actual real human) with a particular id.
It does raise a question of how you would deal with "same id, different name" instances, but this is getting into the realm of Falsehoods Programmers Believe About Names:
People have exactly one canonical full name.
People have exactly one full name which they go by.
People have, at this point in time, exactly one canonical full name.
People have, at this point in time, one full name which they go by.
People have exactly N names, for any value of N.
(People’s names fit within a certain defined amount of space.)
People’s names do not change.
...
Taking these into consideration, if you want to say "this Person is the same as that Person" (and you want to do that using equals), id seems like the only reasonable thing to use.
How do I make my code to add identical objects to a SET? I guess I will have to do something with hashcode() or equal() functions.
Class Order {
private id;
private Set<Discount>;
}
Class Discount {
private id;
private Long amount;
}
Now if I try to save two discounts of $1 each, the SET only shows one discount. When hibernate saves it, discounts will have different IDs, but they are same as of now. Don;t want to change the definition of Order class, as it's a big project and changes will be endless
According to the JavaDoc for the Set interface, a set is not allowed to contain duplicate identical elements (as defined by equals and hashcode). While this will work fine when hibernate saves the discounts (since you said the ids will be different), the ids are the same right now, so what you are trying to accomplish is not possible without doing things that future people who will be stuck maintaining your code will hate you for.
Since you do not desire to change the Order class, your best recourse is to retroactively change the ids on your discounts to be unique.
You cannot add identical objects to a set, because that is the point of a set. A set contains unique elements. You would be better off using a list or a map.
I'm replacing an application used at work, using Hibernate with an existing database. I can't modify the database since it's in use with other processes. When Hibernate pulls the main object from the db, the child objects are put in an unordered set. I've never really dealt with sets or sorting sets much before.
I need to display the last (chronologically) child for each set. There are no dates stored for the child objects, but since the id field in the db is AUTO_INCREMENT, I can sort them by id in lieu of a date.
One of the complaints about the existing system in use is that it's really, really slow. I'd like to show a definite increase of speed with the new application.
Given a Person object (variable name "off") with 0 to n "home addresses", I'm using:
Set addressSet = off.getAddresses();
List<Address> addressList = new ArrayList<>();
Iterator i = addressSet.iterator();
while(i.hasNext()){
addressList.add((Address) i.next());
}
Collections.sort(addressList, new AddressComparator());
Address a = null;
if(addressList.size()>0){
a = addressList.get(addressList.size()-1);
}else{
a = new Address(); //creates new Address object with empty strings
//for fields
}
My simple comparator is:
public int compare(Address t, Address t1) {
return t.getId().compareTo(t1.getId());
}
My question: Through either Java or Hibernate, is there a faster method to sort the sets?
From my point of view, you don't need to sort at all. Use
Collections.max()
or
Collections.min()
with your custom comparator provided to find the address you want. This has O(n) run time in worst case compared to O(nlog(n)) sorting time since you do not sort and only iterate your set once. The positive part also is that you don't need to convert your Set to List as the max and min methods work with any Collection instance.
Another advantage (at least for me) is that Collections utilities are part of the java runtime, so you don't need to add any third-party libraries.
I'm not sure if there are multiple sets, but from the code it seems like you are just getting the Address with the highest id. This can be achieved with the following sql, which wouldn't require sorting.
select * from table where id = (select max(id) from table);
You can do this without temporary List.
TreeSet sortedSet = Sets.newTreeSet(new AddressComparator());
sortedSet.addAll(off.getAddresses());
return sortedSet.first(); // or sortedSet.last() see what is suitable for you
Details on Sets.
UPD.
Please also see solution with Guava Ordering. It will allow you to get max element without temporary collection at all.
Ordering<Adress> ordering = Ordering.from(new AddressComparator());
return ordering.max(off.getAddresses());
You can sort at the database level in JPA/Hibernate by using the #OrderBy property where the sort is on a non-nested property. So in your case you can do this.
e.g.
#OneToMany
#OrderBy("id");
public Set<Address> addresses;
and Hibernate will ensure the collection is in a sorted set.
If the sort field happened to be on a nested property (which it isn't in your case) e.g. person.address.town.population then you can still have Hibernate deal with sort using the Hibernate specific (non-JPA) #Sort annotation which will ensure a sorted set as above but will sort using an in memory sort rather than a DB order by clause.
#OneToMany
#Sort(//natural or specify a comparator);
public Set<Address> addresses;
That does not get you the most recent address of course. If you don't want to change the mapping from Set to List which would allow you get the latest based on index, then you could also do this in the Database tier by various means e.g. by creating a view based on max address id for each person.
#Entity
#Table(name = "vw_most_recent_addresses"
public class MostRecentAddress extends Address{
}
public class Person{
#OneToMany
#OrderBy("id");
public Set<Address> addresses;
#OneToOne
public MostRecentAddress mostRecentAddress;
}
I need to make a list of people and their time of arrival to a party, and when ever they leave I need to take them off this list. (the party maximum is 150)
Set would provide me that in no case I would add the same person twice.
List would provide me flexibility to start the list with few spaces (in case no one shows up).
Arrays (not sure what they provide) but I used them more often.
My idea was either to create 2 arrays one with names and what with times. When someone comes in, I save name in one and time on the other. When he/she leaves I search for his/her name, delete it and use the same index to delete the time on the other array.
A list could have one array of 2 elements, and then I will only need to add it in one location but searching would be a TINY more complicated.
Or maybe I am complicating this too much?
Map implementation:
public final class Person
{
... remainder left to the student ...
}
Map<Person, Date> currentPartyAttendees; // date is arrival time.
Set implementation:
public final class PartyAttendee
{
... person details ...
Date arrive;
int hashcode()
{
... use Apache HashCodeBuilder ...
}
boolean equals(Object other)
{
... implementation left to student. Use Apache EqualsBuilder ...
}
}
Set<PartyAttendee> currentPartyAttendees;
Using a HasMap would suit your purpose, as you can use the person's name as a key to add and retrieve the entry for the person, and it offers constant time performance, so regardless of how large the set grows, the performance should remain consistent.
The way you've described your use-case, why not consider the HashMap, or some other Map based implementation?
Unless of course, there's a binding for you to use a List [or similar] based data structure.
Just use a List<> and a Data structure the represents guest.
Subclass List to mark the arrival and departure time and add/remove methods. You can also use set, but then you'll have to generate a hashCode and equals method. I'm not sure you want to do that, cause people may have the same names (unless you have other data like SSN, bday, middle name etc)
public Class Guest{
private String firstName, lastName;
private long arrivalTime, departureTime;
....
}
public class MyGuests extends ArrayList<Guest>{
#Overide
public void add(Guest g){
//record arrival time here
super.add(g)
}
#Overide
public void remove(Guest g){
//record departure time here
super.remove(g);
}
}
I think you can use arrays as well, and, instead two arrays, use an arrays of 'Person' model, that holds the name of the person, arrive time and leave time. Before you insert on array, you can verify if the list already contains this person.
ps: don't forget to overwrite equals() and hashCode() in your model
LinkedHashMap - a container of key-value pairs that maintains the order of their insertion. The key would be the person (a simple String or a designated class), the value would be the time of arrival, e.g. a Date.