Fastest way to sort a set - java

I'm replacing an application used at work, using Hibernate with an existing database. I can't modify the database since it's in use with other processes. When Hibernate pulls the main object from the db, the child objects are put in an unordered set. I've never really dealt with sets or sorting sets much before.
I need to display the last (chronologically) child for each set. There are no dates stored for the child objects, but since the id field in the db is AUTO_INCREMENT, I can sort them by id in lieu of a date.
One of the complaints about the existing system in use is that it's really, really slow. I'd like to show a definite increase of speed with the new application.
Given a Person object (variable name "off") with 0 to n "home addresses", I'm using:
Set addressSet = off.getAddresses();
List<Address> addressList = new ArrayList<>();
Iterator i = addressSet.iterator();
while(i.hasNext()){
addressList.add((Address) i.next());
}
Collections.sort(addressList, new AddressComparator());
Address a = null;
if(addressList.size()>0){
a = addressList.get(addressList.size()-1);
}else{
a = new Address(); //creates new Address object with empty strings
//for fields
}
My simple comparator is:
public int compare(Address t, Address t1) {
return t.getId().compareTo(t1.getId());
}
My question: Through either Java or Hibernate, is there a faster method to sort the sets?

From my point of view, you don't need to sort at all. Use
Collections.max()
or
Collections.min()
with your custom comparator provided to find the address you want. This has O(n) run time in worst case compared to O(nlog(n)) sorting time since you do not sort and only iterate your set once. The positive part also is that you don't need to convert your Set to List as the max and min methods work with any Collection instance.
Another advantage (at least for me) is that Collections utilities are part of the java runtime, so you don't need to add any third-party libraries.

I'm not sure if there are multiple sets, but from the code it seems like you are just getting the Address with the highest id. This can be achieved with the following sql, which wouldn't require sorting.
select * from table where id = (select max(id) from table);

You can do this without temporary List.
TreeSet sortedSet = Sets.newTreeSet(new AddressComparator());
sortedSet.addAll(off.getAddresses());
return sortedSet.first(); // or sortedSet.last() see what is suitable for you
Details on Sets.
UPD.
Please also see solution with Guava Ordering. It will allow you to get max element without temporary collection at all.
Ordering<Adress> ordering = Ordering.from(new AddressComparator());
return ordering.max(off.getAddresses());

You can sort at the database level in JPA/Hibernate by using the #OrderBy property where the sort is on a non-nested property. So in your case you can do this.
e.g.
#OneToMany
#OrderBy("id");
public Set<Address> addresses;
and Hibernate will ensure the collection is in a sorted set.
If the sort field happened to be on a nested property (which it isn't in your case) e.g. person.address.town.population then you can still have Hibernate deal with sort using the Hibernate specific (non-JPA) #Sort annotation which will ensure a sorted set as above but will sort using an in memory sort rather than a DB order by clause.
#OneToMany
#Sort(//natural or specify a comparator);
public Set<Address> addresses;
That does not get you the most recent address of course. If you don't want to change the mapping from Set to List which would allow you get the latest based on index, then you could also do this in the Database tier by various means e.g. by creating a view based on max address id for each person.
#Entity
#Table(name = "vw_most_recent_addresses"
public class MostRecentAddress extends Address{
}
public class Person{
#OneToMany
#OrderBy("id");
public Set<Address> addresses;
#OneToOne
public MostRecentAddress mostRecentAddress;
}

Related

Hibernate Collection vs List as field type

I have two entities, Author and Book, connected with a one-to-many relationship. What's the difference between specifying field type as Collection<Book> and List<Book>? Aforementioned scenario is presented below:
#Entity
public class Author {
#Id
#GeneratedValue
private Long id;
private String name;
#OneToMany(mappedBy = "author")
private Collection<Book> books = new ArrayList<>(); // List<Book> instead?
}
The only difference I have already noticed is that when I want to use #OrderColumn annotation I need to use List, but are there any other differences I don't know about? Should I always use Collection if I don't need an order?
Set - contains no duplicates no order
(Bag)Collection - duplicates no order
List - duplicates order
For Set you need to be carefull about hashcode and equals. And one interesting twist with Bags in relation to SQL generated:
If we are using List as a mapped attribute in hibernate without
indexed column, hibernates treats it as a Bag. Since Hibernate handles
List as a Bag (Unordered collection with non unique values. The best
feature of a bag is that you can get the number of occurrences of an
object through the API With a list, there is no way to do the same
without iterating through the whole list.) as soon as we delete and
add a element in this collection. Hibernate issues a SQL to delete all
the elements first from join table which are no supposed to be deleted
and then it re-insert all of them back from the Bag.
http://lkumarjain.blogspot.no/2013/07/why-hibernate-does-delete-all-entries.html
java.util.Collection is the most generic unordered collection of elements while the java.util.List implies existence of an iteration order.
Using #OrderColumn will give this iteration order however it might change the generated SQL query. Often it results in ORDER BY statement added to the SQL query. Without #OrderColumn the JPA provider has more flexibility but you should always measure the performance in your actual database instead of tuning it blindly.

How to retrieve particular emp object from a list without looping through datastructure

I have 65000 records of employees in a database . i am retreiving all the records and storing as employee object in a list as a cache. when customer enters the emp id in the browser , the record should be fetched from the list on one condition , without looping through the list. how can we acheive it.
using indexOf(Object ) we can acheive ,by implementing equals method , but what business logic should go in that.kindly let me know your views.
class Employee
{
private int id;
private String name;
Private String address;
public void setAddress (){}
public void setId(){}
public void setName(){}
// simillarly getMethods
}
1) I would implement a cache based on a hashmap rather than a list:
Map cache = new HashMap<Integer, Employee>();
This way you can retrieve an Employee object by a given ID very efficiently.
Additionally, I wouldn't add a setter for the employee id, since it can corrupt the mapping. Consider setting the id through a constructor parameter only.
--EDIT--
If you MUST use a list:
2) You may want to sort it first. This will allow performing a binary search (See Collections.binarySearch(..) methods). This requires implementing a Comparator or the Comparable interface, in order to define an ordering between the Employee objects. Also, you will have to create a dummy Employee object with the required id each time you want to perform the search.
3) If performance is not an issue, simply use List.indexOf(..). This requires implementing the equals(..) method in the Employee class.
4) In order to do it really without loops, you can create a sparse list, containing Employee with id N at index N. This is only feasible if the Employee id value range is not too big. The benefit is an optimal retrieval time.

Using hibernate with annotations, i want a one-many relationship to be sorted

Using hibernate with annotations, i want a one-many relationship to be sorted by the 'created' field on the 'many' table.
So far i've got this, which always ends up in a random order:
// The notes
#OneToMany
#JoinColumn(name="task_id")
Set<TaskNote> notes;
public Set<TaskNote> getNotes() {return notes;}
public void setNotes(Set<TaskNote> notes) {this.notes = notes;}
since neither answer gave you the full solution :
#OneToMany
#JoinColumn(name="task_id")
#OrderBy("created")
List<TaskNote> notes;
public List<TaskNote> getNotes() {return notes;}
public void setNotes(List<TaskNote> notes) {this.notes = notes;}
Set is unordered, so use List instead, and you need the #OrderBy annotation too.
Use a List instead of a Set. A List preserves order, a Set doesn't. Using a List, the order of the elements will match whatever you specify for your ORDER BY in HQL or using Criteria.
You have two options, you can either
#OrderBy("created") which will do what you would expect in SQL.
You can also #Sort which allows you to specify an arbitrary comparator implementation, if you want to sort in memory for some reason.
Edit: unpredictable iteration order is an implementation detail of HashSet, it's not part of the contract of the Set interface. Hibernate will happily use LinkedHashSet when you use XML mapping and specify ordering on a set. I assumed it does the same when you use the annotation, apologies if that was incorrect.

In Hibernate, why Set is the recommended way to represent many-valued associations

Taken from here: http://docs.jboss.org/hibernate/stable/core/reference/en/html/persistent-classes.html#persistent-classes-equalshashcode
I tend to use List since Criteria returns List, so it makes my code cleaner since I don't have to do conversion.
I do something like so..
#OneToMany(cascade= {CascadeType.PERSIST, CascadeType.REMOVE}, mappedBy="parent")
#Column(name="PARENT_ID")
public List<Menu> getChildMenus() {
return childMenus;
}
If I had use Set there, somewhere in my DAO I will have to convert results returned by Criteria to Set first.
I wonder what the repercussion could be by using List they way I am doing.
Set is used as the child table has a primary key such that a child can only be once in a parent. If you use a list there can be duplicate children in the list and this cannot be saved to the database.

Order multiple one-to-many relations

I have a search screen, using JSF, JBoss Seam, and Hibernate underneath. There are columns for A, B, and C, where the relations are as follows:
A (1< --; >*) B (1< --; >*) C
Let's say A has a List< B > and B has a List< C > (both relations are one-to-many).
The UI table supports ordering by any column (ASC or DESC), so I want the results of the query to be ordered. This is the reason I used Lists in the model.
However, I got an exception that Hibernate cannot eagerly fetch multiple bags (it considers both lists to be bags). There is an interesting blog post here, and they identify the following solutions:
Use #IndexColumn` annotation (there is none in my DB, and what's more, I want the position of results to be determined by the ordering, not by an index column)
Fetch lazily (for performance reasons, I need eager fetching)
Change List to Set
I changed the List to Set, which by the way is more correct, model-wise.
First, if don't use #OrderBy, the PersistentSet returned by Hibernate wraps a HashSet, which has no ordering. So, when I iterate over it in the UI, the order is random, whatever ordering the database did.
Second, If I do use #OrderBy, the PersistentSet wraps a LinkedHashSet, which has ordering, and is what I would like. However, the OrderBy property is hardcoded, and takes precedence over whatever ordering I set both using Collections (link) or HQL (link). As such, all other ordering I request through the UI comes after it.
I tried again with Sets, and used SortedSet (and its implementation, TreeSet), but I have some issues:
I want ordering to take place in the DB, and not in-memory, which is what TreeSet does (either through a Comparator, or through the Comparable interface of the elements).
I found that there is the Hibernate annotation #Sort, which has a SortOrder.UNSORTED and you can also set a Comparator. I still haven't managed to make it compile, but I am still not convinced it is what I need.
One of the requirements is for the sorting to take place in the DB.
Created a simple Maven project and committed it as a Google Code project. This is my personal playground for the problem.
What's the point of ordering in the DB when the same result set can be reordered by any column? If you need to hit the DB every time when a different column is clicked on the UI, you just create a performance issue for yourself. This is exactly the case when it makes sense to order the set in memory.
About bags and lists, this is what the Hibernate bok has to say:
Bags may not be sorted (there is no TreeBag, unfortunately), nor may lists; the
order of list elements is defined by the list index.
Based on what Hibernate in Action said and the workaround provided by your own answer, you could sort your collection at runtime to avoit your exception
#Entity
public class Aa {
private List<Bb> bbList - new ArrayList<Bb>();
#OneToMany
public List<Bb> getBbList() {
return bbList;
}
#Transient
public List<Bb> getBbListSortedBySomeProperty() {
Collections.sort(bbList, new Comparator<Bb>() {
public int compare(Bb o1, Bb o2) {
return o1.getSomeProperty().compareTo(o2.getSomeProperty());
}
});
return bbList;
}
}
Be aware someProperty must implement Comparable
...
#Entity
public class Bb {
private List<Cc> ccList - new ArrayList<Cc>();
#OneToMany
public List<Cc> getCcList() {
return ccList;
}
}

Categories

Resources