OpenJPA Eager Fetching

OpenJPA Eager Fetching - java

I use OpenJPA 2.3 bundled with WebSphere 8.5 and I have to read a lot of data from a table. I also have to fetch a lot of relations with the root entity.
Atm I am using the criteria API to create the search query and select the entities. I annotated all collections with EAGER. When I check the logfile it creates 5 Queries to fetch all children. That is the way I want it.
The catch is that I have to filter a lot in java after the select and stop after 1000 matching entities. So I thought i specify the fetch size and stop reading entities from the db as soon I have my 1k results.
If I introduce the FetchBatchSize setting, OpenJPA creates single queries for each entity to load the children. (n+1 problem)
I also tried to use the fetch join syntax directly in my query, but without any success. So what am I doing wrong?
I tried:
1)
query.setHint("openjpa.FetchPlan.FetchBatchSize", 1000);
query.setHint("openjpa.FetchPlan.ResultSetType", "SCROLL_INSENSITIVE");
2)
OpenJPAQuery<?> kq = OpenJPAPersistence.cast(query);
JDBCFetchPlan fetch = (JDBCFetchPlan) kq.getFetchPlan();
fetch.setFetchBatchSize(1000);
fetch.setResultSetType(ResultSetType.FORWARD_ONLY);
fetch.setFetchDirection(FetchDirection.FORWARD);
fetch.setLRSSizeAlgorithm(LRSSizeAlgorithm.UNKNOWN);
The entity:
#Entity
#Table(name = "CONTRACT")
public class Contract {
// omitted the other properties. The other relationships are annotated the same way
#OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, mappedBy = "contract")
private List<Vehicle> vehicles= new ArrayList<Vehicle>();
The query:
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<Contract> crit = cb.createQuery(Contract.class);
crit.distinct(true);
Root<Contract> r = crit.from(Contract.class);
// omited the where clause. In worst case I have a full table scan without any where clause. (the reason I need the batch size)
Fetch<Contract, Vehicle> fetchVehicles = r.fetch("vehicles", JoinType.LEFT); // I tried to work with a fetch join as well
TypedQuery<Contract> query = em.createQuery(crit);
// query.setHint("openjpa.FetchPlan.FetchBatchSize", FETCH_SIZE);
// query.setHint("openjpa.FetchPlan.ResultSetType", "SCROLL_INSENSITIVE");
OpenJPAQuery<?> kq = OpenJPAPersistence.cast(query);
JDBCFetchPlan fetch = (JDBCFetchPlan) kq.getFetchPlan();
fetch.setFetchBatchSize(FETCH_SIZE);
fetch.setResultSetType(ResultSetType.FORWARD_ONLY);
fetch.setFetchDirection(FetchDirection.FORWARD);
fetch.setLRSSizeAlgorithm(LRSSizeAlgorithm.UNKNOWN);
fetch.setEagerFetchMode(FetchMode.PARALLEL);
List<TPV> queryResult = query.getResultList();
// here begins the filtering and I stop as soon I have 1000 results
Thanks for the help!

Have a look at how to deal with large result sets and you will see that EAGER is the opposite of what you should do.
As I stated in comments, EAGER means that JPA loads all results at once, so it is not recommended for large result sets. Setting the fetchBatchSize causes JPA to lazy load every x (in your case 1000) results. So it would be practically the same as if you would use #OneToMany(fetch = FetchType.LAZY, ...) (also worth a try)
Setting the fetchBatch size to a much lower number (e.g. 50) will also lower the objects that are kept in memory.
Also try
query.setHint("openjpa.FetchPlan.ResultSetType", "SCROLL_SENSITIVE");

It seems that there are some Bugs filed which apply in my scenario. I found a workaround which scales well.
First I select only the ids (Criteria API can select skalar values) and I apply the batching there. So I have no n+1 problem due to the wrong fetching strategy anymore.
After this I select my entities with an IN() statement in batches of 1000 without limiting with fetch batch size or max results. So I do not run into this bug and OpenJPA generates one query for each relation.
So I have around 6 querys for the entity with all its dependencies.
Thanks again thobens for your help!

Related

Is there way to fetch queries with Hibernate session.get()?

I am working with Hibernate 5 Criteria Builder Queries fetching with Criteria Queries. But when calling session.get() SQL creating multiple queries for related Hibernate entities when calling them. Is there way to fetch them with one query as Hibernate Criteria Query Fetching.
CriteriaQuery<AdvanceRecieved> advanceCriteria = builder.createQuery(AdvanceRecieved.class);
Root<AdvanceRecieved> advanceRoot = advanceCriteria.from(AdvanceRecieved.class);
advanceRoot.fetch(AdvanceRecieved_.department,JoinType.LEFT);
I fetched these entities with fetch(), But I haven't find an example for fetch below code example.
ItemsABS selectedItem = jpaSess.get(ItemsABS.class, dealer.id);
Set<Tax> itemtaxes = selectedItem.getTaxEligibility();

You are seeing multiple queries because you probably have a One-To-Many relation between ItemsABS and TAX entities. So when you request ItemsABS data, it by default fetches the attached references (i.e. TAX data), and hence multiple queries are fired for that.
If you just need ItemsABS data, then you probably would have to use LAZY LOADING while fetching data for ItemsABS.
This can be defined at entity level using #OneToMany(fetch = FetchType.LAZY)

JPA 2.1 Load extra data after object is loaded

With JPA, can I run a query after an object is loaded from the database?
For example, I have an entity that has this field:
#OneToMany(mappedBy = "widget", fetch = FetchType.EAGER, cascade = {})
#OrderBy("happenedDate DESC")
public List<Widget> getWidgets() {
return widgets;
}
This will only load all of the associated widgets. What I really want are the first 20 in the result set with the order by the happenedDate. Is there an annotation that can specify a method to run after my object is loaded from the DB so I can run a query and get limited results, something like:
#AfterDataLoaded
List<Widget> loadLast20WidgetsWidgets(){
// Do query here
}
Does this annotation or pattern exist?

You can use a combination of #EntityListener (class level) and #PostLoad (method level) to achieve your goal.

You can achieve this using the Hibernate specific, non JPA compliant #Where annotation which takes a native sql clause to limit the results loaded to an associated collection.
https://docs.jboss.org/hibernate/stable/annotations/reference/en/html_single/#entity-hibspec-collection
How this query would look will depend on your database support for LIMIT, TOP etc.
#OneToMany(mappedBy = "widget", fetch = FetchType.EAGER, cascade = {})
#OrderBy("happenedDate DESC")
#Where(clause = "id in (select top 20 id from table_name order by some_field desc)")
public List<Widget> getLast20Widgets() {
return widgets;
}

There is no annotation to limit the no of records fetched. To accomplish that, you will have to run a JPA query, and that would be trivial.

Its not possible since JPA/Hibernate has to manage entire collection's transition say from persistent to removed state.
You can anyways use below alternatives:
1.HQL or Native SQL
2.Criteria Query

Hibernate: Criteria query taking a long time with three-level relationship

I'm having trouble using Criteria to find all the objects that belong to a certain entity. My model is like the following (just showing the relevant code):
#Entity...
Class A {
#ManyToOne(fetch = FetchType.EAGER)
#Fetch(FetchMode.JOIN)
#JoinColumn(name = "ide_b", referencedColumnName = "ide_b", nullable = false)
private B b;
}
#Entity...
class B {
#ManyToOne(fetch = FetchType.EAGER)
#Fetch(FetchMode.JOIN)
#JoinColumn(name = "ide_c", referencedColumnName = "ide_c", nullable = false)
private C c;
}
#Entity...
class C {
...
}
My Criteria query is as simple as this (actually, there would be some filters, but they're not being used):
Criteria criteria = getSession().createCriteria(A.class);
criteria.list(); // MY SYSTEM STAYS HERE FOREVER WHEN RUNNING AGAINST A REAL DATABASE
Would anybody have a clue on this issue? The system just stays forever on the line "criteria.list()" and never returns.
I've already tested the SQL that it generates directly on the database and it works just fine.
I've already tested this query both with code involving only class A having a reference to B and class A having a reference to C (directly). They both work. This third level in the association seems to be causing problems... Note: my Hibernate version is an old one, like 3.0.0

You need to remove fetch type join. Eager is fine for your data layout, but do your really need that much data at once?
With default fetch, hibernate will query table A only. Then for each B's foreign key, queries B only once per key. Same done for C.
i.e. once b_id=1 is fetched from B, it wont be fetched again even it used with million rows of A. Hibernate's 2 nd level cache handles it.
With join type fetch, for each row from A, you'll get 1 single row containing columns of all 3 tables.
If your relation was OneToMany, then you'll get A x B x C rows returned. But since ManyToOne, there is no such issue.
Your problem is even this query returns 1 big row for each items at A, there is too much B and C replication. So DB response is huge so processing is hard as well for both DB and your app.

Well, actually I think the problem is something else. I've changed both relationships (A -> B and B -> C) to lazy (and removed the #Fetch). Then I queried for a specific A object. So far so good. I was able to get the object and I was able to call a.getB() successfully.
Nonetheless, when I call b.getC() Hibernate does not return the C object to me (I mean, Hibernate gets stuck at this line).
The query Hibernate creates to fetch the C object when I call b.getC() is the following:
select
myCtable0_.id as id1_85_0_,
myCtable0_.name as nam2_85_0_
from
MyTableC myCtable0_
where
myCtable0_.id in (?, ?)
The C table has an id field (the primary key) and a name (varchar).

Another option is to fetch all data separately. Then hibernate cache will have each entity and then, I think, criteria query will be faster.

NamedEntityGraph - JPA / Hibernate throwing org.hibernate.loader.MultipleBagFetchException: cannot simultaneously fetch multiple bags

We have a project where we need to lazily load collections of an entity, but in some cases we need them loaded eagerly. We have added a #NamedEntityGraph annotation to our entity. In our repository methods we add a "javax.persistence.loadgraph" hint to eagerly load 4 of attributes defined in said annotation. When we invoke that query, Hibernate throws org.hibernate.loader.MultipleBagFetchException: cannot simultaneously fetch multiple bags.
Funnily, when I redefine all of those collection as eagerly fetched Hibernate does fetch them eagerly with no MultipleBagFetchException.
Here is the distilled code.
Entity:
#Entity
#NamedEntityGraph(name = "Post.Full", attributeNodes = {
#NamedAttributeNode("comments"),
#NamedAttributeNode("plusoners"),
#NamedAttributeNode("sharedWith")
}
)
public class Post {
#OneToMany(cascade = CascadeType.ALL, mappedBy = "postId")
private List<Comment> comments;
#ElementCollection
#CollectionTable(name="post_plusoners")
private List<PostRelatedPerson> plusoners;
#ElementCollection
#CollectionTable(name="post_shared_with")
private List<PostRelatedPerson> sharedWith;
}
Query method (all cramped together to make it postable):
#Override
public Page<Post> findFullPosts(Specification<Post> spec, Pageable pageable) {
CriteriaBuilder builder = entityManager.getCriteriaBuilder();
CriteriaQuery<Post> query = builder.createQuery(Post.class);
Root<Post> post = query.from(Post.class);
Predicate postsPredicate = spec.toPredicate(post, query, builder);
query.where(postsPredicate);
EntityGraph<?> entityGraph = entityManager.createEntityGraph("PlusPost.Full");
TypedQuery<GooglePlusFullPost> typedQuery = entityManager.createQuery(query);
typedQuery.setHint("javax.persistence.loadgraph", entityGraph);
query.setFirstResult(pageable.getOffset());
query.setMaxResults(pageable.getPageSize());
Long total = QueryUtils.executeCountQuery(getPostCountQuery(specification));
List<P> resultList = total > pageable.getOffset() ? query.getResultList() : Collections.<P>emptyList();
return new PageImpl<P>(resultList, pageable, total);
}
Any hints on why is this working with eager fetches on entity level, but not with dynamic entity graphs?

I'm betting the eager fetches you think were working, were actually working incorrectly.
When you eager fetch more than one "bag" (an unorder collection allowing duplicates), the sql used to perform the eager fetch (left outer join) will return multiple results for the joined associations as explained by this SO answer. So while hibernate does not throw the org.hibernate.loader.MultipleBagFetchException when you have more than one List eagerly fetched it would not return accurate results for the reason given above.
However, when you give the query the entity graph hint, hibernate will (rightly) complain. Hibernate developer, Emmanuel Bernard, addresses the reasons for this exception to be thrown:
eager fetching is not the problem per se, using multiple joins in one SQL query is. It's not limited to the static fetching strategy; it has never been supported (property), because it's conceptually not possible.
Emmanuel goes on to say in a different JIRA comment that,
most uses of "non-indexed" List or raw Collection are erroneous and should semantically be Sets.
So bottom line, in order to get the multiple eager fetching to work as you desire:
use a Set rather than a List
persist the List index using JPA 2's #OrderColumn annotation,
if all else fails, fallback to Hibernate specific fetch annotations (FetchMode.SELECT or FetchMode.SUBSELECT)
EDIT
related:
https://stackoverflow.com/a/17567590/225217
https://stackoverflow.com/a/24676806/225217

JPA eager fetch does not join

What exactly does JPA's fetch strategy control? I can't detect any difference between eager and lazy. In both cases JPA/Hibernate does not automatically join many-to-one relationships.
Example: Person has a single address. An address can belong to many people. The JPA annotated entity classes look like:
#Entity
public class Person {
#Id
public Integer id;
public String name;
#ManyToOne(fetch=FetchType.LAZY or EAGER)
public Address address;
}
#Entity
public class Address {
#Id
public Integer id;
public String name;
}
If I use the JPA query:
select p from Person p where ...
JPA/Hibernate generates one SQL query to select from Person table, and then a distinct address query for each person:
select ... from Person where ...
select ... from Address where id=1
select ... from Address where id=2
select ... from Address where id=3
This is very bad for large result sets. If there are 1000 people it generates 1001 queries (1 from Person and 1000 distinct from Address). I know this because I'm looking at MySQL's query log. It was my understanding that setting address's fetch type to eager will cause JPA/Hibernate to automatically query with a join. However, regardless of the fetch type, it still generates distinct queries for relationships.
Only when I explicitly tell it to join does it actually join:
select p, a from Person p left join p.address a where ...
Am I missing something here? I now have to hand code every query so that it left joins the many-to-one relationships. I'm using Hibernate's JPA implementation with MySQL.
Edit: It appears (see Hibernate FAQ here and here) that FetchType does not impact JPA queries. So in my case I have explicitly tell it to join.

JPA doesn't provide any specification on mapping annotations to select fetch strategy. In general, related entities can be fetched in any one of the ways given below
SELECT => one query for root entities + one query for related mapped entity/collection of each root entity = (n+1) queries
SUBSELECT => one query for root entities + second query for related mapped entity/collection of all root entities retrieved in first query = 2 queries
JOIN => one query to fetch both root entities and all of their mapped entity/collection = 1 query
So SELECT and JOIN are two extremes and SUBSELECT falls in between. One can choose suitable strategy based on her/his domain model.
By default SELECT is used by both JPA/EclipseLink and Hibernate. This can be overridden by using:
#Fetch(FetchMode.JOIN)
#Fetch(FetchMode.SUBSELECT)
in Hibernate. It also allows to set SELECT mode explicitly using #Fetch(FetchMode.SELECT) which can be tuned by using batch size e.g. #BatchSize(size=10).
Corresponding annotations in EclipseLink are:
#JoinFetch
#BatchFetch

"mxc" is right. fetchType just specifies when the relation should be resolved.
To optimize eager loading by using an outer join you have to add
#Fetch(FetchMode.JOIN)
to your field. This is a hibernate specific annotation.

The fetchType attribute controls whether the annotated field is fetched immediately when the primary entity is fetched. It does not necessarily dictate how the fetch statement is constructed, the actual sql implementation depends on the provider you are using toplink/hibernate etc.
If you set fetchType=EAGER This means that the annotated field is populated with its values at the same time as the other fields in the entity. So if you open an entitymanager retrieve your person objects and then close the entitymanager, subsequently doing a person.address will not result in a lazy load exception being thrown.
If you set fetchType=LAZY the field is only populated when it is accessed. If you have closed the entitymanager by then a lazy load exception will be thrown if you do a person.address. To load the field you need to put the entity back into an entitymangers context with em.merge(), then do the field access and then close the entitymanager.
You might want lazy loading when constructing a customer class with a collection for customer orders. If you retrieved every order for a customer when you wanted to get a customer list this may be a expensive database operation when you only looking for customer name and contact details. Best to leave the db access till later.
For the second part of the question - how to get hibernate to generate optimised SQL?
Hibernate should allow you to provide hints as to how to construct the most efficient query but I suspect there is something wrong with your table construction. Is the relationship established in the tables? Hibernate may have decided that a simple query will be quicker than a join especially if indexes etc are missing.

Try with:
select p from Person p left join FETCH p.address a where...
It works for me in a similar with JPA2/EclipseLink, but it seems this feature is present in JPA1 too:

If you use EclipseLink instead of Hibernate you can optimize your queries by "query hints". See this article from the Eclipse Wiki: EclipseLink/Examples/JPA/QueryOptimization.
There is a chapter about "Joined Reading".

to join you can do multiple things (using eclipselink)
in jpql you can do left join fetch
in named query you can specify query hint
in TypedQuery you can say something like
query.setHint("eclipselink.join-fetch", "e.projects.milestones");
there is also batch fetch hint
query.setHint("eclipselink.batch", "e.address");
see
http://java-persistence-performance.blogspot.com/2010/08/batch-fetching-optimizing-object-graph.html

I had exactly this problem with the exception that the Person class had a embedded key class.
My own solution was to join them in the query AND remove
#Fetch(FetchMode.JOIN)
My embedded id class:
#Embeddable
public class MessageRecipientId implements Serializable {
#ManyToOne(targetEntity = Message.class, fetch = FetchType.LAZY)
#JoinColumn(name="messageId")
private Message message;
private String governmentId;
public MessageRecipientId() {
}
public Message getMessage() {
return message;
}
public void setMessage(Message message) {
this.message = message;
}
public String getGovernmentId() {
return governmentId;
}
public void setGovernmentId(String governmentId) {
this.governmentId = governmentId;
}
public MessageRecipientId(Message message, GovernmentId governmentId) {
this.message = message;
this.governmentId = governmentId.getValue();
}
}

Two things occur to me.
First, are you sure you mean ManyToOne for address? That means multiple people will have the same address. If it's edited for one of them, it'll be edited for all of them. Is that your intent? 99% of the time addresses are "private" (in the sense that they belong to only one person).
Secondly, do you have any other eager relationships on the Person entity? If I recall correctly, Hibernate can only handle one eager relationship on an entity but that is possibly outdated information.
I say that because your understanding of how this should work is essentially correct from where I'm sitting.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.