Delete cascade hangs in JPA when large number of objects

Delete cascade hangs in JPA when large number of objects - java

I have a JPA Entities like this:
#Entity
class MyEntity{
#JsonIgnore
#OneToMany(mappedBy = "application", cascade = ALL, fetch = LAZY)
private List<MyChildEnity> myChildEntities;
}
...
#Entity
class MyChildEnity {
#ManyToOne(optional = false, fetch = FetchType.LAZY, cascade = { REFRESH,
DETACH })
#JoinColumn(name = "APPLICATION_ID")
private MyEntity application;
}
I access this entity from a REST call. When the number of elements is very large, and I try to delete the MyEntity Object the REST call hangs and then timeout. For small number of elements in MyChildEnity table it works fine. When I debugged, I saw that JPA fetches one record at a time and deletes it. This is too slow and too much work done.
Is this an expected behavior? Shouldn't JPA be intelligent to convert this to a single DELETE call on the MyChildEnity table.
I'm using OpenJPA with Derby and DB2 database.

The reason why you get one delete statement for each element probably has something to do with the fact that JPA let you do something pre- and post removal. If you write a JPQL with a deletestatement you are able to bypass the callback mechanism and delete everything in a single request.
Documentation for entity listeners and callbacks. (This is JPA functionality).

Related

Hibernate select with birectionnal mapping

On a bidirectional relationship beetwen two entities (a ControlTable made up of ControlSteps), i'm simply trying by different ways to request a ControlTable by knowing the collection ControlSteps of it. I know that it's not recommended to have this bidirectionnal mapping but i need to know each childs of a parent, and the parent for each child.
I configured it like this in ControlTable class:
#OneToMany(mappedBy = "controlTable",cascade = CascadeType.ALL, fetch=FetchType.EAGER)
#Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
#Fetch(FetchMode.JOIN)
private Set<ControlStep> controlSteps;
And like this for ControlStep class :
#ManyToOne(optional=false, fetch=FetchType.LAZY)
#JoinColumn(name="ctrl_table_id", referencedColumnName = "id")
private ControlTable controlTable;
When i use the default JPA query findAll(), it's not working to get the list of ControlTables (or only one) because it's requesting recursively the parent in the child's parent (infinite response).
In another way, itried to put all in LAZY loading, with an HQL query fetching the childs, but the result is the same.
Do you have any idea of how to get these collections without problems?
Thank you very much by advance

Found it. The problem was Spring Data Rest and JSON transformation, for more details :
Infinite Recursion with Jackson JSON and Hibernate JPA issue

Spring Data JPA: Batch insert for nested entities

I have a test case where I need to persist 100'000 entity instances into the database. The code I'm currently using does this, but it takes up to 40 seconds until all the data is persisted in the database. The data is read from a JSON file which is about 15 MB in size.
Now I had already implemented a batch insert method in a custom repository before for another project. However, in that case I had a lot of top level entities to persist, with only a few nested entities.
In my current case I have 5 Job entities that contain a List of about ~30 JobDetail entities. One JobDetail contains between 850 and 1100 JobEnvelope entities.
When writing to the database I commit the List of Job entities with the default save(Iterable<Job> jobs) interface method. All nested entities have the CascadeType PERSIST. Each entity has it's own table.
The usual way to enable batch inserts would be to implement a custom method like saveBatch that flushes every once in a while. But my problem in this case are the JobEnvelope entities. I don't persist them with a JobEnvelope repository, instead I let the repository of the Jobentity handle it. I'm using MariaDB as database server.
So my question boils down to the following: How can I make the JobRepository insert it's nested entities in batches?
These are my 3 entites in question:
Job
#Entity
public class Job {
#Id
#GeneratedValue
private int jobId;
#OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.PERSIST, mappedBy = "job")
#JsonManagedReference
private Collection<JobDetail> jobDetails;
}
JobDetail
#Entity
public class JobDetail {
#Id
#GeneratedValue
private int jobDetailId;
#ManyToOne(fetch = FetchType.EAGER, cascade = CascadeType.PERSIST)
#JoinColumn(name = "jobId")
#JsonBackReference
private Job job;
#OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.PERSIST, mappedBy = "jobDetail")
#JsonManagedReference
private List<JobEnvelope> jobEnvelopes;
}
JobEnvelope
#Entity
public class JobEnvelope {
#Id
#GeneratedValue
private int jobEnvelopeId;
#ManyToOne(fetch = FetchType.EAGER, cascade = CascadeType.PERSIST)
#JoinColumn(name = "jobDetailId")
private JobDetail jobDetail;
private double weight;
}

Make sure to configure Hibernate batch-related properties properly:
<property name="hibernate.jdbc.batch_size">100</property>
<property name="hibernate.order_inserts">true</property>
<property name="hibernate.order_updates">true</property>
The point is that successive statements can be batched if they manipulate the same table. If there comes the statement doing insert to another table, the previous batch construction must be interrupted and executed before that statement. With the hibernate.order_inserts property you are giving permission to Hibernate to reorder inserts before constructing batch statements (hibernate.order_updates has the same effect for update statements).
jdbc.batch_size is the maximum batch size that Hibernate will use. Try and analyze different values and pick one that shows best performance in your use cases.
Note that batching of insert statements is disabled if IDENTITY id generator is used.
Specific to MySQL, you have to specify rewriteBatchedStatements=true as part of the connection URL. To make sure that batching is working as expected, add profileSQL=true to inspect the SQL the driver sends to the database. More details here.
If your entities are versioned (for optimistic locking purposes), then in order to utilize batch updates (doesn't impact inserts) you will have to turn on also:
<property name="hibernate.jdbc.batch_versioned_data">true</property>
With this property you tell Hibernate that the JDBC driver is capable to return the correct count of affected rows when executing batch update (needed to perform the version check). You have to check whether this works properly for your database/jdbc driver. For example, it does not work in Oracle 11 and older Oracle versions.
You may also want to flush and clear the persistence context after each batch to release memory, otherwise all of the managed objects remain in the persistence context until it is closed.
Also, you may find this blog useful as it nicely explains the details of Hibernate batching mechanism.

To complete the previous answer of Dragan Bozanovic. Hibernate sometimes silently deactivates the order of execution of the batches if for example it encounters cyclic relations between the entities when it builds the graph of dependencies between the batches (see InsertActionSorter.sort(..) method). It would have been interesting for hibernate to trace this behavior when this happens.

cascade type save update in Hibernate

I am using hibernate with JPA annotations for relationship mapping.
I have three entities in my code User Group & User_Group
User & Group are in a ManyToMany relationship.
User_Group is a kinda bridge table but with some additional fields. So here is the modified mapping code.
User
#Entity
#Table(name = "USERS")
public class User {
#OneToMany(mappedBy = "user")
private Set<UserGroup> userGroups
}
Group
#Entity
#Table(name = "GROUPS")
public class Group {
#OneToMany(mappedBy = "group")
private Set<UserGroup> userGroups
}
UserGroup
#Entity
#Table(name = "USERS_GROUPS")
public class UserGroup {
#ManyToOne(cascade = CascadeType.ALL)
#JoinColumn(name = "USER_ID")
private User user;
#ManyToOne(cascade = CascadeType.ALL)
#JoinColumn(name = "GROUP_ID")
private Group group;
}
When I set the user & group object to the usergroup & save it.
User user = new User("tommy", "ymmot", "tommy#gmail.com");
Group group = new Group("Coders");
UserGroup userGroup = new UserGroup();
userGroup.setGroup(group);
userGroup.setUser(user);
userGroup.setActivated(true);
userGroup.setRegisteredDate(new Date());
session.save(userGroup);
Things work fine. With CascadeType.ALL the group object & user object are updated too. But when I delete the userGroup object. The child object are deleted too.
Deletion of child objects is a strict no no.
There is no CascadeType.SAVE-UPDATE in JPA, which just does save or update but no delete. How do I achieve this.
If I remove the CascadeType.ALL from the mapping the child objects don't get updated & I need them to be updated.

SAVE_UPDATE is for save(), update(), and saveOrUpdate(), which are 3 Hibernate-proprietary methods. JPA only has persist() and merge(). So, if you want to use cascading on Hibernate-proprietary methods, you'll need to use Hibernate-proprietary annotations. In this case, Cascade.
Or you could stop using the Hibernate Session, and use the standard JPA API instead.

CascadeType.ALL includes CascadeType.REMOVE too.
The solution is to use all CascadeType.* you need except CascadeType.REMOVE, like so:
#ManyToOne(cascade = {CascadeType.PERSIST, CascadeType.REFRESH, CascadeType.MERGE}))
in your UserGroup definitions.

It's almost always a code smell when propagating from child to parent entity, it should be the other way round.
From Cascading best practices:
Cascading only makes sense only for Parent – Child associations (the
Parent entity state transition being cascaded to its Child entities).
Cascading from Child to Parent is not very useful and usually, it’s a
mapping code smell.
From Hibernate best practices:
Avoid cascade remove for huge relationships
Most developers (myself included) get a little nervous when they see a
CascadeType.REMOVE definition for a relationship. It tells Hibernate
to also delete the related entities when it deletes this one. There is
always the fear that the related entity also uses cascade remove for
some of its relationships and that Hibernate might delete more
database records than intended. During all the years I’ve worked with
Hibernate, this has never happened to me, and I don’t think it’s a
real issue. But cascade remove makes it incredibly hard to understand
what exactly happens if you delete an entity. And that’s something you
should always avoid. If you have a closer look at how Hibernate
deletes the related entities, you will find another reason to avoid
it. Hibernate performs 2 SQL statements for each related entity: 1
SELECT statement to fetch the entity from the database and 1 DELETE
statement to remove it. This might be OK, if there are only 1 or 2
related entities but creates performance issues if there are large
numbers of them.

JPA Hibernate unexpectedly fetches records of #OneToOne mapped entity, should I change mapping to #ManyToOne or do something else?

I have an entity with #OneToOne mapped subentity:
#Entity #Table
public class BaseEntity {
#Id
private String key;
#OneToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
private InnerEntity inner;
}
#Entity #Table
public class InnerEntity {
private String data;
}
It was working perfectly on persist and merge operations until I decided to fetch all records in a named query (SELECT e FROM BaseEntity e). Problems are that after calling it, Hibernate fetches all records from BaseEntity and then executes distinct queries for each InnerEntity. Because table is quite big it takes much time and takes much memory.
First, I started to investigate if getInner() is called anywhere in running code. Then I tried to change fetchType to EAGER to check if Hibernate it's going to fetch it all with one query. It didn't. Another try was to change mapping to #ManyToOne. Doing this I've added updatable/insertable=false to #JoinColumn annotation. Fetching started to work perfectly - one SELECT without any JOIN (I changed EAGER back to LAZY), but problems with updating begun. Hibernate expects InnerEntity to be persisted first, but there's no property with primary key. Of course I can do this and explicity persist InnerEntity calling setKey() first, but I would rather solve this without this.
Any ideas?

If you want inner field to be loaded on demand and your relation is #OnToOneyou can try this
#OneToOne(fetch = FetchType.LAZY, optional = false)

When using HQL hibernate doesn't consider the annotations, so you should tell it how to work.
In your case you should right the HQL like this:
SELECT e FROM BaseEntity as e left join fetch e.inner

How to retrieve nested JPA entities in a single query

I am trying to retrieve entities using eclipselink JPA and am looking for a way to reduce the number of queries run to retrieve a single entity. I believe I should be using the #JoinFetch annotation to retrieve sub-entities in the same query as the main entity. This works fine for a single level of join, but not for multiple levels.
In the example below, EntityA contains a collection of EntityB which contains an EntityC. When I retrieve EntityA, I want a single query to return all 3 sets of entity data. In reality it generates 2 queries, 1 joining EntityA and EntityB and then a separate query joining EntityB and EntityC.
Is it possible to combine this into one query?
class EntityA {
#OneToMany(mappedBy = "entityALink", fetch = FetchType.EAGER)
#JoinFetch
private Collection<EntityB> entityBs;
}
class EntityB {
#JoinColumn(name = "X", referencedColumnName = "Y")
#ManyToOne(optional = false, fetch = FetchType.EAGER)
private EntityA entityALink;
#JoinColumn(name = "A", referencedColumnName = "B")
#ManyToOne(optional = false, fetch = FetchType.EAGER)
#JoinFetch
private EntityC entityCLink;
}
class EntityC {
#Id
#Basic(optional = false)
#Column(name = "SomeColumn")
private String someField
}

If you need reduce number of queries, you may using lazy initialization - FetchType.LAZY instead of FetchType.EAGER - in this way jpa get data from databases when need. But you must remember, this is not working when entity is disconnected from manager. So if you send this entity to other servers in serialize the form (ex. in multi-level application) you must again connected this entity with manager. If you application runs in one server, then you don't have this problem.
Summing up is not the exact answer to your question, but maybe helpful for optimize this code.
Exact answer for you question:
You may using named queries, but then query is parse to sql native query, and you don't sure that this working as you want. But maybe you may using native query method?
em.createNativeQuery("SELECT ... your queries")
For this purpose, please read about using #SqlResultSetMapping annotation to configure result entity class...

First write a query to get EntityA.
EntityA entity = <your Query> ;
then call
Collection<EntityB> entityB = entity.getEntityBs();
for(EntityB eachB : entityB){
EntityC entityCLink = eachB.getEntityCLink();
}
Note: Create setter & getters in each entity.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.