Hibernate many to many fetching associated objects

Hibernate many to many fetching associated objects - java

#Entity
#Table(name = "MATCHES")
public class Match implements Serializable{
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(name = "MATCH_ID")
private Long id;
#ManyToMany(mappedBy = "matches", cascade = CascadeType.ALL)
private Set<Team> teams = new HashSet<Team>();
}
#Entity
#Table(name = "Teams")
public class Team implements Serializable {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(name = "TEAM_ID")
private long id;
#ManyToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
#JoinTable(name = "TEAM_MATCH", joinColumns = { #JoinColumn(name = "TEAM_ID") }, inverseJoinColumns = {
#JoinColumn(name = "MATCH_ID") })
private Set<Match> matches = new HashSet<Match>();
}
I got those classes, now I want to get all the matches and let's say, print names of both teams.
public List getAllMatches() {
Session session = HibernateUtil.getSession();
Transaction t = session.beginTransaction();
Criteria criteria = session.createCriteria(Match.class, "match");
criteria.createAlias("match.teams", "mt", JoinType.LEFT_OUTER_JOIN);
List result = criteria.list();
t.commit();
session.close();
return result;
}
But when I invoke that method, result has size 2 when I got only 1 match in my table. Both of those matches in result have 2 teams, which is correct. I have no idea why this happends. What I want is to have one Match object with two Team objects in 'teams' set, but I have two of those Match objects. They are fine, but there are two of them. I'm completely new to this and have no idea how to fix those criterias. I tried deleting 'FetchType.LAZY' from #ManyToMany in Team but it doesn't work. Team also has properties like Players/Trainer etc. which are in their own tables, but I don't want to dig that deep yet, baby steps. I wonder tho if doing such queries is a good idea, should I just return Matches and then if I want to get Teams, get them in another session?
Edit: I added criteria.setResultTransformer(DistinctRootEntityResultTransformer.INSTANCE); and it works, is that how I was suppose to fix that or this is for something completely different and I just got lucky?

I think the duplication is a result of your createAlias call, which besides having this side effect is redundant in the first place.
By calling createAlias with those arguments, you are telling Hibernate to not just return all matches, but to first cross index the MATCHES table with the TEAM_MATCH table and return a result for each matching pair of rows. You get one result for a row in the matches table paired with the many-to-many mapping to the first team, and another result for the same row in the matches table paired with the many-to-many mapping to the second team.
I'm guessing your intent with that line was to tell Hibernate to fetch the association. This is not necessary, Hibernate will fetch associated objects on its own automatically when needed.
Simply delete the criteria.createAlias call, and you should get the result you expected - with one caveat. Because the association is using lazy fetching, Hibernate won't load it until you access it, and if that comes after the session is closed you will get a LazyInitializationException. In general I would suggest you prefer solving this by having the session opened and closed at a higher level of abstraction - getting all matches is presumably part of some larger task, and in most cases you should really use one session for the duration of the entire task unless there are substantial delays (such as waiting for user input) involved. Changing that would likely require significant redesign of your code, however; the quick solution is to simply loop over the result list and call Hibernate.initialize() on the teams collection in each Match. Or you could just change the fetch type to eager, if the performance cost of always loading the association whether or not you need it is acceptable.

Related

What is the best practice to create repository on Spring Boot?

I want to create one to many mapping like Post has many Comments. I have two solutions for adding comments. The first solution is to create a repository for the comment and the second solution is to use PostRepository and get post and add comment to the post. Each solution has its own challenges.
In the first solution, creating repositories per entity increases the number of repositories too much and based on the DDD, repositories should be created for Aggregate Roots.
In the second solution, there are performance issues. To load, add or remove nested entities, the root entity must be loaded first. To add entity, other related entities like User Entity in Comment Entity must be loaded from userRepository. As a result, these additional loadings cause a decrease in speed and total performance.
What is the best practice to load, add or remove nested entities?
File Post.java
#Entity
#Table(name = "posts")
#Getter
#Setter
public class Post
{
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
#Size(max = 250)
private String description;
#NotNull
#Lob
private String content;
#OneToMany(mappedBy = "post", fetch = FetchType.LAZY, cascade = CascadeType.ALL)
private Set<Comment> comments = new HashSet<>();
#ManyToOne(fetch = FetchType.LAZY, optional = false)
#JoinColumn(name = "user_id", nullable = false)
#OnDelete(action = OnDeleteAction.CASCADE)
private User user;
}
File Comment.java
#Entity
#Table(name = "comments")
#Getter
#Setter
public class Comment {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
#NotNull
#Lob
private String text;
#ManyToOne(fetch = FetchType.LAZY, optional = false)
#JoinColumn(name = "post_id", nullable = false)
#OnDelete(action = OnDeleteAction.CASCADE)
private Post post;
#ManyToOne(fetch = FetchType.LAZY, optional = false)
#JoinColumn(name = "user_id", nullable = false)
#OnDelete(action = OnDeleteAction.CASCADE)
private User user;
}
#Entity
#Table(name = "Users")
#Getter
#Setter
public class User
{
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
#OneToMany(mappedBy = "user", fetch = FetchType.LAZY, cascade = CascadeType.ALL)
private Set<Comment> comments = new HashSet<>();
#OneToMany(mappedBy = "user", fetch = FetchType.LAZY, cascade = CascadeType.ALL)
private Set<Post> posts = new HashSet<>();
}

"best" is not well defined.
But here is what is probably to be considered the canonic stance the Spring Data Team has on this question.
You definitely should NOT have one repository per entity (s. Are you supposed to have one repository per table in JPA?).
The reason is certainly not that you'd have to many classes/interfaces.
Classes and interfaces are really cheap to create both at implementation time and at run time.
It is kind of hard to have so many of them that it poses a significant problem.
And if it would, already the entities would cause a problem.
The reason is that repositories handle aggregates, not entities.
Although, admittedly the difference is hard to see in JPA based code.
So your question boils down to: What should be an aggregate.
At least part of the answer is already in your question:
In the second solution, there are performance issues. To load, add or remove nested entities, the root entity must be loaded first. To add entity, other related entities like User Entity in Comment Entity must be loaded from userRepository. As a result, these additional loadings cause a decrease in speed and total performance.
The concepts of aggregate and repository are widely adopted in the microservice community because they lead to good scalability.
This certainly isn't the same as "speed and total performance" but certainly related.
So how go these two view together?
Andrey B. Panfilov is onto something with their comment:
#OneToMany is actually #OneToFew like "person may be reachable by a couple of phone numbers".
But it only describes a heuristic.
The real rule is: An aggregate should group classes that need to be consistent at all times.
The canonical example is a purchase order with its line items.
Line items on their own don't make sense.
And if you modify a line item (or add/remove one) you might have to update the purchase order, for example in order to update the total price or in order to maintain constraints like a maximum value.
So purchase order should be an aggregate including its line items.
This also means that you need to completely load an aggregate.
This in turn means that it can't be to big, because otherwise you'd run into performance problems.
In your example of Post, Comment, and User, Post might form an aggregate with Comment.
But in most systems the number of comments is close to unlimited and can be huge.
I therefore would vote for making each entity in your example its own aggregate.
For more input about aggregates and repositories you might find Spring Data JDBC, References, and Aggregates interesting.
It is about Spring Data JDBC not Spring Data JPA, but the conceptual ideas do apply.

N+1 problem: fetch data in loop and If you have 2000+ data for posts and comments, you need to avoid to fetch for each data.
// Ex: 2000 posts is fetched
for(Post post: userRepository.findById("1").getPosts()) {
// fetching in loop: you go to database for each post(2000) and get comments of posts.
Set<Comment> comments = post.getComments();
}
Solution: create a repository for Post and fetch with custom repository. There are a lot of way to fetch eagerly. Ex: EntityGraph, FetchType.EAGER, JPQL ...
#Query(value = "select p from Post p fetch left join p.comments c where p.id=:postId)
public Set<Post> postsWithComments(#Param("postId") Long postId)
Set<Post> posts = postRepository.postWithComments(1L);
Even you need to be careful when fetching data eagerly, If there are a lot of comments for post simply use another repository for Comment.
public Set<Comment> findByPostId(String postId);
Set<Comment> comments = commentRepository.findByPostId(1L);
Even if there are 60000 comments for a single post. you need to fetch with pagination which can be helpful in critical times.
public Page<Comment> findByPostId(Long postId, Pageable pageable);
Page<Comment> comments = commentRepository.findByPostId(1L, PageRequest.of(2000));
int loopCounter = comments.getTotalElements() % 2000 == 0 ? comments.getTotalElements() / 2000 : comments.getTotalElements() / 2000 + 1;
int i=1;
do{
// do something
i++;
}while(i <= loopCounter);
For further things you need to use cache strategies for improving performance.
Also you need to define what can be the response time of request and what is actual response time. You can use fetch with left join or simply another request. In the long running processes you can use async operations as well.

Is there a way to avoid N+1 queries when using a unidirectional #ManyToOne relationship in JPA + Hibernate?

I have the following entities:
DummyA:
#Entity
#Table(name = "dummy_a")
#Data
public class DummyA implements Serializable {
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "dummy_b_name", referencedColumnName = "name", updatable = false, insertable = false)
private DummyB dummyB;
}
DummyB:
#Entity
#Table(name = "dummy_b")
#Data
public class DummyB implements Serializable {
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(name = "entity_id")
private Integer id;
#Column(name = "name")
private String name;
}
As it currently stands, any attempt to fetch DummyA objects results in additional queries to fetch DummyB objects as well. This causes unacceptable extra delay due to N+1 queries and also breaks Page objects returned by repository.findAll(specification, pageable), causing incorrect total page counts and element counts to be returned (in my case repository extends JpaRepository). Is there a way to do it such that DummyB objects are lazily loaded or, if that's not possible, so that they're all eagerly loaded in a single query?
Limitations:
I'm fairly new to JPA and Hibernate and have been learning how to use them. I've come across the following in a project I'm working on. I don't have the liberty to include new dependencies and my project currently does not allow hibernate bytecode enhancement through #LazyToOne(LazyToOneOption.NO_PROXY).
Things I've tried so far and did not work / did not work as expected:
#ManyToOne(optinoal = false, fetch = FetchType.LAZY)
Tried to see if accessing the dummyB field in dummyA is what caused the N+1 queries by removing dummyB's setter and getter. Still had N+1 queries.
Using #EntityGraph on findAll.
Tried implementing PersistentAttributeInterceptable and using PersistentAttributeInterceptor to solve the problem.
Links to resources I've looked up so far:
#ManyToOne(fetch = FetchType.LAZY) doesn't work on non-primary key referenced column
N+1 query problem with JPA and Hibernate
Hibernate lazy loading for reverse one to one workaround - how does this work?
PersistentAttributeInterceptable
JPA Entity Graph
Any help is greatly appreciated.

I've come back with an answer in case anyone is curious. It turns out that some entries had an invalid "magic" value in the column used by DummyA as the foreign key to associate it with DummyB, causing Hibernate to execute separate queries for those null values in order to check if the association is truly not found (see this doc and this answer from a related question). I mistook those queries as N+1. The project also had an interceptor that extended Hibernate's EmptyInterceptor in order to modify queries that produced pages, resulting in incorrect counts if secondary queries were executed.

Hibernate update before insert in one to many

I am getting the constraint violation exception because of the order of operations performed by Hibernate. I have the following entities defined.
#Entity
public class A {
#Id
private Integer id;
#OneToMany(mappedBy = "a", fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true)
private List<B> bList;
public void setBList(List<B> bList) {
if(CollectionUtils.isNotEmpty(this.bList)) {
this.bList.clear();
}
if(CollectionUtils.isNotEmpty(bList)) {
this.bList.addAll(bList);
}
}
}
#Entity
#Table(uniqueConstraints={#UniqueConstraint(columnNames = {"name", "a_id", "isDeleted"})})
public class B {
#Id
private Integer id;
private String name;
#ManyToOne(fetch = FetchType.LAZY, optional = false)
#JoinColumn(name="a_id")
private A a;
private boolean isDeleted;
}
When I set the new list of Bs (containing one item updated as deleted and a new item having the same values in the columns corresponding to constraint) in entity A and save entity A, I get constraint violation.
Hibernate is performing insert of the new item before updating the old item as deleted leading to constraint violation when in fact the data is correct in the application.
Am I doing something wrong here or Is there any configuration or fix for this?

Answer changed on 2021/05/07 due to comment from the OP pointing out it was missing the point
There are 2 things you should change for things to work
You should not rely on Hibernate to guess the right order of operations for you. It relies on heuristics that might not fit your intent. In your case, you should call EntityManager.flush after your soft-delete of the old B and before persisting the new one.
Your unique constrain will cause problems anyway, when you'll soft-delete your second B, that is identical regarding unique columns. More hereafter
In general, ensuring this kind of constrains in DB is a bad idea. If you try and update/insert an entity that violates them, then you'll get an obscure PersistenceException and it will be hard to warn your users about the exact cause. So you will have to programmatically check those constrains before insertion/update anyways. Hence, you'd better remove them and ensure unicity through your program, unless they're vital to data integrity. Same goes for not-nullable columns and other constrains that are pure business logic.
Now last advice from experience: for soft-delete column, use a TimeStamp rather than a boolean. Same effort updating and reading your records, but it gives you some valuable information about when a record was deleted.

What is the SQL/JDBC equivalent to JPA merge API? What does it exactly do?

I'm migrating JPA api's like persist,save,merge,refresh,detach and remove to plain SQL using JDBC, where iam finding it hard to understand the concept of EntityManager.merge(someTask).
I tried a SQL update query for the merge API but the explanation of merge is as follows Merge the state of the given entity into the current persistence context.But with plain SQL and JDBC its hard to understand how to do the same and i need to handle OptimisticLock as well.
The entity class which is used for JPA is as follows.
#Entity
#Table(name = "TASK", indexes = {#Index(name = "RIO", columnList = "priority", unique = false),
#Index(name = "EXP", columnList = "expiry", unique = false),
#Index(name = "STA", columnList = "taskStatus", unique = false),
#Index(name = "CAT", columnList = "category", unique = false),
#Index(name = "NEXTTRY", columnList = "nextTry", unique = false)})
public class TaskEntity {
#Version
private int version;
#Basic
#Column(length = Integer.MAX_VALUE, columnDefinition = "varchar(" + Integer.MAX_VALUE + ")")
private String taskId;
#Basic
private String category;
#ElementCollection(fetch = FetchType.EAGER)
#MapKeyColumn(name = "KEY")
#CollectionTable(name = "TASKPROPERTIES", foreignKey = #ForeignKey(
name = "TASK_ID_FK",
foreignKeyDefinition = "FOREIGN KEY (TASKENTITY_ID) REFERENCES TASK (ID) ON DELETE CASCADE"))
#Column(length = Integer.MAX_VALUE, columnDefinition = "varchar(" + Integer.MAX_VALUE + ")")
private Map<String, String> TaskProperties;
#Basic
#Column(length = Integer.MAX_VALUE, columnDefinition = "varchar(" + Integer.MAX_VALUE + ")")
private String destination;
#Enumerated(EnumType.STRING)
private TaskStatus taskStatus;
#Basic
private String type;
#Basic
private Long expiry;
#Basic
private Long nextTry;
#Basic
private Integer retries;
#Basic
private Integer priority;
//Setters and Getters
//Equals and HashCode
}
Hence what would be the equivalent of EntityManger.merge(task) to SQL/HSQL.

Merge in essence is the process of merging an existing record in a table with what has been provided in the statement (i.e. UPDATE if the record exists else INSERT). Also known as UPSERT.
Let us say you have a table tbl_person that has primary key person_ssn and two other columns namely name and age. In case you want to insert a statement on a person_ssn that happens to exist there, DBs will throw error. Your requirement is to insert a record if the person_ssn doesn't exist else update the name and age. In such situation you will use Merge.
There are few ways to achieve this, two of them are
Issue at least two DML statements. First do a SELECT on the person_ssn and based on whether you found a record, subsequently, you will either issue an UPDATE or an INSERT statement
Use MERGE SQL statement. This is the more modern and direct way but not all databases support it. Read more information here. Further, just to get an idea, check here, on how MERGE SQL statement works in SQL Server which supports it
As far as Java JPA is concerned, implementations abstract this concept. Depending on DB's support for MERGE SQL statement, either it is used or two statements (SELECT followed by either UPDATE or INSERT) are issued to accomplish the same.
hsqldb offers MERGE SQL support as per comment provided.

There is more to merge semantically (in case of ORM context) other than just upsert. Essentially your entity model is a graph of objects having relations to each other using memory pointers. The objective of merge API is to enable reflecting the expected future state of object graph with the current state. Typically the ORM would issue SQL insert/updates/deletes to reflect the expected future state and not necessarily SQL MERGE. For e.g., the future entity state has a one to many relation as null - this would result in ORM issuing query to nullify the foreign key in the child table to reflect this state. In nutshell - when you pass an object ( which is a graph of interconnected objects) to merge , the ORM first determines for individual objects whether they need to be newly persisted or if they contain identifier of already persisted data then load them into persistence context ( if not already there) and apply all data changes and relationship updates. Finally the dirty checking mechanism of ORM makes sure to generate equivalent SQL to reflect this final state.
EntityManager - merge(T entity) Merge the state of the given entity into the
current persistence context.

How do I maintain consistency of cached ManyToOne collections with READ_WRITE CacheConcurrencyStrategy in Hibernate?

I'm running into a difference between NONSTRICT_READ_WRITE and READ_WRITE CacheConcurrencyStrategy when writing "denormalized" collections... the idea being that I have a join table modeled as an entity but it also contains read only links to the tables it joins to.
My entities, roughly:
#Entity
#org.hibernate.annotations.Entity(dynamicUpdate = true)
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
class Actor {
#Id
Integer id;
#Column
String name;
}
#Entity
#org.hibernate.annotations.Entity(dynamicUpdate = true)
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
class Movie {
#Id
Integer id;
#Column
String title;
}
#Entity
#org.hibernate.annotations.Entity(dynamicUpdate = true)
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
class Credit {
#Column
String roleName;
#ManyToOne(targetEntity = Movie.class, optional = true)
#JoinColumn(name = "movie_id", insertable = false, updatable = false)
#NotFound(action = NotFoundAction.IGNORE)
Movie movie;
#Column(name = "movie_id")
Long movieId;
#ManyToOne(targetEntity = Actor.class, optional = true)
#JoinColumn(name = "actor_id", insertable = false, updatable = false)
#NotFound(action = NotFoundAction.IGNORE)
Actor actor;
#Column(name = "actor_id")
Long actorId;
}
Second level object cache is enabled (with ehcache).
My application writes Movies and Actors... and sometime later, it links them together by writing Credit. When Credit is written, I only fill in the roleName, movieId, and actorId fields, I do not provide the Movie and Actor objects.
Using NONSTRICT_READ_WRITE caching, I am then able to read back that Credit object and it will contain the referenced Movie and Actor objects.
Using READ_WRITE caching, reading back the Credit will return a Credit with empty Movie and Actor fields. If I clear the hibernate cache, reading back that Credit then contains the Movie and Actor objects as expected. This is also the behavior with TRANSACTIONAL caching (but of course not with NONE caching).
So it would seem that hibernate is inserting Credit into the 2nd level cache with null Actor and Movie fields when using READ_WRITE cache. Is there a way to prevent this from happening and always read from the database to get back these joined fields? I've tried annotating just the fields with CacheConcurrencyStrategy.NONE, but this does not work.

I think you have probably stumbled across a hibernate bug because of your weird mapping (at least non standard mapping). There is no real reason for having two fields one with id and one with entity.
You can turn an id into an entity reference using session.load - which just creates a proxy, doesn't load the data from DB.
If you get rid of the movieId and actorId field and remove the insertable / updatable false on the movie / actor field it should work the same irrespective of READ_WRITE or NON_STRICT_READ_WRITE
Credit c = new Credit()
Movie m = session.load(movieId);
Actor a = session.load(actorId);
c.setMovie(m);
c.setActor(a);
session.save(c);
Hibernate doesn't save the complete objects in the second level cache. I stores it in a flattened form (hydrated - more like db tables) and reconstructs the object from that. So your assertion that it stored the object with null in the cache and not updating it is incorrect. There is something else going on.
The main difference between READ_WRITE and NON_STRICT_READ_WRITE is that the entry in the cache will be locked while updating in case of READ_WRITE and won't be locked in NON_STRICT_READ_WRITE. This will only impact if you are updating this entity in multiple threads concurrently.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.