Strange hibernate cache behaviour - java

I use ehcache and hibernate 3.6.7 Final. This a pseudo code sample that reveals problem with caching.
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class A{
long id;
#OneToMany(mappedBy = "aId", targetEntity = B.class, fetch = FetchType.LAZY)
#Fetch(value = FetchMode.JOIN)
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
protected Set<B> fieldB;
}
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class B {
long id;
long bId;
}
1) First time when I load entity A from hibernate it does not read fieldB. And this is ok - cause FetchType.LAZY is set.
2) Second time when I load entity A I see sql queries retrieving entity A JOIN entity B.
3)If remove #Fetch(value = FetchMode.JOIN) point 2 will not be performed.
So the question is this bug or feature? And how can I avoid such latent things.

You have two conflicting fetches, you definitely do not want to specify the fetch on the column and the #Fetch annotation as it will provide unpredicatable behavior.

Related

Hibernate - join ElementCollection with the rest of an entity fetch query

I have major performance issues when I try to map an entity into a response.
This is the entity:
#Entity
#Table(name = "MyEntity")
public class MyEntity extends BaseEntity {
#Column(name = "someOtherId", nullable = false)
private String someOtherId;
#ElementCollection
#CollectionTable(name = "Phones", joinColumns = #JoinColumn(name = "myEntityId"))
#Column(name = "phone")
private List<String> phones; // <------- we care about this
#ElementCollection
#CollectionTable(name = "Websites", joinColumns = #JoinColumn(name = "myEntityId"))
#Column(name = "websites")
private List<String> websites; // <------- we care about this
#Fetch(FetchMode.SUBSELECT)
#OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, mappedBy = "myEntity")
private List<ContactEntity> bbb;
#Fetch(FetchMode.SUBSELECT)
#OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, mappedBy = "myEntity")
private List<AddressEntity> ccc;
}
This is how I use the DAL to fetch it:
List<MyEntity> findByTenantIdAndIdIn(String someOtherId, Set<String> MyEntityIds);
Now when I iterate over List<MyEntity> to map it, and call myEntity.getPhones(), I see that a DB call is being made, which is what causes the slowdown, a 70 seconds slowdown on 1000 entities.
So what can I do to force it to join in the first query it did when I called findByTenantIdAndIdIn?
Notes:
Phones is a simple table with columns: [myEntityId, phone]
The same problem happens with Websites
This has nothing to do with it being an #ElementCollection. Like you figured out, you can use subselect fetching or could also use a batch size for select fetching(the default strategy). Another possibility is to use a fetch join in the query, but be careful when fetch joining multiple collections as that might create a cartesian product which leads to a performance problem cause by too many rows being transfered. A fetch join example HQL query looks like this:
SELECT e FROM MyEntity e LEFT JOIN FETCH e.phones LEFT JOIN FETCH e.websites
I solved it right after I posted this.
I annotated phones and websites with #Fetch(FetchMode.SUBSELECT) which creates a parallel subquery.
Another way to solve this is simply to not use #ElementCollection because it has bad performance, use an Entity instead as they recommend in the video here: https://thorben-janssen.com/hibernate-tips-query-elementcollection/

Too many queries problem with JPA + Hibernate even when using #Fetch(FetchMode.JOIN)

I am developing REST application using spring boot and I am trying to optimize the performance of the queries. I am currently using findAll from the repositories which is causing performance issues. Code is given below:
Person Entity
#Entity
#Table(name = "cd_person")
#Data
#NoArgsConstructor
public class Person {
....
#OneToOne(fetch = FetchType.EAGER, cascade = CascadeType.ALL)
#JoinColumn(name = "password_id")
#Fetch(FetchMode.JOIN)
private Password password;
....
#ManyToMany(fetch = FetchType.EAGER, cascade = {CascadeType.MERGE, CascadeType.PERSIST, CascadeType.REFRESH})
#JoinTable(name = "cd_person_role",
joinColumns = #JoinColumn(name = "person_id", referencedColumnName = "id"),
inverseJoinColumns = #JoinColumn(name = "role_id", referencedColumnName = "id"))
#Fetch(FetchMode.JOIN)
private Set<Role> roles = new HashSet<>();
}
Password Entity
#Entity
#Table(name = "cd_password")
#Data
#NoArgsConstructor
public class Password {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(name = "id", updatable = false, nullable = false)
private Long id;
#Column(name = "password_hash", nullable = false)
private String passwordHash;
.......
}
Role Entity
#Entity
#Table(name = "cd_role")
#Data
#NoArgsConstructor
public class Role {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
#Column(name = "role_type")
#Enumerated(EnumType.STRING)
private RoleType roleType;
....
}
Person Repository
public interface PersonRepository extends CrudRepository<Person, Long> {
Optional<Person> findByEmail(String email);
}
When I do a personRepository.findAll() there are select queries fired for each row in the person table to fetch the password and roles when I access the person. I know I can use #Query annotation with JOIN FETCH in the repository to make it force generate the single query but I was wondering if there was any other way to do so. I am looking for something which we can do at the entity level to reduce queries.
Using spring boot 2.1.5-RELEASE version and related dependencies.
PS. The #Data and #NoArgsConstructor are Lombok annotations.
The most minimal code change is to use the ad-hoc EntityGraph feature from spring data . Just override PersonRepository 's findAll() and use #EntityGraph to configure the graph. All entities in this graph will be fetched together.
public interface PersonRepository extends CrudRepository<Person, Long> {
#EntityGraph(attributePaths = { "password", "roles" })
public List<Person> findAll();
}
Behind scene it works like JOIN FETCH. Only single SQL with LEFT JOIN will be generated.
I would leave the Entity as is and override the findAll method in the repository with an #Query annotation.
This way, the code refactor is minimal (only one repository change instead of an entity change).
The unsatisfying answer to your question is: no, there's no way to annotate/configure the entities so that the fetch mode applies to a query as well.
As you correctly found yourself, you can manipulate the query itself. Alternatives to this are using Hibernate's fetch profiles or leveraging JPA entity graphs - but all of them require programmatic intervention at the query/session level as well.
You should place #BatchSize on top of Password class
#Entity
#Table(name = "cd_password")
#Data
#NoArgsConstructor
#BatchSize(size = 50)
public class Password {
...
}
Here are the queries with #BatchSize:
Hibernate:
select
person0_.id as id1_1_,
person0_.password_id as password2_1_
from
cd_person person0_
Hibernate:
select
password0_.id as id1_0_0_,
password0_.password_hash as password2_0_0_
from
cd_password password0_
where
password0_.id in (
?, ?, ?, ?, ?
)
Can't you use lazy fetch and remove the #Fetch ? Using #NamedQuery on top of your entity and using an hibernate session to call session.createNamedQuery in a custom service would do it.
If you can afford to not use the default personRepository.findAll() but this custom service you would run an optimized query. I get that it does not exactly answer your question but my team and I faced the exact same issue and this is how we did it.
My suggestions would be:
Try to refactor and use lazy fetching.
I might not understand this part well, but why do you need personRepository.findAll() exactly? I think you would merely need something like personRepository.findById(), so you could fetch the roles and other data easily. Selecting all persons seems to be a huge overload here.
You might need the extended functions of JpaRepository later, so it might be worth changing it now instead of working a little bit more later.
This should works:
public interface PersonRepository extends CrudRepository<Person, Long> {
#Override
#Query("SELECT p FROM Person p JOIN FETCH p.roles JOIN FETCH p.password ")
Iterable<Person> findAll();
}

Why does the JPA merge operation cause multiple selects before update?

We are using Spring Data repositories with Hibernate 5.x
We have a entity graph with a deep hierarchy.
The mapping looks like this:
#Entity
public class FooBar {
#OneToMany(mappedBy = "fooBar", cascade = CascadeType.ALL, orphanRemoval = true)
private Set<Foo> chassis = new HashSet<>(0);
...
}
#Entity
public class Foo {
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "foobar_id")
private FooBar fooBar;
#OneToMany(mappedBy = "foo", cascade = CascadeType.ALL, orphanRemoval = true)
private Set<Bar> chassis = new HashSet<>(0);
...
}
#Entity
public class Bar {
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "foo_id")
private FooBar foo;
...
}
As you can see the FooBar entity has a set of Foo entities. Each Foo entity contains more Bar entities and so on.
We use the Fetchgraph feature to load the FooBar entity with the relations we need during runtime to avoid n+1 query issue when fetching lazy associations.
After the service call to load the entity graph the transaction has ended and the entity is detached.
When calling save on the FooBar entity at a later time, this causes multiple select statements. Each fetching one of the child entities.
I know that this comes from the entitymanager merge() call which fetches the object graph from the db before copying state changes from the detached objects.
I have two questions:
Why is hibernate not able to join these statements to one big select like what happens when using the fetchgraph?
When i remove all cascade options from the relations it still causes multiple selects but only attributes of the top, FooBar entity, will be updated. Why is hibernate still fetching all loaded child entites during merge even with no cascade merge?
Thanks
You can use session.update instead of merge to overcome this issue.
Session session = entityManager.unwrap(Session.class);
for (Post post: posts) {
session.update(post);
}
I have similar issue with your case, and the reason is the setting of cascading CascadeType.ALL on the #OneToMany association. Updating and merging the parent entity cause a lot of select on the child association.
#Entity
public class FooBar {
#OneToMany(mappedBy = "fooBar", cascade = CascadeType.ALL, orphanRemoval = true)
private Set<Foo> chassis = new HashSet<>(0);
...
}
I fix my case by reducing the scope of cascading, only PERSIST and REMOVE is sufficient
#OneToMany(mappedBy = "fooBar", cascade = {CascadeType.PERSIST, CascadeType.REMOVE}, orphanRemoval = true)
private Set<Foo> chassis = new HashSet<>(0);

Cache inconsistency - Entity not always persisted in cached Collection

I'm having an issue where a Validation instance is added to a Collection on a Step instance.
Declaration is as follows:
Step class:
#Entity
#Table
#Cacheable
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Step extends AbstractEntity implements ValidatableStep {
#OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true)
#JoinColumn(name = "step_id", nullable = false)
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
private Set<Validation> validations = new HashSet<>();
#Override
public void addValidation(Validation validation) {
// do some stuff
...
// add validation instance to collection
getValidations().add(validation);
}
}
Validation class:
#Entity
#Table
#Cacheable
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
#NoArgsConstructor(access = AccessLevel.PROTECTED)
public class Validation extends AbstractEntity {
//some properties
}
Both classes are Cacheable with a READ_WRITE strategy applied. The unidirectional Collection of Validations are also cached with same strategy.
One would expect when a read-write transaction that invokes addValidation(new Validation('userName')); commits, the new Validation would be visible in a subsequent read-only transaction. The weird thing is that sometimes it does work and sometimes it doesn't work...
The first transaction always succeeds; we see the new validation being persisted in database and Step's version property (for optimistic locking puposes) getting incremented. But sometimes, the 2nd read transaction contains a Step instance with an empty Validation Collection...
Our Hibernate caching config is as follows:
hibernate.cache.use_second_level_cache = true
hibernate.cache.use_query_cache = true
hibernate.cache.region.factory_class = org.hibernate.cache.ehcache.SingletonEhCacheRegionFactory
hibernate.cache.provider_configuration_file_resource_path = classpath:ehcache.xml
net.sf.ehcache.hibernate.cache_lock_timeout = 10000
Any idea what's causing this weird (and random) behavior?
The Hibernate Collection Cache always invalidates existing entries and both the Entity and the Collection caches are sharing the same AbstractReadWriteEhcacheAccessStrategy, so a soft-lock is acquired when updating data.
Because you are using a unidirectional one-to-many association, you will end up with a Validation table and a Step_validation link table too. Whenever you add/remove a Validation you have to hit two tables and that's less efficient.
I suggest you adding the #ManyToOne side in the Validation entity and turn the #OneToMany side into a mapped-by collection:
#OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "step")
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
private Set<Validation> validations = new HashSet<>();

JPA, Hibernate + EhCache + JoinColumn on empty field

I am using EhCache together with Hibernate, and I am kind of stuck with following thing:
If my entity has fields like:
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
#OneToOne(cascade = CascadeType.ALL) #PrimaryKeyJoinColumn VkAuth vka
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
#OneToOne(cascade = CascadeType.ALL) #PrimaryKeyJoinColumn OkAuth oka
vka is present but oka is null, vka gets cached but query for oka is being sent every time
I understand that oka is simply not being cached because there is nothing to cache, but what could be the possible workaround for this scenario?

Categories

Resources