Deep eager load in J2EE entities - java

I have a 3 layers model in my J2EE using EJB application: Cart, which has many LineItems, each having many Books (Book doesn't necessarily refers to a Line Item, it's not 2-directional).
Cart(1) <--> (M) LineItem (1) --> (M) Book
I wish to have it all eager loaded, i.e. when I extract the Cart it should also load all its Line Items and all of those Books with minimal number of SQL queries (I'm using a relational DB, e.g. MySQL). It can be done with 3 queries, one for each type of object. Setting "FetchType.EAGER" cause all objects to be loaded, however it has "2+n" calls: 1 query for the cart (obviously), another query for the Line Items, but then had to go on n queries for books, where n is the number of line items.
I used to work with Ruby on Rails, where using eager load (using includes) would do what I need. Can I do it also with J2EE?
(note: join might be an option, but I wish the entities to be populated automatically from the query, although I think the join is less comfortable).
Sample of my code:
#Entity
public class Cart implements Serializable {
#OneToMany(cascade=ALL, mappedBy="cart", fetch = FetchType.EAGER)
private List<LineItem> lineItems;
}
#Entity
public class LineItem implements Serializable {
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name="cart_id", referencedColumnName = "id")
private Cart cart;
#ManyToOne(fetch = FetchType.EAGER)
#JoinColumn(name="book_id", referencedColumnName = "id")
private Book book;
}
#Entity
public class Book implements Serializable {
...
}
Here is an example of the SQL queries where the Cart has 3 Line Items:
SELECT id, name FROM carts WHERE (id = 19)
SELECT id, quantity, book_id, cart_id FROM line_items WHERE (cart_id = 19)
SELECT id, description, name, price FROM books WHERE (id = 4)
SELECT id, description, name, price FROM books WHERE (id = 3)
SELECT id, description, name, price FROM books WHERE (id = 1)

Standard JPA provides join fetch, which marks a relation to be fetched eagerly, as if it was marked eager via annotation. In your case, it is only necessary to join fetch lineItems, as book will be eagerly loaded with each LineItem in single query.
With JPA 2.1, you may use Entity graph - you don't need to modify your query, you just attach a descriptor to your query that defines which relations should be eagerly fetched.
If you want to optimize to the smallest amount of queries possible, you might want to use batch fetching, which is available in some JPa providers. But beware, there is no standardized way to turn this on - I just linked to how to do it with EclipseLink.

Related

Incomplete #NodeEntity loaded from database

From one of the 'recommendations with Neo4j' tutorials I have downloaded and imported the product catalog data set. I'm creating a Spring Neo4j project using Object Graph Mapping.
For now I have created a #NodeEntity for both Product and Category. In order to quickly validate if everyting is ok used a #PostConstruct method for a ProductService and a CategoryService to get a product and a category from the db.
What I notice is that if I query the product, then get the product's category, and then all the products in the category the set does not contain all products, but only the product I started the query with.
However, if I query the category itself directly it does contain all products.
The graph model is as follows:
The subset of data I'm querying on is:
The Product entity defined as:
#NodeEntity
public class Product {
#Id
private String sku;
private String name;
#Relationship(type = "IN_CATEGORY")
private Category category;
#Convert(PriceConverter.class)
private BigDecimal price;
}
The Category entity is defined as:
#NodeEntity
public class Category {
#Id #GeneratedValue
private Long id;
private String name;
#Relationship(type = "PARENT_CATEGORY")
private Category parent;
#Relationship(type = "IN_CATEGORY", direction = Relationship.INCOMING)
private Set<Product> products = new HashSet<>();
}
For both I have created a Repository class.
If I query the CategoryRepository' withcategoryRepository.findByName("Desks")` and print the result this category has three products, as expected.
If I query the ProductRepository for the "Height Adjustable Standing Desk" and print it's category information it is Category "Desks", but it only containts a single product (the Height Adjustable Standing Desk) and not the other two products.
private void showInfo(final Category category) {
System.out.printf("Name:%s%n", category.getName());
System.out.printf("Parent: %s%n", category.getParent());
System.out.printf("Products:%s%n", category.getProducts());
}
I would have expected that the set would have been lazily evaluated into the full set of products. Do I need to force it to do so? When do additional nodes get loaded into a #NodeEntity, and how are you sure the complete subgraph for a certain node is loaded?
Edit:
The documentation contains the following quote:
For graph to object mapping, the automatic transitive loading of related entities depends on the depth of the horizon specified on the call to Session.load(). The default depth of 1 implies that related node or relationship entities will be loaded and have their properties set, but none of their related entities will be populated.
Which suggests that the session object should be used to load more data, but I don't know which session object.
Your analysis is correct. The default load depth in Spring Data Neo4j (and the underlying OGM) is 1. When you load the product, you will get its category, but not other products, as these are 2 hops in the graph away from the original product. If you want to fetch all the related products, I can think of 2 possible approaches.
Having obtained the product category from the product, query the category repository with its id. This will return the list of products with that category.
Set the query depth on the original product request to 2. The default Spring Data repository methods allow you to specify the query depth. This will then return everything related to that product up to 2 hops away from it in the graph.
There is only one way to load the "complete graph" for an entity, and that is to set the query depth to -1. If your graph model is not particularly dense, this may work for you. However, it might cause performance problems in other circumstances. Also, this technique is not compatible with loading only those entities that exist in your domain model. In other words, if the graph contains nodes and relationships you don't want, setting the query depth to -1 will blindly include all of these in the query, only to discard them again before returning those that do match your domain. Again, depending on the match between your domain model and the underlying graph, this may or may not be a problem.
Please refer to https://neo4j.com/docs/ogm-manual/current/migration/#_performance_and_unlimited_load_depth for more details

Controlling lazy/eager loading of #Formula columns dynamically

We have a few entities with a bunch of properties annotated with Hibernate's #Formula annotation. The SQL snippets in the annotations mainly run scalar sub-queries (e.g. COUNT queries). As an example, we have a one-to-many relationship hierarchy that's four levels deep: A <- B <- C <- D (where <- marks a one-to-many association). Pretty often when fetching an entity of type A, we'd like to know the amount of associated entities of type D. For this we use a #Formula-annotated property in A.
As we don't need these values every time, we've declared the #Formula properties as lazy-loaded (we've enabled Hibernate's bytecode enhancement to make this possible). But for some queries, we'd like to load these properties eagerly. We often load hundreds of entities of type A in one query, and it'd be important performance-wise to control the eager/lazy loading of these properties dynamically. We already use JPA's entity graphs to control which properties get loaded eagerly for certain queries, but entity graphs don't seem to work here. Even if we list the #Formula properties in the entity graph, they're still loaded lazily.
Is it possible to control lazy/eager loading of #Formula columns dynamically on a per query basis? We're currently restricted to the JPA Criteria Query API, and named queries are not a possibility here.
Update:
The properties in question are not associations to other entities, but just some calculated values. This means that e.g. fetch profiles don't apply here, as they're only applicable to entity associations (or at least that's how I understood the Hibernate manual). Here's an example of one of our #Formula properties:
#Entity
public class A {
#Basic(fetch = FetchType.LAZY)
#Formula("(select count(*) from entity_D_table where ...)")
private int associatedDCount;
...
}
You could use the Critria api to make it return a DTO instead of an Entity.
In your criteria query use a Projection to select only the column you need.
ProjectionList properties = Projections.projectionList();
properties.add(Projections.property("id").as("id"));
properties.add(Projections.property("name").as("name"));
properties.add(Projections.property("lazyField").as("lazyField"));
criteria.setProjection(properties);
criteria.setResultTransformer(new AliasToBeanResultTransformer(MyEntityDTO.class));
That way the select query will only contains the fields you ask, whatever the mapping EAGER or LAZY.
You can try to have a look at Hibernate's fetch profiles https://docs.jboss.org/hibernate/orm/4.2/manual/en-US/html/ch20.html#performance-fetching-profiles.
You can for example annotate an entity like that
#Entity
#FetchProfile(name = "country_states", fetchOverrides = {
#FetchProfile.FetchOverride(entity = Country.class, association = "states", mode = FetchMode.JOIN)
})
public class Country implements Serializable {...
and activate the JOIN mode when querying, like this:
session=getSession();
session.beginTransaction();
//enable fetch profile for EAGER fetching
session.enableFetchProfile("country_states");
As shown in http://www.concretepage.com/hibernate/fetchprofile_hibernate_annotation
It turns out it's not hard to pull this off without having to resort to bytecode instrumentation.
Create a "formula" entity mapped to the same table:
#Entity
#Table("A")
public class ACounts {
#Id
private Long id;
#Formula("(select count(*) from entity_D_table where ...)")
private int dCount;
public int getDCount() {
return dCount;
}
}
Then in your parent entity, A, use #ManyToOne to relate lazily to this "formula" entity:
#Entity
public class A {
#Id
private Long id;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "id", nullable = false, insertable = false, updatable = false)
private ACounts counts;
public ACounts getCounts() {
return counts;
}
...
}
Now the count query will only be issued when the count is requested (i.e. it's lazy!):
A a = ...
// lazily invoke count query now:
a.getCounts().getDCount()
ref: https://stackoverflow.com/a/55581854/225217

mapping two fields from one table on to other

I have these 2 tables
Users(
id PK,
name VARCHAR(30)
);
The other table is
Orders(
id PK,
orderBy FK Users.id,
orderTo FK Users.id
);
Now, what I want to do is to create Orders entity class which maps orderBy and orderTo to the user. But the most thing i am confuse about is what cascading i should use.
class Orders{
///
#ManyToOne(fetch = FetchType.Lazy
#JoinColumn(name="orderBy")
Users orderBy;
///
#ManyToOne(fetch = FetchType.Lazy
#JoinColumn(name="orderTo")
Users orderTo;
}
I am thinking to create two fields in Users Table such that
class Account{
///
#OneToMany(fetch = FetchType.Lazy)
#JoinColumn(name="orderTo")
List<Orders> ordersReceived;
///
#OneToMany(fetch = FetchType.Lazy)
#JoinColumn(name="orderBo")
List<Orders> ordersPlaced;
}
But again, I am not sure what cascading shall i use. My Users table will be populated by some other processes so orders has nothing to do with. I don't want when i am placing an order, that particular transaction should add/delete anything. HOWEVER, i might need to update a specific field of User whenever i place an order.
I'll suggest to avoid to use cascade at all (if possible)... When you place an order, you should follow the following steps:
1) load your user from your database
2) create your order ...
3) linkup your order to your user (this is, order.setOrderBy(user))
4) persist your order with your EntityManager.
5) Change your user attribute.
From my experience, Cascade should be used carefully. I only used it for persist entities in one shoot (Cascade.PERSIST) (example: persisting a newly user with another new entities like orders)

Spring Data JPA - simulate a "create + join" query for an existing collection

Let's say I have a List of entities:
List<SomeEntity> myEntities = new ArrayList<>();
SomeEntity.java:
#Entity
#Table(name = "entity_table")
public class SomeEntity{
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
private long id;
private int score;
public SomeEntity() {}
public SomeEntity(long id, int score) {
this.id = id;
this.score = score;
}
MyEntityRepository.java:
#Repository
public interface MyEntityRepository extends JpaRepository<SomeEntity, Long> {
List<SomeEntity> findAllByScoreGreaterThan(int Score);
}
So when I run:
myEntityRepository.findAllByScoreGreaterThan(10);
Then Hibernate will load all of the records in the table into memory for me.
There are millions of records, so I don't want that. Then, in order to intersect, I need to compare each record in the result set to my List.
In native MySQL, what I would have done in this situation is:
create a temporary table and insert into it the entities' ids from the List.
join this temporary table with the "entity_table", use the score filter and then only pull the entities that are relevant to me (the ones that were in the list in the first place).
This way I gain a big performance increase, avoid any OutOfMemoryErrors and have the machine of the database do most of the work.
Is there a way to achieve such an outcome with Spring Data JPA's query methods (with hibernate as the JPA provider)? I couldn't find in the documentation or in SO any such use case.
I understand you have a set of entity_table identifiers and you want to find each entity_table whose identifier is in that subset and whose score is greater than a given score.
So the obvious question is: how did you arrive to the initial subset of entity_tables and couldn't you just add the criteria of that query to your query that also checks for "score is greater than x"?
But if we ignore that, I think there's two possible solutions. If the list of some_entity identifiers is small (what exactly is "small" depends on your database), you could just use an IN clause and define your method as:
List<SomeEntity> findByScoreGreaterThanAndIdIn(int score, Set<Long) ids)
If the number of identifiers is too large to fit in an IN clause (or you're worried about the performance of using an IN clause) and you need to use a temporary table, the recipe would be:
Create an entity that maps to your temporary table. Create a Spring Data JPA repository for it:
class TempEntity {
#Id
private Long entityId;
}
interface TempEntityRepository extends JpaRepository<TempEntity,Long> { }
Use its save method to save all the entity identifiers into the temporary table. As long as you enable insert batching this should perform all right -- how to enable differs per database and JPA provider, but for Hibernate at the very least set the hibernate.jdbc.batch_size Hibernate property to a sufficiently large value. Also flush() and clear() your entityManager regularly or all your temp table entities will accumulate in the persistence context and you'll still run out of memory. Something along the lines of:
int count = 0;
for (SomeEntity someEntity : myEntities) {
tempEntityRepository.save(new TempEntity(someEntity.getId());
if (++count == 1000) {
entityManager.flush();
entityManager.clear();
}
}
Add a find method to your SomeEntityRepository that runs a native query that does the select on entity_table and joins to the temp table:
#Query("SELECT id, score FROM entity_table t INNER JOIN temp_table tt ON t.id = tt.id WHERE t.score > ?1", nativeQuery = true)
List<SomeEntity> findByScoreGreaterThan(int score);
Make sure you run both methods in the same transaction, so create a method in a #Service class that you annotate with #Transactional(Propagation.REQUIRES_NEW) that calls both repository methods in succession. Otherwise your temp table's contents will be gone by the time the SELECT query runs and you'll get zero results.
You might be able to avoid native queries by having your temp table entity have a #ManyToOne to SomeEntity since then you can join in JPQL; I'm just not sure if you'll be able to avoid actually loading the SomeEntitys to insert them in that case (or if creating a new SomeEntity with just an ID would work). But since you say you already have a list of SomeEntity that's perhaps not a problem.
I need something similar myself, so will amend my answer as I get a working version of this.
You can:
1) Make a paginated native query via JPA (remember to add an order clause to it) and process a fixed amount of records
2) Use a StatelessSession (see the documentation)

Hibernate : many to many linker table, best OO design

If I have 3 tables, with the expected normal columns : Customer, CustomerProductLinker and Product.
And I want in my Java code to do this :
Customer customer = myService.getCustomer(id); //calls hibernate session
List<Product> customerProducts = customer.getProducts();
What would my 3 entities look like and the respective collections within each, specifically the getProducts() method ? Or is it better to use HQL and a named query for this ?
I am creating the databse tables from the java code (using the create option in hibernate conf), so the table desgin can be altered if preferred.
Try #ManyToMany relationship using #JoinTable. A customer has a set (or a list) of products. A product has a set (or a list) of customers.
#Entity
public class Customer {
#ManyToMany(cascade= CascadeType.ALL)
#JoinTable(name="customer_product",
joinColumns={#JoinColumn(name="customer_id")},
inverseJoinColumns={#JoinColumn(name="product_id")})
private Set<Product> products = new HashSet<Product>();
...
#Entity
public class Product {
#ManyToMany
#JoinTable(name="customer_product",
joinColumns={#JoinColumn(name="product_id")},
inverseJoinColumns={#JoinColumn(name="customer_id")})
private Set<Customer> customers = new HashSet<Customer>();
...
I would set up the entities like wannik suggested. Try to keep it simple. If you start using named queries you are doing more work and you are just covering an specific case.

Categories

Resources