Designing hierarchy of parent child relationships with spring data jpa - java

I am trying to build a to-do log keeper.
I am using java spring-boot with data-jpa which is built on hibernate.
I want a user to have several projects the user works on. Every project then has several tasks associated with it and the user tracks how much time was spent per a task by completing short atomic units of work (log entries).
So far I ended up building the most naive implementation of this system. It looked like several levels of one to many hierarchy: user->projects->tasks->entries. The current db implementation is based on a schema like this
Code for entity classes (getters setters constructors and some annotations are omitted for brevity):
#MappedSuperclass
public abstract class AbstractEntity {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
}
#Entity
public class User extends AbstractEntity {
#Column
private String name;
#OneToMany(mappedBy = "user", fetch = FetchType.LAZY)
private List<Project> projects;
}
#Entity
public class Project extends AbstractEntity {
#Column
private String name;
#OneToMany(mappedBy = "project", fetch = FetchType.LAZY)
private List<Task> tasks;
#ManyToOne
#JoinColumn(name = "user_id")
private User user;
}
#Entity
public class Task extends AbstractEntity {
#Column
private String name;
#OneToMany(mappedBy = "task", fetch = FetchType.LAZY)
private List<Entry> entries;
#ManyToOne
#JoinColumn(name = "project_id")
private Project project;
}
#Entity
public class Entry extends AbstractEntity {
#Column
private Integer duration;
#Column
private LocalDateTime finish;
#ManyToOne
#JoinColumn(name = "task_id")
private Task task;
}
I want to be able to provide functionality for a user to view all the log entries in a user specified time frame. I added jpa repository like this:
public interface EntryRepository extends JpaRepository<Entry, Integer> {
#Query("SELECT e FROM Entry e WHERE (e.task.project.user.id=:user_id) AND " +
"(e.finish BETWEEN :from AND :to)")
List<Entry> getAllForUserInDateRange(#Param("from") LocalDateTime from,
#Param("to") LocalDateTime to,
#Param("user_id") int userId);
}
1) Is it correct to say that this query is inefficient? I was thinking performing a fetch like this from a database is inefficient because the query cannot take advantage of indexes. Since there is no foreign key user_id in the Entry table every row is being looked up and the chain entry->task->project->user is being followed. I end up with linear complexity instead of logarithmic.
2) What is a better way to solve the problem? Is it ok to store the foreign key to the user in the Entry table? If I will want to fetch entries from the database for a particular project or a task, then I will have to add foreign keys to these relationships as well. Is that ok?

You should check real SQL which is being executed. Set org.hibernate.SQL log level to DEBUG and you'll see the statements.
I think for your query you will actuall get three inner joins between four tables. You say the query cannot take advantage of indexes. It absolutely can. Create following indexes:
USER (ID)
PROJECT (USED_ID, ID)
TASK (PROJECT_ID, ID)
ENTRY(TASK_ID, ID)
See Contactenated Indexes from Use the Index, Luke.
With these indexes your joins across four tables will likely use indexes. I won't put my hand in fire for this, but it should work. Check the query plan.
You are right that the chain ENTRY->TASK->PROJECT->USER will be followed, but it should be quite faset with indixes
Your database schema is pretty normalized, which results in three joins across four tables. You could denormalize this schema by bringing, say, user_id to the ENTRY. This may improve query performance, but honestly I doubt this will bring much. You may want to run real-world benchmark before actually switching to this solution.

Related

Orchestrating Spring Boot CrudRepositories with foreign key relationships

I am writing a Spring Boot application that will use Hibernate/JPA to persist between the app and a MySQL DB.
Here we have the following JPA entities:
#MappedSuperclass
public abstract class BaseEntity {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#JsonIgnore
private Long id;
#Type(type="uuid-binary")
private UUID refId;
}
#Entity(name = "contacts")
#AttributeOverrides({
#AttributeOverride(name = "id", column=#Column(name="contact_id")),
#AttributeOverride(name = "refId", column=#Column(name="contact_ref_id"))
})
public class Contact extends BaseEntity {
#Column(name = "contact_given_name")
private String givenName;
#Column(name = "contact_surname")
private String surname;
#Column(name = "contact_phone_number")
private String phone;
}
#Entity(name = "assets")
#AttributeOverrides({
#AttributeOverride(name = "id", column=#Column(name="asset_id")),
#AttributeOverride(name = "refId", column=#Column(name="asset_ref_id"))
})
public class Asset extends BaseEntity {
#Column(name = "asset_location")
private String location;
}
#Entity(name = "accounts")
#AttributeOverrides({
#AttributeOverride(name = "id", column=#Column(name="account_id")),
#AttributeOverride(name = "refId", column=#Column(name="account_ref_id"))
})
public class Account extends BaseEntity {
#OneToOne(fetch = FetchType.EAGER)
#JoinColumn(name = "contact_id", referencedColumnName = "contact_id")
private Contact contact;
#OneToOne(fetch = FetchType.EAGER)
#JoinColumn(name = "asset_id", referencedColumnName = "asset_id")
private Asset asset;
#Column(name = "account_code")
private String code;
}
And the #RestController, where an Account instance will be POSTed (to be created):
public interface AccountRepository extends CrudRepository<Account, Long> {
#Query("FROM accounts where account_code = :accountCode")
public Account findByCode(#Param("accountCode") String accountCode);
}
#RestController
#RequestMapping(value = "/accounts")
public class AccountController {
#Autowired
private AccountRepository accountRepository;
#RequestMapping(method = RequestMethod.POST)
public void createNewAccount(#RequestBody Account account) {
// Do some stuff maybe
accountRepository.save(account);
}
}
So the idea here is that "Account JSON" will be sent to this controller where it will be deserialized into an Account instance and (somehow) persisted to the backing MySQL. My concern is this: Account is a composition (via foreign keys) of several other entities. Do I need to:
Either create CrudRepository impls for each of these entities, and then orchestrate save(...) calls to those repositories such that the "inner-entitities" get saved first before the "outer" Account entity?; or
Do I just save the Account entity (via AccountRepository.save(account)) and Hibernate/JPA automagically takes care of creating all the inner/dependendent entities for me?
What would the code/solution look like in either scenario? And how do we specify values for BaseEntity#id when it is an auto-incrementing PK in the DB?
That depends on your design and specific use cases, and what level of flexibility you want to keep. Both ways are used in practice.
In most CRUD situations, you would rather save the account and let Hibernate save the entire graph (the second option). Here you usually have another case which you didn't mention, and it is updating of the graph, which you would probably do the same way, and actually the Spring's repository save method does it: if the entity is a new (transient) one, it persists it, otherwise it merges it.
All you need to do is to tell Hibernate to cascade the desired entity lifecycle operations from the Account to the related entities:
#Entity
...
public class Account extends ... {
#OneToOne(..., cascade = {CascadeType.PERSIST, CascadeType.MERGE})
...
private Contact contact;
#OneToOne(..., cascade = {CascadeType.PERSIST, CascadeType.MERGE})
...
private Asset asset;
...
}
However, you pay the penalty of reloading the object graph from the db in case of merge operation, but if you want everything done automatically, Hibernate has no other way to check what has actually changed, other than comparing it with the current state in the db.
Cascade operations are applied always, so if you want more flexibility, you obviously have to take care of things manually. In that case, you would omit cascade options (which is your current code), and save and update the parts of the object graph manually in the order that does not break any integrity constraints.
While involving some boilerplate code, manual approach gives you flexibility in more complex or performance-demanding situations, like when you don't want to load or reinitialize the parts of the detached graph for which you know that they are not changed in some context in which you save it.
For example, let's assume a case where there are separate web service methods for updating account, contact and asset. In the case of the account method, with cascading options you would need to load the entire account graph just to merge the changes on the account itself, although contact and asset are not changed (or worse, depending on how you do it, you may here revert changes on them made by somebody else in their dedicated methods in the meantime if you just use the detached instances contained in the account).
Regarding auto-generated ids, you don't have to specify them yourself, just take them from the saved entities (Hibernate will set it there). It is important to take the result of the repository's save method if you plan to use the updated entity afterwards, because merge operation always returns the merged copy of the passed-in instance, and if there are any newly persisted associated entity instances in the updated detached graph, their ids will be set in the copy, and the original instances are not modified.

JPA/validation #ManyToOne relations should not create new rows

I have an JPA entity with contains a ManyToOne reference to another table, a simplified version of that entity is shown below:
#Entity
#Table(name = "ENTITIES")
public class Entity implements Serializable {
#Id #NotNull
private String id;
#JoinColumn(name = "REFERENCE", referencedColumnName = "ID")
#ManyToOne(optional = false)
private ReferencedEntity referencedEntity;
}
#Entity
#Table(name = "REFERENCES")
public class ReferencedEntity implements Serializable {
#Id #NotNull #Column(name = "ID")
private String id;
#Size(max = 50) #Column(name = "DSC")
private String description;
}
Finding entities works fine. Peristing entities also works fine, a bit too good in my particular setup, I need some extra validation.
Problem
My requirement is that the rows in table REFERENCES are static and should not be modified or new rows added.
Currently when I create a new Entity instance with a non-existing (yet) ReferencedEntity and persist that instance, a new row is added to REFERENCES.
Right now I've implemented this check in my own validate() method before calling the persist(), but I'd rather do it more elegantly.
Using an enum instead of a real entity is not an option, I want to add rows myself without a rebuild/redeployment several times in the future.
My question
What is the best way to implement a check like this?
Is there some BV annotation/constraint that helps me restrict this? Maybe a third party library?
It sounds like you need to first do a DB query to check if the value exists and then insert the record. This must be done in a transaction in order to ensure that the result of the query is still true at the time of insertion. I had a similar problem half a year back which might provide you with some leads on how to set up locking. Please see this SO question.
You should add this => insertable=false, updatable=false
And remove => optional=false , and maybe try nullable=true

Data JPA - Remove entities from repository

I have an existing java application in which I use now spring-boot and data jpa to save some data in a database. In one class Order which I convert now to an #Entity I have a member which is a List<Position>. Following is the code of the reduced classes
#Entity
public class Order
{
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private long id;
private List<Position> positions;
//some other members follow here...
}
#Entity
public class Position
{
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private long id;
//some members follow here...
}
So what I have done is the following, I added the annotation #Transient to my list in Order and add inPosition a reference to an Order:
#Entity
public class Order
{
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private long id;
#Transient
private List<Position> positions;
//some other members follow here...
}
#Entity
public class Position
{
#Id
#GeneratedValue(strategy=GenerationType.AUTO)
private long id;
#ManyToOne
private Order order;
//some members follow here...
}
Now when I want to save an Order object, then I save first the Order in the corresponding repository and then go through the list of Positions and set in ervery the reference to Order and then save the Position object to its corresponding repository. If I want to fetch an Order then I fetch first the Order and then fetch the Positions in the correspoding repository with findByOrder(..).
So far this works. Now I'm facing the problem, if the application modifies in the Order the list with the Positions and I have to update the database with the new Order object, then I find no smooth solution to delete the Positions in the repository which were removed from the list by the application (as I have no longer a reference to the removed ones). I could delete first all Positions of that Order and then save the existing ones again.
So my questions is maybe if there is a better way to remove the Positions in the repository which were removed by the application. But maybe it would be an XY question, cause my approach how to save the Position-List is the reason why I am facing this problem. I appreciate any hints concering this.
You're not doing it right.
First, it's not clear why you're making the #OneToMany side #Transient.
Best is to use cascade features of JPA.
In your example, if you put:
#OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval=true)
private List<Position> positions;
All operations on Order will cascade on Positions aswell, so you don't need to explicitly manage them.
See these examples with Hibernate

JPA repository - improve findAll() performances

I am using JPA repository, I need to retrieve a whole Mysql table(40000 records) wich has 5 foreign keys towards smaller tables (500 records). I need one field of each of these 5 tables.
If I call a JPArepository findall(), it takes a few seconds to retrieve all the data.
I need it to be faster. Is there a way to do that?
I don't know what would be the best solution, if it can be done on mysql side, or must be done on Java side.
All the tables are well mapped to JPA entities :
#Entity
#Table(name = "T_CLIENT")
#Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public class Client implements Serializable {
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
private Long id;
#Column(name = "code")
private String code;
#OneToOne
private Seller seller;
#OneToOne
private Language language;
#OneToOne
private Address address;
#OneToOne
private Country billCountry;
#OneToOne
private ClientType clientType;
}
Thank you for your answers.
You can choose loading fields from foreign keys tables using EntityGraph mechanism.

Hibernate Criteria, Select order according to newest events

I have classes in hibernate like this:
#Entity
class Order{
private MyPattern pat;
#Id
private int id;
#OneToMany(cascade = CascadeType.ALL)
#JoinColumn(name = "order_id")
private List<Event> events;
public DetachedCriteria getCriteria() {
//here I create criterias
}
}
#Entity
class Event{
#Column
#Temporal(value = javax.persistence.TemporalType.DATE)
private Date date;
#Id
private int id;
#Column
private String name;
}
What I need is to create DetachedCriteria in Order from MyPattern(detail structure is not important). I have this partially implemented, but my problem now is to select only Orders that have newest event.name like one in pattern. I think maybe selecting row with nevest date would help, But I just cant figure out, how to do this in Criteria. So I am open to solutions and help. Thanks
edit:
I have request, which is Order. I have to respond with correct Order instance(according to content of pattern). For example: Client requesting only orders which was already shipped. So I need to select Orders which has newest Event with name "Shipped".
DetachedCriteria dc=DetachedCriteria.forClass(Order.class,"or").CreateAlias("events","eve");
dc.add(Restriction.eq("eve.name","Shipped"));
orders=dc.getExecutableCriteria(session).list();
Basicky this code should do the trick but it has one BIG flaw. It returns even orders which were "Delivered" because events in Order is List which contains every event on order. So maybe simple fix like "select orders which has event.name=Shipped but NOT event.name=Delivered.
Fetching results based on event name pattern & ordered by date. Event_ & Order_ are metamodel classes of Event & Order entity.
CriteriaQuery<Order> cq = cb.createQuery(Order.class);
Root<Order> order = cq.from(Order.class);
Join<Order, Event> event = cq.join(Order_.events);
cq.select(order);
cq.where(cb.like(event.get(Event_.name)), "*somePattern"); // pattern for results
cq.orderBy(cb.asc(event.get(Event_.date));

Categories

Resources