merge vs find to update entities JPA

merge vs find to update entities JPA - java

From the book Pro EJB3 JPA:
The most common strategy to handle this (-update entities-) in Java EE application that uses JPA is to place the results of the changes into detached entity instances and merge the pending changes into a persistence context so that they can be written to the database
Example:
The emp param is a detached entity
#Stateless
public class EmployeeServiceBean {
#PersistenceContext
EmtityManager em;
public void updateEmployee(Employee emp){
if(em.find(Employee.class, emp.getId()) == null){
throw new IllegalArgumentException("Unknown Employee id")
}
em.merge(emp);
}
}
Then, says:
If the amount of information being udated is very small, we can avoid the detached object and merge() operation entirely by locating the managed version and manually copying the changes into it.
Example:
Here the emp is attached
public void updateEmployee(int id, String newName, long newSalary) {
Employee emp = em.find(Employee.class, id);
if(emp==null){
throw new IllegalArgumentException("Unknown Employee id")
}
emp.setEmpName(newName);
emp.setSalary(newSalary);
}
So, looks like for small updates and create operations the strategy find() and then set new values one by one is convenient. But!, for big updates of data (i.e collections) is preferred have a detached entity and all it's relations (with CascadeType.Merge) and do a big merge().
OK, but why?

Because if your bean has a lot of attributes, JPA will check one by one in the merge process, for all attributes, if you're dealing with a detached object.
Now, if you have a bean with 200 atrributes and want to change only 1 field, it´s easier for JPA to just get the managed version (internally, JPA knows when one field of a managed entity is "dirty" or not), then it will only deal with that specific attribute.

Related

JPA flushing to Database before #PreUpdate is called

I am trying to capture the entity data in the database before the save is executed, for the purpose of creating a shadow copy.
I have implemented the following EntityListener in my Spring application:
public class CmsListener {
public CmsListener() {
}
#PreUpdate
private void createShadow(CmsModel entity) {
EntityManager em = BeanUtility.getBean(EntityManager.class);
CmsModel p = em.find(entity.getClass(), entity.getId());
System.out.println(entity);
}
}
The entity does indeed contain the entity object that is to be saved, and then I inject the EntityManager using another tool, which works fine - but for some reason, the entity has already been saved to the database. The output of CmsModel p = em.find(...) results in identical data which is in entity.
Why is JPA/hibernate persisting the changes before #PreUpdate is called? How can I prevent that?

I would assume this is because em.find doesn't actually query the database but fetches the object from cache, so it actually fetches the same object entity refers to (with changes already applied).
You could check your database log for the query that fetches the data for entity.id to verify this is indeed the case or you could add a breakpoint in createShadow() and have a look at the database entry for entity at the time the function is called to see for yourself if the changes are already applied to the database at that time.
To actually solve your problem and get your shadow copy you could fetch the object directly from database via native query.
Here is an untested example of what this could look like:
public CmsModel fetchCmsModelDirectly(){
Query q = em.createNativeQuery("SELECT cm.id,cm.value_a,cm.value_b FROM CmsModel cm", CmsModel.class);
try{
return q.getSingleResult();
}catch(NoResultException e){
return null;
}
}

Do you check if the entity is really updated to database? My suspect is that the change is only updated to the persistence context (cache). And when the entity is query back at the listener, the one from the cache is returned. So they are identical.
This is the default behavior of most of the ORM (JPA in this case) to speed up the data lookup. The ORM framework will take care of the synchronizing between the persistence context and the database. Usually when the transaction is committed.

Bulk inserting existing data: Preventing JPA to do a select before every insert

I'm working on a Spring Boot application that uses JPA (Hibernate) for the persistence layer.
I'm currently implementing a migration functionality. We basically dump all the existing entities of the system into an XML file. This export includes ids of the entities as well.
The problem I'm having is located on the other side, reimporting the existing data. In this step the XML gets transformed to a Java object again and persisted to the database.
When trying to save the entity, I'm using the merge method of the EntityManager class, which works: everything is saved successfully.
However when I turn on the query logging of Hibernate I see that before every insert query, a select query is executed to see if an entity with that id already exists. This is because the entity already has an id that I provided.
I understand this behavior and it actually makes sense. I'm sure however that the ids will not exist so the select does not make sense for my case. I'm saving thousands of records so that means thousands of select queries on large tables which is slowing down the importing process drastically.
My question: Is there a way to turn this "checking if an entity exists before inserting" off?
Additional information:
When I use entityManager.persist() instead of merge, I get this exception:
org.hibernate.PersistentObjectException: detached entity passed to
persist
To be able to use a supplied/provided id I use this id generator:
#Id
#GeneratedValue(generator = "use-id-or-generate")
#GenericGenerator(name = "use-id-or-generate", strategy = "be.stackoverflowexample.core.domain.UseIdOrGenerate")
#JsonIgnore
private String id;
The generator itself:
public class UseIdOrGenerate extends UUIDGenerator {
private String entityName;
#Override
public void configure(Type type, Properties params, ServiceRegistry serviceRegistry) throws MappingException {
entityName = params.getProperty(ENTITY_NAME);
super.configure(type, params, serviceRegistry);
}
#Override
public Serializable generate(SessionImplementor session, Object object)
{
Serializable id = session
.getEntityPersister(entityName, object)
.getIdentifier(object, session);
if (id == null) {
return super.generate(session, object);
} else {
return id;
}
}
}

If you are certain that you will never be updating any existing entry on the database and all the entities should be always freshly inserted, then I would go for the persist operation instead of a merge.
Per update
In that case (id field being set-up as autogenerated) the only way would be to remove the generation annotations from the id field and leave the configuration as:
#Id
#JsonIgnore
private String id;
So basically setting the id up for always being assigned manually. Then the persistence provider will consider your entity as transient even when the id is present.. meaning the persist would work and no extra selects would be generated.

I'm not sure I got whether you fill or not the ID. In the case you fill it on the application side, check the answer here. I copied it below:
Here is the code of Spring SimpleJpaRepository you are using by using Spring Data repository:
#Transactional
public <S extends T> S save(S entity) {
if (entityInformation.isNew(entity)) {
em.persist(entity);
return entity;
} else {
return em.merge(entity);
}
}
It does the following:
By default Spring Data JPA inspects the identifier property of the given entity. If the identifier property is null, then the entity will be assumed as new, otherwise as not new.
Link to Spring Data documentation
And so if one of your entity has an ID field not null, Spring will make Hibernate do an update (and so a SELECT before).
You can override this behavior by the 2 ways listed in the same documentation. An easy way is it to make your Entity implement Persistable (instead of Serializable), which will make you implement the method "isNew".

JPA handle merge() of relationship

I have a unidirectional relation Project -> ProjectType:
#Entity
public class Project extends NamedEntity
{
#ManyToOne(optional = false)
#JoinColumn(name = "TYPE_ID")
private ProjectType type;
}
#Entity
public class ProjectType extends Lookup
{
#Min(0)
private int progressive = 1;
}
Note that there's no cascade.
Now, when I insert a new Project I need to increment the type progressive.
This is what I'm doing inside an EJB, but I'm not sure it's the best approach:
public void create(Project project)
{
em.persist(project);
/* is necessary to merge the type? */
ProjectType type = em.merge(project.getType());
/* is necessary to set the type again? */
project.setType(type);
int progressive = type.getProgressive();
type.setProgressive(progressive + 1);
project.setCode(type.getPrefix() + progressive);
}
I'm using eclipselink 2.6.0, but I'd like to know if there's a implementation independent best practice and/or if there are behavioral differences between persistence providers, about this specific scenario.
UPDATE
to clarify the context when entering EJB create method (it is invoked by a JSF #ManagedBean):
project.projectType is DETACHED
project is NEW
no transaction (I'm using JTA/CMT) is active
I am not asking about the difference between persist() and merge(), I'm asking if either
if em.persist(project) automatically "reattach" project.projectType (I suppose not)
if it is legal the call order: first em.persist(project) then em.merge(projectType) or if it should be inverted
since em.merge(projectType) returns a different instance, if it is required to call project.setType(managedProjectType)
An explaination of "why" this works in a way and not in another is also welcome.

You need merge(...) only to make a transient entity managed by your entity manager. Depending on the implementation of JPA (not sure about EclipseLink) the returned instance of the merge call might be a different copy of the original object.
MyEntity unmanaged = new MyEntity();
MyEntity managed = entityManager.merge(unmanaged);
assert(entityManager.contains(managed)); // true if everything worked out
assert(managed != unmanaged); // probably true, depending on JPA impl.
If you call manage(entity) where entity is already managed, nothing will happen.
Calling persist(entity) will also make your entity managed, but it returns no copy. Instead it merges the original object and it might also call an ID generator (e.g. a sequence), which is not the case when using merge.
See this answer for more details on the difference between persist and merge.
Here's my proposal:
public void create(Project project) {
ProjectType type = project.getType(); // maybe check if null
if (!entityManager.contains(type)) { // type is transient
type = entityManager.merge(type); // or load the type
project.setType(type); // update the reference
}
int progressive = type.getProgressive();
type.setProgressive(progressive + 1); // mark as dirty, update on flush
// set "code" before persisting "project" ...
project.setCode(type.getPrefix() + progressive);
entityManager.persist(project);
// ... now no additional UPDATE is required after the
// INSERT on "project".
}
UPDATE
if em.persist(project) automatically "reattach" project.projectType (I suppose not)
No. You'll probably get an exception (Hibernate does anyway) stating, that you're trying to merge with a transient reference.
Correction: I tested it with Hibernate and got no exception. The project was created with the unmanaged project type (which was managed and then detached before persisting the project). But the project type's progression was not incremented, as expected, since it wasn't managed. So yeah, manage it before persisting the project.
if it is legal the call order: first em.persist(project) then em.merge(projectType) or if it should be inverted
It's best practise to do so. But when both statements are executed within the same batch (before the entity manager gets flushed) it may even work (merging type after persisting project). In my test it worked anyway. But as I said, it's better to merge the entities before persisting new ones.
since em.merge(projectType) returns a different instance, if it is required to call project.setType(managedProjectType)
Yes. See example above. A persistence provider may return the same reference, but it isn't required to. So to be sure, call project.setType(mergedType).

Do you need to merge? Well it depends. According to merge() javadoc:
Merge the state of the given entity into the current persistence
context
How did you get the instance of ProjectType you attach to your Project to? If that instance is already managed then all you need to do is just
type.setProgessive(type.getProgressive() + 1)
and JPA will automatically issue an update effective on next context flush.
Otherwise if the type is not managed then you need to merge it first.
Although not directly related this quesetion has some good insight about persist vs merge: JPA EntityManager: Why use persist() over merge()?
With the call order of em.persist(project) vs em.merge(projectType), you probably should ask yourself what should happen if the type is gone in the database? If you merge the type first it will get re-inserted, if you persist the project first and you have FK constraint the insert will fail (because it's not cascading).

Here in this code. Merge basically store the record in different object, Let's say
One Account pojo is there
Account account =null;
account = entityManager.merge(account);
then you can store the result of this.
But in your code your are using merge different condition like
public void create(Project project)
{
em.persist(project);
/* is necessary to merge the type? */
ProjectType type = em.merge(project.getType());
}
here
Project and ProjectType two different pojo you can use merge for same pojo.
or is there any relationship between in your pojo then also you can use it.

Safe embedded entity with objectify

I have two entities.
#Entity
public class Recipe {
#Id
private Long id;
private List<Step> steps;
}
#Entity
public class Step {
#Id
private Long id;
private String instruction;
}
And the following Clound Endpoint
#ApiMethod(
name = "insert",
path = "recipe",
httpMethod = ApiMethod.HttpMethod.POST)
public Recipe insert(Recipe recipe) {
ofy().save().entities(recipe.getSteps()).now(); //superfluous?
ofy().save().entity(recipe).now();
logger.info("Created Recipe with ID: " + recipe.getId());
return ofy().load().entity(recipe).now();
}
I'm wondering how do I skip the step where I have to save the emebedded entity first. The Id of neither entity is set. I want objectify to automatically create those. But if don't save the embedded entity I get an exception.
com.googlecode.objectify.SaveException: Error saving com.devmoon.meadule.backend.entities.Recipe#59e4ff19: You cannot create a Key for an object with a null #Id. Object was com.devmoon.meadule.backend.entities.Step#589a3afb
Since my object structure will get a lot more complex, I need to find a way to skip this manual step.

I presume you are trying to create real embedded objects, not separate objects stored in the datastore and linked. Your extra save() is actually saving separate entities. You don't want that.
You have two options:
Don't give your embedded object an id. Don't give it #Entity and don't give it an id field (or at least eliminate #Id). It's just a POJO. 90% of the time, this is what people want with embedded objects.
Allocate the id yourself with the allocator, typically in your (non-default) constructor.
Assuming you want a true embedded entity with a real key, #2 is probably what you should use. Keep in mind that this key is somewhat whimsical since you can't actually load it; only the container object can be looked up in the datastore.
I suggest going one step further and never use automatic id generation for any entities ever. Always use the allocator in the (non-default) constructor of your entities. This ensures that entities always have a valid, stable id. If you always allocate the id before a transaction start, it fixes duplicate entities that can be created when a transaction gets retried. Populating null ids is just a bad idea all around and really should not have been added to GAE.

The concept of the embedded is that the embedded content is persisted inside the main entity.
Is this the behaviour you are trying to configure?
The default behaviour of a Collection (List) of #Entity annoted class is to refer them instead of embed them. As you current configuration, the List<Step> variable does not have any annotation to override the default configuration, which is a different entity related to another one.
The error you are getting is because Objectify, when it saves the recipe entity, is trying to get the key of each step to create the relationship (and save them in the recipe entity), but if the entity step is not saved yet on the datastore, does not have a key
If you are trying to persist the steps inside the recipe entity, you need to setup objectify like this
#Entity
public class Recipe {
#Id
private Long id;
private List<Step> steps;
}
public class Step {
private Long id;
private String instruction;
}
As you can see, I removed the #Id annotation (an embedded Entity does not require an ID because is inside another entity) and the #Entity from the Step class. With this configuration, Objectify save the step entities inside the recipe entity
Source: https://code.google.com/p/objectify-appengine/wiki/Entities#Embedded_Object_Native_Representation

JPA, removing an entity which has found by different manager

Assume we have a simple entity bean, like above
#Entity
public class Schemes implements serializable{
...
#Id private long id;
...
}
I find a record using find method and it works perfect, the problem is I cannot manipulate it(remove) by another EntityManager later, for example I find it with a method, and later I want to remove it, what is the problem?! if I find it with same manager again I would remove it, but if object has found by another manager I cannot.
#ManagedBean #SessionScopped class JSFBean {
private Schemes s;
public JSFBean(){
....
EntityManager em;//.....
s=em.find(Schemes.class,0x10L);//okay!
....
}
public void remove(){//later
....
EntityManager em;//.....
em.getTransaction().begin();
em.remove(s);//Error! some weird error, it throws IllegalArgumentException!
em.getTransaction().commit();
....
}
}
many thanks.

You are probably getting a java.lang.IllegalArgumentException: Removing a detached instance.
The two EMs do not share a persistence context and for the second EM, your object is considered detached. Trying to remove a detached object will result in an IllegalArgumentException.
You can refetch the entity before the removal:
Schemes originalS = em.find(Schemes.class, s.getId());
em.remove(originalS);
EDIT You can also delete the entity without fetching it first by using parametrized bulk queries:
DELETE FROM Schemes s WHERE s.id = :id
Be aware that bulk queries can cause problems on their own. First, they bypass the persistence context, meaning that whatever you do with a bulk query will not be reflected by the objects in the persistence context. This is less an issue for delete queries than for update queries. Secondly, if you have defined any cascading rules on your entites - they will be ignored by a bulk query.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

merge vs find to update entities JPA - java

Related

JPA flushing to Database before #PreUpdate is called

Bulk inserting existing data: Preventing JPA to do a select before every insert

JPA handle merge() of relationship

Safe embedded entity with objectify

JPA, removing an entity which has found by different manager

Categories

Resources