Batch Insert with JPA and Spring

Batch Insert with JPA and Spring - java

I'm using Spring Framework and JPA to insert beans into my database. I need to insert almost 8000 entities, and this can delay too much.
Why should I disable "second level cache" in Hibernate hibernate.cache.use_second_level_cache false
When I set a "hibernate.jdbc.batch_size 20" in Hibernate, will it insert my beans like this?
INSERT INTO VALUES (1),(2),(3)...(20);
INSERT INTO VALUES (21),(2),(3)...(40);
The documentation says: "Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator.". So, all my beans have this configuration:
#Id
#GeneratedValue(strategy = javax.persistence.GenerationType.IDENTITY)
private Integer id;
When I'm using this identity above, is the batch insert disabled? How can I solve this?

In Hibernate you cannot disable the session level cache. If you don't want it, use StatelessSession . This will not cache anything.
Furthermore, Hibernate documentation specifies how to do batch insert. See here .
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ ) {
Customer customer = new Customer(.....);
session.save(customer);
if ( i % 20 == 0 ) { //20, same as the JDBC batch size
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}
tx.commit();
session.close();

Related

Does Hibernate cache newly created instance

While learning HIbernate I came across this in the Hibernate official documentation:
That is because Hibernate caches all the newly inserted Customer instances in the session-level cache.
I am aware that Hibernate caches the entities retrieved but does it cache the new ones as well?
EDIT: newly created instance like session.save(new Customer())

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ ) {
Customer customer = new Customer(.....);
session.save(customer);
}
tx.commit();
session.close();
What the cache here means that after session.save(customer) , the customer would still in the session object and not removed before the session is closed.
It means that if you use session.get(Customer.class, id) to get a customer with an ID that is already saved before closing the session, it will not cause a SQL SELECT to retrieve this customer from database but simply return the cached customer from the session .

Insert large number of rows without attaching the entities to the Persistence Context

While streaming over a "data provider" I need to insert a fairly large number of entities in the database, say around 100.000. This whole step needs to be transactional.
To simplify my use-case as much as possible let's assume this is my code:
#Transactional
public void execute() {
for (int i = 0; i < 100000; i++) {
carRespository.save(new Car());
}
}
The problem with this code is that even if it's clear i have no use for the Car entities after the insert query is generated the entity is attached to the Persistence Context and held in memory until the transaction is done.
I would like to make sure that in case the garbage collection is triggered the created entities are cleared. For this currently I see two solutions:
create a native insert query on the repository
Inject the EntityManager in the service and call em.detach(car) after every insert
I tend to prefer the second option as I would not have to manage the native insert statement as the entity changes.
Can you confirm I taking the correct approach or suggest a better alternative?

You can find in the Hibernate documentation the way to insert the batch of data.
When making new objects persistent flush() and then clear() the session regularly in order to control the size of the first-level cache.
Thus the following approach is recommended:
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ ) {
session.save(new Car());
if (i%20 == 0 ) {
session.flush();
session.clear();
}
}
tx.commit();
session.close();

You can try using the saveAndFlush(S entity) method from spring data JPA JpaRepository instead of save()

batch saveORupdate using hibernate

I have a batch operation where in i have to either insert or update a record.I want to inser larger number of records so i need to commit batch after batch
1)Insert if new
2)Update if existing.
I can typically do it using
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ ) {
Customer customer = new Customer(.....);
session.saveOrUpdat(customer);
if ( i % 20 == 0 ) { //20, same as the JDBC batch size
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}
tx.commit();
session.close();
The problem is hibernate generates a select before each saveOrUpdate which seems to be a issue.
The primarykey of object is always populated before passing to hibernate.As it primaryKey is never generated by hibernate using sequencer or anything else.
How can i avoid this exta select for each saveOrupdate?
I dnt want to use stored procedure.

Following are the steps that it takes for Hibernate to decide whether to update or insert a record in to database.
saveOrUpdate() does the following:
if the object is already persistent in this session, do nothing
if another object associated with the session has the same identifier, throw an exception
if the object has no identifier property, save() it
if the object's identifier has the value assigned to a newly instantiated object, save() it
if the object is versioned by a <version> or <timestamp>, and the version property value is the same value assigned to a newly instantiated object, save() it
otherwise update() the object.
If in any case there is a conflict and hibernate is not able to decide on what operation to perform it does a select.
Coming to your question, try giving an hint to Hibernate like using the fields like timestamp or version
Credits - Jboss HIbernate Docs, StackOverFlow

JPA 1.0 and Hibernate 3.4 generating FOR UPDATE NOWAIT when locking

I am currently working on a Java EJB project being deployed to Weblogic 10.3.3. We are using JPA 1.0 with Hibernate 3.4 as the implementor. We are also using the Oracle10g Dialect.
The issue we are running in to involves the generation of SQL by hibernate when attempting to lock a row for update.
We execute a query:
Query q = entityManager.createNamedQuery("findMyObject");
MyHibernateObject myObject= (MyHibernateObject ) q.getSingleResult();
And then lock that object with:
entityManager.lock(myObject, LockModeType.WRITE);
This act of locking generates the query:
SELECT myObject FROM myTable FOR UPDATE NOWAIT
What I want it to generate is:
SELECT myObject FROM myTable FOR UPDATE
Enabling other threads to query for this object without throwing the exception: org.hibernate.exception.LockAcquisitionException and to just wait their turn or let the EJB transaction timeout.
So knowing all this, can I force Hibernate to generate the SQL without the NOWAIT keyword?
I know that using Hibernate 3.6 and JPA 2.0 will allow this using a pessimistic lock but due to Weblogic only supporting JPA 1.0 our hands are tied.
Ideally I want to avoid writing our own retry and/or timeout mechanism by handling the exception when all I need is to just augment the SQL that Hibernate is generating when the EntityManager creates the lock.

Ok, we are using a workaround until we update to Weblogic 10.3.4
Here it is in case someone else stumbles upon this:
SessionImpl session = (SessionImpl)entityManager.getDelegate();
session.lock(myObject, LockMode.UPGRADE);
This of course breaks from the JPA standard, in that we are exposing hibernates implementation and using the hibernate session.
But it will generate a
SELECT myObject FOR UPDATE
instead of
SELECT myObject FOR UPDATE NOWAIT
Hope this helps someone.

Use below code to skip a locked row. This is the alternate of
select * from student where student id = n for update nowait
findStudnt method throws error if row is already locked. If there is no error call updateStudent method to update student entity else log the error for audit.
#Override
public Student findStudent(final Long studentId) {
TypedQuery<Student> query = getEntityManager().createNamedQuery("from Student s where s.studentId =:studentId", Student.class);
query.setParameter("studentId", studentId);
query.setLockMode(LockModeType.PESSIMISTIC_WRITE);
query.setHint(JAVAX_PERSISTENCE_LOCK_TIMEOUT, ZERO_NUMBER);
return query.getSingleResult();
}
#Override
#Transactional(readOnly = false, propagation = Propagation.REQUIRED)
public void updateStudent(Student student) {
makePersistent(student);
getEntityManager().flush();
}

Creating SpringSource Tool Suite (STS) Hibernate Template

i've created the Hibernate project using Spring Template Project. Two domain objects, a JUnit test, app-context.xml and the persistence-context.xml were created. Now i noticed this line
<jdbc:embedded-database
id="dataSource"></jdbc:embedded-database>
and assume that the following happens
A default HQSL db is used
The two created models Order.java & Item.java will automatically created in memory tables T_ORDER and T_ITEM and the these will be mapped as per annotations on the objects. Inside the auto created classes one of the test methods is as follows
#Test
#Transactional
public void testSaveAndGet() throws Exception {
Session session = sessionFactory.getCurrentSession();
Order order = new Order();
order.getItems().add(new Item());
session.save(order);
session.flush();
// Otherwise the query returns the existing order
// (and we didn't set the parent in the item)...
session.clear();
Order other = (Order) session.get(Order.class, order.getId());
assertEquals(1, other.getItems().size());
assertEquals(other, other.getItems().iterator().next().getOrder());
}
Questions ...
Am i correct to think that the in memory tables are created from the domain models (Order/Item), and mapped? Therefore session.flush() synchronize the object to the physical (in memory table)....
Are these tables auto mapped because if i do the following
session.save(order);
session.flush();
session.clear();
Order other = (Order) session
.createQuery("from T_ORDER where ORDER_ID =: orderid")
.setLong("orderid", order.getId())
.uniqueResult();
i get an exception...
org.hibernate.hql.ast.[B]QuerySyntaxException[/B]: \
T_ORDER is not mapped [from T_ORDER where ORDER_ID =: orderid]
............
............
if these tables are not mapped automatically, how is flushing working at the first place?

Table creation is a feature of Hibernate (and other JPA proviers). It taking place when the application/test starts. It has nothing to do with any query. Even if you only start your test, with Hibernate running and configured, it can create the tables.
If Hibernate create the tables, drop old once, and so on, depends on its configuration: the property: hibernate.hbm2ddl.auto is used what hibernate do if its starts. For example the value update will add not existing tables and columns.
More details can be found in the documentation.
Your Exception
When you uses Hibernate and write hibernate query statements, then you have to use HQL and not SQL. -- The main difference is that HQL is based on the classes but not on the tables. So in your case you must not use T_ORDER, but Order (the same is for the id, you need to use the property/field name, but not the column name).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Batch Insert with JPA and Spring - java

Related

Does Hibernate cache newly created instance

Insert large number of rows without attaching the entities to the Persistence Context

batch saveORupdate using hibernate

JPA 1.0 and Hibernate 3.4 generating FOR UPDATE NOWAIT when locking

Creating SpringSource Tool Suite (STS) Hibernate Template

Categories

Resources