OpenJPA merging/persisting is very slow - java

I use OpenJPA 2.2.0 on WebSphere Application Server 8 with a MySQL 5.0 DB.
I have a list of objects which I want to merge into the DB.
it's like:
for (Object ob : list) {
Long start = Calendar.getInstance().getTimeInMillis();
em = factory.createEntityManager();
em.getTransaction().begin();
em.merge(ob);
em.getTransaction().commit();
em.close();
Long end = Calendar.getInstance().getTimeInMillis();
Long diff = end - start;
LOGGER.info("Time: " + diff);
}
When I run this loop I need about 300-600 Milliseconds to merge one object. When I delete the line "em.merge(ob);" then I need "0" Milliseconds to iterate over 1 List Object.
So my question is: What can I do to improve the time to merge one object?
Thanks!

You can try starting the transaction before iteration & then commiting it afterwards within a single transaction. So, basically you are creating a batch which would be merged/persisted on commit.
Also, you can limit the number of objects in a batch to be processed at a time & can explicitly flush the changes into database.
Here, you are initiating a transaction & commiting it in each iteration and also creating/closing entity manager each time, will affect performance for numerous data.
It will be something like below code.
em = factory.createEntityManager();
em.getTransaction().begin();
int i = 0;
for (Object ob : list) {
Long start = Calendar.getInstance().getTimeInMillis();
em.merge(ob);
Long end = Calendar.getInstance().getTimeInMillis();
Long diff = end - start;
LOGGER.info("Time: " + diff);
/*BATCH_SIZE is the number of entities
that will be persisted/merged at once */
if(i%BATCH_SIZE == 0){
em.flush();
em.clear();
}
i++;
}
em.getTransaction().commit();
em.close();
Here, you can also rollback the whole transaction if any of the object fails to persist/merge.

Related

How to properly / efficiently manage entity manager JPA Spring #Transactional for large datasets?

I am attempting to insert ~57,000 entities in my database, but the insert method takes longer and longer as the loop progresses. I have implemented batches of 25 - each time flushing, clearing, and closing the transaction (I'm pretty sure) without success. Is there something else I need to be doing in the code below to maintain the insert rate? I feel like it should not take 4+ hours to insert 57K records.
[Migrate.java]
This is the main class that loops through 'Xaction' entities and adds 'XactionParticipant' records based off each Xaction.
// Use hibernate cursor to efficiently loop through all xaction entities
String hql = "select xaction from Xaction xaction";
Query<Xaction> query = session.createQuery(hql, Xaction.class);
query.setFetchSize(100);
query.setReadOnly(true);
query.setLockMode("xaction", LockMode.NONE);
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
int count = 0;
Instant lap = Instant.now();
List<Xaction> xactionsBatch = new ArrayList<>();
while (results.next()) {
count++;
Xaction xaction = (Xaction) results.get(0);
xactionsBatch.add(xaction);
// save new XactionParticipants in batches of 25
if (count % 25 == 0) {
xactionParticipantService.commitBatch(xactionsBatch);
float rate = ChronoUnit.MILLIS.between(lap, Instant.now()) / 25f / 1000;
System.out.printf("Batch rate: %.4fs per xaction\n", rate);
xactionsBatch = new ArrayList<>();
lap = Instant.now();
}
}
xactionParticipantService.commitBatch(xactionsBatch);
results.close();
[XactionParticipantService.java]
This service provides a method with "REQUIRES_NEW" in an attempt to close the transaction for each batch
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void commitBatch(List<Xaction> xactionBatch) {
for (Xaction xaction : xactionBatch) {
try {
XactionParticipant xp = new XactionParticipant();
// ... create xp based off Xaction info ...
// Use native query for efficiency
String nativeQueryStr = "INSERT INTO XactionParticipant .... xp info/data";
Query q = em.createNativeQuery(nativeQueryStr);
q.executeUpdate();
} catch (Exception e) {
log.error("Unable to update", e);
}
}
// Clear just in case??
em.flush();
em.clear();
}
That is not clear what is the root cause of your performance problem: java memory consumption or db performance, please check some thoughts below:
The following code does not actually optimize memory consumption:
String hql = "select xaction from Xaction xaction";
Query<Xaction> query = session.createQuery(hql, Xaction.class);
query.setFetchSize(100);
query.setReadOnly(true);
query.setLockMode("xaction", LockMode.NONE);
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
Since you are retrieving full-functional entities, those entities get stored in persistence context (session-level cache), and in order to free memory up you need to detach entity upon entity has been processed (i.e. after xactionsBatch.add(xaction) or // ... create xp based off Xaction info ...), otherwise at the end of processing you consume the same amount of memory as you were doing List<> results = query.getResultList();, and here I'm not sure what is better: consume all memory required at the start of transaction and release all other resources or keep cursor and jdbc connection open for 4 hours.
The following code does not actually optimize JDBC interactions:
for (Xaction xaction : xactionBatch) {
try {
XactionParticipant xp = new XactionParticipant();
// ... create xp based off Xaction info ...
// Use native query for efficiency
String nativeQueryStr = "INSERT INTO XactionParticipant .... xp info/data";
Query q = em.createNativeQuery(nativeQueryStr);
q.executeUpdate();
} catch (Exception e) {
log.error("Unable to update", e);
}
}
yes, in general, JDBC should be faster than JPA API, however that is not your case - you are inserting records one-by-one instead of using batch inserts. In order to take advantage of batches your code should look like:
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void commitBatch(List<Xaction> xactionBatch) {
session.doWork(connection -> {
String insert = "INSERT INTO XactionParticipant VALUES (?, ?, ...)";
try (PreparedStatement ps = connection.prepareStatement(insert)) {
for (Xaction xaction : xactionBatch) {
ps.setString(1, "val1");
ps.setString(2, "val2");
ps.addBatch();
ps.clearParameters();
}
ps.executeBatch();
}
});
}
BTW, Hibernate may do the same if hibernate.jdbc.batch_size is set to large enough positive integer and entities are properly designed (id generation is backed up by DB sequence and allocationSize is large enough)

Performance insert by JPA Spring-boot with Oracle database

the first thanks for your time.
I am trying to insert data to the database by JPA(spring-boot), the project is using Oracle.
Currently, Insert 5000 record, it takes a long time with repository.save(...) or repository.saveAll(...).
I tried batch_size, but it is not working(looks like it is not working for oracle ?).
Code config:
Properties properties = new Properties();
properties.setProperty("hibernate.ddl-auto", "none");
properties.setProperty("hibernate.dialect", "org.hibernate.dialect.Oracle12cDialect");
properties.setProperty("hibernate.show_sql", "true");
properties.put("hibernate.jdbc.batch_size", 5);
properties.put("hibernate.order_inserts", true);
properties.put("hibernate.order_updates", true);
setJpaProperties(properties);
I create sql query to insert several rows at one time execute statement.
INSERT ALL INTO table(...)...
I hope there is a better and more efficient way
So, can you give me any solution?
Thankyou so much!!!!
How about:
batch_size : 1000
when entity count is 1000, then :repository.saveAndFlush();
then call the next batch.
Another method can be call the EntityManager persist directly in the batch saving. like:
public int saveDemoEntities(List<DemoEntity> DemoEntities) {
long start = System.currentTimeMillis();
int count = 0;
for (DemoEntities o : DemoEntities) {
entityManager.persist(o);
count++;
if (count % BATCH_COUNT == 0) {
entityManager.flush();
entityManager.clear();
}
}
entityManager.flush();
entityManager.clear();
return count;
}

What is the use of Hibernate batch processing

I am new to hibernate i have doubt in hibernate batch processing, i read some tutorial for hibernate batch processing they said
Session session = SessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ )
{
Employee employee = new Employee(.....);
session.save(employee);
}
tx.commit();
session.close();
Hibernate will cache all the persisted objects in the session-level cache and ultimately your application would fall over with an OutOfMemoryException somewhere around the 50,000th row. You can resolve this problem if you are using batch processing with Hibernate like,
Session session = SessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ )
{
Employee employee = new Employee(.....);
session.save(employee);
if( i % 50 == 0 )
{ // Same as the JDBC batch size
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}
tx.commit();
session.close();
My doubt is instead of initializing the session outside, why can't we initialize it in to the for loop like,
Session session = null;
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ )
{
session =SessionFactory.openSession()
Employee employee = new Employee(.....);
session.save(employee);
}
tx.commit();
session.close();
Is it correct way or not any one suggest me the correct way?
No. Don't initialize the session in the for loop; every time you start a new session you're starting a new batch (so you have a batch size of one your way, that is it is non-batching). Also, it would be much slower your way. That is why the first example has
if( i % 50 == 0 ) {
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
that is what "flush a batch of inserts and release memory" was for.
Batch Processing in Hibernate means to divide a task of huge numbers to some smaller tasks.
When you fire session.save(obj), hibernate will actually cache that object into its memory (still the object is not written into database), and would save it to database when you commit your transaction i.e when you call transactrion.commit().
Lets say you have millions of records to insert, so firing session.save(obj) would consume a lot of memory and eventually would result into OutOfMemoryException.
Solution :
Creating a simple batch of smaller size and saving them to database.
if( i % 50 == 0 ) {
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
Note :
In code above session.flush() would flush i.e actually save the objects into database and session.clear() would clear any memory occupied by those objects for a batch of size 50.
Batch processing allows you to optimize writing data.
However, the usual advice of flushing and clearing the Hibernate Session is incomplete.
You need to commit the transaction at the end of the batch to avoid long-running transactions which can hurt performance and, if the last item fails, undoing all changes is going to put a lot of pressure on the DB.
Therefore, this is how you should do batch processing:
int entityCount = 50;
int batchSize = 25;
EntityManager entityManager = entityManagerFactory().createEntityManager();
EntityTransaction entityTransaction = entityManager.getTransaction();
try {
entityTransaction.begin();
for (int i = 0; i < entityCount; i++) {
if (i > 0 && i % batchSize == 0) {
entityTransaction.commit();
entityTransaction.begin();
entityManager.clear();
}
Post post = new Post(
String.format("Post %d", i + 1)
);
entityManager.persist(post);
}
entityTransaction.commit();
} catch (RuntimeException e) {
if (entityTransaction.isActive()) {
entityTransaction.rollback();
}
throw e;
} finally {
entityManager.close();
}

Delay during commit (using JPA/JTA)

I would like to ask you for help with following problem. I have method:
String sql = "INSERT INTO table ...."
Query query = em.createNativeQuery(sql);
query.executeUpdate();
sql = "SELECT max(id) FROM ......";
query = em.createNativeQuery(sql);
Integer importId = ((BigDecimal) query.getSingleResult()).intValue();
for (EndurDealItem item : deal.getItems()) {
String sql2 = "INSERT INTO another_table";
em.createNativeQuery(sql2).executeUpdate();
}
And after executing it, data are not commited (it takes like 10 or 15 minutes until data are commited). Is there any way how to commit data explicitly or trigger commit? And what causes the transaction to remain uncommited for such a long time?
The reason we use nativeQueries is, that we are exporting data on some shared interface and we are not using the data anymore.
I would like to mention, that the transaction is Container-Managed (by Geronimo). EntityManager is created via linking:
#PersistenceContext(unitName = "XXXX", type = PersistenceContextType.TRANSACTION)
private EntityManager em;
Use explicitly the transaction commit:
EntityManager em = /* get an entity manager */;
em.getTransaction().begin();
// make some changes
em.getTransaction().commit();
This should work. The time of execution of all operation between .begin() and .end() depends of course also from the cycle you're performing, the number of row you're inserting, from the position of the database (the speed of the network matters) and so on...

Hibernate is not deleting my objects. Why?

I have just set up a test that checks that I am able to insert entries into my database using Hibernate. The thing that drives me crazy is that Hibernate does not actually delete the entries, although it reports that they are gone!
The test below runs successfully, but when I check my DB afterwards the entries that were inserted are still there! I even try to check it using assert (yes I have -ea as vm parameter). Does anyone have a clue why the entries are not deleted?
public class HibernateExportStatisticDaoIntegrationTest {
HibernateExportStatisticDao dao;
Transaction transaction;
#Before
public void setUp(){
assert numberOfStatisticRowsInDB() == 0;
dao = new HibernateExportStatisticDao(HibernateUtil.getSessionFactory());
}
#After
public void deleteAllEntries(){
assert numberOfStatisticRowsInDB() != 0;
Session session = HibernateUtil.getSessionFactory().getCurrentSession();
for(PersistableStatisticItem item:allStatisticItemsInDB()) {
session.delete(item);
}
session.flush();
assert numberOfStatisticRowsInDB() == 0;
}
#Test public void exportAllSavesEntriesToDatabase(){
int expectedNumberOfStatistics = 20;
dao.exportAll(StatisticItemFactory.createTestStatistics(expectedNumberOfStatistics));
assertEquals(expectedNumberOfStatistics, numberOfStatisticRowsInDB());
}
private int numberOfStatisticRowsInDB() {
return allStatisticItemsInDB().size();
}
#SuppressWarnings("unchecked")
private List<PersistableStatisticItem> allStatisticItemsInDB(){
Session session = HibernateUtil.getSessionFactory().getCurrentSession();
transaction = session.beginTransaction();
Query q = session.createQuery("FROM PersistableStatisticItem item");
return q.list();
}
}
The console is filled with
Hibernate: delete from UPTIME_STATISTICS where logDate=? and serviceId=?
but nothing has been deleted when I check it.
I guess it's related to inconsistent use of transactions (note that beginTransaction() in allStatisticItemsInDB() is called several times without corresponding commits).
Try to manage transactions in proper way, for example, like this:
Session session = HibernateUtil.getSessionFactory().getCurrentSession();
Transaction tx = session.beginTransaction();
for(PersistableStatisticItem item:
session.createQuery("FROM PersistableStatisticItem item").list()) {
session.delete(item);
}
session.flush();
assert session.createQuery("FROM PersistableStatisticItem item").list().size() == 0;
tx.commit();
See also:
13.2. Database transaction demarcation
I have the same problem. Although I was not using transaction at all. I was using namedQuery like this :
Query query = session.getNamedQuery(EmployeeNQ.DELETE_EMPLOYEES);
int rows = query.executeUpdate();
session.close();
It was returning 2 rows but the database still had all the records. Then I wrap up the above code with this :
Transaction transaction = session.beginTransaction();
Query query = session.getNamedQuery(EmployeeNQ.DELETE_EMPLOYEES);
int rows = query.executeUpdate();
transaction.commit();
session.close();
Then it started working fine. I was using SQL server. But I think if we use h2, above code (without transaction) will also work fine.
One more observation : To insert and get records usage of transaction is not mandatory but for deletion of records we will have to use transaction. (only tested in SQL server)
Can you post your DB schema and HBM or Fluent maps? One thing that got me a while back was I had a ReadOnly() in my Fluent map. It never threw an error and I too saw the "delete from blah where blahblah=..." in the logs.

Categories

Resources