I would like to improve my insert performance on hibernate with my Java EJB Application deployed on Wildfly 10.0
I would like to perform batch inser on hibernate but with my code the insert is more slowly than without batch insert.
Here is my Method which performs the insert. It gets a List of customers and should persist it. The If-clause makes the difference.
public List<Customer> insertCustomerList(List<Customer> cusList)
{
try {
int batchSize = 25;
for ( int i = 0; i < cusList.size(); ++i ) {
em.persist( cusList.get(i) );
if ( i > 0 && i % batchSize == 0 ) {
System.out.println("FLUSHFG");
//flush a batch of inserts and release memory
em.flush();
em.clear();
}
}
} catch (RuntimeException e) {
throw e;
}
return cusList;
}
In my opinion it should be faster with the flush and clear but it is way more slowly than without!
In my Wildfly container I can not activate a new session or a new transaction because I get an error.
Can you tell me how I can manage the batch insert with wildfly so that the insert of large and many entity will be more fast and not more slowly?
In my persistence xml I have this property:
<property name="hibernate.jdbc.batch_size" value="25"/>
Thanks!
Related
the first thanks for your time.
I am trying to insert data to the database by JPA(spring-boot), the project is using Oracle.
Currently, Insert 5000 record, it takes a long time with repository.save(...) or repository.saveAll(...).
I tried batch_size, but it is not working(looks like it is not working for oracle ?).
Code config:
Properties properties = new Properties();
properties.setProperty("hibernate.ddl-auto", "none");
properties.setProperty("hibernate.dialect", "org.hibernate.dialect.Oracle12cDialect");
properties.setProperty("hibernate.show_sql", "true");
properties.put("hibernate.jdbc.batch_size", 5);
properties.put("hibernate.order_inserts", true);
properties.put("hibernate.order_updates", true);
setJpaProperties(properties);
I create sql query to insert several rows at one time execute statement.
INSERT ALL INTO table(...)...
I hope there is a better and more efficient way
So, can you give me any solution?
Thankyou so much!!!!
How about:
batch_size : 1000
when entity count is 1000, then :repository.saveAndFlush();
then call the next batch.
Another method can be call the EntityManager persist directly in the batch saving. like:
public int saveDemoEntities(List<DemoEntity> DemoEntities) {
long start = System.currentTimeMillis();
int count = 0;
for (DemoEntities o : DemoEntities) {
entityManager.persist(o);
count++;
if (count % BATCH_COUNT == 0) {
entityManager.flush();
entityManager.clear();
}
}
entityManager.flush();
entityManager.clear();
return count;
}
Well, i'm trying to making a batch insert in JPA but i think this don't work.
My method is this :
public void saveBatch(List<? extends AbstractBean> beans) {
try {
begin();
logger.info("Salvando em batch " + beans.size() + " bean(s)");
for (int i = 0; i < beans.size(); i++) {
if (i % 50 == 0) {
entityManager.flush();
entityManager.clear();
}
entityManager.merge(beans.get(i));
}
commit();
} catch (Exception e) {
logger.error("Ocorreu um erro ao tentar salvar batch. MSG ORIGINAL: "
+ e.getMessage());
rollback();
throw new DAOException("Ocorreu um erro ao tentar salvar batch");
}
}
My ideia is that each 50 rows the hibernate will make:
insert into tableA values (),(),()...
But watching the log i see one INSERT for each merge() command link this:
insert into tableA values ()
insert into tableA values ()
insert into tableA values ()
insert into tableA values ()
What is wrong ? This is correct ?
Hibernate does not enable batching by default. You will want to consider the following settings (I think at least batch_size is required to get any batch inserts/updates to work):
hibernate.jdbc.batch_size
A non-zero value enables use of JDBC2 batch updates by Hibernate. e.g.
recommended values between 5 and 30
hibernate.jdbc.batch_versioned_data
Set this property to true if your JDBC driver returns correct row
counts from executeBatch(). It is usually safe to turn this option on.
Hibernate will then use batched DML for automatically versioned data.
Defaults to false. e.g. true | false
hibernate.order_updates (similarly, hibernate.order_inserts)
Forces Hibernate to order SQL updates by the primary key value of the
items being updated. This will result in fewer transaction deadlocks
in highly concurrent systems. e.g. true | false
I am new to hibernate i have doubt in hibernate batch processing, i read some tutorial for hibernate batch processing they said
Session session = SessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ )
{
Employee employee = new Employee(.....);
session.save(employee);
}
tx.commit();
session.close();
Hibernate will cache all the persisted objects in the session-level cache and ultimately your application would fall over with an OutOfMemoryException somewhere around the 50,000th row. You can resolve this problem if you are using batch processing with Hibernate like,
Session session = SessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ )
{
Employee employee = new Employee(.....);
session.save(employee);
if( i % 50 == 0 )
{ // Same as the JDBC batch size
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}
tx.commit();
session.close();
My doubt is instead of initializing the session outside, why can't we initialize it in to the for loop like,
Session session = null;
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ )
{
session =SessionFactory.openSession()
Employee employee = new Employee(.....);
session.save(employee);
}
tx.commit();
session.close();
Is it correct way or not any one suggest me the correct way?
No. Don't initialize the session in the for loop; every time you start a new session you're starting a new batch (so you have a batch size of one your way, that is it is non-batching). Also, it would be much slower your way. That is why the first example has
if( i % 50 == 0 ) {
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
that is what "flush a batch of inserts and release memory" was for.
Batch Processing in Hibernate means to divide a task of huge numbers to some smaller tasks.
When you fire session.save(obj), hibernate will actually cache that object into its memory (still the object is not written into database), and would save it to database when you commit your transaction i.e when you call transactrion.commit().
Lets say you have millions of records to insert, so firing session.save(obj) would consume a lot of memory and eventually would result into OutOfMemoryException.
Solution :
Creating a simple batch of smaller size and saving them to database.
if( i % 50 == 0 ) {
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
Note :
In code above session.flush() would flush i.e actually save the objects into database and session.clear() would clear any memory occupied by those objects for a batch of size 50.
Batch processing allows you to optimize writing data.
However, the usual advice of flushing and clearing the Hibernate Session is incomplete.
You need to commit the transaction at the end of the batch to avoid long-running transactions which can hurt performance and, if the last item fails, undoing all changes is going to put a lot of pressure on the DB.
Therefore, this is how you should do batch processing:
int entityCount = 50;
int batchSize = 25;
EntityManager entityManager = entityManagerFactory().createEntityManager();
EntityTransaction entityTransaction = entityManager.getTransaction();
try {
entityTransaction.begin();
for (int i = 0; i < entityCount; i++) {
if (i > 0 && i % batchSize == 0) {
entityTransaction.commit();
entityTransaction.begin();
entityManager.clear();
}
Post post = new Post(
String.format("Post %d", i + 1)
);
entityManager.persist(post);
}
entityTransaction.commit();
} catch (RuntimeException e) {
if (entityTransaction.isActive()) {
entityTransaction.rollback();
}
throw e;
} finally {
entityManager.close();
}
According to the Hibernate doc, the best way to do a bulk insert is that :
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ ) {
Customer customer = new Customer(.....);
session.save(customer);
if ( i % 20 == 0 ) { //20, same as the JDBC batch size
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}
tx.commit();
session.close();
Actually, I'm using Hibernate with Spring (#Transactional), so I don't use Session and methods flush() and clear().
I have major problem with performance : 1 hour to insert 7400 rows...
I think Spring is mismanaging the session and disconnects from the database between calling a method of the DAO class.
How to check that ?
I have a code looking like this:
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
try {
for ( Customer customer: customers ) {
i++;
session.update(customer);
if ( i % 200 == 0 ) { //200, same as the JDBC batch size
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}
} catch (Exc e) {
//TODO want to know customer id here!
}
tx.commit();
session.close();
Say, at some point session.flush() raises an DataException, because one of the fields did not map into the database column size, one of those batch of 200 customers. Nothing wrong with it, data can be corrupted, it's ok in this case. BUT, I really need to know the customer id which failed. Database returns meaningless error message, not stating what was the params of the statement, etc. Catched exception also does not contain which customer did fail, only the sql statement text, looking like 'update Customer set name=?'
Can I somehow determine it using the hibernate session? Does it store somewhere the information about last entity it tried to save down?