How to turn off select before SaveAll()? - java

I need to turn off the select before saveAll() in a spring boot application with Hibernate and Jpa to boost performance with high number of records.
I've found a method with JPQL with good performance (delete + save of 10k records in 30s), but i'd like to stay with hibernate and jpa.
My expectations are that when i run my code written in java, i have to deleteAll the record of a table, then saveAll records from another one. When i do that in classic way (deleteAll(), findAll() and then saveAll()), i got low performance during the saveAll() because it does a select of all records got on the list before saving them.
I'd like to avoid the code to execute all selects before saving the records. Is that possibile without using EntityManager or EntityManagerFactory?

Code two native query: DELETE and INSERT SELECT using the #Query annotation on your repository class.
It's the best way to resolve these issues. If you only has to copy records from a table to another, it's no sense use jpa and loading thousands of objects. Using findAll could throw out of memory errors.

Related

How to bulk delete records using temporary tables in Hibernate?

I have a question. Where did these methods go?
Dialect.supportsTemporaryTables();
Dialect.generateTemporaryTableName();
Dialect.dropTemporaryTableAfterUse();
Dialect.getDropTemporaryTableString();
I've tried to browse git history for Dialect.java, but no luck. I found that something like
MultiTableBulkIdStrategy was created but I couldn't find any example of how to use it.
To the point...I have legacy code (using hibernate 4.3.11) which is doing batch delete from
multiple tables using temporary table. In those tables there may be 1000 rows, but also there may
be 10 milion rows. So just to make sure I don't kill DB with some crazy delete I create temp table where I put (using select query with some condition) 1000 ids at once
and then use this temp table to delete data from 4 tables. It's running in while cycle until all data based on some condition is not deleted.
Transaction is commited after each cycle.
To make it more complicated this code has to run on top of: mysql, mariadb, oracle, postgresql, sqlserver and h2.
It was done using native SQL, with methods mentioned above. But not I can't find a way how
to refactor it.
My first try was to create query using nested select like this:
delete from TABLE where id in (select id from TABLE where CONDITION limit 1000) but this is way slower as I have to run select query multiple times for each delete and limit is not supported in nested select in HQL.
Any ideas or pointers?
Thanks.
The methods were present in version 4.3.11 but removed in version 5.0.0. It seems a bit unusual that they were removed rather than deprecated - the background is on this Jira ticket.
To quote from this:
Long term, I think the best approach is to remove the Dialect method
intended to support table tabled in a piecemeal fashion and to make
MultiTableBulkIdStrategy be a fully self-contained contract.
The methods were removed in this commit.
So it seems that getDefaultMultiTableBulkIdStrategy() is the intended replacement for these methods - but I'm not entirely clear on how, as it currently has no Javadoc. Guess you could try to work it out from the source code ...or if all else fails, perhaps try to contact Steve Ebersole, who implemented the change?

Hibernate - Why shouldn't we use FETCH JOIN with scroll()?

I'm reading the Hibernate Documentation (version 5.1), and I'm falling on this sentence :
Fetch joins should not be used in paged queries (e.g. setFirstResult()
or setMaxResults()), nor should they be used with the scroll() or
iterate() features.
(http://docs.jboss.org/hibernate/orm/5.1/userguide/html_single/Hibernate_User_Guide.html#hql-explicit-join)
I've recently work on a project with a big database (2To) and for many batch treatments we doesn't have any other solution than to use FETCH JOIN with the scroll() method (or we have to leave Hibernate and go back to full SQL query).
I understand that paged queries can't garanty that we have a complete result (while fetching a collection for example).
But when using scroll() (which rely on databases cursors if I'm not making a mistake), if we sort the result by the root entites Ids, I can't see a reason why Hibernate can't garantie the result's coherence and completeness.
What is the reason of that restriction ?
What is the risk for the data fetched ?
Is there a way to prevent it ? (like sorting by ids root entities ?)

Does JPA load entities in JPQL, as it does when using entities in code?

I have a two tables: Person and House, the mapping is one to one.
Now I have to assign the address of Person and House (which can be different) to the same address.
There are more than 5000 records. Which will be faster? Using Code to update the entities one by one, e.g.
for (id : Ids) {
Person person = PersonDAO.find(id);
person.setAddress ("abc");
}
and then doing same with House;
Or should I use JPQL to update both in two different queries, e.g.
UPDATE Person p SET p.Address = "abc" WHERE ID IN(.....ID QUERY)
My question is what will be faster? Will the update using JPQL have the same performance, same as that in code? Or should I use native query to NOT load the entities, as I only want performance.
Using the query will be faster (and much more memory efficient), as the query provider will translate the JPQL query to native SQL. Also, if you use entities directly, the number of queries made against the database will be siginificantly higher (one select and update for each and every row).
The native query will be faster, as it doesn't have to translate anything.
If you want it to be even faster, you can use a PreparedStatement. With the .addBatch() method you add the query to the batch, and with the executeBatch() method you will execute the full batch, minimizing the amount of times being switched between user and kernel mode.

Java Persist without Entity

I am working on an JavaEE application, and there are almost 1000+ tables in the database, now I have to query the records by the parametes from the client.
Generally I will create one Entity for each table, and create the Dao,Service to do the query.
However I meet two problems:
1 Number of the tables
As I said, 1000+ table with almost 40+ columns for each, it would a nightmare to create the entity one by one.
2 Scheme update
Even I can create the Entity by program, the schema of the data may change sometime which is out of my control.
And in my application, only read operations are related to these kinds of data,no update,delete,create required.
So I wonder if the following solution is possible:
1 Use Map instead of POJOs
Do not create POJOs at all, use the native Map to wrap the columns and values.
2 Row mapping
When querying using Hibernate or Spring JdbcTemplate or something else, use a mapper to map each row to an entry in the map.
If yes, I would use the ResultMetaData to detect the column name,type,value:
ResultMetaData rmd=rs.getMetaData();
for(int i=0;i<rmd.getColumnCount();i++){
Type t=rmd.getType(i)
if(t==....){
...
}else if(t=...){
...
}
}
Looks like part of JPA's job, any library can used here?
If not, any other alternatives?

How to make update faster with Hibernate while using huge number of records

I am facing issue while using the hibernate update (Session.update()) portion with huge number of records. it is becoming very slower. but there is no issue with the insert (Session.insert()) portion. is there any way to do the update portion while we do update on lakh's of records.is there any way to tune the sql server so that the update will become faster. while we add seperate indexes to all the primary fields then the delete portion is taking time. is there any better way to tune sql server so that it performs well with insert, delete and update.
Thank you,
Saif.
do a batch update instead of individual update for each record. this way you will only hit the database once for all the records.
When you do a save only the data is saved into the database whereas when your updating a record it has to first perform the search operation and then update the record that is why your facing issues on update and not on save when your are handling huge number of records can use hibernate's BATCH PROCESSING to update your records. Here is a good link for batch processing in hibernate from tutorials point:
http://www.tutorialspoint.com/hibernate/hibernate_batch_processing.htm
There may be other solutions to this but one way I know is:
Whenever you save or update an instance through session (e.g. session.save(), session.update(), session.saveOrUpdate() etc.), it also updates the instance FK associations.
So if your POJO has multiple FK associations, it will fire queries on those tables as well.
So instead of updating instance in this way, I would suggest to use HQL (if it applies to your requirement) to save or update instance.

Categories

Resources