How do I achieve a hibernate batch insert with Spring transactional (#Transactional)? - java

I am using Spring 5.x and plain Hibernate in backend. I want to transfer data from one DataBase to another, 1000 records in one go as a batch. I have same tables in the both databases. We have configured 2 transaction mangers with 2 data source details.
I have configured batch-size="1000" in the hbm file as below.
<hibernate-mapping>
<class table="User" name="com.my.User" batch-size="1000">
<id name="id" .../>
<property name="name" ...>
...
I have the 2 service classes annotated with #Transactional and implemented as below.
#Transactional(value="oldDBTransactionManager" propagation = Propagation.SUPPORTS, readOnly = true)
public class GetUserServiceImpl extends HibernateDaoImpl implements GetUserService {
...
public List<User> getAllUsers(int firstResult, int maxResults) {
DetachedCriteria criteria = DetachedCriteria.forClass(User.class);
criteria.addOrder(Order.asc("id"));
List results = getHibernateTemplate().findByCriteria(criteria, firstResult, maxResults);
if (!results.isEmpty()) {
for (User user: results ) {
saveUserService.saveUser(user);
}
}
return results;
...
}
#Transactional(value="newDBTransactionManager" propagation = Propagation.SUPPORTS, readOnly = true)
public class SaveUserServiceImpl extends HibernateDaoImpl implements SaveUserService{
...
#Transactional(value="newDBTransactionManager", propagation = Propagation.REQUIRED, readOnly = false)
#Qualifier("newDBTransactionManager")
public void saveUser(User user) {
getHibernateTemplate().save(user);
}
...
I am calling the above getAllUsers service method from outside of the Service layer(outside Transaction) and it will internally calls saveUser service method (which usesanother transaction manger) as below:
public class UserProcessActivity{
private int maxUsersPerIteration = 50;
...
private void processUsers(){
List<User> usersList;
long counter = 0;
int firstResult = 0;
usersList = getUserService.getAllUsers(firstResult, maxUsersPerIteration);
while (!usersList.isEmpty()) {
// some logic...
firstResult += usersList .size();
usersList = getUserService.getAllUsers(firstResult, maxUserPerIteration);
}
}
...
My question here is whether the above code handles batch insert or not?
If yes, what is the number of records insert as a batch 1000(batch-size declared in hbm) or 50 (calling service method with maxUserPerIteration assigned with 50).
If it does not perform insert operation as batch, how to achieve the batch insert with Spring #Trasactional and Hibernate?
How to enable logging for this batch insert operation queries to understand actual batch operation information?
Thanks in advance.

Related

What caused the PersistenceException with the message "detached entity passed to perist"

I'm using:
Quarkus with JPA (javax)
Postgres 11 database
I have:
An Entity
#Entity
#Table(name = "MyEntityTable")
#NamedQuery(name = MyEntity.DOES_EXIST, query = "SELECT x FROM MyEntity x WHERE x.type = :type")
public class MyEntity {
public static final String DOES_EXIST = "MyEntity.DoesExists";
#Id
#SequenceGenerator(name = "myEntitySequence", allocationSize = 1)
#GeneratedValue(generator = myEntitySequence)
private long id;
#Column(name = type)
private String type;
}
A repository
#ApplicationScoped
#Transactional(Transactional.TxType.Supports)
public class MyEntityReporitory {
#Inject
EntityManager entityManager;
#Transactional(Transactional.TxType.Required)
public void persist(final MyEntity entity) {
entityManager.persist(entiy);
}
public boolean doesExist(final String type) {
final TypedQuery<MyEntity> query = entityManager
.createNamedQuery(MyEntity.DOES_EXIST, MyEntity.class)
.setParameter("type", type);
return query.getResultList().size() > 0;
}
}
A test with two variations
Variation 1
#QuarkusTest
#QuarkusTestResource(DatabaseResource.class) // used to set up a docker container with postgres db
public class MyEntityRepositoryTest {
private static final MyEntity ENTITY = entity();
#Inject
MyEntityRepository subject;
#Test
public void testDoesExist() {
subject.persist(ENTITY);
final boolean actual = subject.doesExist("type");
assertTrue(actual);
}
#Test
public void testDoesExist_notMatching() {
subject.persist(ENTITY);
final boolean actual = subject.doesExist("another_type");
assertFalse(actual);
}
private static MyEntity entity() {
final MyEntity result = new MyEntity();
result.setType("type")
return result;
}
}
When I execute this test class (both tests) I'm getting the following Exception on the second time the persist method is called:
javax.persistence.PersistenceException: org.hibernate.PersistentObjectException: detached entity passed to persist com.mypackage.MyEntity
...
Variation 2
I removed the constant ENTITY from the test class, instead I'm calling now the entity() method inside the tests, like:
...
subject.persist(entity());
...
at both places. Now the Exeption is gone and everything is fine.
Question
Can someone explain to me, why this is the case (why variante 2 is working and variante 1 not)?
https://vladmihalcea.com/jpa-persist-and-merge/
The persist operation must be used only for new entities. From JPA perspective, an entity is new when it has never been associated with a database row, meaning that there is no table record in the database to match the entity in question.
testDoesExist executed, ENTITY saved to database and ENTITY.id set to 1
testDoesExist_notMatching executed and persist called on ENTITY shows the error beacuse it exists in the database, it has an id assigned
The simplest fix is to call entity() twice, as in you variation 2.
But don't forget that the records will exist after a test is run, and might affect your other test cases. You might want to consider cleaning up the data in an #After method or if you intend to use this entity in multiple test cases then put the perist code into a #BeforeClass method.

Using a Hibernate filter with Spring Boot JPA

I have found the need to limit the size of a child collection by a property in the child class.
I have the following after following this guide:
#FilterDef(name="dateFilter", parameters=#ParamDef( name="fromDate", type="date" ) )
public class SystemNode implements Serializable {
#Getter
#Setter
#Builder.Default
// "startTime" is a property in HealthHistory
#Filter(name = "dateFilter", condition = "startTime >= :fromDate")
#OneToMany(mappedBy = "system", targetEntity = HealthHistory.class, fetch = FetchType.LAZY)
private Set<HealthHistory> healthHistory = new HashSet<HealthHistory>();
public void addHealthHistory(HealthHistory health) {
this.healthHistory.add(health);
health.setSystem(this);
}
}
However, I don't really understand how to toggle this filter when using Spring Data JPA. I am fetching my parent entity like this:
public SystemNode getSystem(UUID uuid) {
return systemRepository.findByUuid(uuid)
.orElseThrow(() -> new EntityNotFoundException("Could not find system with id " + uuid));
}
And this method in turn calls the Spring supported repository interface:
public interface SystemRepository extends CrudRepository<SystemNode, UUID> {
Optional<SystemNode> findByUuid(UUID uuid);
}
How can I make this filter play nicely together with Spring? I would like to activate it programatically when I need it, not globally. There are scenarios where it would be viable to disregard the filter.
I am using Spring Boot 1.3.5.RELEASE, I cannot update this at the moment.
Update and solution
I tried the following as suggested to me in the comments above.
#Autowired
private EntityManager entityManager;
public SystemNode getSystemWithHistoryFrom(UUID uuid) {
Session session = entityManager.unwrap(Session.class);
Filter filter = session.enableFilter("dateFilter");
filter.setParameter("fromDate", new DateTime().minusHours(4).toDate());
SystemNode systemNode = systemRepository.findByUuid(uuid)
.orElseThrow(() -> new EntityNotFoundException("Could not find system with id " + uuid));
session.disableFilter("dateFilter");
return systemNode;
}
I also had the wrong type in the FilterDef annotation:
#FilterDef(name="dateFilter", parameters=#ParamDef( name="fromDate", type="timestamp" ) )
I changed from date to timestamp.
This returns the correct number of objects, verified against the database.
Thank you!

Spring JPA Hibernate : slow SELECT query

I encounter an optimisation problem and I can't figure out why my query is so slow.
Here my entity :
#Entity
#Table(name = "CLIENT")
public class Client {
private static final long serialVersionUID = 1L;
#Id
#Column(name = "CLIENT_ID")
#SequenceGenerator(name = "ID_GENERATOR", sequenceName = "CLIENT_S", allocationSize = 1, initialValue = 1)
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "ID_GENERATOR")
private Long id;
#Column(name="LOGIN")
private String login;
#Column(name="PASSWORD")
private String password;
And the DAO
#NoRepositoryBean
public interface ClientDao extends JpaRepository<Client, Long>, JpaSpecificationExecutor<Client> {
Client findByPasswordAndLogin(#Param("login") String customerLogin,#Param("password") String customerHashedPassword);
}
When the method findByPasswordAndLogin is executed, it takes about 200ms to be completed (seen both through Junit tests and with JProfiler).
Here the Hibernate query :
Hibernate: select clientx0_.CLIENT_ID as CLIENT_ID1_4_, clientx0_.LOGIN as LOGIN9_4_, clientx0_.PASSWORD as PASSWORD10_4_, clientx0_.STATUT as STATUT13_4_ from CLIENT clientx0_ where clientx0_.PASSWORD=? and clientx0_.LOGIN=?
When I execute manually the SQL query on the database, it takes only 3ms :
select * from CLIENT where PASSWORD='xxxxx' and LOGIN='yyyyyyyy'
We have 4000 clients in our development environnement. More than a million in production.
Here the context :
JDK 8
Spring 4.1.6.RELEASE + JPA + Hibernate
Oracle Database 10
Any idea ?
I have tested different types of DAO (I don't publish code here because it is so dirty) :
With Hibernate : ~200ms
With (Injected) Spring JDBCTemplate and RowMapper : ~70 ms
With Java Statement : ~2 ms
With Java OracleStatement : ~5 ms
With Java PreparedStatement : ~100ms
With Java PreparedStatement adjusted with Fetch size = 5000 : ~50ms
With Java OraclePreparedStatement : ~100ms
With Java OraclePreparedStatement adjusted with PreFetch size = 5000 : ~170ms
Notes :
DAO injected by Spring instead of new ClientDao() : +30ms lost (-sick-)
Connection time to DB : 46ms
I could use :
Java Statement with manual sanitized fields.
Pre-connection on application launch
Do not use Spring Injection
But :
Not really secured / safe
Fast for a small number of rows, slow to map ResultSet to entity on large number of rows (I also have this use case)
So :
The Spring JDBCTemplate with RowMapper seems to be the best solution to increase performances on specific case.
And we can keep a security on SQL queries.
But need to write specific RowMapper to transform ResultSet to Entity.
Example of Spring JDBCTemplate
#Repository
public class ClientJdbcTemplateDao {
private final Logger logger = LoggerFactory.getLogger(ClientJdbcTemplateDao.class);
private JdbcTemplate jdbcTemplate;
#Autowired
public void setDataSource(DataSource dataSource) {
this.jdbcTemplate = new JdbcTemplate(dataSource);
}
public List<Client> find() {
List<Client> c = this.jdbcTemplate.query( "SELECT login FROM Client WHERE LOGIN='xxxx' AND PASSWORD='xxx'", new ClientRowMapper());
return c;
}
}
Example of Client RowMapper
public class ClientRowMapper implements RowMapper<Client> {
#Override
public Client mapRow(ResultSet arg0, int arg1) throws SQLException {
// HERE IMPLEMENTS THE CONVERTER
// Sample :
// String login = arg0.getString("LOGIN")
// Client client = new Client(login);
// return client;
}
}
Maybe can be better, any suggestion is welcome.

Hibernate EntityManager keeps old data

I have Java EE application with Hibernate. I want to implement a feature that every minute updates one of existing rows in database. I have following classes:
#Singleton
#Startup
public class TimerRunnerImpl implements TimerRunner {
#EJB
private WorkProcessor workProcessor;
private String jobId;
#Timeout
#AccessTimeout(value = 90, unit = TimeUnit.MINUTES)
#TransactionAttribute(value = TransactionAttributeType.NEVER)
public void doProcessing(Timer timer) {
jobId = workProcessor.doWork(jobId);
}
//other methods: startTimer, etc
}
#Stateless
public class WorkProcessorImpl implements WorkProcessor {
#EJB
private MyEntityDao myEntityDao;
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
#Override
public String doWork(String jobId) {
if (jobId == null) {
MyEntity myEntity = myEntityDao.oldestEntityToProcess();
String uuid = UUID.randomUUID().toString();
myEntity.setJobId(uuid);
myEntityDao.update(myEntity); // this invokes merge()
return uuid;
} else {
// line below can never find entity, although there is one in DB
MyEntity myEntity = myEntityDao.findByJobId(jobId);
myEntity.setSomeProperty("someValue");
// some other updates
myEntityDao.update(myEntity); // this invokes merge()
return jobId;
}
}
}
First run of doWork updates MyEntity with job ID. This is being persisted into database - I can query it manually from SQLDeveloper. Second run always fails to find entity by job ID. In case I try to retrieve it by entity_id in debug mode, the object retrieved from Entity Manager has job id with previous value.
This is not cache problem, I have tried on each run to evict all cache at the beginning and results are identical.
As far as I understand, transaction is around workProcessor.doWork(jobId). I find confirmation of this by the fact that when this method returns I can see changes in DB. But why does EntityManager keeps my unmodified object and returns it when I query for it?

How to optimize hibernate method call in loop?

I have a java web application built using spring+hibernate.
I have code like this:
for (Account account : accountList){
Client client = clientService.findById(account.getFkClient()); // fkClient is foreign key to Client
if (client != null) {
...
anObject.setName(client.getName());
anObject.setAccountNo(account.getAccountNo());
...
}
else {
...
anObject.setAccountNo(account.getAccountNo());
...
}
...
}
accountList is a List of Account entity that retrieved from hibernate call. Inside the for loop, a Client entity is retrieved from account using hibernate call inside clientService.findById method.
These are the class involved to the call:
public class ClientService implements IClientService {
private IClientDAO clientDAO;
...
#Override
public Client findById(Long id) throws Exception {
return clientDAO.findById(id);
}
}
public class ClientDAO extends AbstractHibernateDAO<Client, Long> implements IClientDAO {
#Override
public Client findById(Long id) throws Exception {
return super.findById(id);
}
}
public class AbstractHibernateDAO<T,Y extends Serializable> extends HibernateDaoSupport {
protected Class<T> domainClass = getDomainClass();
private Class<T> getDomainClass() {
if (domainClass == null) {
ParameterizedType thisType = (ParameterizedType) getClass().getGenericSuperclass();
domainClass = (Class<T>) thisType.getActualTypeArguments()[0];
}
return domainClass;
}
public T findById(final Y id) throws SystemException {
return (T) this.execute(new HibernateCallback<T>() {
#Override
public T doInHibernate(Session session) throws HibernateException, SQLException {
return (T) session.get(domainClass, id);
}
});
}
}
Note: clientService and clientDAO are spring beans object.
My question is how to optimize the clientService.findById inside the loop with hibernate? I feel the findById call make the looping process become slower.
The accountList usually contains 7000+ records, so I need something like pre-compiled query mechanism just like PreparedStatements in jdbc. Is it possible to do this with hibernate?
Note: the code above has been simplified by removing unrelated parts, the method, variable and class name are made fictious for privacy reason. If you find a typo, please let me know in the comment section since I typed the code manually.
In Hibernate/JPA you can write queries with Hibernate Query Language/ JPA query language and create NamedQueries. NamedQuery is compiled when server is started so you can consider it like some kind of prepared statement.
You can try to write HQL query which can get all entity instances with single query.
I will give you example in JPQL but you can write it with HQL as well.
#NamedQueries({
#NamedQuery(name = "QUERY_BY_ID",
query = "SELECT u from SomeEntity se WHERE se.id IN (:idList)"),
})
class SomeEntity {
}
class SomeEntityDao {
public List<SomeEntity> findIdList(List<Long> idList) {
Query query = entityManager.createNamedQuery("QUERY_BY_ID");
query.setParameter("idList", idList);
return query.getResultList();
}
}
I found the best solution. I put the query that select columns from table Account and Client joined together into a View (VIEW_ACCOUNT_CLIENT), then I made entity class (AccountClientView) for the view and fetch it using hibernate, the result is wow, it boosts the performance drastically. Using the real code, it could takes about 15-20 minutes to finish the loop, but using View, it only takes 8-10 seconds
#Entity
#Table(name = "VIEW_ACCOUNT_CLIENT")
public class AccountClientView implements Serializable {
...
}
It's not clear what you want to achieve. I wouldn't do service calls in a loop. Why don't you use a NamedQuery?
Retrieve all Clients attached to the given Accounts, then iterate over that list of Clients.
SELECT c from Client c JOIN c.account a WHERE a.id IN (:accounIds)
But it really depends on the business requirement!
Also it's not clear to me why don't you just call:
Client client = account.getClient();
You might want to load your accountList with the clients already fetched in. Either use eager fetching, or fetch join. If the Account entity does not contain a Client, you should have a very good reason for it.

Categories

Resources