Running multiple JPA transactions in parallel

Running multiple JPA transactions in parallel - java

I have two (or more) Java Threads creating, updating and deleting entities from a mysql database using JPA. To achieve this I have a PersistenceLayer class creating the EntityManager and providing save, update and delete methods for all my entities looking like:
public void saveEntity(Entity entity) {
manager.getTransaction().begin();
manager.persist(entity);
manager.getTransaction().commit();
}
public void saveEntity2(Entity2 entity) {
manager.getTransaction().begin();
manager.persist(entity);
manager.getTransaction().commit();
}
If one thread enters operation saveEntity and the other saveEntity2 at the same time, both will try to get the transaction from the EntityManager wich will fail. However I always thought that the underlying database is able to manage multiple transactions, especially if both are working on different rows or even different tables. Of course I could synchronize the blocks, but that would mean only one DB connection is possible at a time which will not scale to multiple users creating several threads.
What wrong assumption am I doing here? Is it possible to submit multiple transactions to a database via JPA and let the DB handle concurrency issues?

EntityManager is not intended to be used by multiple threads. You need to obtain separate instances of EntityManager for each thread.
Actually, if you use EJB or Spring you can use a transaction-scoped EntityManager, which can be used from multiple threads (it's a proxy which delegates actual work to separate thread-bound instances of EntityManager), but I think it's not your case.

To handle concurrency issues when the object is being processed accross multiple transactions, lock can be acquired on the object. Either Optimistic or Pessimistic locking can be used with appropriate lock-modes.
Locking the versioned object by entityManager.lock(entity, LockModeType.READ) will ensure that it will prevent any Dirty Read & Non-Repeatable Read.
LockModeType.WRITE will force incrementing/updating the version column of the entity.

Related

What is a transaction boundary in hibernate

I have 2 questions related to each other
Q1 What exactly is a transaction boundary in hibernate/Spring Data JPA.
I am new to JPA , so please give a very basic example so I can understand as I tried to read multiple blogs but still not very clear.
Q2 And on top of it, what does this mean-
In hibernate, persist() method guarantees that it will not execute an INSERT statement if it is called outside of transaction boundaries, save() method does not guarantee the same.
What is outside and inside of a transaction boundary and how executions are performed outside boundaries?

A transaction is a unit of work that is either executed completely or not at all.
Transactions are fairly simple to use in a typical relational database.
You start a transaction by modifying some data. Every modification starts a transaction, you typically can't avoid it. You end the transaction by executing a commit or rollback.
Before your transaction is finished your changes can't be seen in other transactions (there are exceptions, variations and details). If you rollback your transaction all your changes in the database are undone.
If you commit your changes your changes become visible to other transactions, i.e. for other users connected to the same database. Implementations vary among many other things if changes become visible only for new transactions or also for already running transactions.
A transaction in JPA is a database transaction plus additional stuff.
You can start and end a transaction by getting a Transaction object and calling methods on it. But nobody does that anymore since it is error prone. Instead you annotate methods with #Transaction and entering the method will start a transaction and exiting the method will end the transaction.
The details are taken care of by Spring.
The tricky part with JPAs transactions is that within a JPA transaction JPA might (and will) choose to delay or even avoid read and write operations as long as possible. For example when you load an entity, and load it again in the same JPA transaction, JPA won't load it from the database but return the same instance it returned during the first load operation. If you want to learn more about this I recommend looking into JPAs first level cache.

A transaction boundary it's where the transaction starts or is committed/rollbacked.

When a transaction is started, the transaction context is bound to the current thread. So regardless of how many endpoints and channels you have in your Message flow your transaction context will be preserved as long as you are ensuring that the flow continues on the same thread. As soon as you break it by introducing a Pollable Channel or Executor Channel or initiate a new thread manually in some service, the Transactional boundary will be broken as well.
some other people ask about it - look it up.
If you do not understand something write to me again more accurately and I will explain.
I really hope I helped!

Spring #Transaction and #Async usage for database operations

In a spring application when we receive message #Service persist bean is calling the database operation to insert in to database & parallel #Service to parse & process message. In this case persist is using #Transactional. In order to make the flow in parallel, is it advised to add #Async for persist.
Additionally there is #Aspect on each save method called by persist service for logging & audit.
Is #Async advisable for database operations?
Does #Async create table locks?

All that #Async does is cause the methods of the annotated component to be executed on another thread, where it gets the thread from a pool (which can be specified, so you can choose for some operations to have a dedicated pool).
#Async itself doesn’t do anything to lock database tables, or anything else database-related. If you want database-level locking you will have to implement that through some other means. If you want the call to use a transaction you have to use the #Transactional annotation on the component being called asynchronously. The transaction will be separate from the caller's transaction. Of course the transaction can possibly cause database locking depending on the isolation level and database implementation.
It’s tricky to use #Async with database work. One pitfall occurs with jpa persistent entities passed across threads, when they have a lazy property that gets realized in the new thread (where the proxy is now invalid because it can’t get to the entityManager from the old thread). It’s safer if the things passed between threads are immutable.
#Async adds complexity and is hard to reason about. There are opportunities for race conditions and deadlocks where if you don’t get it exactly right then bad things can happen, and you can’t count on testing to uncover the issues. It’s working without a net, if you want any infrastructure to help with exception handling, retries, or other recovery you will have to provide it yourself.
So no, I wouldn’t necessarily call it advisable. It's a good capability to have in your toolbox that might be helpful for a few isolated cases, but pervasive usage would seem like a bad thing. There are alternatives if you’re looking for ways to persist data without blocking.

Using the same EntityManager across multiple threads

I have a JPA environment which has a Application-managed EntityManager. I manually create the entityManagerFactory and create the EntityManager from that. I would like to use the same EntityManager across multiple threads. The documentation says that EntityManager is not thread-safe, but all of my operations will only be reads and no writes will occur through the EntityManager. I also have a timeout on the data in the cache to ensure consistency. In such a scenario, is it ok to use the same instance of the EntityManager across threads? Or can there be any side effects / wrong data on using the same EntityManager across threads.
Thanks

To be sure, just lock the EntityManager instance where ever you use it by using synchronized.
So instead of writing
em.persist(...);
write
synchronized (em) {
em.persist(...);
}
You can read up about the locking mechanism here.

Modifying hibernate entities from multiple threads

I've got a problem with understanding the details of thread safety in Hibernate.
I know that Hibernate Sessions are not by themselves thread safe, so I'm not going to access them from more than one thread. However, I can't find any information on the thread safety of Hibernate entities. Can I modify them in multiple threads, while they remain attached to the Session which was used to load them?
I won't be using lazy loading (I know it would lead to concurrency problems).
Entities will be properly synchronized and hibernate will access them via synchronized getters.
The scenario that I've envisioned:
Use a Hibernate Session to load entity A from database,
Subsequently, modify entity A from multiple threads, other than the thread in which the entity was loaded,
All the time entity A remains attached to the
Session and is in persistent state,
Flush the Session so that modifications are synchronized with the database.
Entity A remains attached to the Session, so the cycle can repeat, with further modification and flushing.

That depends on the nature of the modifications. If you modify an entity by creating, persisting and associating another entity with it in another thread, then it will not work, because the other entity instance will be considered detached in the first thread.
Taken aside the use cases like the one above, this should work in theory and only if you don't use bytecode instrumentation for the dirty check. Hibernate will just check whether the objects are dirty when they need to be flushed; basically, it does not care how you modified the objects.
However, this is not recommended.
Firstly, it may not be compatible with future versions of Hibernate/JPA (there may be more restrictions preventing concurrent access to the entities).
Secondly, the workaround is fairly simple: Just make the DTOs for the data that you want to modify concurrently, submit it for processing, and update the entities when the processing is done. This way the code is more clear, there are no unexpected Hibernate thread-related complaints and you keep the flexibility to use other useful features like lazy-loading.

Hibernate entities are tightly integrated with the session and I'm pretty confident, that if the session isn't thread safe, the entities aren't either.
Even without considering the session, entities are just java beans which aren't thread safe. If you for example
Set a reference from anA to anB in one thread and change a property of anA in a second thread (or persist the entity) there is no guarantee, that the second thread will ever see the changes from the first.
So NO: Hibernate Entities are not thread safe.

sync entitymanager from database

i have a swing desktop application that use a entitymanagerfactory, but when this application is executed many times at the same time, the diferents entity managers have old data that was modified by others and this changes will not visible until the next entitymanagerfactory... how can i sync in anytime the entitymanager with the database data??

EntityManager instances should not be held for prolonged periods of time; instead each should be used for unit of work and discarded afterwards.
That said, EntityManager has a refresh() method you can invoke to reload state of a particular entity from the database.
It also has a clear() method which will clear the persistence context completely of "old" data. You need to be careful with it, though - calling clear() without flush() will discard all pending updates.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.