I've got a problem with understanding the details of thread safety in Hibernate.
I know that Hibernate Sessions are not by themselves thread safe, so I'm not going to access them from more than one thread. However, I can't find any information on the thread safety of Hibernate entities. Can I modify them in multiple threads, while they remain attached to the Session which was used to load them?
I won't be using lazy loading (I know it would lead to concurrency problems).
Entities will be properly synchronized and hibernate will access them via synchronized getters.
The scenario that I've envisioned:
Use a Hibernate Session to load entity A from database,
Subsequently, modify entity A from multiple threads, other than the thread in which the entity was loaded,
All the time entity A remains attached to the
Session and is in persistent state,
Flush the Session so that modifications are synchronized with the database.
Entity A remains attached to the Session, so the cycle can repeat, with further modification and flushing.
That depends on the nature of the modifications. If you modify an entity by creating, persisting and associating another entity with it in another thread, then it will not work, because the other entity instance will be considered detached in the first thread.
Taken aside the use cases like the one above, this should work in theory and only if you don't use bytecode instrumentation for the dirty check. Hibernate will just check whether the objects are dirty when they need to be flushed; basically, it does not care how you modified the objects.
However, this is not recommended.
Firstly, it may not be compatible with future versions of Hibernate/JPA (there may be more restrictions preventing concurrent access to the entities).
Secondly, the workaround is fairly simple: Just make the DTOs for the data that you want to modify concurrently, submit it for processing, and update the entities when the processing is done. This way the code is more clear, there are no unexpected Hibernate thread-related complaints and you keep the flexibility to use other useful features like lazy-loading.
Hibernate entities are tightly integrated with the session and I'm pretty confident, that if the session isn't thread safe, the entities aren't either.
Even without considering the session, entities are just java beans which aren't thread safe. If you for example
Set a reference from anA to anB in one thread and change a property of anA in a second thread (or persist the entity) there is no guarantee, that the second thread will ever see the changes from the first.
So NO: Hibernate Entities are not thread safe.
Related
Suppose, I have an application deployed on 3 nodes and from each node, one thread is trying to fetch and update the same data record at the same time. Data can only be fetched and updated from the central database. A method that is making the database connection, is thread safe. In this scenario, Is it possible that data can be modified and lead to inconsistency? If yes, how can we solve this problem?
You're confusing several completely different things together:
In general, to protect "data" between "threads", one would use a "lock". One way of protecting data in Java is with synchronized.
In general, a "thread" running on one "node" cannot - and will not - interfere with in-memory data objects being manipulated by some thread on a different "node".
"Database access" brings completely different issues to the table. In particular, read up about isolation levels
Finally, IF you're doing "database updates" and IF "concurrency" is an issue ... then you probably want to perform your update(s) within a DB transaction.
A "transaction" is ACID:
Atomic
Consistency
Isolation
Durability
In a spring application when we receive message #Service persist bean is calling the database operation to insert in to database & parallel #Service to parse & process message. In this case persist is using #Transactional. In order to make the flow in parallel, is it advised to add #Async for persist.
Additionally there is #Aspect on each save method called by persist service for logging & audit.
Is #Async advisable for database operations?
Does #Async create table locks?
All that #Async does is cause the methods of the annotated component to be executed on another thread, where it gets the thread from a pool (which can be specified, so you can choose for some operations to have a dedicated pool).
#Async itself doesn’t do anything to lock database tables, or anything else database-related. If you want database-level locking you will have to implement that through some other means. If you want the call to use a transaction you have to use the #Transactional annotation on the component being called asynchronously. The transaction will be separate from the caller's transaction. Of course the transaction can possibly cause database locking depending on the isolation level and database implementation.
It’s tricky to use #Async with database work. One pitfall occurs with jpa persistent entities passed across threads, when they have a lazy property that gets realized in the new thread (where the proxy is now invalid because it can’t get to the entityManager from the old thread). It’s safer if the things passed between threads are immutable.
#Async adds complexity and is hard to reason about. There are opportunities for race conditions and deadlocks where if you don’t get it exactly right then bad things can happen, and you can’t count on testing to uncover the issues. It’s working without a net, if you want any infrastructure to help with exception handling, retries, or other recovery you will have to provide it yourself.
So no, I wouldn’t necessarily call it advisable. It's a good capability to have in your toolbox that might be helpful for a few isolated cases, but pervasive usage would seem like a bad thing. There are alternatives if you’re looking for ways to persist data without blocking.
should I create an EclipseLink EntityManager per method call, store in thread local or guard with a lock?
Obviously it's initially created from entityManagerFactory.createEntityManager().
Which is best practise please?
With per method call I'm concerned with performance.
With thread local I'm concern about cache visibility between threads.
With guarding a single EntityManager with a lock I have the cost of a lock each time.
/I'm using JSE - so no EJBs, no injtection - just entityManagerFactory.createEntityManager() in a multi-threaded JSE app.
Thanks
EntityManagers are not thread safe, and are designed to represent a unit of work. Each method/thread should have its own unless participating in a larger transaction. It is also better to close/clear them at logical points because they maintain a cache of managed entities that can grow large with long lived EntityManagers. There are numerous posts on what the best way to go is outside and inside a container.
EclipseLink's EntityManager uses EclipseLink's native sessions and unitOfWork underneath, which will lazily fetch resources as required, and release them when done. But they can be configured to operate differently.
I have two (or more) Java Threads creating, updating and deleting entities from a mysql database using JPA. To achieve this I have a PersistenceLayer class creating the EntityManager and providing save, update and delete methods for all my entities looking like:
public void saveEntity(Entity entity) {
manager.getTransaction().begin();
manager.persist(entity);
manager.getTransaction().commit();
}
public void saveEntity2(Entity2 entity) {
manager.getTransaction().begin();
manager.persist(entity);
manager.getTransaction().commit();
}
If one thread enters operation saveEntity and the other saveEntity2 at the same time, both will try to get the transaction from the EntityManager wich will fail. However I always thought that the underlying database is able to manage multiple transactions, especially if both are working on different rows or even different tables. Of course I could synchronize the blocks, but that would mean only one DB connection is possible at a time which will not scale to multiple users creating several threads.
What wrong assumption am I doing here? Is it possible to submit multiple transactions to a database via JPA and let the DB handle concurrency issues?
EntityManager is not intended to be used by multiple threads. You need to obtain separate instances of EntityManager for each thread.
Actually, if you use EJB or Spring you can use a transaction-scoped EntityManager, which can be used from multiple threads (it's a proxy which delegates actual work to separate thread-bound instances of EntityManager), but I think it's not your case.
To handle concurrency issues when the object is being processed accross multiple transactions, lock can be acquired on the object. Either Optimistic or Pessimistic locking can be used with appropriate lock-modes.
Locking the versioned object by entityManager.lock(entity, LockModeType.READ) will ensure that it will prevent any Dirty Read & Non-Repeatable Read.
LockModeType.WRITE will force incrementing/updating the version column of the entity.
What is 'Extended Session Antipattern' ?
An extended (or Long) session (or session-per-conversation) is a session that may live beyond the duration of a transaction, as opposed to transaction-scoped sessions (or session-per-request). This is not necessarily an anti-pattern, this is a way to implement Long conversations (i.e. conversations with the database than span multiple transactions) which are just another way of designing units of work.
Like anything, I'd just say that long conversations can be misused or wrongly implemented.
Here is how the documentation introduces Long conversations:
12.1.2. Long conversations
The session-per-request pattern is
not the only way of designing units of
work. Many business processes require
a whole series of interactions with
the user that are interleaved with
database accesses. In web and
enterprise applications, it is not
acceptable for a database transaction
to span a user interaction. Consider
the following example:
The first screen of a dialog opens. The data seen by the user has been
loaded in a particular Session and
database transaction. The user is free
to modify the objects.
The user clicks "Save" after 5 minutes and expects their
modifications to be made persistent.
The user also expects that they were
the only person editing this
information and that no conflicting
modification has occurred.
From the point of view of the user, we
call this unit of work a long-running
conversation or application
transaction. There are many ways to
implement this in your application.
A first naive implementation might
keep the Session and database
transaction open during user think
time, with locks held in the database
to prevent concurrent modification and
to guarantee isolation and atomicity.
This is an anti-pattern, since lock
contention would not allow the
application to scale with the number
of concurrent users.
You have to use several database
transactions to implement the
conversation. In this case,
maintaining isolation of business
processes becomes the partial
responsibility of the application
tier. A single conversation usually
spans several database transactions.
It will be atomic if only one of these
database transactions (the last one)
stores the updated data. All others
simply read data (for example, in a
wizard-style dialog spanning several
request/response cycles). This is
easier to implement than it might
sound, especially if you utilize some
of Hibernate's features:
Automatic Versioning: Hibernate can perform automatic optimistic
concurrency control for you. It can
automatically detect if a concurrent
modification occurred during user
think time. Check for this at the end
of the conversation.
Detached Objects: if you decide to use the session-per-request pattern,
all loaded instances will be in the
detached state during user think time.
Hibernate allows you to reattach the
objects and persist the modifications.
The pattern is called
session-per-request-with-detached-objects.
Automatic versioning is used to
isolate concurrent modifications.
Extended (or Long) Session: the Hibernate Session can be disconnected
from the underlying JDBC connection
after the database transaction has
been committed and reconnected when a
new client request occurs. This
pattern is known as
session-per-conversation and makes
even reattachment unnecessary.
Automatic versioning is used to
isolate concurrent modifications and
the Session will not be allowed to be
flushed automatically, but
explicitly.
Both
session-per-request-with-detached-objects
and session-per-conversation have
advantages and disadvantages. These
disadvantages are discussed later in
this chapter in the context of
optimistic concurrency control.
I've added some references below but I suggest reading the whole Chapter 12. Transactions and Concurrency.
References
Hibernate Core Reference Guide
12.1.2. Long conversations
12.3. Optimistic concurrency control