I have 2 questions related to each other
Q1 What exactly is a transaction boundary in hibernate/Spring Data JPA.
I am new to JPA , so please give a very basic example so I can understand as I tried to read multiple blogs but still not very clear.
Q2 And on top of it, what does this mean-
In hibernate, persist() method guarantees that it will not execute an INSERT statement if it is called outside of transaction boundaries, save() method does not guarantee the same.
What is outside and inside of a transaction boundary and how executions are performed outside boundaries?
A transaction is a unit of work that is either executed completely or not at all.
Transactions are fairly simple to use in a typical relational database.
You start a transaction by modifying some data. Every modification starts a transaction, you typically can't avoid it. You end the transaction by executing a commit or rollback.
Before your transaction is finished your changes can't be seen in other transactions (there are exceptions, variations and details). If you rollback your transaction all your changes in the database are undone.
If you commit your changes your changes become visible to other transactions, i.e. for other users connected to the same database. Implementations vary among many other things if changes become visible only for new transactions or also for already running transactions.
A transaction in JPA is a database transaction plus additional stuff.
You can start and end a transaction by getting a Transaction object and calling methods on it. But nobody does that anymore since it is error prone. Instead you annotate methods with #Transaction and entering the method will start a transaction and exiting the method will end the transaction.
The details are taken care of by Spring.
The tricky part with JPAs transactions is that within a JPA transaction JPA might (and will) choose to delay or even avoid read and write operations as long as possible. For example when you load an entity, and load it again in the same JPA transaction, JPA won't load it from the database but return the same instance it returned during the first load operation. If you want to learn more about this I recommend looking into JPAs first level cache.
A transaction boundary it's where the transaction starts or is committed/rollbacked.
When a transaction is started, the transaction context is bound to the current thread. So regardless of how many endpoints and channels you have in your Message flow your transaction context will be preserved as long as you are ensuring that the flow continues on the same thread. As soon as you break it by introducing a Pollable Channel or Executor Channel or initiate a new thread manually in some service, the Transactional boundary will be broken as well.
some other people ask about it - look it up.
If you do not understand something write to me again more accurately and I will explain.
I really hope I helped!
Related
I'm using JPA and I begin a transaction before a function is called which will in turn store some data into the database. The transaction is then ended after the function call is executed. But I also use entityManager.flush() and entityManager.clear() guessing that memory will be freed on using clear(). before ending the transaction. Please do correct me. When I executed the program, it failed to roll back.
I also tried removing the flush() and clear(), still I'm not able to roll back. When can a roll back fail in my scenario when I'm using clear and flush?
I also use entityManager.flush() and entityManager.clear() guessing
that memory will be freed on using clear().
Well you should stop guessing and start reading some documentation.
EntityManager flushes on tx.commit() by itself, and entity manager is cleared when it is closed.
If you are using container manager transactions, you should rather not worry about flushing and clearing entity managers. If you are using JavaSE then your normal workflow with DB should look like this:
Create new entity manager
Start transaction
Do your job - delete/update/insert whatever
Commit transaction / Rollback if needed
Close entity manager
Normally this would be closed in small try-catch-finally block
EntityManager is lightweight and short living component. You should create one just when you need it, and close it right after you use it.
What is supposed to be a singleton instance, and created (most cases) only once is EntityManagerFactory
Why do I need Transaction in Hibernate for read-only operations?
Does the following transaction put a lock in the DB?
Example code to fetch from DB:
Transaction tx = HibernateUtil.getCurrentSession().beginTransaction(); // why begin transaction?
//readonly operation here
tx.commit() // why tx.commit? I don't want to write anything
Can I use session.close() instead of tx.commit()?
Transactions for reading might look indeed strange and often people don't mark methods for transactions in this case. But JDBC will create transaction anyway, it's just it will be working in autocommit=true if different option wasn't set explicitly. But there are practical reasons to mark transactions read-only:
Impact on databases
Read-only flag may let DBMS optimize such transactions or those running in parallel.
Having a transaction that spans multiple SELECT statements guarantees proper Isolation for levels starting from Repeatable Read or Snapshot (e.g. see PostgreSQL's Repeatable Read). Otherwise 2 SELECT statements could see inconsistent picture if another transaction commits in parallel. This isn't relevant when using Read Committed.
Impact on ORM
ORM may cause unpredictable results if you don't begin/finish transactions explicitly. E.g. Hibernate will open transaction before the 1st statement, but it won't finish it. So connection will be returned to the Connection Pool with an unfinished transaction. What happens then? JDBC keeps silence, thus this is implementation specific: MySQL, PostgreSQL drivers roll back such transaction, Oracle commits it. Note that this can also be configured on Connection Pool level, e.g. C3P0 gives you such an option, rollback by default.
Spring sets the FlushMode=MANUAL in case of read-only transactions, which leads to other optimizations like no need for dirty checks. This could lead to huge performance gain depending on how many objects you loaded.
Impact on architecture & clean code
There is no guarantee that your method doesn't write into the database. If you mark method as #Transactional(readonly=true), you'll dictate whether it's actually possible to write into DB in scope of this transaction. If your architecture is cumbersome and some team members may choose to put modification query where it's not expected, this flag will point you to the problematic place.
All database statements are executed within the context of a physical transaction, even when we don’t explicitly declare transaction boundaries (e.g., BEGIN, COMMIT, ROLLBACK).
If you don't declare transaction boundaries explicitly, then each statement will have to be executed in a separate transaction (autocommit mode). This may even lead to opening and closing one connection per statement unless your environment can deal with connection-per-thread binding.
Declaring a service as #Transactional will give you one connection for the whole transaction duration, and all statements will use that single isolation connection. This is way better than not using explicit transactions in the first place.
On large applications, you may have many concurrent requests, and reducing database connection acquisition request rate will definitely improve your overall application performance.
JPA doesn't enforce transactions on read operations. Only writes end up throwing a TransactionRequiredException in case you forget to start a transactional context. Nevertheless, it's always better to declare transaction boundaries even for read-only transactions (in Spring #Transactional allows you to mark read-only transactions, which has a great performance benefit).
Transactions indeed put locks on the database — good database engines handle concurrent locks in a sensible way — and are useful with read-only use to ensure that no other transaction adds data that makes your view inconsistent. You always want a transaction (though sometimes it is reasonable to tune the isolation level, it's best not to do that to start out with); if you never write to the DB during your transaction, both committing and rolling back the transaction work out to be the same (and very cheap).
Now, if you're lucky and your queries against the DB are such that the ORM always maps them to single SQL queries, you can get away without explicit transactions, relying on the DB's built-in autocommit behavior, but ORMs are relatively complex systems so it isn't at all safe to rely on such behavior unless you go to a lot more work checking what the implementation actually does. Writing the explicit transaction boundaries in is far easier to get right (especially if you can do it with AOP or some similar ORM-driven technique; from Java 7 onwards try-with-resources could be used too I suppose).
It doesn't matter whether you only read or not - the database must still keep track of your resultset, because other database clients may want to write data that would change your resultset.
I have seen faulty programs to kill huge database systems, because they just read data, but never commit, forcing the transaction log to grow, because the DB can't release the transaction data before a COMMIT or ROLLBACK, even if the client did nothing for hours.
Why do I need Transaction in Hibernate for read-only operations?
Does the following transaction put a lock in the DB?
Example code to fetch from DB:
Transaction tx = HibernateUtil.getCurrentSession().beginTransaction(); // why begin transaction?
//readonly operation here
tx.commit() // why tx.commit? I don't want to write anything
Can I use session.close() instead of tx.commit()?
Transactions for reading might look indeed strange and often people don't mark methods for transactions in this case. But JDBC will create transaction anyway, it's just it will be working in autocommit=true if different option wasn't set explicitly. But there are practical reasons to mark transactions read-only:
Impact on databases
Read-only flag may let DBMS optimize such transactions or those running in parallel.
Having a transaction that spans multiple SELECT statements guarantees proper Isolation for levels starting from Repeatable Read or Snapshot (e.g. see PostgreSQL's Repeatable Read). Otherwise 2 SELECT statements could see inconsistent picture if another transaction commits in parallel. This isn't relevant when using Read Committed.
Impact on ORM
ORM may cause unpredictable results if you don't begin/finish transactions explicitly. E.g. Hibernate will open transaction before the 1st statement, but it won't finish it. So connection will be returned to the Connection Pool with an unfinished transaction. What happens then? JDBC keeps silence, thus this is implementation specific: MySQL, PostgreSQL drivers roll back such transaction, Oracle commits it. Note that this can also be configured on Connection Pool level, e.g. C3P0 gives you such an option, rollback by default.
Spring sets the FlushMode=MANUAL in case of read-only transactions, which leads to other optimizations like no need for dirty checks. This could lead to huge performance gain depending on how many objects you loaded.
Impact on architecture & clean code
There is no guarantee that your method doesn't write into the database. If you mark method as #Transactional(readonly=true), you'll dictate whether it's actually possible to write into DB in scope of this transaction. If your architecture is cumbersome and some team members may choose to put modification query where it's not expected, this flag will point you to the problematic place.
All database statements are executed within the context of a physical transaction, even when we don’t explicitly declare transaction boundaries (e.g., BEGIN, COMMIT, ROLLBACK).
If you don't declare transaction boundaries explicitly, then each statement will have to be executed in a separate transaction (autocommit mode). This may even lead to opening and closing one connection per statement unless your environment can deal with connection-per-thread binding.
Declaring a service as #Transactional will give you one connection for the whole transaction duration, and all statements will use that single isolation connection. This is way better than not using explicit transactions in the first place.
On large applications, you may have many concurrent requests, and reducing database connection acquisition request rate will definitely improve your overall application performance.
JPA doesn't enforce transactions on read operations. Only writes end up throwing a TransactionRequiredException in case you forget to start a transactional context. Nevertheless, it's always better to declare transaction boundaries even for read-only transactions (in Spring #Transactional allows you to mark read-only transactions, which has a great performance benefit).
Transactions indeed put locks on the database — good database engines handle concurrent locks in a sensible way — and are useful with read-only use to ensure that no other transaction adds data that makes your view inconsistent. You always want a transaction (though sometimes it is reasonable to tune the isolation level, it's best not to do that to start out with); if you never write to the DB during your transaction, both committing and rolling back the transaction work out to be the same (and very cheap).
Now, if you're lucky and your queries against the DB are such that the ORM always maps them to single SQL queries, you can get away without explicit transactions, relying on the DB's built-in autocommit behavior, but ORMs are relatively complex systems so it isn't at all safe to rely on such behavior unless you go to a lot more work checking what the implementation actually does. Writing the explicit transaction boundaries in is far easier to get right (especially if you can do it with AOP or some similar ORM-driven technique; from Java 7 onwards try-with-resources could be used too I suppose).
It doesn't matter whether you only read or not - the database must still keep track of your resultset, because other database clients may want to write data that would change your resultset.
I have seen faulty programs to kill huge database systems, because they just read data, but never commit, forcing the transaction log to grow, because the DB can't release the transaction data before a COMMIT or ROLLBACK, even if the client did nothing for hours.
I have a method that has the propagation = Propagation.REQUIRES_NEW transactional property:
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void createUser(final UserBean userBean) {
//Some logic here that requires modification in DB
}
This method can be called multiple times simultaneously, and for every transaction if an error occurs than it's rolled back (independently from the other transactions).
The problem is that this might force Spring to create multiple transactions, even if another one is available, and may cause some performance problems.
Java doc of propagation = Propagation.REQUIRED says: Support a current transaction, create a new one if none exists.
This seems to solve the performance problem, doesn't it?
What about the rollback issue ? What if a new method call rolls back while using an existing transaction ? won't that rollback the whole transaction even the previous calls ?
[EDIT]
I guess my question wasn't clear enough:
We have hundreds of clients connected to our server.
For each client we naturally need to send a feedback about the transaction (OK or exception -> rollback).
My question is: if I use REQUIRED, does it mean only one transaction is used, and if the 100th client encounters a problem the 1st client's transaction will rollback as well ?
Using REQUIRES_NEW is only relevant when the method is invoked from a transactional context; when the method is invoked from a non-transactional context, it will behave exactly as REQUIRED - it will create a new transaction.
That does not mean that there will only be one single transaction for all your clients - each client will start from a non-transactional context, and as soon as the the request processing will hit a #Transactional, it will create a new transaction.
So, with that in mind, if using REQUIRES_NEW makes sense for the semantics of that operation - than I wouldn't worry about performance - this would textbook premature optimization - I would rather stress correctness and data integrity and worry about performance once performance metrics have been collected, and not before.
On rollback - using REQUIRES_NEW will force the start of a new transaction, and so an exception will rollback that transaction. If there is also another transaction that was executing as well - that will or will not be rolled back depending on if the exception bubbles up the stack or is caught - your choice, based on the specifics of the operations.
Also, for a more in-depth discussion on transactional strategies and rollback, I would recommend: «Transaction strategies: Understanding transaction pitfalls», Mark Richards.
If you really need to do it in separate transaction you need to use REQUIRES_NEW and live with the performance overhead. Watch out for dead locks.
I'd rather do it the other way:
Validate data on Java side.
Run everyting in one transaction.
If anything goes wrong on DB side -> it's a major error of DB or validation design. Rollback everything and throw critical top level error.
Write good unit tests.
I can't understand the behavior difference between the PROPAGATION_REQUIRES_NEW and PROPAGATION_NESTED propagation policies. It seems to me that in both cases, the current process is rollbacked but not the whole transaction. Any clue?
See this link: PROPAGATION_NESTED versus PROPAGATION_REQUIRES_NEW? Juergen Hoeller explain it very well. -- the Spring Source Forum is completely offline since February 28, 2019, but you can read the relevant part of the article in the quote below
PROPAGATION_REQUIRES_NEW starts a new, independent "inner" transaction
for the given scope. This transaction will be committed or rolled back
completely independent from the outer transaction, having its own
isolation scope, its own set of locks, etc. The outer transaction will
get suspended at the beginning of the inner one, and resumed once the
inner one has completed. ...
PROPAGATION_NESTED on the other hand starts a "nested" transaction,
which is a true subtransaction of the existing one. What will happen
is that a savepoint will be taken at the start of the nested
transaction. Íf the nested transaction fails, we will roll back to
that savepoint. The nested transaction is part of of the outer
transaction, so it will only be committed at the end of of the outer
transaction. ...
PROPAGATION_REQUIRES_NEW : uses a completely independent transaction for each affected transaction scope. In that case, the underlying physical transactions are different and hence can commit or roll back independently, with an outer transaction not affected by an inner transaction's rollback status.
PROPAGATION_NESTED : uses a single physical transaction with multiple savepoints that it can roll back to. Such partial rollbacks allow an inner transaction scope to trigger a rollback for its scope, with the outer transaction being able to continue the physical transaction despite some operations having been rolled back. This setting is typically mapped onto JDBC savepoints, so will only work with JDBC resource transactions.
check spring documentation
Please find the difference
1.) Use of NESTED Transaction
Execute within a nested transaction if a current transaction exists, behave like PROPAGATION_REQUIRED else.
Nested transaction is supporting by Spring
2.)Use of REQUIRED Transaction
Support a current transaction, create a new one if none exists.
. It means for banking domain like withdraw,deposit, update the transaction
3.) Use of REQUIRES_NEW Transaction
Create a new transaction, and suspend the current transaction if one exists.