From App Engine doc on transaction:
Note: In extremely rare cases, the transaction is fully committed even if a transaction
returns a timeout or internal error exception. For this reason, it's best to make transactions
idempotent whenever possible.
Suppose in a situation A transfers money to another person B, the operations should be in a transaction, if the above Note did occur, then it will be in inconsistent state, (the transfer action can not be idempotent). Is my understanding correct?
You would need to make such a transaction idempotent. See this earlier StackOverflow item for a much deeper description of the issue and resolutions: GAE transaction failure and idempotency
Related
I have 2 questions related to each other
Q1 What exactly is a transaction boundary in hibernate/Spring Data JPA.
I am new to JPA , so please give a very basic example so I can understand as I tried to read multiple blogs but still not very clear.
Q2 And on top of it, what does this mean-
In hibernate, persist() method guarantees that it will not execute an INSERT statement if it is called outside of transaction boundaries, save() method does not guarantee the same.
What is outside and inside of a transaction boundary and how executions are performed outside boundaries?
A transaction is a unit of work that is either executed completely or not at all.
Transactions are fairly simple to use in a typical relational database.
You start a transaction by modifying some data. Every modification starts a transaction, you typically can't avoid it. You end the transaction by executing a commit or rollback.
Before your transaction is finished your changes can't be seen in other transactions (there are exceptions, variations and details). If you rollback your transaction all your changes in the database are undone.
If you commit your changes your changes become visible to other transactions, i.e. for other users connected to the same database. Implementations vary among many other things if changes become visible only for new transactions or also for already running transactions.
A transaction in JPA is a database transaction plus additional stuff.
You can start and end a transaction by getting a Transaction object and calling methods on it. But nobody does that anymore since it is error prone. Instead you annotate methods with #Transaction and entering the method will start a transaction and exiting the method will end the transaction.
The details are taken care of by Spring.
The tricky part with JPAs transactions is that within a JPA transaction JPA might (and will) choose to delay or even avoid read and write operations as long as possible. For example when you load an entity, and load it again in the same JPA transaction, JPA won't load it from the database but return the same instance it returned during the first load operation. If you want to learn more about this I recommend looking into JPAs first level cache.
A transaction boundary it's where the transaction starts or is committed/rollbacked.
When a transaction is started, the transaction context is bound to the current thread. So regardless of how many endpoints and channels you have in your Message flow your transaction context will be preserved as long as you are ensuring that the flow continues on the same thread. As soon as you break it by introducing a Pollable Channel or Executor Channel or initiate a new thread manually in some service, the Transactional boundary will be broken as well.
some other people ask about it - look it up.
If you do not understand something write to me again more accurately and I will explain.
I really hope I helped!
I have a JPA transaction like the following (Using controller advice to catch exceptions)
#Transactional
public void save(MyObj myObj) {
// Attempt to save the object
this.myRepo.save(myObj)
// After it saves, call my audit log service to record the change
this.myAuditLogService.logChange(myObj)
}
Works fine, but the problem is if the save fails and throws an exception, it still calls the audit log service, and then throws an exception afterwards. Causing erroneous audit entries to be created.
Expected Flow
Call save function
Save fails
Transaction stops and rolls back
Controller advice catches the exception
Actual Flow
Call save function
Save fails
Audit log service is called
Transaction rolls back
Controller advice catches the exception
This is a common problem in Computer Science in Distributed Systems.
Basically what you want to achieve is to have atomic operation across multiple systems.
Your transaction spans only your local (or first) database and that's all.
When the REST call to the second system is initiated and successful but the first save results in crash you want to have rollback on the first system (first save) and rollback on the second system as well. There are multiple problems with that and it's really hard to have atomic-like consistency across multiple systems.
You could use Database supported technologies for such cases:
What you probably need is a 2PC / 3PC or change the processing of your request somehow.
The trade-off of course will be that you'll have to sacrifice immediate results to have eventual consistency.
You could use eventual-consistency
For example send message to some storage for later processing -> make both systems read the message:
System1 reads from storage this message and will save myObj
System2 reads from storage this message and will log change
This will of course happen "eventually" - there will never be a guarantee that either system is up at the time of the processing or even later on (e.g. somebody killed the server or deployed code with bug and the server restarts indefinitely).
Moreover you'll sacrifice read-after-write consistency.
You could use in case of a failure a Compensating transaction.
I recommend reading more on the topic of Distributed Systems in:
(Fallacies of distributed computing)[https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing]
(Designing Data Intensive Applications)[https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321]
(CAP theorem)[https://en.wikipedia.org/wiki/CAP_theorem]
(Consistency models)[https://en.wikipedia.org/wiki/Consistency_model]
Why do I need Transaction in Hibernate for read-only operations?
Does the following transaction put a lock in the DB?
Example code to fetch from DB:
Transaction tx = HibernateUtil.getCurrentSession().beginTransaction(); // why begin transaction?
//readonly operation here
tx.commit() // why tx.commit? I don't want to write anything
Can I use session.close() instead of tx.commit()?
Transactions for reading might look indeed strange and often people don't mark methods for transactions in this case. But JDBC will create transaction anyway, it's just it will be working in autocommit=true if different option wasn't set explicitly. But there are practical reasons to mark transactions read-only:
Impact on databases
Read-only flag may let DBMS optimize such transactions or those running in parallel.
Having a transaction that spans multiple SELECT statements guarantees proper Isolation for levels starting from Repeatable Read or Snapshot (e.g. see PostgreSQL's Repeatable Read). Otherwise 2 SELECT statements could see inconsistent picture if another transaction commits in parallel. This isn't relevant when using Read Committed.
Impact on ORM
ORM may cause unpredictable results if you don't begin/finish transactions explicitly. E.g. Hibernate will open transaction before the 1st statement, but it won't finish it. So connection will be returned to the Connection Pool with an unfinished transaction. What happens then? JDBC keeps silence, thus this is implementation specific: MySQL, PostgreSQL drivers roll back such transaction, Oracle commits it. Note that this can also be configured on Connection Pool level, e.g. C3P0 gives you such an option, rollback by default.
Spring sets the FlushMode=MANUAL in case of read-only transactions, which leads to other optimizations like no need for dirty checks. This could lead to huge performance gain depending on how many objects you loaded.
Impact on architecture & clean code
There is no guarantee that your method doesn't write into the database. If you mark method as #Transactional(readonly=true), you'll dictate whether it's actually possible to write into DB in scope of this transaction. If your architecture is cumbersome and some team members may choose to put modification query where it's not expected, this flag will point you to the problematic place.
All database statements are executed within the context of a physical transaction, even when we don’t explicitly declare transaction boundaries (e.g., BEGIN, COMMIT, ROLLBACK).
If you don't declare transaction boundaries explicitly, then each statement will have to be executed in a separate transaction (autocommit mode). This may even lead to opening and closing one connection per statement unless your environment can deal with connection-per-thread binding.
Declaring a service as #Transactional will give you one connection for the whole transaction duration, and all statements will use that single isolation connection. This is way better than not using explicit transactions in the first place.
On large applications, you may have many concurrent requests, and reducing database connection acquisition request rate will definitely improve your overall application performance.
JPA doesn't enforce transactions on read operations. Only writes end up throwing a TransactionRequiredException in case you forget to start a transactional context. Nevertheless, it's always better to declare transaction boundaries even for read-only transactions (in Spring #Transactional allows you to mark read-only transactions, which has a great performance benefit).
Transactions indeed put locks on the database — good database engines handle concurrent locks in a sensible way — and are useful with read-only use to ensure that no other transaction adds data that makes your view inconsistent. You always want a transaction (though sometimes it is reasonable to tune the isolation level, it's best not to do that to start out with); if you never write to the DB during your transaction, both committing and rolling back the transaction work out to be the same (and very cheap).
Now, if you're lucky and your queries against the DB are such that the ORM always maps them to single SQL queries, you can get away without explicit transactions, relying on the DB's built-in autocommit behavior, but ORMs are relatively complex systems so it isn't at all safe to rely on such behavior unless you go to a lot more work checking what the implementation actually does. Writing the explicit transaction boundaries in is far easier to get right (especially if you can do it with AOP or some similar ORM-driven technique; from Java 7 onwards try-with-resources could be used too I suppose).
It doesn't matter whether you only read or not - the database must still keep track of your resultset, because other database clients may want to write data that would change your resultset.
I have seen faulty programs to kill huge database systems, because they just read data, but never commit, forcing the transaction log to grow, because the DB can't release the transaction data before a COMMIT or ROLLBACK, even if the client did nothing for hours.
My backend system serves about 10K POS devices, each device will request service in a sequential style, however I am wondering how the backend guarantee to handle requests of a given client in sequential style.
For example a device issues a 'sell' request, and timeout(may DB blocked) to get response, so it issue a 'cancellation' to cancel that sale request. In this case, the backend may is still handling 'sale' transaction when get 'cancellation' request, it may cause some unexpected result.
My idea is to setup a persistent queue for each device(client), but is it OK to setup 10K queues? I am not sure, please help.
This is an incredibly complex area of computer science and a lot of these problems have been solved many times. I would not try to reinvent the wheel.
My advice:
Read about and thoroughly understand ACID (summaries paraphrased):
Atomicity - If any part of a transaction fails, the whole transaction fails, and the database is not left in an unknown or corrupted state. This is hugely important. Rely on existing software to make this happen in the actual database. And don't invent data structures that require you to reinvent your own transaction system. Make your transactions as small as possible to reduce failures.
Consistency - The database is never left in an invalid state. All operations committed to it will take it to a new valid state.
Isolation - The operations you perform on a database can be performed at the same time and result in the same state as if performed one after the other. OR performed safely inside a locking transaction.
Durability - Once a transaction is committed, it will remain so.
Both your existing system and your proposed idea sound like they could potentially be violating ACID:
A stateful request system probably violates (or makes it hard not to violate) isolation.
A queue could violate durability if not done in a bullet-proof way.
Not to mention, you have scalability issues as well. Combine scalability and ACID and you have heavyweight situation requiring serious expertise.
If you can help it, I would strongly suggest relying on existing systems, especially if this is for point of sale.
I have a method that has the propagation = Propagation.REQUIRES_NEW transactional property:
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void createUser(final UserBean userBean) {
//Some logic here that requires modification in DB
}
This method can be called multiple times simultaneously, and for every transaction if an error occurs than it's rolled back (independently from the other transactions).
The problem is that this might force Spring to create multiple transactions, even if another one is available, and may cause some performance problems.
Java doc of propagation = Propagation.REQUIRED says: Support a current transaction, create a new one if none exists.
This seems to solve the performance problem, doesn't it?
What about the rollback issue ? What if a new method call rolls back while using an existing transaction ? won't that rollback the whole transaction even the previous calls ?
[EDIT]
I guess my question wasn't clear enough:
We have hundreds of clients connected to our server.
For each client we naturally need to send a feedback about the transaction (OK or exception -> rollback).
My question is: if I use REQUIRED, does it mean only one transaction is used, and if the 100th client encounters a problem the 1st client's transaction will rollback as well ?
Using REQUIRES_NEW is only relevant when the method is invoked from a transactional context; when the method is invoked from a non-transactional context, it will behave exactly as REQUIRED - it will create a new transaction.
That does not mean that there will only be one single transaction for all your clients - each client will start from a non-transactional context, and as soon as the the request processing will hit a #Transactional, it will create a new transaction.
So, with that in mind, if using REQUIRES_NEW makes sense for the semantics of that operation - than I wouldn't worry about performance - this would textbook premature optimization - I would rather stress correctness and data integrity and worry about performance once performance metrics have been collected, and not before.
On rollback - using REQUIRES_NEW will force the start of a new transaction, and so an exception will rollback that transaction. If there is also another transaction that was executing as well - that will or will not be rolled back depending on if the exception bubbles up the stack or is caught - your choice, based on the specifics of the operations.
Also, for a more in-depth discussion on transactional strategies and rollback, I would recommend: «Transaction strategies: Understanding transaction pitfalls», Mark Richards.
If you really need to do it in separate transaction you need to use REQUIRES_NEW and live with the performance overhead. Watch out for dead locks.
I'd rather do it the other way:
Validate data on Java side.
Run everyting in one transaction.
If anything goes wrong on DB side -> it's a major error of DB or validation design. Rollback everything and throw critical top level error.
Write good unit tests.