Isolation in distributed (global) transactions using JTA

Isolation in distributed (global) transactions using JTA - java

As we know Isolation and Atomicity are two different properties. Atomicity is the "all or nothing" property, either a transaction completes successfully or fails altogether. Atomicity is definetly supported by JTA and the X/Open XA Two Phase Commit Standard on which JTA is based upon.
My question is: Does JTA support isolation? I'm referring only to the case when we use EJBs and JDBC, no frameworks (e.g. Spring) or transaction managers other than JTA.
In other words, let's take the case that we have multiple threads, and let's say that one of them executes the global transaction which performs access and modifications on multiple databases. The other threads perform modifications on the databases but each thread performs modification on only one database and it does it within a transaction.
Are we going to have any concurrency issues like dirty/repeatable/phantom reads inside the global transaction?
AFAIK there is no way to specify the isolation level in JTA.

Isolation is the black sheep of the ACID family. It's not, strictly speaking, a property of the transaction manager. It's entirely controlled by the resource manager i.e. the database. All transactions against the database run at some isolation level. The difference in XA (JTA) transactions is in how that level is selected.
For the most part it isn't possible to achieve the per-transaction isolation level selection control you have with regular transactions, though some resource managers may allow SQL set transaction isolation commands as the first statement in an XA controlled transaction branch. The other model sometimes used is custom flags to XAResource.start, an approach taken by e.g. oracle. For database engines supporting neither of these, the XA transaction defaults to the isolation level configured globally for the database server.
Note that even for 'serializable' transactions, JTA or otherwise, you are still going to have headaches. Read Peter Bailis's excellent ACIDRain paper and then go find a corner to weep quietly in.
http://www.bailis.org/papers/acidrain-sigmod2017.pdf

Related

Spring Chained Transaction Manager versus Atomikos

Hi I have a distributed transactions and I have to manage them somehow
Also in spring ecosystem ChainedTransactionManager can do that on the other hand in spring document Atomikos can be used for distributed transactions
https://docs.spring.io/spring-boot/docs/2.1.6.RELEASE/reference/html/boot-features-jta.html
Which one I should use?I prefer to stay in spring librarys but Atomikos is much more than spring transaction manager?If someone use them both,Can compare pros and cons

Using Atomikos is a better overall solution. The ChainedTransactionManager is something you can use in some cases. The assumption it makes are stated in the javadocs:
PlatformTransactionManager implementation that orchestrates transaction creation, commits and rollbacks to a list of delegates. Using this implementation assumes that errors causing a transaction rollback will usually happen before the transaction completion or during the commit of the most inner PlatformTransactionManager.
The configured instances will start transactions in the order given and commit/rollback in reverse order, which means the PlatformTransactionManager most likely to break the transaction should be the last in the list configured. A PlatformTransactionManager throwing an exception during commit will automatically cause the remaining transaction managers to roll back instead of committing.
The chance of committing one transaction and the other one failing still remains with ChainedTransactionManager.
Using Atomikos is a real distributed transaction all or nothing on both databases. But this also has some consequences that can affect the support of the application, for example when the TX is fully commited on one DB and prepared on the other, and at that point the application crashes. You'll need to ensure that your application can recover from this scenario. Usually the TX would be fully commited on the second DB when the app is restarted, but that might not happen.
So which one is the right one? It depends.

Can we write custom Spring transaction manager?

Suppose that we have a businessLogic() method that does 2 things: write some information in a local cache and save the same information in the DB using JDBC so that the contents of the cache and the DB are always the same.
I know we can use Spring's JDBC Datasource Transaction Manager to automatically rollback the DB in case of exception. However, how can we define a custom transaction manager that also rollbacks the content of the cache in this case, so that the contents of the cache and the DB are always in sync?
Thanks all.

Gab's answer is right, except for the parts that aren't.
XA is indeed the standard way to coordinate update of multiple resources... except that where the cache is local i.e. in-process, it's not necessarily a resource.
A cache doesn't exactly 'implement JTA', it acts in one of two roles in the XA protocol, according to how it's deployed. It can be an XAResource, but that's usually only required where its lifecycle is distinct from that of the client process. For in-process use, it's more likely to be a Synchronization.
The key difference between these roles is: XAResource is fault-tolerant, but Synchronization is not. For a volatile cache that's in-memory with the client process, it's sufficient to rebuild the cache after a crash by querying the db. For a cache that's out of process, a client crash after the db tx commit but before the cache update would leave the cache out of sync, at least until it expired or was manually refreshed.
Depending on the cache implementation, there is no guarantee it will pick the right mode automatically. See the configuration reference for your chosen implementation e.g. https://infinispan.org/docs/stable/user_guide/user_guide.html#tx_sync_enlist
Spring isn't actually a JTA XA transaction manager either, though it does provide an abstraction layer over them. It's possible to use Spring to drive a database in non-XA mode, but then you have no standard hook for the cache Synchronizations and you need a proprietary interface instead. Or you can have the database do pseudo-XA via a one-phase resource adapter. Full-on 2PC is probably overkill for your use case.

First of all I believe that the task of transaction management for cache is redundant. I advice you to only update the cache if database level transaction is successfully committed.
Most scenarios with cache using are completely acceptable if you have small window between updates of entity in database and its cached state.
If your case rejects any possibility of outdated cache then you probably have to avoid using cache or use something special for caching, probably the same database as your original data supporting transactions. Otherwise you will have problems trying to maintain consistency between two different systems: db level and cache level. Most of the time the best you can achieve is eventual consistency - it means that anyway you will have windows of inconsistent state and only then (eventually) the data will become consistent.

Standard way to deal with transaction distributed among multiple resources is to use XA
You must then access your database using an xa-datasource and use a cache implementation implementing JTA, eg. ehcache.
I'am not very familiar with spring boot, but the transaction manager should manage the transaction synchronization across both resources out of the box with the appropriate configuration (no need to override anything)

Exception if no transactions are configured?

I am using spring/hibernate stand alone application. if i dont configure Transactions i am getting below excpetion.
Exception in thread "Thread-1" org.hibernate.HibernateException: No Hibernate Session bound to thread, and configuration does not allow creation of non-transactional one here
in spring/hibernate integrated application is it mandatory to have transaction configuration?
Thanks!

Basically, yes. The Hibernate documentation says:
Database, or system, transaction boundaries are always necessary. No
communication with the database can occur outside of a database
transaction (this seems to confuse many developers who are used to the
auto-commit mode). Always use clear transaction boundaries, even for
read-only operations. Depending on your isolation level and database
capabilities this might not be required, but there is no downside if
you always demarcate transactions explicitly. Certainly, a single
database transaction is going to perform better than many small
transactions, even for reading data.

JTA or LOCAL transactions in JPA2+Hibernate 3.6.0?

We are in the process of re-thinking our tech stack and below are our choices (We can't live without Spring and Hibernate due to the complexity etc of the app). We are also moving from J2EE 1.4 to Java EE 5.
Technology stack
Java EE 5
JPA 2.0 (I know Java EE 5 only supports
JPA 1.0 but we want to use Hibernate
as the JPA provider)
Hibernate 3.6.0 (We already have
lots of hbm files with custom types
etc. so we doesn't want to migrate
them at this time to JPA. This means
we want both jpa/hbm mappings work
together and hence the Hibernate as
the JPA provider instead of using
the default that comes with App
Server)
Now the problems is that I want to stick with local transactions but other team members want to use JTA. I have been working with J2EE for last 9 years and I've heard time and again people suggesting to stick with local transactions if I doesn't need two phase commits. This is not only for performance reasons but debugging/troubleshooting a local transaction is lot easier than a JTA (even if JTA only does single phase commit when required).
My suggestion is to use spring declarative transaction management + local transactions (HibernateTransactionManager) instead of container JTA
I want to make sure if I am being paranoid or I have a valid point. I'd like to hear what the rest of the Java EE world thinks. Or please point me an appropriate article.

As Duffy already mentioned, JTA is not synonymous with 2 phase commit, which is something done via the XA protocol.
In JBoss AS for example, you can explicitly choose whether you want a given data source to be an xa-datasource or a tx-datasource. In both cases, transactions are managed via JTA.
In some cases you might already have been using JTA without knowing it. If you send a JMS message transactionally, or update a transactional cache in the same transaction where you modify something in a database, the transaction manager automatically switches to XA mode. The datasource representing your DB may not be XA, but in an XA transaction 1 resource is allowed to be non-XA. Updates to this resource then happens via the last resource commit optimization.
Although you should always calculate the risks and test for your self, I do want to warn against unfounded fear. XA seems to be one of those things we as developers have been brought up to fear. There was an interesting discussion on the JBoss forum about this recently: when to use xa-datasource.
The thing is that XA might have been a complex technology with sub-par implementations in the past, but almost a decade and a half since this FUD this might not be the case anymore. What was complex big enterprise stuff in 1995 is your common run of the mill technology in 2011.
Compare this with the fear we were once brought up with for EJB, which is now completely irrelevant anymore, or the fear for virtual machines (obviously not a problem for Java programmers), or when you're really participating in this industry for a long time, the fear for doing something as basic as function calls ;)

JTA doesn't mean two phase commits. I think it's the combination of JTA and XA drivers that makes two phase commits possible.
I'd still recommend using JTA and declarative transactions over embedding transaction logic in code. Transactions are best done in aspect oriented fashion, a la Spring.
UPDATE:
With the additional information you've posted, I agree with your argument. I'd recommend using Spring declarative transactions and the HibernateTransactionManager class.

Is it possible to use more than one persistence unit in a transaction, without it being XA?

Our application needs to use (read-only) a couple different persistence units pointing to different databases (different, commercial vendors as well).
We do not have the budget to enable 2pc on one of them (Sybase). Is there a way to use these in a transaction without it having to be an XA transaction?
We're using Websphere 6.1, Sybase 12.5.3, Oracle 10g, Java EE 5, and JPA with Hibernate Entity Manager.
Update: The oracle PU is updated rarely 1 or 2 per month, the sybase PU is updated very frequently -- many times per day. Isolation is definitely a concern for the latter, consistency between the two is not necessary to enforce.

Careful.
Read-only does not always mean that 2PC does not apply. If you have two databases, and you read both but only update one, you need a transaction to guarantee consistent results. Suppose you have a scenario where you read database A, then use those results to read and update database B. If you fail to use a transaction with database A, then it is possible that while your operation is active, the data you have read from database A can be read and updated by another application. In this case you can get inconsistent data in database B.
If you truly are reading BOTH databases and updating neither, again you may think that a distributed transaction and its accompanying locking is unnecessary. Once again though, maybe not. You may get inconsistent reads in this scenario as well, if other applications are updating the same databases. It depends on your requirements and the other users of the database.
I would suggest reading up on isolation levels to get some insight into the locking that applies, even during read operations, for all durable stores like databases. Transactional locking may be unnecessary; for example it is unnecessary if you are dealing with data that effectively does not change (no writes by any app).
Maybe there is a business solution here - negotiate with your vendor to drop the price of XA enablement, and pay it. With the economy, you may get a deal you can afford. Side note: I am surprised that you can license a database and NOT get transactions. I was not aware that it was possible to license Sybase in that way.

Atomikos TransactionsEssentials is a free, open source JTA/XA with connection pools for JDBC (and JMS).
One of its features is its added support for non-xa datasources. If readonly (your case) it is safe and easy to use our non-xa datasource to include your Sybase into a JTA transaction.
Best
Guy

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.