Spring Transactional slows down complete process

Spring Transactional slows down complete process - java

I am trying to analyze a situation where I have two classes. One class is ProcessImpl which is starting point and internally calls other child transactions.I dont know whats going wrong.
The processImpl is importing some stuff and writing related data to database.
Specs
Spring-orm version : 3.2.18.RELEASE.
JDK version : 1.8.
Db : H2 (on any db same performance is recorded).
Issue
If I remove #Transactional from ProcessImpl.processStage() the process takes ~ 50 seconds
If I keep #Transactional from ProcessImpl.processStage() the process takes ~ 15 minutes.
Dont know why this is happening.
I have been trying to solve this issue since long but no luck. Please have a look at code below.
Requirement:
The complete processStage() should complete or rollback completely, even if one of child transactions fail.
Fyi : I also get lot of messages like : "Participating in existing transaction". Tried to get over this by adding propagation=Propagation.NESTED to processStage() but did not work.
ProcessImpl Class.
public class ProcessImpl {
/*This is the big transaction that calls other transactional stuff from MyServiceImpl
* This is starting point you can say for the process...
*
* If we remove #Transactional from here the process is lightning fast
* With transactional : 15minutes
* Without transactional : 50 seconds
* */
#Transactional
public void processStage(){
MyServiceImpl mp = new MyServiceImpl();
//do some stuff
mp.doWork1();
//do more work
mp.doWork2();
}
}
MyServiceImpl Class
class MyServiceImpl{
#Transactional
public void doWork1(){
Object o = doChildWork();
// and more stuff
//calls other class services and dao layers
}
#Transactional
public void doWork2(){
//some stuff
doChildWork2();
doChildWork();
//more work
}
#Transactional
public Object doChildWork(){
return new Object(); //hypothetical, I am returning list and other collection stuff
}
#Transactional
public Object doChildWork2(){
return new Object(); //hypothetical, I am returning list and other collection stuff
}
}
Also, here will I get self invocation issue, which is not advisable in Transactional?

It is hard to guess what exactly is happening in your code, however these are the possible problems:
Lock on DB level.
This could happen when you update the same DB object within doWork1() and doWork2(). Since both of the methods are performed within one transaction the updates done inside doWork1() will not be committed until doWork2() is completed. Both the methods might try to lock the same DB object and wait for it. Technically it could be any DB object: row in a table, index, whole table, etc.
Analise your code and try to find what could be locked. You can also look into DB transaction log while the method is running. All popular DBs provide functionality that helps to find problematic places.
Slow down during Hibernate context refresh. In case when you update too many objects ORM engine (lets say Hibernate) has to sink them and keep them in memory. Literally Hibernate must have all old states and all new states of updated objects. Sometimes it does this not in an optimal way.
You can indicate this using debug. Try to find the slowest place and check what exactly is being invoked there. I might guess that it slows down when hibernate updates state of the cache.
One more issue. I see that you create MyServiceImpl using constructor during processStage(). I'd recommend you to replace this code by spring autowiring. First of all the way you're using it is not the way it was designed to be used, but theoretically that could also somehow influence on the execution.
will I get self invocation issue, which is not advisable in Transactional?
No, it will work just fine ignoring all annotations. Calls of doChildWork() and doChildWork2() inside doWork2() will be treated as standard java calls (spring is not able to add any "magic" to them as long as you're invoking them directly).

Any answers on here will really only be (very well informed) conjecture. In this sort of situation the best thing to do is to get hold of a Java profiler and do a detailed cpu level profiling to work out exactly what is going on.
I suggest the excellent YourKit which is commercial but you can get a free trial.
https://www.yourkit.com/docs/java/help/performance_bottlenecks.jsp

Related

Recommended approach when restoring a Spring State Machine instance

I am planning to use Spring State Machine to control an execution workflow. The system is expected to receive requests from multiple users and each user may be assigned to multiple workflows. My initial idea was to have one instance of SM per workflow and every time an user perform a step in the workflow, I would use its identifier to restore the machine from a persistent storage, input the new event and store the updated SM.
I've read around that initialising a SM is an expensive operation and some people recommend having a single instance of it, but "rehydrate" that instance with some data. My understanding is that this would be more effective, but I think it would become a "blocking" operation, in other words, one workflow would need to wait for the previous one to be finished/released before-hand. Since I'm newbie on this topic, can anyone shed some light on the best alternatives for my use case and perhaps pieces of code to illustrate the differences? (PS: I'm using v2.4.0)

I was first implementing the "rehydrate" mechanism because as you said, it made sense and was also used in the "persist" example of spring-statemachine.
Howewer, running performance tests against my API showed that using a single instance fails when using the StateMachine as an #Autowired Bean with the prototype scope as it is described in that example. What happens is that simultaneous requests against my API override that Statemachine Bean and the first request fails as the statemachine changes when writing back to the DB (i used redis).
So now I actually build a fresh statemachine everytime a request comes in and rehydrate that object:
public String getStatesGuest(HttpServletRequest httpServletRequest) throws Exception {
StateMachine<States, Events> stateMachine = stateMachineConfig.stateMachine();
resetStateMachineFromStore(httpServletRequest.getSession().getId(), stateMachine);
return convertToJson(buildGetStateResponse(stateMachine));
}
It still is very performant, I was testing with around 30 reqs/s and still got a median of 12ms. (Docker with 2 Cores for spring boot, 1 Core for redis).

Reusing same #Transactional method for different DataSources (JdbcTemplate) in Spring

we have this code where the same service method will call different daos each using a different datasource (and different JdbcTemplates). We would like to use #Transactional annotation, but as far as I know, this annotation is always linked to a specific TransactionManager (and thus, to a specific DataSource).
So, my question is, is there a way to choose dynamically which DataSource (or TransactionManager) using when calling a #Transactional method so I can reuse that method to attack different databases?

The #Transactional annotation doesn't allow dynamic evaluation of the value attribute which selects the TransactionManager (possibly by design, at least it doesn't look like it's going to change any time soon). So you can't have something like #Transactional("#{#getTxManager}") which would resolve the tx manager at call time.
In simple cases you might be able to get away with the following, but it would only be worth considering when for example you have a primary DS, and a secondary DS that's used only in some cases. Otherwise you'd be peppering the code that selects between calling foo/bar all around, and that wouldn't look clean at all
// TX boundary on the "top" abstraction layer
#Transactional("foo")
public void foo() {
doWork();
}
#Transactional("bar")
public void bar() {
doWork();
}
private void doWork() {
// Work done here, no concern for tx management
}
For more complex cases like multitenancy, AbstractRoutingDataSource is an easy and robust choice if you haven't considered it yet. Although depending on how much switching you need, it may require tweaking or be even unsuitable.
Finally, you might be able to create your own annotation that does choose the DS dynamically (although I don't guarantee it), but that would be the riskiest approach for possible very little gains.

The safest way for you would be to create separate services for each dao... I wouldn't like to be debugging such code. Think about maintaining this code and possible failures that might happen.
If I were you I'd ask myself following questions:
1.) Why separate dbs?
2.) Isn't the context somehow mixed up? Maybe create some boundaries between them?
3.) Do my queries really need to be transactional?
I probably don't know the context of your problem but for me it seems that you've modeled your application in a wrong way and I'd focus on it.

Struts2 application scope instances

I've inherited a Struts2 project which needs some functionality addition. When I ran into de code to guess how the previous guy did things, I found out that if he wants a class to instantiate only once when the Tomcat server starts (because it has to read heavy loads of data from disk, but only once to get its config, for instance), he did this in the following way:
public class ExampleClass {
public ExampleClass(){//Read files and stuff to initialize}
public Object method(Object[] args){//The job to do}
}
And then, in the struts action which uses it he instantiates it this way:
public class SomeAction extends ActionSupport {
ExampleClass example = new ExampleClass()
public String execute() {
//Do stuff every time the action is called
Object result = example.method(args);
// Do stuff with results
}
}
I know from servlet times that this does the trick, however, I feel like the guy who handled this before was as inexperienced in Struts2 as I am, so here comes my question:
Is this the proper way to do so according to style recommendations and best practices? Does struts2 provide a more controlled way to do so?
I found some answers related to simple parameters here, but I'm not sure if this is the proper way for objects like those? What would happen if ExampleClass instance is really heavy? I don't want them to be copied around:
How to set a value in application scope in struts2?
Some background about ExampleClass: When the constructor is called, it reads large sets of files and extracts it's configurations from them, creating complex internal representations.
When method() is called, it analyzes it's parameters using the rules, and outputs results to the user. This process usually takes seconds, and doesn't modify the previously initialized rule values.
This is running in Tomcat 7, however, I'm planning to upgrade to Tomcat 8.5 when everything is in place. I'd like to know if there are known issues about this regarding to this setup aswell (there are no other incompatibilities in the code).
BTW: He's not checking if ExampleClass is broken or anything like that, this definetly looks like a recipe to disaster xD. In fact, If I remove the source files, it is still trying to execute the method()... Poor soul...
Ideally, I need a way to instantiate all my application-level objects on start-up (they're the application itself, the rest is just a mere interface) in a way that if they fail Struts2 will tell Tomcat not to start that war, with the corresponding error logging and so on.
If Struts2 doesn't support this, which is the commonly accepted work-around? Maybe some Interceptor to check the object status and return to a error page if it hasn't been correctly instantiated? Execute a partial stop of tomcat from within?
All the objects of this project are thread safe (the only write operation inside them is performed on initialization), but I'd like to know best practices for Struts2 when objects are not so simple. What happens if a user can actually break one? (I know I should by any means avoid that, and I do, but mistakes happen, so I need a secure way to get through them, and get properly alerted, and of course I need a way to reinstantiate it safelly or to stop the whole service).
Right now, I can manually execute something like:
public class SomeAction extends ActionSupport {
ExampleClass example = new ExampleClass();
private boolean otherIsBuildingExample = false;
public String execute() {
if(otherIsBuildingExample) return '500 error';
if(example==null || example.isBroken()){
otherIsBuildingExample = true;
example = new ExampleClass();
otherIsBuildingExample = false;
}
Object result = example.method(args);
// Do stuff with results
}
}
Indeed, this would be cleaner with Interceptors, or so, however, this sounds like a pain in the *** for concurrency, specially taking into consideration thay example takes several seconds to start, and that more requests can come, so more concerns to take into consideration, like: what if two people call if(otherIsBuildingExample) and the second one gets the value before the first one performs otherIsBuildingExample=true? Nothing good... If the class is simple enough, both will instantiate and the slower one will prevail, but if one instantiation blocks the other's resources... well, more problems.
The only clean solution I can think of is to make ExampleClass robust enough so you can repare it using its own methods (not reinstantiating) and make those thread safe in the common way (if 10 people try to repair it, only one will proceed, while the others are just waiting for the first to end to continue, for instance).
Or maybe everytime you call execute() you get a copy of example, so no worries at all about this?
I'm digging into struts documentation
Thanks in advance.

How can a Spring Application be made to be single threaded or made to have one instance?

I have a Spring application that is controlled with an API that we need to be single threaded, but I can not figure out how to accomplish this. The application is a re-factoring of an app that is single threaded. They want the same basic design for the new version, while using our new programming methods (i.e. Java, Spring, etc.) and adding extra functionality.
There is an API resource to start the application:
#RequestMapping("/start")
public String startProcess(){...}
If this gets called twice then the application will start another thread. We want to stop this from happening. But, we still want the stop API resource to work:
#RequestMapping("/stop")
public String stopProcess(){...}
The app has a typical Spring structure:
#SpringBootApplication
public class MyApplication{...}
#RestController
public class MyController{
#Autowired
private MyService myService;
...}
#Service
#Transactional
public class CarolService{
#Autowired
private MyDAO myDAO;
...}
#Repository
public class myDAO{...}
How do I make sure that there is only one instance of this application running at a time? Please Help! And, thanks in advance!

You have actually two different problems: making your API single-threaded and making sure that there is only one instance of this application running at a time.
The solution is conceptually the same: you have to synchronize on some mutex. But it's much easier to do in the first case than in the second.
To make your API single-threaded you'll need to synchronize on something. If you have just one controller, just make API methods synchronized. If you have more than one controller, you'll need to create some application scope bean, inject it in every controller and synchronize on it. In old times there was also something like SingleThreadModel, but I think it was deprecated. Haven't seen it around for a few years but I won't be surprized if Spring would have a way of setting it somehow.
Making sure that there is only one instance of this application running at a time is much harder. You basically want to prevent anybody to start several copies of the application in parallel. One of the way to achieve this is to have some central shared resource like a database. On start-up the application will try to "acquire" the mutex by creating a record in some table (which would allow at most one record). If the record was created successfully, the application starts normally, if not then fails. You'll need some mechanism to detect stale mutex record - probably as simple as saving the timestamp in the mutex record and constantly updating it via scheduled task (heartbeat).
We recently had a similar task in an application running many instances of many microservices. We needed exactly one microservice to regularly execute certain maintenance task. We solved it by synchronising over a central MongoDB database. Microservices try to acquire the mutex by creating a document in a database collection. By design, at most one document may exist in that collection and the microservice which created the document performs the regular task and removes the document at the end. The collection is configured with an automatic clean-up so if the microservice failed to remove the document for whatever reason, it will be removed automatically by the database.

Should I always prefer member variables to parameters method when it makes sense?

Put value as a field makes sense only if this one represents some object's state.
I wonder if this "rule" should be avoid in certain case.
Here an example, suppose this class :
public class DbCar {
private ResultSet rs;
public DbMapper(ResultSet rs) {
this.rs = rs;
}
public Car buildObject(){
//.....does some mappings, then returns the builded car ....
}
}
So we see that ResultSet is stored as a member variable and it makes sense since every DbMapper like DbCar manipulates a retrieved ResultSet from JDBC query.
We would have a caller looking like as follows:
while (rs.next()) {
items.add(new DbCar(rs)).buildObject();
}
But imagine that the current query returned 15000 records.
To put it in a nutshell => 15000 instances of DbCar objects were created.
So my question is : Is the garbage collector efficient enough so that I shouldn't worry about this huge number of instances ?
Of course, to avoid all these instances, we can refactor the code as follows:
public class DbCar {
public Car buildObject(ResultSet rs) {
//.....does some mappings, then returns the builded car ....
}
}
In this case, a single instance of DbCar (in the current thread and in the current method) would be created, and the caller looking like as follows:
DbCar dbCar = new DbCar();
while (rs.next()) {
items.add(dbCar.buildObject(rs));
}
So, which solution should I choose ? Trust on garbage collector with a more elegant code or coding like a more procedural coding-style with local parameter method ?
In order to make choice harder, imagine that the DbCar class divides its "build" method into elegant small methods, each one dedicated to a specific responsibility like for instance :
"buildEngine", "buildDoors" etc... In case of local parameter choice, I would have to pass ResultSet into all these methods... boring and redondant isn't it ?

2nd is better. 15000 objects is nothing for the garbage collector, but hanging on to the resultset object is not recommended. #2 is better for that reason

That does not constitute a problem for the GC but it all boils down to your application requirements.
In a previous project, I was involved in developing a very big "near real time" application that runs on a Solaris server (requires 10GB of RAM to start) the application is creating something like 150000 DTO objects every 4 seconds or so. This has no impact on the GC at first glance but after some working hours, the users started complaining about the software loosing data coming out of the hardware. We spent a long night investigating the problem and we finally found out that the GC was taking full CPU to clean up the unused objects which made the application look like it hanged for a second or so (trust me, a second of data loss costs more than 1000$)

I would trust that garbage collector can do this nicely, and choose the most clear implementation. If you find later that this causes a performance problem, you can refactor then. You should always do the more clean implementation, unless you have a proven reason not to (like a profiler showing you that this is a bottleneck and you should lose some clarity to make it faster).
I would be surprised if GC couldn't handle this nicely.

Your particular problem it's mostly a matter of functional design. One normally should not worry about garbage collector when designing a program. If you have to think in GC, i would think you are working on application with non functional requirements for using a very few memory or you are optimizing a very big architecture. It's that your case?
Out there you can find pattern for implementing templates like this. I'm refering to Spring JDBC Template for example:
http://static.springsource.org/spring/docs/2.0.x/reference/jdbc.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.