I am trying to get Multi-Threading working in my Java web application and it seems like no matter what I try I run into some sort of issues with connection pooling.
My current process is that I have am looping through all my departments and processing them, which finally generates a display. This takes time, so I want to spawn a thread for each department and have them process concurrently.
After alot of time of first figuring out how to get my hibernate session to stay open in a thread to prevent the lazy initializion loading errors, I finally had the solution of creating a Spring bean of my Thread class, and creating a new instance of that bean for each thread. I have tried 2 different versions
1) I directly inject the DAO classes into the Bean. FAILED - After loading the page a few times I would get "Cannot get a connection, pool error Timeout waiting for idle object" for each thread and the app would crash.
2) Ok so then I tried injecting the spring SessionFactory into my bean, then create new instances of my DAO and set it with the SessionFactory. My DAO objects all extend HibernateDaoSupport. FAILED - After loading the page a few times I would get "too many connections" error messages.
Now what I am confused about is that my SessionFactory bean is a singleton which I understand to mean that it is a single Object that is shared throughout the Spring container. If that is true then why does it appear like each Thread is creating a new connection when it should just be sharing that single instance? It appears that all my connection pool is getting filled up and I don't understand why. Once the threads are done, all the connections that were created should be released but there not. I even tried running a close() operation on the injected SessionFactory but that has no effect. I tried limiting how many threads run concurrently at 5, hoping that would cause not so many connections to be created at one time but no luck.
I am obviously doing something wrong but I am not sure what. Am I taking the wrong approach entirely in trying to get my hibernate Session into my Threads? Am I somehow not managing my connection pool properly? any ideas would be greatly appreciated!
More info I thought of: My process creates about 25 threads total, which are run 5 at a time. I am able to refresh my page about 3 times before I start getting the errors. So apparently each refresh creates and holds on to a bunch of connections.
Here is my spring config file:
<bean id="mainDataSource" class="org.apache.commons.dbcp.BasicDataSource"
destroy-method="close">
....
<!-- Purposely put big values in here trying to figure this issue out
since my standard smaller values didn't help. These big values Fail just the same -->
<property name="maxActive"><value>500</value></property>
<property name="maxIdle"><value>500</value></property>
<property name="minIdle"><value>500</value></property>
<property name="maxWait"><value>5000</value></property>
<property name="removeAbandoned"><value>true</value></property>
<property name="validationQuery"><value>select 0</value></property>
</bean>
<!--Failed attempt #1 -->
<bean id="threadBean" class="controller.manager.ManageApprovedFunds$TestThread" scope="prototype">
<property name="dao"><ref bean="dao" /></property>
</bean>
<!--Failed attempt #2 -->
<bean id="threadBean" class="manager.ManageApprovedFunds$TestThread" scope="prototype">
<property name="sessionFactory"><ref bean="sessionFactory" /></property>
</bean>
Java code:
ExecutorService taskExecutor = Executors.newFixedThreadPool(5);
for(Department dept : getDepartments()) {
TestThread t = (TestThread)springContext.getBean("threadBean");
t.init(dept, form, getSelectedYear());
taskExecutor.submit(t);
}
taskExecutor.shutdown();
public static class TestThread extends HibernateDaoSupport implements Runnable, ApplicationContextAware {
private ApplicationContext appContext;
#Override
public void setApplicationContext(ApplicationContext arg0)
throws BeansException {
this.appContext = arg0;
}
#Override
public void run() {
try {
MyDAO dao = new MyDAO();
dao.setSessionFactory(getSessionFactory());
//SOME READ OPERATIONS
getSessionFactory().close();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Basically you shouldn't try to share a single hibernate Session between threads. Hibernate sessions, nor the entity objects, are threadsafe and that can lead to some suprising problems. Here and [here]3[] is a small but nice read there is also some value in the comments. Basically don't try to share a Session between threads.
There is also another problem as the whole transaction management is based around ThreadLocal objects, the same goes for obtaining the current session and underlying JDBC connection. Now trying to spawn threads will lead to suprising problems, one of them would be connection pool starvation. (Note: don't open too many connections, more information here).
Next to not to opening to much connections you should be aware of starting to many threads. A Thread is bound to a cpu (or core if you have multiple cores). Adding to many threads might lead to heavy sharing of a single cpu/core between to many threads. This can kill your performance instead of increasing it.
In short IMHO your approach is wrong, each thread should simply read the entity it cares about, do its thing and commit the transaction. I would suggest using something like Spring Batch for this instead of inventing your own mechanism. (Although if it is simple enough I would probably go for it).
A database connection is not associated with the SessionFactory but with the Session. To get your handling correct you have to take care of the following:
Only create one instance of SessionFactory (per persistence context) - but you are doing this already
Don't close() the SessionFactory - it's lifetime should end when the application dies - that is at undeployment or server-shutdown.
Make sure to always close() your Session - you use one database connection per open Session and if these don't get closed you are leaking connections.
Related
There is a service that connects to Oracle DB for reading data and it uses Hibernate-3.6 and SpringData-JPA-1.10.x. Heap dumps are getting generated frequently which results in out of memory on the hosts.
After analyzing few heapdumps using Eclipse MAT, found that the majority of the memory is accumulated in one instance of org.hibernate.engine.StatefulPersistenceContext -> org.hibernate.util.IdentityMap -> java.util.LinkedHashMap.
And the leak suspect says
The thread java.lang.Thread # 0x84427e10 ... : 29 keeps local
variables with total size 1,582,637,976 (95.04%) bytes.
The memory is accumulated in one instance of "java.util.LinkedHashMap"
loaded by "".
Searched it on StackOverflow and it says SessionFactory should be singleton and session.flush() and session.clear() should be invoked before each call to clear the cache. But SessionFactory is not explicitly initialized or used in the code.
What's causing the memory leak here (looks like the result of each query is cached and not cleared) and how to fix it?
More info about the Spring Data configuration:
TransactionManager is initialized as:
<tx:annotation-driven mode='proxy' proxy-target-class='true' />
<bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
....
</bean>
<bean id="dataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource" depends-on="...">
....
</bean>
To interact with the table, an interface is declared extending Spring Data Repository and JpaSpecificationExecutor. Both are typed into the domain class that it will handle.
API activity method has the annotation #Transactional(propagation = Propagation.SUPPORTS, readOnly = true).
From what you describe this is what I expect to be going on:
Hibernate (actually JPA in general) keeps a reference to all entities it loads or saves for the lifetime of the session.
In a typical web application setup, this isn't a problem, because. A new session starts with each request and gets closed once the request is finished and it doesn't involve that many entities.
But for your application, it looks like the session keeps growing and growing.
I can imagine the following reasons:
something runs in an open session all the time without it ever closing. Maybe something like a batch job or a scheduled job which runs at regular intervals.
Hibernate is configured in such a way that it reuses the same session without ever closing it.
In order to find the culprit enable logging for opening and closing the session. Judging from https://hibernate.atlassian.net/browse/HHH-2425 org.hibernate.impl.SessionImpl should be the right log category and you probably need trace level logging.
Now test the various requests to your server and see if there are any sessions that get opened but not closed.
The question contains information about the creations of some beans. But the problem doesn't lie there. The problem is in your code, where have you use these beans.
Please check your code. Probably you are loading items in a loop. And the loop is wrapped with a transaction.
Hibernate creates huge intermediate objects, and it doesn't clean these before the transaction being completed (commit/rollback).
I have a question regarding #Transactional annotation.
Nothing special defined, so as I understand is PROPAGATION_REQUIRED
Let’s say I have a transactional annotation which on both service and dao layer.
Service
#Transactional
public long createStudentInDB(Student student) {
final long id = addStudentToDB (student);
addStudentToCourses (id, student.getCourseIds());
return id;
}
private long addStudentToDB (Student student) {
StudentEntity entity = new StudentEntity ();
convertToEntity(student, entity);
try {
final id = dao.create(entity);
} catch (Exception e){
//
}
return id;
}
private void addStudentToCourses (long studentId, List<String> coursesIds){
//add user to group
if(coursesIds!= null){
List<StudentCourseEntity> studentCourses = new ArrayList<>();
for(String coursesId: coursesIds){
StudentCourseEntity entity = new StudentCourseEntity ();
entity.setCourseId(coursesId);
entity.setStudentId(userId);
studentCourses.add(studentId);
}
anotherDao.saveAll(studentCourses);
}
}
DAO
#Transactional
public UUID create(StudentEntity entity) {
if ( entity == null ) { throw new Exception(//…); }
getCurrentSession().save(entity);
return entity.getId();
}
ANOTHER DAO:
#Transactional
public void saveAll(Collection< StudentCourseEntity > studentCourses) {
List< StudentCourseEntity > result = new ArrayList<>();
if(studentCourses!= null) {
for (StudentCourseEntity studentCourse : studentCourses) {
if (studentCourse!= null) {
save(studentCourse);
}
}
}
}
Despite the fact that’s not optimal, it seems it causing deadlocks.
Let’s say I have max 2 connections to the database.
And I am using 3 different threads to run the same code.
Thread-1 and thread-2 receive a connection, thread-3 is not getting any connection.
More than that, it seems that thread-1 become stuck when trying to get a connection in dao level, same for thread-2. Causing a deadlock.
I was sure that by using propagation_required this would not happen.
Am I missing something?
What’s the recommendation for something like that? Is there a way I can have #transactional on both layers? If not which is preferred?
Thanks
Fabrizio
As the dao.doSomeStuff is expected to be invoked from within other transactions I would suggest you to configure this method as:
#Transaction(propagation=REQUIRES_NEW)
Thanks to that the transaction which is invoking this method will halted until the one with REQUIRES_NEW will be finished.
Not sure if this is the fix for your particular deadlock case but your example fits this particular set-up.
You are right, Propagation.REQUIRED is the default. But that also means that the second (nested) invocation on dao joins / reuses the transaction created on service level. So there is no need to create another transaction for the nested call.
In general Spring (on high level usage) should manage resource handling by forwarding it to the underlying ORM layer:
The preferred approach is to use Spring's highest level template based
persistence integration APIs or to use native ORM APIs with
transaction- aware factory beans or proxies for managing the native
resource factories. These transaction-aware solutions internally
handle resource creation and reuse, cleanup, optional transaction
synchronization of the resources, and exception mapping. Thus user
data access code does not have to address these tasks, but can be
focused purely on non-boilerplate persistence logic.
Even if you handle it on your own (on low level API usage) the connections should be reused:
When you want the application code to deal directly with the resource
types of the native persistence APIs, you use these classes to ensure
that proper Spring Framework-managed instances are obtained,
transactions are (optionally) synchronized, and exceptions that occur
in the process are properly mapped to a consistent API.
...
If an existing transaction already has a connection synchronized
(linked) to it, that instance is returned. Otherwise, the method call
triggers the creation of a new connection, which is (optionally)
synchronized to any existing transaction, and made available for
subsequent reuse in that same transaction.
Source
Maybe you have to find out what is happening down there.
Each Session / Unit of Work will be bound to a thread and released (together with the assigned connection) after the transaction has ended. Of course when your thread gets stuck it won't release the connection.
Are you sure that this 'deadlock' is caused by this nesting? Maybe that has another reason. Do you have some test code for this example? Or a thread dump or something?
#Transactional works by keeping ThreadLocal state, which can be accessed by the (Spring managed) Proxy EntityManager. If you are using Propagation.REQUIRED (the default), and you have a non-transactional method which calls two different DAOs (or two Transactional methods on the same DAO), you will get two transactions, and two calls to acquire a pooled connection. You may get the same connection twice or two different connections, but you should only use one connection at the time.
If you call two DAOs from a #Transactional method, there will only be one transaction, as the DAO will find and join the existing transaction found in the ThreadLocal state, again you only need one connection from the pool.
If you get a deadlock then something is very wrong, and you may want to debug when your connections and transaction are created. A transaction is started by calling Connection.setAutoCommit(false), in Hibernate this happens in org.hibernate.resource.jdbc.internal.AbstractLogicalConnectionImplementor#begin(). Connections are managed by a class extending org.hibernate.resource.jdbc.internal.AbstractLogicalConnectionImplementor so these are some good places to put a break point, and trace the call-stack back to your code to see which lines created connections.
Using Spring and Hibernate, I want to write to one MySQL master database, and read from one more more replicated slaves in cloud-based Java webapp.
I can't find a solution that is transparent to the application code. I don't really want to have to change my DAOs to manage different SessionFactories, as that seems really messy and couples the code with a specific server architecture.
Is there any way of telling Hibernate to automatically route CREATE/UPDATE queries to one datasource, and SELECT to another? I don't want to do any sharding or anything based on object type - just route different types of queries to different datasources.
An example can be found here: https://github.com/afedulov/routing-data-source.
Spring provides a variation of DataSource, called AbstractRoutingDatasource. It can be used in place of standard DataSource implementations and enables a mechanism to determine which concrete DataSource to use for each operation at runtime. All you need to do is to extend it and to provide an implementation of an abstract determineCurrentLookupKey method. This is the place to implement your custom logic to determine the concrete DataSource. Returned Object serves as a lookup key. It is typically a String or en Enum, used as a qualifier in Spring configuration (details will follow).
package website.fedulov.routing.RoutingDataSource
import org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource;
public class RoutingDataSource extends AbstractRoutingDataSource {
#Override
protected Object determineCurrentLookupKey() {
return DbContextHolder.getDbType();
}
}
You might be wondering what is that DbContextHolder object and how does it know which DataSource identifier to return? Keep in mind that determineCurrentLookupKey method will be called whenever TransactionsManager requests a connection. It is important to remember that each transaction is "associated" with a separate thread. More precisely, TransactionsManager binds Connection to the current thread. Therefore in order to dispatch different transactions to different target DataSources we have to make sure that every thread can reliably identify which DataSource is destined for it to be used. This makes it natural to utilize ThreadLocal variables for binding specific DataSource to a Thread and hence to a Transaction. This is how it is done:
public enum DbType {
MASTER,
REPLICA1,
}
public class DbContextHolder {
private static final ThreadLocal<DbType> contextHolder = new ThreadLocal<DbType>();
public static void setDbType(DbType dbType) {
if(dbType == null){
throw new NullPointerException();
}
contextHolder.set(dbType);
}
public static DbType getDbType() {
return (DbType) contextHolder.get();
}
public static void clearDbType() {
contextHolder.remove();
}
}
As you see, you can also use an enum as the key and Spring will take care of resolving it correctly based on the name. Associated DataSource configuration and keys might look like this:
....
<bean id="dataSource" class="website.fedulov.routing.RoutingDataSource">
<property name="targetDataSources">
<map key-type="com.sabienzia.routing.DbType">
<entry key="MASTER" value-ref="dataSourceMaster"/>
<entry key="REPLICA1" value-ref="dataSourceReplica"/>
</map>
</property>
<property name="defaultTargetDataSource" ref="dataSourceMaster"/>
</bean>
<bean id="dataSourceMaster" class="org.apache.commons.dbcp.BasicDataSource">
<property name="driverClassName" value="com.mysql.jdbc.Driver"/>
<property name="url" value="${db.master.url}"/>
<property name="username" value="${db.username}"/>
<property name="password" value="${db.password}"/>
</bean>
<bean id="dataSourceReplica" class="org.apache.commons.dbcp.BasicDataSource">
<property name="driverClassName" value="com.mysql.jdbc.Driver"/>
<property name="url" value="${db.replica.url}"/>
<property name="username" value="${db.username}"/>
<property name="password" value="${db.password}"/>
</bean>
At this point you might find yourself doing something like this:
#Service
public class BookService {
private final BookRepository bookRepository;
private final Mapper mapper;
#Inject
public BookService(BookRepository bookRepository, Mapper mapper) {
this.bookRepository = bookRepository;
this.mapper = mapper;
}
#Transactional(readOnly = true)
public Page<BookDTO> getBooks(Pageable p) {
DbContextHolder.setDbType(DbType.REPLICA1); // <----- set ThreadLocal DataSource lookup key
// all connection from here will go to REPLICA1
Page<Book> booksPage = callActionRepo.findAll(p);
List<BookDTO> pContent = CollectionMapper.map(mapper, callActionsPage.getContent(), BookDTO.class);
DbContextHolder.clearDbType(); // <----- clear ThreadLocal setting
return new PageImpl<BookDTO>(pContent, p, callActionsPage.getTotalElements());
}
...//other methods
Now we can control which DataSource will be used and forward requests as we please. Looks good!
...Or does it? First of all, those static method calls to a magical DbContextHolder really stick out. They look like they do not belong the business logic. And they don't. Not only do they not communicate the purpose, but they seem fragile and error-prone (how about forgetting to clean the dbType). And what if an exception is thrown between the setDbType and cleanDbType? We cannot just ignore it. We need to be absolutely sure that we reset the dbType, otherwise Thread returned to the ThreadPool might be in a "broken" state, trying to write to a replica in the next call. So we need this:
#Transactional(readOnly = true)
public Page<BookDTO> getBooks(Pageable p) {
try{
DbContextHolder.setDbType(DbType.REPLICA1); // <----- set ThreadLocal DataSource lookup key
// all connection from here will go to REPLICA1
Page<Book> booksPage = callActionRepo.findAll(p);
List<BookDTO> pContent = CollectionMapper.map(mapper, callActionsPage.getContent(), BookDTO.class);
DbContextHolder.clearDbType(); // <----- clear ThreadLocal setting
} catch (Exception e){
throw new RuntimeException(e);
} finally {
DbContextHolder.clearDbType(); // <----- make sure ThreadLocal setting is cleared
}
return new PageImpl<BookDTO>(pContent, p, callActionsPage.getTotalElements());
}
Yikes >_< ! This definitely does not look like something I would like to put into every read only method. Can we do better? Of course! This pattern of "do something at the beginning of a method, then do something at the end" should ring a bell. Aspects to the rescue!
Unfortunately this post has already gotten too long to cover the topic of custom aspects. You can follow up on the details of using aspects using this link.
I don't think that deciding that SELECTs should go to one DB (one slave) and CREATE/UPDATES should go to a different one (master) is a very good decision. The reasons are:
replication is not instantaneous, so you could CREATE something in the master DB and, as part of the same operation, SELECT it from the slave and notice that the data hasn't yet reached the slave.
if one of the slaves is down, you shouldn't be prevented from writing data in the master, because as soon as the slave is back up, its state will be synchronized with master. In your case though, your write operations are dependent on both master and slave.
How would you then define transactionality if you're in fact using 2 dbs?
I would advise using the master DB for all the WRITE flows, with all the instructions they might require (whether they are SELECTs, UPDATE or INSERTS). Then, the application dealing with the read-only flows can read from the slave DB.
I'd also advise having separate DAOs, each with its own methods, so that you'll have a clear distinction between read-only flows and write/update flows.
You could create 2 session factories and hava a BaseDao wrapping the 2 factories(or the 2 hibernateTemplates if you use them) and use the get methods with on factory and the saveOrUpdate methods with the other
Try this way : https://github.com/kwon37xi/replication-datasource
It works nicely and very easy to implement without any extra annotation or code. It requires only #Transactional(readOnly=true|false).
I have been using this solution with Hibernate(JPA),Spring JDBC Template, iBatis.
You can use DDAL to implement writting master database and reading slave database in a DefaultDDRDataSource without modifying your Daos, and what's more, DDAL provided loading balance for mulit-slave databases. It doesn't rely on spring or hibernate. There is a demo project to show how to use it: https://github.com/hellojavaer/ddal-demos and the demo1 is just what you described scene.
Do we have to explicitly manage database resources when using Spring Framework.. liking closing all open connections etc?
I have read that Spring relieves developer from such boiler plate coding...
This is to answer an error that I am getting in a Spring web app:
org.springframework.jdbc.CannotGetJdbcConnectionException:
Could not get JDBC Connection; nested
exception is java.sql.SQLException:
ORA-00020: maximum number of processes
(150) exceeded
The jdbcTemplate is configured in the xml file and the DAO implementation has reference to this jdbcTemplate bean which is used to query the database.
Do we have to explicitly manage database resources when using Spring Framework, like closing all open connections etc?
If you are using Spring abstraction like JbdcTemplate, Spring handles that for you and it is extremely unlikely that that there is a bug in that part.
Now, without more information on your configuration (your applicationContext.xml), on the context (how do you create your application context, when does this happen exactly?), it is a hard to say anything. So this is a shot in the dark: do you have the attribute destroy-method="close" set on your datasource configuration? Something like that:
<bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
In certain circumstances, not using the destroy-method combined with some other bad practices may eventually end up with exhausting resources.
It could be due to connections not being closed. How are you accessing your connections within spring? Are you are using JdbcTemplate to query the database? Or just getting the connection from spring?
I have read that Spring relieves
developer from such boiler plate
coding
That depends which level of Spring you operate at. JdbcTemplate provides many different operations, some of which are fire-and-forget, some of which still require you to manage your JDBC resources (connections, resultsets, statements, etc) properly. The rule of thumb is that if you find yourself calling getConnection(), then at some point you need to call releaseConnection() also.
ORA-00020: maximum number of processes
(150) exceeded
Are you using a connection pool? If so, then make sure that it isn't configured with a larger number of max connections than your database is capable of handling (150, in this case). If you're not using a connection pool, then you really, really should be.
you say "The jdbcTemplate is configured in the xml file". You should normally create a new instance of the jdbcTemplate for each usage, not have it managed by spring.
I would guess that each time you request a new jdbcTemplate bean from spring, it is creating a new one with a new connection to the database, but after it falls out of scope in your code it is still referenced by spring's applicationContext, and so does not close the connection.
My hosing, provide only 20 connection. I done by manually close the connection on every request to db. I not declared a destory-method in bean(this not worked "i dont know why"), but i done in every requst call. (Hint : extends JdbcDaoSupport in dao class).
public void cleanUp() {
try {
if (!this.getJdbcTemplate().getDataSource().getConnection().isClosed()) {
this.getJdbcTemplate().getDataSource().getConnection().close();
}
} catch (Exception e) {
Logger.getLogger(myDAOImpl.class.getName()).log(Level.SEVERE, null, e);
}
}
IMPORTANT: Here I mention that, what I have done to solve my problem. You should not use this code directly. You should use only one connection. Alter if as per your code.
my service layer methods are transactional, when i use ExecutorService and submit task to threads, i cannot pass servicelayer as parameter to each threads, as i get error
Dec 14, 2009 10:40:18 AM com.companyx.applicationtest.applicationtestcompanyx.services.threadtestRunnable run
SEVERE: null
org.hibernate.HibernateException: No Hibernate Session bound to thread, and conf
iguration does not allow creation of non-transactional one here
at org.springframework.orm.hibernate3.SpringSessionContext.currentSessio
n(SpringSessionContext.java:63)
at org.hibernate.impl.SessionFactoryImpl.getCurrentSession(SessionFactor
yImpl.java:542)
my service layer
ExecutorService executor = Executors.newFixedThreadPool(10);
for (final Object item : CollectionsTest{
executor.submit(new threadtestRunnable((Long)item,collectionAfterfiltered,this)); //'this' is service layer
}
should i pass the service layer to each thread like this?
what is the proper way to do it, i need each thread to call method in service layer? (i'm using spring)
Generally, as said in the comments, transactions shouldn't be run in multiple threads. However, there are cases, where it is acceptable.
you need to make some asynchronous communication with a web-service (without making the user wait for the result), and store the result when it comes
you need read-only transactions in the multiple threads.
If you create your thread using new, it is not part of the spring context. Hence, when the method creating the thread finishes, your transaction interceptor will close the transaction (and session, eventually), and you will get the above exception.
(For more details - Spring docs, see "Lookup injection")
You need to create your threads within the spring context. And since you are probably creating them from a singleton bean, it is the rare case of creating prototype beans from a singleton bean. So in order to create a thread in the spring context, you can use:
<bean id="mainBean"
class="com.my.MyClass">
<lookup-method name="createThread" bean="myThreadBean"/>
</bean>
You should also map your ThreadtestRunnable class in the applicationContext.xml or annotate it as #Component("myThreadBean").
Then define an abstract method on your main bean named createThread and returning your thread class. Annotate your run method with #Transactional (or define the appropriate aop rules), and try it. Perhaps you will need to set propagation=Propagation.REQUIRES_NEW" in your #Transactional. If anything is wrong, get back here.