i"m running a web service on Heroku and using New Relic to monitor its performance. I'm using MySQL with Hibernate on top. My non default c3p0 settings are the following
hibernate.c3p0.maxStatementsPerConnection, 5
hibernate.c3p0.maxPoolSize, 35
hibernate.c3p0.minPoolSize, 5
hibernate.c3p0.initialPoolSize, 10
hibernate.c3p0.acquireIncrement, 10
Every single request to my web service hits the database at least a couple of times. After running a load test of about 200 requests/minute for 10min I see most of time is spent in
com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection
My guess it's waiting for a connection in the connection pool? The interesting part is as I increased
hibernate.c3p0.maxPoolSize, 40
the performance was worse off (longer wait time in the same getConnection call. During the test I can see that the max number of c3p0 connections is indeed open at the MySQL server (max connection set on MySQL's end is 300, definitely not exhausted).
All of my database functions use the same format
public void executeTransaction( Session session, IGenericQuery<T> query, T entity )
{
Transaction tx = null;
try
{
tx = session.beginTransaction();
query.execute( session, entity );
tx.commit();
}
catch ( RuntimeException e )
{
try
{
tx.rollback();
}
catch ( RuntimeException e2 )
{
}
throw e;
}
finally
{
if ( session != null )
{
session.close();
}
}
}
so I'm certain all sessions are closed, which should translate into connections closing. Why is the wait time more as I increase the max number of connections? It seems like performance increases from hibernate.c3p0.maxPoolSize, 25 to hibernate.c3p0.maxPoolSize, 30, but drops after hibernate.c3p0.maxPoolSize, 35. Are my values far off?
Thanks!
as a guess, i would try increasing numHelperThreads. you have a heavy load; maybe c3p0's administrative Threads are getting backed up. (You should be able to see this if you dump stack traces or use JMX to monitor c3p0. If you have enough helper threads, they should generally be idle(), wait()ing. If they are getting backed up, you'll see them mostly active and runnable, and by JMX you'll see tasks queued.)
an insufficiency of helper threads is consistent with your observed better-then-worse performance with maxPoolSize. initially you get what you want, more Connections at the ready, but then the helper Threads fail to keep up and adding more Connections just makes things worse.
given your settings, helper Threads shouldn't have too much work to do, UNLESS maxStatementsPerConnection is too small. if your app has more than 5 PreparedStatements that are run frequently, then you will end up churning through Statements and tying up helper Threads with Statement close() tasks. you might try making this value larger. it should be approximately (rounding up) the number of distinct PreparedStatements used on an ongoing basis by your application. (You can ignore single or very rarely used PreparedStatements, involved for example in setup or cleanup.) again, monitoring what helper threads are up to would give you information about whether this is the issue. (you'd see backed-up Statement close() tasks.)
so, things to try: increase numHelperThreads, increase maxStatementsPerConnection (or set it to zero, to turn off Statement caching entirely.)
good luck!
Related
I have a scenario in production for a web app, where when a form is submitted the data gets stored in 3 tables in Oracle DB through JDBC. Sometimes I am seeing connection time out errors in logs while the app is trying to connect to Oracle DB through Java code. This is intermittent.
Below is the exception:
SQL exception while storing data in table
java.sql.SQLRecoverableException: IO Error: Connection timed out
Most of the times the web app is able to connect to data base and insert values in it but some times and I am getting this time out error and unable to insert data in it. I am not sure why am I getting this intermittent issue. When I checked the connections pool config in my application, I noticed the following things:
Pool Size (Maximum number of Connections that this pool can open) : 10
Pool wait (Maximum wait time, in milliseconds, before throwing an Exception if all pooled Connections are in use) : 1000
Since the pool size is just 10 and if there are multiple users trying to connect to data base will this connection time out issue occur ?
Also since there are 3 tables where the data insertion occurs we are doing the whole insertion in just one connection itself. We are not opneing each DB connection for each individual table.
NOTE: This application is deployed on AEM (Content Management system) server and connections pool config is provided by them.
Update: I tried setting the validation query in the connections pool but still I am getting the connection time out error. I am not sure whether the connections pool has checked the validation query or not. I have attached the connections pool above for reference.
I would try two things:
Try setting a validation query so each time the pool leases a connection, you're sure it's actually available. select 1 from dual should work. On recent JDBC drivers that should not be required but you might give it a go.
Estimate the concurrency of your form. A 10 connections pool is not too small depending on the complexity of your work on DB. It seems you're saving a form so it should not be THAT complex. How many users per day do you expect? Then, on peak time, how many users do you expect to be using the form at the same time? A 10 connections pool often leases and retrieves connections quite fast so it can handle several transactions per second. If you expect more, increase the size slightly (more than 25-30 actually degrades DB performance as more queries compete for resources there).
If nothing seems to work, it would be good to check what's happening on your DB. If possible, use Enterprise Manager to see if there are latches while doing stuff on those three tables.
I give this answer from programming point of view. There are multiple possibilities for this problem. These are following and i have added appropriate solution for it. As connection timeout occurs, means your new thread do not getting database access within mentioned time and it is due to:
Possibility I: Not closing connection, there should be connection leakage somewhere in your application Solution
You need to ensure this thing and need to check for this leakage and close the connection after use.
Possibility II: Big Transaction Solution
i. Is these insertion synchronized, if it is so then use it very carefully. Use it at block level not method level. And your synchronized block size should be minimum as much as possible.
What happen is if we have big synchronized block, we give connection, but it will be in waiting state as this synchronized block needs too much time for execution. so other thread waiting time increases. Suppose we have 100 users, each have 100 threads for that operation. 1st is executing and it takes too long time. and others are waiting. So there may be a case where 80th 90th,etc thread throw timeout. And For some thread this issue occurs.
So you must need to reduce size of the synchronized block.
ii. And also for this case also check If the transaction is big, then try to cut the transaction into smaller ones if possible:-
For an example here, for one insertion one small transaction. for second other small transaction, like this. And these three small transaction completes operation.
Possibility III: Pool size is not enough if usability of application is too high Solution
Need to increase the pool size. (It is applicable if you properly closes all the connection after use)
You can use Java Executor service in this case .One thread One connection , all asynchronous .Once transaction completed , release the connection back to pool.That way , you can get rid of this timeout issue.
If one connection is inserting the data in 3 tables and other threads trying to make connection are waiting, timeout is bound to happen.
One jdbc "select" statement takes 5 secs to complete.
So doing 5 statements takes 25 secs.
Now I try to do the job in parallel. The db is mysql with innodb.
I start 5 threads and give each thread its own db connection. But it still takes 25 secs for all to complete?
Note I provide java with enough heap and have 8 cores but only one hd (maybe having only one hd is the bottleneck here?)
Is this the expected behavour with mysql out of the box?
here is example code:
public void doWork(int n) {
try (Connection conn = pool.getConnection();
PreparedStatement stmt = conn.prepareStatement("select id from big_table where id between "+(n * 1000000)" and " +(n * 1000000 +1000000));
) {
try (ResultSet rs = stmt.executeQuery();) {
while (rs.next()) {
Long itemId = rs.getLong("id");
}
}
}
}
public void doWorkBatch() {
for(int i=1;i<5;i++)
doWork(i);
}
public void doWorkParrallel() {
for(int i=1;i<5;i++)
new Thread(()->doWork(i)).start();
System.console().readLine();
}
(I don't recall where but I read that a standard mysql installation can easily handle 1000 connections in parallel)
Looking at your problem definitely multi-threading will improve your performance because even i once converted an 4-5 hours batch job into a 7-10 minute job by doing exactly the same what you're thinking but you need to know the following things before hand while designing :-
1) You need to think about inter-task dependencies i.e. tasks getting executed on different threads.
2) Using connection pool is a good sign since Creating Database connections are slow process in Java and takes long time.
3) Each thread needs its own JDBC connection. Connections can't be shared between threads because each connection is also a transaction.
4) Cut tasks into several work units where each unit does one job.
5) Particularly for your case, i.e. using mysql. Which database engine you use would also affect the performance as the InnoDB engine uses row-level locking. This way, it will handle much higher traffic. The (usual) alternative, however, (MyISAM) does not support row-level locking, it uses table locking.
i'm talking about the case What if another thread comes in and wants to update the same row before the first thread commits.
6) To improve performance of Java database application is running queries with setAutoCommit(false). By default new JDBC connection has there auto commit mode ON, which means every individual SQL Statement will be executed in its own transaction. while without auto commit you can group SQL statement into logical transaction, which can either be committed or rolled back by calling commit() or rollback().
You can also checkout springbatch which is designed for batch processing.
Hope this helps.
It depends where the bottleneck in your system is...
If your queries spend a few seconds each establishing the connection to the database, and only a fraction of that actually running the query, you'd see a nice improvement.
However if the time is spent in mysql, running the actual query, you wouldn't see as much of a difference.
The first thing I'd do, rather than trying concurrent execution is to optimize the query, maybe add indices to your tables, and so forth.
Concurrent execution may be faster. You should also consider batch execution.
Concurrent execution will help if there is any room for parallelization. In your case, there seems to be no room for parallelization, because you have a very simple query which performs a sequential read of a huge amount of data, so your bottleneck is probably the disk transfer and then the data transfer from the server to the client.
When we say that RDBMS servers can handle thousands of requests per second we are usually talking about the kind of requests that we usually see in web applications, where each SQL query is slightly more complicated than yours, but results in much smaller disk reads (so they are likely to be found in a cache) and much smaller data transfers (stuff that fit within a web page.)
I'm having an issue with the jdbc Connection Pool on glassfish handing out dead database connections. I'm running Glassfish 3.1.2.2 using jconn3 (com.sybase.jdbc3) to connect to Sybase 12.5. Our organization has a nightly reboot process during which time we restart the Sybase server. My issue manifests itself when an attempt to use a database connection during the reboot occurs. Here are the order of operations to produce my issue:
Sybase is down for restart.
Connection is requested from the pool.
DB operation fails as expected.
Connection is returned to the pool in a closed state.
Sybase is back up.
Connection is requested from the pool.
DB operation fails due to "Connection is already closed" exception.
Connection is returned to the pool
I've implemented a database recovery singleton that attempts to recover from this scenario. Any time a database exception occurs I make a jmx call to pause all queue's and execute a flushConnectionPool operation on the JDBC Connection Pool. If the database connection is still not up the process sets up a timer to retry in 10 minutes. While this process works, it's not without flaws.
I realize there's a setting on the pool so that you can require validation on the database connection prior to handing it out but I've shied away from this for performance reasons. My process performs approximately 5 million database transactions a day.
My question is, does anyone know of a way to avoid returning a dead connection back to the pool in the first place?
You've pretty well summed up your options. We had that problem, the midnight DB going down. For us, we turned on connection validation, but we don't have your transaction volume.
Glassfish offers a custom validation option, with which a class can be specified to do the validation.
By default, all the classes provided by Glassfish do (You'll see them offered as options in the console) is a SQL statement like this:
SELECT 1;
The syntax varies a bit between databases, SQL Server is uses '1', whereas for Postgres, it just uses 1. But the intent is the same.
The net is that it will cost you an extra DB hit every time you try to get a connection, but it's a really, really cheap hit. But still, it's a hit.
But you could implement your own version. It could do the check, say, every 10th request, or even less frequent. Roll a random number from 1 to N (N = 10, 20, 100...), if you get a '1', do the select (and fail, if it fails), otherwise return "true". But at the same time, configure it so that if you do detect an error, purge the entire pool. Clearly tweak this so you have a good chance of it happening when your db goes down at night (dunno how busy your system is at night) vs peak processing.
You could even "lower the odds" during peak processing. "if time between 6am and 6pm then odds = 1000 else odds = 100; if (random(odds) == 1) { do select... }"
A random option removes the need to maintain a thread safe counter.
In the end, it doesn't really matter, you just want a timely note that the DB is down so you can ask GF to abort the pool.
I can definitely see it thrashing a bit at the very beginning as the DB comes up, possibly refreshing the pool more than once, but that should be harmless.
Different ways you could play with that, but that's an avenue to consider.
While stress testing my JPA based DAO layer (Running 500 simultanious updates at the same time each in a separate thread). I encountered following - system always stuck unable to make any progress.
The problem was, that there were no available connections at some point for any thread, so no running thread could make any progress.
I have investigated this for a while and the root was REQUIRES_NEW annotation on add method in one of my JPA DAO's.
So the scenario was:
Test starts acquiring new Connection from ConnectionPool to start transaction.
After some initial phase, I call add on my DAO, causing it to request another Connection from ConnectionPool which there are no, because all the Connections by that time, were taken by parallel running tests.
I tried to play with DataSource configurations
c3p0 stucks
DBCP stucks
BoneCP stucks
MySQLDataSource fail some requests, with error - number of connections exceeded allowed.
Although I solved it by getting read of REQUIRES_NEW, with which all DataSources worked perfectly, still the best result seems to be of MySQLDataSource, since it did not stuck and just fail:)
So it seems you should not use REQUIRES_NEW at all, if you expect high throughput.
And my question:
Is there a configuration for either DataSources that could prevent this REQUIRES_NEW problem?
I played with checkout timeout in c3p0, and tests started to fail, as expected.
2 sec - 8 % passed
4 sec - 12 % passed
6 sec - 16 % passed
8 sec - 26 % passed
10 sec - 34 % passed
12 sec - 36 % passed
14/16/18 sec - 40 % passed
This is highly subjective ofcourse.
MySQLDataSource with plain configurations gave 20% of passed tests.
What about configuring a timeout for obtaining the connection? If the connection can't be obtained in say 2 seconds, the pool will abort and throw an exception.
Also note that REQUIRES is more typical. Often you'd want a call chain to share a transaction instead of starting a new transaction for each new call in a chain.
Probably any of the Connection pools can be configured to deal with this in any number of ways. Ultimately, all that REQUIRES_NEW is probably forcing your app to acquire more than one Connection per client, which is multiplying the stressfulness of your stress tests. If pools are hanging, it's probably because they are running out of Connections. If you set a sufficiently large pool size, you might resolve the issue. Alternatively, as Arjan suggests above, you can configure pools to "fail fast" instead of hanging indefinitely, if clients have to wait for a Connection. With c3p0, the config param for that would be checkoutTimeout.
Without more information about exactly what's going on when you say a Connection pool "stucks", this is necessarily guesswork. But under very high concurrent load, with any Connection pool, you'll either need to make lots of resources available (high maxPoolSize + numHelperThreads in c3p0), kick out excess clients (checkoutTimeout), or let clients just endure long (but finite!) wait times.
Encountering the following error with our J2EE application:
java.sql.SQLException: Error in allocating a connection. Cause: In-use connections equal max-pool-size and expired max-wait-time. Cannot allocate more connections.
How do I know how many connections the application is currently using and what should be the optimal connection pool settings for a heavily traffic application? Can I change it, and how can I determine what I should set it to (is it a memory issue, bandwidth, etc.)?
How to know how much connection the
application is currently using
You don't give enough information to answer that. Most appservers will have some sort of JMX reporting on things like this. Alternatively, depending on the database, you could find the number of currently open connections.
what should be the optimal connection
pool settings for a heavily traffic
applicaiton
Higher than what you've got?
The above of course assumes that you're not mishandling connections. By that I mean if you're using them directly in code you should always use this idiom:
Connection conn = null;
try {
conn = ... ; // get connection
// do stuff
} finally {
if (conn != null) try { conn.close(); } catch (Exception e) { }
}
If you're not released connections back to the pool and are waiting for the garbage collector to clean them up and release them you're going to use way more connections that you actually need.
First thing to check is whether you have resource leaks. Try surveilling with jconsole or jvisualvm to see how your application behaves and if there is anything that springs in the eye.
Then the actual jdbc pool connection is inherently Java EE container specific so you need to give more information.
Are you sure you are closing everything you need to? Take a look here.
Make sure that the close goes in the finally block. You will see this code in the link:
finally
{
closeAll(resultSet, statement, connection);
}
I helped someone else find a similar issue (theirs was a memory issue instead... but if the process had gone on longer it would have had this result) where they had not closed the result set or the connection.
I think you may need to inspect ask yourself some questions, here are some to think about:
A. Are you using youre connections correctly, are you closing them and returning them to the pool after usage?
B. Are you using long running transactions, e.g. "conversations" with users? Can they be left hanging if the user terminates his usage of the application?
C. Have you designed your data access to fit your application? E.g. are you using caching techniques in areas where you expect repeated reads frequently?
D. Are your connection pool big enough? 5 years ago I had a application with 250 simultaneous connections towards a Oracle database, a lot more than what you typically find out of the box. running the application on say 50 didn't work.
To give more detailed answer, you need to provide more info on the app.
Good luck!