I'm trying to write a multithreading program that connects to a MySQL database and processes the returned set for a query (which has thousands of rows). The problem is that I have implemented the connection pool and I get every thread to open the connection to the database and get the resulting set. But I don't understand which is the advantage of using connection-pooling if retrieving that big set takes such a lot of time. It wouldn't be better if I get the whole set with only one connection (without using pooling) and then I use thread pooling to process it? Or is there a way that every thread takes the next row of the resulting set?
If you have a limited number of threads, I would have a connection per thread.
A connection pool is more efficient if the number of threads which could use a connection is too high and those thread use the connections a relatively low percentage of the time.
Related
According to lettuce, we don't need connection pool and it uses single thread safe shared connection
https://github.com/lettuce-io/lettuce-core/wiki/Connection-Pooling
But, according to hikari-cp we need to have pool of connections of preferably size connections = ((core_count * 2) + effective_spindle_count)
https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing
I am confused why we don't need pooling in one case but required in other?
Hikari is a JDBC connection pool. Connection pooling is required because database has long things to do and while that connection open you can not run another query over it, so it is better to have multiple "open" connection ready. Read more here
On the other hand Redis is not a database. Redis is a key value based in memory store/cache used for fast access. There you do not need connection pool because it is simple and fast, you ask to redis is "key1" exist give me the data and you get it or not. On the other hand with database you run long running SQLs sometime stored procedures depends to complexity but it is not one step work.
I am building a Java application which queries my SQL server once a minute. Right now, the application is using a connection pool with a single connection (min pool size 1, max pool size 1).
I figured that a pool size of 1 would be enough because of the infrequent queries (once a minute, as mentioned before).
Do I need a connection pool at all, and if the answer is yes, is 1 connection enough? Or should I just not use a pool and open a new connection every minute?
using a connection pool with a single connection (min pool size 1, max
pool size 1).
In such case don't see any need or benefit of using connection pool since at any point in time there will be only one connection object and if it's in use then other request has to wait (or) create a non-pooled connection object.
Connection pooling is generally used to save time/resource from creation/tearing-up the connection object.
In your specific case, you can probably create a connection instance and dispose it off once done with your work.
I would like to give some reason to use the connection pool, potentially with multiple connections. Not sure if you are considering this negative case.
In real world, a query might run more than 1 minute with various reason.
Do you want the application to wait for the hung connection? Or what is your expected behavior for this?
Also if you use connection pool, DB connection initialize process(time and resource consuming) is done while the pool is generated. When you actually use the DB connection pool, some of initialization step should be already done so it reduce repetitive overhead when the application is running query.
If you're accessing the database that rarely, then in practice having a connection pool will not provide any significant benefit. However, I'd keep it if it's already written. Maybe one day your application will grow and you'll find the pool useful. It's not a bad thing to have.
I have a scenario in production for a web app, where when a form is submitted the data gets stored in 3 tables in Oracle DB through JDBC. Sometimes I am seeing connection time out errors in logs while the app is trying to connect to Oracle DB through Java code. This is intermittent.
Below is the exception:
SQL exception while storing data in table
java.sql.SQLRecoverableException: IO Error: Connection timed out
Most of the times the web app is able to connect to data base and insert values in it but some times and I am getting this time out error and unable to insert data in it. I am not sure why am I getting this intermittent issue. When I checked the connections pool config in my application, I noticed the following things:
Pool Size (Maximum number of Connections that this pool can open) : 10
Pool wait (Maximum wait time, in milliseconds, before throwing an Exception if all pooled Connections are in use) : 1000
Since the pool size is just 10 and if there are multiple users trying to connect to data base will this connection time out issue occur ?
Also since there are 3 tables where the data insertion occurs we are doing the whole insertion in just one connection itself. We are not opneing each DB connection for each individual table.
NOTE: This application is deployed on AEM (Content Management system) server and connections pool config is provided by them.
Update: I tried setting the validation query in the connections pool but still I am getting the connection time out error. I am not sure whether the connections pool has checked the validation query or not. I have attached the connections pool above for reference.
I would try two things:
Try setting a validation query so each time the pool leases a connection, you're sure it's actually available. select 1 from dual should work. On recent JDBC drivers that should not be required but you might give it a go.
Estimate the concurrency of your form. A 10 connections pool is not too small depending on the complexity of your work on DB. It seems you're saving a form so it should not be THAT complex. How many users per day do you expect? Then, on peak time, how many users do you expect to be using the form at the same time? A 10 connections pool often leases and retrieves connections quite fast so it can handle several transactions per second. If you expect more, increase the size slightly (more than 25-30 actually degrades DB performance as more queries compete for resources there).
If nothing seems to work, it would be good to check what's happening on your DB. If possible, use Enterprise Manager to see if there are latches while doing stuff on those three tables.
I give this answer from programming point of view. There are multiple possibilities for this problem. These are following and i have added appropriate solution for it. As connection timeout occurs, means your new thread do not getting database access within mentioned time and it is due to:
Possibility I: Not closing connection, there should be connection leakage somewhere in your application Solution
You need to ensure this thing and need to check for this leakage and close the connection after use.
Possibility II: Big Transaction Solution
i. Is these insertion synchronized, if it is so then use it very carefully. Use it at block level not method level. And your synchronized block size should be minimum as much as possible.
What happen is if we have big synchronized block, we give connection, but it will be in waiting state as this synchronized block needs too much time for execution. so other thread waiting time increases. Suppose we have 100 users, each have 100 threads for that operation. 1st is executing and it takes too long time. and others are waiting. So there may be a case where 80th 90th,etc thread throw timeout. And For some thread this issue occurs.
So you must need to reduce size of the synchronized block.
ii. And also for this case also check If the transaction is big, then try to cut the transaction into smaller ones if possible:-
For an example here, for one insertion one small transaction. for second other small transaction, like this. And these three small transaction completes operation.
Possibility III: Pool size is not enough if usability of application is too high Solution
Need to increase the pool size. (It is applicable if you properly closes all the connection after use)
You can use Java Executor service in this case .One thread One connection , all asynchronous .Once transaction completed , release the connection back to pool.That way , you can get rid of this timeout issue.
If one connection is inserting the data in 3 tables and other threads trying to make connection are waiting, timeout is bound to happen.
I use thousands of H2 databases via TCP using Hikari Connection Pools. In a period of 1-30min a lot of queries will be performed on about five of the databases. There will be some queries to some of the other databases too but it is not predictable which and how many databases are affected by those side queries.
I store the already used HikariDataSources in a HashMap for the subsequent queries but I fear that the HashMap (contained objects) is getting too large and therefore I want to create a cleanup thread that closes HikariDataSources and removes them from the HashMap after they weren't used for a certain period of time.
To remove the right connection pools I need to know if a pool was not used for a defined period of time. How do I get that information?
And is there a better way to handle the number of Connection Pools? Maybe there is something that is made for Connection Pools similar as HikariCP is made for Connections. Is there a Pool for Connection Pools? :D
I came across this interview question : How will you manage a db connection pool?
My thought was:
I will create an ArrayBlockingQueue<Connection>, create come connection objects and put them in the queue when ajvm starts up. Then wrap this in some form of an enum singleton so there is only one such queue and it stays alive for the life of the JVM.
Then use some kind of utility/driver class that will take connections from the queue and return them back to the queue.
I am thinking what else i need to say to this? Do i need to make the queue thread safe so that multiple requests dont have the same connection?
In my opninion you're missing several points here:
Connections should be returned to the initial state when returning back to the pool. For instance, connection.setAutocommit(...); should definitely be reverted
I't a good idea to wrap native connection into your own implementation of javax.sql.Connection interface to control and monitor actions performed on the connection. With this pattern you can also implement a valuable feature: return the connection back to pool on close(); call
You need some menans to control the number of the connections in a pool based on the actual pool utilization. Take a look at how "capacity" and "load factor" are implemented in Java collection to get a rough implementation idea
Connections should be monitored if they're alive. It's not that easy to archive for all possible databases.