Single connection vs Connection Pool - java

According to lettuce, we don't need connection pool and it uses single thread safe shared connection
https://github.com/lettuce-io/lettuce-core/wiki/Connection-Pooling
But, according to hikari-cp we need to have pool of connections of preferably size connections = ((core_count * 2) + effective_spindle_count)
https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing
I am confused why we don't need pooling in one case but required in other?

Hikari is a JDBC connection pool. Connection pooling is required because database has long things to do and while that connection open you can not run another query over it, so it is better to have multiple "open" connection ready. Read more here
On the other hand Redis is not a database. Redis is a key value based in memory store/cache used for fast access. There you do not need connection pool because it is simple and fast, you ask to redis is "key1" exist give me the data and you get it or not. On the other hand with database you run long running SQLs sometime stored procedures depends to complexity but it is not one step work.

Related

Connection pool with single connection vs connection for infrequent queries

I am building a Java application which queries my SQL server once a minute. Right now, the application is using a connection pool with a single connection (min pool size 1, max pool size 1).
I figured that a pool size of 1 would be enough because of the infrequent queries (once a minute, as mentioned before).
Do I need a connection pool at all, and if the answer is yes, is 1 connection enough? Or should I just not use a pool and open a new connection every minute?
using a connection pool with a single connection (min pool size 1, max
pool size 1).
In such case don't see any need or benefit of using connection pool since at any point in time there will be only one connection object and if it's in use then other request has to wait (or) create a non-pooled connection object.
Connection pooling is generally used to save time/resource from creation/tearing-up the connection object.
In your specific case, you can probably create a connection instance and dispose it off once done with your work.
I would like to give some reason to use the connection pool, potentially with multiple connections. Not sure if you are considering this negative case.
In real world, a query might run more than 1 minute with various reason.
Do you want the application to wait for the hung connection? Or what is your expected behavior for this?
Also if you use connection pool, DB connection initialize process(time and resource consuming) is done while the pool is generated. When you actually use the DB connection pool, some of initialization step should be already done so it reduce repetitive overhead when the application is running query.
If you're accessing the database that rarely, then in practice having a connection pool will not provide any significant benefit. However, I'd keep it if it's already written. Maybe one day your application will grow and you'll find the pool useful. It's not a bad thing to have.

How much time passed after the last active connection?

I use thousands of H2 databases via TCP using Hikari Connection Pools. In a period of 1-30min a lot of queries will be performed on about five of the databases. There will be some queries to some of the other databases too but it is not predictable which and how many databases are affected by those side queries.
I store the already used HikariDataSources in a HashMap for the subsequent queries but I fear that the HashMap (contained objects) is getting too large and therefore I want to create a cleanup thread that closes HikariDataSources and removes them from the HashMap after they weren't used for a certain period of time.
To remove the right connection pools I need to know if a pool was not used for a defined period of time. How do I get that information?
And is there a better way to handle the number of Connection Pools? Maybe there is something that is made for Connection Pools similar as HikariCP is made for Connections. Is there a Pool for Connection Pools? :D

Why is it impossible to serve connections from pool according to the prepared statement already executed with them?

I've been researching all around the web the most efficient way to design a connection pool and tried to analyze into details the available libraries (HikariCP, BoneCP, etc.).
Our application is a heavy-load consumer webapp and most of the time the users are working on similar business objects (thus the underlying SQL queries executed are the often the same, but still there are numerous).
It is designed to work with different DBMS (Oracle and MS SQL Server especially).
So a simplified use case would be :
User goes on a particular JSP page (e.g. Enterprise).
A corresponding Bean is created.
Each time it realizes an action (e.g. getEmployees(), computeTurnover()), the Bean asks the pool for a connection and returns it back when done.
If we want to take advantage of the Prepared Statement caching of the underlying JDBC driver (as PStatements are attached to a connection - jTDS doc.), from what I understand an optimal way of doing it would be :
Analyze what kind of SQL query a particular Bean want to execute before providing it an available connection from the pool.
Find a connection where the same prepared statement has already been executed if possible.
Serve the connection accordingly (and use the benefits of the cache/precompiled statement).
Return the connection to the pool and start over.
Am I missing an important point here (like JDBC drivers capable of reusing cached statements regardless of the connection) or is my analysis correct ?
The different sources I found state it is not possible, but why ?
For your scheme to work, you'd need to be able to get the connection that already has that statement prepared.
This falls foul on two points:
In JDBC you obtain the connection first,
Cached prepared statements (if a driver or connection pool even supports that) aren't exposed in a standardized way (if at all) nor would you be able to introspect them.
The performance overhead of finding the right connection (and the subsequent contention on the few connections that already have it prepared) would probably undo any benefit of reusing the prepared statement.
Also note that some database systems also have a serverside cache for prepared statements (meaning that it already has the plan etc available), limiting the overhead from a new prepare from the client.
If you really think the performance benefit is big enough, you should consider using a data source specific for this functionality (so it is almost guaranteed that the connection will have the statement in its cache).
A solution could be for a connection pool implementation to delay retrieving the connection from the pool until the Connection.prepareStatement() is called. At that time a connection pool would look up available connections by the SQL statement text and then play forward all the calls made before Connection.prepareStatement(). This way it would be possible to get a connection with a ready PreparedStatement without the issues other guys suggested.
In other words, when you request a connection from the pool, it would return a wrapper that logs everything until the first operation requiring DB access (such as prepareStatement() is requested.
You'd need to ask a vendor of your connection pool functionality to add this feature.
I've logged this request with C3P0:
https://github.com/swaldman/c3p0/issues/55
Hope this helps.

Why use a connection pool instead of a static connection variable?

Why should I prefer using a connection pool instead of static variable from a static class in Tomcat to hold the database connection?
This seems to me equivalent of using a connection pool having the capacity to store just one connection. So, a related question is: why the capacity of a connection pool needs to be bigger than one connection?
Thank's in advance.
With a pool, you can have multiple threads using different connections. Do you really want to limit your web application to handling one db-related request at a time? :) (And adding the complication of synchronizing to make sure that one thread doesn't try to use that single connection while another request is doing so...)
It would be generally a very bad idea to have a connection pool with a capacity of 1 - but at least if you did so, you could later increase the capacity without changing any other code, of course.
(EDIT: And as noted in other answers, connections in a pool can be closed and reopened if they become stale or broken in some way.)
The reason is to increase scalability, robustness and speed.
If you're creating a web application, there can be many concurrent HTTP requests coming in, each served by a different thread.
If you have only one static connection to the database, you need synchronization around that connection. You can't share a connection between several threads, That means each HTTP request have to wait for someone else using the database. And you need to fix up/reconnect that connection if something goes wrong with it at one point or another.
You could open a connection at the start of each HTTP request - however opening a new database connection can be expensive, and you don't get much control over how many database connections you have. Databases can be overwhelmed by having too many connections.
A connection pool solves this, as you have a pool of already opened connections that can be handed out to serve an HTTP request, or to different parts of the applications that needs to do database operations, and is returned to the pool when the database operation is finished, ready to be used again by something else.
A connection pool of just 1 connection rarely makes sense though - however connection pools take care of many other things as well, such a closing the connection and opening a new one if a connection goes stale or is otherwise in a bad state, as well as it takes care of the synchronization when there is are no more connections to hand out at a particular time.
If you're using a connection pool with only one connection it is equivalent to have one static connection - like you mentioned and there's no advantage for connection pool in regards.
The strength of a connection pool is when you're using multiple connections (multiple threads) - it saves you the effort of managing the connections (open/close connections, boilerplate-code, smart resource handling etc).
Using a connection pool for one connection only is kind of like paving a 10-lane road that will be used by one car only - lot of overhead with (almost) no gain.
Using a connection pool is not just about sharing connections: it is about leveraging years of experience with broken JDBC drivers and all the weird ways in which a Connection can become unusable. Using a single static connection is not only a bottleneck, but a very fragile solution. It will cause your application to break on a regular basis, and even restarting the application will not clean up the problems: you may get a connection leak.

Correct way to implement a connection pool

I'm trying to write a multithreading program that connects to a MySQL database and processes the returned set for a query (which has thousands of rows). The problem is that I have implemented the connection pool and I get every thread to open the connection to the database and get the resulting set. But I don't understand which is the advantage of using connection-pooling if retrieving that big set takes such a lot of time. It wouldn't be better if I get the whole set with only one connection (without using pooling) and then I use thread pooling to process it? Or is there a way that every thread takes the next row of the resulting set?
If you have a limited number of threads, I would have a connection per thread.
A connection pool is more efficient if the number of threads which could use a connection is too high and those thread use the connections a relatively low percentage of the time.

Categories

Resources