I came across this interview question : How will you manage a db connection pool?
My thought was:
I will create an ArrayBlockingQueue<Connection>, create come connection objects and put them in the queue when ajvm starts up. Then wrap this in some form of an enum singleton so there is only one such queue and it stays alive for the life of the JVM.
Then use some kind of utility/driver class that will take connections from the queue and return them back to the queue.
I am thinking what else i need to say to this? Do i need to make the queue thread safe so that multiple requests dont have the same connection?
In my opninion you're missing several points here:
Connections should be returned to the initial state when returning back to the pool. For instance, connection.setAutocommit(...); should definitely be reverted
I't a good idea to wrap native connection into your own implementation of javax.sql.Connection interface to control and monitor actions performed on the connection. With this pattern you can also implement a valuable feature: return the connection back to pool on close(); call
You need some menans to control the number of the connections in a pool based on the actual pool utilization. Take a look at how "capacity" and "load factor" are implemented in Java collection to get a rough implementation idea
Connections should be monitored if they're alive. It's not that easy to archive for all possible databases.
Related
Why should I prefer using a connection pool instead of static variable from a static class in Tomcat to hold the database connection?
This seems to me equivalent of using a connection pool having the capacity to store just one connection. So, a related question is: why the capacity of a connection pool needs to be bigger than one connection?
Thank's in advance.
With a pool, you can have multiple threads using different connections. Do you really want to limit your web application to handling one db-related request at a time? :) (And adding the complication of synchronizing to make sure that one thread doesn't try to use that single connection while another request is doing so...)
It would be generally a very bad idea to have a connection pool with a capacity of 1 - but at least if you did so, you could later increase the capacity without changing any other code, of course.
(EDIT: And as noted in other answers, connections in a pool can be closed and reopened if they become stale or broken in some way.)
The reason is to increase scalability, robustness and speed.
If you're creating a web application, there can be many concurrent HTTP requests coming in, each served by a different thread.
If you have only one static connection to the database, you need synchronization around that connection. You can't share a connection between several threads, That means each HTTP request have to wait for someone else using the database. And you need to fix up/reconnect that connection if something goes wrong with it at one point or another.
You could open a connection at the start of each HTTP request - however opening a new database connection can be expensive, and you don't get much control over how many database connections you have. Databases can be overwhelmed by having too many connections.
A connection pool solves this, as you have a pool of already opened connections that can be handed out to serve an HTTP request, or to different parts of the applications that needs to do database operations, and is returned to the pool when the database operation is finished, ready to be used again by something else.
A connection pool of just 1 connection rarely makes sense though - however connection pools take care of many other things as well, such a closing the connection and opening a new one if a connection goes stale or is otherwise in a bad state, as well as it takes care of the synchronization when there is are no more connections to hand out at a particular time.
If you're using a connection pool with only one connection it is equivalent to have one static connection - like you mentioned and there's no advantage for connection pool in regards.
The strength of a connection pool is when you're using multiple connections (multiple threads) - it saves you the effort of managing the connections (open/close connections, boilerplate-code, smart resource handling etc).
Using a connection pool for one connection only is kind of like paving a 10-lane road that will be used by one car only - lot of overhead with (almost) no gain.
Using a connection pool is not just about sharing connections: it is about leveraging years of experience with broken JDBC drivers and all the weird ways in which a Connection can become unusable. Using a single static connection is not only a bottleneck, but a very fragile solution. It will cause your application to break on a regular basis, and even restarting the application will not clean up the problems: you may get a connection leak.
I have gone through couple of articles on singleton example. And I could see that developers sometimes make the database connection object or connection manager as singleton implementation. In some of the posts even it was advised to use Database connection pool.
Well Singleton means to create a single instance, so basically we restrict the access. e.g. Printer or hardware access, logger access in which we try to restrict the access of the user to one at a time using singleton. However what is the purpose of using singleton in DB connection objects?
If I can understand correctly creating a Database connection as singleton means that app server will have only one instance. Does this mean only one user can access the Database connection and next user has to wait until the connection is closed?
Please advise.
I think you understand correctly the implication of making the connection itself a singleton. Generally it is not a good idea (although in some very particular case it could make sense).
Making a connection manager or a connection pool a singleton is completely different. The pool itself would handle a collection of connections and it can create new as they are needed (up to a limit) or re-use the ones that have been already used and discarded.
Having several connection pools at the same time would lose the advantages of the pool:
It would be harder to control the total number of connections open
One pool could be creating connections while other could have connections available
Hope this helps to clarify the subject. You might want to read more on connection pools.
Q: "However what is the purpose of using singleton in DB connection objects?" A: There is (almost always) none. So your thinking is correct.
Q: "Does this mean only one user can access the Database connection and next user has to wait until the connection is closed?"
A: Depends (to first part) and No (to second part after "and"). In single-threaded application only one user will use the database at one time and another will wait, until dispatch of first user ends but not when the connection is closed. Once connection is closed, you need to create another connection to make use of database. In multi-threaded application many threads may be using the same connection instance and result really depends on the vendor implementation: may block dispatching (effectively transforming your app to single-threaded app) or throw exceptions or even something different. However, such design in multi-threaded app is in my opinion a programmer-error.
I know that connection pool mechanism in database lets you keep database opened between many transactions and then close the database only at the end.
I am using the sshxcute(http://code.google.com/p/sshxcute/) to connect to unix machine from java code. but if i have to execute unix commands from different java files the entire process right from connecting to machine takes place. i want to keep the session open between many calls to this machine. how to acheive this. basically i want some some mechanism like connection pool which lets me open(connect) to unix machine only once and execute as many instructions as i want from different java classes or methods and finally once for all close the session/connection to the unix machine..
I've had to create such pools. It's not that hard. In a nutshell, my general approach is:
Create a class to manage the pool.
In that class, create a Collection to hold the available connections. (Ooh, nice rhyme to that.) Probably a LinkedList but it could be an ArrayList, etc. Create a second Collection to hold the connections that are currently in use. Initially these collections are empty.
Create a function in that class that can be called to request a connection. That function checks if there are any connections in the pool. If so, it picks one, removes it from the available collection, adds it to the used collection, and returns it to the caller. If not, it creates a new one, adds it to the used collection, and returns it the caller.
Create a function that can be called to release a connection. That functions takes the collection as a parameter, removes it from the used collection, and adds it to the available collection. (If it's not in the used collection, that means someone got a connection without going through the pool. You may want to add it to the available collection anyway, or just close it and throw it away.)
That's basically it. There are, of course, bunches of details to be considered. Like:
Should there be a limit on maximum number of connections? If so, you must keep count of how many connections you've given out, and if a new request would put you over that limit, throw an exception instead of returning a connection. (Or maybe return a null, depending on just how you want to handle it.)
Should there be a limit on the number of connections to keep in the available pool? If so, when a connection is released, instead of automatically adding it to the available pool, check if the pool is already at maximum size and if so, close the connection instead of returning it to the pool.
It's a good idea for the get-connection function to test a connection before returning it. It's possible that the connection has timed out while it was sitting in the pool, for example. Perhaps you can send some low-cost message and make sure you get a valid response.
The main reason for having a used collection is so you can watch for connection leaks, i.e. someone requests a connection and then never gives it back. Rather than putting the connection directly into the used collection, I usually create a wrapper object to hold it that also keeps the time that it was given out. Then I put in a function that is called with a timer that loops through the used collection and checks if there is anything that has been there for a ridiculously long amount of time. Depending on the type of connection, you may be able to check when it was last actually used or do some other test to see if the caller is really still using it or if it is a connection leak. If you're confidant that you can recognize a connection leak, you might close it or return it to the available pool. Otherwise you can at least write a message to a log, and periodically check the logs to see if you have leakage problems and hunt them down. If you don't do any connection-leak tracking, then the used collection is probably superfluous and can be eliminated.
Several ideas here:
A. Use an object pool. For example, you can use this one.
This will let you hold a pool of several Unix connections.
B. Have some sort of session management at your application level,
And have a session context which will hold a reference to an object taken from the pool.
Once a session starts - you will try to obtain a connection from the pool.
Once the session ends, you will return the connection to the pool.
Bare in mind that you may not have a 1:1 ratio between your application sessions and the objects held at the pool.
A possible strategy to handle this is to create a pool at initial size X, and let it grow up to size Y, if needed.
Another issue that you will need to handle, is perhaps to have some sort of "keep alive" check to see that your connections are alive
The two strategies I can think of here are:
A. Have periodic check on connections (let's say - use ping).
B. Create a new connection, if one of the connections at the pool got broken.
Today:
I continue to work on an application, which, constantly creates and closes a LOT of (sql) connections to the same (Oracle) database, using the same credentials running very basic selects, updates and inserts.
Currently, a singleton CustomConnectionManager checks in its pool to see if any CustomConnectors are available to be given out. Once given out they may be closed by the client. Upon close() the underlying SQLConnection (created and maintained by the CustomConnector) is also closed. If CustomConnector is unavailable, new CustomConnector is created.
The good thing about this is that ultimately SQLConnection remains closed after each use, however, very little reuse is going on as the value lies in SQLConnection not in CustomConnector.
Since all users of the system will connect to the same database, the same single connection may be used to accommodate all requests. Original solution that created new Connectors upon each request seems very wasteful.
Proposed:
singleton CustomConnectionManager will maintain 2 queues:
a queue of available CustomConnector, each of which will maintain it's own SQLConnection and
a queue of inUse CustomConnectors. Upon request new CustomConnector will be created and given out.
Client interaction only happens with the singleton CustomConnectionManager.
When new connection is needed, manager creates it, gives it out to client and places it in inUse queue. When client is done using the connection, instead of closing it, client will .markConnectorAvailable(), which would put it back into the availableConnectors queue ( Client will no longer be able to control underlying SQLConnection)
Question 1: What do you think? Will this work? Is there an existing solution that already does this well?
Question 2: If proposed approach is not a complete waste, what is a good point for CustomConnector's to close it's SQLConnections?
That's connection pooling. ADO.Net's kills them after no use in a set time (can be set in the connectionstring, (default is two minutes as I remember.).
I have a Data base connection pool. There are consumers who take connection from that pool.
But i cant trust those consumers because lot of them are not returning my connection back. Hence the pool starves and many consumers are forced too wait for infinite.
for example:
class Consumer{
void someMethod(){
Connection con=Pool.getConnection();
//some bloody steps which throws exception
con.goodBye();//giving back the connection to the pool
}
}
Because of the exception and may be because of arrogance the connection is not given back always. I have no way to restrict the usage of Pool api in the consumers' class.
How can i get my connection back?.(I have no way to force the Consumer)
I believe there is no fool proof solution for this(May be im not that smart). Still can any one can come up with a pretty good solution.
One solution which i got is checking whether any exception occurs in the Consumer class, if Exception occurs then take back the connection force fully.
Or is there any new revolutionary DBPool design pattern which are not very popular for this type of typical scenarios(even though i think that my case is very generic, any one can forget to give back the connection back to the pool.)
That's bad client code. The code should handle the case of an exception and close the connection when done.
There's no way for you to know from your code if that's not being done, though. It's the client code's fault and problem if it doesn't do that.
Having a sane timeout is the only way to limit this, but it still does not "solve" it, ultimately.
--
You mention in comments that this pool is shared among multiple clients. That shifts the responsibility back to you, of course.
Can you limit each client to only using X connections at once? This way, at least they can only tie up so many at one time.
Otherwise, you could create separate pools per client. That sort of just moves the problem down the stack, but might be appropriate, depending on the logistics involved.
Do not return connection objects, return proxy objects representing connections instead. These proxy objects, when finalized, should say goodbye to the connection they stand for. If a proxy is not properly closed, it will eventually be garbage-collected, and adjust the connection state at this time.
Two issues here. First, the time before GC is unpredictable. Better than forever, but still can be very long. Second, be aware of side effects of complex calls in the finalizer, object resurrection in particular. There are some rare but ugly scenarios that prevent objects to be collected at all.
why don't you look at using WeakReference, here you can adjust your code to return a weak reference to a connection, when the thread using the connection dies, the object will have no reference (except from your WeakHashMap), you can then periodically identify these objects and call the goodBye method using a thread.
here is an article which can help you understand this better.
.net also has a WeakReference class which behaves very similar to this.
Have you tried catching the exception and closing the connection in the catch clause?
class Consumer{
void someMethod(){
Connection con=Pool.getConnection();
try{
//some bloody steps which throws exception
}catch(Exception e){
con.goodBye();
}
con.goodBye();//giving back the connection to the pool
}
}
Edit: you could also use a finally block to remove the redundant code and make sure your connection gets closed in every case. I am assuming this is java code, no experience with C#.
The only way I can think of is not to give the consumers access to the connection pool directly, but have your own list of connections. Then reclaim the connection after a timeout, say 60 seconds.