Today:
I continue to work on an application, which, constantly creates and closes a LOT of (sql) connections to the same (Oracle) database, using the same credentials running very basic selects, updates and inserts.
Currently, a singleton CustomConnectionManager checks in its pool to see if any CustomConnectors are available to be given out. Once given out they may be closed by the client. Upon close() the underlying SQLConnection (created and maintained by the CustomConnector) is also closed. If CustomConnector is unavailable, new CustomConnector is created.
The good thing about this is that ultimately SQLConnection remains closed after each use, however, very little reuse is going on as the value lies in SQLConnection not in CustomConnector.
Since all users of the system will connect to the same database, the same single connection may be used to accommodate all requests. Original solution that created new Connectors upon each request seems very wasteful.
Proposed:
singleton CustomConnectionManager will maintain 2 queues:
a queue of available CustomConnector, each of which will maintain it's own SQLConnection and
a queue of inUse CustomConnectors. Upon request new CustomConnector will be created and given out.
Client interaction only happens with the singleton CustomConnectionManager.
When new connection is needed, manager creates it, gives it out to client and places it in inUse queue. When client is done using the connection, instead of closing it, client will .markConnectorAvailable(), which would put it back into the availableConnectors queue ( Client will no longer be able to control underlying SQLConnection)
Question 1: What do you think? Will this work? Is there an existing solution that already does this well?
Question 2: If proposed approach is not a complete waste, what is a good point for CustomConnector's to close it's SQLConnections?
That's connection pooling. ADO.Net's kills them after no use in a set time (can be set in the connectionstring, (default is two minutes as I remember.).
Related
So I have a Java process that runs indefinitely as a TCP server (receives messages from another process, and has onMsg handlers).
One of the things I want to do with the message in my Java program is to write it to disk using a database connection to postgres. Right now, I have one single static connection object which I call every time a message comes in. I do NOT close and reopen the connection for each message.
I am still a bit new to Java, I wanted to know 1) whether there are any pitfalls or dangers with using one connection object open indefinitely, and 2) Are there performance benefits to never closing the connection, as opposed to reopening/closing every time I want to hit the database?
Thanks for the help!
I do NOT close and reopen the connection for each message.
Yes you do... at least as far as the plain Connection object is concerned. Otherwise, if you ever end up with a broken connection, it'll be broken forever, and if you ever need to perform multiple operations concurrently, you'll have problems.
What you want is a connection pool to manage the "real" connections to the database, and you just ask for a connection from the pool for each operation and close it when you're done with it. Closing the "logical" connection just returns the "real" connection to the pool for another operation. (The pool can handle keeping the connection alive with a heartbeat, retiring connections over time etc.)
There are lots of connection pool technologies available, and it's been a long time since I've used "plain" JDBC so I wouldn't like to say where the state of the art is at the moment - but that's research you can do for yourself :)
Creating a database connection is always a performance hit. Only a very naive implementation would create and close a connection for each operation. If you only needed to do something once an hour, then it would be acceptable.
However, if you have a program that performs several database accesses per minute (or even per second for larger apps), you don't want to actually close the connection.
So when do you close the connection? Easy answer: let a connection pool handle that for you. You ask the pool for a connection, it'll give you an open connection (that it either has cached, or if it really needs to, a brand new connection). When you're finished with your queries, you close() the connection, but it actually just returns the connection to the pool.
For very simple programs setting up a connection pool might be extra work, but it's not very difficult and definitely something you'll want to get the hang of. There are several open source connection pools, such as DBCP from Apache and 3CPO.
I came across this interview question : How will you manage a db connection pool?
My thought was:
I will create an ArrayBlockingQueue<Connection>, create come connection objects and put them in the queue when ajvm starts up. Then wrap this in some form of an enum singleton so there is only one such queue and it stays alive for the life of the JVM.
Then use some kind of utility/driver class that will take connections from the queue and return them back to the queue.
I am thinking what else i need to say to this? Do i need to make the queue thread safe so that multiple requests dont have the same connection?
In my opninion you're missing several points here:
Connections should be returned to the initial state when returning back to the pool. For instance, connection.setAutocommit(...); should definitely be reverted
I't a good idea to wrap native connection into your own implementation of javax.sql.Connection interface to control and monitor actions performed on the connection. With this pattern you can also implement a valuable feature: return the connection back to pool on close(); call
You need some menans to control the number of the connections in a pool based on the actual pool utilization. Take a look at how "capacity" and "load factor" are implemented in Java collection to get a rough implementation idea
Connections should be monitored if they're alive. It's not that easy to archive for all possible databases.
I have gone through couple of articles on singleton example. And I could see that developers sometimes make the database connection object or connection manager as singleton implementation. In some of the posts even it was advised to use Database connection pool.
Well Singleton means to create a single instance, so basically we restrict the access. e.g. Printer or hardware access, logger access in which we try to restrict the access of the user to one at a time using singleton. However what is the purpose of using singleton in DB connection objects?
If I can understand correctly creating a Database connection as singleton means that app server will have only one instance. Does this mean only one user can access the Database connection and next user has to wait until the connection is closed?
Please advise.
I think you understand correctly the implication of making the connection itself a singleton. Generally it is not a good idea (although in some very particular case it could make sense).
Making a connection manager or a connection pool a singleton is completely different. The pool itself would handle a collection of connections and it can create new as they are needed (up to a limit) or re-use the ones that have been already used and discarded.
Having several connection pools at the same time would lose the advantages of the pool:
It would be harder to control the total number of connections open
One pool could be creating connections while other could have connections available
Hope this helps to clarify the subject. You might want to read more on connection pools.
Q: "However what is the purpose of using singleton in DB connection objects?" A: There is (almost always) none. So your thinking is correct.
Q: "Does this mean only one user can access the Database connection and next user has to wait until the connection is closed?"
A: Depends (to first part) and No (to second part after "and"). In single-threaded application only one user will use the database at one time and another will wait, until dispatch of first user ends but not when the connection is closed. Once connection is closed, you need to create another connection to make use of database. In multi-threaded application many threads may be using the same connection instance and result really depends on the vendor implementation: may block dispatching (effectively transforming your app to single-threaded app) or throw exceptions or even something different. However, such design in multi-threaded app is in my opinion a programmer-error.
I know that connection pool mechanism in database lets you keep database opened between many transactions and then close the database only at the end.
I am using the sshxcute(http://code.google.com/p/sshxcute/) to connect to unix machine from java code. but if i have to execute unix commands from different java files the entire process right from connecting to machine takes place. i want to keep the session open between many calls to this machine. how to acheive this. basically i want some some mechanism like connection pool which lets me open(connect) to unix machine only once and execute as many instructions as i want from different java classes or methods and finally once for all close the session/connection to the unix machine..
I've had to create such pools. It's not that hard. In a nutshell, my general approach is:
Create a class to manage the pool.
In that class, create a Collection to hold the available connections. (Ooh, nice rhyme to that.) Probably a LinkedList but it could be an ArrayList, etc. Create a second Collection to hold the connections that are currently in use. Initially these collections are empty.
Create a function in that class that can be called to request a connection. That function checks if there are any connections in the pool. If so, it picks one, removes it from the available collection, adds it to the used collection, and returns it to the caller. If not, it creates a new one, adds it to the used collection, and returns it the caller.
Create a function that can be called to release a connection. That functions takes the collection as a parameter, removes it from the used collection, and adds it to the available collection. (If it's not in the used collection, that means someone got a connection without going through the pool. You may want to add it to the available collection anyway, or just close it and throw it away.)
That's basically it. There are, of course, bunches of details to be considered. Like:
Should there be a limit on maximum number of connections? If so, you must keep count of how many connections you've given out, and if a new request would put you over that limit, throw an exception instead of returning a connection. (Or maybe return a null, depending on just how you want to handle it.)
Should there be a limit on the number of connections to keep in the available pool? If so, when a connection is released, instead of automatically adding it to the available pool, check if the pool is already at maximum size and if so, close the connection instead of returning it to the pool.
It's a good idea for the get-connection function to test a connection before returning it. It's possible that the connection has timed out while it was sitting in the pool, for example. Perhaps you can send some low-cost message and make sure you get a valid response.
The main reason for having a used collection is so you can watch for connection leaks, i.e. someone requests a connection and then never gives it back. Rather than putting the connection directly into the used collection, I usually create a wrapper object to hold it that also keeps the time that it was given out. Then I put in a function that is called with a timer that loops through the used collection and checks if there is anything that has been there for a ridiculously long amount of time. Depending on the type of connection, you may be able to check when it was last actually used or do some other test to see if the caller is really still using it or if it is a connection leak. If you're confidant that you can recognize a connection leak, you might close it or return it to the available pool. Otherwise you can at least write a message to a log, and periodically check the logs to see if you have leakage problems and hunt them down. If you don't do any connection-leak tracking, then the used collection is probably superfluous and can be eliminated.
Several ideas here:
A. Use an object pool. For example, you can use this one.
This will let you hold a pool of several Unix connections.
B. Have some sort of session management at your application level,
And have a session context which will hold a reference to an object taken from the pool.
Once a session starts - you will try to obtain a connection from the pool.
Once the session ends, you will return the connection to the pool.
Bare in mind that you may not have a 1:1 ratio between your application sessions and the objects held at the pool.
A possible strategy to handle this is to create a pool at initial size X, and let it grow up to size Y, if needed.
Another issue that you will need to handle, is perhaps to have some sort of "keep alive" check to see that your connections are alive
The two strategies I can think of here are:
A. Have periodic check on connections (let's say - use ping).
B. Create a new connection, if one of the connections at the pool got broken.
Let's say we've got a web application or web service on an application server supporting JDBC connection pooling.
Should I be grabbing a new Connection on a per-thread or per-query basis?
Thanks!
Hopefully you are grabbing them on a per-transactional-unit-of-work basis.
Per query implies that you never have any logical unit of work in your system that spans more than a single query. (Maybe that's true, but you still might want to think about the future!)
Per-thread (which I assume to mean request-scoped, rather than for the entire life of the thread?) will probably result in holding them for longer than absolutely necessary, but it does allow you to manage transactions much better. (and it's how plenty of leading frameworks have worked or did work for a long time. A pattern known as Open Entity Manager In View, if you'd like to do some google-fu on it)
Assigning it indefinitely to a single thread means your max number of active request processors is capped at the max size of your database pool, which is a definite failure in scalability.
per-thread
Each new request will grab a new connection (new thread = new request). There is no need for getting a new connection for each query as after each query the connection can be reused.