So I have a Java process that runs indefinitely as a TCP server (receives messages from another process, and has onMsg handlers).
One of the things I want to do with the message in my Java program is to write it to disk using a database connection to postgres. Right now, I have one single static connection object which I call every time a message comes in. I do NOT close and reopen the connection for each message.
I am still a bit new to Java, I wanted to know 1) whether there are any pitfalls or dangers with using one connection object open indefinitely, and 2) Are there performance benefits to never closing the connection, as opposed to reopening/closing every time I want to hit the database?
Thanks for the help!
I do NOT close and reopen the connection for each message.
Yes you do... at least as far as the plain Connection object is concerned. Otherwise, if you ever end up with a broken connection, it'll be broken forever, and if you ever need to perform multiple operations concurrently, you'll have problems.
What you want is a connection pool to manage the "real" connections to the database, and you just ask for a connection from the pool for each operation and close it when you're done with it. Closing the "logical" connection just returns the "real" connection to the pool for another operation. (The pool can handle keeping the connection alive with a heartbeat, retiring connections over time etc.)
There are lots of connection pool technologies available, and it's been a long time since I've used "plain" JDBC so I wouldn't like to say where the state of the art is at the moment - but that's research you can do for yourself :)
Creating a database connection is always a performance hit. Only a very naive implementation would create and close a connection for each operation. If you only needed to do something once an hour, then it would be acceptable.
However, if you have a program that performs several database accesses per minute (or even per second for larger apps), you don't want to actually close the connection.
So when do you close the connection? Easy answer: let a connection pool handle that for you. You ask the pool for a connection, it'll give you an open connection (that it either has cached, or if it really needs to, a brand new connection). When you're finished with your queries, you close() the connection, but it actually just returns the connection to the pool.
For very simple programs setting up a connection pool might be extra work, but it's not very difficult and definitely something you'll want to get the hang of. There are several open source connection pools, such as DBCP from Apache and 3CPO.
Related
I am working on mongodb connection pooling & I came across this option we can set in mongo client : "MaxConnectionIdleTime".
It basically means that connection will die after this time when sitting idle.
The default value is zero & it's stated that in case of zero, there's no limit.
Does it mean that once a connection has been created it won't die at all & will be kept in pool forever ?
Assuming there's space for new connections to be created in the pool. like min connections=10 & max connections =1000. & also, the max connection time isn't set.
If you can suggest a method to test it out on my own, that'll be really helpful too.
Please let me know if there's any way I can improve the question.
Thanks!
"Die" is a sloppy term and it is not clear what you mean by it.
If no idle time is set, the connection will not be proactively closed by the driver upon that idle time elapsing. That's all.
Connections may be closed by the driver in the following other circumstances:
It experiences a network error or a timeout.
Another connection to the server that this connection is associated with experiences a network error, causing the server to be marked unusable by the driver.
A connection may become unusable because of a network error. The connection may be unusable without the driver knowing about it. The driver often detects unusability of a connection when trying to use it (i.e. write or read something) which means a connection may be unusable (one might say dead) while the driver thinks it is perfectly fine, for a long time.
I will try to explain what I want to know.
I have a Java web app that makes connections to a MySQL database.
If I execute SHOW PROCESSLIST in MySQL I have rows like this:
id: xxx
User: xxx
Host: XXX
db: XXX
Command: Sleep
Time: 12352
State:
Info: NULL
I understand that each process is an open connection to the database.
To manage the connections I use a pool like this
http://www.chuidiang.com/java/mysql/BasicDataSource-Pool-Conexiones.php
(It's in Spanish but I think you see the idea)
So to "open" the connection I make something like this:
DataSource ds = ...;
con = ds.getConnection();
Internally, my class is not opening the connection; it's only getting a connection already opened by the pool.
The question is:
Can I know which class is holding (or has made con = ds.getConnection()) a particular connection shown in SHOW PROCESSLIST?
MySQL doesn't know, so it can't tell you.
Your Java code could tell you, by running the query SELECT CONNECTION_ID() AS connection_id and logging the output. This will be the same ID shown in SHOW [FULL] PROCESSLIST:.
Note, of course, that this doesn't mean the class still holds the connection, only that it obtained it. Sleeping threads are idle and if you're pooling, nobody may be holding them. Whoever held the connection last may have already released it to the pool.
Also, remember, that sleeping threads are sleeping, so they generally aren't hurting anything, sitting there sleeping like that. The time is time since last used for a query.
You may also find that, in the presence of a a stateful firewall between MySQL and the application, a connection is idled out of the firewall's memory due to excessive idleness (sometimes with crazy low times like 15 minutes). The application tends to initiate most interactions with MySQL, so the application will figure out that the connection is useless (thanks to the firewall), abandon it, and get a new one when it la attempts to communicate with the server after such an idle period are rebuffed by the firewall, but the server will sit happily waiting for requests that never come. These connections are wasting server resources when this happens. Enabling TCP keepalives in the kernels of the servers will mitigate this, if it's happening. The firewall could be part of your infrastructure that you don't realize is doing this.
Why should I prefer using a connection pool instead of static variable from a static class in Tomcat to hold the database connection?
This seems to me equivalent of using a connection pool having the capacity to store just one connection. So, a related question is: why the capacity of a connection pool needs to be bigger than one connection?
Thank's in advance.
With a pool, you can have multiple threads using different connections. Do you really want to limit your web application to handling one db-related request at a time? :) (And adding the complication of synchronizing to make sure that one thread doesn't try to use that single connection while another request is doing so...)
It would be generally a very bad idea to have a connection pool with a capacity of 1 - but at least if you did so, you could later increase the capacity without changing any other code, of course.
(EDIT: And as noted in other answers, connections in a pool can be closed and reopened if they become stale or broken in some way.)
The reason is to increase scalability, robustness and speed.
If you're creating a web application, there can be many concurrent HTTP requests coming in, each served by a different thread.
If you have only one static connection to the database, you need synchronization around that connection. You can't share a connection between several threads, That means each HTTP request have to wait for someone else using the database. And you need to fix up/reconnect that connection if something goes wrong with it at one point or another.
You could open a connection at the start of each HTTP request - however opening a new database connection can be expensive, and you don't get much control over how many database connections you have. Databases can be overwhelmed by having too many connections.
A connection pool solves this, as you have a pool of already opened connections that can be handed out to serve an HTTP request, or to different parts of the applications that needs to do database operations, and is returned to the pool when the database operation is finished, ready to be used again by something else.
A connection pool of just 1 connection rarely makes sense though - however connection pools take care of many other things as well, such a closing the connection and opening a new one if a connection goes stale or is otherwise in a bad state, as well as it takes care of the synchronization when there is are no more connections to hand out at a particular time.
If you're using a connection pool with only one connection it is equivalent to have one static connection - like you mentioned and there's no advantage for connection pool in regards.
The strength of a connection pool is when you're using multiple connections (multiple threads) - it saves you the effort of managing the connections (open/close connections, boilerplate-code, smart resource handling etc).
Using a connection pool for one connection only is kind of like paving a 10-lane road that will be used by one car only - lot of overhead with (almost) no gain.
Using a connection pool is not just about sharing connections: it is about leveraging years of experience with broken JDBC drivers and all the weird ways in which a Connection can become unusable. Using a single static connection is not only a bottleneck, but a very fragile solution. It will cause your application to break on a regular basis, and even restarting the application will not clean up the problems: you may get a connection leak.
I have gone through couple of articles on singleton example. And I could see that developers sometimes make the database connection object or connection manager as singleton implementation. In some of the posts even it was advised to use Database connection pool.
Well Singleton means to create a single instance, so basically we restrict the access. e.g. Printer or hardware access, logger access in which we try to restrict the access of the user to one at a time using singleton. However what is the purpose of using singleton in DB connection objects?
If I can understand correctly creating a Database connection as singleton means that app server will have only one instance. Does this mean only one user can access the Database connection and next user has to wait until the connection is closed?
Please advise.
I think you understand correctly the implication of making the connection itself a singleton. Generally it is not a good idea (although in some very particular case it could make sense).
Making a connection manager or a connection pool a singleton is completely different. The pool itself would handle a collection of connections and it can create new as they are needed (up to a limit) or re-use the ones that have been already used and discarded.
Having several connection pools at the same time would lose the advantages of the pool:
It would be harder to control the total number of connections open
One pool could be creating connections while other could have connections available
Hope this helps to clarify the subject. You might want to read more on connection pools.
Q: "However what is the purpose of using singleton in DB connection objects?" A: There is (almost always) none. So your thinking is correct.
Q: "Does this mean only one user can access the Database connection and next user has to wait until the connection is closed?"
A: Depends (to first part) and No (to second part after "and"). In single-threaded application only one user will use the database at one time and another will wait, until dispatch of first user ends but not when the connection is closed. Once connection is closed, you need to create another connection to make use of database. In multi-threaded application many threads may be using the same connection instance and result really depends on the vendor implementation: may block dispatching (effectively transforming your app to single-threaded app) or throw exceptions or even something different. However, such design in multi-threaded app is in my opinion a programmer-error.
We have seen connection droughts in our system every once in a while, and the problem seems to be that Sessions are not being returned to the connection pool quick enough. I wrote a test that seems to confirm using Session.disconnect() on the sessions (after being done with one) will solve this problem. However, I also timed these calls, and it seems like using disconnect is increasing running time by 3 times.
According to the docs (http://docs.jboss.org/hibernate/core/3.5/api/org/hibernate/Session.html#disconnect() ), disconnect should be returning it to the connection pool. However, the doc also says it "closes" the connection. I'm not sure what it means because I know for a fact that Session.close() does more than disconnect, and what good would a connection pool be if you close the connection before returning it?
In any case, I'm wondering why a method that returns the session to the connection pool would be anything but instantaneous and essentially free. Surely thats the whole point of a connection pool, right?
Any ideas would be appreciated.