Abandoned connection cleanup in mariadb (compared to mysql)?

Abandoned connection cleanup in mariadb (compared to mysql)? - java

Switching from mysql-connector to mariadb client library:
What is the equivalent of the mysql class com.mysql.cj.jdbc.AbandonedConnectionCleanupThread.checkedShutdown()?
If there is any at all?
(I'm also using hikari connection pool).

I don't believe there is an equivalent, it looks like this feature was not migrated to Maria DB. It would be more prudent to fix the connection leak in the application instead.
As explained by HikariCP pool author in this message, this feature of force closing abandoned connections has a number of problems:
Yes, we have considered it (removing abandoned connections), but ultimately we decided to pass. The problem with closing leaked connections is several fold. Some thread is possibly using that connection, and its going to blow-up (in production) somewhere if we close it. Or nothing is using that connection, and closing it has no negative impact, but now we've just covered up a leak that will cause constant cycling of connections in the pool.
Applications are responsible for cleaning up resources. Java developers tend to get lazy compared to C/C++ programmers. This is leak just like a memory leak, and both can and rightfully should eventually kill your application. How else would you 1) know a problem exist, and 2) be motivated to track it down and fix it.
We do appreciate all input, even if not adopted. In this case, users looking for a library to defensively cover-up coding errors should probably look to tomcat-jdbc.
Note, leak detection can be run in production, and can be enabled at runtime through a JMX console, so there's not a lot of justification for adding proactive connection reclamation.

Related

use jpa hibernate high concurrency will unable to acquire JDBC Connection

When I encounter high concurrency during jpa hibernate, the project will report “Unable to acquire JDBC Connection” error after running for a while
But after I added the hikari database connection pool, the problem is solved. Why is this happening or there is no other way to solve it?

It depends on what pool you used before.
The HikariCP-maxLifeTime default is 30 Minutes. After this, the connection will be given back to the DBMS, which normally limits the maximum number of connections.
DBCP default is without limit.
If you did not use a pool, nobody closes the connections if you don't do it yourself.
So that might be the cause, why you don't get the exceptions anymore. But be aware, that a memory-leak might be left. That means there might be hibernate-sessions stored anywhere in your code, which never get used and never get closed.

Is there an error in the Oracle Database 12.2.0.1 JDBC Driver?

In 2006 I wrote my own JDBC-connection-pooling for Oracle-Connections.
I have stored the collections in a Vector and every night I instantiated a new Vector-object to initialize the connection-pool:
connections = new Vector(poolsize);
Thus, all the existing connections were deleted by the garbage-collector and Oracle deleted the connections.
To be honest, it's a very poor solution - but it works for 12 years without problems!
This year we updated our Oracle-version to 12.2.0.1.0 and I updated the Oracle-JDBC-driver in my high-sophisticated programs.
I am currently using the Oracle Database 12.2.0.1 JDBC Driver (ojdbc8.jar), downloaded from this website:
https://www.oracle.com/technetwork/database/features/jdbc/jdbc-ucp-122-3110062.html
The database-access is working fine - except my poor connection-pool.
After calling "connections = new Vector(poolsize)" the Oracle-DB does NOT remove the open connections, and the amount of open JDBC-connections increase every day - until Oracle breaks down (too many open JDBC-connections).
I know that I must close every JDBC-connection with close() instead of only initialize the Vector holding the collections.
But I am wondering why the new Oracle JDBC-driver does not remove all the connections after the garbage-collection was running.
Is this an error from the new JDBC-driver?
In all older JDBC-drivers this error will not occur - it occurs only with the new ojdbc8.jar.
A JDBC-driver should automatically close all database-related objects (for example ResultSets) if there are not reachable.
I do not believe that every JDBC-developer will close the ResultSet-objects after the database-operation has finished.
I have not tested whether ojdbc8.jar will close these kind of unclosed ResultSet-Objects, but if not, some programs will blow up in the future.
What do you think, is there a bug in the new JDBC-driver because unreachable JDBC-connections were not closed automatically?

Depending on finalizers in Java to discard resources is discouraged 1, 2.
You should switch to a well tested pooling library e.g. HikariCP. There are a lot of gotchas e.g. how to properly reset connection after rollback occured, see Pool Analysis or Bad Behavior: Handling Database Down. Writing and maintaining this code yourself is counterproductive.

Not every JDBC developer will close resources, but every good developer will.
If you have connections in a pool, they're strongly reachable. This means they won't be GC'd and the finalize() method (if one exists in the driver's classes) that would free resources won't be called.
Don't blame the driver first, they tend to be heavily tested unlike your code.

You can use Universal Connection Pool (UCP) which is the Java connection pool. Refer to the UCPSample.java. Also, refer to UCP Developer's guide for more details.

Connection Pooling - How much of an overhead is it?

I am running a webapp inside Webpshere Application Server 6.1. This webapp has a rules kind of engine, where every rule obtains its very own connection from the websphere data source pool. So, I see that when an use case is run, for 100 records of input, about 400-800 connections are obtained from the pool and released back to the pool. I have a feeling that if this engine goes to production, it might take too much time to complete processing.
Is it a bad practice to obtain connections from pool that frequently? What are the overhead costs involved in obtaining connections from pool? My guess is that costs involved should be minimal as pool is nothing but a resource cache. Please correct me if I am wrong.

Connection pooling keeps your connection alive in anticipation, if another user connects the ready connection to the db is handed over and the database does not have to open a connection all over again.
This is actually a good idea because opening a connection is not just a one-go thing. There are many trips to the server (authentication, retrieval, status, etc) So if you've got a connection pool on your website, you're serving your customers faster.
Unless your website is not visited by people you can't afford not to have a connection pool working for you.

The pool doesn't seem to be your problem. The real problem lies in the fact that your "rules engine" doesn't release connections back to the pool before completing the entire calculation. The engine doesn't scale well, so it seems. If the number of database connections somehow depends on the number of records being processed, something is almost always very wrong!
If you manage to get your engine to release connections as soon as possible, it may be that you only need a few connections instead of a few hundred. Failing that, you could use a connection wrapper that re-uses the same connection every time the rules engine asks for one, that somewhat negates the benefits of having a connection pool though...
Not to mention that it introduces many multithreading and transaction isolation issues, if the connections are read-only, it might be an option.

A connection pool is all about connection re-use.
If you are holding on to a connection at times where you don't need a connection, then you are preventing that connection from being re-used somewhere else. And if you have a lot of threads doing this, then you must also run with a larger pool of connections to prevent pool exhaustion. More connections takes longer to create and establish, and they take more resources to maintain; there will be more reconnecting as the connections grow old and your database server will also be impacted by the greater number of connections.
In other words: you want to run with the smallest possible pool without exhausting it. And the way to do that is to hold on to your connections as little as possible.
I have implemented a JDBC connection pool myself and, although many pool implementations out there probably could be faster, you are likely not going to notice because any slack going on in the pool is most likely dwarfed by the time it takes to execute queries on your database.
In short: connection pools just love it when you return their connections. Or they should anyway.

To really check if your pool is a bottle neck you should profile you program. If you find the pool is a problem, then you have tuning problem. A simple pool should be able to handle 100K allocations per second or more or about 10 micro-seconds. However, as soon as you use a connection, it will take between 200 and 2,000 micro-seconds to do something useful.

I think this is a poor design. Sounds like a Rete rules engine run amok.
If you assume 0.5-1.0 MB minimum per thread (e.g. for stack, etc.) you'll be thrashing a lot of memory. Checking the connections in and out of the pool will be the least of your problems.
The best way to know is to do a performance test and measure memory, wall times for each operation, etc. But this doesn't sound like it'll end well.
Sometimes I see people assume that throwing all their rules into Blaze or ILOG or JRules or Drools simply because it's "standard" and high tech. It's a terrific resume item, but how many of those solutions would be better served by a simpler table-driven decision tree? Maybe your problem is one of those.
I'd recommend that you get some data, see if there's a problem, and be prepared to redesign if the data tells you it's necessary.

Could you provide more details on what your rules engine does exactly? If each rule "firing" is performing data updates, you may want to verify that the connection is being properly released (Put this in the finally block of your code to ensure that the connections are really being released).
If possible, you may want to consider capturing your data updates to a memory buffer, and write to the database only at the end of the rule session/invocation.
If the database operations are read-only, consider caching the information.
As bad as you think 400-800 connections being created and released to the pool is, I suspect it'll be much much worse if you have to create and close 400-800 unpooled connections.

Do I have to explicitly disconnect from a database when using Java?

It is necessary to disconnect from the database after the job is done in Java? If it is not disconnected, will it lead to memory leaks?

You must always close all your Connections, Statements and ResultSets.
If not, is more probable you can't obtain new connections from the pool than a memory leak.

You should provide more details like which framework you are using or something.
Anyway, are you using JDBC? If so you should close the following objects by using their respective close() methods: Statement, ResultSet and Connection.

Assuming you are using JDBC, the answer is yes. If you don't close the connection, then the JDBC driver might try to close it in a finallizer, but that could hold the connection open for a very long time, causing resource issues (the amount of database connections allowed to be open at one time is finite). Typically JDBC programming is done with a database pool, and not closing the connection will mean that the pool will run out of available connections very quickly.
Some application servers (e.g. JBoss) will detect when a connection wasn't closed and close it for you if it is managing the transactions, but you should not rely on that.
Of course some JDBC drivers are not pure java drivers, at which point memory leaks become a very real possibility.

I don't have a source, but I believe (if I remember right, it's been a while since I've touched JDBC) that it depends on the JDBC driver implementation. You should always close your connections and clean up after yourself as not all JDBC drivers do it for you (although some might).
This goes back to a rule that I like to follow - If I create or open something, I'm responsible for deleting or closing it.

yes and yes

Reusing a connection while polling a database in JDBC?

Have a use case wherein need to maintain a connection open to a database open to execute queries periodically.
Is it advisable to close connection after executing the query and then reopen it after the period interval (10 minutes). I would guess no since opening a connection to database is expensive.
Is connection pooling the alternative and keep using the connections?

You should use connection pooling. Write your application code to request a connection from the pool, use the connection, then return the connection back to the pool. This keeps your code clean. Then you rely on the pool implementation to determine the most efficient way to manage the connections (for example, keeping them open vs closing them).
Generally it is "expensive" to open a connection, typically due to the overhead of setting up a TCP/IP connection, authentication, etc. However, it can also be expensive to keep a connection open "too long", because the database (probably) has reserved resources (like memory) for use by the connection. So keeping a connection open can tie-up those resources.
You don't want to pollute your application code managing these types of efficiency trade-offs, so use a connection pool.

Yes, connection pooling is the alternative. Open the connection each time (as far as your code is concerned) and close it as quickly as you can. The connection pool will handle the physical connection in an appropriately efficient manner (including any keepalives required, occasional "liveness" tests etc).
I don't know what the current state of the art is, but I used c3p0 very successfully for my last Java project involving JDBC (quite a while ago).

The answer here really depends on the application. If there are other connections being used simultaneously for the same database from the same application, then a pool is definitely your answer.
If all your application does is query the db, wait 10 minutes, then query again, then simply connect and reconnect. A connection is considered to be an expensive operation, but all things are relative. It is not expensive if you do it only once every 10 minutes. If the application is this simple, don't introduce unnecessary complexity.
NOTE:
OK, complexity is also relative, so if are already using something like Spring and already know how to use its pooling mechanism, then apply it for this case. If this is not true, keep it simple.

Connection pooling would be an option for you. You can then leave your code as it is including opening and closing connections. The connection pool will care about the connections. If you close a connection of a pool it will not be closed but just be made available in the pool again. If you open a connection after you closed one if there is a open connection in the pool the pool will return this. So in an application server you can use the build-in connection pools. For simple java applications most of the JDBC drivers also include a pool driver.

There are many, many tradeoffs in opening and closing connections, keeping them open, making sure that connections that have been "kept alive" are still "valid" when you start to use them again, invalidating connections that get corrupted, etc. These kinds of complex tradeoffs make it difficult (but certainly not impossible) to implement the "best" connection management strategy for your specific case. The "safest" method is to open a connection, use it, and then close it. But, as you already realize, that is not at all the most efficient method. If you manage your own connections, then as you do things to make your strategy more efficient, the complexity will rise very quickly (especially in the presence of any less-than-perfect JDBC drivers, of which there are many.)
There are many connection pooling libraries available out there that can take care of all of this for you in extremely configurable ways (they almost always come pre-configured out-of-the-box for the most typical cases, and until you get up to the point that you're doing high-load activities, you probably don't have to worry about all that configurability - but you will be glad to have it if you scale up!) As is always the case, the libraries themselves may be of variable quality.
I have successfully used both C3P0 and Apache DBCP. If I were choosing again today, I would probably go with DBCP.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.