I've created a mariadb cluster and I'm trying to get a Java application to be able to failover to another host when one of them dies.
I've created an application that creates a connection with "jdbc:mysql:sequential://host1,host2,host3/database?socketTimeout=2000&autoReconnect=true". The application makes a query in a loop every second. If I kill the node where the application is currently executing the query (Statement.executeQuery()) I get a SQLException because of a timeout. I can catch the exception and re-execute the statement and I see that the request is being sent to another server, so failover in that case works ok. But I was expecting that executeQuery() would not throw an exception and silently retry another server automatically.
Am I wrong in assuming that I shouldn't have to handle an exception and explicitely retry the query? Is there something more I need to configure for that to happen?
It is dangerous to auto reconnect for the following reason. Let's say you have this code:
BEGIN;
SELECT ... FROM tbl WHERE ... FOR UPDATE;
(line 3)
UPDATE tbl ... WHERE ...;
COMMIT;
Now let's say the server crashes at (line 3). The transaction will be rolled back. In my fabricated example, that only involves releasing the lock on tbl.
Now let's say that some other connection succeeds in performing the same transaction on the same row while you are auto-reconnecting.
Now, with auto-reconnect, the first thread is oblivious that the first half of the transaction was rolled back and proceeds to do the UPDATE based on data that is now out of date.
You need to get an exception so that you can go back to the BEGIN so that you can be "transaction safe".
You need this anyway -- With Galera, and no crashes, a similar thing could happen. Two threads performing that transaction on two different nodes at the same time... Each succeeds until it gets to the COMMIT, at which point the Galera magic happens and one of the COMMITs is told to fail. The 'right' response is replay the entire transaction on the server that was chosen for failure.
Note that Galera, unlike non-Galera, requires checking for errors on COMMIT.
More Galera tips (aimed at devs and dbas migrating from non-Galera)
Failover doesn't mean that application doesn't have to handle exceptions.
Driver will try to reconnect to another server when connection is lost.
If driver fail to reconnect to another server a SQLNonTransientConnectionException will be thrown, pools will automatically discard those connection.
If connection is recovered, there is some marginals cases where relaunching query is safe: when query is not in a transaction, and connection is currently in read-only mode (using Spring #Transactional(readOnly = false)) for example. For thoses cases, MariaDb java connection will then relaunch query automatically. In those particular cases, no exception will be thrown, and failover is transparent.
Driver cannot re-execute current query during a transaction.
Even without without transaction, if query is an UPDATE command, driver cannot know if the last request has been received by the database server and executed.
Then driver will send an SQLException (with SQLState begining by "25" = INVALID_TRANSACTION_STATE), and it's up to the application to handle those cases.
Related
How can I check that a connection to db is active or lost using spring data jpa?
Only the way is to make a query "SELECT 1"?
Nothing. Just execute your query. If the connection has died, either your JDBC driver will reconnect (if it supports it, and you enabled it in your connection string--most don't support it) or else you'll get an exception.
If you check the connection is up, it might fall over before you actually execute your query, so you gain absolutely nothing by checking.
That said, a lot of connection pools validate a connection by doing something like SELECT 1 before handing connections out. But this is nothing more than just executing a query, so you might just as well execute your business query.
our best chance is to just perform a simple query against one table, e.g.:
select 1 from SOME_TABLE;
https://docs.oracle.com/javase/6/docs/api/java/sql/Connection.html#isValid%28int%29
if you can use Spring Boot. Spring Boot Actuator is useful for you.
actuator will configure automatically and after it has activated ,
you can get know database status to request "health"
http://[CONTEXT_ROOT]/health
and it will return , database status like below
{"status":"UP","db":{"status":"UP","database":"PostgreSQL","hello":1}}
In my organization, Oracle database is a 2 node RAC database. Each member of the cluster is on a reboot schedule that is:
Node 1 - First Sunday of each month at 1:00am
Node 2 - Second Sunday of each month at 1:00am
Whenever these node gets rebooted, I see below exception in my J2EE application log file:
org.hibernate.engine.transaction.spi.AbstractTransactionImpl.rollback(AbstractTransactionImpl.java:209)
... 154 more
Caused by: java.sql.SQLRecoverableException: ORA-01089: immediate shutdown in progress - no operations are permitted
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:389)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:382)
at oracle.jdbc.driver.T4C7Ocommoncall.processError(T4C7Ocommoncall.java:93)
at oracle.jdbc.driver.T4CTTIfun.receive
Our DBAs said that "We do a shutdown transactional local. It is supposed to try to wait for in flight transactions to finish and not allow new transactions. "
As I mentioned above, out of 2 nodes, only one node gets rebooted at a time, and considering DBAs answer.. our app should never block on database during reboot process.
My question is, why my application is throwing this exception then? And why my application is trying to connect to a DB node for which shut down is in progress?
Your application isn't trying to connect to a node that is shutting down. It's already connected when the shutdown starts.
I assume that your application maintains a connection pool in the middle tier. So, presumably, just before one of the nodes restarts, your connection pool has open connections to both nodes. When the DBA does a shutdown transactional, sessions that have active transactions are allowed to complete but most of the sessions in your connection pool that are connected to the node that is shutting down will not have active transactions at that point. When you get one of these connections from the connection pool and try to start a transaction, you'll get this error.
Most likely, you want to catch this error and reconnect which should cause a new connection to be made to the remaining node. The error you're getting is a SQLRecoverableException so it would generally make sense to try to recover.
I'm using latest derby10.11.1.1.
Doing something like this:
DriverManager.registerDriver(new org.apache.derby.jdbc.EmbeddedDriver())
java.sql.Connection connection = DriverManager.getConnection("jdbc:derby:filePath", ...)
Statement stmt = connection.createStatement();
stmt.setQueryTimeout(2); // shall stop query after 2 seconds but id does nothing
stmt.executeQuery(strSql);
stmt.cancel(); // this in fact would run in other thread
I get exception "java.sql.SQLFeatureNotSupportedException: Caused by: ERROR 0A000: Feature not implemented: cancel"
Do you know if there is way how to make it work? Or is it really not implemented in Derby and I would need to use different embedded database? Any tip for some free DB, which I can use instead of derby and which would support SQL timeout?
As i got in java docs
void cancel() throws SQLException
Cancels this Statement object if both the DBMS and driver support aborting an SQL statement. This method can be used by one thread to cancel a statement that is being executed by another thread.
and it will throws
SQLFeatureNotSupportedException - if the JDBC driver does not support this method
you can go with mysql.
there are so many embedded database available you can go through
embedded database
If you get Feature not implemented: cancel then that is definite, cancel is not supported.
From this post by H2's author it looks like H2 supports two ways to timeout your queries, both through the JDBC API and through a setting on the JDBC URL.
Actually I found that there is deadlock timeout in derby as well only set to 60 seconds by default and I never have patience to reach it :).
So the correct answer would be:
stmt.setQueryTimeout(2); truly seems not working
stmt.cancel(); truly seems not implemented
But luckily timeout in database manager exists. And it is set to 60 seconds. See derby dead-locks.
Time can be changed using command:
statement.executeUpdate("CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY(" +
"'derby.locks.waitTimeout', '5')");
And it works :)
I have been handling a application which uses wicket+JPA+springs technologies.Recently we got many 5XX error in logs(greater than threshhold).During that time,There were some general problems due to unstable response times of the mainframe db2 which is backend for our application.
But after that once the mainframe is OK this application servers did not come to normal again.
There are a lot of hanging transactions (from my appplication).
There are many threads in the server that may be hung.
As users will go on keeping login or will access the links in aplication during that time the situation becomes worse.
When I look at webspehere logs I found following exceptions:
00000035 ThreadMonitor W WSVR0605W: Thread "WebContainer : 88" (000005ac)
has been active for 637111 milliseconds and may be hung.
There is/are 43 thread(s) in total in the server that may be hung.
In application logs i found following exceptions:
-->CouldNotLockPageException: Could not lock page 4. Attempt lasted 3 minutes
-->DefaultExceptionMapper - Connection lost, give up responding.
org.apache.wicket.protocol.http.servlet.ResponseIOException:
com.ibm.wsspi.webcontainer.ClosedConnectionException: OutputStream encountered error during
write.
--> JDBCExceptionReporter - [jcc][t4][2030][11211][3.67.27] A communication error occurred
during operations on the connection's underlying socket, socket input stream,
or socket output stream.
Error location: Reply.fill() - socketInputStream.read (-1). Message:
Connection reset. ERRORCODE=-4499, SQLSTATE=08001DSRA0010E: SQL State = 08001, Error Code = - 4.499
Now we are working on the solutions to this problem.The follwing are two solutions that we are thinking as of now.
1.I have gone through many forums and found that whenever we get CouldNotLockPageException then it would be better to invaidate the session and force user to login page.Currently We do not have session invalidation (logout) mechanism.So we will implement that one.
2.We need to implement transaction timeouts so that we can stop hanging transactions.
I need solution for this problem from java or server side.Here we are using wicket,jpa and springs frameworks.I have few queries.
1.How can we implement transaction timeouts in the above frameworks?
2.Will invalidating session can stop hanging transaction or threads that may hung?
Since you are already using Spring, it's as simple as that:
#Transactional(timeout = 300)
The Transaction annotation allow you to supply a timeout value(in seconds) and the transaction manager will forward it to the JTA transaction manager or your Data Source connection pool. It works nice with Bitronix Transaction Manager, which automatically picks it up.
You also need to make sure the java.sql.Conenction are always being closed and Transaction are always committed (when all operations succeeded) or rollbacked on failure.
Invalidating the user http session has nothing to do with jdbc connections. Your jdbc connection should always be committed/rollbacked and closed(which in case on connection pooling, will release the connection to the pool).
And make sure the max pool size is not greater than tour db max concurrent connections setting.
I presume i have a close to default HicariConfiguration with MaximumPoolSize(5).
The problem i faced with is there're a lot of attempts to connect to database even the first one failed. I mean, for instance, the password i'm going to use to connect to Oracle is wrong and connection fails, but then we have one more attempts to connect to database which lock the account as a result.
Question: What HicariCP setting is supposed to be used to limit up to 1 number of attempt to connect?
Thanks for any information!
### UPDATE
env.conf:
jdbc {
test1 {
datasourceClassName="oracle.jdbc.pool.OracleDataSource"
dataSourceUrl=.....jdbc url
dataSourceUser=USER
dataSourcePassword=password
setMaximumPoolSize = 5
setJdbc4ConnectionTest = true
}
}
Conf file is read by means of ConfigFactory, and create HicariConfig based on conf file (setDriverClassName etc).
Output of HikariConfig:
autoCommit.....................true
connectionTimeOut..............30000
idleTimeOut....................600000
initializationFailFast.........false
isolateInternalQueries.........false
jdbc4ConnectionTest............test
maxLifetime....................1800000
minimumIdle....................5
https://github.com/brettwooldridge/HikariCP/issues/312, As explained at the end of this issue, HikariCP will keep trying to acquire a connection. It removed the acquireRetries parameters deliberately. so the way is to configure the right username/password, since DB only lock after authenticaions failures.
Here's extracted from the issue. HikariCP intends to retry forever.
Back to acquireRetries... Without a concept of acquireRetries, how
long does the dedicated thread continue to try to create a new
connection? Forever. The background creation thread will continue to
try to add a connection to the pool forever, or until one of three
conditions is met: