I'm using mybatis connect to oracle.
My mybatis config is:
<settings>
<setting name="lazyLoadingEnabled" value="true" />
<setting name="aggressiveLazyLoading" value="false" />
<setting name="logImpl" value="${logImpl}" />
<setting name="defaultStatementTimeout" value="10" />
</settings>
<environments default="default">
<environment id="default">
<transactionManager type="JDBC" />
<dataSource type="POOLED">
<property name="driver" value="${jdbc.driver}" />
<property name="url" value="${jdbc.url}" />
<property name="username" value="${jdbc.username}" />
<property name="password" value="${jdbc.password}" />
<property name="poolPingConnectionsNotUsedFor" value="290000"/>
<property name="poolPingQuery" value="SELECT COUNT(*) FROM RESORT"/>
<property name="poolPingEnabled" value="true"/>
</dataSource>
</environment>
</environments>
My code of open session is like
SqlSession sqlSession = factory.openSession();
Object result = null;
try
{
QueryInfoMapper mapper = sqlSession.getMapper(QueryInfoMapper.class);
result = mapper.queryInfoFromOpera(mybatisMapping);
} finally
{
sqlSession.close();
}
Because of application scoped of the class, and sqlSession could not be used in application scope, so I have to manage sqlSession by myself.
The log is
2019-04-11 15:30:35,773 INFO [stdout] (default task-60) Opening JDBC Connection
2019-04-11 15:30:41,860 INFO [stdout] (default task-57) Bad connection. Could not roll back
2019-04-11 15:30:41,861 INFO [stdout] (default task-57) Claimed overdue connection 962608913.
2019-04-11 15:30:41,861 INFO [stdout] (default task-57) A bad connection (962608913) was returned from the pool, getting another connection.
2019-04-11 15:30:41,895 INFO [stdout] (default task-57) Created connection 1812494479.
2019-04-11 15:30:41,895 INFO [stdout] (default task-57) Setting autocommit to false on JDBC Connection [oracle.jdbc.driver.T4CConnection#6c08788f]
2019-04-11 15:30:41,895 INFO [stdout] (default task-57) ==> Preparing: SELECT TRAVEL_AGENT_NAME FROM( SELECT TRAVEL_AGENT_NAME FROM OPERA.NAME_RESERVATION WHERE RESV_NAME_ID = ? ) WHERE ROWNUM = 1
2019-04-11 15:30:41,896 INFO [stdout] (default task-57) ==> Parameters: 288541(String)
2019-04-11 15:30:41,900 INFO [stdout] (default task-57) <== Columns: TRAVEL_AGENT_NAME
2019-04-11 15:30:41,900 INFO [stdout] (default task-57) <== Row: null
2019-04-11 15:30:41,900 INFO [stdout] (default task-57) <== Total: 1
2019-04-11 15:30:41,900 INFO [stdout] (default task-57) Resetting autocommit to true on JDBC Connection [oracle.jdbc.driver.T4CConnection#6c08788f]
2019-04-11 15:30:41,900 INFO [stdout] (default task-57) Closing JDBC Connection [oracle.jdbc.driver.T4CConnection#6c08788f]
2019-04-11 15:31:00,788 INFO [stdout] (default task-60) Bad connection. Could not roll back
2019-04-11 15:31:00,788 INFO [stdout] (default task-60) Claimed overdue connection 1228464923.
2019-04-11 15:31:00,788 INFO [stdout] (default task-60) A bad connection (1228464923) was returned from the pool, getting another connection.
2019-04-11 15:31:00,820 INFO [stdout] (default task-60) Created connection 265625885.
2019-04-11 15:31:00,820 INFO [stdout] (default task-60) Setting autocommit to false on JDBC Connection [oracle.jdbc.driver.T4CConnection#fd5211d]
2019-04-11 15:31:00,820 INFO [stdout] (default task-57) Returned connection 1812494479 to pool.
Seeing the log, according to the timestamp, it seems happens during closing connection(which is transaction here)
But it takes 9s or 19s to close it. The second log is "Bad connection. Could not roll back". I can't locate where is the really cause. And which method takes so much time. This issue doesn't happen every time but randomly.
I thought to set <property name="poolMaximumActiveConnections" value="40" /> to increase connections. I'm not sure if it would help.
What would be the cause of failed to close connection/transaction? How can I avoid the failed of closing connection/transaction?
===========================
Update: I met this issue again and log comes something different:
2019-04-13 15:42:31,812 INFO [stdout] (default task-86) Opening JDBC Connection
2019-04-13 15:42:35,493 INFO [stdout] (default task-62) Execution of ping query 'SELECT COUNT(*) FROM RESORT' failed: IO Error: Socket read timed out
2019-04-13 15:42:35,493 INFO [stdout] (default task-62) Connection 1963609369 is BAD: IO Error: Socket read timed out
2019-04-13 15:42:35,493 INFO [stdout] (default task-62) A bad connection (1963609369) was returned from the pool, getting another connection.
2019-04-13 15:42:35,493 INFO [stdout] (default task-62) Checked out connection 195963529 from pool.
2019-04-13 15:42:35,493 INFO [stdout] (default task-62) Testing connection 195963529 ...
2019-04-13 15:42:54,448 INFO [stdout] (default task-62) Execution of ping query 'SELECT COUNT(*) FROM RESORT' failed: IO Error: Socket read timed out
2019-04-13 15:42:54,448 INFO [stdout] (default task-62) Connection 195963529 is BAD: IO Error: Socket read timed out
2019-04-13 15:42:54,448 INFO [stdout] (default task-62) A bad connection (195963529) was returned from the pool, getting another connection.
2019-04-13 15:42:54,479 INFO [stdout] (default task-62) Created connection 741137137.
Btw, I'll change the ping sql to SELECT 1 FROM DUAL.
What could cause this socket read timed out?
I can see several problems here:
potentially heavy ping query (as pointed by beny23)
long close connection operation
incorrect behaviour of the mybatis connection pool
You definitely need to use SELECT 1 FROM DUAL as a ping query. Otherwise you a doing some not so cheap operation on every connection open.
The long close and IO Error: Socket read timed out suggests that there is either some network connectivity issue or oracle server availability issue or both.
It makes sense to check oracle healthiness at the time when this issue happens. Does it respond to other queries at that time? What is the CPU/io/memory/swap usage etc. If the server is under very high load it may be that it does not respond in time.
Checking the issues with network connectivity is a very broad topic. The most reliable (and also complex) way I know is to capture network traffic (with tools like tcpdump or WireShark) on both ends and compare them.
Then there's an issue with mybatis connection pool.
First of all some background about how mybatis connection pool works.
One important and not obvious thing is that mybatis connection pool implementation forcefully returns connections to the pool if they are used for too long. Here's the quote from the documentation:
poolMaximumCheckoutTime – This is the amount of time that a Connection can be "checked out" of the pool before it will be forcefully returned. Default: 20000ms (i.e. 20 seconds)
It means that if the application tries to open new connection and all connections are busy then mybatis will close the oldest connection if it was in use for more than 20 seconds (by default).
It is by itself may be a very unexpected behaviour if you have some long running queries. Another and probably bigger problem is how this is implemented in mybatis. In order to grab a connection the request to rollback the transaction is done from the thread which requested new connection (In the example above thread default task-57 is holding the connection and thread default task-60 tries to get the connection from the pool).
This is the problem because oracle jdbc driver requires proper synchronization when accessing the connection from multiple threads and mybatis does not do that:
Controlled serial access to a connection, such as that provided by connection caching, is both necessary and encouraged. However, Oracle strongly discourages sharing a database connection among multiple threads. Avoid allowing multiple threads to access a connection simultaneously. If multiple threads must share a connection, use a disciplined begin-using/end-using protocol.
So this failure to synchronize access from multiple thread to the shared resource (the connection) may cause all kinds of consistency problems and I do not exclude the possibility that the problem with closing the connection is caused by the fact that connection had gotten into some inconsistent state earlier because of the lack of the synchronization.
Increasing the pool size removes this problem for the given load as the situation when the pool is exhausted does not happen (or happens less frequently).
Note that concurrency issues are very tricky to reproduce and positive synthetic test gives you virtually no guarantee. This a broad topic so recommend you to look to Goetz book for details.
I would change the connection pool implementation, namely use https://github.com/swaldman/c3p0 or https://commons.apache.org/proper/commons-dbcp/ or https://brettwooldridge.github.io/HikariCP/.
Related
I currently have an Aws Rds Postgres multi AZ(13.4) database and multiple Java(Spring + hibernate) Applications that use the database with:
<artifactId>postgresql</artifactId>
<version>42.2.24</version>
and
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-hikaricp</artifactId>
<version>5.6.7.Final</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.zaxxer</groupId>
<artifactId>HikariCP</artifactId>
<version>4.0.3</version>
<scope>compile</scope>
</dependency>
I configured HikariCP like this:
public DataSource hikariDataSource() {
// Adjusting Cache since RDS failover can/will result in a different IP for the Database
java.security.Security.setProperty("networkaddress.cache.ttl", "1");
java.security.Security.setProperty("networkaddress.cache.negative.ttl", "3");
....
final HikariConfig config = new HikariConfig();
config.setJdbcUrl(jdbcURL);
config.setUsername(user);
config.setPassword(password);
// Initial Connect wait time
config.setConnectionTimeout(60000);
// "We strongly recommend setting this value, and it should be several seconds shorter than any database or
// infrastructure imposed connection time limit."
config.setMaxLifetime(50000);
config.addDataSourceProperty("socketTimeout", "60");
// This value must be less than the maxLifetime value
config.setKeepaliveTime(30000);
config.setMaximumPoolSize(6);
config.setMinimumIdle(2);
config.setIdleTimeout(45000);
return new HikariDataSource(config);
When I trigger a failover of the AWS multi AZ Database HikariCP will acknowledge that the connection is lost and after a while reconnect. So far so good. If my Application was in the middle of a Query while the failover occurred, the Application will throw an error because the Connection was lost, something like:
2022-06-21 10:45:10 WARN com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection#7ca90935 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
2022-06-21 10:45:10 DEBUG com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - Closing connection org.postgresql.jdbc.PgConnection#7ca90935: (connection is dead)
2022-06-21 10:45:10 DEBUG com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Added connection org.postgresql.jdbc.PgConnection#7f52ac10
2022-06-21 10:45:20 WARN com.zaxxer.hikari.pool.ProxyConnection - HikariPool-1 - Connection org.postgresql.jdbc.PgConnection#24118275 marked as broken because of SQLSTATE(08006), ErrorCode(0)
org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:349)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:481)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:401)
...
and
2022-06-21 10:45:20 DEBUG com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - Closing connection org.postgresql.jdbc.PgConnection#24118275: (connection is broken)
2022-06-21 10:45:20 WARN org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 08006
2022-06-21 10:45:20 ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - An I/O error occurred while sending to the backend.
2022-06-21 10:45:20 WARN org.hibernate.engine.loading.internal.LoadContexts - HHH000100: Fail-safe cleanup (collections) : org.hibernate.engine.loading.internal.CollectionLoadContext#314419c8<rs=HikariProxyResultSet#320236451 wrapping org.postgresql.jdbc.PgResultSet#af06e>
2022-06-21 10:45:20 WARN org.hibernate.engine.loading.internal.CollectionLoadContext - HHH000160: On CollectionLoadContext#cleanup, localLoadingCollectionKeys contained [1] entries
2022-06-21 10:45:20 ERROR org.springframework.transaction.interceptor.TransactionInterceptor - Application exception overridden by rollback exception
org.springframework.dao.DataAccessResourceFailureException: could not extract ResultSet; nested exception is org.hibernate.exception.JDBCConnectionException: could not extract ResultSet
at org.springframework.orm.jpa.vendor.HibernateJpaDialect.convertHibernateAccessException(HibernateJpaDialect.java:255)
at org.springframework.orm.jpa.vendor.HibernateJpaDialect.translateExceptionIfPossible(HibernateJpaDialect.java:233)
at org.springframework.orm.jpa.AbstractEntityManagerFactoryBean.translateExceptionIfPossible(AbstractEntityManagerFactoryBean.java:551)
at org.springframework.dao.support.ChainedPersistenceExceptionTranslator.translateExceptionIfPossible(ChainedPersistenceExceptionTranslator.java:61)
...
Now I want to get rid of these errors.
In Mongodb there are Retryable Reads/Writes which wait for a new Primary to be elected, before retrying a query once. Is there something similar I can do with Postgres? I already tried doing a Multi-Az Cluster Setup, but even then the connection is simply lost, error thrown and reestablished.
When I start hbase on my cluster, HMaster process and HQuorumPeer process start on master node while only HQuorumPeer process starts on slaves.
In the GUI console, in the task section, I can see the master (node0) in the state RUNNING and the status "Waiting for region servers count to settle; currently checked in 0, slept for 250920 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms".
In the software attributes section I can find all my nodes in the zookeeper quorum with the description "Addresses of all registered ZK servers".
So It seems that Zookeeper is working but in the log file it seems to be the problem.
Log hbase-clusterhadoop-master:
2016-09-08 12:26:14,875 INFO [main-SendThread(node0:2181)] zookeeper.ClientCnxn: Opening socket connection to server node0/192.168.1.113:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Impossibile trovare una configurazione di login) 2016-09-08 12:26:14,882 WARN [main-SendThread(node0:2181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2016-09-08 12:26:14,994 WARN [main] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=node3:2181,node2:2181,node1:2181,node0:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
........
2016-09-08 12:32:53,063 INFO [master:node0:60000] zookeeper.ZooKeeper: Initiating client connection, connectString=node3:2181,node2:2181,node1:2181,node0:2181 sessionTimeout=90000 watcher=replicationLogCleaner0x0, quorum=node3:2181,node2:2181,node1:2181,node0:2181, baseZNode=/hbase
2016-09-08 12:32:53,064 INFO [master:node0:60000-SendThread(node3:2181)] zookeeper.ClientCnxn: Opening socket connection to server node3/192.168.1.112:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Impossibile trovare una configurazione di login)
2016-09-08 12:32:53,065 INFO [master:node0:60000-SendThread(node3:2181)] zookeeper.ClientCnxn: Socket connection established to node3/192.168.1.112:2181, initiating session
2016-09-08 12:32:53,069 INFO [master:node0:60000-SendThread(node3:2181)] zookeeper.ClientCnxn: Session establishment complete on server node3/192.168.1.112:2181, sessionid = 0x357095a4b940001, negotiated timeout = 90000
2016-09-08 12:32:53,072 INFO [master:node0:60000] zookeeper.RecoverableZooKeeper: Node /hbase/replication/rs already exists and this is not a retry
2016-09-08 12:32:53,072 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner
2016-09-08 12:32:53,075 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.snapshot.SnapshotLogCleaner
2016-09-08 12:32:53,076 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.cleaner.HFileLinkCleaner
2016-09-08 12:32:53,077 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner
2016-09-08 12:32:53,078 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner
2016-09-08 12:32:53,078 INFO [master:node0:60000] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 0 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2016-09-08 12:32:54,607 INFO [master:node0:60000] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 1529 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2016-09-08 12:32:56,137 INFO [master:node0:60000] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 3059 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
Log hbase-clusterhadoop-zookeeper-node0 (master):
2016-09-08 12:26:18,315 WARN [WorkerSender[myid=0]] quorum.QuorumCnxManager: Cannot open channel to 1 at election address node1/192.168.1.156:3888
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:431)
at java.net.Socket.connect(Socket.java:527)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430)
at java.lang.Thread.run(Thread.java:695)
Log hbase-clusterhadoop-regionserver-node1 (one of the slave):
2016-09-08 12:33:32,690 INFO [regionserver60020-SendThread(node3:2181)] zookeeper.ClientCnxn: Opening socket connection to server node3/192.168.1.112:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Impossibile trovare una configurazione di login)
2016-09-08 12:33:32,691 INFO [regionserver60020-SendThread(node3:2181)] zookeeper.ClientCnxn: Socket connection established to node3/192.168.1.112:2181, initiating session
2016-09-08 12:33:32,692 INFO [regionserver60020-SendThread(node3:2181)] zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-09-08 12:33:32,793 WARN [regionserver60020] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=node3:2181,node2:2181,node1:2181,node0:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
2016-09-08 12:33:32,794 ERROR [regionserver60020] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts
2016-09-08 12:33:32,794 WARN [regionserver60020] zookeeper.ZKUtil: regionserver:600200x0, quorum=node3:2181,node2:2181,node1:2181,node0:2181, baseZNode=/hbase Unable to set watcher on znode /hbase/master
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:427)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:778)
at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:751)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:884)
at java.lang.Thread.run(Thread.java:695)
2016-09-08 12:33:32,794 ERROR [regionserver60020] zookeeper.ZooKeeperWatcher: regionserver:600200x0, quorum=node3:2181,node2:2181,node1:2181,node0:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:427)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:778)
at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:751)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:884)
at java.lang.Thread.run(Thread.java:695)
2016-09-08 12:33:32,795 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server node1,60020,1473330794709: Unexpected exception during initialization, aborting
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:427)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:778)
at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:751)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:884)
at java.lang.Thread.run(Thread.java:695)
2016-09-08 12:33:32,798 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
2016-09-08 12:33:32,798 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Unexpected exception during initialization, aborting
2016-09-08 12:33:32,867 INFO [regionserver60020-SendThread(node0:2181)] zookeeper.ClientCnxn: Opening socket connection to server node0/192.168.1.113:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Impossibile trovare una configurazione di login)
Log hbase-clusterhadoop-zookeeper-node1:
2016-09-08 12:33:32,075 WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0%0:2181] quorum.Learner: Unexpected exception, tries=0, connecting to node3/192.168.1.112:2888
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:431)
at java.net.Socket.connect(Socket.java:527)
at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:225)
at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
2016-09-08 12:33:32,227 INFO [node1/192.168.1.156:3888] quorum.QuorumCnxManager: Received connection request /192.168.1.113:49844
2016-09-08 12:33:32,233 INFO [WorkerReceiver[myid=1]] quorum.FastLeaderElection: Notification: 1 (message format version), 0 (n.leader), 0x10000002d (n.zxid), 0x1 (n.round), LOOKING (n.state), 0 (n.sid), 0x1 (n.peerEpoch) FOLLOWING (my state)
2016-09-08 12:33:32,239 INFO [WorkerReceiver[myid=1]] quorum.FastLeaderElection: Notification: 1 (message format version), 3 (n.leader), 0x10000002d (n.zxid), 0x1 (n.round), LOOKING (n.state), 0 (n.sid), 0x1 (n.peerEpoch) FOLLOWING (my state)
2016-09-08 12:33:32,725 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxnFactory: Accepted socket connection from /192.168.1.111:49534
2016-09-08 12:33:32,725 WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxn: Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-09-08 12:33:32,725 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxn: Closed socket connection for client /192.168.1.111:49534 (no session established for client)
The conf file abase-site:
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://node0:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>node0,node1,node2,node3</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/Users/clusterhadoop/usr/local/zookeeper</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/Users/clusterhadoop/usr/local/hbtmp</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>hbase.master</name>
<value>node0:60000</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>1000</value>
</property>
</configuration>
Hosts file:
127.0.0.1 localhost
127.0.0.1 node3
192.168.1.112 node3
192.168.1.156 node1
192.168.1.111 node2
192.168.1.113 node0
Any idea on what is the problem and how to solve it?
Today i tried voldemort with my linux machine, but i cant proceed fully. Following error i got it while try to store a value by using "PUT" keyword.
pls look at this information and get the solution
server.properties
bdb.sync.transactions=false
bdb.cache.size=1000MB
max.threads=5000
http.enable=true
socket.enable=true
node.id=0
kumaran#mohandoss-Vostro-1550:/home/kumaran/voldemort-1.3.0$ ./bin/voldemort-admin-tool.sh --get-metadata --url tcp://localhost:5555
[19:05:55,373 voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager] INFO Closed, exiting [voldemort-niosocket-client-3]
[19:05:55,374 voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager] INFO Closed, exiting [voldemort-niosocket-client-8]
[19:05:55,374 voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager] INFO Closed, exiting [voldemort-niosocket-client-7]
[19:05:55,373 voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager] INFO Closed, exiting [voldemort-niosocket-client-6]
[19:05:55,373 voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager] INFO Closed, exiting [voldemort-niosocket-client-5]
[19:05:55,373 voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager] INFO Closed, exiting [voldemort-niosocket-client-4]
[19:05:55,373 voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager] INFO Closed, exiting [voldemort-niosocket-client-1]
[19:05:55,373 voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager] INFO Closed, exiting [voldemort-niosocket-client-2]
[19:05:55,376 voldemort.store.socket.clientrequest.ClientRequestExecutor] WARN No client associated with Socket[unconnected] [main]
[19:05:55,376 voldemort.store.socket.clientrequest.ClientRequestExecutor] INFO Closing remote connection from Socket[unconnected] [main]
localhost:0
Key - cluster.xml
version() ts:1379597320733
: <cluster>
<name>Kumaran</name>
<server>
<id>0</id>
<host>localhost</host>
<http-port>8081</http-port>
<socket-port>5554</socket-port>
<admin-port>5555</admin-port>
<partitions>0, 1</partitions>
</server>
<server>
<id>1</id>
<host>localhost</host>
<http-port>8082</http-port>
<socket-port>5556</socket-port>
<admin-port>5557</admin-port>
<partitions>2, 3</partitions>
</server>
</cluster>
Key - stores.xml
version() ts:1379597320847
: <stores>
<store>
<name>test1</name>
<persistence>bdb</persistence>
<routing-strategy>consistent-routing</routing-strategy>
<routing>client</routing>
<replication-factor>2</replication-factor>
<required-reads>2</required-reads>
<required-writes>2</required-writes>
<key-serializer>
<type>string</type>
<schema-info version="0">UTF-8</schema-info>
</key-serializer>
<value-serializer>
<type>string</type>
<schema-info version="0">UTF-8</schema-info>
</value-serializer>
</store>
</stores>
Key - server.state
version() ts:1379597320870
: NORMAL_SERVER
Key - node.id
version() ts:1379597320865
: 0
Key - rebalancing.steal.info.key
version() ts:1379597320869
: []
localhost:1
Key - cluster.xml
Error in retrieving Failure while checking out socket for localhost:5557(ad1):
Key - stores.xml
Error in retrieving Failure while checking out socket for localhost:5557(ad1):
Key - server.state
Error in retrieving Failure while checking out socket for localhost:5557(ad1):
Key - node.id
Error in retrieving Failure while checking out socket for localhost:5557(ad1):
Key - rebalancing.steal.info.key
Error in retrieving Failure while checking out socket for localhost:5557(ad1):
Kumaran
Unreachable store exception, try caching your client connection, probably got too many open connections and are now being refused.
Make sure you have enough client side threads to handle your server connection threads.
Can't give much more help as your question is not exactly detailed or helpful.
The setup of my project is -
Spring JDBC for persistence
Apache DBCP 1.4 for connection pooling
Mysql 5 on Linux
Here is the log of my application that captures the interactions with the database.
2013-01-29 15:52:21,549 DEBUG http-bio-8080-exec-3 org.springframework.jdbc.core.JdbcTemplate - Executing SQL query [SELECT id from emp]
2013-01-29 15:52:21,558 DEBUG http-bio-8080-exec-3 org.springframework.jdbc.datasource.DataSourceUtils - Fetching JDBC Connection from DataSource
2013-01-29 15:52:31,878 INFO http-bio-8080-exec-3 jdbc.connection - 1. Connection opened org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
2013-01-29 15:52:31,878 DEBUG http-bio-8080-exec-3 jdbc.connection - open connections: 1 (1)
2013-01-29 15:52:31,895 INFO http-bio-8080-exec-3 jdbc.connection - 1. Connection closed org.apache.commons.dbcp.DelegatingConnection.close(DelegatingConnection.java:247)
2013-01-29 15:52:31,895 DEBUG http-bio-8080-exec-3 jdbc.connection - open connections: none
2013-01-29 15:52:41,950 INFO http-bio-8080-exec-3 jdbc.connection - 2. Connection opened org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
2013-01-29 15:52:41,950 DEBUG http-bio-8080-exec-3 jdbc.connection - open connections: 2 (1)
2013-01-29 15:52:52,001 INFO http-bio-8080-exec-3 jdbc.connection - 3. Connection opened org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
2013-01-29 15:52:52,002 DEBUG http-bio-8080-exec-3 jdbc.connection - open connections: 2 3 (2)
2013-01-29 15:53:02,058 INFO http-bio-8080-exec-3 jdbc.connection - 4. Connection opened org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
2013-01-29 15:53:02,058 DEBUG http-bio-8080-exec-3 jdbc.connection - open connections: 2 3 4 (3)
2013-01-29 15:53:03,403 DEBUG http-bio-8080-exec-3 org.springframework.jdbc.core.BeanPropertyRowMapper - Mapping column 'id' to property 'id' of type int
2013-01-29 15:53:04,494 DEBUG http-bio-8080-exec-3 org.springframework.jdbc.datasource.DataSourceUtils - Returning JDBC Connection to DataSource
Two things are clear from the log -
The connection pool only starts creating connections when the first request to execute a query is received.
A pool of 4 connections takes nearly 30 seconds to initialize.
My questions are -
How should one configure DBCP to initialize on startup automatically?
Should it really take that long to create connections?
Note: Please don't suggest switching to C3P0 or Tomcat connection pool. I'm aware of those solutions. I'm more interested in understanding the problem at hand than just a quick fix.
Besides I'm sure something so basic should be possible with DBCP as well.
Contents of dbcontext -
<bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
<property name="driverClassName" value="${db.driver}" />
<property name="url" value="${db.jdbc.url}" />
<property name="username" value="${db.user}" />
<property name="password" value="${db.password}" />
<property name="maxActive" value="20" />
<property name="initialSize" value="4" />
<property name="testOnBorrow" value="true" />
<property name="validationQuery" value="SELECT 1" />
</bean>
The initialSize doesn't take effect until you first request a connection. From the java docs to BasicDataSource#setInitialSize
Sets the initial size of the connection pool.
Note: this method currently has no effect once the pool has been
initialized. The pool is initialized the first time one of the
following methods is invoked: getConnection, setLogwriter,
setLoginTimeout, getLoginTimeout, getLogWriter.
Try adding init-method="getLoginTimeout" to your bean to confirm this.
For a web application, you can implement ServletContextListener.contextInitialized() method and fire a test query (e.g. Select ID From Emp Limit 1) using your DataAccess layer. This should initialize your connection pool and make it ready before your application starts serving real user from web.
Have a look at the initialSize property -especially the part about when the pool is initialized. As sbridges points out, you can use the init-method property on beans to call one of the methods to trigger pool creation.
Also, you should look into why it takes 7.5 seconds on average to create a connection...
Cheers,
I have the following code
Configuration config = new Configuration().configure();
config.buildMappings();
serviceRegistry = new ServiceRegistryBuilder().applySettings(config.getProperties()).buildServiceRegistry();
SessionFactory factory = config.buildSessionFactory(serviceRegistry);
Session hibernateSession = factory.openSession();
Transaction tx = hibernateSession.beginTransaction();
ObjectType ot = (ObjectType)hibernateSession.merge(someObj);
tx.commit();
return ot;
hibernate.cfg.xml contains:
<session-factory>
<property name="connection.url">jdbc:postgresql://127.0.0.1:5432/dbase</property>
<property name="hibernate.dialect">org.hibernate.dialect.PostgreSQLDialect</property>
<property name="connection.driver_class">org.postgresql.Driver</property>
<property name="connection.username">username</property>
<property name="connection.password">password</property>
<property name="connection.provider_class">org.hibernate.connection.C3P0ConnectionProvider</property>
<property name="hibernate.c3p0.acquire_increment">1</property>
<property name="hibernate.c3p0.min_size">5</property>
<property name="hibernate.c3p0.max_size">20</property>
<property name="hibernate.c3p0.max_statements">50</property>
<property name="hibernate.c3p0.timeout">300</property>
<property name="hibernate.c3p0.idle_test_period">3000</property>
<property name="hibernate.c3p0.acquireRetryAttempts">1</property>
<property name="hibernate.c3p0.acquireRetryDelay">250</property>
<property name="hibernate.show_sql">true</property>
<property name="hibernate.use_sql_comments">true</property>
<property name="hibernate.transaction.factory_class">org.hibernate.transaction.JDBCTransactionFactory</property>
<property name="hibernate.current_session_context_class">thread</property>
<mapping class="...." />
</session-factory>
After a few seconds and some successful inserts, the following exception appears:
org.postgresql.util.PSQLException: FATAL: sorry, too many clients already
at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:291)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:108)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30)
at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:393)
at org.postgresql.Driver.connect(Driver.java:267)
at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:135)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:182)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:171)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:137)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1014)
at com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:32)
at com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask.run(BasicResourcePool.java:1810)
at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
12:24:19.151 [ Thread-160] WARN internal.JdbcServicesImpl - HHH000342: Could not obtain connection to query metadata : Connections could not be acquired from the underlying database!
12:24:19.151 [ Thread-160] INFO dialect.Dialect - HHH000400: Using dialect: org.hibernate.dialect.PostgreSQLDialect
12:24:19.151 [ Thread-160] INFO internal.LobCreatorBuilder - HHH000422: Disabling contextual LOB creation as connection was null
12:24:19.151 [ Thread-160] INFO internal.TransactionFactoryInitiator - HHH000268: Transaction strategy: org.hibernate.engine.transaction.internal.jdbc.JdbcTransactionFactory
12:24:19.151 [ Thread-160] INFO ast.ASTQueryTranslatorFactory - HHH000397: Using ASTQueryTranslatorFactory
12:24:19.151 [ Thread-160] INFO hbm2ddl.SchemaUpdate - HHH000228: Running hbm2ddl schema update
12:24:19.151 [ Thread-160] INFO hbm2ddl.SchemaUpdate - HHH000102: Fetching database metadata
12:24:19.211 [Runner$PoolThread-#0] WARN resourcepool.BasicResourcePool - com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#ee4084 -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (1). Last acquisition attempt exception:
org.postgresql.util.PSQLException: FATAL: sorry, too many clients already
at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:291)
It seems that the hibernate doesn't realse the connection. But hibernateSession.close() causes exception Session is closed because tx.commit() is called.
I'm not quite sure what's going on here, but I'd recommend you not set hibernate.c3p0.acquireRetryAttempts to 1. First, that renders your next setting, hibernate.c3p0.acquireRetryDelay irrelevant -- that sets the length of time between retry attempts, but if there is only one attempt (ok, the param name is misleading, it sets the total number of tries), there are no retries. The effect of your settings is simply to have the pool try to fetch a Connection whenever a client comes in, then throw an Exception to clients immediately if that fails. It doesn't at all limit the number of Connections the pool will try to acquire (unless you set breakOnAcquireFailure to true, in which case, with your settings, any failure to acquire a Connection would invalidate the whole pool).
I share sola's concern about your lack of reliable resource cleanup. If, under your settings, commit() means close() (and you are not allowed to call close explicitly? that seems bad), then it is commit that should be in the finally block (but commit in a finally block also seems bad, sometimes you don't want to commit). Whatever the issue with close/commit, with the code you have, occasional Exceptions between openSession and commit will lead to Connection leaks.
But that should not be the cause of your too-many-open-Connections problem. If you leak Connections, you'll find that the Connection pool eventually freezes (as maxPoolSize Connectiosn are checked out forever due to the leaks). You'd only have 25 open Connections. Something else is going on. Try reviewing your logs. Is more than one Connection pool somehow being initialized? (c3p0 dumps config information at INFO level on pool init, so if multiple pools are getting opened, you should see multiple messages. alternatively, you can inspect running c3p0 pools via JMX, to see whether/why more than 25 Connections have been opened.)
Good luck!
I found the cause why c3p0 behaved in this way.The issue was quite trivial...
This part of code:
Configuration config = new Configuration().configure();
config.buildMappings();
serviceRegistry = new ServiceRegistryBuilder().applySettings(config.getProperties()).buildServiceRegistry();
SessionFactory factory = config.buildSessionFactory(serviceRegistry);
was executed multiple times. Thank you Steve for the tip.
I'm suggesting you to use try-catch-finally block,
in finally kindly close the session
i.e
try {
tx.commit();
} catch (HibernateException e) {
handleException(e);
} finally {
hibernateSession.close();
}
and also,
the max_connections property in postgresql.conf it's 100 by default. Increase it if you need.