I'm using ActiveMQ 5.9 with Camel 2.10.3, and under load (during a performance test) I'm experiencing some problems, it seems that the broker is stuck trying to close a connection, and I can't understand the reason.
The configuration of the JMS system is the following: there are two broker (configured in failover mode) and many client nodes that acts both as consumer both as producer of some queues (let's take one for example: 'customer_update queue'.
I'm using PooledConnectionFactory with default configuration, 'CACHE_CONSUMER' cache level and 10 max concurrent consumer per each client node.
Broker configuration is the following: tcp://0.0.0.0:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600
Here's the thread holding lock on the broker, and never releasing it:
"ActiveMQ Transport: tcp:///10.128.43.206:38694#61616" #5774 daemon prio=5 os_prio=0 tid=0x00007f2a4424e800 nid=0
xaba4 waiting on condition [0x00007f29fe397000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000004fd008fb0> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.
java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.activemq.broker.TransportConnection.stop(TransportConnection.java:983)
at org.apache.activemq.broker.TransportConnection.processAddConnection(TransportConnection.java:699)
- locked <0x000000050401eed0> (a java.lang.Object)
at org.apache.activemq.broker.jmx.ManagedTransportConnection.processAddConnection(ManagedTransportConnection.java:79)
at org.apache.activemq.command.ConnectionInfo.visit(ConnectionInfo.java:139)
at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:149)
at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:113)
at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:270)
at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:214)
at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
at java.lang.Thread.run(Thread.java:748)
I have more than 500 other threads on the broker that look like this:
"ActiveMQ Transport: tcp:///10.128.43.206:52074#61616" #2962 daemon prio=5 os_prio=0 tid=0x00007f2a440c9000 nid=0xa01f waiting for monitor entry [0x00007f29fc768000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.activemq.broker.TransportConnection.processAddConnection(TransportConnection.java:696)
- waiting to lock <0x000000050401eed0> (a java.lang.Object)
at org.apache.activemq.broker.jmx.ManagedTransportConnection.processAddConnection(ManagedTransportConnection.java:79)
at org.apache.activemq.command.ConnectionInfo.visit(ConnectionInfo.java:139)
at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:149)
at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:113)
at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:270)
at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:214)
at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
at java.lang.Thread.run(Thread.java:748)
First error I see on the broker is this one:
2018-05-16 16:36:59,336 [org.apache.activemq.broker.TransportConnection.Transport:856]
WARN - Transport Connection to: tcp://10.128.43.206:48747 failed: java.io.EOFException
On the client node that is referenced in the broker (10.128.43.206), I see this logs, it seems that the node is trying to reconnect, but right after it's disconnected again, and this happens again and again.
2018-05-16 16:36:59,322 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN - Transport (tcp://10.128.43.169:61616) failed, reason: java.io.IOException, attempting to automatically reconnect
2018-05-16 16:36:59,322 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN - Transport (tcp://10.128.43.169:61616) failed, reason: java.io.IOException, attempting to automatically reconnect
2018-05-16 16:36:59,375 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO - Successfully reconnected to tcp://10.128.43.169:61616
2018-05-16 16:36:59,375 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO - Successfully reconnected to tcp://10.128.43.169:61616
2018-05-16 16:36:59,375 [org.apache.activemq.TransactionContext:856] INFO - commit failed for transaction TX:ID:52374-1526300283331-1:1:898
javax.jms.TransactionRolledBackException: Transaction completion in doubt due to failover. Forcing rollback of TX:ID:52374-1526300283331-1:1:898
at org.apache.activemq.state.ConnectionStateTracker.restoreTransactions(ConnectionStateTracker.java:231)
at org.apache.activemq.state.ConnectionStateTracker.restore(ConnectionStateTracker.java:169)
at org.apache.activemq.transport.failover.FailoverTransport.restoreTransport(FailoverTransport.java:827)
at org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1005)
at org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:136)
at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2018-05-16 16:37:00,091 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO - Successfully reconnected to tcp://10.128.43.169:61616
2018-05-16 16:37:00,091 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO - Successfully reconnected to tcp://10.128.43.169:61616
2018-05-16 16:37:00,112 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN - Transport (tcp://10.128.43.169:61616) failed, reason: java.io.IOException, attempting to automatically reconnect
At the end, broker reach maxConnections available (1000), and needs to be restarted.
Could it be due to the fact that one client node is acting both as a consumer and a producer, using same connection pool, generating some kind of deadlock?
Do you have some suggestions?
Thanks
Giulio
Most probably I was affected by this issue:
https://issues.apache.org/jira/browse/AMQ-5090
Updating to ActiveMQ 5.10.0 and Camel 2.13.1 solved the issue (system is much more stable, even during performance tests).
Thanks
Giulio
Related
Software libraries’ Versions in use: Spring 2.2.3.Release, Spring kafka 2.x , Kafka-Client 2.0.0
Java Version: OpenJdk 1.8
Platform Kafka version: Apache kafka 2.2.0 (https://docs.confluent.io/5.2.1/release-notes.html#apache-kafka-2-2-0-cp2)
Configuration: Topic has 6 partitions.
Data rate: The incoming data rate is 3-4 msgs/second.
Scenario:
With the above configuration a test was run continuously for 3 days. After 2.5 days, I encountered problem in terms of not receiving messages from topic in our consumer group.
Upon detailed investigation, I found out that consumer threads were blocked precisely 18 threads were in blocked state. Also, the thread graph is suggesting that consumer threads were waiting on “Java.nio.
PFA logs and screenshots .
a.) Logs
b.) Screenshots
a)Logs
"kafka-coordinator-heartbeat-thread | release-registry-group" #128 daemon prio=5 os_prio=0 tid=0x00007f1954001800 nid=0x8a waiting for monitor entry [0x00007f197f7f6000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:1005)
- waiting to lock <0x00000000c3382070> (a org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
"org.springframework.kafka.KafkaListenerEndpointContainer#1-2-C-1" #127 prio=5 os_prio=0 tid=0x00007f1b6dbc1800 nid=0x89 runnable [0x00007f197f8f6000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000c33822d8> (a sun.nio.ch.Util$3)
- locked <0x00000000c33822c8> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c33822e8> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.kafka.common.network.Selector.select(Selector.java:689)
at org.apache.kafka.common.network.Selector.poll(Selector.java:409)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:271)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:218)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:230)
- locked <0x00000000c3382070> (a org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:314)
at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1218)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1175)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1154)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:732)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:689)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
b.) Screenshots
I would like to understand root cause and possible solutions to fix it and thanks in advance.
My application is under heavy load and I am getting below logs for
sudo -u tomcat jstack <java_process_id>
The below thread is consuming the messages from Kafka, and it got stuck. Since this thread is in WAITING state, no more kafka messages are being consumed.
"StreamThread-3" #91 daemon prio=5 os_prio=0 tid=0x00007f9b5c606000 nid=0x1e4d waiting on condition [0x00007f9b506c5000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000073aad9718> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
at ch.qos.logback.core.AsyncAppenderBase.put(AsyncAppenderBase.java:160)
at ch.qos.logback.core.AsyncAppenderBase.append(AsyncAppenderBase.java:148)
at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:84)
at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:51)
at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:270)
at ch.qos.logback.classic.Logger.callAppenders(Logger.java:257)
at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:421)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:383)
at ch.qos.logback.classic.Logger.error(Logger.java:538)
at com.abc.system.solr.repo.AbstractSolrRepository.doSave(AbstractSolrRepository.java:316)
at com.abc.system.solr.repo.AbstractSolrRepository.save(AbstractSolrRepository.java:295)
I also found this post
WAITING at sun.misc.Unsafe.park(Native Method)
but it didn't help me in my case.
What else I could investigate to get more details in such case?
I also ran into same problem. But, luckily I got my issue resolved by playing around with the size of pool and number of producer and consumer.
Try to check if there is any way to configure following.
Size of your thread pool
Number of consumers/producers (if we can configure in kafka)
Make sure that thread pool should have enough threads to serve consumer and producer both.
I am analyzing an application hang, and through the Thread Dumps, I am having 90% of worker threads in this state:
"pool-3-thread-352" #13082 prio=5 os_prio=0 tid=0x00007ff6407fc800
nid=0x1e94 waiting on condition [0x00007ff5a53b4000]
java.lang.Thread.State: TIMED_WAITING (parking) at
sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000044af6bcd0> (a java.util.concurrent.SynchronousQueue$TransferStack) at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-21-thread-214" #13081 prio=5 os_prio=0 tid=0x0000000002e6a800
nid=0x1e92 waiting on condition [0x00007ff5a54b5000]
java.lang.Thread.State: TIMED_WAITING (parking) at
sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000004ad95fba8> (a java.util.concurrent.SynchronousQueue$TransferStack) at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
As per my understanding, these are basically request worker threads on a tomcat sever, waiting on a blocking queue until a request comes. When a request comes, one thread will get permit and will run to execute the request.
So if no tasks are available these threads will wait (park) on the queue. When a task is available, one worker thread will get permit and become a running thread. It will execute the task.
But these threads still can cause issue if too many threads in the thread pool are created and they will be eating up resource.
Zero Deadlocks found, but still the app hanging, with almost Exceptions everywhere of type:
javax.ws.rs.ProcessingException: RESTEASY004655: Unable to invoke request
at org.jboss.resteasy.client.jaxrs.engines.ApacheHttpClient4Engine.invoke(ApacheHttpClient4Engine.java:287)
at com.agfa.orbis.core.client.service.rest.ClientHttpEngineWrapper.invoke(ClientHttpEngineWrapper.java:59)
at org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.invoke(ClientInvocation.java:436)
at org.jboss.resteasy.client.jaxrs.internal.ClientInvocationBuilder.get(ClientInvocationBuilder.java:159)
at com.agfa.hap.crs.commons.client.rest.RestClient.getResponse(RestClient.java:238)
at com.agfa.hap.crs.commons.client.rest.RestClient.get(RestClient.java:70)
at com.agfa.hap.crs.alertsystem.client.orbis.ForwardedUserAlertsMonitor.getSharedAlertState(ForwardedUserAlertsMonitor.java:88)
at com.agfa.hap.crs.alertsystem.client.orbis.ForwardedUserAlertsMonitor.getCurrentAlertState(ForwardedUserAlertsMonitor.java:79)
at com.agfa.hap.crs.alertsystem.client.orbis.AbstractAlertMonitor.requestMonitorUpdate(AbstractAlertMonitor.java:275)
at com.agfa.hap.crs.alertsystem.client.orbis.AbstractAlertMonitor$10.execute(AbstractAlertMonitor.java:823)
at com.agfa.hap.crs.alertsystem.client.orbis.AbstractAlertMonitor$Task.call(AbstractAlertMonitor.java:952)
at com.agfa.hap.crs.alertsystem.client.orbis.AbstractAlertMonitor$Task.call(AbstractAlertMonitor.java:942)
at com.agfa.hap.crs.alertsystem.client.orbis.AbstractAlertMonitor$TaskWrapper.call(AbstractAlertMonitor.java:925)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:992)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:535)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:403)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:177)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:304)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:611)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:446)
at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at org.jboss.resteasy.client.jaxrs.engines.ApacheHttpClient4Engine.invoke(ApacheHttpClient4Engine.java:283)
... 16 more
Caused by: java.io.EOFException: SSL peer shut down incorrectly
at sun.security.ssl.InputRecord.read(InputRecord.java:505)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
... 29 more
Am looking to link these exceptions through the threads activity !!!
Any idea why the connection is closed incorrectly ?!!!!
These Threads are waiting for something to happen. As you wrote:
these are basically request worker threads on a tomcat sever, waiting on a blocking queue until a request comes
As far as I understand, this happens under low load. So a too big ThreadPool will not be a problem. If you're really worried about it, you can configure a maxIdleTime for ThreadPools. So Tomcat is going to kill the old idle threads - until the ThreadPool reaches the minSpareThreads.
This is the thread pool documentation for Tomcat 8.
This is the thread pool documentation for Tomcat 7.
This is the thread pool documentation for Tomcat 6.
Every now and then an old Java I have to maintain stops responding. I managed to get a couple of thread stack traces and most threads are blocked like this trying to obtain a connection:
"tomcat-http-8180-168" - Thread t#10137
java.lang.Thread.State: BLOCKED
at oracle.jdbc.pool.OracleImplicitConnectionCache.retrieveCacheConnection(OracleImplicitConnectionCache.java:566)
- waiting to lock <566080> (a oracle.jdbc.pool.OracleImplicitConnectionCache) owned by "Thread-6" t#29
The thread holding the lock shows this:
"Thread-6" - Thread t#29
java.lang.Thread.State: BLOCKED
at oracle.jdbc.driver.PhysicalConnection.closeLogicalConnection(PhysicalConnection.java:3849)
- waiting to lock <146da30> (a oracle.jdbc.driver.T4CConnection) owned by "tomcat-http-8180-369" t#17665
at oracle.jdbc.driver.LogicalConnection.cleanupAndClose(LogicalConnection.java:304)
at oracle.jdbc.pool.OracleImplicitConnectionCache.closeCheckedOutConnection(OracleImplicitConnectionCache.java:1392)
at oracle.jdbc.pool.OracleImplicitConnectionCacheThread.runAbandonedTimeout(OracleImplicitConnectionCacheThread.java:250)
- locked <566080> (a oracle.jdbc.pool.OracleImplicitConnectionCache)
at oracle.jdbc.pool.OracleImplicitConnectionCacheThread.run(OracleImplicitConnectionCacheThread.java:81)
This thread seems to be responsible for closing abandoned connections, and it's itself blocked waiting for this other thread to finish:
"tomcat-http-8180-369" - Thread t#17665
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at oracle.net.ns.Packet.receive(Packet.java:282)
at oracle.net.ns.DataPacket.receive(DataPacket.java:103)
at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:230)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:175)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:100)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:85)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:122)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:78)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readB1(T4CSocketInputStreamWrapper.java:149)
at oracle.jdbc.driver.T4CMAREngine.buffer2Value(T4CMAREngine.java:2393)
at oracle.jdbc.driver.T4CMAREngine.unmarshalSB8(T4CMAREngine.java:1401)
at oracle.jdbc.driver.T4C8TTILob.readRPA(T4C8TTILob.java:837)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:292)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:186)
at oracle.jdbc.driver.T4C8TTIClob.read(T4C8TTIClob.java:240)
at oracle.jdbc.driver.T4CConnection.getChars(T4CConnection.java:3015)
- locked <146da30> (a oracle.jdbc.driver.T4CConnection)
at oracle.sql.CLOB.getChars(CLOB.java:402)
at oracle.jdbc.driver.OracleClobReader.needChars(OracleClobReader.java:187)
at oracle.jdbc.driver.OracleClobReader.read(OracleClobReader.java:142)
at java.io.Reader.read(Reader.java:123)
at oracle.jdbc.driver.ClobAccessor.getString(ClobAccessor.java:291)
at oracle.jdbc.driver.T4CClobAccessor.getString(T4CClobAccessor.java:481)
at oracle.jdbc.driver.OracleResultSetImpl.getString(OracleResultSetImpl.java:1251)
- locked <146da30> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OracleResultSet.getString(OracleResultSet.java:494)
This last thread is running a heavy query. Which is also why it's considered abandoned and Tomcat is trying to close it but it seemingly can't as it's in use and has a lock.
I don't understand why:
Oracle can't close the connection.
The other threads can't get a connection from the pool until the abandoned connection is closed.
Because by looking at the thread stack above it's what's happening. When the extra-long query finished (I know I have to look into that query) then the app started responding again as the threads unblocked.
This is Tomcat's (v6) pool config (sensitive details ommitted):
<Resource
name="jdbc/MainDBPool"
AutoCommit="true"
defaultReadOnly="false"
driverClassName="oracle.jdbc.OracleDriver"
factory="oracle.jdbc.pool.OracleDataSourceFactory"
fairQueue="false"
initialSize="10"
jdbcInterceptors="ConnectionState;StatementFinalizer"
jmxEnabled="true"
logAbandoned="false"
maxActive="100"
maxIdle="75"
maxWait="30000"
minEvictableIdleTimeMillis="5000"
minIdle="10"
removeAbandoned="true"
removeAbandonedTimeout="60"
testOnBorrow="false"
testOnReturn="false"
testWhileIdle="false"
timeBetweenEvictionRunsMillis="5000"
type="oracle.jdbc.pool.OracleDataSource"
connectionCachingEnabled="true"
connectionCacheName="tomcatConnectionCache1"
fastConnectionFailoverEnabled="true"
implicitCachingEnabled="true"
connectionCacheProperties="(ValidateConnection=true, PropertyCheckInterval=60, AbandonedConnectionTimeout=300, InitialLimit=10, MinLimit=30, MaxLimit=200, ConnectionWaitTimeout=30, InactivityTimeout=300)"
useEquals="false"
validationInterval="30000"
/>
I also looked into other possible causes, like a long full GC, but GC logging was enabled and there isn't a long pause that would explain this.
Thanks in advance.
Oracle Implicit Connection Cache is desupported. You must use Universal Connection Pool (UCP) which is the ICC's replacement. You can download ucp.jar and check out "UCP with Tomcat" whitepaper for more details.
I am facing new problem in glassfish server now :(
After server starts successfully after sometime it gets hang and stops processing requests
following are the exceptions in server log
[#|2011-05-08T17:44:34.027+0300|WARNING|sun-appserver2.1|javax.enterprise.system.stream.err|_ThreadID=17;_ThreadName=Timer-18
;_RequestID=a47329c8-5efa-4e29-b90d-a5bda9c8f725;|
java.util.logging.ErrorManager: 5: Error in formatting Logrecord|#]
[#|2011-05-08T17:44:34.036+0300|WARNING|sun-appserver2.1|javax.enterprise.system.stream.err|_ThreadID=17;_ThreadName=Timer-18
;_RequestID=a47329c8-5efa-4e29-b90d-a5bda9c8f725;|
java.security.ProviderException: implNextBytes() failed
at sun.security.pkcs11.P11SecureRandom.implNextBytes(P11SecureRandom.java:170)
at sun.security.pkcs11.P11SecureRandom.engineNextBytes(P11SecureRandom.java:117)
at java.security.SecureRandom.nextBytes(SecureRandom.java:413)
at java.util.UUID.randomUUID(UUID.java:161)
at com.sun.enterprise.admin.monitor.callflow.AgentImpl$ThreadLocalState.<init>(AgentImpl.java:162)
at com.sun.enterprise.admin.monitor.callflow.AgentImpl$1.initialValue(AgentImpl.java:256)
at com.sun.enterprise.admin.monitor.callflow.AgentImpl$1.initialValue(AgentImpl.java:255)
at java.lang.ThreadLocal$ThreadLocalMap.getAfterMiss(ThreadLocal.java:374)
at java.lang.ThreadLocal$ThreadLocalMap.get(ThreadLocal.java:347)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:225)
at java.lang.ThreadLocal.get(ThreadLocal.java:127)
at com.sun.enterprise.admin.monitor.callflow.AgentImpl.getThreadLocalData(AgentImpl.java:1070)
at com.sun.enterprise.server.logging.UniformLogFormatter.uniformLogFormat(UniformLogFormatter.java:320)
at com.sun.enterprise.server.logging.UniformLogFormatter.format(UniformLogFormatter.java:151)
at java.util.logging.StreamHandler.publish(StreamHandler.java:179)
at com.sun.enterprise.server.logging.FileandSyslogHandler.publish(FileandSyslogHandler.java:512)
at java.util.logging.Logger.log(Logger.java:440)
at java.util.logging.Logger.doLog(Logger.java:462)
at java.util.logging.Logger.log(Logger.java:526)
at com.sun.enterprise.deployment.autodeploy.AutoDeployControllerImpl$AutoDeployTask.run(AutoDeployControllerImpl.java
:391)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
Caused by: sun.security.pkcs11.wrapper.PKCS11Exception: CKR_DEVICE_ERROR
at sun.security.pkcs11.wrapper.PKCS11.C_GenerateRandom(Native Method)
at sun.security.pkcs11.P11SecureRandom.implNextBytes(P11SecureRandom.java:167)
... 21 more
|#]
[#|2011-05-08T17:44:36.034+0300|SEVERE|sun-appserver2.1|javax.enterprise.system.tools.deployment|_ThreadID=17;_ThreadName=Tim
er-18;_RequestID=a47329c8-5efa-4e29-b90d-a5bda9c8f725;|"DPL8011: autodeployment failure while deploying the application : nul
l"|#]
[#|2011-05-08T17:44:38.034+0300|SEVERE|sun-appserver2.1|javax.enterprise.system.tools.deployment|_ThreadID=17;_ThreadName=Tim
er-18;_RequestID=a47329c8-5efa-4e29-b90d-a5bda9c8f725;|"DPL8011: autodeployment failure while deploying the application : nul
l"|#]
********
when it is in hang state i took thread dump and found 1 dead lock as below !!!
Found one Java-level deadlock:
=============================
"SelectorThread-4242":
waiting to lock monitor 0x0040ef60 (object 0x56cd6dd0, a org.apache.jasper.util.SystemLogHandler),
which is held by "Timer-2"
"Timer-2":
waiting to lock monitor 0x0040f5d8 (object 0x568bc440, a com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream),
which is held by "SelectorThread-4242"
Java stack information for the threads listed above:
===================================================
"SelectorThread-4242":
at java.lang.Throwable.printStackTrace(Throwable.java:461)
- waiting to lock <0x56cd6dd0> (a org.apache.jasper.util.SystemLogHandler)
at java.lang.Throwable.printStackTrace(Throwable.java:452)
at java.util.logging.ErrorManager.error(ErrorManager.java:78)
- locked <0x5a4fe388> (a java.util.logging.ErrorManager)
at com.sun.enterprise.server.logging.UniformLogFormatter.uniformLogFormat(UniformLogFormatter.java:364)
at com.sun.enterprise.server.logging.UniformLogFormatter.format(UniformLogFormatter.java:151)
at java.util.logging.StreamHandler.publish(StreamHandler.java:179)
- locked <0x568825e0> (a com.sun.enterprise.server.logging.FileandSyslogHandler)
at com.sun.enterprise.server.logging.FileandSyslogHandler.publish(FileandSyslogHandler.java:512)
- locked <0x568825e0> (a com.sun.enterprise.server.logging.FileandSyslogHandler)
at java.util.logging.Logger.log(Logger.java:440)
at java.util.logging.Logger.doLog(Logger.java:462)
at java.util.logging.Logger.log(Logger.java:485)
at com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingByteArrayOutputStream.flush(SystemOutandErrHandler.java:368)
- locked <0x568b84f0> (a com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingByteArrayOutputStream)
at java.io.PrintStream.write(PrintStream.java:414)
- locked <0x568bc440> (a com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream)
at com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream.write(SystemOutandErrHandler.java:293)
at sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java:336)
at sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(StreamEncoder.java:404)
at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:115)
- locked <0x568bc470> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:169)
at java.io.PrintStream.write(PrintStream.java:459)
- locked <0x568bc440> (a com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream)
at java.io.PrintStream.print(PrintStream.java:602)
at com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream.print(SystemOutandErrHandler.java:205)
at java.io.PrintStream.println(PrintStream.java:739)
- locked <0x568bc440> (a com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream)
at com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream.println(SystemOutandErrHandler.java:187)
at org.apache.jasper.util.SystemLogHandler.println(SystemLogHandler.java:254)
at org.apache.jasper.util.SystemLogHandler.println(SystemLogHandler.java:254)
at org.apache.jasper.util.SystemLogHandler.println(SystemLogHandler.java:254)
at org.apache.jasper.util.SystemLogHandler.println(SystemLogHandler.java:254)
at java.util.logging.ErrorManager.error(ErrorManager.java:76)
- locked <0x5a4fe3a8> (a java.util.logging.ErrorManager)
at com.sun.enterprise.server.logging.UniformLogFormatter.uniformLogFormat(UniformLogFormatter.java:364)
at com.sun.enterprise.server.logging.UniformLogFormatter.format(UniformLogFormatter.java:151)
at com.sun.enterprise.server.logging.AMXLoggingHook.publish(AMXLoggingHook.java:198)
at com.sun.enterprise.server.logging.FileandSyslogHandler.publish(FileandSyslogHandler.java:531)
- locked <0x568825e0> (a com.sun.enterprise.server.logging.FileandSyslogHandler)
at java.util.logging.Logger.log(Logger.java:440)
at java.util.logging.Logger.doLog(Logger.java:462)
at java.util.logging.Logger.log(Logger.java:551)
at com.sun.enterprise.web.connector.grizzly.SelectorThread.doSelect(SelectorThread.java:1445)
at com.sun.enterprise.web.connector.grizzly.SelectorThread.startListener(SelectorThread.java:1316)
- locked <0x56cdf878> (a [Ljava.lang.Object;)
at com.sun.enterprise.web.connector.grizzly.SelectorThread.startEndpoint(SelectorThread.java:1279)
at com.sun.enterprise.web.connector.grizzly.SelectorThread.run(SelectorThread.java:1255)
"Timer-2":
at java.io.PrintStream.println(PrintStream.java:738)
- waiting to lock <0x568bc440> (a com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream)
at com.sun.enterprise.server.logging.SystemOutandErrHandler$LoggingPrintStream.println(SystemOutandErrHandler.java:178)
at org.apache.jasper.util.SystemLogHandler.println(SystemLogHandler.java:258)
at org.apache.jasper.util.SystemLogHandler.println(SystemLogHandler.java:258)
at org.apache.jasper.util.SystemLogHandler.println(SystemLogHandler.java:258)
at org.apache.jasper.util.SystemLogHandler.println(SystemLogHandler.java:258)
at java.lang.Throwable.printStackTrace(Throwable.java:462)
- locked <0x56cd6dd0> (a org.apache.jasper.util.SystemLogHandler)
at java.lang.Throwable.printStackTrace(Throwable.java:452)
at com.sun.jbi.management.system.AdminService.heartBeat(AdminService.java:975)
at com.sun.jbi.management.system.AdminService.handleNotification(AdminService.java:198)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor$ListenerWrapper.handleNotification(DefaultMBeanServerInterceptor.java:1652)
at javax.management.NotificationBroadcasterSupport.handleNotification(NotificationBroadcasterSupport.java:221)
at javax.management.NotificationBroadcasterSupport.sendNotification(NotificationBroadcasterSupport.java:184)
at javax.management.timer.Timer.sendNotification(Timer.java:1295)
- locked <0x57d2e590> (a javax.management.timer.TimerNotification)
at javax.management.timer.Timer.notifyAlarmClock(Timer.java:1264)
at javax.management.timer.TimerAlarmClock.run(Timer.java:1347)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
**Found 1 deadlock.**
Please suggest
Thanks
Ali
got same issue
i resolve it cause my .properties file contained a simple quote ' , thus when Jersey logger tries to log it, it cannot parse the Json send back to the server.
So look which kind of String Jersey is trying to log and be carefull on special characters