Using ehCache 2.4.4, I seem to have gotten into a deadlock on the ehCache Segment object. From other logging, I know that the 'waiting thread', 1694 last ran anything 9 hours before this stack trace was generated. In the meantime, 1696 has gone and done a lot of other work, so this lock is definitely being held errantly.
I'm pretty confident that I am not directly locking any Segment instances directly, so I assume this is some kind of issue internal to the library. Any ideas?
"Model Executor - 1696" Id=1696 in TIMED_WAITING on lock=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#92eb1ed
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at java.util.concurrent.PriorityBlockingQueue.poll(Unknown Source)
at com.rtrms.application.modeling.local.BlockingTaskList.takeTask(BlockingTaskList.java:20)
at com.rtrms.application.modeling.local.ModelExecutor.executeNextTask(ModelExecutor.java:71)
at com.rtrms.application.modeling.local.ModelExecutor.run(ModelExecutor.java:46)
Locked synchronizers: count = 1
- java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync#4a3d767f
"Model Executor - 1694" Id=1694 in WAITING on lock=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync#4a3d767f
owned by Model Executor - 1696 Id=1696
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(Unknown Source)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(Unknown Source)
at net.sf.ehcache.store.compound.Segment.unretrievedGet(Segment.java:248)
at net.sf.ehcache.store.compound.CompoundStore.unretrievedGet(CompoundStore.java:191)
at net.sf.ehcache.store.compound.impl.DiskPersistentStore.containsKeyInMemory(DiskPersistentStore.java:72)
at net.sf.ehcache.Cache.searchInStoreWithStats(Cache.java:1884)
at net.sf.ehcache.Cache.get(Cache.java:1549)
at com.rtrms.amoeba.cache.DistributedModeledSecurities.get(DistributedModeledSecurities.java:57)
at com.rtrms.amoeba.modeling.AssertPersistedModeledSecurities.get(AssertPersistedModeledSecurities.java:44)
at com.rtrms.application.modeling.tasks.ExpandableModelingTask.getNextUnexecutedTask(ExpandableModelingTask.java:35)
at com.rtrms.application.modeling.local.BlockingTaskList.takeTask(BlockingTaskList.java:36)
at com.rtrms.application.modeling.local.ModelExecutor.executeNextTask(ModelExecutor.java:71)
at com.rtrms.application.modeling.local.ModelExecutor.run(ModelExecutor.java:46)
Locked synchronizers: count = 0
Turns out that calls like Cache.acquireWriteLockOnKey end up obtaining a lock on the internal Segment, so this apparent deadlock was caused by a .unlock call that wasn't in a finally block.
Editorial comment: It also implies that you can get contention trying to lock two different keys that just happened to be in the same Segment, which is pretty unfortunate.
Related
One of the threads has a lock for more than 3 seconds when querying Oracle Database. This causes many blocked threads when accesing Oracle database, and hence sudden increases in number of threads and unresposiveness of application. Im am using Tomcat 8.5, Tomcat connection pool, Java 8. Trace for blocking thread:
***"http-nio-80-exec-433" #4207 daemon prio=5 os_prio=0 tid=0x00007fd9d8042000 nid=0x503b runnable [0x00007fd839f04000]
java.lang.Thread.State: RUNNABLE
at java.util.Hashtable.get(Hashtable.java:363)
- locked <0x000000070193caa0> (a java.util.Hashtable)
at java.lang.ConditionalSpecialCasing.lookUpTable(ConditionalSpecialCasing.java:151)
at java.lang.ConditionalSpecialCasing.toUpperCaseEx(ConditionalSpecialCasing.java:123)
at java.lang.String.toUpperCase(String.java:2775)
at java.lang.String.toUpperCase(String.java:2833)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1638)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:4401)
at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:4482)
- locked <0x000000074cd7d868> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:6272)
at sun.reflect.GeneratedMethodAccessor400.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.interceptor.AbstractQueryReport$StatementProxy.invoke(AbstractQueryReport.java:210)
at com.sun.proxy.$Proxy637.executeQuery(Unknown Source)
at sun.reflect.GeneratedMethodAccessor400.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.StatementFacade$StatementProxy.invoke(StatementFacade.java:114)
at com.sun.proxy.$Proxy637.executeQuery(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)
at org.hibernate.loader.Loader.getResultSet(Loader.java:1953)
at org.hibernate.loader.Loader.doQuery(Loader.java:802)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
at org.hibernate.loader.Loader.doList(Loader.java:2533)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2276)
at org.hibernate.loader.Loader.list(Loader.java:2271)***
Here is a trace for one of the 10+ BLOCKED threads
***"http-nio-80-exec-271" #2777 daemon prio=5 os_prio=0 tid=0x00007fd9c8941800 nid=0x19c3 waiting for monitor entry [0x00007fd8356ca000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Hashtable.get(Hashtable.java:363)
- waiting to lock <0x000000070193caa0> (a java.util.Hashtable)
at java.lang.ConditionalSpecialCasing.lookUpTable(ConditionalSpecialCasing.java:151)
at java.lang.ConditionalSpecialCasing.toUpperCaseEx(ConditionalSpecialCasing.java:123)
at java.lang.String.toUpperCase(String.java:2775)
at java.lang.String.toUpperCase(String.java:2833)
at oracle.jdbc.driver.CharCommonAccessor.init(CharCommonAccessor.java:164)
at oracle.jdbc.driver.VarcharAccessor.<init>(VarcharAccessor.java:88)
at oracle.jdbc.driver.T4CVarcharAccessor.<init>(T4CVarcharAccessor.java:108)
at oracle.jdbc.driver.T4CTTIdcb.fillupAccessors(T4CTTIdcb.java:431)
at oracle.jdbc.driver.T4CTTIdcb.receiveCommon(T4CTTIdcb.java:209)
at oracle.jdbc.driver.T4CTTIdcb.receive(T4CTTIdcb.java:145)
at oracle.jdbc.driver.T4C8Oall.readDCB(T4C8Oall.java:963)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:447)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:235)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:543)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:239)
at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:1246)
at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1500)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1717)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:4401)
at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:4482)
- locked <0x000000074d203f60> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:6272)
at sun.reflect.GeneratedMethodAccessor400.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.interceptor.AbstractQueryReport$StatementProxy.invoke(AbstractQueryReport.java:210)
at com.sun.proxy.$Proxy637.executeQuery(Unknown Source)
at sun.reflect.GeneratedMethodAccessor400.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.StatementFacade$StatementProxy.invoke(StatementFacade.java:114)
at com.sun.proxy.$Proxy637.executeQuery(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)***
I have no idea why toUpperCase() would lock something (it is some Integer object being locked, as far as i found) for 30+ seconds, but this keeps occuring multiple times per day. Thread dump analysers did not found any deadlocks in dump. Tomcat pools logs that query for blocking thread http-nio-80-exec-433 took 5 minutes to complete.
Could this be a problem with jvm, memory or something else? Like jdbc driver or connection pool configuration problem?
It appears the problem was not code related. We had 10GB size catalina.out log file, and 4 bash scripts which analyzed that file for specific errors every five minutes, and because of large file size, each such analysis (mostly tail/wc commands) took 3-4 minutes. I do not know if catalina.out was being locked, but CPU usage for "tail" and "wc" commands was quite significant. Memory usage did not increase significantly though.
After manually rolling catalina.out, the problem is gone. Admin has been tasked with figuring out why logrotate is failing.
Update: The problem kept reappearing under higher loads (>50 users), so after some testing, locale was changed from "lt" to "en". Together with fixing another MyFaces cache bug, response times from Tomcat decreased 10-20 times, and number of concurrent users that can use application increased >10 times.
I have a problem with my production tomcat in which my web application is deployed. I runs fine for some long duration but after that whenever any user try to access the application through web browser then connection reset message will be shown on the browser literally application down.
I tried to increase the tomcat memory but the problem still continue. I have taken the thread dump also in which all the thread are showing in BLOCKED state.
Can anyone help me out !!!
Thread dump sample
"http-80-173" daemon prio=6 tid=0x55392800 nid=0xf84 waiting for monitor entry [0x64c8e000..0x64c8f9e8]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:572)
at java.lang.StringBuffer.append(StringBuffer.java:320)
- locked <0x4472eeb8> (a java.lang.StringBuffer)
at java.text.MessageFormat.applyPattern(MessageFormat.java:436)
at java.text.MessageFormat.<init>(MessageFormat.java:350)
at java.text.MessageFormat.format(MessageFormat.java:811)
at org.apache.naming.StringManager.getString(StringManager.java:121)
at org.apache.naming.StringManager.getString(StringManager.java:144)
at org.apache.naming.resources.FileDirContext.getAttributes(FileDirContext.java:432)
at org.apache.naming.resources.BaseDirContext.getAttributes(BaseDirContext.java:747)
at org.apache.naming.resources.ProxyDirContext.cacheLoad(ProxyDirContext.java:1531)
at org.apache.naming.resources.ProxyDirContext.cacheLookup(ProxyDirContext.java:1454)
at org.apache.naming.resources.ProxyDirContext.lookup(ProxyDirContext.java:288)
at org.apache.naming.resources.DirContextURLConnection.getInputStream(DirContextURLConnection.java:368)
at org.apache.jasper.compiler.Compiler.isOutDated(Compiler.java:435)
at org.apache.jasper.compiler.Compiler.isOutDated(Compiler.java:392)
All thread will be in same state.
"http-80-160" daemon prio=6 tid=0x5538dc00 nid=0x1e80 waiting for monitor entry [0x6453e000..0x6453fce8]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:45)
at java.lang.StringBuffer.<init>(StringBuffer.java:79)
at com.microsoft.sqlserver.jdbc.SQLCollation.readCollation(Unknown Source)
at com.microsoft.sqlserver.jdbc.TypeInfo.init(Unknown Source)
at com.microsoft.sqlserver.jdbc.StreamColumns.processBytes(Unknown Source)
at com.microsoft.sqlserver.jdbc.IOBuffer.processPackets(Unknown Source)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(Unknown Source)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(Unknown Source)
at com.microsoft.sqlserver.jdbc.SQLServerStatement$StatementExecutionRequest.executeStatement(Unknown Source)
at com.microsoft.sqlserver.jdbc.CancelableRequest.execute(Unknown Source)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeRequest(Unknown Source)
**- locked <0x444b72c0> (a com.microsoft.sqlserver.jdbc.TDSWriter)**
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQuery(Unknown Source)
at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
Also this is a common line i am getting in most of the cases:
locked <0x444b72c0> (a com.microsoft.sqlserver.jdbc.TDSWriter)
I am using JDBC driver 2 and my Operating system is Windows 2007
Also i am using connection pooling and during getting the connection it is showing locked
"http-80-172" daemon prio=6 tid=0x55392400 nid=0x2548 waiting for monitor entry [0x64bfe000..0x64bffa68]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.commons.pool.impl.GenericObjectPool.getNumIdle(GenericObjectPool.java:911)
- waiting to lock <0x3ef89f88> (a org.apache.commons.dbcp.AbandonedObjectPool)
at org.apache.commons.dbcp.AbandonedObjectPool.borrowObject(AbandonedObjectPool.java:78)
at org.apache.commons.dbcp.PoolingDriver.connect(PoolingDriver.java:176)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:207)
at com.cong.vts.dbutil.JDBCConnection.getConnection(JDBCConnection.java:42)
at com.cong.vts.dbutil.SQLFunctions.createCSSFile(SQLFunctions.java:10069)
I'm seeing a problem with multiple Threads deadlocking on the same line of code.
I cannot reproduce the problem locally or in any test, but yet Thread Dumps from Production have show the problem quite clearly.
I can't see why the Threads would become blocked on the synchronized line below, since there is no other synchronization on the Object in the call stack or in any other Thread. Does anyone have any idea what is going on, or how I can even reproduce this issue (Currently trying with 15 Threads all hitting trim() in a loops, while processing 2000 tasks through my Queue - But unable to reproduce)
In the Thread dump below, I think the multiple Threads with the 'locked' status may be a manifestation of Java Bug: http://bugs.java.com/view_bug.do?bug_id=8047816 where JStack reports Threads in wrong state.
(I'm using JDK Version: 1.7.0_51)
Cheers!
Here is a view of the Threads in the Thread dump.....
"xxx>Job Read-3" daemon prio=10 tid=0x00002aca001a6800 nid=0x6a3b waiting for monitor entry [0x0000000052ec4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- locked <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9c2680> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-2" daemon prio=10 tid=0x00002aca001a5000 nid=0x6a3a waiting for monitor entry [0x0000000052d83000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- locked <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9ed518> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-1" daemon prio=10 tid=0x00002aca00183000 nid=0x6a39 waiting for monitor entry [0x0000000052c42000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- waiting to lock <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9ecde8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-0" daemon prio=10 tid=0x0000000006a83000 nid=0x6a36 waiting for monitor entry [0x000000005287f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- waiting to lock <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Here is the Java code extracted, which shows where the error is...
public class Deadlock {
final Deque<Object> delegate = new ArrayDeque<>();
final long maxSize = Long.MAX_VALUE;
private final AtomicLong totalExec = new AtomicLong();
private final Map<Object, AtomicLong> totals = new HashMap<>();
private final Map<Object, Deque<Long>> execTimes = new HashMap<>();
public void trim() {
//Possible optimization is evicting in chunks, segmenting by arrival time
while (this.totalExec.longValue() > this.maxSize) {
final Object t = this.delegate.peek();
final Deque<Long> execTime = this.execTimes.get(t);
final Long exec = execTime.peek();
if (exec != null && this.totalExec.longValue() - exec > this.maxSize) {
//If Job Started Inside of Window, remove and re-loop
remove();
}
else {
//Otherwise exit the loop
break;
}
}
}
public Object remove() {
Object removed;
synchronized (this.delegate) { //4 Threads deadlocking on this line !
removed = this.delegate.pollFirst();
}
if (removed != null) {
itemRemoved(removed);
}
return removed;
}
public void itemRemoved(final Object t) {
//Decrement Total & Queue
final AtomicLong catTotal = this.totals.get(t);
if (catTotal != null) {
if (!this.execTimes.get(t).isEmpty()) {
final Long exec = this.execTimes.get(t).pollFirst();
if (exec != null) {
catTotal.addAndGet(-exec);
this.totalExec.addAndGet(-exec);
}
}
}
}
}
From the documentation for HashMap
Note that this implementation is not synchronized. If multiple threads
access a hash map concurrently, and at least one of the threads
modifies the map structurally, it must be synchronized externally.
(Emphasis theirs)
You are both reading and writing to/from the Maps in an unsynchronized manner.
I see no reason to assume that your code is thread safe.
I suggest that you have an infinite loop in trim caused by this lack of thread safety.
Entering a synchronized block is relatively slow, so it's likely that a thread dump will always show at least a few threads waiting to obtain the lock.
Your first thread is holding the lock while waiting for pollFirst.
"xxx>Job Read-3" daemon prio=10 tid=0x00002aca001a6800 nid=0x6a3b waiting for monitor entry [0x0000000052ec4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- locked <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
The other threads are waiting to obtain the lock.
You will need to provide the entire thread dump to determine which thread is holding the lock on 0x0000000052ec4000, which is what is preventing your pollFirst call from returning.
In order to deadlock, you need at least two be locking on at least two objects in the same thread at the same time which is something the code you posted doesn't appear to do. The bug you point to may apply but as I read it, it's a cosmetic issue and that the threads are not 'locked', but waiting to acquire a lock on the object in question (the ArrayDeque). You should see a "deadlock" message in your logs if you have a deadlock. It will call out the two threads that are blocking each other.
I don't believe the thread dump says there are deadlocks. It's simply telling you how many threads are waiting on the monitor at the moment you took the dump. Since only one thread may have the monitor at a given moment, it shouldn't be very surprising.
What behavior are you seeing in your application that lead you to believe you have a deadlock? There's a lot missing from your code particularly where the objects in the delegate Dequeue are coming from. My guess is you don't have an outright deadlock but some other issue that may look like a deadlock.
Thanks to the responses here, it became clear that the issue was none Thread Safe usage of multiple Collections.
To resolve the issue, I've made the trim method synchronized and replaced usage of HashMap with ConcurrentHashMap and ArrayDeque with LinkedBlockingDeque
(Concurrent Collections FTW!)
A further planned enhancement is to change the usage of 2 separate Maps into a single Map containing a Custom Object, that way keeping the operations (in itemRemoved) atomic.
I'm having difficulties in understanding the thread dump I got from jstack for a Spring MVC web application running on Tomcat 6 (java 1.6.0_22, Linux).
I see blocking threads (that cause other threads to wait) which are blocked themselves, however the thread dump doesn't tell me why or for which monitor they are waiting.
Example:
"TP-Processor75" daemon prio=10 tid=0x00007f3e88448800 nid=0x56f5 waiting for monitor entry [0x00000000472bc000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Class.initAnnotationsIfNecessary(Class.java:3067)
- locked <0x00007f3e9a0b3830> (a java.lang.Class for org.catapultframework.resource.ResourceObject)
at java.lang.Class.getAnnotation(Class.java:3029)
...
I.e. I am missing the "waiting to lock ..." line in the stack trace. Apparently the thread locks a Class object, but I don't see why the thread itself is blocked.
The thread-dump does not contain any hints for deadlocks.
What can I do to identify the locking monitor?
Thanks,
Oliver
Apparently the situation where we observed these kinds of blocked threads were related to heavy memory consumption and therefore massive garbage collection.
This question Java blocking issue: Why would JVM block threads in many different classes/methods? describes a similar situation, so I believe these threads were simply blocked by the garbage collector.
(Anyway, after solving the memory issue this problem with the blocking threads was gone.)
Check if the finalizer thread is blocked or waiting.
During a GC sweep, the GC will "stop the world" to perform its cleanup. The definition of "world" depends on the garbage collector being used and context. It may be a small cluster of threads or all of them. Before officially collecting garbage, GC will invoke the object's finalize().
If you are in the undesirable situation where you are implementing finalizer methods, the finalization code may be blocking it from finishing and the 'world' stays stopped.
This is most obvious when seeing lots of threads being permanently-blocked by some unknown magic force: Look up the code where the blocking occurs and it will make no sense; there is no blocking code to be found anywhere near it and the dumps will not divulge what monitor it is waiting on because there isn't one. The GC has paused the threads.
I had a similar problem just now using an Applet in Google Chrome.
In short:
The BLOCKED threads can be blocked when the VM needs to load a class.
When the process of loading the class itself is blocked by something a freeze for the whole app can occur.
In Detail:
I had the following scenario:
I am using an Applet in Chrome with codebase = folder to single class-files (no jar)
The Website passes focus-events to the applet using LiveConnect
The incoming JS-calls are using an Executor with new Runnable() ... to detach the calls in order to reduce the wait times and thus hangs in JS.
That's where the problem occured!
Explanation:
The new Runnable() is an annonymous inner class which was not loaded before the JS call happened.
The JS call therefore triggers the class load.
But now the class loader is blocked because it needs to talk to the browser (i am guessing) via the same queue or mechanism that is processing the incoming JS call.
Here is the blocked thread that is trying to load the class:
"Thread-20" daemon prio=4 tid=0x052e8400 nid=0x4608 in Object.wait() [0x0975d000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at sun.plugin2.message.Queue.waitForMessage(Unknown Source)
- locked <0x29fbc5d8> (a sun.plugin2.message.Queue)
at sun.plugin2.message.Pipe$2.run(Unknown Source)
at com.sun.deploy.util.Waiter$1.wait(Unknown Source)
at com.sun.deploy.util.Waiter.runAndWait(Unknown Source)
at sun.plugin2.message.Pipe.receive(Unknown Source)
at sun.plugin2.main.client.MessagePassingExecutionContext.doCookieOp(Unknown Source)
at sun.plugin2.main.client.MessagePassingExecutionContext.getCookie(Unknown Source)
at sun.plugin2.main.client.PluginCookieSelector.getCookieFromBrowser(Unknown Source)
at com.sun.deploy.net.cookie.DeployCookieSelector.getCookieInfo(Unknown Source)
at com.sun.deploy.net.cookie.DeployCookieSelector.get(Unknown Source)
- locked <0x298da868> (a sun.plugin2.main.client.PluginCookieSelector)
at sun.net.www.protocol.http.HttpURLConnection.setCookieHeader(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.writeRequests(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
- locked <0x2457cdc0> (a sun.net.www.protocol.http.HttpURLConnection)
at com.sun.deploy.net.HttpUtils.followRedirects(Unknown Source)
at com.sun.deploy.net.BasicHttpRequest.doRequest(Unknown Source)
at com.sun.deploy.net.BasicHttpRequest.doGetRequestEX(Unknown Source)
at com.sun.deploy.cache.ResourceProviderImpl.checkUpdateAvailable(Unknown Source)
at com.sun.deploy.cache.ResourceProviderImpl.isUpdateAvailable(Unknown Source)
at com.sun.deploy.cache.DeployCacheHandler.get(Unknown Source)
- locked <0x245727a0> (a java.lang.Object)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
- locked <0x24572020> (a sun.net.www.protocol.http.HttpURLConnection)
at java.net.HttpURLConnection.getResponseCode(Unknown Source)
at sun.plugin2.applet.Applet2ClassLoader.getBytes(Unknown Source)
at sun.plugin2.applet.Applet2ClassLoader.access$000(Unknown Source)
at sun.plugin2.applet.Applet2ClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.plugin2.applet.Applet2ClassLoader.findClass(Unknown Source)
at sun.plugin2.applet.Plugin2ClassLoader.loadClass0(Unknown Source)
at sun.plugin2.applet.Plugin2ClassLoader.loadClass(Unknown Source)
- locked <0x299726b8> (a sun.plugin2.applet.Applet2ClassLoader)
at sun.plugin2.applet.Plugin2ClassLoader.loadClass(Unknown Source)
- locked <0x299726b8> (a sun.plugin2.applet.Applet2ClassLoader)
at java.lang.ClassLoader.loadClass(Unknown Source)
As you can see it is waiting for a message --> waitForMessage().
At the same time there is our incoming JS call being BLOCKED here:
"Applet 1 LiveConnect Worker Thread" prio=4 tid=0x05231800 nid=0x1278 waiting for monitor entry [0x0770e000]
java.lang.Thread.State: BLOCKED (on object monitor)
at MyClass.myMethod(MyClass.java:23)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.plugin.javascript.Trampoline.invoke(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.plugin.javascript.JSClassLoader.invoke(Unknown Source)
at sun.plugin2.liveconnect.JavaClass$MethodInfo.invoke(Unknown Source)
at sun.plugin2.liveconnect.JavaClass$MemberBundle.invoke(Unknown Source)
at sun.plugin2.liveconnect.JavaClass.invoke0(Unknown Source)
at sun.plugin2.liveconnect.JavaClass.invoke(Unknown Source)
at sun.plugin2.main.client.LiveConnectSupport$PerAppletInfo$DefaultInvocationDelegate.invoke(Unknown Source)
at sun.plugin2.main.client.LiveConnectSupport$PerAppletInfo$3.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.plugin2.main.client.LiveConnectSupport$PerAppletInfo.doObjectOp(Unknown Source)
at sun.plugin2.main.client.LiveConnectSupport$PerAppletInfo$LiveConnectWorker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Additional other threads were blocked in the same manner. I suppose all subsequent class-load requests were blocked by the first blocked class-loading thread.
As mentioned before, my guess is that the class-loading process is blocked by the pending JS call, which by itself is blocked by the missing class to be loaded.
Solutions:
Trigger loading all relevant classes in the constructor of the applet before any calls can be made from JS.
It might help if the class-files are not being loaded individually, but from a jar-file. The theory behind this is: The class loader does not need to talk to the browser to load the classes from the jar-file (which would be
In combination with 1.: Use a dynamic Proxy class to wrap all incoming JS calls and run them independently in an Executor.
My implementation for #3:
public class MyClass implements JsCallInterface
{
private final JsCallInterface jsProxy;
private final static interface JsCallInterface
{
public void myMethod1Intern(String param1, String param2);
}
private final class JsCallRunnable implements Runnable
{
private final Method method;
private final Object[] args;
private JsCallRunnable(Method method, Object[] args)
{
this.method = method;
this.args = args;
}
public void run()
{
try
{
method.invoke(MyClass.this, args);
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
public MyClass()
{
MyUtilsClass.class.getName(); // load class
JsCallRunnable.class.getName(); // load class
InvocationHandler jsCallHandler = new InvocationHandler()
{
public Object invoke(final Object proxy, final Method method, final Object[] args) throws Throwable
{
MyUtilsClass.executeInExecutor(new JsCallRunnable(method, args));
return null;
}
};
jsProxy = (JsCallInterface) Proxy.newProxyInstance(MyClass.class.getClassLoader(), new Class<?>[] { JsCallInterface.class }, jsCallHandler);
}
public void myMethod1(String param1, String param2)
{
jsProxy.myMethod1Intern(param1, param2);
// needs to be named differently than the external method or else the proxy will call this method recursively
// alternatively the target-class in "method.invoke(MyClass.this, args);" could be a different instance of JsCallInterface
}
public void myMethod1Intern(String param1, String param2)
{
// do actual work here
}
}
This is a cosmetic bug in Oracle's HotSpot JVM - in your stack trace where you see - locked <0x00007f3e9a0b3830> it should actually say - waiting to lock <0x00007f3e9a0b3830>.
See this bug for more details.
When developing an application which consumes an external webservice I have generated the sources from the wsdl-url and then created a client:
GeoIPServiceClient service = new GeoIPServiceClient();
GeoIPServiceSoap geoIPClient = service.getGeoIPServiceSoap();
Since the creation of this proxy takes some time I set the client as an attribute in my service class.
But I'm worried that the client isn't thread safe and this webservice is heavily used in the application by concurrent threads (webapp). I can't find any documentation on this.
As a precaution I've started to use an object pool of soap clients instead of a shared one.
Is this an unnecessary precaution? What is the best practice when writing xfire clients?
I suspect some kind of concurrency problem with xfire since I regularly, under high load, get blocked threads and as a result of this the application crashes. Here's a partial thread dump:
"http-xx.xx.xx.xx-80-17" daemon prio=10 tid=0x00007f560d437000 nid=0x66cb waiting for monitor entry [0x00000000412b8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.sun.xml.bind.v2.runtime.reflect.opt.Injector.inject(Injector.java:174)
- waiting to lock <0x00007f561d44e1c0> (a com.sun.xml.bind.v2.runtime.reflect.opt.Injector)
at com.sun.xml.bind.v2.runtime.reflect.opt.Injector.inject(Injector.java:85)
at com.sun.xml.bind.v2.runtime.reflect.opt.AccessorInjector.prepare(AccessorInjector.java:87)
at com.sun.xml.bind.v2.runtime.reflect.opt.OptimizedAccessorFactory.get(OptimizedAccessorFactory.java:165)
at com.sun.xml.bind.v2.runtime.reflect.Accessor$FieldReflection.optimize(Accessor.java:253)
at com.sun.xml.bind.v2.runtime.reflect.TransducedAccessor$CompositeTransducedAccessorImpl.<init>(TransducedAccessor.java:231)
at com.sun.xml.bind.v2.runtime.reflect.TransducedAccessor.get(TransducedAccessor.java:173)
at com.sun.xml.bind.v2.runtime.property.SingleElementLeafProperty.<init>(SingleElementLeafProperty.java:83)
at sun.reflect.GeneratedConstructorAccessor165.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.sun.xml.bind.v2.runtime.property.PropertyFactory.create(PropertyFactory.java:124)
at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.<init>(ClassBeanInfoImpl.java:171)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.getOrCreate(JAXBContextImpl.java:481)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.<init>(JAXBContextImpl.java:315)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:139)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:117)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:188)
at sun.reflect.GeneratedMethodAccessor176.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:128)
at javax.xml.bind.ContextFinder.find(ContextFinder.java:277)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:372)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:337)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:244)
at org.codehaus.xfire.jaxb2.JaxbType.getJAXBContext(JaxbType.java:306)
- locked <0x00007f565b3aee60> (a org.codehaus.xfire.jaxb2.JaxbType)
at org.codehaus.xfire.jaxb2.JaxbType.writeObject(JaxbType.java:230)
at org.codehaus.xfire.aegis.AegisBindingProvider.writeParameter(AegisBindingProvider.java:229)
at org.codehaus.xfire.service.binding.AbstractBinding.writeParameter(AbstractBinding.java:273)
at org.codehaus.xfire.service.binding.WrappedBinding.writeMessage(WrappedBinding.java:90)
at org.codehaus.xfire.soap.SoapSerializer.writeMessage(SoapSerializer.java:80)
at org.codehaus.xfire.transport.http.HttpChannel.writeWithoutAttachments(HttpChannel.java:56)
at org.codehaus.xfire.transport.http.OutMessageRequestEntity.writeRequest(OutMessageRequestEntity.java:51)
at org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:499)
at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114)
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.codehaus.xfire.transport.http.CommonsHttpMessageSender.send(CommonsHttpMessageSender.java:369)
at org.codehaus.xfire.transport.http.HttpChannel.sendViaClient(HttpChannel.java:123)
at org.codehaus.xfire.transport.http.HttpChannel.send(HttpChannel.java:48)
at org.codehaus.xfire.handler.OutMessageSender.invoke(OutMessageSender.java:26)
at org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131)
at org.codehaus.xfire.client.Invocation.invoke(Invocation.java:79)
at org.codehaus.xfire.client.Invocation.invoke(Invocation.java:114)
at org.codehaus.xfire.client.Client.invoke(Client.java:336)
at org.codehaus.xfire.client.XFireProxy.handleRequest(XFireProxy.java:77)
at org.codehaus.xfire.client.XFireProxy.invoke(XFireProxy.java:57)
at $Proxy143.getMyMethod(Unknown Source)
The thread dump contains a lot of blocked threads that look like this.
I guess as you get a lot of blocked threads, the client is actually thread-safe as object data is not corrupted :). But I agree it's not handling the concurrency in a good way.
1) One observation is that the final lock seems to be in JAXB implementation and not in XFire. What if you try using different JAXB implementation like JaxMe?
2) Also the method getJAXBContext in JaxbType is synchronised. And most likely because your threads are accessing the same JaxbType instance they may be blocked.
Looking at that method I would actually moved the synchronisation into the method after context presense is checked:
if (context == null) {
synchronized (this) {
...
This will allow for clients that already have JAXBContext initialised to skip expensive synchronisation.
My suggestion is either try fixing the code yourself and make a test or submit a bug to XFire or do both :).
Depends on the version of Xfire you are using, as they have fixed few Thread Safety issues in version 1.2.5. You can check the bug raised at http://jira.codehaus.org/browse/XFIRE-886 , and see more details on the release notes at hxxp://xfire.codehaus.org/XFire+1.2.5+Release+Notes