While parsing a date string the thread hung in the native method. The thread is stopped in the same state for the last 18 hours(Thread dumb in different times shows the same show the same stack for the thread in question).
Please suggest what might be the error
"DateParsingThread - 001" #228 daemon prio=5 os_prio=0 tid=0x00007f98e0029000 nid=0x9c2 runnable [0x00007f98016ee000]
java.lang.Thread.State: RUNNABLE
at java.security.AccessController.doPrivileged(Native Method)
at sun.util.resources.LocaleData.getBundle(LocaleData.java:163)
at sun.util.resources.LocaleData.getDateFormatData(LocaleData.java:127)
at sun.util.locale.provider.LocaleResources.getCalendarNames(LocaleResources.java:325)
at sun.util.locale.provider.CalendarNameProviderImpl.getDisplayNamesImpl(CalendarNameProviderImpl.java:157)
at sun.util.locale.provider.CalendarNameProviderImpl.getDisplayNames(CalendarNameProviderImpl.java:139)
at sun.util.locale.provider.CalendarDataUtility$CalendarFieldValueNamesMapGetter.getObject(CalendarDataUtility.java:178)
at sun.util.locale.provider.CalendarDataUtility$CalendarFieldValueNamesMapGetter.getObject(CalendarDataUtility.java:154)
at sun.util.locale.provider.LocaleServiceProviderPool.getLocalizedObjectImpl(LocaleServiceProviderPool.java:281)
at sun.util.locale.provider.LocaleServiceProviderPool.getLocalizedObject(LocaleServiceProviderPool.java:265)
at sun.util.locale.provider.CalendarDataUtility.retrieveFieldValueNames(CalendarDataUtility.java:88)
at java.util.Calendar.getDisplayNames(Calendar.java:2178)
at java.text.SimpleDateFormat.getDisplayNamesMap(SimpleDateFormat.java:2366)
at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:1972)
at java.text.SimpleDateFormat.parse(SimpleDateFormat.java
at java.text.DateFormat.parse(DateFormat.java:364)
Any suggestions to find the solutions are welcome.
Related
On analysing my thread dumps using an online tool fastthread.io, I am getting an error-"3 java daemon threads are looping on".
The stacktrace of the looping threads are:
Signal Dispatcher
threadId:4 - state:RUNNABLE
stackTrace:
java.lang.Thread.State: RUNNABLE
prio=9 blockedtime=0 blockedcount=0 waitedtime=0 waitedcount=0
Attach Listener
threadId:5 - state:RUNNABLE
stackTrace:
java.lang.Thread.State: RUNNABLE
prio=5 blockedtime=0 blockedcount=0 waitedtime=0 waitedcount=0
DestroyJavaVM
threadId:188 - state:RUNNABLE
stackTrace:
java.lang.Thread.State: RUNNABLE
prio=5 blockedtime=0 blockedcount=0 waitedtime=0 waitedcount=0
I am currently on java 11.0.9.
I am facing cpu spikes in my windows server.
I am using javas Thread to connect via SMTP to our mailprovider as this can take some time until it finishes and I dont want the request to wait.
But it looks like the threads are not closed after they are finished.
I noticed this in the debug mode of Eclipse:
For each time I create a new Thread(), it adds one running thread, but it is not closing it (at least I assume this, as eclipse still shows Running).
This is my code:
Thread mailThread = new Thread() {
public void run() {
System.out.println("Does it work?");
try {
Transport t = session.getTransport("smtp");
t.connect("user","pass");
t.sendMessage(message,message.getAllRecipients());
t.close();
System.out.println("SENT");
return;
} catch (MessagingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
return;
}
}
};
mailThread.start();
Is this working as intended? Or does Running in eclipse mean something different?
I suggest not only to use the debugger to see, to see which threads you have at a certain point in time. Debuggers might display threads which are active during a break point but should not be there under normal conditions.
It is preferrable to use the command line tool jstack to create thread dumps. This will dump all the threads in a JVM at a certain point in time.
Here are some instructions on how to use it: https://helpx.adobe.com/uk/experience-manager/kb/TakeThreadDump.html
Another thing could help you debugging and finding threads in the dump: give threads a name using the string in one of the constructor.
new Thread("foo")
Then it becomes easier to find these in the thread dump.
If you call a thread "foo" then it will show up in a thread dump like this:
"foo" #16 prio=5 os_prio=0 tid=0x0000000041970800 nid=0x41f8 waiting on condition [0x000000004244e000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(java.base#9/Native Method)
at stackoverflow.ThreadReferenceTest$1.run(ThreadReferenceTest.java:14)
Locked ownable synchronizers:
- None
"Service Thread" #15 daemon prio=9 os_prio=0 tid=0x0000000041914000 nid=0x3d90 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
I met strange issue with logback, that FREEZE thread when trying to log in a spawned child java process. Briefly described as below:
ParentProcess creates 1 ChildProcess
In ChildProcess, use logback prints 1000 lines.
Note:
Logging is output to Console
App freeze
Issue not happen when: Run directly ChildProcess (without ParentProcess); or log to File only (no Console log)
I push simple code that produce this phenomenom: https://github.com/huymluu/logbackfreeze
EDIT: add thread dump
"process reaper#687" daemon prio=10 tid=0xc nid=NA runnable
java.lang.Thread.State: RUNNABLE
at java.lang.UNIXProcess.waitForProcessExit(UNIXProcess.java:-1)
at java.lang.UNIXProcess.lambda$initStreams$3(UNIXProcess.java:290)
at java.lang.UNIXProcess$$Lambda$7.687241927.run(Unknown Source:-1)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"main#1" prio=5 tid=0x1 nid=NA waiting
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.Object.wait(Object.java:502)
at java.lang.UNIXProcess.waitFor(UNIXProcess.java:396)
at parent.ParentProcess.main(ParentProcess.java:20)
"Finalizer#689" daemon prio=8 tid=0x3 nid=NA waiting
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler#690" daemon prio=10 tid=0x2 nid=NA waiting
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"Signal Dispatcher#688" daemon prio=9 tid=0x4 nid=NA runnable
java.lang.Thread.State: RUNNABLE
I think logback has problem with child process's console output. Just don't know the detail why it happened.
Trying to redirect child process's output will solve this issue. e.g. use redirectOutput() of ProcessBuilder:
ProcessBuilder.redirectOutput(new File("/dev/null"));
i have been try to use the LMAX distruptor to buffer the content produced by one of my programs and publish them to another program as a batch of records (well i am still unable to get the consumer batching part done). But even without using the batching of the records, it works as it should be. But my problem is eventhough i used call the
`disruptor.shutdown()` and `executorService.shutdownNow()`
as it is given in one of the examples, it doesn't stop executing the program. It does even execute statement below those methods as well. When i print
executorService.isShutdown();
it returns true. Can someone help me with this...
Edit
"pool-1-thread-1" prio=10 tid=0x00007f57581b9800 nid=0x1bec waiting on condition [0x00007f573eb0d000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000d9110148> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at com.lmax.disruptor.BlockingWaitStrategy.waitFor(BlockingWaitStrategy.java:45)
at com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:55)
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Some tips that help me:
1. Set daemon flag on ExecutorService threads
Do this with a ThreadFactory (or Guava's ThreadFactoryBuilder).
Example:
final ThreadFactory threadFactory =
new ThreadFactory() {
#Override
public Thread newThread(Runnable r) {
final ThreadFactory threadFactory = Executors.defaultThreadFactory();
final Thread thread = threadFactory.newThread(r);
thread.setDaemon(true);
return thread;
}
};
final ExecutorService executorService =
Executors.newFixedThreadPool(threadCount, threadFactory);
2. Shutdown order
Disruptor.shutdown(long, TimeUnit)
Disruptor.halt()
ExecutorService.shutdown()
ExecutorService.awaitTermination(long, TimeUnit)
Impatient shutdown example:
try {
disruptor.shutdown(0, TimeUnit.NANOSECONDS);
// if shutdown is successful:
// 1. exception is not thrown (obviously)
// 2. Disruptor.halt() is called automatically (less obvious)
}
catch (TimeoutException e) {
disruptor.halt();
}
executorService.shutdown();
executorService.awaitTermination(0, TimeUnit.NANOSECONDS);
3. Use a Shutdown Hook
These are called even when System.exit(int) is called, but not if your JVM is killed with SIGKILL (or the equivalent on non-POSIX platforms).
Runtime.getRuntime()
.addShutdownHook(
new Thread(
() -> {
// shutdown here
}));
Your Java Process only stops, when all threads (that are non daemon threads) are finished.
Probably some thread is still running, maybe in a lock, maybe in a loop.
To see what thread are still running you can use the jdk-tools:
Use jps to get the ids of running Java processes:
C:\DVE\jdk\jdk8u45x64\jdk1.8.0_45\bin>jps
4112 TestExMain
With the right id for your program use the command jstack:
C:\DVE\jdk\jdk8u45x64\jdk1.8.0_45\bin>jstack 4112
2015-09-17 09:12:45
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode):
"Service Thread" #9 daemon prio=9 os_prio=0 tid=0x000000001d208800 nid=0x1b7c runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"main" #1 prio=5 os_prio=0 tid=0x0000000002260800 nid=0x1324 waiting on condition [0x000000000224f000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.example.TestExMain.main(TestExMain.java:8)
For example, here you'll see a thread Service Thread which is a daemon - this thread wont stop your program from shutting down.
The Thread main is not a daemon thread - the Java-Process will wait for this thread to finish, before it stops.
For each thread you'll see a stack trace, at which position the thread is - with that you can find the code that possible keeps the thread from running.
That specific Thread you have there is somehow locked (im not shure why, it could be a wait() call, a synchronize block or some other locking mechanism). When that thread doesn't stop when you call disruptor.shutdown() it might be a bug in that lib you use.
I think I found a situation where mixed usage of log4j a) directly and b) via commons-logging causes some kind of class-loading deadlock. I'm not sure if such a situation is possible at all (shouldn't the JVM detect that?) and what to do about it.
The problem
In our build system, we currently are running our unit tests sequentially - to make the build faster, we obviously can change that to run our unit tests in parallel. However, if we do so, some builds run into an execution timeout. When analysing the thread dump of such "hanging builds", we find ourselves in different modules with different tests involved most of the time. But it always boils down to two threads trying the initialize a Logger: one with Logger.getLogger (directly using log4j), the other with LogFactory.getLog (using commons-logging).
My analysis
So we have one thread (the one using log4j directly) waiting at this place:
"pool-1-thread-3" prio=10 tid=0x00007f6528ca6000 nid=0x6f8f in Object.wait() [0x00007f64d9ca6000]
java.lang.Thread.State: RUNNABLE
at org.apache.log4j.LogManager.<clinit>(LogManager.java:82)
at org.apache.log4j.Logger.getLogger(Logger.java:117)
at de.is24.platform.contacts.domain.PhoneNumberFormat.<clinit>(PhoneNumberFormat.java:21)
which, unfortunately, is a rather "crowded" line:
Hierarchy h = new Hierarchy(new RootLogger((Level) Level.DEBUG));
And another thread (using commons-logging) waiting here:
"pool-1-thread-2" prio=10 tid=0x00007f6528bf9800 nid=0x6f8e in Object.wait() [0x00007f64d9da7000]
java.lang.Thread.State: RUNNABLE
at org.apache.log4j.Priority.<clinit>(Priority.java:45)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:171)
at org.apache.commons.logging.impl.Log4JLogger.class$(Log4JLogger.java:37)
at org.apache.commons.logging.impl.Log4JLogger.<clinit>(Log4JLogger.java:45)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:209)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:351)
which is straightforward:
final static public Priority FATAL = new Level(FATAL_INT, "FATAL", 0);
So to me, it seems like the second thread is in the process of initializing class Priority and waits to load the Level class.
The first thread probably attempts to load the Level class (the other stuff in that line seems unrelated), and as the Level class extends Priority, waits for the Priority class to be loaded.
There we have our deadlock.
So, can you tell me if this analysis is correct? Or did I miss something?
UPDATE I
I wrote some test cases, you can find them here: https://github.com/sebastiankirsch/classloading
There are several test cases demonstrating the problem:
TestLoadingByClassForName should cause a deadlock rather quickly (every third run on my machine)
TestLoadingMixed represents a version of the problem being stripped down to the minimum of the observed stack trace; this one fails much more infrequently (roughly by factor 4)
TestMixedLoggerInstantiation mimics the behaviour: one class instantiates a logger using log4j, another using commons-logging. The deadlock is hard to catch here, as there is much more code involved - I needed to add a random sleep which certainly needs to be adapted to your machine
Here's a stack trace of the TestMixedLoggerInstantiation test case hanging.
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode):
"UseLog4JLogger" prio=10 tid=0x00007fa1f017d800 nid=0x7bd8 in Object.wait() [0x00007fa1e5ba4000]
java.lang.Thread.State: RUNNABLE
at org.apache.log4j.LogManager.<clinit>(LogManager.java:82)
at org.apache.log4j.Logger.getLogger(Logger.java:117)
at net.tcc.classloading.UseLog4JLogger.run(UseLog4JLogger.java:23)
"UseCommonsLoggingLogFactory" prio=10 tid=0x00007fa1f00e5000 nid=0x7bd7 in Object.wait() [0x00007fa1e5ca4000]
java.lang.Thread.State: RUNNABLE
at org.apache.log4j.Priority.<clinit>(Priority.java:45)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at org.apache.commons.logging.impl.Log4JLogger.class$(Log4JLogger.java:37)
at org.apache.commons.logging.impl.Log4JLogger.<clinit>(Log4JLogger.java:45)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:209)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:351)
at net.tcc.classloading.UseCommonsLoggingLogFactory.run(UseCommonsLoggingLogFactory.java:13)
"ReaderThread" prio=10 tid=0x00007fa1f00ed000 nid=0x7bd6 runnable [0x00007fa1e5da6000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
- locked <0x00000007d7088a00> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
- locked <0x00000007d7088a00> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner$ReaderThread.run(RemoteTestRunner.java:140)
"Low Memory Detector" daemon prio=10 tid=0x00007fa1f009d800 nid=0x7bd4 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x00007fa1f009b800 nid=0x7bd3 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x00007fa1f0098800 nid=0x7bd2 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x00007fa1f0096800 nid=0x7bd1 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x00007fa1f007a000 nid=0x7bd0 in Object.wait() [0x00007fa1e6c54000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007d7001300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x00000007d7001300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=10 tid=0x00007fa1f0078000 nid=0x7bcf in Object.wait() [0x00007fa1e6d55000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007d70011d8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x00000007d70011d8> (a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x00007fa1f000c000 nid=0x7bc5 in Object.wait() [0x00007fa1f50b0000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007d730dfd8> (a net.tcc.classloading.UseCommonsLoggingLogFactory)
at java.lang.Thread.join(Thread.java:1186)
- locked <0x00000007d730dfd8> (a net.tcc.classloading.UseCommonsLoggingLogFactory)
at java.lang.Thread.join(Thread.java:1239)
at net.tcc.classloading.TestMixedLoggerInstantiation.test(TestMixedLoggerInstantiation.java:21)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
"VM Thread" prio=10 tid=0x00007fa1f0071800 nid=0x7bce runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007fa1f001f000 nid=0x7bc6 runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007fa1f0021000 nid=0x7bc7 runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007fa1f0022800 nid=0x7bc8 runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007fa1f0024800 nid=0x7bc9 runnable
"GC task thread#4 (ParallelGC)" prio=10 tid=0x00007fa1f0026800 nid=0x7bca runnable
"GC task thread#5 (ParallelGC)" prio=10 tid=0x00007fa1f0028000 nid=0x7bcb runnable
"GC task thread#6 (ParallelGC)" prio=10 tid=0x00007fa1f002a000 nid=0x7bcc runnable
"GC task thread#7 (ParallelGC)" prio=10 tid=0x00007fa1f002c000 nid=0x7bcd runnable
"VM Periodic Task Thread" prio=10 tid=0x00007fa1f00a8800 nid=0x7bd5 waiting on condition
JNI global references: 1168
Reproducing the deadlock
Download the code from https://github.com/sebastiankirsch/classloading.
TestLoadingByClassForName should easily cause a deadlock for you (just run it a few times), this is a prerequisite that your system/JVM will eventually run into a deadlock for the complex scenario.
Now run TestMixedLoggerInstantiation several times. Note the average run time, open up UseLog4JLogger and set the sleep timer sum to a little less than the average run time. This will eventually cause a deadlock, but it's occurring rarely.
Therefore, run it on the command line like this:
for (( ; ; )) do
testExectution
done
Instead of putting the testExecution together manually, just set a break boint in the test, debug, type ps -ef | grep TestMixedLoggerInstantiation in a shell, and copy the command your IDE uses.
I finally found the answer in the Java Language Specification, specifically in chapter 12.4.2 Detailed Initialization Procedure.
There it says
[...]
2) If the Class object for C indicates that initialization is in progress for C by some other thread, then [...] block the current thread until informed that the in-progress initialization has completed, [...]
7) Next, if C is a class rather than an interface, and its superclass SC has not yet been initialized, then recursively perform this entire procedure for SC
10) If the execution of the initializers completes normally, [...] label the Class object for C as fully initialized, notify all waiting threads, [...]
So the behaviour observed is exactly as specified by the JLS. I'm still a bit baffled that there's no means to detect such a deadlock. And why the threads are marked as RUNNABLE - but I guess this isn't your typical bytecode to be executed, so who knows...
How to fix the issue
The solution for the problem was to get rid of commons-logging. As #Robert Johnson pointed out, this can easily be done by using org.slf4j:jcl-over-slf4j instead. I also checked the SLF code: it does not "take advantage" of the unlucky log4j design.
Your analysis is correct.
You can try to run your concurrent tests in different classloaders, see the discussion here on how to do it. There's is an open bug in Surefire and discussion in JUnit groups on this issue.
As a workaround you can use org.apache.myfaces.test.runners.TestPerClassLoaderRunner as described in the link above.