Rinning jstack -e produces a dump like this (at least in Java 19):
"Thread-0" #25 [23276] prio=5 os_prio=0 cpu=0.00ms elapsed=593.30s allocated=6720B defined_classes=1 tid=0x0000023dafe60b20 nid=23276 waiting for monitor entry [0x000000796a4ff000]
What does "defined_classes" mean here?
This output is coming from the enhancement JDK-8200720. Its implementation defines this values as follow:
defined_classes=... : The number of classes defined by this thread
This might hint to a thread that loads too many classes.
It was added in commit d1b24f2ceca5 on 25 Jun 2018.
This attribute gives the number of classes defined by this thread.
Related
While parsing a date string the thread hung in the native method. The thread is stopped in the same state for the last 18 hours(Thread dumb in different times shows the same show the same stack for the thread in question).
Please suggest what might be the error
"DateParsingThread - 001" #228 daemon prio=5 os_prio=0 tid=0x00007f98e0029000 nid=0x9c2 runnable [0x00007f98016ee000]
java.lang.Thread.State: RUNNABLE
at java.security.AccessController.doPrivileged(Native Method)
at sun.util.resources.LocaleData.getBundle(LocaleData.java:163)
at sun.util.resources.LocaleData.getDateFormatData(LocaleData.java:127)
at sun.util.locale.provider.LocaleResources.getCalendarNames(LocaleResources.java:325)
at sun.util.locale.provider.CalendarNameProviderImpl.getDisplayNamesImpl(CalendarNameProviderImpl.java:157)
at sun.util.locale.provider.CalendarNameProviderImpl.getDisplayNames(CalendarNameProviderImpl.java:139)
at sun.util.locale.provider.CalendarDataUtility$CalendarFieldValueNamesMapGetter.getObject(CalendarDataUtility.java:178)
at sun.util.locale.provider.CalendarDataUtility$CalendarFieldValueNamesMapGetter.getObject(CalendarDataUtility.java:154)
at sun.util.locale.provider.LocaleServiceProviderPool.getLocalizedObjectImpl(LocaleServiceProviderPool.java:281)
at sun.util.locale.provider.LocaleServiceProviderPool.getLocalizedObject(LocaleServiceProviderPool.java:265)
at sun.util.locale.provider.CalendarDataUtility.retrieveFieldValueNames(CalendarDataUtility.java:88)
at java.util.Calendar.getDisplayNames(Calendar.java:2178)
at java.text.SimpleDateFormat.getDisplayNamesMap(SimpleDateFormat.java:2366)
at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:1972)
at java.text.SimpleDateFormat.parse(SimpleDateFormat.java
at java.text.DateFormat.parse(DateFormat.java:364)
Any suggestions to find the solutions are welcome.
Has anyone got any experience of having deadlocks with beanshell? This is something we have been encountering recently in our production system where script execution is blocking other threads, due to it's lock on classloading via tomcat. The following is the stacktrace for the lock owner in thread dump:
"Thread-64" : 150 : BLOCKED : cpu=37812500000 : cpuLoad= 0.0
BlockedCount:93354 BlockedTime:-1 LockName:java.lang.Object#219d66b6 LockOwnerID:151 LockOwnerName:Thread-65
WaitedCount:13 WaitedTime:-1 InNative:false IsSuspended:false at org.apache.catalina.webresources.AbstractSingleArchiveResourceSet.getArchiveEntries(AbstractSingleArchiveResourceSet.java:66)
at org.apache.catalina.webresources.AbstractArchiveResourceSet.getResource(AbstractArchiveResourceSet.java:262)
at org.apache.catalina.webresources.StandardRoot.getResourceInternal(StandardRoot.java:281)
at org.apache.catalina.webresources.Cache.getResource(Cache.java:62)
at org.apache.catalina.webresources.StandardRoot.getResource(StandardRoot.java:216)
at org.apache.catalina.webresources.StandardRoot.getClassLoaderResource(StandardRoot.java:225)
at org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:2173)
at org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:811)
at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1260)
at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1119)
at java.lang.Class.forName0(Class.java:-2)
at java.lang.Class.forName(Class.java:348)
at bsh.classpath.ClassManagerImpl.classForName(null:-1)
at bsh.NameSpace.classForName(null:-1)
at bsh.NameSpace.getImportedClassImpl(null:-1)
at bsh.NameSpace.getClassImpl(null:-1)
at bsh.NameSpace.getClass(null:-1)
at bsh.Name.consumeNextObjectField(null:-1)
at bsh.Name.toObject(null:-1)
at bsh.BSHAmbiguousName.toObject(null:-1)
at bsh.BSHAmbiguousName.toObject(null:-1)
at bsh.BSHPrimaryExpression.eval(null:-1)
at bsh.BSHPrimaryExpression.eval(null:-1)
at bsh.BSHVariableDeclarator.eval(null:-1)
at bsh.BSHTypedVariableDeclaration.eval(null:-1)
at bsh.Interpreter.eval(null:-1)
at bsh.Interpreter.eval(null:-1)
at bsh.Interpreter.eval(null:-1)
at my.package.MyClassFile(MyClassFile:2332)
I see that Groovy is a more popular choice for Java scripting, but I haven't seen many posts where it says that bsh can cause deadlocks.
It would be good to get some ideas from SO users.
Regards,
There's a fix for one dead lock in GUI does not start in Java 8 found in Beanshell (almost latest) version 2.0b5.
You can open a new issue in Beanshell project.
It may be connected to ClassManagerImpl:
Bsh has a multi-tiered class loading architecture. No class loader is
created unless/until a class is generated, the classpath is modified,
or a class is reloaded.
Note: we may need some synchronization in here
Seeing very strange behaviour. My code is executing well but not sure what happen, method is calling to other method but other method doesnt get called ( i cant see logs which is there in the first line of other method )
"jaxws-engine-1-thread-2" id=447 idx=0x73c tid=4031 prio=5 alive, parked, native_blocked, daemon
at jrockit/vm/Locks.park0(J)V(Native Method)
at jrockit/vm/Locks.park(Locks.java:2230)
at sun/misc/Unsafe.park(ZJ)V(Native Method)
at java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:196)
at java/util/concurrent/SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424)
at java/util/concurrent/SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
at java/util/concurrent/SynchronousQueue.poll(SynchronousQueue.java:874)
at java/util/concurrent/ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:955)
at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:917)
at java/lang/Thread.run(Thread.java:682)
at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
-- end of trace
Code -
public static void startMicroSessionTimer(TimerName timerName, Data Data) {
logger.debug("Starting a micro-timer for timer name: " + timerName);
//Start a micro timer to process the soap response in worker thread
SipApplicationSession applicationSession = Util.getAppSession((String)Data.get(DataAttribute.ID));
Util. AbcTimer (applicationSession, 1L, timerName.getTimerName());
}
public static void AbcTimer(SipApplicationSession appSession,
long timeInMillies, String timerName) {
logger.debug("Inside AbcTimer”);
//Some Logic
}
Logs -
16 May 2018 09:13:07,506 [jaxws-engine-1-thread-12] DEBUG -----SOME LOGS…..
16 May 2018 09:13:07,506 [jaxws-engine-1-thread-12] DEBUG [AbcUtils] [ODhlNjQ0ZjAzMTMzN2U5MGNhMTE2MTgxOTg2MTdmYjA.] Starting a micro-timer for timer name: HAHAHA
Not able to see any log after above line for Thread jaxws-engine-1-thread-12. As per log this log Inside AbcTimer should come as it is in the starting of called method ie AbcTimer. There is no Exception occured.
I have taken ThreadDump as well which I have posted above.
Not Sure but think that it is a machine specific issue. Also google it and saw that this type of issue occurred to other people as well but i didnt get the solution.
Using below JRocket Version
java version "1.6.0_141"
Java(TM) SE Runtime Environment (build 1.6.0_141-b12)
Oracle JRockit(R) (build R28.3.13-15-173128-1.6.0_141-20161219-1845-linux-x86_64, compiled mode)
I am using the curator framework to connect to a zookeeper server, but running into weird DNS resolution issue. Here is the jstack dump for the thread,
#21 prio=5 os_prio=0 tid=0x0000000001888800 nid=0x3a46 runnable [0x00007f25e69f3000]
java.lang.Thread.State: RUNNABLE
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at org.apache.zookeeper.client.StaticHostProvider.resolveAndShuffle(StaticHostProvider.java:117)
at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:81)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:1096)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:1006)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:804)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:679)
at com.netflix.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:72)
- locked <0x00000000fd761f40> (a com.netflix.curator.HandleHolder$1)
at com.netflix.curator.HandleHolder.getZooKeeper(HandleHolder.java:46)
at com.netflix.curator.ConnectionState.reset(ConnectionState.java:122)
at com.netflix.curator.ConnectionState.start(ConnectionState.java:95)
at com.netflix.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:137)
at com.netflix.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:167)
The thread seems to be stuck in the native method and never returns. Also it occurs very randomly, so haven't been able to reproduce consistently. Any ideas?
We are also trying to solve this problem. Looks like this is due to glibc bug: https://bugzilla.kernel.org/show_bug.cgi?id=99671 or the kernel bug: https://bugzilla.redhat.com/show_bug.cgi?id=1209433 depending on who you ask ;)
Also worth reading: https://access.redhat.com/security/cve/cve-2013-7423 and https://alas.aws.amazon.com/ALAS-2015-617.html
To confirm that this is indeed the case attach gdb to the java process:
gdb --pid <JavaProcessPid>
then from gdb:
info threads
find a thread that does recvmsg:
thread <HangingThreadId>
and then
backtrace
and if you see something like this then you know that glibc/kernel upgrade will help:
#0 0x00007fc726ff27cd in recvmsg () from /lib64/libc.so.6
#1 0x00007fc727018765 in make_request () from /lib64/libc.so.6
#2 0x00007fc727018b9a in __check_pf () from /lib64/libc.so.6
#3 0x00007fc726fdbd57 in getaddrinfo () from /lib64/libc.so.6
#4 0x00007fc706dd9635 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr () from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-0.b17.el6_7.x86_64/jre/lib/amd64/libnet.so
Update: Looks like the kernel wins. Please see this thread: http://www.gossamer-threads.com/lists/linux/kernel/2264958 for details.
Also there is a tool to verify that your system is affected by the kernel bug you can use this simple program: https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473
to verify:
curl -o pf_dump.c https://gist.githubusercontent.com/stevenschlansker/6ad46c5ccb22bc4f3473/raw/22cfe72f6708de1e3468c1e0fa3888aafae42db4/pf_dump.c
gcc pf_dump.c -pthread -o pf_dump
./pf_dump
And if the output is:
[26170] glibc: check_pf: netlink socket read timeout
Aborted
Then the system is affected. If the output is something like:
exit success [7618] exit success [7265] exit success
then the system is ok.
In the AWS context, upgrading AMIs to (2016.3.2) with the new kernel seems to have fixed the problem.
I see this thread in my jstack that does not appear to be moving at all. Any pointers on how to figure out why it's stuck? I don't see any locks or anything, the only suspicious thing is the "Object.wait()" reference.
"main" prio=10 tid=0x00007f3a8000b000 nid=0x942 in Object.wait() [0x00007f3a89539000]
java.lang.Thread.State: RUNNABLE
at org.joda.time.DateTimeZone.<clinit>(DateTimeZone.java:95)
at org.joda.time.format.DateTimeFormatter.withZoneUTC(DateTimeFormatter.java:301)
at com.amazonaws.auth.AWS4Signer.<clinit>(AWS4Signer.java:44)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at java.lang.Class.newInstance0(Class.java:372)
at java.lang.Class.newInstance(Class.java:325)
at com.amazonaws.auth.SignerFactory.createSigner(SignerFactory.java:121)
at com.amazonaws.auth.SignerFactory.lookupAndCreateSigner(SignerFactory.java:107)
at com.amazonaws.auth.SignerFactory.getSigner(SignerFactory.java:80)
at com.amazonaws.AmazonWebServiceClient.computeSignerByServiceRegion(AmazonWebServiceClient.java:311)
at com.amazonaws.AmazonWebServiceClient.computeSignerByURI(AmazonWebServiceClient.java:284)
at com.amazonaws.AmazonWebServiceClient.setEndpoint(AmazonWebServiceClient.java:160)
Also, line 95 in DateTimeZone.java at the top of the stack is this:
public static final DateTimeZone UTC = new FixedDateTimeZone("UTC", "UTC", 0, 0);
There is another thread that's also stuck in a similar place:
"FeatureManagerService" daemon prio=10 tid=0x00007f3a8056a800 nid=0x94f in Object.wait() [0x00007f3a84151000]
java.lang.Thread.State: RUNNABLE
at com.amazonaws.util.DateUtils.<clinit>(DateUtils.java:35)
at com.amazonaws.services.s3.internal.ServiceUtils.<clinit>(ServiceUtils.java:59)
at com.amazonaws.services.s3.internal.S3Signer.sign(S3Signer.java:123)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:348)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:245)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3711)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3664)
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:620)
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:603)
And DateUtils.java:35 is:
private static final DateTimeZone GMT = new FixedDateTimeZone("GMT", "GMT", 0, 0);
I already tried looking into it with jvisualvm/jhat but didn't really get very far.
Note that this is a live process, not something I am running in my debugger locally and after restart it works fine so it appears to be intermittent.
Any help would be appreciated!
Thanks!
Update using the mixed mode in jstack seems to give some more insight - it's waiting on a pthread_cond_wait:
----------------- 2370 -----------------
0x00007f3a89115414 __pthread_cond_wait + 0xc4
0x00007f3a8833a03c _ZN13ObjectMonitor4waitElbP6Thread + 0x7dc
0x00007f3a88117fbb _ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread + 0x36b
0x00007f3a881182ca _ZN13instanceKlass10initializeEP6Thread + 0x6a
0x00007f3a8814d3f3 _ZN18InterpreterRuntime4_newEP10JavaThreadP19constantPoolOopDesci + 0x143
0x00007f3a7d01d9ee * org.joda.time.DateTimeZone.<clinit>() bci:0 line:95 (Interpreted frame)
0x00007f3a7d0004f7 <StubRoutines>
...
Maybe it's not stuck. It's just calling new DateTimeZone() in a loop, and the constructor does some computations. Every time you look at this thread, it's inside DateTimeZone() - but it's a different DateTimeZone() each time.
Which then gets discarded. Happened to me quite a few times.
As found by #naumcho, this proved to be a bug (https://github.com/JodaOrg/joda-time/issues/171).
Based on the information provided (stack traces of two different threads + source line) one could have suspected a deadlock because both threads are trying to instantiate a new object of the same type FixedDateTimeZone.
The next step to confirm that would be to use GDB to inspect the stack frames around __pthread_cond_wait().