I am using AsyncHttpClient 2.3.0 and Default configuration.
I've noticed that AHC created two types of threads (from the thread dump):
1)
AsyncHttpClient-timer-478-1" - Thread t#30390 java.lang.Thread.State: TIMED_WAITING
at java.lang.Thread.$$YJP$$sleep(Native Method)
at java.lang.Thread.sleep(Thread.java)
at io.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:560)
at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:459)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
2)
AsyncHttpClient-3-4" - Thread t#20320 java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.$$YJP$$epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <16163575> (a io.netty.channel.nio.SelectedSelectionKeySet)
- locked <49280039> (a java.util.Collections$UnmodifiableSet)
- locked <2decd496> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:753)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:409)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
I've expected that AsyncHttpClient uses a few threads under the hood. But after a few days of running, AsyncHttpClient creates ~500 of AsyncHttpClient-timer-xxx-x threads and a few AsyncHttpClient-x-x.
It's called not very intensively, probably also ~500 times per this period.
Only executeRequest is used (execute request and get on returned future) https://static.javadoc.io/org.asynchttpclient/async-http-client/2.3.0/org/asynchttpclient/AsyncHttpClient.html#executeRequest-org.asynchttpclient.Request-org.asynchttpclient.AsyncHandler-:
<T> ListenableFuture<T> executeRequest(Request request, AsyncHandler<T> handler);
I've seen a page about connection pool configuration (https://github.com/AsyncHttpClient/async-http-client/wiki/Connection-pooling) but nothing about thread pool configuration.
What is the difference between both types of thread and what can cause a large number of threads created? Is there any configuration I should apply?
AHC has two types of threads:
For I/O operation.
On your screen, it's AsyncHttpClient-x-x
threads. AHC creates 2*core_number of those.
For timeouts.
On your screen, it's AsyncHttpClient-timer-1-1 thread. Should be
only one.
Any different number means you’re creating multiple clients.
Source: issue on GitHub https://github.com/AsyncHttpClient/async-http-client/issues/1658
Related
I am using Log4j2 with async logging. There are more than 10 threads running parallel in my application and one thread is blocking all other threads.
Thread dump shows below.
"Scheduler_Worker-7" #143 prio=5 os_prio=0 tid=0x00007fae2ac55800 nid=0x18fc runnable [0x00007faceab12000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
at com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136)
at com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105)
at com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
at org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageWhenQueueFull(AsyncLoggerDisruptor.java:236)
- locked <0x00000005522ce890> (a java.lang.Object)
at org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:246)
at org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:230)
at org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:202)
at org.apache.logging.log4j.core.async.AsyncLogger.access$100(AsyncLogger.java:67)
at org.apache.logging.log4j.core.async.AsyncLogger$1.log(AsyncLogger.java:157)
at org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:130)
at org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(ExtendedLoggerWrapper.java:222)
at org.apache.logging.log4j.spi.AbstractLogger.log(AbstractLogger.java:2117)
at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2205)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2159)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2142)
at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2034)
at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1899)
at com.ragi.common.util.Log4jWrapper.debug(Log4jWrapper.java:106)
and other threads stack is like below
"Scheduler_Worker-6" #142 prio=5 os_prio=0 tid=0x00007fae2ad2c800 nid=0x18fb waiting for monitor entry [0x00007faceac13000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageWhenQueueFull(AsyncLoggerDisruptor.java:236)
- waiting to lock <0x00000005522ce890> (a java.lang.Object)
at org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:246)
at org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:230)
at org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:202)
at org.apache.logging.log4j.core.async.AsyncLogger.access$100(AsyncLogger.java:67)
at org.apache.logging.log4j.core.async.AsyncLogger$1.log(AsyncLogger.java:157)
at org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:130)
at org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(ExtendedLoggerWrapper.java:222)
at org.apache.logging.log4j.spi.AbstractLogger.log(AbstractLogger.java:2117)
at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2205)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2159)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2142)
at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2022)
at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1875)
at com.ragi.common.util.Log4jWrapper.debug(Log4jWrapper.java:94)
This clearly shows that Scheduler_Worker-7 thread is holding lock on 0x00000005522ce890 and other threads (Example is Scheduler_Worker-6) are waiting to lock 0x00000005522ce890.
Also, from heap dumps, I observed that log ring queue buffer is full. As I am using default DefaultAsyncQueueFullPolicy, when the buffer is full, the new log events should go directly to the appender instead of the buffer.
Why Scheduler_Worker-7 thread is holding lock on object for so much time and went to TIMED WAITING state and other threads which are trying to log went to blocked states.
Note: There are no deadlocks observed from thread dump.
Software libraries’ Versions in use: Spring 2.2.3.Release, Spring kafka 2.x , Kafka-Client 2.0.0
Java Version: OpenJdk 1.8
Platform Kafka version: Apache kafka 2.2.0 (https://docs.confluent.io/5.2.1/release-notes.html#apache-kafka-2-2-0-cp2)
Configuration: Topic has 6 partitions.
Data rate: The incoming data rate is 3-4 msgs/second.
Scenario:
With the above configuration a test was run continuously for 3 days. After 2.5 days, I encountered problem in terms of not receiving messages from topic in our consumer group.
Upon detailed investigation, I found out that consumer threads were blocked precisely 18 threads were in blocked state. Also, the thread graph is suggesting that consumer threads were waiting on “Java.nio.
PFA logs and screenshots .
a.) Logs
b.) Screenshots
a)Logs
"kafka-coordinator-heartbeat-thread | release-registry-group" #128 daemon prio=5 os_prio=0 tid=0x00007f1954001800 nid=0x8a waiting for monitor entry [0x00007f197f7f6000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:1005)
- waiting to lock <0x00000000c3382070> (a org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
"org.springframework.kafka.KafkaListenerEndpointContainer#1-2-C-1" #127 prio=5 os_prio=0 tid=0x00007f1b6dbc1800 nid=0x89 runnable [0x00007f197f8f6000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000c33822d8> (a sun.nio.ch.Util$3)
- locked <0x00000000c33822c8> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c33822e8> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.kafka.common.network.Selector.select(Selector.java:689)
at org.apache.kafka.common.network.Selector.poll(Selector.java:409)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:271)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:218)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:230)
- locked <0x00000000c3382070> (a org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:314)
at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1218)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1175)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1154)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:732)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:689)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
b.) Screenshots
I would like to understand root cause and possible solutions to fix it and thanks in advance.
My application is under heavy load and I am getting below logs for
sudo -u tomcat jstack <java_process_id>
The below thread is consuming the messages from Kafka, and it got stuck. Since this thread is in WAITING state, no more kafka messages are being consumed.
"StreamThread-3" #91 daemon prio=5 os_prio=0 tid=0x00007f9b5c606000 nid=0x1e4d waiting on condition [0x00007f9b506c5000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000073aad9718> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
at ch.qos.logback.core.AsyncAppenderBase.put(AsyncAppenderBase.java:160)
at ch.qos.logback.core.AsyncAppenderBase.append(AsyncAppenderBase.java:148)
at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:84)
at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:51)
at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:270)
at ch.qos.logback.classic.Logger.callAppenders(Logger.java:257)
at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:421)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:383)
at ch.qos.logback.classic.Logger.error(Logger.java:538)
at com.abc.system.solr.repo.AbstractSolrRepository.doSave(AbstractSolrRepository.java:316)
at com.abc.system.solr.repo.AbstractSolrRepository.save(AbstractSolrRepository.java:295)
I also found this post
WAITING at sun.misc.Unsafe.park(Native Method)
but it didn't help me in my case.
What else I could investigate to get more details in such case?
I also ran into same problem. But, luckily I got my issue resolved by playing around with the size of pool and number of producer and consumer.
Try to check if there is any way to configure following.
Size of your thread pool
Number of consumers/producers (if we can configure in kafka)
Make sure that thread pool should have enough threads to serve consumer and producer both.
My application listen on kafka topic and dump data into cassandra. Threads loads some information from mongo too. Lag in kafka topic getting increased. I have seen that mostly threads are blocked while loading some class. I am attaching my thread_dump below.
"KafkaConsumer-49" prio=10 tid=0x00007f1178fdd000 nid=0x78e0 waiting for monitor entry [0x00007f1155fb5000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.ClassLoader.loadClass(ClassLoader.java:403)
- waiting to lock <0x00000006c0655b58> (a java.lang.Object)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at org.springframework.util.ClassUtils.forName(ClassUtils.java:258)
at org.springframework.data.convert.SimpleTypeInformationMapper.resolveTypeFrom(SimpleTypeInformationMapper.java:56)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:103)
at org.springframework.data.convert.DefaultTypeMapper.getDefaultedTypeToBeUsed(DefaultTypeMapper.java:144)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:121)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:186)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:176)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:172)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:75)
at org.springframework.data.mongodb.core.MongoTemplate$ReadDbObjectCallback.doWith(MongoTemplate.java:1840)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1536)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1336)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1322)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:495)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:486)
at com.snapdeal.coms.timemachine.mao.TimeMachineMao.getVendorProductsForUploadId(TimeMachineMao.java:32)
at com.snapdeal.coms.timemachine.service.TimeMachineService.getVendorProductsForUploadIdAndSupc(TimeMachineService.java:35)
at com.snapdeal.coms.timemachine.event.SupcUploadIdStateUpdateEventHandler.handleEvent(SupcUploadIdStateUpdateEventHandler.java:40)
KafkaConsumer-48" prio=10 tid=0x00007f1178fdb000 nid=0x78df waiting for monitor entry [0x00007f11560b6000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.ClassLoader.loadClass(ClassLoader.java:403)
- waiting to lock <0x00000006c0655b58> (a java.lang.Object)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at org.springframework.util.ClassUtils.forName(ClassUtils.java:258)
at org.springframework.data.convert.SimpleTypeInformationMapper.resolveTypeFrom(SimpleTypeInformationMapper.java:56)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:103)
at org.springframework.data.convert.DefaultTypeMapper.getDefaultedTypeToBeUsed(DefaultTypeMapper.java:144)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:121)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:186)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:176)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:172)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:75)
at org.springframework.data.mongodb.core.MongoTemplate$ReadDbObjectCallback.doWith(MongoTemplate.java:1840)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1536)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1336)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1322)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:495)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:486)
at com.snapdeal.coms.timemachine.mao.TimeMachineMao.getVendorProductsForUploadId(TimeMachineMao.java:32)
at com.snapdeal.coms.timemachine.service.TimeMachineService.getVendorProductsForUploadIdAndSupc(TimeMachineService.java:35)
at com.snapdeal.coms.timemachine.event.SupcUploadIdStateUpdateEventHandler.handleEvent(SupcUploadIdStateUpdateEventHandler.java:40)
at com.snapdeal.coms.timemachine.TimeMachine.onEvent(TimeMachine.java:109)
"KafkaConsumer-47" prio=10 tid=0x00007f1178fd9800 nid=0x78de waiting for monitor entry [0x00007f11561b7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.ClassLoader.loadClass(ClassLoader.java:403)
- waiting to lock <0x00000006c0655b58> (a java.lang.Object)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at org.springframework.util.ClassUtils.forName(ClassUtils.java:258)
at org.springframework.data.convert.SimpleTypeInformationMapper.resolveTypeFrom(SimpleTypeInformationMapper.java:56)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:103)
at org.springframework.data.convert.DefaultTypeMapper.getDefaultedTypeToBeUsed(DefaultTypeMapper.java:144)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:121)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:186)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:176)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:172)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:75)
at org.springframework.data.mongodb.core.MongoTemplate$ReadDbObjectCallback.doWith(MongoTemplate.java:1840)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1536)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1336)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1322)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:495)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:486)
"KafkaConsumer-46" prio=10 tid=0x00007f1178fd8000 nid=0x78dd waiting for monitor entry [0x00007f11562b8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.ClassLoader.loadClass(ClassLoader.java:403)
- waiting to lock <0x00000006c0655b58> (a java.lang.Object)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at org.springframework.util.ClassUtils.forName(ClassUtils.java:258)
at org.springframework.data.convert.SimpleTypeInformationMapper.resolveTypeFrom(SimpleTypeInformationMapper.java:56)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:103)
at org.springframework.data.convert.DefaultTypeMapper.getDefaultedTypeToBeUsed(DefaultTypeMapper.java:144)
at org.springframework.data.convert.DefaultTypeMapper.readType(DefaultTypeMapper.java:121)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:186)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:176)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:172)
at org.springframework.data.mongodb.core.convert.MappingMongoConverter.read(MappingMongoConverter.java:75)
at org.springframework.data.mongodb.core.MongoTemplate$ReadDbObjectCallback.doWith(MongoTemplate.java:1840)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1536)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1336)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1322)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:495)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:486)
at com.snapdeal.coms.timemachine.mao.TimeMachineMao.getVendorProductsForUploadId(TimeMachineMao.java:32)
at com.snapdeal.coms.timemachine.service.TimeMachineService.getVendorProductsForUploadIdAndSupc(TimeMachineService.java:35)
at com.snapdeal.coms.timemachine.event.SupcUploadIdStateUpdateEventHandler.handleEvent(SupcUploadIdStateUpdateEventHandler.java:40)
I am not sure why all the threads are blocked. I thought class get loaded only one time and later no need to take any lock .
Did you try using the ConsumerOffsetChecker to see if your consumers are still alive ? you can try the following command from inside your $KAFKA_ROOT_DIR/ folder
bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group consumer-group1 --zkconnect zkhost:zkport --topic topic1
Here's few note taken from their FAQ page
If consumer offset is not moving after some time, then consumer is likely to have stopped. If consumer offset is moving, but consumer lag (difference between the end of the log and the consumer offset) is increasing, the consumer is slower than the producer. If the consumer is slow, the typical solution is to increase the degree of parallelism in the consumer. This may require increasing the number of partitions of a topic.
the above faq pages also explains possible reasons behind your consumer getting blocked, might worth take a look at it.
Problem was with the fetching data from mongo. there was huge data and pagination was not implemented and there was no socket timeout on the particular request hence threads were getting blocked.
When developing an application which consumes an external webservice I have generated the sources from the wsdl-url and then created a client:
GeoIPServiceClient service = new GeoIPServiceClient();
GeoIPServiceSoap geoIPClient = service.getGeoIPServiceSoap();
Since the creation of this proxy takes some time I set the client as an attribute in my service class.
But I'm worried that the client isn't thread safe and this webservice is heavily used in the application by concurrent threads (webapp). I can't find any documentation on this.
As a precaution I've started to use an object pool of soap clients instead of a shared one.
Is this an unnecessary precaution? What is the best practice when writing xfire clients?
I suspect some kind of concurrency problem with xfire since I regularly, under high load, get blocked threads and as a result of this the application crashes. Here's a partial thread dump:
"http-xx.xx.xx.xx-80-17" daemon prio=10 tid=0x00007f560d437000 nid=0x66cb waiting for monitor entry [0x00000000412b8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.sun.xml.bind.v2.runtime.reflect.opt.Injector.inject(Injector.java:174)
- waiting to lock <0x00007f561d44e1c0> (a com.sun.xml.bind.v2.runtime.reflect.opt.Injector)
at com.sun.xml.bind.v2.runtime.reflect.opt.Injector.inject(Injector.java:85)
at com.sun.xml.bind.v2.runtime.reflect.opt.AccessorInjector.prepare(AccessorInjector.java:87)
at com.sun.xml.bind.v2.runtime.reflect.opt.OptimizedAccessorFactory.get(OptimizedAccessorFactory.java:165)
at com.sun.xml.bind.v2.runtime.reflect.Accessor$FieldReflection.optimize(Accessor.java:253)
at com.sun.xml.bind.v2.runtime.reflect.TransducedAccessor$CompositeTransducedAccessorImpl.<init>(TransducedAccessor.java:231)
at com.sun.xml.bind.v2.runtime.reflect.TransducedAccessor.get(TransducedAccessor.java:173)
at com.sun.xml.bind.v2.runtime.property.SingleElementLeafProperty.<init>(SingleElementLeafProperty.java:83)
at sun.reflect.GeneratedConstructorAccessor165.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.sun.xml.bind.v2.runtime.property.PropertyFactory.create(PropertyFactory.java:124)
at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.<init>(ClassBeanInfoImpl.java:171)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.getOrCreate(JAXBContextImpl.java:481)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.<init>(JAXBContextImpl.java:315)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:139)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:117)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:188)
at sun.reflect.GeneratedMethodAccessor176.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:128)
at javax.xml.bind.ContextFinder.find(ContextFinder.java:277)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:372)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:337)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:244)
at org.codehaus.xfire.jaxb2.JaxbType.getJAXBContext(JaxbType.java:306)
- locked <0x00007f565b3aee60> (a org.codehaus.xfire.jaxb2.JaxbType)
at org.codehaus.xfire.jaxb2.JaxbType.writeObject(JaxbType.java:230)
at org.codehaus.xfire.aegis.AegisBindingProvider.writeParameter(AegisBindingProvider.java:229)
at org.codehaus.xfire.service.binding.AbstractBinding.writeParameter(AbstractBinding.java:273)
at org.codehaus.xfire.service.binding.WrappedBinding.writeMessage(WrappedBinding.java:90)
at org.codehaus.xfire.soap.SoapSerializer.writeMessage(SoapSerializer.java:80)
at org.codehaus.xfire.transport.http.HttpChannel.writeWithoutAttachments(HttpChannel.java:56)
at org.codehaus.xfire.transport.http.OutMessageRequestEntity.writeRequest(OutMessageRequestEntity.java:51)
at org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:499)
at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114)
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.codehaus.xfire.transport.http.CommonsHttpMessageSender.send(CommonsHttpMessageSender.java:369)
at org.codehaus.xfire.transport.http.HttpChannel.sendViaClient(HttpChannel.java:123)
at org.codehaus.xfire.transport.http.HttpChannel.send(HttpChannel.java:48)
at org.codehaus.xfire.handler.OutMessageSender.invoke(OutMessageSender.java:26)
at org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131)
at org.codehaus.xfire.client.Invocation.invoke(Invocation.java:79)
at org.codehaus.xfire.client.Invocation.invoke(Invocation.java:114)
at org.codehaus.xfire.client.Client.invoke(Client.java:336)
at org.codehaus.xfire.client.XFireProxy.handleRequest(XFireProxy.java:77)
at org.codehaus.xfire.client.XFireProxy.invoke(XFireProxy.java:57)
at $Proxy143.getMyMethod(Unknown Source)
The thread dump contains a lot of blocked threads that look like this.
I guess as you get a lot of blocked threads, the client is actually thread-safe as object data is not corrupted :). But I agree it's not handling the concurrency in a good way.
1) One observation is that the final lock seems to be in JAXB implementation and not in XFire. What if you try using different JAXB implementation like JaxMe?
2) Also the method getJAXBContext in JaxbType is synchronised. And most likely because your threads are accessing the same JaxbType instance they may be blocked.
Looking at that method I would actually moved the synchronisation into the method after context presense is checked:
if (context == null) {
synchronized (this) {
...
This will allow for clients that already have JAXBContext initialised to skip expensive synchronisation.
My suggestion is either try fixing the code yourself and make a test or submit a bug to XFire or do both :).
Depends on the version of Xfire you are using, as they have fixed few Thread Safety issues in version 1.2.5. You can check the bug raised at http://jira.codehaus.org/browse/XFIRE-886 , and see more details on the release notes at hxxp://xfire.codehaus.org/XFire+1.2.5+Release+Notes