There are around 60 Microservices hosted in Kubernetes in my company. Two of them have the problem of high CPU usage after some time. The CPU of the pod jumps up to 100% (the pod got allocated with one CPU of a newer AMD EPYC Generation) and then keeps on working at this rate. This problem appears randomly somewhere between the application start and us noticing and restarting/deleting the pod in the Kuberentes dashboard.
The microservices all run with Spring-Boot 2.7.8 and using undertow instead of tomcat. As HTTP Library we're using Retrofit2 2.9.0 and OkHttp3 3.14.9. We're using Project Reactor.
The Thread Dump has one potentially abnormal entry:
"getProductVariants-260" #354 prio=5 os_prio=0 cpu=678915664.69ms elapsed=687202.16s tid=0x00007f63d800a800 nid=0x170 runnable [0x00007f643ccec000]
java.lang.Thread.State: RUNNABLE
at java.lang.Throwable.fillInStackTrace(java.base#11.0.17/Native Method)
at java.lang.Throwable.fillInStackTrace(java.base#11.0.17/Throwable.java:787)
- locked <0x00000000f4132990> (a okhttp3.internal.connection.RouteException)
at java.lang.Throwable.<init>(java.base#11.0.17/Throwable.java:315)
at java.lang.Exception.<init>(java.base#11.0.17/Exception.java:102)
at java.lang.RuntimeException.<init>(java.base#11.0.17/RuntimeException.java:96)
at okhttp3.internal.connection.RouteException.<init>(RouteException.java:31)
at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.java:96)
at okhttp3.internal.connection.Transmitter.newExchange(Transmitter.java:169)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:41)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at de.md.services.microframework.http.RetryInterceptor.intercept(RetryInterceptor.java:22)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at de.md.services.microframework.http.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:98)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at de.md.services.microframework.http.HeaderInterceptor.intercept(HeaderInterceptor.java:32)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at de.md.services.microframework.http.RetrofitClient.lambda$buildOkHttpClient$0(RetrofitClient.java:166)
at de.md.services.microframework.http.RetrofitClient$$Lambda$1044/0x0000000100959c40.intercept(Unknown Source)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229)
at okhttp3.RealCall.execute(RealCall.java:81)
at retrofit2.OkHttpCall.execute(OkHttpCall.java:204)
at com.jakewharton.retrofit2.adapter.reactor.ExecuteSinkConsumer.accept(ExecuteSinkConsumer.java:39)
at com.jakewharton.retrofit2.adapter.reactor.ExecuteSinkConsumer.accept(ExecuteSinkConsumer.java:24)
at reactor.core.publisher.FluxCreate.subscribe(FluxCreate.java:95)
at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.run(FluxSubscribeOn.java:194)
at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84)
at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37)
at java.util.concurrent.FutureTask.run(java.base#11.0.17/FutureTask.java:264)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base#11.0.17/ScheduledThreadPoolExecutor.java:304)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base#11.0.17/ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base#11.0.17/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base#11.0.17/Thread.java:829)
This thread does not look like it is running and this was always the case when I captured the dump. I also noticed that this thread stays, while other threads tasked with similar HTTP calls die after some time. This has been around for almost as long as the microservice runs. The thread seems to live as the stacktrace changes sometimes.
The Method getProductVariants looks like this:
public Mono<ProductVariantList> getProductVariants(String productVariantIds, HttpServletRequest httpServletRequest) {
ProductConnector productConnector = retrofitClient.buildRetrofitClientReactorDefault(
connectorConfig.getProductService(), connectorConfig.getDefaultTimeout(), getProductVariantsScheduler,
httpServletRequest).create(ProductConnector.class);
return productConnector.getProductVariants(productVariantIds, connectorConfig.getSlapToken()) //
.doOnNext(res -> ThirdPartyUtils.checkRetrofitResponse(res, LOGGER, "ProductService"))
.flatMap(MonoHelper::toResponseContentMono) //
.doOnError(error -> LOGGER.error("getProductVariants ist auf einen Fehler gelaufen.", error));
}
The default timeout is 10 seconds. The scheduler is from Project Reactor and has a thread cap of 40 and a que cap of 100000. The connector just executes a ReST request and aquires less than 150kb of data from another microservice not related to this problem.
The requests to this microservice are about 10 to 20 per minute and they are received by two instances of this microservice.
The memory consumption of the service is normal and so is the memory dump. Does anyone have an idea how to look further into this matter?
Related
I'm using the Google Maps Geocode API (https://github.com/googlemaps/google-maps-services-java) in a Dataflow job. My DoFn prepares the GeoApiContext at Setup. The process element function is done like so:
public void processElement(ProcessContext c) {
String address = c.element().get("Address").toString();
String id = c.element().get("Id").toString();
Gson gson = new GsonBuilder().create();
try {
GeocodingResult[] results = GeocodingApi.newRequest(this.geocodeContext).address(address).language("pt-BR").components(ComponentFilter.country("BR")).await();
if(results.length == 0) {
TableRow outputRow = new TableRow();
outputRow.set("Id", id);
c.output(outputRow);
} else {
for(GeocodingResult r : results) {
TableRow outputRow = convertTableRow(gson.toJson(r).toString());
outputRow.set("Id", id);
c.output(outputRow);
}
}
} catch(ApiException e) {
LOGGER.error("ApiException on address: {}", address, e);
} catch(InterruptedException e) {
LOGGER.error("InterruptedException on address: {}", address, e);
} catch(IOException e) {
LOGGER.error("IOException on address: {}", address, e);
}
}
This code worked fine locally, but when deployed to dataflow it throws a network error:
exception: "java.net.ConnectException: Failed to connect to maps.googleapis.com/2607:f8b0:4001:c05:0:0:0:5f:443
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:265)
at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:183)
at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.java:224)
at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.java:108)
at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.java:88)
at okhttp3.internal.connection.Transmitter.newExchange(Transmitter.java:169)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:41)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:172)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.net.ConnectException: Network is unreachable (connect failed)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.base/java.net.Socket.connect(Socket.java:591)
at okhttp3.internal.platform.Platform.connectSocket(Platform.java:130)
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:263)
... 22 more
I've ensured that the VM spawned has internet access and I can even ping the maps.googleapis.com endpoint from inside the container:
USER#test-geocode-07020834-qmrj-harness-3k2l ~ $ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b2fd123138aa 3a1cb7aedd54 "/opt/google/dataflo…" 6 minutes ago Up 5 minutes k8s_healthchecker_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
086e36c3dd23 4127911f4769 "/opt/google/dataflo…" 6 minutes ago Up 5 minutes k8s_vmmonitor_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
2890fa415af5 664bd8972b23 "/opt/google/dataflo…" 6 minutes ago Up 6 minutes k8s_shuffle_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
eea757bf6be7 gcr.io/cloud-dataflow/v1beta3/beam-java11-batch "/opt/google/dataflo…" 6 minutes ago Up 6 minutes k8s_java-batch_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
b636784118f5 k8s.gcr.io/pause:3.1 "/pause" 6 minutes ago Up 6 minutes k8s_POD_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
lucas#test-geocode-07020834-qmrj-harness-3k2l ~ $ docker exec -it eea /bin/sh
# ping maps.googleapis.com
PING maps.googleapis.com (172.217.214.95) 56(84) bytes of data.
64 bytes from 172.217.214.95: icmp_seq=1 ttl=115 time=1.08 ms
64 bytes from 172.217.214.95: icmp_seq=2 ttl=115 time=1.28 ms
64 bytes from 172.217.214.95: icmp_seq=3 ttl=115 time=1.15 ms
64 bytes from 172.217.214.95: icmp_seq=4 ttl=115 time=1.41 ms
^C
--- maps.googleapis.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 1.089/1.235/1.414/0.131 ms
#
Regarding versions, I'm using the latest beam version (2.22.0) and the latest google maps version (0.14.0).
No idea what else to look at, and any help is appreciated.
UPDATE
The problem seems to be the fact that the request is done with an ipv6 address. However, GCE machines seems to have no support for ipv6 and the call simply fails without falling back to ipv4.
Considering that, there doesn't seem to be any way out of this problem:
Configuring the JVM to prefer ipv4 address can't be done with Dataflow (JVM flags are ignored)
There's also no way to customize the GCE machine (since a base Dataflow image is used)
The library doesn't seem to open any options to configure ipv4 or ipv6
Thanks
I had this exact same issue come up after upgrading from 2.17 to 2.24 and changing from Java 8 to Java 11. After trying to fix this on 2.24 and Java 11 I gave up and went back to 8 and it's working now.
I couldn't find it documented anywhere but it looks like the userAgent used is based on that -
When I build the self executable with Java 8, Dataflow shows the userAgent as
Apache_Beam_SDK_for_Java/2.24.0(JRE_8_environment)
and with Java 11 it shows Apache_Beam_SDK_for_Java/2.24.0(JDK_11_environment)
One of the threads has a lock for more than 3 seconds when querying Oracle Database. This causes many blocked threads when accesing Oracle database, and hence sudden increases in number of threads and unresposiveness of application. Im am using Tomcat 8.5, Tomcat connection pool, Java 8. Trace for blocking thread:
***"http-nio-80-exec-433" #4207 daemon prio=5 os_prio=0 tid=0x00007fd9d8042000 nid=0x503b runnable [0x00007fd839f04000]
java.lang.Thread.State: RUNNABLE
at java.util.Hashtable.get(Hashtable.java:363)
- locked <0x000000070193caa0> (a java.util.Hashtable)
at java.lang.ConditionalSpecialCasing.lookUpTable(ConditionalSpecialCasing.java:151)
at java.lang.ConditionalSpecialCasing.toUpperCaseEx(ConditionalSpecialCasing.java:123)
at java.lang.String.toUpperCase(String.java:2775)
at java.lang.String.toUpperCase(String.java:2833)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1638)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:4401)
at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:4482)
- locked <0x000000074cd7d868> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:6272)
at sun.reflect.GeneratedMethodAccessor400.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.interceptor.AbstractQueryReport$StatementProxy.invoke(AbstractQueryReport.java:210)
at com.sun.proxy.$Proxy637.executeQuery(Unknown Source)
at sun.reflect.GeneratedMethodAccessor400.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.StatementFacade$StatementProxy.invoke(StatementFacade.java:114)
at com.sun.proxy.$Proxy637.executeQuery(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)
at org.hibernate.loader.Loader.getResultSet(Loader.java:1953)
at org.hibernate.loader.Loader.doQuery(Loader.java:802)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
at org.hibernate.loader.Loader.doList(Loader.java:2533)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2276)
at org.hibernate.loader.Loader.list(Loader.java:2271)***
Here is a trace for one of the 10+ BLOCKED threads
***"http-nio-80-exec-271" #2777 daemon prio=5 os_prio=0 tid=0x00007fd9c8941800 nid=0x19c3 waiting for monitor entry [0x00007fd8356ca000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Hashtable.get(Hashtable.java:363)
- waiting to lock <0x000000070193caa0> (a java.util.Hashtable)
at java.lang.ConditionalSpecialCasing.lookUpTable(ConditionalSpecialCasing.java:151)
at java.lang.ConditionalSpecialCasing.toUpperCaseEx(ConditionalSpecialCasing.java:123)
at java.lang.String.toUpperCase(String.java:2775)
at java.lang.String.toUpperCase(String.java:2833)
at oracle.jdbc.driver.CharCommonAccessor.init(CharCommonAccessor.java:164)
at oracle.jdbc.driver.VarcharAccessor.<init>(VarcharAccessor.java:88)
at oracle.jdbc.driver.T4CVarcharAccessor.<init>(T4CVarcharAccessor.java:108)
at oracle.jdbc.driver.T4CTTIdcb.fillupAccessors(T4CTTIdcb.java:431)
at oracle.jdbc.driver.T4CTTIdcb.receiveCommon(T4CTTIdcb.java:209)
at oracle.jdbc.driver.T4CTTIdcb.receive(T4CTTIdcb.java:145)
at oracle.jdbc.driver.T4C8Oall.readDCB(T4C8Oall.java:963)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:447)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:235)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:543)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:239)
at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:1246)
at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1500)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1717)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:4401)
at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:4482)
- locked <0x000000074d203f60> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:6272)
at sun.reflect.GeneratedMethodAccessor400.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.interceptor.AbstractQueryReport$StatementProxy.invoke(AbstractQueryReport.java:210)
at com.sun.proxy.$Proxy637.executeQuery(Unknown Source)
at sun.reflect.GeneratedMethodAccessor400.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.StatementFacade$StatementProxy.invoke(StatementFacade.java:114)
at com.sun.proxy.$Proxy637.executeQuery(Unknown Source)
at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)***
I have no idea why toUpperCase() would lock something (it is some Integer object being locked, as far as i found) for 30+ seconds, but this keeps occuring multiple times per day. Thread dump analysers did not found any deadlocks in dump. Tomcat pools logs that query for blocking thread http-nio-80-exec-433 took 5 minutes to complete.
Could this be a problem with jvm, memory or something else? Like jdbc driver or connection pool configuration problem?
It appears the problem was not code related. We had 10GB size catalina.out log file, and 4 bash scripts which analyzed that file for specific errors every five minutes, and because of large file size, each such analysis (mostly tail/wc commands) took 3-4 minutes. I do not know if catalina.out was being locked, but CPU usage for "tail" and "wc" commands was quite significant. Memory usage did not increase significantly though.
After manually rolling catalina.out, the problem is gone. Admin has been tasked with figuring out why logrotate is failing.
Update: The problem kept reappearing under higher loads (>50 users), so after some testing, locale was changed from "lt" to "en". Together with fixing another MyFaces cache bug, response times from Tomcat decreased 10-20 times, and number of concurrent users that can use application increased >10 times.
I recently got the following messages in our logs followed by a JVM crash (Due to OOME). I am not sure what to make of this and would really appreciate any guidance.
2015-03-19 21:15:02,457 [Timer-0] WARN (ThreadPoolAsynchronousRunner.java [run]:608) - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#6824f21c -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
2015-03-19 21:26:29,543 [Timer-0] WARN (ThreadPoolAsynchronousRunner.java [run]:624) - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#6824f21c -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#15da1b6b (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2)
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#b35b08a (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1)
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#51cfdd17 (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0)
Pending Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#19397937
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#5c7d3838
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#7aea62dd
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#55622ff2
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#74004a8
Pool thread stack traces:
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:560)
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2,5,main]
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
java.net.SocketOutputStream.write(SocketOutputStream.java:153)
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3227)
com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1917)
com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2060)
com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2536)
com.mysql.jdbc.ConnectionImpl.configureClientCharacterSet(ConnectionImpl.java:1751)
com.mysql.jdbc.ConnectionImpl.initializePropsFromServer(ConnectionImpl.java:3425)
com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2196)
com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:718)
com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:46)
sun.reflect.GeneratedConstructorAccessor306.newInstance(Unknown Source)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:408)
com.mysql.jdbc.Util.handleNewInstance(Util.java:406)
com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:302)
com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:282)
com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:135)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:182)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:171)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:137)
com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1014)
com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:32)
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask.run(BasicResourcePool.java:1810)
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:560)
2015-03-19 21:56:59,137 [Timer-0] WARN (ThreadPoolAsynchronousRunner.java [run]:608) - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#6824f21c -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
2015-03-19 21:56:59,143 [eXistThread-18676] ERROR (XQueryServlet.java [process]:566) - Java heap space
java.lang.OutOfMemoryError: Java heap space
at org.exist.storage.btree.BTree$BTreeNode.read(BTree.java:1269)
at org.exist.storage.btree.BTree$BTreeNode.access$16(BTree.java:1239)
at org.exist.storage.btree.BTree.getBTreeNode(BTree.java:460)
at org.exist.storage.btree.BTree.scanSequential(BTree.java:413)
at org.exist.storage.btree.BTree$BTreeNode.scanNextPage(BTree.java:2039)
at org.exist.storage.btree.BTree$BTreeNode.query(BTree.java:1835)
at org.exist.storage.btree.BTree$BTreeNode.query(BTree.java:1759)
at org.exist.storage.btree.BTree$BTreeNode.query(BTree.java:1759)
at org.exist.storage.btree.BTree$BTreeNode.query(BTree.java:1759)
at org.exist.storage.btree.BTree$BTreeNode.access$12(BTree.java:1734)
at org.exist.storage.btree.BTree.query(BTree.java:379)
at org.exist.storage.structural.NativeStructuralIndexWorker.scanByType(NativeStructuralIndexWorker.java:259)
at org.exist.dom.VirtualNodeSet.getNodesFromIndex(VirtualNodeSet.java:457)
at org.exist.dom.VirtualNodeSet.realize(VirtualNodeSet.java:585)
at org.exist.dom.VirtualNodeSet.iterator(VirtualNodeSet.java:860)
at org.exist.dom.AbstractNodeSet.iterator(AbstractNodeSet.java:1)
at org.exist.storage.structural.NativeStructuralIndexWorker.findDescendantsByTagName(NativeStructuralIndexWorker.java:162)
at org.exist.xquery.LocationStep.getAttributes(LocationStep.java:645)
at org.exist.xquery.LocationStep.eval(LocationStep.java:434)
at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
at org.exist.xquery.PathExpr.eval(PathExpr.java:264)
at org.exist.xquery.Predicate.selectByNodeSet(Predicate.java:446)
at org.exist.xquery.Predicate.evalPredicate(Predicate.java:326)
at org.exist.xquery.LocationStep.processPredicate(LocationStep.java:251)
at org.exist.xquery.LocationStep.applyPredicate(LocationStep.java:238)
at org.exist.xquery.LocationStep.eval(LocationStep.java:462)
at org.exist.xquery.AbstractExpression.eval(AbstractExpression.java:71)
at org.exist.xquery.PathExpr.eval(PathExpr.java:264)
at org.exist.xquery.LetExpr.eval(LetExpr.java:142)
at org.exist.xquery.LetExpr.eval(LetExpr.java:187)
at org.exist.xquery.LetExpr.eval(LetExpr.java:187)
at org.exist.xquery.BindingExpression.eval(BindingExpression.java:164)
2015-03-19 21:56:59,147 [Timer-0] WARN (ThreadPoolAsynchronousRunner.java [run]:624) - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#6824f21c -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#79180a12 (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0)
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#243c6d0c (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1)
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#50191373 (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2)
Pending Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#3a9d08ca
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#3ecdd11
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#44ff846d
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#5ce5850a
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#eec1d04
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#6b8d4d9d
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#53e9706d
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#23d472cf
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#4dbe4f8c
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#4c5e0203
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#54ac79fd
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#546e2bad
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#6b13cc83
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#57e185f8
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#60357d68
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#45231180
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#3021aa73
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#6bb437ca
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#2021c9e9
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#7d53637c
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#409c2c97
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#adc5929
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#241ca71a
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#42b26866
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#636b1c33
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#b160466
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#4af34669
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#1b53e609
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#2062ebd4
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#1b6cfe8a
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#4b7c2380
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#4f9be748
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#78108924
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#474b002
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#2ebee32f
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#3e0fe017
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#42aa175b
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#637f5bac
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#3a017b77
com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#7b4f2b78
Pool thread stack traces:
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1,5,main]
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:408)
com.mysql.jdbc.Util.handleNewInstance(Util.java:406)
com.mysql.jdbc.ResultSetImpl.getInstance(ResultSetImpl.java:370)
com.mysql.jdbc.MysqlIO.buildResultSetWithRows(MysqlIO.java:2532)
com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:477)
com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:2510)
com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1746)
com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2135)
com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2536)
com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465)
com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1383)
com.mysql.jdbc.DatabaseMetaData$9.forEach(DatabaseMetaData.java:4826)
com.mysql.jdbc.IterateBlock.doForAll(IterateBlock.java:50)
com.mysql.jdbc.DatabaseMetaData.getTables(DatabaseMetaData.java:4804)
com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnectionNoQuery(DefaultConnectionTester.java:185)
com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:62)
com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(AbstractConnectionTester.java:67)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:368)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:310)
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:1999)
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2,5,main]
com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2596)
com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465)
com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1383)
com.mysql.jdbc.DatabaseMetaData$9.forEach(DatabaseMetaData.java:4826)
com.mysql.jdbc.IterateBlock.doForAll(IterateBlock.java:50)
com.mysql.jdbc.DatabaseMetaData.getTables(DatabaseMetaData.java:4804)
com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnectionNoQuery(DefaultConnectionTester.java:185)
com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:62)
com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(AbstractConnectionTester.java:67)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:368)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:310)
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:1999)
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0,5,main]
java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1012)
java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006)
com.newrelic.agent.TransactionService.addTransaction(TransactionService.java:142)
com.newrelic.agent.Transaction.getTransaction(Transaction.java:1104)
com.newrelic.agent.Transaction.getTransaction(Transaction.java:1087)
com.newrelic.agent.TracerService$TracerServiceImpl.getTracer(TracerService.java:136)
com.newrelic.agent.TracerService.getTracer(TracerService.java:41)
com.newrelic.agent.instrumentation.InvocationPoint.invoke(InvocationPoint.java:55)
com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java)
com.mysql.jdbc.DatabaseMetaData$9.forEach(DatabaseMetaData.java:4826)
com.mysql.jdbc.IterateBlock.doForAll(IterateBlock.java:50)
com.mysql.jdbc.DatabaseMetaData.getTables(DatabaseMetaData.java:4804)
com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnectionNoQuery(DefaultConnectionTester.java:185)
com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:62)
com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(AbstractConnectionTester.java:67)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:368)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:310)
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:1999)
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
We are using hiberanate and c3p0 with the following c3p0 configuration:
We are using the following maven artifacts for hibernate and c3p0:
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-core</artifactId>
<version>4.3.6.Final</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-c3p0</artifactId>
<version>4.3.6.Final</version>
</dependency>
with the following c3p0 configuration:
configuration = new Configuration().setProperty("hibernate.dialect", "org.hibernate.dialect.MySQL5InnoDBDialect")
.setProperty("hibernate.connection.provider_class", "org.hibernate.connection.C3P0ConnectionProvider")
.setProperty("hibernate.c3p0.idle_test_period", "1000")
.setProperty("hibernate.c3p0.min_size", "20")
.setProperty("hibernate.c3p0.max_size", "50")
.setProperty("hibernate.c3p0.timeout", "1800")
.setProperty("hibernate.c3p0.max_statements", "50")
The server is under very light load ~5 queries a second. (Java 8)
I stumbled into the same exception and the reason was a wrong password of the db user...
c3p0 you're a funny guy
So, directly the issue is that the Connection pool was trying to acquire new Connections, but the tasks attempting to acquire those tasks froze for a long period of time, so long that c3p0 decided the tasks must be deadlocked and then discarded and replaced the Thread pool. Later, the Thread pool was hung on idle connection test tasks.
Normally, "hung" tasks tend to look like the second Thread underneath the first "Pool thread stack traces:" label above: performing network IO that fails to complete. Your circumstances are odd in that two of the three threads are not stuck in IO. They have barely begun to do anything, yet they aren't live. Then you experience an OutOfMemoryError, and you get another APPARENT DEADLOCK on idle connection test tasks that also look like they mostly should be live.
Maybe your application is very close to some kind of resource limits that are causing things to run very sluggishly? Straightforwardly, you might increase the amount of memory available to this application (or modify it to have a lower memory footprint). You experience an OOME the second time the Thread pool tries to flush and recreate hung Threads, not directly provoked by that, but quite possibly caused by the growing Thread footprint. (In your logs, are there lots of these APPARENT DEADLOCKs previously? If you force a JVM Thread dump, do you see lots of com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread instances around, still hung on old, not completing tasks?
Some general comments: If your load is no more than 5-ish simultaneous queries, why such a heavy pool? why not min_size 5-ish, max 10-ish? Your max_statements setting is way too small for the number of Connections you permit in the pool. I'd omit this entirely until you get things working smoothly. Then, to get a bit better performance, you might set the more-straightforward-to-reason-about maxStatementsPerConnection parameter if you want.
Mostly you need to keep your application's footprint (memory? Threads?) well below the resources alotted to it, either by increasing resources, reducing its footprint, or fixing any issue that might exist that cause its resource footprint to increase into limits. I'd start by making the pool smaller, the available memory larger, and configuring the pool to be much smaller.
Authentication issue for me too. I just added the domain part to the database server name and it all worked. Very misleading error.
One of the application I work on generates reports using large volume of data returned from DB. The application receives the request as a AJAX async request, executes a query based on the input parameters, stores the data in JSON format, generates an excel with the same contents and then returns the JSON result back to the browser to display the records.
At times due to high volume of data the server crashes due to high memory usage and at times the server threads are hung. The exception captured in the logs is mentioned below :
[3/20/14 21:55:07:051 IST] 0000001a ThreadMonitor W WSVR0605W: Thread "WebContainer : 6" (00000038) has been active for 761754 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung.
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.ibm.db2.jcc.a.ab.b(ab.java:168)
at com.ibm.db2.jcc.a.ab.c(ab.java:222)
at com.ibm.db2.jcc.a.ab.c(ab.java:337)
at com.ibm.db2.jcc.a.ab.v(ab.java:1447)
at com.ibm.db2.jcc.a.db.a(db.java:42)
at com.ibm.db2.jcc.a.r.a(r.java:30)
at com.ibm.db2.jcc.a.sb.g(sb.java:152)
at com.ibm.db2.jcc.b.zc.n(zc.java:1186)
at com.ibm.db2.jcc.b.ad.db(ad.java:1761)
at com.ibm.db2.jcc.b.ad.d(ad.java:2203)
at com.ibm.db2.jcc.b.ad.U(ad.java:489)
at com.ibm.db2.jcc.b.ad.executeQuery(ad.java:472)
at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.pmiExecuteQuery(WSJdbcPreparedStatement.java:1102)
at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.executeQuery(WSJdbcPreparedStatement.java:723)
at com.abc(ABCDao.java:1250)
at com.abc(ABCFacade.java:159)
at com.abcservlet.doPost(ABCServlet.java:56)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:738)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:831)
Environment details are as follows :
1. Application Server - WebSphere Application Server 7.0.0.29
2. DB Server - DB2
3. JVM Memory - 512MB - 2048 MB
Wanted to know what should be the best approach to handle these kind of scenarios. We use poi to generate the reports and we using "org.apache.poi.xssf.streaming.SXSSFWorkbook" to keep less data on the server. But if the data exceeds 200000 records the server thread gets hung.
When developing an application which consumes an external webservice I have generated the sources from the wsdl-url and then created a client:
GeoIPServiceClient service = new GeoIPServiceClient();
GeoIPServiceSoap geoIPClient = service.getGeoIPServiceSoap();
Since the creation of this proxy takes some time I set the client as an attribute in my service class.
But I'm worried that the client isn't thread safe and this webservice is heavily used in the application by concurrent threads (webapp). I can't find any documentation on this.
As a precaution I've started to use an object pool of soap clients instead of a shared one.
Is this an unnecessary precaution? What is the best practice when writing xfire clients?
I suspect some kind of concurrency problem with xfire since I regularly, under high load, get blocked threads and as a result of this the application crashes. Here's a partial thread dump:
"http-xx.xx.xx.xx-80-17" daemon prio=10 tid=0x00007f560d437000 nid=0x66cb waiting for monitor entry [0x00000000412b8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.sun.xml.bind.v2.runtime.reflect.opt.Injector.inject(Injector.java:174)
- waiting to lock <0x00007f561d44e1c0> (a com.sun.xml.bind.v2.runtime.reflect.opt.Injector)
at com.sun.xml.bind.v2.runtime.reflect.opt.Injector.inject(Injector.java:85)
at com.sun.xml.bind.v2.runtime.reflect.opt.AccessorInjector.prepare(AccessorInjector.java:87)
at com.sun.xml.bind.v2.runtime.reflect.opt.OptimizedAccessorFactory.get(OptimizedAccessorFactory.java:165)
at com.sun.xml.bind.v2.runtime.reflect.Accessor$FieldReflection.optimize(Accessor.java:253)
at com.sun.xml.bind.v2.runtime.reflect.TransducedAccessor$CompositeTransducedAccessorImpl.<init>(TransducedAccessor.java:231)
at com.sun.xml.bind.v2.runtime.reflect.TransducedAccessor.get(TransducedAccessor.java:173)
at com.sun.xml.bind.v2.runtime.property.SingleElementLeafProperty.<init>(SingleElementLeafProperty.java:83)
at sun.reflect.GeneratedConstructorAccessor165.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.sun.xml.bind.v2.runtime.property.PropertyFactory.create(PropertyFactory.java:124)
at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.<init>(ClassBeanInfoImpl.java:171)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.getOrCreate(JAXBContextImpl.java:481)
at com.sun.xml.bind.v2.runtime.JAXBContextImpl.<init>(JAXBContextImpl.java:315)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:139)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:117)
at com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:188)
at sun.reflect.GeneratedMethodAccessor176.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:128)
at javax.xml.bind.ContextFinder.find(ContextFinder.java:277)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:372)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:337)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:244)
at org.codehaus.xfire.jaxb2.JaxbType.getJAXBContext(JaxbType.java:306)
- locked <0x00007f565b3aee60> (a org.codehaus.xfire.jaxb2.JaxbType)
at org.codehaus.xfire.jaxb2.JaxbType.writeObject(JaxbType.java:230)
at org.codehaus.xfire.aegis.AegisBindingProvider.writeParameter(AegisBindingProvider.java:229)
at org.codehaus.xfire.service.binding.AbstractBinding.writeParameter(AbstractBinding.java:273)
at org.codehaus.xfire.service.binding.WrappedBinding.writeMessage(WrappedBinding.java:90)
at org.codehaus.xfire.soap.SoapSerializer.writeMessage(SoapSerializer.java:80)
at org.codehaus.xfire.transport.http.HttpChannel.writeWithoutAttachments(HttpChannel.java:56)
at org.codehaus.xfire.transport.http.OutMessageRequestEntity.writeRequest(OutMessageRequestEntity.java:51)
at org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:499)
at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114)
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.codehaus.xfire.transport.http.CommonsHttpMessageSender.send(CommonsHttpMessageSender.java:369)
at org.codehaus.xfire.transport.http.HttpChannel.sendViaClient(HttpChannel.java:123)
at org.codehaus.xfire.transport.http.HttpChannel.send(HttpChannel.java:48)
at org.codehaus.xfire.handler.OutMessageSender.invoke(OutMessageSender.java:26)
at org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131)
at org.codehaus.xfire.client.Invocation.invoke(Invocation.java:79)
at org.codehaus.xfire.client.Invocation.invoke(Invocation.java:114)
at org.codehaus.xfire.client.Client.invoke(Client.java:336)
at org.codehaus.xfire.client.XFireProxy.handleRequest(XFireProxy.java:77)
at org.codehaus.xfire.client.XFireProxy.invoke(XFireProxy.java:57)
at $Proxy143.getMyMethod(Unknown Source)
The thread dump contains a lot of blocked threads that look like this.
I guess as you get a lot of blocked threads, the client is actually thread-safe as object data is not corrupted :). But I agree it's not handling the concurrency in a good way.
1) One observation is that the final lock seems to be in JAXB implementation and not in XFire. What if you try using different JAXB implementation like JaxMe?
2) Also the method getJAXBContext in JaxbType is synchronised. And most likely because your threads are accessing the same JaxbType instance they may be blocked.
Looking at that method I would actually moved the synchronisation into the method after context presense is checked:
if (context == null) {
synchronized (this) {
...
This will allow for clients that already have JAXBContext initialised to skip expensive synchronisation.
My suggestion is either try fixing the code yourself and make a test or submit a bug to XFire or do both :).
Depends on the version of Xfire you are using, as they have fixed few Thread Safety issues in version 1.2.5. You can check the bug raised at http://jira.codehaus.org/browse/XFIRE-886 , and see more details on the release notes at hxxp://xfire.codehaus.org/XFire+1.2.5+Release+Notes