Hystrix command fails with "timed-out and no fallback available" - java

I've noticed that some of the commands in my application fail with
Caused by: ! com.netflix.hystrix.exception.HystrixRuntimeException: GetAPICommand timed-out and no fallback available.
out: ! at com.netflix.hystrix.HystrixCommand.getFallbackOrThrowException(HystrixCommand.java:1631)
out: ! at com.netflix.hystrix.HystrixCommand.access$2000(HystrixCommand.java:97)
out: ! at com.netflix.hystrix.HystrixCommand$TimeoutObservable$1$1.tick(HystrixCommand.java:1025)
out: ! at com.netflix.hystrix.HystrixCommand$1.performBlockingGetWithTimeout(HystrixCommand.java:621)
out: ! at com.netflix.hystrix.HystrixCommand$1.get(HystrixCommand.java:516)
out: ! at com.netflix.hystrix.HystrixCommand.execute(HystrixCommand.java:425)
out: Caused by: ! java.util.concurrent.TimeoutException: null
out: !... 11 common frames omitted
This is my Hystrix configuration override:
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=210000
hystrix.threadpool.default.coreSize=50
hystrix.threadpool.default.maxQueueSize=100
hystrix.threadpool.default.queueSizeRejectionThreshold=50
What kind of timeout is this? Is it a read/connection timeout to the external application? How do I go about debugging this?

This is a Hystrix Command Timeout, this timeout is enabled by default per each command, you define the value using the property:
execution.isolation.thread.timeoutInMilliseconds:
This property sets the time in milliseconds after which the caller will
observe a timeout and walk away from the command execution. Hystrix marks > the HystrixCommand as a TIMEOUT, and performs fallback logic.
So you can increase your timeout value or disable the default time out (if apply in your case) for your command using the property:
#HystrixProperty(name = "hystrix.command.default.execution.timeout.enabled", value = "false")
You can find more information here: https://github.com/Netflix/Hystrix/wiki/Configuration#CommandExecution

It might be you are in debug or your connection too slow, default thread execution timeout is only 1 second, so you could get this message easily if you put a break-point in your command let's say
Although this is not your case but might help somebody else

After adding the following dependency
<dependency>
<groupId>com.netflix.hystrix</groupId>
<artifactId>hystrix-javanica</artifactId>
<version>1.5.18</version>
</dependency>
I would be able to resolve the timeout issue.
What you need to do is to set the hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=210000 property in the application.properties or bootstrap.properties
This works for me hopefully would work.

Looking at the stacktrace this is an exception thrown by Hystrix after the 210 seconds you defined above.
As TimeoutException is a checked exception that needs to be declared on each method that could throw this exception. You would see this declared in the run() method of your code.
You can debug this like any other program, but be aware that the run() method runs in a thread separate from the caller. After 210 seconds the caller will just continue despite your debugging session.

You should increase ur rest client httpclient readTimeout property

Related

How to avoid "failed to send operations" errors after switching to the "exactly-once" delivery strategy?

I recently tried to switch my subscriptions in GCP Pub/Sub to the "exactly-once" delivery strategy. However, I started seeing the following warnings ~10 times every 30 minutes in my application logs:
com.google.api.gax.rpc.InvalidArgumentException: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Some acknowledgement ids in the request were invalid. This could be because the acknowledgement ids have expired or the acknowledgement ids were malformed.
at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:92)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:98)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:67)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1041)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1215)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771)
at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:574)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:544)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:535)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:563)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:744)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Some acknowledgement ids in the request were invalid. This could be because the acknowledgement ids have expired or the acknowledgement ids were malformed.
at io.grpc.Status.asRuntimeException(Status.java:535)
... 17 more
They're immediately followed by the following INFO log messages in the same thread:
Permanent error invalid ack id message, will not resend
I didn't see any problems caused by these warnings, but it's a bit hard to tell because my application is processing a decent number of messages (~1000/hour). I initially thought that these warnings were just short-term "aftershocks" from switching to the "exactly-once" strategy. However, I waited for about 2 hours and they kept occurring with the same frequency, showing no sign of disappearing. I then disabled the "exactly-once" strategy and they went away immediately after. Can anyone tell me whether these warnings are dangerous, what side effects I can expect, and most importantly how I can get rid of them?
I'm using version 3.4.0 of spring-cloud-gcp-dependencies and spring-cloud-gcp-starter-pubsub. I'm also using Spring Cloud Stream to process the incoming messages and I rely on it to automatically acknowledge the messages.
I have the following configuration set in my application.yaml file:
spring:
cloud:
gcp:
pubsub:
subscriber:
executor-threads: 15
max-ack-extension-period: 23400 # 6 hours and 30 minutes
acknowledgement-deadline: 600 # Maximum value
For context: The messages in my application represent jobs for execution, and they could take quite a while to finish - hence the 6h30m maximum acknowledgement extension period.
I also saw the following StackOverflow question:
How to handle errors during message acknowledgement using google pubsub java library?
From what I understand, the consequence of these warnings is that the messages will be redelivered to my application, but this is exactly what I want to avoid.
Thanks for the question, Alexander.
The errors you are seeing happen when the modifyAckDeadline or Acknowledgement request to the service fail because the acknowledgement id is already expired. In such cases, the service considers the expired acknowledgement id as invalid, since a newer delivery might already be in-flight. This is as-per the guarantees for exactly once delivery. There could be a few reasons for it:
The request was delayed due to network delays and by the time it arrived at the server, the acknowledgement id lease is already expired.
The tasks issuing the modifyAckDeadline or Acknowledgement request is overwhelmed (high CPU/memory/network usage), leading to delay in issuing those requests.
I suggest setting min-duration-per-ack-extension to a high number to reduce issues mentioned above. This will help mitigate the cases where the acknowledgement id lease expired. The highest value you can set for this field is 600 seconds.
Additionally, as mentioned in the other stack overflow question, you should consider checking the response of your acknowledgement operations. This can be used to guide your application if it can expect a redelivery. Sample.

How do deal with h2 databases inability to deal with interrupts

How do deal with h2 database inability to deal with interrupts, I was occasionally seeing that my embedded h2 database appeared to get corrupted, in particular I had amended an ExecutorService so that if a task took too long it would cancel the task. The task would be cancelled okay but then subsequent database access failed with exceptions such as
23/07/2019 14.23.31:BST:DeleteDuplicatesController:start:SEVERE: commit failed
org.hibernate.TransactionException: commit failed
at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.commit(AbstractTransactionImpl.java:187)
at com.jthink.songkong.db.ReportCache.save(ReportCache.java:46)
at com.jthink.songkong.reports.AbstractReport.setReportDatabaseObject(AbstractReport.java:365)
at com.jthink.songkong.reports.DeleteDuplicatesReport.setReportDatabaseObject(DeleteDuplicatesReport.java:333)
at com.jthink.songkong.reports.DeleteDuplicatesReport.closeReport(DeleteDuplicatesReport.java:377)
at com.jthink.songkong.analyse.toplevelanalyzer.DeleteDuplicatesController.deleteAnyDups(DeleteDuplicatesController.java:606)
at com.jthink.songkong.analyse.toplevelanalyzer.DeleteDuplicatesController.start(DeleteDuplicatesController.java:665)
at com.jthink.songkong.ui.swingworker.DeleteDuplicates.doInBackground(DeleteDuplicates.java:43)
at com.jthink.songkong.ui.swingworker.DeleteDuplicates.doInBackground(DeleteDuplicates.java:20)
at javax.swing.SwingWorker$1.call(SwingWorker.java:295)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at javax.swing.SwingWorker.run(SwingWorker.java:334)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.hibernate.TransactionException: unable to commit against JDBC connection
at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doCommit(JdbcTransaction.java:116)
at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.commit(AbstractTransactionImpl.java:180)
... 14 more
Caused by: org.h2.jdbc.JdbcSQLNonTransientException: General error: "java.lang.IllegalStateException: Reading from nio:C:/Users/Paul/AppData/Roaming/SongKong/Database/Database.mv.db failed; file length -1 read length 4096 at 1541494 [1.4.199/1]"; SQL statement:
COMMIT [50000-199]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:502)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:427)
at org.h2.message.DbException.get(DbException.java:194)
at org.h2.message.DbException.convert(DbException.java:347)
at org.h2.command.Command.executeUpdate(Command.java:280)
at org.h2.jdbc.JdbcConnection.commit(JdbcConnection.java:542)
at com.mchange.v2.c3p0.impl.NewProxyConnection.commit(NewProxyConnection.java:1284)
at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doCommit(JdbcTransaction.java:112)
... 15 more
Caused by: java.lang.IllegalStateException: Reading from nio:C:/Users/Paul/AppData/Roaming/SongKong/Database/Database.mv.db failed; file length -1 read length 4096 at 1541494 [1.4.199/1]
at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:883)
at org.h2.mvstore.DataUtils.readFully(DataUtils.java:420)
at org.h2.mvstore.FileStore.readFully(FileStore.java:98)
at org.h2.mvstore.MVStore.readBufferForPage(MVStore.java:1048)
at org.h2.mvstore.MVStore.readPage(MVStore.java:2186)
at org.h2.mvstore.MVMap.readPage(MVMap.java:554)
at org.h2.mvstore.Page$NonLeaf.getChildPage(Page.java:1086)
at org.h2.mvstore.Page.get(Page.java:221)
at org.h2.mvstore.MVMap.get(MVMap.java:402)
at org.h2.mvstore.MVMap.get(MVMap.java:389)
at org.h2.mvstore.MVStore.getMapName(MVStore.java:2737)
at org.h2.mvstore.MVStore.renameMap(MVStore.java:2650)
at org.h2.mvstore.tx.TransactionStore.commit(TransactionStore.java:453)
at org.h2.mvstore.tx.Transaction.commit(Transaction.java:389)
at org.h2.engine.Session.commit(Session.java:691)
at org.h2.command.dml.TransactionCommand.update(TransactionCommand.java:46)
at org.h2.command.CommandContainer.update(CommandContainer.java:133)
at org.h2.command.Command.executeUpdate(Command.java:267)
... 18 more
Caused by: java.nio.channels.ClosedChannelException
at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:721)
at org.h2.store.fs.FileNio.read(FilePathNio.java:74)
at org.h2.mvstore.DataUtils.readFully(DataUtils.java:406)
... 34 more
23/07/2019 14.23.31:BST:Errors:addError:SEVERE: Adding Error:commit failed
I have since found this issue
Basically if using H2 in embedded mode, and it receives an interrupt then all subsequent access fails until the thread pool is close and reopened. In the example I give of a process having to be cancelled because it appears to be stuck there is no solution except for interrupting
I also have another case whereby usually the controller thread that doesn't directly do a database work itself so I was struggling to see why when an interrupt occurred why this would cause database errors since this is handled by controller thread. I have now worked out the issue is that Im using an ExecutorService with a fixed size BlockingQueue (so that we dont have a big queue build up in memory), but if the queue gets full then new task is actually executed by the controller thread (because of CallerRunsPolicy), so the controller thread can be making calls to database after all.
Im using H2 with hibernate and in both cases calling the following immediately after the interrupt
HibernateUtil.closeFactory();
seems to solve the issue, however I guess this means that any other threads with hibernate sessions will be broken, but at least newly opened sessins will be okay. So im not particularly happy with this workaround, any other ideas ?
Using H2 as a server is not a solution since the whole point of H2 was an embedded db self contained within application.
Although not properly documented using the async protocol allows a connection to be interrupted without breaking all other connections.

I getting always getting "{"status":504,"error":"Gateway Timeout","message":"com.netflix.zuul.exception.ZuulException: Hystrix Readed time out"}"?

I am always getting "2019-04-09 07:24:23.389 WARN 11676 --- [nio-9095-exec-5] o.s.c.n.z.filters.post.SendErrorFilter : Error during filtering", for request which takes more than 1 second.
I have already tried to increase the timeout but none of them worked.
2019-04-09 07:24:23.389 WARN 11676 --- [nio-9095-exec-5] o.s.c.n.z.filters.post.SendErrorFilter : Error during filtering
com.netflix.zuul.exception.ZuulException:
at org.springframework.cloud.netflix.zuul.filters.post.SendErrorFilter.findZuulException(SendErrorFilter.java:114) ~[spring-cloud-netflix-zuul-2.1.0.RELEASE.jar:2.1.0.RELEASE]
at org.springframework.cloud.netflix.zuul.filters.post.SendErrorFilter.run(SendErrorFilter.java:76) ~[spring-cloud-netflix-zuul-2.1.0.RELEASE.jar:2.1.0.RELEASE]
at com.netflix.zuul.ZuulFilter.runFilter(ZuulFilter.java:117) ~[zuul-core-1.3.1.jar:1.3.1]
at com.netflix.zuul.FilterProcessor.processZuulFilter(FilterProcessor.java:193) ~[zuul-core-1.3.1.jar:1.3.1]
at com.netflix.zuul.FilterProcessor.runFilters(FilterProcessor.java:157) ~[zuul-core-1.3.1.jar:1.3.1]
at com.netflix.zuul.FilterProcessor.error(FilterProcessor.java:105) ~[zuul-core-1.3.1.jar:1.3.1]
at com.netflix.zuul.ZuulRunner.error(ZuulRunner.java:112) ~[zuul-core-1.3.1.jar:1.3.1]
at com.netflix.zuul.http.ZuulServlet.error(ZuulServlet.java:145) ~[zuul-core-1.3.1.jar:1.3.1]
at com.netflix.zuul.http.ZuulServlet.service(ZuulServlet.java:83) ~[zuul-core-1.3.1.jar:1.3.1]
at org.springframework.web.servlet.mvc.ServletWrappingController.handleRequestInternal(ServletWrappingController.java:165) ~[spring-webmvc-5.1.5.RELEASE.jar:5.1.5.RELEASE]
at ava.lang.Thread.run(Thread.java:834) ~[na:na]
You can check my answer:
here
Hystrix readed timeout by default is 1 second, and you can change that in your application.yaml file. It can be done globally or per service.
Above issue is caused due to hysterix timeout.
The above issue can be solved by disabling the hystrix timeout or increasing the hysterix timeout as below :
# Disable Hystrix timeout globally (for all services)
hystrix.command.default.execution.timeout.enabled: false
#To disable timeout foror particular service,
hystrix.command.<serviceName>.execution.timeout.enabled: false
# Increase the Hystrix timeout to 60s (globally)
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds: 60000
# Increase the Hystrix timeout to 60s (per service)
hystrix.command.<serviceName>.execution.isolation.thread.timeoutInMilliseconds: 60000
The above solution will work if you are using discovery service for service lookup and routing.
Here is the detailed explaination : spring-cloud-netflix-issue-321
You are timing out on H2 console testing with postman or any other http testers because: Using Zuul...hysterix...you are trying to send the same exact object to the H2 database. This may be happening because you have validators on your models also. To resolve: make sure the json, xml or whatever it is objects are relatively unique by re-edit and then try to send request again.

Every 15 minutes there will be this exception, look at the fillInStackTrace information

Problem Description: MongoDB version is 3.4
In fact, did not do anything on the normal query, write,
because it is in the testing phase, QPS is small.
Question:
1: How is this anomaly produced.
2: what configuration or adjustment needs to be done? help me
02-01 15:11:47 WARN - Got socket exception on connection [connectionId{localValue:43}] to 172.16.199.96:22001. All connections to 172.16.199.96:22001 will be closed.
02-01 15:11:47 INFO - Closed connection [connectionId{localValue:43}] to 172.16.199.96:22001 because there was a socket exception raised by this connection.
org.springframework.data.mongodb.UncategorizedMongoDbException: Exception receiving message; nested exception is com.mongodb.MongoSocketReadException: Exception receiving message
at org.springframework.data.mongodb.core.MongoExceptionTranslator.translateExceptionIfPossible(MongoExceptionTranslator.java:107)
at org.springframework.data.mongodb.core.MongoTemplate.potentiallyConvertRuntimeException(MongoTemplate.java:2135)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1978)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1784)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1767)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:641)
at org.springframework.data.mongodb.core.MongoTemplate.findOne(MongoTemplate.java:606)
at org.springframework.data.mongodb.core.MongoTemplate.findOne(MongoTemplate.java:598)
at com.xxx.xxx.xxx.xxx(xxxService.java:46)
at com.xxx.xxx.xxx.xxx(xxxService.java:157)
at com.xxx.xxx.xxx.xxx(xxxService.java:142)
at com.xxx.xxx.xxx.xxx(xxxService.java:87)
at com.alibaba.dubbo.common.bytecode.Wrapper2.invokeMethod(Wrapper2.java)
at com.alibaba.dubbo.rpc.proxy.javassist.JavassistProxyFactory$1.doInvoke(JavassistProxyFactory.java:46)
at com.alibaba.dubbo.rpc.proxy.AbstractProxyInvoker.invoke(AbstractProxyInvoker.java:72)
at com.alibaba.dubbo.rpc.protocol.InvokerWrapper.invoke(InvokerWrapper.java:53)
at com.alibaba.dubbo.rpc.filter.ExceptionFilter.invoke(ExceptionFilter.java:64)
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:69)
at com.alibaba.dubbo.monitor.support.MonitorFilter.invoke(MonitorFilter.java:75)
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:69)
at com.alibaba.dubbo.rpc.filter.TimeoutFilter.invoke(TimeoutFilter.java:42)
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:69)
at com.alibaba.dubbo.rpc.protocol.dubbo.filter.TraceFilter.invoke(TraceFilter.java:78)
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:69)
at com.alibaba.dubbo.rpc.filter.ContextFilter.invoke(ContextFilter.java:61)
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:69)
at com.alibaba.dubbo.rpc.filter.GenericFilter.invoke(GenericFilter.java:132)
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:69)
at com.alibaba.dubbo.rpc.filter.ClassLoaderFilter.invoke(ClassLoaderFilter.java:38)
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:69)
at com.alibaba.dubbo.rpc.filter.EchoFilter.invoke(EchoFilter.java:38)
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:69)
at com.alibaba.dubbo.rpc.protocol.dubbo.DubboProtocol$1.reply(DubboProtocol.java:100)
at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.handleRequest(HeaderExchangeHandler.java:98)
at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.received(HeaderExchangeHandler.java:170)
at com.alibaba.dubbo.remoting.transport.DecodeHandler.received(DecodeHandler.java:52)
at com.alibaba.dubbo.remoting.transport.dispatcher.ChannelEventRunnable.run(ChannelEventRunnable.java:81)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.mongodb.MongoSocketReadException: Exception receiving message
at com.mongodb.connection.InternalStreamConnection.translateReadException(InternalStreamConnection.java:483)
at com.mongodb.connection.InternalStreamConnection.receiveMessage(InternalStreamConnection.java:228)
at com.mongodb.connection.UsageTrackingInternalConnection.receiveMessage(UsageTrackingInternalConnection.java:96)
at com.mongodb.connection.DefaultConnectionPool$PooledConnection.receiveMessage(DefaultConnectionPool.java:440)
at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:112)
at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:168)
at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:289)
at com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:176)
at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:216)
at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:207)
at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:113)
at com.mongodb.operation.FindOperation$1.call(FindOperation.java:516)
at com.mongodb.operation.FindOperation$1.call(FindOperation.java:510)
at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:431)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:404)
at com.mongodb.operation.FindOperation.execute(FindOperation.java:510)
at com.mongodb.operation.FindOperation.execute(FindOperation.java:81)
at com.mongodb.Mongo.execute(Mongo.java:836)
at com.mongodb.Mongo$2.execute(Mongo.java:823)
at com.mongodb.DBCursor.initializeCursor(DBCursor.java:870)
at com.mongodb.DBCursor.hasNext(DBCursor.java:142)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1964)
... 37 common frames omitted
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.mongodb.connection.SocketStream.read(SocketStream.java:85)
at com.mongodb.connection.InternalStreamConnection.receiveResponseBuffers(InternalStreamConnection.java:494)
at com.mongodb.connection.InternalStreamConnection.receiveMessage(InternalStreamConnection.java:224)
... 57 common frames omitted
java version 1.8.
spring boot version 1.5.3.
deployed with docker.
mongo.hosts=ip:port,ip:port,ip:port
mongo.database.name=dbname
mongo.username=username
mongo.password=pwd
mongo.connections.per.host=32
mongo.max.wait.time=2000
mongo.connect.timeout=2000
You can try,
autoConnectRetry simply means the driver will automatically attempt to reconnect to the server(s) after unexpected disconnects. In production environments you usually want this set to true.
This is from another post, How to configure MongoDB Java driver MongoOptions for production use?
for everybody who is experiencing the same random MongoSocketReadException, you may need the socketTimeoutMS or maxIdleTimeMS parameters instead. The parameter autoConnectRetry is not exposed any more in the mongodb connection string.
Our situation: we switched to mongodb atlas serverless solution for our development and testing environments, ever since then we got this MongoSocketReadException like every 15 min. or randomly. We are also behind a enterprise firewall.
According to https://www.mongodb.com/docs/v6.0/tutorial/connection-pool-performance-tuning/:
a misconfigured firewall closes a socket connection incorrectly and the driver cannot detect that the connection closed improperly.
you need => Use socketTimeoutMS to ensure that sockets are always closed. Set socketTimeoutMS to two or three times the length of the slowest operation that the driver runs.
because the socketTimeoutMS is by default 0, which will never timeout.
And another parameter maxIdleTimeMS may also affect the connection because if the socket is closed and on the client side it's not detected, the connection will be still waiting in idle time and not cloesd. And by default it's 0 meaning it waits forever with no upper boundaries.
So configure this to a small amount may help the driver to close the the problematic connection with its closed socket, before it tries to connect to the db using the same connection and presumes the connection is still there.
So our solution:
...mongodbUri...?socketTimeoutMS=150000&maxIdleTimeMS=150000
we changed the socketTimeoutMS from 0 to 15s and same for the maxIdleTimeMS.

HystrixRuntimeException : timed out

I am getting this error:
com.netflix.hystrix.exception.HystrixRuntimeException:
getCatchmentsByAreaType timed-out and fallback failed.
When I try calling the same method though API it works. I don't know why!
I have set following properties in ZuulApplication:
hystrix.command.default.execution.timeout.enabled=false
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=2000000
hystrix.threadpool.default.coreSize=100
hystrix.threadpool.default.maxQueueSize=100
hystrix.threadpool.default.queueSizeRejectionThreshold=100
ribbon.IsSecured=true
ribbon.ConnectTimeout=2000000
ribbon.ReadTimeout=2000000
ribbon.maxAutoRetries=3

Categories

Resources