I am running a load test using Jmeter 5.2 on a web application.
When I run the thread group even with a greater than 500 thread,
I do some configuration jmeter :
in user.properties I have set :
httpclient4.retrycount=1
hc.parameters.file=hc.parameters
In hc.parameters I have set:
http.connection.stalecheck$Boolean=true
and in Http request : client Implementation > httpClient4
org.apache.http.NoHttpResponseException: server-address:80 failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.executeRequest(HTTPHC4Impl.java:850)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:561)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:67)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1282)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1271)
at org.apache.jmeter.threads.JMeterThread.doSampling(JMeterThread.java:627)
at org.apache.jmeter.threads.JMeterThread.executeSamplePackage(JMeterThread.java:551)
at org.apache.jmeter.threads.JMeterThread.processSampler(JMeterThread.java:490)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:257)
at java.lang.Thread.run(Unknown Source)
any idea how to resolve this!
Most probably your application gets overloaded hence it cannot handle 500 concurrent connections, you need to check the following:
Your application server configuration, i.e. increase maximum number of incoming connections
Your backend health in terms of CPU, RAM, available network sockets, etc., it can be done using JMeter PerfMon Plugin
Your application configuration (connection pools, database configuration, etc.) as it might not be properly tuned for high loads
Related
Context:
We have a Spring Boot (2.3.1.RELEASE) web app
It's written in Java 8 but running inside of a container with Java 11 (openjdk:11.0.6-jre-stretch).
It has a DB connection and an upstream service that is called via https (simple RestTemplate#exchange method) (this is important!)
It is deployed inside of a Kubernetes cluster (not sure if this is important)
Problem:
Every day, I see a small percentage of requests towards the upstream service fail with this error: I/O error on GET request for "https://upstream.xyz/path": Connection reset; nested exception is javax.net.ssl.SSLException: Connection reset
The errors are totally random and happen intermittently.
We have had a similar error (javax.net.ssl.SSLProtocolException: Connection reset) that was related to JRE11 and it's TLS 1.3 negotiation issue. We have updated our Docker image to above mentioned and that fixed it.
This is the stack trace from the error:
java.net.SocketException: Connection reset
at java.base/java.net.SocketInputStream.read(Unknown Source)
at java.base/java.net.SocketInputStream.read(Unknown Source)
at java.base/sun.security.ssl.SSLSocketInputRecord.read(Unknown Source)
at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(Unknown Source)
at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(Unknown Source)
at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(Unknown Source)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.springframework.http.client.HttpComponentsClientHttpRequest.executeInternal(HttpComponentsClientHttpRequest.java:87)
at org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:53)
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:739)
at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:674)
at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:583)
....
Configuration:
public static RestTemplate create(final int maxTotal, final int defaultMaxPerRoute,
final int connectTimeout, final int readTimeout,
final String userAgent) {
final Registry<ConnectionSocketFactory> schemeRegistry = RegistryBuilder.<ConnectionSocketFactory>create()
.register("http", PlainConnectionSocketFactory.getSocketFactory())
.register("https", SSLConnectionSocketFactory.getSocketFactory())
.build();
final PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager(schemeRegistry);
connManager.setMaxTotal(maxTotal);
connManager.setDefaultMaxPerRoute(defaultMaxPerRoute);
final CloseableHttpClient httpClient = HttpClients.custom()
.setConnectionManager(connManager)
.setUserAgent(userAgent)
.setDefaultRequestConfig(RequestConfig.custom()
.setConnectTimeout(connectTimeout)
.setSocketTimeout(readTimeout)
.setExpectContinueEnabled(false).build())
.build();
return new RestTemplateBuilder()
.requestFactory(() -> new HttpComponentsClientHttpRequestFactory(httpClient))
.build();
}
Has anyone experienced this issue?
When I turn on debug logs on the http client, it is overflowing with noise and I am unable to discern anything useful...
We had a similar problem when migrating to AWS/Kubernetes.
I've found out why.
You're using a connection pool. The default behavior of the PoolingHttpClientConnectionManager is that it will reuse connections. So connections will not be closed immediately when your request is done. This will save resources by not having to reconnect all the time.
A Kubernetes cluster uses a NAT (Network Address Translation) for outgoing connections. When a connection is not used for a certain amount of time, the connection will be removed from the NAT-table, and the connection will be broken. This causes the seemingly random SSLExceptions.
On AWS, connections will be removed from the NAT-table when it is Idle for 350 seconds. Other Kubernetes instances might have other settings.
See https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-troubleshooting.html
The solution:
Disable connection-reuse:
final CloseableHttpClient closeableHttpClient = HttpClients.custom()
.setConnectionReuseStrategy(NoConnectionReuseStrategy.INSTANCE)
.setConnectionManager(poolingHttpClientConnectionManager)
.build();
Or, let the httpClient evict connections that are idle for too long:
return HttpClients.custom()
.evictIdleConnections(300, TimeUnit.SECONDS) //Read the javadocs, may not be used when the instance of HttpClient is created inside an EJB container.
.setConnectionManager(poolingHttpClientConnectionManager)
.build();
Or call setConnectionKeepAliveStrategy(....) with a custom KeepAliveStrategy that will never return -1 or a timeout with a value of more than 300 seconds .
I will share my experience on this error probably it is the same problem you are facing. Comparing the stack trace which I had.
As this is happening randomly is the key phrase which I suspect that this is the same problem.
HTTP connections are made through an HTTP client library(Apache HTTP Client).
HTTP client usually manages, a re-usable pool of connections. This pool has a limit. In our case, the pool of connections is sometimes(Randomly) getting totally occupied. There are no more free connections which can be used anymore.
You can either increase the pool size
Implement a back-off retry mechanism which will try to grab a connection from the pool of HTTP connections when there is a failure on executing the HTTP request successfully.
If you wonder how to tune this underlying HTTP Client that is being used in sprint boot, check out this post.
I guess the issue is related with k8s.
if you use flannel as k8s network, please check flannel status and find if it restarts more times. use below command
kubectl get pod -n kube-system | grep flannel
what version of your linux kennel? if not 4.x version or above, please upgrade to 4.x.
# to check linux kennel version
uname -sr
# upgrade step
1)
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-4.el7.elrepo.noarch.rpm
yum --enablerepo=elrepo-kernel -y install kernel-lt
2) open and edit /etc/default/grub, and set "GRUB_DEFAULT=0"
3) grub2-mkconfig -o /boot/grub2/grub.cfg
4) reboot
Wish it useful to solving issue.
SSL stacktrace like this could be caused by many different reasons which could have nothing to do with SSL itself. That stacktrace will not help you enough, and furthermore this issue has nothing to do with spring, resttemplate etc.
What will help you is if you implement a logging/monitoring/tracing framework I use elasticsearch. Monitor the behavior for a couple of days, make sure you record as much information in these logs as needed, such as the container id, connection details (when it was initiated, etc). You might find that for example after a connection has lived for a certain amount of time, (eg 1 hour) this occurs, and if you simply make connections live for less time, the issue will goes away.
This way you may be able to fix the issue without needing to figure out the root cause, as that could be many days of work and get you no where. Rather tinkering with the connection parameters will resolve your issue potentially. But for that you need more visibility as the info you've posted so far is not enough to troubleshoot the issue.
We have a java application hosted on two different Application Servers with 64GB RAM and 32vCPU each. And HaProxy as a Loadbalancer in front.
With this setup in our VPS servers, we cannot reach 100 concurrent users. The application keeps throwing the below error message even if we have enough free memory(Approximately 40GB RAM).
ERROR org.springframework.boot.context.web.ErrorPageFilter - Forwarding to error page from request [/create-new] due to exception [unable to create new native thread]
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method) [na:1.8.0_262]
at java.lang.Thread.start(Thread.java:717) [na:1.8.0_262]
at org.codehaus.groovy.runtime.DefaultGroovyStaticMethods.createThread(DefaultGroovyStaticMethods.java:104) ~[groovy-2.4.5.jar:2.4.5]
at org.codehaus.groovy.runtime.DefaultGroovyStaticMethods.start(DefaultGroovyStaticMethods.java:58) ~[groovy-2.4.5.jar:2.4.5]
at sun.reflect.GeneratedMethodAccessor2752.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_262]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_262]
at org.codehaus.groovy.runtime.metaclass.ReflectionMetaMethod.invoke(ReflectionMetaMethod.java:54) ~[groovy-2.4.5.jar:2.4.5]
at org.codehaus.groovy.runtime.metaclass.NewStaticMetaMethod.invoke(NewStaticMetaMethod.java:53) ~[groovy-2.4.5.jar:2.4.5]
at org.codehaus.groovy.runtime.callsite.StaticMetaMethodSite$StaticMetaMethodSiteNoUnwrapNoCoerce.invoke(StaticMetaMethodSite.java:151) ~[groovy-2.4.5.jar:2.4.5]
RAM
App1 # free -m
total used free shared buff/cache available
Mem: 65536 21480 44056 2 10341 33715
Swap: 0 0 0
However, on AWS with the same configuration and setup, the app can handle more than 200 concurrent users.
We are hopeless, any suggestions will be greatly appreciated!
I deployed Apache Ignite cluster and I need to perform different operations with caches from my Vert.x backend.
I successfully connect to cluster using Apache Ignite client (not Thin client). Apache Ingite Client is run inside Vert.x verticle:
vertx.deployVerticle(new IgniteVerticle(),
new DeploymentOptions().setInstances(1).setWorker(true),
apacheIgniteVerticleDeployment.completer());
But some time later I start receiving the following messages:
SEVERE: Blocked system-critical thread has been detected.
This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=28s]
SEVERE: Critical system error detected. Will be handled accordingly to configured handler
[hnd=NoOpFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED,
err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1567112815022]]]
class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=null, finished=false, heartbeatTs=1567112815022]
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
at org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
at org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Such messages appear about every 10 seconds. I have a guess that this can be somehow related to the way how Vert.x works.
What can be the reason of these exceptions?
You can try increasing systemWorkerBlockedTimeout on IgniteConfiguration to make this message go away. See more in docs:
https://apacheignite.readme.io/docs/critical-failures-handling
I have deployed a Java EE application in linux and Apache Tomcat 7.0.42
Everything works fine when I load test for 100 users using JMeter(concurrent 100 threads requests)
But as soon as I change the users(or number of threads) to 1000 server is choked and it gives "Connection refused" error for all the requests after ~600.
I have done all fine tuning in the application and it is more of of a static web page now, even then it comes back with error.
Server Configuration: Ubuntu, 8 vCPU / 32 GB RAM / 960 GB HD
PS: The same application works well in AWS(Amazon Web Services), so you can rule out any problem with my machine running JMeter(client)
org.apache.http.conn.HttpHostConnectException: Connection to http://a.b.c.d:8080 refused
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:286)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:62)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1088)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1077)
at org.apache.jmeter.threads.JMeterThread.process_sampler(JMeterThread.java:428)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:256)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
... 12 more
Try adjusting the maxThreads and acceptCount attributes of the http connector in server.xml:
Each incoming request requires a thread for the duration of that
request. If more simultaneous requests are received than can be
handled by the currently available request processing threads,
additional threads will be created up to the configured maximum (the
value of the maxThreads attribute). If still more simultaneous
requests are received, they are stacked up inside the server socket
created by the Connector, up to the configured maximum (the value of
the acceptCount attribute). Any further simultaneous requests will
receive "connection refused" errors, until resources are available to
process them.
Reference: http://tomcat.apache.org/tomcat-7.0-doc/config/http.html
Thank you all!!
Problem was actually with network, when we tested using different IP address(IP spoofing), all requests were successful. Network was thinking that it was a DoS attack.
Thanks all. I had tried maxThreads & acceptCount and did a lot of tuning in Linux.
So the learning is: Conduct the performance test from a server which is located in the same zone.
Possibly 1000 concurrent requests (in one second) is out of reality. A better test would be to distribute the 1000 concurrent requests in an interval of time.
e.g.: The image show that 100 requests are executed in a period of 60 seconds, ie, almost two requests per second.
I have two web apps deployed to one Tomcat 7 container. They communicate with each other over HTTP on localhost. I'm occasionally seeing connection reset errors:
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149)
at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:110)
at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:260)
at org.apache.http.impl.conn.LoggingSessionInputBuffer.readLine(LoggingSessionInputBuffer.java:115)
at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:281)
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:219)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:633)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:454)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:776)
at com.sun.jersey.client.apache4.ApacheHttpClient4Handler.handle(ApacheHttpClient4Handler.java:170)
My questions are:
How can I find out what's causing these?
Should I expect connect resets over the lo interface?
If they are expected, what's the best way of dealing with them? Retry idempotent calls?
There can be various reasons for that. Usually the cause is a TCP level problem or abnormal connection termination due to excessive load, internal problem, etc
I see no reason why a loopback device should be any special in this regard
Yes, it is. Automatic retrial of idempotent methods would be the most reasonable recovery measure