Here I wanted to register to 2 endpoints and send requests to them. You can see this in the code below. I name one env1 and the other env2.
val client = Http.client
.configured(Transport.Options(noDelay = false, reuseAddr = false))
.newService("gexampleapi-env1.localhost.net:8081,gexampleapi-env2.localhost.net:8081")
So far everything is normal. But env1 instance had to be down for some reason(for a few hours' maintenance etc. not sure why.). Under normal circumstances, our expectation is that it continues to send requests through the env2 instance. But this didn't happen. Could not send requests to both servers. Normally it was working correctly, but it didn't work that day for a reason we don't know.
Since the event took place months ago, I only have the following log.
2022-02-15 12:09:40,181 [finagle/netty4-1-3] INFO com.twitter.finagle
FailureAccrualFactory marking connection to "gExampleAPI" as dead.
Remote Address:
Inet(gexampleapi-env1.localhost.net/10.0.0.1:8081,Map())
To solve the problem, we removed gexampleapi-env1.localhost.net:8081 host from the config file. and after restarting it continued to process requests. If you have any ideas about why we may have experienced this problem and how to avoid this next time, I would appreciate it if you could share them.
We are setting up a cluster to handle inferencing (with Tensorflow Serving) over gRPC. We intend to use a layer-7 load balancer (AWS ALB) to distribute the load. For our work load, inferencing will occur many times per minute from each client account. It is my understand that gRPC holds connection state for each of these channels. As a result, in order for the ALB to do its job, we need to periodically teardown and rebuild the connection on the client instance.
My question: what is the best practice for cycling a connection in Java?
Below is my proposed code, which would be called every couple minutes on each client channel. I assume that while the first connection is being shutdown, we can go about creating new one and immediately issue a request on it; or do we need to wait while the prior channel is shutdown first. In our situation, the channel will (very likely) be empty since the previous request will have been 10 seconds earlier.
if (mChannel != null)
mChannel.shutdown();
mChannel = ManagedChannelBuilder.forAddress(mHost, mPort).usePlaintext().build();
mStub = PredictionServiceGrpc.newBlockingStub(mChannel);
The best practice is to use Lookaside Load Balancing.
However, you can do few tweaks to terminate client connections.
var builder = ManagedChannelBuilder.forAddress(mHost, mPort)
.keepAliveTime(15, TimeUnit.SECONDS)
.keepAliveTimeout(5, TimeUnit.SECONDS);
The above config will ensure to terminate sticky gRPC connections, and AWS ALB can do its job to load balance requests uniformly.
There are other options that you can try depending upon your use case, e.g retries, etc. See ManagedChannelBuilder
I am working with a commercial application which is throwing a SocketException with the message,
An existing connection was forcibly closed by the remote host
This happens with a socket connection between client and server. The connection is alive and well, and heaps of data is being transferred, but it then becomes disconnected out of nowhere.
Has anybody seen this before? What could the causes be? I can kind of guess a few causes, but also is there any way to add more into this code to work out what the cause could be?
Any comments / ideas are welcome.
... The latest ...
I have some logging from some .NET tracing,
System.Net.Sockets Verbose: 0 : [8188] Socket#30180123::Send() DateTime=2010-04-07T20:49:48.6317500Z
System.Net.Sockets Error: 0 : [8188] Exception in the Socket#30180123::Send - An existing connection was forcibly closed by the remote host DateTime=2010-04-07T20:49:48.6317500Z
System.Net.Sockets Verbose: 0 : [8188] Exiting Socket#30180123::Send() -> 0#0
Based on other parts of the logging I have seen the fact that it says 0#0 means a packet of 0 bytes length is being sent. But what does that really mean?
One of two possibilities is occurring, and I am not sure which,
The connection is being closed, but data is then being written to the socket, thus creating the exception above. The 0#0 simply means that nothing was sent because the socket was already closed.
The connection is still open, and a packet of zero bytes is being sent (i.e. the code has a bug) and the 0#0 means that a packet of zero bytes is trying to be sent.
What do you reckon? It might be inconclusive I guess, but perhaps someone else has seen this kind of thing?
This generally means that the remote side closed the connection (usually by sending a TCP/IP RST packet). If you're working with a third-party application, the likely causes are:
You are sending malformed data to the application (which could include sending an HTTPS request to an HTTP server)
The network link between the client and server is going down for some reason
You have triggered a bug in the third-party application that caused it to crash
The third-party application has exhausted system resources
It's likely that the first case is what's happening.
You can fire up Wireshark to see exactly what is happening on the wire to narrow down the problem.
Without more specific information, it's unlikely that anyone here can really help you much.
Using TLS 1.2 solved this error.
You can force your application using TLS 1.2 with this (make sure to execute it before calling your service):
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12
Another solution :
Enable strong cryptography in your local machine or server in order to use TLS1.2 because by default it is disabled so only TLS1.0 is used.
To enable strong cryptography , execute these commande in PowerShell with admin privileges :
Set-ItemProperty -Path 'HKLM:\SOFTWARE\Wow6432Node\Microsoft\.NetFramework\v4.0.30319' -Name 'SchUseStrongCrypto' -Value '1' -Type DWord
Set-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\.NetFramework\v4.0.30319' -Name 'SchUseStrongCrypto' -Value '1' -Type DWord
You need to reboot your computer for these changes to take effect.
This is not a bug in your code. It is coming from .Net's Socket implementation. If you use the overloaded implementation of EndReceive as below you will not get this exception.
SocketError errorCode;
int nBytesRec = socket.EndReceive(ar, out errorCode);
if (errorCode != SocketError.Success)
{
nBytesRec = 0;
}
Had the same bug. Actually worked in case the traffic was sent using some proxy (fiddler in my case). Updated .NET framework from 4.5.2 to >=4.6 and now everything works fine. The actual request was:
new WebClient().DownloadData("URL");
The exception was:
SocketException: An existing connection was forcibly closed by the
remote host
Simple solution for this common annoying issue:
Just go to your ".context.cs" file (located under ".context.tt" which located under your "*.edmx" file).
Then, add this line to your constructor:
public DBEntities()
: base("name=DBEntities")
{
this.Configuration.ProxyCreationEnabled = false; // ADD THIS LINE!
}
I've got this exception because of circular reference in entity.In entity that look like
public class Catalog
{
public int Id { get; set; }
public int ParentId { get; set; }
public Catalog Parent { get; set; }
public ICollection<Catalog> ChildCatalogs { get; set; }
}
I added [IgnoreDataMemberAttribute] to the Parent property. And that solved the problem.
If Running In A .Net 4.5.2 Service
For me the issue was compounded because the call was running in a .Net 4.5.2 service. I followed #willmaz suggestion but got a new error.
In running the service with logging turned on, I viewed the handshaking with the target site would initiate ok (and send the bearer token) but on the following step to process the Post call, it would seem to drop the auth token and the site would reply with Unauthorized.
Solution
It turned out that the service pool credentials did not have rights to change TLS (?) and when I put in my local admin account into the pool, it all worked.
I had the same issue and managed to resolve it eventually. In my case, the port that the client sends the request to did not have a SSL cert binding to it. So I fixed the issue by binding a SSL cert to the port on the server side. Once that was done, this exception went away.
For anyone getting this exception while reading data from the stream, this may help. I was getting this exception when reading the HttpResponseMessage in a loop like this:
using (var remoteStream = await response.Content.ReadAsStreamAsync())
using (var content = File.Create(DownloadPath))
{
var buffer = new byte[1024];
int read;
while ((read = await remoteStream.ReadAsync(buffer, 0, buffer.Length)) != 0)
{
await content.WriteAsync(buffer, 0, read);
await content.FlushAsync();
}
}
After some time I found out the culprit was the buffer size, which was too small and didn't play well with my weak Azure instance. What helped was to change the code to:
using (Stream remoteStream = await response.Content.ReadAsStreamAsync())
using (FileStream content = File.Create(DownloadPath))
{
await remoteStream.CopyToAsync(content);
}
CopyTo() method has a default buffer size of 81920. The bigger buffer sped up the process and the errors stopped immediately, most likely because the overall download speeds increased. But why would download speed matter in preventing this error?
It is possible that you get disconnected from the server because the download speeds drop below minimum threshold the server is configured to allow. For example, in case the application you are downloading the file from is hosted on IIS, it can be a problem with http.sys configuration:
"Http.sys is the http protocol stack that IIS uses to perform http communication with clients. It has a timer called MinBytesPerSecond that is responsible for killing a connection if its transfer rate drops below some kb/sec threshold. By default, that threshold is set to 240 kb/sec."
The issue is described in this old blogpost from TFS development team and concerns IIS specifically, but may point you in a right direction. It also mentions an old bug related to this http.sys attribute: link
In case you are using Azure app services and increasing the buffer size does not eliminate the problem, try to scale up your machine as well. You will be allocated more resources including connection bandwidth.
I got the same issue while using .NET Framework 4.5. However, when I update the .NET version to 4.7.2 connection issue was resolved. Maybe this is due to SecurityProtocol support issue.
For me, it was because the app server I was trying to send email from was not added to our company's SMTP server's allowed list.
I just had to put in SMTP access request for that app server.
This is how it was added by the infrastructure team (I don't know how to do these steps myself but this is what they said they did):
1. Log into active L.B.
2. Select: Local Traffic > iRules > Data Group List
3. Select the appropriate Data Group
4. Enter the app server's IP address
5. Select: Add
6. Select: Update
7. Sync config changes
Yet another possibility for this error to occur is if you tried to connect to a third-party server with invalid credentials too many times and a system like Fail2ban is blocking your IP address.
I was trying to connect to the MQTT broker using the GO client,
broker address was given as address + port, or tcp://address:port
Example: ❌
mqtt://test.mosquitto.org
which indicates that you wish to establish an unencrypted connection.
To request MQTT over TLS use one of ssl, tls, mqtts, mqtt+ssl or tcps.
Example: ✅
mqtts://test.mosquitto.org
In my case, enable the IIS server & then restart and check again.
We are using a SpringBoot service. Our restTemplate code looks like below:
#Bean
public RestTemplate restTemplate(final RestTemplateBuilder builder) {
return builder.requestFactory(() -> {
final ConnectionPool okHttpConnectionPool =
new ConnectionPool(50, 30, TimeUnit.SECONDS);
final OkHttpClient okHttpClient =
new OkHttpClient.Builder().connectionPool(okHttpConnectionPool)
// .connectTimeout(30, TimeUnit.SECONDS)
.retryOnConnectionFailure(false).build();
return new OkHttp3ClientHttpRequestFactory(okHttpClient);
}).build();
}
All our call were failing after the ReadTimeout set for the restTemplate. We increased the time, and our issue was resolved.
This error occurred in my application with the CIP-protocol whenever I didn't Send or received data in less than 10s.
This was caused by the use of the forward open method. You can avoid this by working with an other method, or to install an update rate of less the 10s that maintain your forward-open-connection.
I am working with a Spring LDAP application for which I did none of the setup of the LDAP workings, but now I need to add a failover feature.
We supply our ContextSource with two space-seperated URLs:
String theseUrls = primaryLdapUrl + " " + secondaryLdapUrl;
environment.put("java.naming.provider.url", theseUrls);
ilc = new InitialLdapContext(environment, null);
If the primary URL is functional, then it connects to that. If not, it connects to the secondary just fine. The connections are then pooled, however I am having trouble figuring out the exact mechanics. But, as it is, due to the pooling if the established connection goes down, the whole application shits the bed.
Is there a way to disable pooling, or create a short timeout for it? I have done some research but can't find exact mechanics that have worked for me (including trying to call setPooled(false)). Ideally, the secondary server is only queried if the first is down. When the first is restored, then it will go back to that.
NOTE: This URL (http://forum.spring.io/forum/spring-projects/data/ldap/34643-switching-ldap-contexts-for-failover) has given me a lot of ideas, but I can't get anything to work.
I'm using Java to stream files from Amazon S3, on Linux (Ubuntu 10) 64-bit servers.
I'm using a separate thread for each file, and each file opens an HttpURLConnection which downloads and processes each file concurrently.
Everything works beautifully until I reach a certain number of streams (usually around 2-300 concurrent streams). At irregular points after this, several (say 10) of the threads will start experiencing java.net.IOException: Connection reset errors simultaneously.
I am throttling the download speed, and am way below the 250mbit/s limit of an m1.large instance. There is also insignificant load on all other server aspects (e.g. CPU, load average and memory usage are all fine).
What could be causing this, or how could I track it down?
not trivial to guess what may happen but this is a couple of hints , may be some may apply into your context:
can you check your shell (linux bash /zsh or any other) to see if you raise up the standard limits restricting the number of file descriptors (but sockets too),
man ulimit with bash shell
did you close the streams explicitly in your Java code ? not closing streams may induce such clever problems
try to google for Linux TCP kernel tuning to try to see if your ubuntu server has a well suited stack for such load context...
HTH
Jerome
They might have spillover problem at VIPs because of number of con-current connections reached the limit. You may decrease the size and see...
The problem here is largely in your language. The high load is triggering the error condition, and the error condition results in the exception. Not the other way around.
One relatively common reason for problems like this is that an intermediate proxy (firewall, load balancer) drops what it deems inactive (or too long-lived) HTTP connection.
But beyond this general possibility, EC2 definitely has more kinks as others have suggested.
You are probably running out of ephemeral ports. This happens under load when many short lived connections are opened and closed rapidly. The standard Java HttpURLConnection is not going to get you the flexibility you need to set the proper socket options. I recommend going with the Apache HttpComponents project, and setting options like so...
...
HttpGet httpGet = new HttpGet(uri);
HttpParams params = new BasicHttpParams();
params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 16 * 1000); // 16 seconds
params.setParameter(CoreConnectionPNames.SO_REUSEADDR, true); // <-- teh MOJO!
DefaultHttpClient httpClient = new DefaultHttpClient(connectionManager, params);
BasicHttpContext httpContext = new BasicHttpContext();
HttpResponse httpResponse = httpClient.execute(httpGet, httpContext);
StatusLine statusLine = httpResponse.getStatusLine();
if (statusLine.getStatusCode() >= HTTP_STATUS_CODE_300)
{
...
I've omitted some code, like the connectionManager setup, but you can grok that from their docs.
[Update]
You might also add params.setParameter(CoreConnectionPNames.SO_LINGER, 1); to keep ephemeral ports from lingering around before reclamation.