Having an issue with the following configuration,
Driver version : 3.12.1, mongodb-driver for Java
Server Version: 3.2 of Mongo API for Azure Cosmos DB (Ancient, I know)
We run some fairly high read/write loads and may hit rate limiting from the Cosmos API for Mongo. In this case, I expect an exception to occur. We're doing pretty vanilla queries, code snippet looks similar to
public DatabaseQueryResult find(String collectionName, Map<String, Object> queryData) {
Document toFind = new Document(queryData);
MongoCollection<Document> collection = this.mongoDatabase.getCollection(collectionName);
FindIterable<Document> findResults = collection.find(toFind);
if (findResults != null) {
Document dataFound = findResults.first();
return new DatabaseQueryResult(dataFound.toJson(this.settings))
}
// other stuff...
}
When rate limited by Azure, you'll receive a response like so
{
"$err":"Message: {\"Errors\":[\"Request rate is large. More Request Units may be needed, so no changes were made. Please retry this request later. Learn more: http://aka.ms/cosmosdb-error-429\"]}\r\n s",
"code":16500,
"_t":"OKMongoResponse",
"errmsg":"Message: {\"Errors\":[\"Request rate is large. More Request Units may be needed, so no changes were made. Please retry this request later. Learn more: http://aka.ms/cosmosdb-error-429\"]}\r\n",
"ok":0
}
I expect an exception to be thrown here - but that doesn't seem to be the case with the later driver. What's happening is,
collection.find is returning a FindIterable with the JSON error result as above as the first document
We're eventually returning a DatabaseQueryResult with JSON error as the query payload
I don't want this to happen - I'd much prefer the mongo driver to throw a MongoCommandException/MongoQueryException if a query operation returns an OKMongoResponse where "ok" 0. This seems fine on writes,
which will use a CommandProtocol object and the response is validated as I'd expect - it's just reads that seems to have changed.
Comparing the 2 driver versions, this seems to be a change in read behaviour - perhaps due to retryable reads that were introduced in version 3.11? Response validation now seems to be around this section.
Q: Is there a way to configure my Mongo client so that the driver will validate server responses on read operations and throw an exception if it receives a OKMongoResponse, and ok == 0?
I can of course validate the results myself, but I'd prefer not to and let the driver do this if possible
I'm not sure why Mongo changed this driver. There is something on the Cosmos side which may help. You can raise a support ticket and ask them to turn on server-side retries. This will change the behavior of Cosmos such that requests will queue up rather than throw 429's when there are too many.
This more reflects how Mongo behaves when running on a VM or in Atlas (which also runs on VM's) rather than a multi-tenant service like Cosmos DB.
With 3.2-3.4 servers the drivers use find command described here, not OP_QUERY.
The driver surely is not "returning OKMongoResponse" since it isn't written for cosmosdb.
If you think there is a driver issue, update the question with exact wire protocol response received and the exact result you receive from the driver.
Retryable writes require sessions (which cosmosdb advertises but does not support, see Importing BSON to CosmosDB MongoDB API using mongorestore) and normally use the OP_MSG protocol which come with 3.6+ servers. I don't know what drivers would do if a 3.2 server advertises session support, this isn't a combination that is possible with MongoDB.
Note that MongoDB does not support cosmosdb (and consequently MongoDB drivers don't, officially, either).
I am working with a commercial application which is throwing a SocketException with the message,
An existing connection was forcibly closed by the remote host
This happens with a socket connection between client and server. The connection is alive and well, and heaps of data is being transferred, but it then becomes disconnected out of nowhere.
Has anybody seen this before? What could the causes be? I can kind of guess a few causes, but also is there any way to add more into this code to work out what the cause could be?
Any comments / ideas are welcome.
... The latest ...
I have some logging from some .NET tracing,
System.Net.Sockets Verbose: 0 : [8188] Socket#30180123::Send() DateTime=2010-04-07T20:49:48.6317500Z
System.Net.Sockets Error: 0 : [8188] Exception in the Socket#30180123::Send - An existing connection was forcibly closed by the remote host DateTime=2010-04-07T20:49:48.6317500Z
System.Net.Sockets Verbose: 0 : [8188] Exiting Socket#30180123::Send() -> 0#0
Based on other parts of the logging I have seen the fact that it says 0#0 means a packet of 0 bytes length is being sent. But what does that really mean?
One of two possibilities is occurring, and I am not sure which,
The connection is being closed, but data is then being written to the socket, thus creating the exception above. The 0#0 simply means that nothing was sent because the socket was already closed.
The connection is still open, and a packet of zero bytes is being sent (i.e. the code has a bug) and the 0#0 means that a packet of zero bytes is trying to be sent.
What do you reckon? It might be inconclusive I guess, but perhaps someone else has seen this kind of thing?
This generally means that the remote side closed the connection (usually by sending a TCP/IP RST packet). If you're working with a third-party application, the likely causes are:
You are sending malformed data to the application (which could include sending an HTTPS request to an HTTP server)
The network link between the client and server is going down for some reason
You have triggered a bug in the third-party application that caused it to crash
The third-party application has exhausted system resources
It's likely that the first case is what's happening.
You can fire up Wireshark to see exactly what is happening on the wire to narrow down the problem.
Without more specific information, it's unlikely that anyone here can really help you much.
Using TLS 1.2 solved this error.
You can force your application using TLS 1.2 with this (make sure to execute it before calling your service):
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12
Another solution :
Enable strong cryptography in your local machine or server in order to use TLS1.2 because by default it is disabled so only TLS1.0 is used.
To enable strong cryptography , execute these commande in PowerShell with admin privileges :
Set-ItemProperty -Path 'HKLM:\SOFTWARE\Wow6432Node\Microsoft\.NetFramework\v4.0.30319' -Name 'SchUseStrongCrypto' -Value '1' -Type DWord
Set-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\.NetFramework\v4.0.30319' -Name 'SchUseStrongCrypto' -Value '1' -Type DWord
You need to reboot your computer for these changes to take effect.
This is not a bug in your code. It is coming from .Net's Socket implementation. If you use the overloaded implementation of EndReceive as below you will not get this exception.
SocketError errorCode;
int nBytesRec = socket.EndReceive(ar, out errorCode);
if (errorCode != SocketError.Success)
{
nBytesRec = 0;
}
Had the same bug. Actually worked in case the traffic was sent using some proxy (fiddler in my case). Updated .NET framework from 4.5.2 to >=4.6 and now everything works fine. The actual request was:
new WebClient().DownloadData("URL");
The exception was:
SocketException: An existing connection was forcibly closed by the
remote host
Simple solution for this common annoying issue:
Just go to your ".context.cs" file (located under ".context.tt" which located under your "*.edmx" file).
Then, add this line to your constructor:
public DBEntities()
: base("name=DBEntities")
{
this.Configuration.ProxyCreationEnabled = false; // ADD THIS LINE!
}
I've got this exception because of circular reference in entity.In entity that look like
public class Catalog
{
public int Id { get; set; }
public int ParentId { get; set; }
public Catalog Parent { get; set; }
public ICollection<Catalog> ChildCatalogs { get; set; }
}
I added [IgnoreDataMemberAttribute] to the Parent property. And that solved the problem.
If Running In A .Net 4.5.2 Service
For me the issue was compounded because the call was running in a .Net 4.5.2 service. I followed #willmaz suggestion but got a new error.
In running the service with logging turned on, I viewed the handshaking with the target site would initiate ok (and send the bearer token) but on the following step to process the Post call, it would seem to drop the auth token and the site would reply with Unauthorized.
Solution
It turned out that the service pool credentials did not have rights to change TLS (?) and when I put in my local admin account into the pool, it all worked.
I had the same issue and managed to resolve it eventually. In my case, the port that the client sends the request to did not have a SSL cert binding to it. So I fixed the issue by binding a SSL cert to the port on the server side. Once that was done, this exception went away.
For anyone getting this exception while reading data from the stream, this may help. I was getting this exception when reading the HttpResponseMessage in a loop like this:
using (var remoteStream = await response.Content.ReadAsStreamAsync())
using (var content = File.Create(DownloadPath))
{
var buffer = new byte[1024];
int read;
while ((read = await remoteStream.ReadAsync(buffer, 0, buffer.Length)) != 0)
{
await content.WriteAsync(buffer, 0, read);
await content.FlushAsync();
}
}
After some time I found out the culprit was the buffer size, which was too small and didn't play well with my weak Azure instance. What helped was to change the code to:
using (Stream remoteStream = await response.Content.ReadAsStreamAsync())
using (FileStream content = File.Create(DownloadPath))
{
await remoteStream.CopyToAsync(content);
}
CopyTo() method has a default buffer size of 81920. The bigger buffer sped up the process and the errors stopped immediately, most likely because the overall download speeds increased. But why would download speed matter in preventing this error?
It is possible that you get disconnected from the server because the download speeds drop below minimum threshold the server is configured to allow. For example, in case the application you are downloading the file from is hosted on IIS, it can be a problem with http.sys configuration:
"Http.sys is the http protocol stack that IIS uses to perform http communication with clients. It has a timer called MinBytesPerSecond that is responsible for killing a connection if its transfer rate drops below some kb/sec threshold. By default, that threshold is set to 240 kb/sec."
The issue is described in this old blogpost from TFS development team and concerns IIS specifically, but may point you in a right direction. It also mentions an old bug related to this http.sys attribute: link
In case you are using Azure app services and increasing the buffer size does not eliminate the problem, try to scale up your machine as well. You will be allocated more resources including connection bandwidth.
I got the same issue while using .NET Framework 4.5. However, when I update the .NET version to 4.7.2 connection issue was resolved. Maybe this is due to SecurityProtocol support issue.
For me, it was because the app server I was trying to send email from was not added to our company's SMTP server's allowed list.
I just had to put in SMTP access request for that app server.
This is how it was added by the infrastructure team (I don't know how to do these steps myself but this is what they said they did):
1. Log into active L.B.
2. Select: Local Traffic > iRules > Data Group List
3. Select the appropriate Data Group
4. Enter the app server's IP address
5. Select: Add
6. Select: Update
7. Sync config changes
Yet another possibility for this error to occur is if you tried to connect to a third-party server with invalid credentials too many times and a system like Fail2ban is blocking your IP address.
I was trying to connect to the MQTT broker using the GO client,
broker address was given as address + port, or tcp://address:port
Example: ❌
mqtt://test.mosquitto.org
which indicates that you wish to establish an unencrypted connection.
To request MQTT over TLS use one of ssl, tls, mqtts, mqtt+ssl or tcps.
Example: ✅
mqtts://test.mosquitto.org
In my case, enable the IIS server & then restart and check again.
We are using a SpringBoot service. Our restTemplate code looks like below:
#Bean
public RestTemplate restTemplate(final RestTemplateBuilder builder) {
return builder.requestFactory(() -> {
final ConnectionPool okHttpConnectionPool =
new ConnectionPool(50, 30, TimeUnit.SECONDS);
final OkHttpClient okHttpClient =
new OkHttpClient.Builder().connectionPool(okHttpConnectionPool)
// .connectTimeout(30, TimeUnit.SECONDS)
.retryOnConnectionFailure(false).build();
return new OkHttp3ClientHttpRequestFactory(okHttpClient);
}).build();
}
All our call were failing after the ReadTimeout set for the restTemplate. We increased the time, and our issue was resolved.
This error occurred in my application with the CIP-protocol whenever I didn't Send or received data in less than 10s.
This was caused by the use of the forward open method. You can avoid this by working with an other method, or to install an update rate of less the 10s that maintain your forward-open-connection.
I am trying to set the write timeout in Cassandra with the Java drive. SocketOptions allows me to set a read and connect timeout but not a write timeout.
Does anyone knows the way to do this without changing the cassandra.yaml?
thanks
Altober
The name is misleading, but SocketOptions.getReadTimeoutMillis() applies to all requests from the driver to cassandra. You can think of it as a client-level timeout. If a response hasn't been returned by a cassandra node in that period of time an OperationTimeoutException will be raised and another node will be tried. Refer to the javadoc link above for more nuanced information about when the exception is raised to the client. Generally, you will want this timeout to be greater than your timeouts in cassandra.yaml, which is why 12 seconds is the default.
If you want to effectively manage timeouts at the client level, you can control this on a per query basis by using executeAsync along with a timed get on the ResultSetFuture to give up on the request after a period of time, i.e.:
ResultSet result = session.executeAsync("your query").get(300, TimeUnit.MILLISECONDS);
This will throw a TimeoutException if the request hasn't been completed in 300 ms.
We are developping an application that uses the Google Cloud Datastore, important detail: it's not an gae application!
Everything works fine for normal usage. We designed a test that fetches over 30000 records but when we tried to run the test we got the following error:
java.net.SocketTimeoutException: Read timed out
We found that a Timeout Exception occurs after 30 seconds, so this explains the error.
I have two questions:
Is there a way to increase this timeout?
Is it possible to use pagination to query the datastore? We found when you have an aep application you can use the cursor, but our application isn't.
You can use cursors in the exact same way as a GAE app using Datastore. Take a look at this page for info.
In particular, the ResultQueryBatch object has an .getEndCursor() method which you can then use when you reissue a Query with setStartCursor(...). Here's a code snippet from the page above:
Query q = ...
if (response.getBatch().getMoreResults() == QueryResultBatch.MoreResultsType.NOT_FINISHED) {
ByteString endCursor = response.getBatch().getEndCursor();
q.setStartCursor(endCursor);
// reissue the query to get more results...
}
You should definitely use cursors to ensure that you get all your results. The rpc has additional constraints to time like total rpc size, so you shouldn't depend on a single rpc answering your entire query.
I am measuring the cost of requests to GAE by inspecting the x-appengine-estimated-cpm-us-dollars header. This works great and in combination with x-appengine-resource-usage and
x-traceurl I can even get more detailed information.
However, a large part of my application run in the context of task queues. Thus, a huge part of the instance hour costs are consumed by queues. Each time code is executed outside of a request its costs are not included in the x-appengine-estimated-cpm-us-dollars header.
I am looking for a way to measure the full costs consumed by each request. I.e. costs generated by the request itself and the cost of the tasks that have been added by this request.
It is an overkill. There is a tool you can download google app engine log and convert them to sqlite.
http://code.google.com/p/google-app-engine-samples/source/browse/trunk/logparser/logparser.py
With this tool, cpm usd for both task request and normal request would be all downloaded together. You can store daily log into separate sqlite file and do as much analysis as you want.
In terms of relate the cost of task back to original request. The log data downloaded with this tool includes the full output of logging module.
So you can simply logging an generate id in the original request
pass the id to task.
logging the received id again in the task request.
find normal and task request pair via id.
for example:
# in org request
a_id = genereate_a_random_id()
logging.info(a_id) # the id will be included
taskqueue.add(url='/path_to_task', params={'id': a_id})
# in task request
a_id = self.request.get('id')
logging.info(a_id)
EDIT1
I think there is another possible way to estimate the cost of normal request + task request.
The trick is change the async task to sync (assume the cost would be the same).
I didn't try it but it is much easier to try.
# in org request, add a variable to identify debug
debug = self.request.get('DEBUG')
if debug:
self.redirect('/path_to_task')
else:
taskqueue.add(url='/path_to_task')
Thus, while testing the normal request with DEBUG parameter. It will firstly process the normal request then return x-appengine-estimated-cpm-us-dollars for normal request. Later it will redirect your test client to the relative task request (task request could also be access and trigger via url client as normal request) and return x-appengine-estimated-cpm-us-dollars for task request. You can simply add them together to get the total cost.