Mongo CDC throws : BSONObjectTooLarge . How to ignore this and proceed further?

Mongo CDC throws : BSONObjectTooLarge . How to ignore this and proceed further? - java

I would liketo listen only to 3 collections in a database: c1, c2, c3. I was not able to figureout how to limit listening to these 3 collections only. Below is my code.
i would like to ignore this error and proceed further. How to do it? In this case the cursor itself is not getting created.
Like i said previously, is there a way to limit the listening to the collections c1, c2 c3 collections only?-- on the db side. Below code is listening to the full db and then filtering the collections on the java side.
List<Bson> pipeline = singletonList(match(in("operationType", asList("insert", "delete", "update"))));
MongoChangeStreamCursor<ChangeStreamDocument<Document>> cursor;
String resumeTokenStr = getResumeTokenFromS3(cdcConfig);
if (resumeTokenStr == null) {
cursor = mongoClient.watch(pipeline).fullDocument(FullDocument.UPDATE_LOOKUP).cursor();
} else {
BsonDocument resumeToken = BsonDocument.parse(resumeTokenStr);
cursor = mongoClient.watch(pipeline).batchSize(1).maxAwaitTime(60, TimeUnit.SECONDS).startAfter(resumeToken).fullDocument(FullDocument.UPDATE_LOOKUP).cursor();
}
return cursor;
The above code throws the below error
com.mongodb.MongoCommandException: Command failed with error 10334 (BSONObjectTooLarge): 'BSONObj size: 16795345 (0x10046D1) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "826337A73B0000000A2B022C0100296E5A1004B317A529F739433BA840730515AC0EAC46645F6964006462624E8146E0FB000934F6560004" }' on server crm-mongo-report01.prod.phenom.local:27017. The full response is {"operationTime": {"$timestamp": {"t": 1664707966, "i": 25}}, "ok": 0.0, "errmsg": "BSONObj size: 16795345 (0x10046D1) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: \"826337A73B0000000A2B022C0100296E5A1004B317A529F739433BA840730515AC0EAC46645F6964006462624E8146E0FB000934F6560004\" }", "code": 10334, "codeName": "BSONObjectTooLarge", "$clusterTime": {"clusterTime": {"$timestamp": {"t": 1664707966, "i": 26}}, "signature": {"hash": {"$binary": {"base64": "NZDJKhCse19Eud88kNh7XRWRgas=", "subType": "00"}}, "keyId": 7113062344413937666}}}
at com.mongodb.internal.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:198)
at com.mongodb.internal.connection.InternalStreamConnection.receiveCommandMessageResponse(InternalStreamConnection.java:413)
at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:337)
at com.mongodb.internal.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:116)
at com.mongodb.internal.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:644)
at com.mongodb.internal.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:71)
at com.mongodb.internal.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:240)
at com.mongodb.internal.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:226)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:126)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:116)
at com.mongodb.internal.connection.DefaultServer$OperationCountTrackingConnection.command(DefaultServer.java:345)
at com.mongodb.internal.operation.CommandOperationHelper.createReadCommandAndExecute(CommandOperationHelper.java:232)
at com.mongodb.internal.operation.CommandOperationHelper.lambda$executeRetryableRead$4(CommandOperationHelper.java:214)
at com.mongodb.internal.operation.OperationHelper.lambda$withSourceAndConnection$2(OperationHelper.java:575)
at com.mongodb.internal.operation.OperationHelper.withSuppliedResource(OperationHelper.java:600)
at com.mongodb.internal.operation.OperationHelper.lambda$withSourceAndConnection$3(OperationHelper.java:574)
at com.mongodb.internal.operation.OperationHelper.withSuppliedResource(OperationHelper.java:600)
at com.mongodb.internal.operation.OperationHelper.withSourceAndConnection(OperationHelper.java:573)
at com.mongodb.internal.operation.CommandOperationHelper.lambda$executeRetryableRead$5(CommandOperationHelper.java:211)
at com.mongodb.internal.async.function.RetryingSyncSupplier.get(RetryingSyncSupplier.java:65)
at com.mongodb.internal.operation.CommandOperationHelper.executeRetryableRead(CommandOperationHelper.java:217)
at com.mongodb.internal.operation.CommandOperationHelper.executeRetryableRead(CommandOperationHelper.java:197)
at com.mongodb.internal.operation.AggregateOperationImpl.execute(AggregateOperationImpl.java:195)
at com.mongodb.internal.operation.ChangeStreamOperation$1.call(ChangeStreamOperation.java:347)
at com.mongodb.internal.operation.ChangeStreamOperation$1.call(ChangeStreamOperation.java:343)
at com.mongodb.internal.operation.OperationHelper.withReadConnectionSource(OperationHelper.java:538)
at com.mongodb.internal.operation.ChangeStreamOperation.execute(ChangeStreamOperation.java:343)
at com.mongodb.internal.operation.ChangeStreamOperation.execute(ChangeStreamOperation.java:58)
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:191)
at com.mongodb.client.internal.ChangeStreamIterableImpl.execute(ChangeStreamIterableImpl.java:221)
at com.mongodb.client.internal.ChangeStreamIterableImpl.cursor(ChangeStreamIterableImpl.java:174)
at com.company.cdc.services.CDCMain.getCursorAtResumeToken(CdcServiceMain.java:217)
line 217 points to the line : cursor = mongoClient.watch(pipeline).batchSize(1).maxAwaitTime(60, TimeUnit.SECONDS).startAfter(resumeToken).fullDocument(FullDocument.UPDATE_LOOKUP).cursor();

This is how i ended up solving the problem. (just incase somebody else is searching for solution to-this problem)
removed FullDocument.UPDATE_LOOKUP when creating the cursor. So now my code looks like cursor = mongoClient.watch(pipeline).batchSize(20000).cursor(); --Now this avoided the gigantic column that may endup in the document which was eventually error-ing out. This worked.
In my case i didnt have to listen to the collection updates which had this bad data. So i modified my cursor to listen only on the databases and collections of my interest --instead of listening on the entire database and then ignoring the unwanted collections later. Below is the code
When writing to the destination i had done bulk lookup on mongo db, constructed the full document and then written it --This approach of lazy lookup reduced a lot of memory footprint of the java program.
private List<Bson> generatePipeline(CdcConfigs cdcConfig) {
List<String> whiteListedCollections = getWhitelistedCollection(cdcConfig);
List<String> whiteListedDbs = getWhitelistedDbs(cdcConfig);
log.info("whitelisted dbs:" + whiteListedDbs + " coll:" + whiteListedCollections);
List<Bson> pipeline;
if (whiteListedDbs.size() > 0)
pipeline = singletonList(match(and(
in("ns.db", whiteListedDbs),
in("ns.coll", whiteListedCollections),
in("operationType", asList("insert", "delete", "update")))));
else
pipeline = singletonList(match(and(
in("ns.coll", whiteListedCollections),
in("operationType", asList("insert", "delete", "update")))));
return pipeline;
}
private MongoChangeStreamCursor<ChangeStreamDocument<Document>> getCursorAtResumeToken(CdcConfigs cdcConfig, MongoClient mongoClient) {
List<Bson> pipeline = generatePipeline(cdcConfig);
MongoChangeStreamCursor<ChangeStreamDocument<Document>> cursor;
String resumeTokenStr = getResumeTokenFromS3(cdcConfig);
if (resumeTokenStr == null) {
// cursor = mongoClient.watch(pipeline).fullDocument(FullDocument.UPDATE_LOOKUP).cursor();
cursor = mongoClient.watch(pipeline).batchSize(20000).cursor();
log.warn("RESUME TOKEN IS NULL. READING CDC FROM CURRENT TIMESTAMP FROM MONGO DB !!! ");
} else {
BsonDocument resumeToken = BsonDocument.parse(resumeTokenStr);
cursor = mongoClient.watch(pipeline).batchSize(20000).maxAwaitTime(30, TimeUnit.MINUTES).startAfter(resumeToken).cursor();
// cursor = mongoClient.watch(pipeline).startAfter(resumeToken).fullDocument(FullDocument.UPDATE_LOOKUP).cursor();
}
return cursor;
}
The solutions available sofar are more inclined towards python code. So it was a challange translating them into Java.

My case is a little bit different. I have a trigger used to align documents from a collection to another, with the option to include the fullDocument in the change event. Problem is that the metadata added to the event, combined with the fullDocument, exceeds the BSON max size (16MB).
I solved removing the fullDocument from the trigger and getting the document from the collection itself performing an aggregate:
collection1.aggregate([
{$match: {_id: changeEvent.documentKey._id}},
{ $merge: { into: "collection", on: "_id", whenMatched: "replace", whenNotMatched: "insert" } }
]);

Related

Fetching new documents on insertion in ElasticSearch with Java

I have been looking for a solution to create a sort of alert when new documents are added to ES via Logstash. I have seen some threads on here such as : stackoverflow.com/a/51980618/4604579, but that does not really serve my purposes as the plug-ins mentioned do not work with the newest version of ELK and there is no Changes API out yet.
So I have resorted to trying 2 different approaches:
Create a Scroll and run over all the documents in a given index using the Search API, retain the last document's ID and use it after a given timeout period to get all documents that were added after it
Creating a Watcher that checks after a given interval (for example 5 minutes) if new documents have been added to an index.
I have advanced on approach 1, where I can scroll through about 50k documents that are currently in ES and retrieve the last documents id (i sort the query based on timestamp in ascending order, that way I know that the last document will be the latest that was inserted). But I don't know how efficient this approach is and I know that a scroller may time out after a given delay, so if no new documents are inserted, that means the scroll will be removed.
I was looking also into using a Watcher, but I don't really understand how I can set up the condition to check if a new document was inserted in a given index.
I imagine I can do something of the genre:
PUT _watcher/watch/new_docs
{
"trigger" : {
"schedule" : {
"interval" : "5s"
}
},
"input" : {
"search" : {
"request" : {
"indices" : "logstash",
"body" : {
"size" : 0,
"query" : { "match" : { "#timestamp" : "now-5s" } }
}
}
}
},
"condition" : {
"compare" : { ?? }
},
"actions" : {
"my_webhook" : {
"webhook" : {
"method" : "POST",
"host" : "mylisteninghost",
"port" : 9200,
"path" : "/{{watch_id}}",
"body" : "New document {{document ID}} errors"
}
}
I am not exactly sure how to define or use the Watcher and if it would even work.
Can anyone let me know what the best course of action would be?
Thank you
EDIT:
For those interested I found a way to poll the ES REST API using Search After. The difference is that using Scroll, there is a snapshot taken of the documents in the ES DB, so any new documents added wont be in this snapshot. Contrary to that, Search After is state-less, which means that it will use unique sorting parameters (in my case timestamp/id) and hold the last one fetched, afterwards we query all documents that come after the held parameters. This way if any new documents are added, they will come after the held timestamp and will be fetched by the query.
Code:
public static void searchAfterElasticData()
throws FileNotFoundException, IOException, InterruptedException {
//create a search request for a given index
SearchRequest search_request = new SearchRequest(elastic_index);
SearchSourceBuilder source_builder =
getSearchSourceBuilder("#timestamp", "_id", 100);
search_request.source(source_builder);
SearchResponse search_response = null;
try {
search_response = client.search(search_request, RequestOptions.DEFAULT);
} catch (ElasticsearchException | ConnectException ex) {
log.info("Error while querying Elastic API: {}", ex.toString());
}
if (search_response != null) {
SearchHit[] search_hits = search_response.getHits().getHits();
Object[] sort_values = null;
while (search_hits != null) {
if (search_hits.length > 0) {
//if there are records retrieved, parse them
for (SearchHit hit: search_hits) {
Map<String, Object> source_map = hit.getSourceAsMap();
try {
parse((String)source_map.get("message"));
} catch (Exception ex) {
log.error("Error while parsing: {}",
(String)source_map.get("message"));
}
}
//get sorting value of last record and do new request
log.info("Getting sorting values");
sort_values = search_response.getHits()
.getAt(search_hits.length-1).getSortValues();
} else {
log.info("Waiting 1 minute for new entries");
Thread.sleep(60000);
}
source_builder.searchAfter(sort_values);
search_request.source(source_builder);
search_response =
client.search(search_request, RequestOptions.DEFAULT);
search_hits = search_response.getHits().getHits();
log.info("Fetched hits: {}", search_hits.length);
log.info("Searching after for new hits");
}
}
}
I still would like to know if it is possible to do the same using a Watcher, also if anyone has any suggestions to make the code more elegant, please share.
Thank you

Why would a Cosmos stored procedure run differently when called from browser vs. called from Java?

I have a stored procedure in Cosmos DB Emulator. All this procedure does is: delete ALL documents from mycollection. When I run it in browser (https://localhost:8081/_explorer/index.html), it works great. Then I try to call it from Java:
RequestOptions requestOptions = new RequestOptions();
requestOptions.setPartitionKey(new PartitionKey(null));
System.out.println("START DELETE PROCEDURE");
StoredProcedureResponse spr = client.executeStoredProcedure(sprocLink, requestOptions, null);
System.out.println(spr.getResponseAsString());
And get the following result: {"deleted": 0,"continuation": false}
This is CRAZY. I'm running this stored procedure from the browser and getting this result: {"deleted": 10,"continuation": false}. Then (of course adding back those 10 documents) running this result from Java and getting this result: {"deleted": 0,"continuation": false}
So when the stored procedure is ran by Java, it is called but didn't do the job. Deleted nothing.... Why would this happen?
Below is the stored procedure
function testStoredProcedure( ) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
var response = getContext().getResponse();
var responseBody = {
deleted: 0,
continuation: true
};
var query = 'SELECT * FROM mycollection ';
// Validate input.
if (!query) throw new Error("The query is undefined or null.");
tryQueryAndDelete();
// Recursively runs the query w/ support for continuation tokens.
// Calls tryDelete(documents) as soon as the query returns documents.
function tryQueryAndDelete(continuation)) {
var requestOptions = {continuation: continuation};
var isAccepted = collection.queryDocuments(collectionLink, query, requestOptions, function (err, retrievedDocs, responseOptions) {
if (err) throw err;
if (retrievedDocs.length > 0) {
// Begin deleting documents as soon as documents are returned form the query results.
// tryDelete() resumes querying after deleting; no need to page through continuation tokens.
// - this is to prioritize writes over reads given timeout constraints.
tryDelete(retrievedDocs);
} else if (responseOptions.continuation) {
// Else if the query came back empty, but with a continuation token; repeat the query w/ the token.
tryQueryAndDelete(responseOptions.continuation);
} else {
// Else if there are no more documents and no continuation token - we are finished deleting documents.
responseBody.continuation = false;
response.setBody(responseBody);
}
});
// If we hit execution bounds - return continuation: true.
if (!isAccepted) {
response.setBody(responseBody);
}
}
// Recursively deletes documents passed in as an array argument.
// Attempts to query for more on empty array.
function tryDelete(documents) {
if (documents.length > 0) {
// Delete the first document in the array.
var isAccepted = collection.deleteDocument(documents[0]._self, {}, function (err, responseOptions) {
if (err) throw err;
responseBody.deleted++;
documents.shift();
// Delete the next document in the array.
tryDelete(documents);
});
// If we hit execution bounds - return continuation: true.
if (!isAccepted) {
response.setBody(responseBody);
}
} else {
// If the document array is empty, query for more documents.
tryQueryAndDelete();
}
}
}

For partitioned containers, when executing a stored procedure, a partition key value must be provided in the request options. Stored procedures are always scoped to a partition key. Items that have a different partition key value will not be visible to the stored procedure. This also applied to triggers as well.
You are setting partition key to "null" in requestOptions. "null" is a valid partition key value. Looks like "null" is not a partition key value for your 10 documents.

Humbly reposting #Jay Gong answer How to specify NONE partition key for deleting a document in Document DB java SDK?
Maybe it will help someone. Put:
PartitionKey partitionKey = new PartitionKey(Undefined.Value());

Spring data aggregation query elasticsearch

I am trying to make the below elasticsearch query to work with spring data. The intent is to return unique results for the field "serviceName". Just like a SELECT DISTINCT serviceName FROM table would do comparing to a SQL database.
{
"aggregations": {
"serviceNames": {
"terms": {
"field": "serviceName"
}
}
},
"size":0
}
I configured the field as a keyword and it made the query work perfectly in the index_name/_search api as per the response snippet below:
"aggregations": {
"serviceNames": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "service1",
"doc_count": 20
},
{
"key": "service2",
"doc_count": 8
},
{
"key": "service3",
"doc_count": 8
}
]
}
}
My problem is the same query doesn't work in Spring data when I try to run with a StringQuery I get the error below. I am guessing it uses a different api to run queries.
Cannot execute jest action , response code : 400 , error : {"root_cause":[{"type":"parsing_exception","reason":"no [query] registered for [aggregations]","line":2,"col":19}],"type":"parsing_exception","reason":"no [query] registered for [aggregations]","line":2,"col":19} , message : null
I have tried using the SearchQuery type to achieve the same results, no duplicates and no object loading, but I had no luck. The below sinnipet shows how I tried doing it.
final TermsAggregationBuilder aggregation = AggregationBuilders
.terms("serviceName")
.field("serviceName")
.size(1);
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withIndices("index_name")
.withQuery(matchAllQuery())
.addAggregation(aggregation)
.withSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.withSourceFilter(new FetchSourceFilter(new String[] {"serviceName"}, new String[] {""}))
.withPageable(PageRequest.of(0, 10000))
.build();
Would someone know how to achieve no object loading and object property distinct aggregation on spring data?
I tried many things without success to print queries on spring data, but I could not, maybe because I am using the com.github.vanroy.springdata.jest.JestElasticsearchTemplate implementation.
I got the query parts with the below:
logger.info("query:" + searchQuery.getQuery());
logger.info("agregations:" + searchQuery.getAggregations());
logger.info("filter:" + searchQuery.getFilter());
logger.info("search type:" + searchQuery.getSearchType());
It prints:
query:{"match_all":{"boost":1.0}}
agregations:[{"serviceName":{"terms":{"field":"serviceName","size":1,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]}}}]
filter:null
search type:DFS_QUERY_THEN_FETCH

I figured out, maybe can help someone. The aggregation don't come with the query results, but in a result for it self and is not mapped to any object. The Objects results that comes apparently are samples of the query elasticsearch did to run your aggregation (not sure, maybe).
I ended up by creating a method which can do a simulation of what would be on the SQL SELECT DISTINCT your_column FROM your_table, but I think this will work only on keyword fields, they have a limitation of 256 characters if I am not wrong. I explained some lines in comments.
Thanks #Val since I was only able to figure it out when debugged into Jest code and check the generated request and raw response.
public List<String> getDistinctField(String fieldName) {
List<String> result = new ArrayList<>();
try {
final String distinctAggregationName = "distinct_field"; //name the aggregation
final TermsAggregationBuilder aggregation = AggregationBuilders
.terms(distinctAggregationName)
.field(fieldName)
.size(10000);//limits the number of aggregation list, mine can be huge, adjust yours
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withIndices("your_index")//maybe can be omitted
.addAggregation(aggregation)
.withSourceFilter(new FetchSourceFilter(new String[] { fieldName }, new String[] { "" }))//filter it to retrieve only the field we ar interested, probably we can take this out.
.withPageable(PageRequest.of(0, 1))//can't be zero, and I don't want to load 10 results every time it runs, will always return one object since I found no "size":0 in query builder
.build();
//had to use the JestResultsExtractor because com.github.vanroy.springdata.jest.JestElasticsearchTemplate don't have an implementation for ResultsExtractor, if you use Spring defaults, you can probably use it.
final JestResultsExtractor<SearchResult> extractor = new JestResultsExtractor<SearchResult>() {
#Override
public SearchResult extract(SearchResult searchResult) {
return searchResult;
}
};
final SearchResult searchResult = ((JestElasticsearchTemplate) elasticsearchOperations).query(searchQuery,
extractor);
final MetricAggregation aggregations = searchResult.getAggregations();
final TermsAggregation termsAggregation = aggregations.getTermsAggregation(distinctAggregationName);//this is where your aggregation results are, in "buckets".
result = termsAggregation.getBuckets().parallelStream().map(TermsAggregation.Entry::getKey)
.collect(Collectors.toList());
} catch (Exception e) {
// threat your error here.
e.printStackTrace();
}
return result;
}

How to query nested mongodb fields efficiently in Java?

I'm not very experienced with running Mongo queries from Java, so I'm no expert at commands. I have a Mongo collection with ~6500 documents, each containing multiple fields (some of which have sub-fields), like below:
"_id" : NumberLong(714847),
"franchiseIds" : [
NumberLong(714848),
NumberLong(714849)
],
"profileSettings" : {
"DISCLAIMER_SETUP" : {
"settingType" : "DISCLAIMER_SETUP",
"attributeMap" : {
...
I want to have an operation which will go through the entire collection from time to time and calculate how many franchiseIds are present, since different documents could have anywhere from 1 to 4 franchiseIds.
From the Mongo shell, I did a very simple script to get this, and it calculated the result immediately:
rs_default:SECONDARY> var totalCount = 0;
rs_default:SECONDARY> db.profiles.find().forEach( function(profile) { totalCount += profile.franchiseIds.length } );
rs_default:SECONDARY> totalCount
However, when I attempted to do the same thing in Java, which is where this would run on the server from time to time, it was much less performant, taking around 15 seconds to complete:
int result = 0;
List<Profile> allProfiles = mongoTemplate.findAll(Profile.class, PROFILE_COLLECTION);
for (Profile profile : allProfiles) {
result += profile.getFranchiseIds().size();
}
return results
I realize the above isn't performant in Java as it's having to allocate memory for all of the Profiles being loaded in. In the Mongo shell script, is Mongo simply taking care of this itself?
Any ideas how I can do something similar in Java?
EDIT:
I returned only the franchiseIds field on the response from Mongo, and that helped significantly. Below is the improved code:
final Query query = new Query();
query.fields().include(FRANCHISE_IDS);
final List<Profile> allProfiles = mongoTemplate.find(query, Profile.class, PROFILE_COLLECTION);
for (Profile profile : allProfiles) {
result += profile.getFranchiseIds().size();
}
return result;

Why sorting in elasticSearch does not sort data properly?

First Question:- I have data of size around 45000.
I want to sort that data on chrom and pos key. I have written the query to sort data shown below.
//The below script sort the chromosomes
SortBuilder builder=new ScriptSortBuilder("s = doc['chrom'].value; s=s.substring(3); s.indexOf('X')!=-1?23:s.indexOf('Y')!=-1?24:s.indexOf('MT')!=-1?25:s.indexOf('M')!=-1?25:s;" +
"n = org.elasticsearch.common.primitives.Ints.tryParse(s); if (n != null) { String.format(\"%010d\",n)} else { s }", String.class.getSimpleName().toLowerCase());
SearchRequestBuilder setQuery = this.getClient().prepareSearch(this.getIndex()).setTypes(this.getType())
.addSort(builder)
.addSort(Keys.POS.toLowerCase(),SortOrder.ASC).
setQuery(QueryBuilders.matchQuery(Keys.SAMPLE_ID_DB_KEY, entityID.toLowerCase())).setSize(100).setSearchType(SearchType.QUERY_AND_FETCH).setScroll(new TimeValue(60000000));
However, after firing the query I received multiple bunch of data. Where bunch is sorted but irrespective of data in other bunch.(i.e. If there is entry of 1:11111 present in 1st bunch then there can be entry in second bunch having value less than 1:11111 ).
Am I missing somthing?
Second question:- When I do not specify the size in query it does not returns me all 45000 entries. Why is it so?
Edit
Data in JSON format
{
"chrom": "chr1",
"pos": 762273,
"isIndel": false,
"interpretation": "",
"sampleID": "xyz",
"isSignedOff": false,
"ownerID": null,
"entityType": 0
}

Switch to SearchType.QUERY_THEN_FETCH instead of SearchType.QUERY_AND_FETCH.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Mongo CDC throws : BSONObjectTooLarge . How to ignore this and proceed further? - java

Related

Fetching new documents on insertion in ElasticSearch with Java

Why would a Cosmos stored procedure run differently when called from browser vs. called from Java?

Spring data aggregation query elasticsearch

How to query nested mongodb fields efficiently in Java?

Why sorting in elasticSearch does not sort data properly?

Categories

Resources