MongoDB: Getting entries with maximum version-id after grouping - java

I'm rather new to MongoDB and I'm trying to create a query which I though would be pretty trivial (well, alteast with SQL it would) but I can't get it done.
So have a collection patients in this collections a single patient is identified using the id property. (NOT mongodbs _id!!) There can be multiple version of a single patient, his version is determined by the meta.versionId field.
In order to query for all "current versions of patients" I need to get for every patient with a specific id the one with the maximum versionId.
So far I've got this:
AggregateIterable<Document> allPatients = db.getCollection("patients").aggregate(Arrays.asList(
new Document("$group", new Document("_id", "$id")
.append("max", new Document("$max", "$meta.versionId")))));
allPatients.forEach(new Block<Document>() {
#Override
public void apply(final Document document) {
System.out.println(document.toJson());
}
});
Which results in the following output (using my very limited test data):
{ "_id" : "2.25.260185450267055504591276882440338245053", "max" : "5" }
{ "_id" : "2.25.260185450267055504591276882441338245099", "max" : "0" }
Seems to work so far, but I need to get the whole patients collection.
Now I only know that for the id : 2.25.260185450267055504591276882440338245053 the max version is "5" and so on. Of course I could now create an own query for every single entry and sequentially get each patient document for a specific id/versionId-combo from mongodb but this seems like a terrible solution! Is there any other way to get it done?

If you know the columns that you want to retrieve , say patient name , address, etc I guess you can append those columns to the document with value 1.
AggregateIterable<Document> allPatients = db.getCollection("patients").aggregate(Arrays.asList(
new Document("$group", new Document("_id", "$id")
.append("max", new Document("$max", "$meta.versionId")).append("name",1).append("address",1))));

An approach that could work for you would be to first order the documents getting in the pipeline by the meta.versionId field using the $sort pipeline operator. However, be aware that the $sort stage has a limit of 100 megabytes of RAM. By default, if it exceeds this limit, $sort will produce an error.
To allow for the handling of large datasets, set the allowDiskUse option to true to enable $sort operations to write to temporary files. See the allowDiskUse option in aggregate() method for details.
After sorting, you can then group the ordered documents, carry out the aggregation using the $first or $last operators (depending on the previous sort direction) to get the other fields.
Consider running the following mongo shell pipeline operation as a way of
demonstrating this concept:
Mongo shell
pipeline = [
{ "$sort": {"meta.versionId": -1}}, // order the documents by the versionId field descending
{
"$group": {
"_id": "$id",
"max": { "$first": "$meta.versionId" }, // get the maximum versionId
"active": { "$first": "$active" }, // Whether this patient's record is in active use
"name": { "$first": "$name" }, // A name associated with the patient
"telecom": { "$first": "$telecom" }, // A contact detail for the individual
"gender": { "$first": "$gender" }, // male | female | other | unknown
"birthDate": { "$first": "$birthDate" } // The date of birth for the individual
/*
And many other fields
*/
}
}
]
db.patients.aggregate(pipeline);
Java test implementation
public class JavaAggregation {
public static void main(String args[]) throws UnknownHostException {
MongoClient mongo = new MongoClient();
DB db = mongo.getDB("test");
DBCollection coll = db.getCollection("patients");
// create the pipeline operations, first with the $sort
DBObject sort = new BasicDBObject("$sort",
new BasicDBObject("meta.versionId", -1)
);
// build the $group operations
DBObject groupFields = new BasicDBObject( "_id", "$id");
groupFields.put("max", new BasicDBObject( "$first", "$meta.versionId"));
groupFields.put("active", new BasicDBObject( "$first", "$active"));
groupFields.put("name", new BasicDBObject( "$first", "$name"));
groupFields.put("telecom", new BasicDBObject( "$first", "$telecom"));
groupFields.put("gender", new BasicDBObject( "$first", "$gender"));
groupFields.put("birthDate", new BasicDBObject( "$first", "$birthDate"));
// append any other necessary fields
DBObject group = new BasicDBObject("$group", groupFields);
List<DBObject> pipeline = Arrays.asList(sort, group);
AggregationOutput output = coll.aggregate(pipeline);
for (DBObject result : output.results()) {
System.out.println(result);
}
}
}

Related

Spring Data MongoOperations: Not able to remove sub-document in array with pull method

Following is the structure of my MongoDB document userActivity.
{
"_id" : ObjectId("5e49569f93e956eeb28eb8a6"),
"userId" : "123",
"likes" : {
"videos" : [
{
"_id" : "abc",
"title" : "This video is part of test setup",
}
]
}
}
I am using Spring Data MongoOperations to manipulate MongoDB collections. And below is the code to remove a video from videos array in likes sub-document. I have tried to first filter the document as per the user's userId. And then apply filter to update function as per videoId.
public UpdateResult removeVideoLike(String videoId, String userId) {
Query queryUser = Query.query( Criteria.where("userId").is(userId) );
Query queryVideo = Query.query( Criteria.where("id").is(videoId) );
Update update = new Update().pull("likes.videos", queryVideo );
return mongoOperations.updateFirst( queryUser , update, UserActivity.class );
}
This runs without errors but the entry is not removed. The UpdateResult has following values
matchedCount = 1
modifiedCount = 0
upsertedId = null
I am confused if it is able to match the entry in the array, why it is not removing it? What I am missing?

Iterate over aggregation on MongoDB with Java

I'm currently having some problems with the usage of java, aggregations and the mongodb.
I have 2 collections in a mongodb.
example collection: person
{
id: 1
name: "Oliver"
companyId: 5
}
example collection: company
{
id: 5
name: asdf
}
Now I want to join those collections by companyId/id (lookup aggregation?) and want to iterate over the result. I dont want to load the whole resultset in to the memory, rather iterate 1 by 1. I think i need some kind of cursor (mongoCursor?).
Im working with Java and Spring. So I have the possibilities to use the Java Mongo Driver (version: 3.7.1) or the Options which provides the Springframework (version 5.0.6).
edit:
In the following example Cursor.hasNext() is always false.
DBObject match = new BasicDBObject("$match",
new BasicDBObject("companyId", "id"));
DBObject lookupFields = new BasicDBObject("from", "company");
lookupFields.put("localField", "companyId");
lookupFields.put("foreignField", "id");
lookupFields.put("as", "personWithCompany");
DBObject lookup = new BasicDBObject("$lookup", lookupFields);
DBObject projectFields = new BasicDBObject("id", 1);
projectFields.put("name", 1);
projectFields.put("companyName", "$company.name);
List<DBObject> pipeline = Arrays.asList(match, lookup, project);
Cursor cursor = mongoTemplate.getCollection("person").aggregate(pipeline, AggregationOptions.builder().allowDiskUse(true).build());
while (cursor.hasNext()) {
DBObject dbObject = cursor.next();
}
person.aggregate([
"$lookup": {
"from": "company",
"localField": "companyId",
"foreignField": "id",
"as": "PersonDetails"
},
{$unwind: "$PersonDetails"}
]);
joining with Person Collection with company

Retrieve only the queried element in an object array in MongoDB using java

Suppose we have the following documents in a MongoDB collection:
{
"_id":ObjectId("562e7c594c12942f08fe4192"),
"shapes":[
{
"shape":"square",
"color":"blue"
},
{
"shape":"circle",
"color":"red"
}
]
},
{
"_id":ObjectId("562e7c594c12942f08fe4193"),
"shapes":[
{
"shape":"square",
"color":"black"
},
{
"shape":"circle",
"color":"green"
}
]
}
And the MongoDB query is
db.test.find({"shapes.color": "red"}, {_id: 0, 'shapes.$': 1});
Can someone tell me how to write it in Java?
I am using:
List<BasicDBObject> obj = new ArrayList<>();
obj1.add(new BasicDBObject("shapes.color", "red"));
List<BasicDBObject> obj1 = new ArrayList<>();
obj2.add(new BasicDBObject("shapes.$", "1"));
BasicDBObject parameters1 = new BasicDBObject();
parameters1.put("$and", obj1);
DBCursor cursor = table.find(parameters1,obj2).limit(500);
and I am not getting anything.
The syntax of the Mongo Shell find function is:
db.collection.find(query, projection)
query document Optional. Specifies selection filter using query operators. To return all documents in a collection, omit this parameter or pass an empty document ({}).
projection document Optional. Specifies the fields to return in the documents that match the query filter.
When translating this for execution by the Mongo Java driver you need to construct separate BasicDBObject instances for;
the query
the projection
Here's an example:
MongoCollection<Document> table = ...;
// {"shapes.color": "red"}
BasicDBObject query = new BasicDBObject("shapes.color", "red");
// {_id: 0, 'shapes.$': 1}
BasicDBObject projection = new BasicDBObject("shapes.$", "1").append("_id", 0);
FindIterable<Document> documents = table
// assign the query
.find(query)
// assign the projection
.projection(projection);
System.out.println(documents.first().toJson());
Given the sample documents included in your question the above code will print out:
{
"shapes": [
{
"shape": "circle",
"color": "red"
}
]
}
This is identical to the output from db.test.find({"shapes.color": "red"}, {_id: 0, 'shapes.$': 1});.

Convert this Mongodb command to java (aggregate, remove)

How can i convert this Mongodb command in java?
This command find the duplicates documents and count them, after that start to remove the duplicates with keeping one of them.
I'm using Mongodb 3.
var duplicates = [];
db.sitesToVisit.aggregate([
{ $match: {
Link: { "$ne": '' } // discard selection criteria
}},
{ $group: {
_id: { Link: "$Link"}, // can be grouped on multiple properties
dups: { "$addToSet": "$_id" },
count: { "$sum": 1 }
}},
{ $match: {
count: { "$gt": 1 } // Duplicates considered as count greater than one
}}
]).forEach(function(doc) {
doc.dups.shift(); // First element skipped for deleting
doc.dups.forEach( function(dupId){
duplicates.push(dupId); // Getting all duplicate ids
}
)
})
// If you want to Check all "_id" which you are deleting else print statement not needed
printjson(duplicates);
// Remove all duplicates in one go
db.sitesToVisit.remove({_id:{$in:duplicates}})
And this java code i tried, but i get confused in foreach part and removing.
DBCollection links = db.getCollection("sitesToVisit");
DBObject match = new BasicDBObject("$match", new BasicDBObject("Link", new BasicDBObject("$ne", "")));
DBObject group = new BasicDBObject("$group",
new BasicDBObject("_id", new BasicDBObject("Link", "$Link"))
.append("dups", new BasicDBObject("$addToSet", "$_id"))
.append("count", new BasicDBObject("$sum", 1)));
DBObject sort = new BasicDBObject("$match", new BasicDBObject("count",new BasicDBObject("$gt", 1)));
AggregationOutput output = links.aggregate(match,group,sort);
output.results().forEach((s)->{
System.out.println(s);
});
Thanks in advance.

using $$ROOT inside Spring Data Mongodb for retrieving whole document

using mongodb shell, I am able to perform an aggregation query that retrieves the whole document.
In order to do that I use the $$ROOT variable.
db.reservations.aggregate([
{ $match : { hotelCode : "0360" } },
{ $sort : { confirmationNumber : -1 , timestamp: -1 } },
{ $group : {
_id : "$confirmationNumber",
timestamp :{$first : "$timestamp"},
fullDocument :{$first : "$$ROOT"}
}}
])
It retrieves objects whose content is confirmationNumber, timestamp, fullDocument.
fullDocument is the whole document.
I am wondering if it is possible to do the same with Spring-Data and the aggregation framework.
My java code is:
TypedAggregation<ReservationImage> aggregation = newAggregation(
ReservationImage.class,
match(where("hotelCode").is(hotelCode)),
sort(Direction.DESC,"confirmationNumber","timestamp"),
group("confirmationNumber").
first("timestamp").as("timestamp").
first("$$ROOT").as("reservationImage"));
List<myClass> items = mongoTemplate.aggregate(
aggregation,
myClass.class).getMappedResults();
the error is :
org.springframework.data.mapping.PropertyReferenceException: No property $$ found for type myClass
Do you have any ideas?
Thanks.
We created https://jira.spring.io/browse/DATAMONGO-954 to track the support for accessing System Variables from MongoDB Pipeline expressions.
Once that is in place, you should be able to write:
Aggregation agg = newAggregation( //
match(where("hotelCode").is("0360")), //
sort(Direction.DESC, "confirmationNumber", "timestamp"), //
group("confirmationNumber") //
.first("timestamp").as("timestamp") //
.first(Aggregation.ROOT).as("reservationImage") //
);
I've seen this sort of thing before and it is not just limited to variable names such as $$ROOT. Spring data has it's own ideas about how to map "properties" of the document in the pipeline. Another common problem is simply projecting a new or calculated field that essentially has a new "property" name to it that does not get recognized.
Probably the best approach is to "step down" from using the helper classes and methods and construct the pipeline as BSON documents. You can even get the underlying collection object and the raw output as a BSON document, yet still cast to your typed List at the end.
Mileage may vary to your actual approach,but essentially:
DBObject match = new BasicDBObject(
"$match", new BasicDBObject(
"hotelCode", "0360"
)
);
DBObject sort = new BasicDBObject(
"$sort", new BasicDBObject(
"cofirmationNumber", -1
).append("timestamp", -1)
);
DBObject group = new BasicDBObject(
"$group", new BasicDBObject(
"_id", "confirmationNumber"
).append(
"timestamp", new BasicDBObject(
"$first", "$timestamp"
)
).append(
"reservationImage", new BasicDBObject(
"$first", "$$ROOT"
)
)
);
List<DBObject> pipeline = Arrays.asList(match,sort,group);
DBCollection collection = mongoOperation.getCollection("collection");
DBObject rawoutput = (DBObject)collection.aggregate(pipeline);
List<myClass> items = new AggregationResults(List<myClass>, rawoutput).getMappedResults();
The main thing is moving away from the helpers that are getting in the way and constructing the pipeline as it should be free of the imposed restrictions.

Categories

Resources