How can i convert this Mongodb command in java?
This command find the duplicates documents and count them, after that start to remove the duplicates with keeping one of them.
I'm using Mongodb 3.
var duplicates = [];
db.sitesToVisit.aggregate([
{ $match: {
Link: { "$ne": '' } // discard selection criteria
}},
{ $group: {
_id: { Link: "$Link"}, // can be grouped on multiple properties
dups: { "$addToSet": "$_id" },
count: { "$sum": 1 }
}},
{ $match: {
count: { "$gt": 1 } // Duplicates considered as count greater than one
}}
]).forEach(function(doc) {
doc.dups.shift(); // First element skipped for deleting
doc.dups.forEach( function(dupId){
duplicates.push(dupId); // Getting all duplicate ids
}
)
})
// If you want to Check all "_id" which you are deleting else print statement not needed
printjson(duplicates);
// Remove all duplicates in one go
db.sitesToVisit.remove({_id:{$in:duplicates}})
And this java code i tried, but i get confused in foreach part and removing.
DBCollection links = db.getCollection("sitesToVisit");
DBObject match = new BasicDBObject("$match", new BasicDBObject("Link", new BasicDBObject("$ne", "")));
DBObject group = new BasicDBObject("$group",
new BasicDBObject("_id", new BasicDBObject("Link", "$Link"))
.append("dups", new BasicDBObject("$addToSet", "$_id"))
.append("count", new BasicDBObject("$sum", 1)));
DBObject sort = new BasicDBObject("$match", new BasicDBObject("count",new BasicDBObject("$gt", 1)));
AggregationOutput output = links.aggregate(match,group,sort);
output.results().forEach((s)->{
System.out.println(s);
});
Thanks in advance.
Related
Suppose we have the following documents in a MongoDB collection:
{
"_id":ObjectId("562e7c594c12942f08fe4192"),
"shapes":[
{
"shape":"square",
"color":"blue"
},
{
"shape":"circle",
"color":"red"
}
]
},
{
"_id":ObjectId("562e7c594c12942f08fe4193"),
"shapes":[
{
"shape":"square",
"color":"black"
},
{
"shape":"circle",
"color":"green"
}
]
}
And the MongoDB query is
db.test.find({"shapes.color": "red"}, {_id: 0, 'shapes.$': 1});
Can someone tell me how to write it in Java?
I am using:
List<BasicDBObject> obj = new ArrayList<>();
obj1.add(new BasicDBObject("shapes.color", "red"));
List<BasicDBObject> obj1 = new ArrayList<>();
obj2.add(new BasicDBObject("shapes.$", "1"));
BasicDBObject parameters1 = new BasicDBObject();
parameters1.put("$and", obj1);
DBCursor cursor = table.find(parameters1,obj2).limit(500);
and I am not getting anything.
The syntax of the Mongo Shell find function is:
db.collection.find(query, projection)
query document Optional. Specifies selection filter using query operators. To return all documents in a collection, omit this parameter or pass an empty document ({}).
projection document Optional. Specifies the fields to return in the documents that match the query filter.
When translating this for execution by the Mongo Java driver you need to construct separate BasicDBObject instances for;
the query
the projection
Here's an example:
MongoCollection<Document> table = ...;
// {"shapes.color": "red"}
BasicDBObject query = new BasicDBObject("shapes.color", "red");
// {_id: 0, 'shapes.$': 1}
BasicDBObject projection = new BasicDBObject("shapes.$", "1").append("_id", 0);
FindIterable<Document> documents = table
// assign the query
.find(query)
// assign the projection
.projection(projection);
System.out.println(documents.first().toJson());
Given the sample documents included in your question the above code will print out:
{
"shapes": [
{
"shape": "circle",
"color": "red"
}
]
}
This is identical to the output from db.test.find({"shapes.color": "red"}, {_id: 0, 'shapes.$': 1});.
The docs note:
To use $text in the $match stage, the $match stage has to be the first stage of the pipeline.
Some example JSON:
{"pid":"b00l16vp", "title": "in our time","categories":{"category1":["factual", "arts culture and the media", "history"]}}
{"pid":"b0079mpp", "title": "doctor who", "categories":{"category2":["childrens", "entertainment and comedy", "animation"],"category1":["signed"]}}
{“pid":"b00htbn3"}
{“pid":"b00gdhqw","categories":{"category2":["factual"],"category3":["scotland"],"category4":["lifestyle and leisure", "food and drink"],"category1":["entertainment", "games and quizzes"]}}
I have the following query:
List<BasicDBObject> pipeline = new ArrayList<>()
BasicDBObject criteria = new BasicDBObject()
BasicDBObject theProjections = new BasicDBObject()
AggregateIterable iterable
//value is coming from a parameter
if (value != null) {
//a text index has been created on the title field
criteria.put('$text', new BasicDBObject('$search', value))
}
//cats is coming from a parameter but it will be an array of Strings
if (cats.length != 0) {
ArrayList<BasicDBObject> orList = new ArrayList<>()
ArrayList<BasicDBObject> andList = new ArrayList<>()
BasicDBList theMegaArray = new BasicDBList()
for (int i = 1; i <= 5; i++) {
String identifier = "categories.category" + i
String cleanIdentifier = '$' + identifier
//If the category does not exist, put in a blank category
theMegaArray.add(new BasicDBObject('$ifNull', Arrays.asList(cleanIdentifier, Collections.EMPTY_LIST)))
}
//merges all of the category arrays into 1
theProjections.put("allCategories", new BasicDBObject('$setUnion', theMegaArray))
orList.add(new BasicDBObject("allCategories", new BasicDBObject('$all', cats)))
andList.add(new BasicDBObject('$or', orList))
criteria.put('$and', andList)
}
pipeline.add(new BasicDBObject('$project', theProjections))
pipeline.add(new BasicDBObject('$match', criteria))
//and by default
iterable = collection.aggregate(pipeline)
The issue is if I want to search on the cats, I need the projection to be in the pipeline first but If I want the text then I need the match to be there first. Is there any way I can do both?
It is a pretty simple solution after all.
I created a new criteria object
BasicDBObject criteriaCat = new BasicDBObject()
Added the categories to this instead of the original criteria.
criteriaCat.put('$and', andList)
Put the $match first in the pipeline then the $project and if there are cats run a $match again on the results.
pipeline.add(new BasicDBObject('$match', criteria))
pipeline.add(new BasicDBObject('$project', theProjections))
if (cats.length != 0) {
pipeline.add(new BasicDBObject('$match', criteriaCat))
}
pipeline.add(new BasicDBObject('$sort', sorting))
//and by default
iterable = collection.aggregate(pipeline)
I'm trying to perform an aggregation operation using in Java using the mongo-java-driver. I've performed some other find operations, but I'm unable to do the following aggregation correctly in Java:
db.I1.aggregate([
{ "$match": { "ci": 862222} },
{ "$match": { "gi": { "$ne": null } }},
{ "$group": {
"_id": {
"ci": "$ci",
"gi": "$gi",
"gn": "$gn",
"si": "$si"
}
}},
{ "$group": {
"_id": {
"ci": "$_id.ci",
"gi": "$_id.gi",
"gn": "$_id.gn"
},
"sn": { "$sum": 1 }
}},
{ "$sort" : { "_id.gi" : 1}}
])
I've tried several ways and methods to perform that aggregation in Java, but I'm unable to include the group fields "ci", "gi", "gn","si" correctly in the coll.aggregate(asList()) method. What I got so far, is the following:
MongoCollection<Document> coll = mongo.getCollection("I1");
Document matchCourse = new Document("$match",
new Document("ci", Integer.parseInt(courseid)));
Document matchGroupNotNull = new Document("$match",
new Document("gi", new Document("$ne", null)));
List<Object> list1 = new BasicDBList();
list1.add(new BasicDBObject("ci", "$ci"));
list1.add(new BasicDBObject("gi", "$gi"));
list1.add(new BasicDBObject("gn", "$gn"));
list1.add(new BasicDBObject("si", "$si"));
Document group1 = new Document(
"_id", list1).append("count", new Document("$sum", 1));
List<Object> list2 = new BasicDBList();
list2.add(new BasicDBObject("ci", "$_id.ci"));
list2.add(new BasicDBObject("gi", "$_id.gi"));
list2.add(new BasicDBObject("gn", "$_id.gn"));
Document group2 = new Document(
"_id", list2).append("sn", new Document("$sum", 1));
Document sort = new Document("$sort",
new Document("_id.gi", 1));
AggregateIterable<Document> iterable = coll.aggregate(asList(matchCourse,
matchGroupNotNull, group1, group2, sort));
I know it's not correct, but I included it to give you an idea of what I am doing. I've googled this in many different ways and read several pages, but I didn't find any solution. The available documentation for MongoDB-Java(1, 2) is too short for me and doesn't include this case.
How can I perform that query in Java? Any help would be appreciated.
Thank you very much!!
Finally I've come to a solution. There were some errors in the question that I posted, as it was the last attempt after reaching some point of desperation, but finally, here is the final solution:
MongoDatabase mongo = // initialize your connection;
Document matches = new Document("$match",
new Document("gi", new Document("$ne", null))
.append("ci", Integer.parseInt(courseid)));
Document firstGroup = new Document("$group",
new Document("_id",
new Document("ci", "$ci")
.append("gi", "$gi")
.append("gn", "$gn")
.append("si", "$si"))
.append("count", new Document("$sum", 1)));
Document secondGroup = new Document("$group",
new Document("_id",
new Document("ci", "$_id.ci")
.append("gi", "$_id.gi")
.append("gn", "$_id.gn"))
.append("ns", new Document("$sum", 1)));
Document sort = new Document("$sort",
new Document("_id.gi", 1));
List<Document> pipeline = Arrays.asList(matches, firstGroup,
secondGroup, sort);
AggregateIterable<Document> cursor = mongo.getCollection("I1")
.aggregate(pipeline);
for(Document doc : cursor) { // do stuff with doc }
Instead of trying to create lists of key-values, I just appended the elements to the documents. Hope it will be useful for somebody!!
This question is quite old but was the top match on google when I searched so if anyone is looking for a solution to this I managed to do it in the following way
Aggregation.group(Fields.fields()
.and("field1")
.and("field2"))
.first("name")
.`as`("name")
.count().`as`("count")
This will produce the following MDB query:
{ "$group" :
{ "_id" :
{ "field1" : "$field1", "field2" : "$field2"},
"name" : { "$first" : "$name"},
"count" : { "$sum" : 1}
}
I'm rather new to MongoDB and I'm trying to create a query which I though would be pretty trivial (well, alteast with SQL it would) but I can't get it done.
So have a collection patients in this collections a single patient is identified using the id property. (NOT mongodbs _id!!) There can be multiple version of a single patient, his version is determined by the meta.versionId field.
In order to query for all "current versions of patients" I need to get for every patient with a specific id the one with the maximum versionId.
So far I've got this:
AggregateIterable<Document> allPatients = db.getCollection("patients").aggregate(Arrays.asList(
new Document("$group", new Document("_id", "$id")
.append("max", new Document("$max", "$meta.versionId")))));
allPatients.forEach(new Block<Document>() {
#Override
public void apply(final Document document) {
System.out.println(document.toJson());
}
});
Which results in the following output (using my very limited test data):
{ "_id" : "2.25.260185450267055504591276882440338245053", "max" : "5" }
{ "_id" : "2.25.260185450267055504591276882441338245099", "max" : "0" }
Seems to work so far, but I need to get the whole patients collection.
Now I only know that for the id : 2.25.260185450267055504591276882440338245053 the max version is "5" and so on. Of course I could now create an own query for every single entry and sequentially get each patient document for a specific id/versionId-combo from mongodb but this seems like a terrible solution! Is there any other way to get it done?
If you know the columns that you want to retrieve , say patient name , address, etc I guess you can append those columns to the document with value 1.
AggregateIterable<Document> allPatients = db.getCollection("patients").aggregate(Arrays.asList(
new Document("$group", new Document("_id", "$id")
.append("max", new Document("$max", "$meta.versionId")).append("name",1).append("address",1))));
An approach that could work for you would be to first order the documents getting in the pipeline by the meta.versionId field using the $sort pipeline operator. However, be aware that the $sort stage has a limit of 100 megabytes of RAM. By default, if it exceeds this limit, $sort will produce an error.
To allow for the handling of large datasets, set the allowDiskUse option to true to enable $sort operations to write to temporary files. See the allowDiskUse option in aggregate() method for details.
After sorting, you can then group the ordered documents, carry out the aggregation using the $first or $last operators (depending on the previous sort direction) to get the other fields.
Consider running the following mongo shell pipeline operation as a way of
demonstrating this concept:
Mongo shell
pipeline = [
{ "$sort": {"meta.versionId": -1}}, // order the documents by the versionId field descending
{
"$group": {
"_id": "$id",
"max": { "$first": "$meta.versionId" }, // get the maximum versionId
"active": { "$first": "$active" }, // Whether this patient's record is in active use
"name": { "$first": "$name" }, // A name associated with the patient
"telecom": { "$first": "$telecom" }, // A contact detail for the individual
"gender": { "$first": "$gender" }, // male | female | other | unknown
"birthDate": { "$first": "$birthDate" } // The date of birth for the individual
/*
And many other fields
*/
}
}
]
db.patients.aggregate(pipeline);
Java test implementation
public class JavaAggregation {
public static void main(String args[]) throws UnknownHostException {
MongoClient mongo = new MongoClient();
DB db = mongo.getDB("test");
DBCollection coll = db.getCollection("patients");
// create the pipeline operations, first with the $sort
DBObject sort = new BasicDBObject("$sort",
new BasicDBObject("meta.versionId", -1)
);
// build the $group operations
DBObject groupFields = new BasicDBObject( "_id", "$id");
groupFields.put("max", new BasicDBObject( "$first", "$meta.versionId"));
groupFields.put("active", new BasicDBObject( "$first", "$active"));
groupFields.put("name", new BasicDBObject( "$first", "$name"));
groupFields.put("telecom", new BasicDBObject( "$first", "$telecom"));
groupFields.put("gender", new BasicDBObject( "$first", "$gender"));
groupFields.put("birthDate", new BasicDBObject( "$first", "$birthDate"));
// append any other necessary fields
DBObject group = new BasicDBObject("$group", groupFields);
List<DBObject> pipeline = Arrays.asList(sort, group);
AggregationOutput output = coll.aggregate(pipeline);
for (DBObject result : output.results()) {
System.out.println(result);
}
}
}
How to convert below query into Java code for Mongo Java driver?
db.post.aggregate(
[
{ $match : {"name" :{'$in': ["michael", "jordan"] } }},
{ $group : { _id : "$game.id" , count : { $sum : 1 } } }
]
)
My function is not working:
DBObject match = new BasicDBObject('$match', new BasicDBObject("name", names));
The $in operator takes and array or list of arguments, so any list will basically do. But you need to form the corresponding BSON. Indenting your code helps to visualize:
BasicDBList inArgs = new BasicDBList();
inArgs.add("michael");
inArgs.add("jordan");
DBObject match = new BasicDBObject("$match",
new BasicDBObject("name",
new BasicDBObject("$in", inArgs )
)
);
DBObject group = new BasicDBObject("$group",
new BasicDBObject("_id","$game.id").append(
"count", new BasicDBObject("$sum",1)
)
);
According to the Aggregation Documentation, your query should look like:
DBObject match = new BasicDBObject('$match', new BasicDBObject('name', new BasicDBObject('$in', names)));