Related
How do I perform the SQL Join equivalent in MongoDB?
For example say you have two collections (users and comments) and I want to pull all the comments with pid=444 along with the user info for each.
comments
{ uid:12345, pid:444, comment="blah" }
{ uid:12345, pid:888, comment="asdf" }
{ uid:99999, pid:444, comment="qwer" }
users
{ uid:12345, name:"john" }
{ uid:99999, name:"mia" }
Is there a way to pull all the comments with a certain field (eg. ...find({pid:444}) ) and the user information associated with each comment in one go?
At the moment, I am first getting the comments which match my criteria, then figuring out all the uid's in that result set, getting the user objects, and merging them with the comment's results. Seems like I am doing it wrong.
As of Mongo 3.2 the answers to this question are mostly no longer correct. The new $lookup operator added to the aggregation pipeline is essentially identical to a left outer join:
https://docs.mongodb.org/master/reference/operator/aggregation/lookup/#pipe._S_lookup
From the docs:
{
$lookup:
{
from: <collection to join>,
localField: <field from the input documents>,
foreignField: <field from the documents of the "from" collection>,
as: <output array field>
}
}
Of course Mongo is not a relational database, and the devs are being careful to recommend specific use cases for $lookup, but at least as of 3.2 doing join is now possible with MongoDB.
We can merge/join all data inside only one collection with a easy function in few lines using the mongodb client console, and now we could be able of perform the desired query.
Below a complete example,
.- Authors:
db.authors.insert([
{
_id: 'a1',
name: { first: 'orlando', last: 'becerra' },
age: 27
},
{
_id: 'a2',
name: { first: 'mayra', last: 'sanchez' },
age: 21
}
]);
.- Categories:
db.categories.insert([
{
_id: 'c1',
name: 'sci-fi'
},
{
_id: 'c2',
name: 'romance'
}
]);
.- Books
db.books.insert([
{
_id: 'b1',
name: 'Groovy Book',
category: 'c1',
authors: ['a1']
},
{
_id: 'b2',
name: 'Java Book',
category: 'c2',
authors: ['a1','a2']
},
]);
.- Book lending
db.lendings.insert([
{
_id: 'l1',
book: 'b1',
date: new Date('01/01/11'),
lendingBy: 'jose'
},
{
_id: 'l2',
book: 'b1',
date: new Date('02/02/12'),
lendingBy: 'maria'
}
]);
.- The magic:
db.books.find().forEach(
function (newBook) {
newBook.category = db.categories.findOne( { "_id": newBook.category } );
newBook.lendings = db.lendings.find( { "book": newBook._id } ).toArray();
newBook.authors = db.authors.find( { "_id": { $in: newBook.authors } } ).toArray();
db.booksReloaded.insert(newBook);
}
);
.- Get the new collection data:
db.booksReloaded.find().pretty()
.- Response :)
{
"_id" : "b1",
"name" : "Groovy Book",
"category" : {
"_id" : "c1",
"name" : "sci-fi"
},
"authors" : [
{
"_id" : "a1",
"name" : {
"first" : "orlando",
"last" : "becerra"
},
"age" : 27
}
],
"lendings" : [
{
"_id" : "l1",
"book" : "b1",
"date" : ISODate("2011-01-01T00:00:00Z"),
"lendingBy" : "jose"
},
{
"_id" : "l2",
"book" : "b1",
"date" : ISODate("2012-02-02T00:00:00Z"),
"lendingBy" : "maria"
}
]
}
{
"_id" : "b2",
"name" : "Java Book",
"category" : {
"_id" : "c2",
"name" : "romance"
},
"authors" : [
{
"_id" : "a1",
"name" : {
"first" : "orlando",
"last" : "becerra"
},
"age" : 27
},
{
"_id" : "a2",
"name" : {
"first" : "mayra",
"last" : "sanchez"
},
"age" : 21
}
],
"lendings" : [ ]
}
I hope this lines can help you.
This page on the official mongodb site addresses exactly this question:
https://mongodb-documentation.readthedocs.io/en/latest/ecosystem/tutorial/model-data-for-ruby-on-rails.html
When we display our list of stories, we'll need to show the name of the user who posted the story. If we were using a relational database, we could perform a join on users and stores, and get all our objects in a single query. But MongoDB does not support joins and so, at times, requires bit of denormalization. Here, this means caching the 'username' attribute.
Relational purists may be feeling uneasy already, as if we were violating some universal law. But let’s bear in mind that MongoDB collections are not equivalent to relational tables; each serves a unique design objective. A normalized table provides an atomic, isolated chunk of data. A document, however, more closely represents an object as a whole. In the case of a social news site, it can be argued that a username is intrinsic to the story being posted.
You have to do it the way you described. MongoDB is a non-relational database and doesn't support joins.
With right combination of $lookup, $project and $match, you can join mutiple tables on multiple parameters. This is because they can be chained multiple times.
Suppose we want to do following (reference)
SELECT S.* FROM LeftTable S
LEFT JOIN RightTable R ON S.ID = R.ID AND S.MID = R.MID
WHERE R.TIM > 0 AND S.MOB IS NOT NULL
Step 1: Link all tables
you can $lookup as many tables as you want.
$lookup - one for each table in query
$unwind - correctly denormalises data , else it'd be wrapped in arrays
Python code..
db.LeftTable.aggregate([
# connect all tables
{"$lookup": {
"from": "RightTable",
"localField": "ID",
"foreignField": "ID",
"as": "R"
}},
{"$unwind": "R"}
])
Step 2: Define all conditionals
$project : define all conditional statements here, plus all the variables you'd like to select.
Python Code..
db.LeftTable.aggregate([
# connect all tables
{"$lookup": {
"from": "RightTable",
"localField": "ID",
"foreignField": "ID",
"as": "R"
}},
{"$unwind": "R"},
# define conditionals + variables
{"$project": {
"midEq": {"$eq": ["$MID", "$R.MID"]},
"ID": 1, "MOB": 1, "MID": 1
}}
])
Step 3: Join all the conditionals
$match - join all conditions using OR or AND etc. There can be multiples of these.
$project: undefine all conditionals
Complete Python Code..
db.LeftTable.aggregate([
# connect all tables
{"$lookup": {
"from": "RightTable",
"localField": "ID",
"foreignField": "ID",
"as": "R"
}},
{"$unwind": "$R"},
# define conditionals + variables
{"$project": {
"midEq": {"$eq": ["$MID", "$R.MID"]},
"ID": 1, "MOB": 1, "MID": 1
}},
# join all conditionals
{"$match": {
"$and": [
{"R.TIM": {"$gt": 0}},
{"MOB": {"$exists": True}},
{"midEq": {"$eq": True}}
]}},
# undefine conditionals
{"$project": {
"midEq": 0
}}
])
Pretty much any combination of tables, conditionals and joins can be done in this manner.
You can join two collection in Mongo by using lookup which is offered in 3.2 version. In your case the query would be
db.comments.aggregate({
$lookup:{
from:"users",
localField:"uid",
foreignField:"uid",
as:"users_comments"
}
})
or you can also join with respect to users then there will be a little change as given below.
db.users.aggregate({
$lookup:{
from:"comments",
localField:"uid",
foreignField:"uid",
as:"users_comments"
}
})
It will work just same as left and right join in SQL.
As others have pointed out you are trying to create a relational database from none relational database which you really don't want to do but anyways, if you have a case that you have to do this here is a solution you can use. We first do a foreach find on collection A( or in your case users) and then we get each item as an object then we use object property (in your case uid) to lookup in our second collection (in your case comments) if we can find it then we have a match and we can print or do something with it.
Hope this helps you and good luck :)
db.users.find().forEach(
function (object) {
var commonInBoth=db.comments.findOne({ "uid": object.uid} );
if (commonInBoth != null) {
printjson(commonInBoth) ;
printjson(object) ;
}else {
// did not match so we don't care in this case
}
});
Here's an example of a "join" * Actors and Movies collections:
https://github.com/mongodb/cookbook/blob/master/content/patterns/pivot.txt
It makes use of .mapReduce() method
* join - an alternative to join in document-oriented databases
$lookup (aggregation)
Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the “joined” collection. The $lookup stage passes these reshaped documents to the next stage.
The $lookup stage has the following syntaxes:
Equality Match
To perform an equality match between a field from the input documents with a field from the documents of the “joined” collection, the $lookup stage has the following syntax:
{
$lookup:
{
from: <collection to join>,
localField: <field from the input documents>,
foreignField: <field from the documents of the "from" collection>,
as: <output array field>
}
}
The operation would correspond to the following pseudo-SQL statement:
SELECT *, <output array field>
FROM collection
WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
FROM <collection to join>
WHERE <pipeline> );
Mongo URL
It depends on what you're trying to do.
You currently have it set up as a normalized database, which is fine, and the way you are doing it is appropriate.
However, there are other ways of doing it.
You could have a posts collection that has imbedded comments for each post with references to the users that you can iteratively query to get. You could store the user's name with the comments, you could store them all in one document.
The thing with NoSQL is it's designed for flexible schemas and very fast reading and writing. In a typical Big Data farm the database is the biggest bottleneck, you have fewer database engines than you do application and front end servers...they're more expensive but more powerful, also hard drive space is very cheap comparatively. Normalization comes from the concept of trying to save space, but it comes with a cost at making your databases perform complicated Joins and verifying the integrity of relationships, performing cascading operations. All of which saves the developers some headaches if they designed the database properly.
With NoSQL, if you accept that redundancy and storage space aren't issues because of their cost (both in processor time required to do updates and hard drive costs to store extra data), denormalizing isn't an issue (for embedded arrays that become hundreds of thousands of items it can be a performance issue, but most of the time that's not a problem). Additionally you'll have several application and front end servers for every database cluster. Have them do the heavy lifting of the joins and let the database servers stick to reading and writing.
TL;DR: What you're doing is fine, and there are other ways of doing it. Check out the mongodb documentation's data model patterns for some great examples. http://docs.mongodb.org/manual/data-modeling/
There is a specification that a lot of drivers support that's called DBRef.
DBRef is a more formal specification for creating references between documents. DBRefs (generally) include a collection name as well as an object id. Most developers only use DBRefs if the collection can change from one document to the next. If your referenced collection will always be the same, the manual references outlined above are more efficient.
Taken from MongoDB Documentation: Data Models > Data Model Reference >
Database References
Before 3.2.6, Mongodb does not support join query as like mysql. below solution which works for you.
db.getCollection('comments').aggregate([
{$match : {pid : 444}},
{$lookup: {from: "users",localField: "uid",foreignField: "uid",as: "userData"}},
])
You can run SQL queries including join on MongoDB with mongo_fdw from Postgres.
MongoDB does not allow joins, but you can use plugins to handle that. Check the mongo-join plugin. It's the best and I have already used it. You can install it using npm directly like this npm install mongo-join. You can check out the full documentation with examples.
(++) really helpful tool when we need to join (N) collections
(--) we can apply conditions just on the top level of the query
Example
var Join = require('mongo-join').Join, mongodb = require('mongodb'), Db = mongodb.Db, Server = mongodb.Server;
db.open(function (err, Database) {
Database.collection('Appoint', function (err, Appoints) {
/* we can put conditions just on the top level */
Appoints.find({_id_Doctor: id_doctor ,full_date :{ $gte: start_date },
full_date :{ $lte: end_date }}, function (err, cursor) {
var join = new Join(Database).on({
field: '_id_Doctor', // <- field in Appoints document
to: '_id', // <- field in User doc. treated as ObjectID automatically.
from: 'User' // <- collection name for User doc
}).on({
field: '_id_Patient', // <- field in Appoints doc
to: '_id', // <- field in User doc. treated as ObjectID automatically.
from: 'User' // <- collection name for User doc
})
join.toArray(cursor, function (err, joinedDocs) {
/* do what ever you want here */
/* you can fetch the table and apply your own conditions */
.....
.....
.....
resp.status(200);
resp.json({
"status": 200,
"message": "success",
"Appoints_Range": joinedDocs,
});
return resp;
});
});
You can do it using the aggregation pipeline, but it's a pain to write it yourself.
You can use mongo-join-query to create the aggregation pipeline automatically from your query.
This is how your query would look like:
const mongoose = require("mongoose");
const joinQuery = require("mongo-join-query");
joinQuery(
mongoose.models.Comment,
{
find: { pid:444 },
populate: ["uid"]
},
(err, res) => (err ? console.log("Error:", err) : console.log("Success:", res.results))
);
Your result would have the user object in the uid field and you can link as many levels deep as you want. You can populate the reference to the user, which makes reference to a Team, which makes reference to something else, etc..
Disclaimer: I wrote mongo-join-query to tackle this exact problem.
playORM can do it for you using S-SQL(Scalable SQL) which just adds partitioning such that you can do joins within partitions.
Nope, it doesn't seem like you're doing it wrong. MongoDB joins are "client side". Pretty much like you said:
At the moment, I am first getting the comments which match my criteria, then figuring out all the uid's in that result set, getting the user objects, and merging them with the comment's results. Seems like I am doing it wrong.
1) Select from the collection you're interested in.
2) From that collection pull out ID's you need
3) Select from other collections
4) Decorate your original results.
It's not a "real" join, but it's actually alot more useful than a SQL join because you don't have to deal with duplicate rows for "many" sided joins, instead your decorating the originally selected set.
There is alot of nonsense and FUD on this page. Turns out 5 years later MongoDB is still a thing.
I think, if You need normalized data tables - You need to try some other database solutions.
But I've foun that sollution for MOngo on Git
By the way, in inserts code - it has movie's name, but noi movie's ID.
Problem
You have a collection of Actors with an array of the Movies they've done.
You want to generate a collection of Movies with an array of Actors in each.
Some sample data
db.actors.insert( { actor: "Richard Gere", movies: ['Pretty Woman', 'Runaway Bride', 'Chicago'] });
db.actors.insert( { actor: "Julia Roberts", movies: ['Pretty Woman', 'Runaway Bride', 'Erin Brockovich'] });
Solution
We need to loop through each movie in the Actor document and emit each Movie individually.
The catch here is in the reduce phase. We cannot emit an array from the reduce phase, so we must build an Actors array inside of the "value" document that is returned.
The code
map = function() {
for(var i in this.movies){
key = { movie: this.movies[i] };
value = { actors: [ this.actor ] };
emit(key, value);
}
}
reduce = function(key, values) {
actor_list = { actors: [] };
for(var i in values) {
actor_list.actors = values[i].actors.concat(actor_list.actors);
}
return actor_list;
}
Notice how actor_list is actually a javascript object that contains an array. Also notice that map emits the same structure.
Run the following to execute the map / reduce, output it to the "pivot" collection and print the result:
printjson(db.actors.mapReduce(map, reduce, "pivot"));
db.pivot.find().forEach(printjson);
Here is the sample output, note that "Pretty Woman" and "Runaway Bride" have both "Richard Gere" and "Julia Roberts".
{ "_id" : { "movie" : "Chicago" }, "value" : { "actors" : [ "Richard Gere" ] } }
{ "_id" : { "movie" : "Erin Brockovich" }, "value" : { "actors" : [ "Julia Roberts" ] } }
{ "_id" : { "movie" : "Pretty Woman" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
{ "_id" : { "movie" : "Runaway Bride" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
We can merge two collection by using mongoDB sub query. Here is example,
Commentss--
`db.commentss.insert([
{ uid:12345, pid:444, comment:"blah" },
{ uid:12345, pid:888, comment:"asdf" },
{ uid:99999, pid:444, comment:"qwer" }])`
Userss--
db.userss.insert([
{ uid:12345, name:"john" },
{ uid:99999, name:"mia" }])
MongoDB sub query for JOIN--
`db.commentss.find().forEach(
function (newComments) {
newComments.userss = db.userss.find( { "uid": newComments.uid } ).toArray();
db.newCommentUsers.insert(newComments);
}
);`
Get result from newly generated Collection--
db.newCommentUsers.find().pretty()
Result--
`{
"_id" : ObjectId("5511236e29709afa03f226ef"),
"uid" : 12345,
"pid" : 444,
"comment" : "blah",
"userss" : [
{
"_id" : ObjectId("5511238129709afa03f226f2"),
"uid" : 12345,
"name" : "john"
}
]
}
{
"_id" : ObjectId("5511236e29709afa03f226f0"),
"uid" : 12345,
"pid" : 888,
"comment" : "asdf",
"userss" : [
{
"_id" : ObjectId("5511238129709afa03f226f2"),
"uid" : 12345,
"name" : "john"
}
]
}
{
"_id" : ObjectId("5511236e29709afa03f226f1"),
"uid" : 99999,
"pid" : 444,
"comment" : "qwer",
"userss" : [
{
"_id" : ObjectId("5511238129709afa03f226f3"),
"uid" : 99999,
"name" : "mia"
}
]
}`
Hope so this will help.
I am trying to query from my collection of documents which looks like:
{ "_id" : ObjectId("94"), "EmailAddress" :"adam#gmail.com","Interests": "CZ1001,CE2004" }
{ "_id" : ObjectId("44"), "EmailAddress" :"ben#gmail.com", "Interests":"CE1001,CE4002" }
{ "_id" : ObjectId("54"), "EmailAddress" :"chris#gmail.com","Interests":"CE1001,CE2002" }
An example is that i want to retrieve the email addresses, given that the field "Interests" has that value i am looking for.
Example if i search CZ1001, i will get back Obj 1 EmailAddress details.
If i search CE1001, i will get back Obj 2 and Obj 3 EmailAddress details.
The object id are uniquely created when i inserted records at the start if that helps.
I am able to get the objects on MongoDB Shell using
db.users.find(Module: {"$regex": "CE1001"}})
Only the email addresses are needed.
I am trying to get all the email addresses and got stuck at this code.
Document doc = (Document) collection.find(new BasicDBObject("Module", {"$regex":"CE1001"})) .projection(Projections.fields(Projections.include("EmailAddress"), Projections.excludeId())).first();
Where
new BasicDBObject("Module", {"$regex":"CE1001"}) is not allowed.
new BasicDBObject("Module", String_variable) is allowed
For a particular Interest, your mongoDB query would look like(taking CE1001 as an example):
db.collection.find({
$or: [
{
Interests: {
$regex: "^CE1001,"
}
},
{
Interests: {
$regex: ",CE1001,"
}
},
{
Interests: {
$regex: ",CE1001$"
}
}
]
},
{
"EmailAddress": 1
})
For others who happens to drop by this post. Below are the working codes. Credits to #Veeram.
FindIterable <Document> results = collection.find(new BasicDBObject("Module", new BasicDBObject("$regex", "CE1001")) .projection(Projections.fields(Projections.include("EmailAddress"), Projections.excludeId()));
for(Document doc : results) {
doc.getString("EmailAddress")
}
How do I perform the SQL Join equivalent in MongoDB?
For example say you have two collections (users and comments) and I want to pull all the comments with pid=444 along with the user info for each.
comments
{ uid:12345, pid:444, comment="blah" }
{ uid:12345, pid:888, comment="asdf" }
{ uid:99999, pid:444, comment="qwer" }
users
{ uid:12345, name:"john" }
{ uid:99999, name:"mia" }
Is there a way to pull all the comments with a certain field (eg. ...find({pid:444}) ) and the user information associated with each comment in one go?
At the moment, I am first getting the comments which match my criteria, then figuring out all the uid's in that result set, getting the user objects, and merging them with the comment's results. Seems like I am doing it wrong.
As of Mongo 3.2 the answers to this question are mostly no longer correct. The new $lookup operator added to the aggregation pipeline is essentially identical to a left outer join:
https://docs.mongodb.org/master/reference/operator/aggregation/lookup/#pipe._S_lookup
From the docs:
{
$lookup:
{
from: <collection to join>,
localField: <field from the input documents>,
foreignField: <field from the documents of the "from" collection>,
as: <output array field>
}
}
Of course Mongo is not a relational database, and the devs are being careful to recommend specific use cases for $lookup, but at least as of 3.2 doing join is now possible with MongoDB.
We can merge/join all data inside only one collection with a easy function in few lines using the mongodb client console, and now we could be able of perform the desired query.
Below a complete example,
.- Authors:
db.authors.insert([
{
_id: 'a1',
name: { first: 'orlando', last: 'becerra' },
age: 27
},
{
_id: 'a2',
name: { first: 'mayra', last: 'sanchez' },
age: 21
}
]);
.- Categories:
db.categories.insert([
{
_id: 'c1',
name: 'sci-fi'
},
{
_id: 'c2',
name: 'romance'
}
]);
.- Books
db.books.insert([
{
_id: 'b1',
name: 'Groovy Book',
category: 'c1',
authors: ['a1']
},
{
_id: 'b2',
name: 'Java Book',
category: 'c2',
authors: ['a1','a2']
},
]);
.- Book lending
db.lendings.insert([
{
_id: 'l1',
book: 'b1',
date: new Date('01/01/11'),
lendingBy: 'jose'
},
{
_id: 'l2',
book: 'b1',
date: new Date('02/02/12'),
lendingBy: 'maria'
}
]);
.- The magic:
db.books.find().forEach(
function (newBook) {
newBook.category = db.categories.findOne( { "_id": newBook.category } );
newBook.lendings = db.lendings.find( { "book": newBook._id } ).toArray();
newBook.authors = db.authors.find( { "_id": { $in: newBook.authors } } ).toArray();
db.booksReloaded.insert(newBook);
}
);
.- Get the new collection data:
db.booksReloaded.find().pretty()
.- Response :)
{
"_id" : "b1",
"name" : "Groovy Book",
"category" : {
"_id" : "c1",
"name" : "sci-fi"
},
"authors" : [
{
"_id" : "a1",
"name" : {
"first" : "orlando",
"last" : "becerra"
},
"age" : 27
}
],
"lendings" : [
{
"_id" : "l1",
"book" : "b1",
"date" : ISODate("2011-01-01T00:00:00Z"),
"lendingBy" : "jose"
},
{
"_id" : "l2",
"book" : "b1",
"date" : ISODate("2012-02-02T00:00:00Z"),
"lendingBy" : "maria"
}
]
}
{
"_id" : "b2",
"name" : "Java Book",
"category" : {
"_id" : "c2",
"name" : "romance"
},
"authors" : [
{
"_id" : "a1",
"name" : {
"first" : "orlando",
"last" : "becerra"
},
"age" : 27
},
{
"_id" : "a2",
"name" : {
"first" : "mayra",
"last" : "sanchez"
},
"age" : 21
}
],
"lendings" : [ ]
}
I hope this lines can help you.
This page on the official mongodb site addresses exactly this question:
https://mongodb-documentation.readthedocs.io/en/latest/ecosystem/tutorial/model-data-for-ruby-on-rails.html
When we display our list of stories, we'll need to show the name of the user who posted the story. If we were using a relational database, we could perform a join on users and stores, and get all our objects in a single query. But MongoDB does not support joins and so, at times, requires bit of denormalization. Here, this means caching the 'username' attribute.
Relational purists may be feeling uneasy already, as if we were violating some universal law. But let’s bear in mind that MongoDB collections are not equivalent to relational tables; each serves a unique design objective. A normalized table provides an atomic, isolated chunk of data. A document, however, more closely represents an object as a whole. In the case of a social news site, it can be argued that a username is intrinsic to the story being posted.
You have to do it the way you described. MongoDB is a non-relational database and doesn't support joins.
With right combination of $lookup, $project and $match, you can join mutiple tables on multiple parameters. This is because they can be chained multiple times.
Suppose we want to do following (reference)
SELECT S.* FROM LeftTable S
LEFT JOIN RightTable R ON S.ID = R.ID AND S.MID = R.MID
WHERE R.TIM > 0 AND S.MOB IS NOT NULL
Step 1: Link all tables
you can $lookup as many tables as you want.
$lookup - one for each table in query
$unwind - correctly denormalises data , else it'd be wrapped in arrays
Python code..
db.LeftTable.aggregate([
# connect all tables
{"$lookup": {
"from": "RightTable",
"localField": "ID",
"foreignField": "ID",
"as": "R"
}},
{"$unwind": "R"}
])
Step 2: Define all conditionals
$project : define all conditional statements here, plus all the variables you'd like to select.
Python Code..
db.LeftTable.aggregate([
# connect all tables
{"$lookup": {
"from": "RightTable",
"localField": "ID",
"foreignField": "ID",
"as": "R"
}},
{"$unwind": "R"},
# define conditionals + variables
{"$project": {
"midEq": {"$eq": ["$MID", "$R.MID"]},
"ID": 1, "MOB": 1, "MID": 1
}}
])
Step 3: Join all the conditionals
$match - join all conditions using OR or AND etc. There can be multiples of these.
$project: undefine all conditionals
Complete Python Code..
db.LeftTable.aggregate([
# connect all tables
{"$lookup": {
"from": "RightTable",
"localField": "ID",
"foreignField": "ID",
"as": "R"
}},
{"$unwind": "$R"},
# define conditionals + variables
{"$project": {
"midEq": {"$eq": ["$MID", "$R.MID"]},
"ID": 1, "MOB": 1, "MID": 1
}},
# join all conditionals
{"$match": {
"$and": [
{"R.TIM": {"$gt": 0}},
{"MOB": {"$exists": True}},
{"midEq": {"$eq": True}}
]}},
# undefine conditionals
{"$project": {
"midEq": 0
}}
])
Pretty much any combination of tables, conditionals and joins can be done in this manner.
You can join two collection in Mongo by using lookup which is offered in 3.2 version. In your case the query would be
db.comments.aggregate({
$lookup:{
from:"users",
localField:"uid",
foreignField:"uid",
as:"users_comments"
}
})
or you can also join with respect to users then there will be a little change as given below.
db.users.aggregate({
$lookup:{
from:"comments",
localField:"uid",
foreignField:"uid",
as:"users_comments"
}
})
It will work just same as left and right join in SQL.
As others have pointed out you are trying to create a relational database from none relational database which you really don't want to do but anyways, if you have a case that you have to do this here is a solution you can use. We first do a foreach find on collection A( or in your case users) and then we get each item as an object then we use object property (in your case uid) to lookup in our second collection (in your case comments) if we can find it then we have a match and we can print or do something with it.
Hope this helps you and good luck :)
db.users.find().forEach(
function (object) {
var commonInBoth=db.comments.findOne({ "uid": object.uid} );
if (commonInBoth != null) {
printjson(commonInBoth) ;
printjson(object) ;
}else {
// did not match so we don't care in this case
}
});
Here's an example of a "join" * Actors and Movies collections:
https://github.com/mongodb/cookbook/blob/master/content/patterns/pivot.txt
It makes use of .mapReduce() method
* join - an alternative to join in document-oriented databases
$lookup (aggregation)
Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the “joined” collection. The $lookup stage passes these reshaped documents to the next stage.
The $lookup stage has the following syntaxes:
Equality Match
To perform an equality match between a field from the input documents with a field from the documents of the “joined” collection, the $lookup stage has the following syntax:
{
$lookup:
{
from: <collection to join>,
localField: <field from the input documents>,
foreignField: <field from the documents of the "from" collection>,
as: <output array field>
}
}
The operation would correspond to the following pseudo-SQL statement:
SELECT *, <output array field>
FROM collection
WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
FROM <collection to join>
WHERE <pipeline> );
Mongo URL
It depends on what you're trying to do.
You currently have it set up as a normalized database, which is fine, and the way you are doing it is appropriate.
However, there are other ways of doing it.
You could have a posts collection that has imbedded comments for each post with references to the users that you can iteratively query to get. You could store the user's name with the comments, you could store them all in one document.
The thing with NoSQL is it's designed for flexible schemas and very fast reading and writing. In a typical Big Data farm the database is the biggest bottleneck, you have fewer database engines than you do application and front end servers...they're more expensive but more powerful, also hard drive space is very cheap comparatively. Normalization comes from the concept of trying to save space, but it comes with a cost at making your databases perform complicated Joins and verifying the integrity of relationships, performing cascading operations. All of which saves the developers some headaches if they designed the database properly.
With NoSQL, if you accept that redundancy and storage space aren't issues because of their cost (both in processor time required to do updates and hard drive costs to store extra data), denormalizing isn't an issue (for embedded arrays that become hundreds of thousands of items it can be a performance issue, but most of the time that's not a problem). Additionally you'll have several application and front end servers for every database cluster. Have them do the heavy lifting of the joins and let the database servers stick to reading and writing.
TL;DR: What you're doing is fine, and there are other ways of doing it. Check out the mongodb documentation's data model patterns for some great examples. http://docs.mongodb.org/manual/data-modeling/
There is a specification that a lot of drivers support that's called DBRef.
DBRef is a more formal specification for creating references between documents. DBRefs (generally) include a collection name as well as an object id. Most developers only use DBRefs if the collection can change from one document to the next. If your referenced collection will always be the same, the manual references outlined above are more efficient.
Taken from MongoDB Documentation: Data Models > Data Model Reference >
Database References
Before 3.2.6, Mongodb does not support join query as like mysql. below solution which works for you.
db.getCollection('comments').aggregate([
{$match : {pid : 444}},
{$lookup: {from: "users",localField: "uid",foreignField: "uid",as: "userData"}},
])
You can run SQL queries including join on MongoDB with mongo_fdw from Postgres.
MongoDB does not allow joins, but you can use plugins to handle that. Check the mongo-join plugin. It's the best and I have already used it. You can install it using npm directly like this npm install mongo-join. You can check out the full documentation with examples.
(++) really helpful tool when we need to join (N) collections
(--) we can apply conditions just on the top level of the query
Example
var Join = require('mongo-join').Join, mongodb = require('mongodb'), Db = mongodb.Db, Server = mongodb.Server;
db.open(function (err, Database) {
Database.collection('Appoint', function (err, Appoints) {
/* we can put conditions just on the top level */
Appoints.find({_id_Doctor: id_doctor ,full_date :{ $gte: start_date },
full_date :{ $lte: end_date }}, function (err, cursor) {
var join = new Join(Database).on({
field: '_id_Doctor', // <- field in Appoints document
to: '_id', // <- field in User doc. treated as ObjectID automatically.
from: 'User' // <- collection name for User doc
}).on({
field: '_id_Patient', // <- field in Appoints doc
to: '_id', // <- field in User doc. treated as ObjectID automatically.
from: 'User' // <- collection name for User doc
})
join.toArray(cursor, function (err, joinedDocs) {
/* do what ever you want here */
/* you can fetch the table and apply your own conditions */
.....
.....
.....
resp.status(200);
resp.json({
"status": 200,
"message": "success",
"Appoints_Range": joinedDocs,
});
return resp;
});
});
You can do it using the aggregation pipeline, but it's a pain to write it yourself.
You can use mongo-join-query to create the aggregation pipeline automatically from your query.
This is how your query would look like:
const mongoose = require("mongoose");
const joinQuery = require("mongo-join-query");
joinQuery(
mongoose.models.Comment,
{
find: { pid:444 },
populate: ["uid"]
},
(err, res) => (err ? console.log("Error:", err) : console.log("Success:", res.results))
);
Your result would have the user object in the uid field and you can link as many levels deep as you want. You can populate the reference to the user, which makes reference to a Team, which makes reference to something else, etc..
Disclaimer: I wrote mongo-join-query to tackle this exact problem.
playORM can do it for you using S-SQL(Scalable SQL) which just adds partitioning such that you can do joins within partitions.
Nope, it doesn't seem like you're doing it wrong. MongoDB joins are "client side". Pretty much like you said:
At the moment, I am first getting the comments which match my criteria, then figuring out all the uid's in that result set, getting the user objects, and merging them with the comment's results. Seems like I am doing it wrong.
1) Select from the collection you're interested in.
2) From that collection pull out ID's you need
3) Select from other collections
4) Decorate your original results.
It's not a "real" join, but it's actually alot more useful than a SQL join because you don't have to deal with duplicate rows for "many" sided joins, instead your decorating the originally selected set.
There is alot of nonsense and FUD on this page. Turns out 5 years later MongoDB is still a thing.
I think, if You need normalized data tables - You need to try some other database solutions.
But I've foun that sollution for MOngo on Git
By the way, in inserts code - it has movie's name, but noi movie's ID.
Problem
You have a collection of Actors with an array of the Movies they've done.
You want to generate a collection of Movies with an array of Actors in each.
Some sample data
db.actors.insert( { actor: "Richard Gere", movies: ['Pretty Woman', 'Runaway Bride', 'Chicago'] });
db.actors.insert( { actor: "Julia Roberts", movies: ['Pretty Woman', 'Runaway Bride', 'Erin Brockovich'] });
Solution
We need to loop through each movie in the Actor document and emit each Movie individually.
The catch here is in the reduce phase. We cannot emit an array from the reduce phase, so we must build an Actors array inside of the "value" document that is returned.
The code
map = function() {
for(var i in this.movies){
key = { movie: this.movies[i] };
value = { actors: [ this.actor ] };
emit(key, value);
}
}
reduce = function(key, values) {
actor_list = { actors: [] };
for(var i in values) {
actor_list.actors = values[i].actors.concat(actor_list.actors);
}
return actor_list;
}
Notice how actor_list is actually a javascript object that contains an array. Also notice that map emits the same structure.
Run the following to execute the map / reduce, output it to the "pivot" collection and print the result:
printjson(db.actors.mapReduce(map, reduce, "pivot"));
db.pivot.find().forEach(printjson);
Here is the sample output, note that "Pretty Woman" and "Runaway Bride" have both "Richard Gere" and "Julia Roberts".
{ "_id" : { "movie" : "Chicago" }, "value" : { "actors" : [ "Richard Gere" ] } }
{ "_id" : { "movie" : "Erin Brockovich" }, "value" : { "actors" : [ "Julia Roberts" ] } }
{ "_id" : { "movie" : "Pretty Woman" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
{ "_id" : { "movie" : "Runaway Bride" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
We can merge two collection by using mongoDB sub query. Here is example,
Commentss--
`db.commentss.insert([
{ uid:12345, pid:444, comment:"blah" },
{ uid:12345, pid:888, comment:"asdf" },
{ uid:99999, pid:444, comment:"qwer" }])`
Userss--
db.userss.insert([
{ uid:12345, name:"john" },
{ uid:99999, name:"mia" }])
MongoDB sub query for JOIN--
`db.commentss.find().forEach(
function (newComments) {
newComments.userss = db.userss.find( { "uid": newComments.uid } ).toArray();
db.newCommentUsers.insert(newComments);
}
);`
Get result from newly generated Collection--
db.newCommentUsers.find().pretty()
Result--
`{
"_id" : ObjectId("5511236e29709afa03f226ef"),
"uid" : 12345,
"pid" : 444,
"comment" : "blah",
"userss" : [
{
"_id" : ObjectId("5511238129709afa03f226f2"),
"uid" : 12345,
"name" : "john"
}
]
}
{
"_id" : ObjectId("5511236e29709afa03f226f0"),
"uid" : 12345,
"pid" : 888,
"comment" : "asdf",
"userss" : [
{
"_id" : ObjectId("5511238129709afa03f226f2"),
"uid" : 12345,
"name" : "john"
}
]
}
{
"_id" : ObjectId("5511236e29709afa03f226f1"),
"uid" : 99999,
"pid" : 444,
"comment" : "qwer",
"userss" : [
{
"_id" : ObjectId("5511238129709afa03f226f3"),
"uid" : 99999,
"name" : "mia"
}
]
}`
Hope so this will help.
How do I make a MongoDB query using BasicDBObjects in Java, when I wish to find all documents that contain an array of nested documents, where one of those nested documents meets all the specified criteria?
Taking the example data:
[
{
"_id":"blood_0",
"type":"O",
"list":[
{
"firstname":"John",
"lastname":"Smith",
"zipcode":"12345"
},
{
"firstname":"John",
"lastname":"Hamilton",
"zipcode":"54627"
},
{
"firstname":"Ben",
"lastname":"Brick",
"zipcode":"12345"
},
{
"firstname":"William",
"lastname":"Tell",
"zipcode":"15487"
}
]
},
{
"_id":"blood_1",
"type":"AB",
"list":[
{
"firstname":"Mary",
"lastname":"Smith",
"zipcode":"12345"
},
{
"firstname":"John",
"lastname":"Henry",
"zipcode":"54624"
},
{
"firstname":"Jacob",
"lastname":"Tell",
"zipcode":"19283"
},
{
"firstname":"William",
"lastname":"Dirk",
"zipcode":"15999"
}
]
}
]
If I only want to return the objects that contain a contact in the list that meets the criteria of firstname = William, lastname = Tell how would I go about doing that? The queries I am doing are not grouping the criteria, so I would get two results where I actually only should be getting one.
How would I do the same query but also checking for type = AB, as well as the other criteria, which would return no results?
You are looking for the $elemMatch operator. It restricts the query operators to a single element within the array of values.
In the shell your query will look like:
db.people.find( { list : { $elemMatch : { lastName:"Smith", firstName: "John" } } } )
To add the blood type:
db.people.find( {
type : "AB",
list : { $elemMatch : { lastName:"Smith", firstName: "John" } }
} )
This gets a bit verbose using the Java Driver.
DBObject elemMatch = new BasicDBObject();
elemMatch.put("lastName","Smith");
elemMatch.put("firstName","John");
DBObject query = new BasicDBObject();
query.append( "type", "AB");
query.append( "list", elemMatch);
Pass that query to one of the find() methods on the collection and you should get the documents you are looking for.
Note that the $elemMatch query operator will return the entire document, including all of the elements in the array. There is a similarly named projection operator to limit the array elements returned to only those matched.
HTH - Rob.
First things first. I really think your model is utterly wrong. Nested arrays which potentially grow indefinetly are bad for multiple reasons:
If the document exceeds it's padding when new members are written to this array, the document needs to be migrated within a data file. That is a pretty costly operation and you want to prevent it as much as you can.
BSON documents are limited to 16MB. So per blood type you could only have a limited number of people.
All queries tend to be a bit more complicated, the according code more bloated and hence slower.
So how to do it? Take these documents:
{
_id: ObjectId(),
firstName: "Mary",
lastName: "Smith",
zip: "12345",
bt: "AB"
},
{
_id: ObjectId(),
firstName: "John",
lastName: "Smith",
zip: "12345",
bt: "0"
}
With indices set like
db.people.ensureIndex({lastName:1,firstName:1})
db.people.ensureIndex({bt:1})
on the MongoDB shell, you can get what you want with
db.people.find({ lastName:"Smith", firstName: "John"})
or
db.people.find({ bt: "AB" })
This query for example translates to the following
MongoClient client = new MongoClient("localhost");
DB db = client.getDB("yourDB");
DBCollection coll = db.getCollection("yourCollection");
BasicDBOBject query = new BasicDBObject("bt","AB");
DBCursor cursor = coll.find(query);
try {
while( cursor.hasNext() ) {
System.out.println( cursor.next() );
}
} finally {
cursor.close();
}
You might find the MongoDB introduction for working with a Java driver interesting.
I'm using Spring-Data MongoDB aggregation framework.
Here is a code sample:
Aggregation agg = Aggregation.newAggregation(
match(Criteria.where("type").is("PROMO")),
group("locale")//.count().as("counts")
);
AggregationResults<Message> results = mongoTemplate.aggregate(agg, "message", Message.class);
return results.getMappedResults();
Throw:
org.springframework.core.convert.ConversionFailedException: Failed to convert from type java.lang.String to type java.math.BigDecimal for value 'CL'; nested exception is java.lang.NumberFormatException
CL is a value on locale field, but i dont understand why throw that exception.
Im using a similar example from the documentation.
SOLVED:
Aggregation agg = Aggregation.newAggregation(
match(Criteria.where("type").is("PROMO")),
group("created", "text").addToSet("locale").as("countries").addToSet("device.deviceType").as("platforms").count().as("count")
);
I try a simple example on mongo console. After that, map the operations to the builder.
I dont undesrtand why dont work before. If someone can clear the problem will be great.
The model "message":
{ "_id" : "90.0", "device" : { "_id" : "5faf92fd-37f2-4d42-a01a-dd1abce0c1af", "deviceType" : "iPhone", "countryId" : "AR" }, "text" : "Text", "created" : ISODate("2014-01-03T15:56:27.096Z"), "status" : "SENT", "type" : "PROMO" }
My simple answer is avoid using the MongoDB Spring Data classes directly for aggregation and use the standard MongoDB Java objects e.g. DBObject / AggregationOutput. The reason for that is I have lost several hours trying to get anything but basic aggregation queries working in MongoDB Spring data (and that is using the latest which as of today is spring-data-mongodb 1.5.0.RELEASE).
However, constructing aggregation queries using the standard MongoDB Java objects can be a pain (especially if nested / complex) as you end up creating countless DBObject groupFields = new BasicDBObject("_id", null); and the code looks a mess.
I recommend adding the following 3 wrapper methods to your code.
protected DBObject dbObj (String key, Object value) {
return new BasicDBObject (key, value);
}
protected DBObject dbObj (Object ... objs) {
DBObject dbObj = new BasicDBObject();
if (objs.length % 2 == 0) {
for (int i = 0; i < objs.length; i+=2) {
dbObj.put((String)objs[i], objs[i+1]);
}
}
return dbObj;
}
protected DBObject dbList (Object ... objs) {
BasicDBList dbList = new BasicDBList();
for (Object obj : objs) {
dbList.add(obj);
}
return (DBObject)dbList;
}
This enables easy translation between your JSON based queries and your Java code. e.g. if you had the following complex query (taken from http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/)
db.zipcodes.aggregate(
{
$group: {
_id: { state: "$state", city: "$city" },
pop: { $sum: "$pop" }
}
},{
$sort: { pop: 1 }
},{
$group: {
_id: "$_id.state",
biggestCity: { $last: "$_id.city" },
biggestPop: { $last: "$pop" },
smallestCity: { $first: "$_id.city" },
smallestPop: { $first: "$pop" }
}
},{
$project: {
_id: 0,
state: "$_id",
biggestCity: {
name: "$biggestCity",
pop: "$biggestPop"
},
smallestCity: {
name: "$smallestCity",
pop: "$smallestPop"
}
}
});
... then your Java code would look something like this ...
List<DBObject> aggregation = Arrays.asList (
dbObj ("$group", dbObj (
"_id", dbObj ("state", "$state", "city", "$city"),
"pop", dbObj ("$sum", "$post")
)),
dbObj ("$sort", dbObj ("pop", 1)),
dbObj ("$group", dbObj (
"_id", "$_id.state",
"biggestCity", dbObj ("$last", "$_id.city"),
"biggestPop", dbObj ("$last", "$pop"),
"smallestCity", dbObj ("$first", "$_id.city"),
"smallestPop", dbObj ("$first", "$pop")
)),
dbObj ("$project", dbObj (
"_id", 0,
"state", "$_id",
"biggestCity", dbObj ("name", "$biggestCity", "pop", "$biggestPop"),
"smallestCity", dbObj ("name", "$smallestCity", "pop", "$smallestPop")
))
);
// Run aggregation query
DBCollection collection = mongoTemplate.getCollection(COLLECTION_NAME);
AggregationOutput output = collection.aggregate (aggregation);
Doing it this way, if it works in your editor (e.g. RoboMongo) it will work in your Java code although you will have to manually convert the objects from the result, which isn't too painful i.e.
List<MyResultClass> results = new ArrayList<MyResultClass>();
Iterator<DBObject> it = output.results().iterator();
while (it.hasNext()) {
DBObject obj = it.next();
MyResultClass result = mongoTemplate.getConverter().read(MyResultClass.class, obj);
results.add(result);
}
However, you may find the Spring Data Aggregation stuff does work OK for you. I love Spring and I do use Mongo Spring Data in various parts of my code, it is the aggregation support that lets it down e.g. doing a "$push" inside a "$group" with multiple items doesn't seem to work. I'm sure it will improve with time (and better documentation). Other people have echoed these thoughts e.g. http://movingfulcrum.tumblr.com/post/61693014502/spring-data-and-mongodb-a-mismatch-made-in-hell - see section 4.
Happy coding!