In my Activity document I want to update a collection of actions status which ids are in a list :
{
"_id" : "...",
"actions" : [
{
"_id" : 1,
"status" : "todo"
},
{
"_id" : 2,
"status" : "in progress"
},
{
"_id" : 3,
"status" : "done"
},
{
"_id" : 4,
"status" : "done"
},
{
"_id" : 5,
"status" : "todo"
}
]
}
I tried to write a code using MongoOperation.updateMulti but it updates only one status at all :
mongoOperation.updateMulti(
new Query().addCriteria(
Criteria.where("_id").is(activityId).and("actionsActivite._id").in(actionsIds)),
new Update().set("actionsActivite.$.status", newStatut),
ActivityModel.class
);
I don't know where the problem is. Is my Query wrong ? My Update ?
I finally found the solution. I just add $[] operator in the update like this :
mongoOperation.updateMulti(
new Query().addCriteria(
Criteria.where("_id").is(activityId).and("actionsActivite._id").in(actionsIds)),
new Update().set("actionsActivite.$[].status", newStatut),
ActivityModel.class
);
Related
I would like to make an aggregate with Spring Data MongoDB and I don't know how to make this group stage :
$group: {
_id: {
field1: "$field1",
field2: "2017-06-21",
field3: "$field3"
},
...
}
I don't know how to put the constant date into the second field of the _id
For the moment i do this :
groupOperation = group("field1","field3")
But i'm not sure that it make a group stage on the value of fields and i don't no how put a new field into _id.
I don't find good doc about the operation of different stage of an aggregate in Spring data MongoDB
If someone has an idea I'm interested
Thank you in advance
Fields fields = Fields.fields("field1", "field2", "field3");
GroupOperation groupOp = Aggregation.group(fields);
This will make group block
$group: {
_id: {
field1: "$field1",
field2: "$field2",
field3: "$field3"
}
Here is an example of using multiple fields on group and count the values.
Aggregation aggregate = Aggregation.newAggregation(Aggregation.group("category", "status").count().as("Categoury_Status_Count"));
AggregationResults<String> aggregateResult = mongoOperations.aggregate(aggregate, "category", String.class);
System.out.println(aggregateResult.getMappedResults());
My Sample Data:-
/* 1 */
{
"_id" : 1,
"category" : "cafe",
"status" : "A"
}
/* 2 */
{
"_id" : 2,
"category" : "cafe",
"status" : "B"
}
/* 3 */
{
"_id" : 3,
"category" : "cafe1",
"status" : "A"
}
/* 4 */
{
"_id" : 4,
"category" : "cafe1",
"status" : "B"
}
/* 5 */
{
"_id" : 5,
"category" : "cafe1",
"status" : "B"
}
Output:-
[{ "category" : "cafe1" , "status" : "A" , "Categoury_Status_Count" : 1}, { "category" : "cafe" , "status" : "B" , "Categoury_Status_Count" : 1}, { "category" : "cafe1" , "status" : "B" , "Categoury_Status_Count" : 2}, { "category" : "cafe" , "status" : "A" , "Categoury_Status_Count" : 1}]
To get the _id in the output:-
You can add the _id to the set.
Aggregation aggregate = Aggregation.newAggregation(Aggregation.group("category", "status").count().as("Categoury_Status_Count").addToSet("_id").as("ids"));
Output:-
[{ "category" : "cafe1" , "status" : "A" , "Categoury_Status_Count" : 1 , "ids" : [ 3.0]}, { "category" : "cafe" , "status" : "B" , "Categoury_Status_Count" : 1 , "ids" : [ 2.0]}, { "category" : "cafe1" , "status" : "B" , "Categoury_Status_Count" : 2 , "ids" : [ 5.0 , 4.0]}, { "category" : "cafe" , "status" : "A" , "Categoury_Status_Count" : 1 , "ids" : [ 1.0]}]
Hi I'm reading data from mongodb into spark application.
My mongodb contains 2 collections.
One is profile_data(actual data with field names)
(Which holds all the input data including some unique fields)
{
"MessageStatus" : 2,
"Origin" : 1,
"_id" : ObjectId("596340fe8b0fa35d2880db1a"),
"accerlation" : 19.4,
"cylinders" : 4,
"displacement" : 119,
"file_id" : ObjectId("59633e48b760e7c8071a6c1c"),
"horsepower" : 82,
"modelyear" : 82,
"modified_date" : ISODate("2017-07-10T08:47:01.641Z"),
"mpg" : 31,
"snet_id" : "new_project",
"unique_id" : "784",
"username" : "chevy s-10",
"weight" : 2720
}
And another collection is : predictive_model_details(Which holds the ML model details like model name, feature fields and prediction field just like metadata)
{
"_id" : ObjectId("56b4351be4b064bb19a90324"),
"algorithm_id" : "55d717a53d9e22022ff2a1e9",
"algorithm_name" : "K- Nearest Neighbours (IBK)",
"client_id" : "562e1d51b760d0e408151b91",
"feature_fields" : [
{
"name" : "Origin",
"type" : "int"
},
{
"name" : "accerlation",
"type" : "Double"
},
{
"name" : "displacement",
"type" : "Int"
},
{
"name" : "horsepower",
"type" : "Int"
},
{
"name" : "modelyear",
"type" : "Int"
}
],
,
"makeActiveStatus" : "0",
"model_name" : "test1",
"parameter_type" : "system_defined",
"parameters" : [
{
"symbol" : "-K",
"value" : "1"
}
],
"predictor" : {
"name" : "mpg"
"type" : "Int"
},
"result_exists" : true,
"snet_id" : "new_project"
}
So I've created 2 datasets in spark for two collections in MongoDB. Now I want to map these 2 Datasets with all feature fields together and prediction field together.
And common field in 2 datasets is snet_id.
Could anyone please help?
The project is a visual analysis of business data, MongoDB through Spring Data, REST interface, then d3.js.
On the database level, my data looks like this:
/* 1 */
{
"_id" : ObjectId("58ac466160fb39e5e8dc8b70"),
"dots" : {
"x" : 4,
"y" : 3
}
}
/* 2 */
{
"_id" : ObjectId("58ac468060fb39e5e8dc8b7e"),
"squares" : {
"x" : 12,
"y" : 2
}
}
The REST interface (spring) delivers this:
{
"_embedded" : {
"JSON" : [ {
"squares" : null,
"dots" : {
"dot" : {
"x" : 4,
"y" : 3
}
},
"_links" : {
"self" : {
"href" : "http://localhost:8080/JSON/58ac466160fb39e5e8dc8b70"
},
"jSON" : {
"href" : "http://localhost:8080/JSON/58ac466160fb39e5e8dc8b70"
}
}
}, {
"squares" : {
"square" : {
"x" : 12,
"y" : 2
}
},
"dots" : null,
"_links" : {
"self" : {
"href" : "http://localhost:8080/JSON/58ac468060fb39e5e8dc8b7e"
},
"jSON" : {
"href" : "http://localhost:8080/JSON/58ac468060fb39e5e8dc8b7e"
}
}
} ]
},
"_links" : {
"self" : {
"href" : "http://localhost:8080/JSON"
},
"profile" : {
"href" : "http://localhost:8080/profile/JSON"
}
},
"page" : {
"size" : 20,
"totalElements" : 2,
"totalPages" : 1,
"number" : 0
}
}
Now I'm having trouble accessing this for processing in d3.js for visualization, whatever way I try to access the data, I only get "undefined" with no values back on the console.
Should I reformat to get rid of the "_embedded" part, or to "flatten" the data generally, or do I need a specific way to access it?
As of now, I'm just using "d3.json("/JSON")" to access the interface, but can't extract any data.
Ok so I am making API requests to retrieve certain things like movies, songs, or to ping the server. However all of these responses are contained within the same response JSON object that has varying fields depending on the response. Below are three examples.
ping
{
"response" : {
"status" : "ok",
"version" : "0.9.1"
}
}
getIndexes
{
"response" : {
"status" : "ok",
"version" : "0.9.1",
"indexes" : {
"index" : [ {
"name" : "A",
"movie" : [ {
"id" : "150",
"name" : "A Movie"
}, {
"id" : "2400",
"name" : "Another Movie"
} ]
}, {
"name" : "S",
"movie" : [ {
"id" : "439",
"name" : "Some Movie"
}, {
"id" : "209",
"name" : "Some Movie Part 2"
} ]
} ]
}
}
}
getRandomSongs
{
"response" : {
"status" : "ok"
"version" : "0.9.1"
"randomSongs" : {
"song": [ {
"id" : "72",
"parent" : "58",
"isDir" : false,
"title" : "Letter From Yokosuka",
"album" : "Metaphorical Music",
"artist" : "Nujabes",
"track" : 7,
"year" : 2003,
"genre" : "Hip-Hop",
"coverArt" : "58",
"size" : 20407325,
"contentType" : "audio/flac",
"suffix" : "flac",
"transcodedContentType" : "audio/mpeg",
"transcodedSuffix" : "mp3",
"duration" : 190,
"bitRate" : 858,
"path" : "Nujabes/Metaphorical Music/07 - Letter From Yokosuka.flac",
"isVideo" : false,
"created" : "2015-06-06T01:18:05.000Z",
"albumId" : "2",
"artistId" : "0",
"type" : "music"
}, {
"id" : "3135",
"parent" : "3109",
"isDir" : false,
"title" : "Forty One Mosquitoes Flying In Formation",
"album" : "Tame Impala",
"artist" : "Tame Impala",
"track" : 4,
"year" : 2008,
"genre" : "Rock",
"coverArt" : "3109",
"size" : 10359844,
"contentType" : "audio/mpeg",
"suffix" : "mp3",
"duration" : 258,
"bitRate" : 320,
"path" : "Tame Impala/Tame Impala/04 - Forty One Mosquitoes Flying In Formation.mp3",
"isVideo" : false,
"created" : "2015-06-29T21:50:16.000Z",
"albumId" : "101",
"artistId" : "30",
"type" : "music"
} ]
}
}
}
So basically my question is, how should I structure my model classes to use for parsing these responses? At the moment I have an abstract response object that only contains fields for the status and version. However, by using this approach I will need a response class that extends this abstract class for ever request I make (e.g. AbstractResponse, IndexesResponse, RandomSongsResponse). Also, some models with the same name may have different fields depending on the API request made. I would prefer to avoid making a model class for every possible scenario.
And as an extra note, I am using GSON for JSON serialization/deserialization and Retrofit to communicate with the API.
I'm new in mongo and use mongodb aggregation framework for my queries. I need to retrieve some records which satisfy certain conditions(include pagination+sorting) and also get total count of records.
Now, I perform next steps:
Create $match operator
{ "$match" : { "year" : "2012" , "author.authorName" : { "$regex" :
"au" , "$options" : "i"}}}
Added sorting and pagination
{ "$sort" : { "some_field" : -1}} , { "$limit" : 10} , { "$skip" : 0}
After querying I receive the expected result: 10 documents with all fields.
For pagination I need to know the total count of records which satisfy these conditions, in my case 25.
I use next query to get count : { "$match" : { "year" : "2012" , "author.authorName" : { "$regex" : "au" , "$options" : "i"}}} , { "$group" : { "_id" : "$all" , "reviewsCount" : { "$sum" : 1}}} , { "$sort" : { "some_field" : -1}} , { "$limit" : 10} , { "$skip" : 0}
But I don't want to perform two separate queries: one for retrieving documents and second for total counts of records which satisfy certain conditions.
I want do it in one single query and get result in next format:
{
"result" : [
{
"my_documets": [
{
"_id" : ObjectId("512f1f47a411dc06281d98c0"),
"author" : {
"authorName" : "author name1",
"email" : "email1#email.com"
}
},
{
"_id" : ObjectId("512f1f47a411dc06281d98c0"),
"author" : {
"authorName" : "author name2",
"email" : "email2#email.com"
}
}, .......
],
"total" : 25
}
],
"ok" : 1
}
I tried modify the group operator : { "$group" : { "_id" : "$all" , "author" : "$author" "reviewsCount" : { "$sum" : 1}}}
But in this case I got : "exception: the group aggregate field 'author' must be defined as an expression inside an object". If add all fields in _id then reviewsCount always = 1 because all records are different.
Nobody know how it can be implement in single query ? Maybe mongodb has some features or operators for this case? Implementation with using two separate query reduces performance for querying thousand or millions records. In my application it's very critical performance issue.
I've been working on this all day and haven't been able to find a solution, so thought i'd turn to the stackoverflow community.
Thanks.
You can try using $facet in the aggregation pipeline as
db.name.aggregate([
{$match:{your match criteria}},
{$facet: {
data: [{$sort: sort},{$skip:skip},{$limit: limit}],
count:[{$group: {_id: null, count: {$sum: 1}}}]
}}
])
In data, you'll get your list with pagination and in the count, count variable will have a total count of matched documents.
Ok, I have one example, but I think it's really crazy query, I put it only for fun, but if this example faster than 2 query, tell us about it in the comments please.
For this question i create collection called "so", and put into this collection 25 documents like this:
{
"_id" : ObjectId("512fa86cd99d0adda2a744cd"),
"authorName" : "author name1",
"email" : "email1#email.com",
"c" : 1
}
My query use aggregation framework:
db.so.aggregate([
{ $group:
{
_id: 1,
collection: { $push : { "_id": "$_id", "authorName": "$authorName", "email": "$email", "c": "$c" } },
count: { $sum: 1 }
}
},
{ $unwind:
"$collection"
},
{ $project:
{ "_id": "$collection._id", "authorName": "$collection.authorName", "email": "$collection.email", "c": "$collection.c", "count": "$count" }
},
{ $match:
{ c: { $lte: 10 } }
},
{ $sort :
{ c: -1 }
},
{ $skip:
2
},
{ $limit:
3
},
{ $group:
{
_id: "$count",
my_documets: {
$push: {"_id": "$_id", "authorName":"$authorName", "email":"$email", "c":"$c" }
}
}
},
{ $project:
{ "_id": 0, "my_documets": "$my_documets", "total": "$_id" }
}
])
Result for this query:
{
"result" : [
{
"my_documets" : [
{
"_id" : ObjectId("512fa900d99d0adda2a744d4"),
"authorName" : "author name8",
"email" : "email8#email.com",
"c" : 8
},
{
"_id" : ObjectId("512fa900d99d0adda2a744d3"),
"authorName" : "author name7",
"email" : "email7#email.com",
"c" : 7
},
{
"_id" : ObjectId("512fa900d99d0adda2a744d2"),
"authorName" : "author name6",
"email" : "email6#email.com",
"c" : 6
}
],
"total" : 25
}
],
"ok" : 1
}
By the end, I think that for big collection 2 query (first for data, second for count) works faster. For example, you can count total for collection like this:
db.so.count()
or like this:
db.so.find({},{_id:1}).sort({_id:-1}).count()
I don't fully sure in first example, but in second example we use only cursor, which means higher speed:
db.so.find({},{_id:1}).sort({_id:-1}).explain()
{
"cursor" : "BtreeCursor _id_ reverse",
"isMultiKey" : false,
"n" : 25,
"nscannedObjects" : 25,
"nscanned" : 25,
"nscannedObjectsAllPlans" : 25,
"nscannedAllPlans" : 25,
"scanAndOrder" : false,
!!!!!>>> "indexOnly" : true, <<<!!!!!
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
...
}
For completeness (full discussion was on the MongoDB Google Groups) here is the aggregation you want:
db.collection.aggregate(db.docs.aggregate( [
{
"$match" : {
"year" : "2012"
}
},
{
"$group" : {
"_id" : null,
"my_documents" : {
"$push" : {
"_id" : "$_id",
"year" : "$year",
"author" : "$author"
}
},
"reviewsCount" : {
"$sum" : 1
}
}
},
{
"$project" : {
"_id" : 0,
"my_documents" : 1,
"total" : "$reviewsCount"
}
}
] )
By the way, you don't need aggregation framework here - you can just use a regular find. You can get count() from a cursor without having to re-query.