The following is the document i'm trying to update :
{
"_id" : "12",
"cm_AccAmt" : 30,
"cmPerDaySts" : [
{
"cm_accAmt" : 30,
"cm_accTxnCount" : 2,
"cm_cpnCount" : 2,
"cm_accDate" : "2018-02-12"
},
{
"cm_accAmt" : 15,
"cm_accTxnCount" : 1,
"cm_cpnCount" : 1,
"cm_accDate" : "2018-02-13"
}
],
"cpnPerDaySts" : {
"cpnFile" : "path",
"perDayAcc" : [
{
"cm_accAmt" : 0,
"cm_accTxnCount" : 0,
"cm_cpnCount" : 0,
"cm_accDate" : "2018-02-12"
},
{
"cm_accAmt" : 0,
"cm_accTxnCount" : 0,
"cm_cpnCount" : 0,
"cm_accDate" : "2018-02-13"
}
]
}
}
I want to update the two lists cmPerDaySts and cpnPerDaySts based on the string date field : cm_accDate, if a match is available.
The code i've tried until now to achieve this task is :
ArrayList<BasicDBObject> filter = new ArrayList<>();
filter.add(new BasicDBObject("_id", "12").append("cmPerDaySts.cm_accDate", "2018-02-12"));
filter.add(new BasicDBObject("_id", "12").append("cpnPerDaySts.perDayAcc.cm_accDate", "2018-02-12"));
Document document2 = mongoCollection.findOneAndUpdate(new BasicDBObject("$or", filter),
new BasicDBObject("$inc",
new BasicDBObject("cmPerDaySts.$.cm_accAmt", 15).append("cm_AccAmt", 15).append("cmPerDaySts.$.cm_accTxnCount", 1)
.append("cmPerDaySts.$.cm_cpnCount", 1).append("cpnPerDaySts.perDayAcc.cm_accTxnCount", 1)),
new FindOneAndUpdateOptions().upsert(false).returnDocument(ReturnDocument.AFTER));
System.out.println(document2.toJson());
But it ends up failing with the below exception :
Exception in thread "main" com.mongodb.MongoCommandException: Command failed with error 16837: 'The positional operator did not find the match needed from the query. Unexpanded update:
i want to achieve this in a single update query not multiple. can anyone point me in the right direction or approach to solve this.
Related
I have the following document in :
"Demo" : {
"SI" : {
"Value1" : 40,
"Value2" : [
10,
15,
20
]
} ,
"RS" : {
"Value1" : 4,
"Value2" : [
1,
2,
3,
4
]
}
}
I want to fetch the data for sub-document 'SI'. I have tried with following query:
db.getCollection('input').find({"Demo.SI":"SI"}), but its not giving any record for 'SI' document. The desired output is:
"SI" : {
"Value1" : 40,
"Value2" : [
10,
15,
20
]
}
Please specify where the query goes wrong.
First checkSI exists or not using $exists and then add it in projection as below :
db.input.find({"Demo.SI":{"$exists":true}},{"Demo.SI":1,"_id":0}).pretty()
db.collection.find({ "Demo.SI": { $exists: true, $ne: null } },{"Demo.SI":1,"_id":0})
This query will return all the documents which has SI key
I am running a 3 node Mongo cluster (version 3.0 wired tiger storage engine ) with 10GB RAM.
I have around 2 million doc each having 25 - 30 fields of which 2 are elementary arrays.
I am performing aggregation query which takes around 150 -170 milliseconds.
When I generate a load of 100 queries/sec, the performance starts degrading and reaches up to 2 sec.
Query
db.testCollection.aggregate( [
{ $match: { vid: { $in: ["001","002"]} , ss :"N" , spt : { $gte : new Date("2016-06-29")}, spf :{ $lte : new Date("2016-06-27")}}},
{ $match: {$or : [{sc:{$elemMatch :{$eq : "TEST"}}},{sc :{$exists : false}}]}},
{ $match: {$or : [{pt:{$ne : "RATE"}},{rpis :{$exists : true}}]}},
{ $project: { vid: 1, pid: 1, pn: 1, pt: 1, spf: 1, spt: 1, bw: 1, bwe: 1, st: 1, et: 1, ls: 1, dw: 1, at: 1, dt: 1, d1: 1, d2: 1, mldv: 1, aog: 1, nn: 1, mn: 1, rpis: 1, lmp: 1, cid: 1, van: 1, vad: 1, efo: 1, sc: 1, ss: 1, m: 1, pr: 1, obw: 1, osc: 1, m2: 1, crp: 1, sce: 1, dce: 1, cns: 1 }},
{ $group: { _id: null , data: { $push: "$$ROOT" } }
},
{ $project: { _id: 1 , data : 1 } }
]
)
There is a compound index on all the fields, in the same order as used for for query (except "rpis" since compound index can have only one array field).
Please suggest, where I am going wrong.
the two last stages are unnecessary.
last group is a very heavy as it creates new array in memory, but your result should be digested by application at this stage (not using group).
and there could be a green light to remove previous $project as maybe it could be cheaper to push full document down to client - this could be worth a try.
When $match is used on first entry - then index is used, there is a huge risk that 2nd and 3rd match works with result set from first pipeline instead of using created indexes. If you have a way try to compress $match stages to have only one and see how query performs.
Simplified version of query below:
db.testCollection.aggregate([{
$match : {
vid : {
$in : ["001", "002"]
},
ss : "N",
spt : {
$gte : new Date("2016-06-29")
},
spf : {
$lte : new Date("2016-06-27")
}
}
}, {
$match : {
$or : [{
sc : {
$elemMatch : {
$eq : "TEST"
}
}
}, {
sc : {
$exists : false
}
}
]
}
}, {
$match : {
$or : [{
pt : {
$ne : "RATE"
}
}, {
rpis : {
$exists : true
}
}
]
}
}])
Other issue could be business rules which had impact for scaling system to sharded environment - do you have estimate of load before you started working with such document structure?
I've this document:
{
"_id" : ObjectId("54140782b6d2ca6018585093"),
"user_id" : ObjectId("53f4ae1ae750619418a20467"),
"date" : ISODate("2014-09-13T08:59:46.709Z"),
"type" : 0,
"tot" : 2,
"additional_info" : {
"item_id" : ObjectId("540986159ef9ebafd3dcb5d0"),
"shop_id" : ObjectId("53f4cc5a6e09f788a103d0a4"),
"ap_id" : ObjectId("53f4cc5a6e09f788a103d0a5")
},
"transactions" : [
{
"_id" : ObjectId("54140782b6d2ca6018585091"),
"date_creation" : ISODate("2014-09-13T08:59:46.711Z"),
"type" : -1
},
{
"_id" : ObjectId("54140782b6d2ca6018585092"),
"date_creation" : ISODate("2014-09-13T08:59:46.788Z"),
"type" : 1
}
]
}
and I need to add 2 more field to the first transaction opbject:
- date_execution: date
- result: this bson document
{ "server_used" : "xxx.xxx.xxx.xxx:27017" , "ok" : 1 , "n" : 1 , "updated_executed" : true} (m_OR.getDocument() in the following code example)
to obtaing that document
{
"_id" : ObjectId("54140811b6d25137753c1a1a"),
"user_id" : ObjectId("53f4ae1ae750619418a20467"),
"date" : ISODate("2014-09-13T09:02:09.098Z"),
"type" : 0,
"tot" : 2,
"additional_info" : {
"item_id" : ObjectId("540986159ef9ebafd3dcb5d0"),
"shop_id" : ObjectId("53f4cc5a6e09f788a103d0a4"),
"ap_id" : ObjectId("53f4cc5a6e09f788a103d0a5")
},
"transactions" : [
{
"_id" : ObjectId("54140811b6d25137753c1a18"),
"date_creation" : ISODate("2014-09-13T09:02:09.100Z"),
"type" : -1,
"result" : {
"server_used" : "xxx.xxx.xxx.xxx:27017",
"ok" : 1,
"n" : 1,
"updated_executed" : true
},
"date_execution" : ISODate("2014-09-13T09:02:15.370Z")
},
{
"_id" : ObjectId("54140811b6d25137753c1a19"),
"date_creation" : ISODate("2014-09-13T09:02:09.179Z"),
"type" : 1
}
]
}
The only way I was able to do that is the do 2 separates updates (update is a my wrapper funciont that execute the real updates in mongodb and it works fine):
// where
BasicDBObject query = new BasicDBObject();
query.append("transactions._id", m_Task.ID());
// new value for result - 1st upd
BasicDBObject value = new BasicDBObject();
value.put("$set",new BasicDBObject("transactions.$.date_execution",new Date()));
update(this._systemDB, "activities", query, value);
// new value for date_execution - 2nd upd
value = new BasicDBObject();
value.put("$set",new BasicDBObject("transactions.$.result",m_OR.getDocument()));
update(this._systemDB, "activities", query, value);
If I try to do this:
BasicDBObject value = new BasicDBObject();
value.put("$set",new BasicDBObject("transactions.$.date_execution",new Date()));
value.put("$set",new BasicDBObject("transactions.$.result",m_OR.getDocument()));
or = update(this._systemDB, "activities", query, value);
just the 2nd set will be applied.
Is there any way do avoid the double execution and apply the update with just one call?
Basic rule of "hash/map" objects is that you can only have one key. It's the "highlander" rule ( "There can be only one" ) applied in general reason. So just apply differently:
BasicDBObject value = new BasicDBObject();
value.put("$set",
new BasicDBObject("transactions.$.date_execution",new Date())
.add( new BasicDBObject("transactions.$.result",m_OR.getDocument() )
);
So basically "both" field arguments are part of the "$set" statement as in the serialized form:
{
"$set": {
"transactions.$.date_execution": new Date(),
"transactions.$.result": m_Or.getDocument()
}
}
Which is basically what you want in the end.
Your suggestion was right, just had to fix a little the syntax this way:
BasicDBObject value = new BasicDBObject();
value.put("$set",
new BasicDBObject("transactions.$.date_execution",new Date())
.append("transactions.$.result",m_OR.getDocument())
);
This worked perfectly ;)
Thanks!
Samuel
i need to create a classifier by feature, i have 15M rows of data like:
{
"app_entertainment" : 1,
"app_widgets" : 2,
"arcade" : 8,
"books_and_reference" : 2,
"comics" : 0,
"brain" : 20,
"business" : 0,
"cards" : 5,
"casual" : 1,
"communication" : 4,
"education" : 0,
"finance" : 1,
"game_wallpaper" : 0,
"game_widgets" : 0,
"health_fitness" : 0,
"libraries_demo" : 0,
"racing" : 1,
"lifestyle" : 1,
"media_video" : 0,
"medical" : 0,
"music_and_audio" : 7,
"news_magazines" : 2,
"personalization" : 1,
"photography" : 0,
"productivity" : 4,
"shopping" : 1,
"social" : 1,
"sports_apps" : 1,
"sports_games" : 7,
"tools" : 15,
"transportation" : 2,
"travel_and_local" : 8,
"weather" : 3,
"app_wallpaper" : 0,
"entertainment" : 0,
"health_and_fitness" : 0,
"libraries_and_demo" : 0,
"media_and_video" : 0,
"news_and_magazines" : 0,
"sports" : 0
}
also for every dataset like this i know if its true or false,
the boolean is if the user with this dataset clicked on ad or not.
how can i use mahout to train a classifier and how do i classify after i trained it?
everything that i found on the net is very abstract, not many examples of how to do it via java
There are very few materials for Mahout on the internet. I referred to the Mahout source code and the source code in Mahout in Action.
You could refer to 20newsgroup source code for classification.
A simple example using NavieBayes classifier. The vector is the dataset.
public List<String> classifyCase(Vector vector) {
TreeMap<Double, String> resultMap = new TreeMap<Double, String>();
Vector result = classifier.classifyFull(vector);
for (Vector.Element element: result) {
int categoryId = element.index();
double score = element.get();
resultMap.put(-score, labels.get(categoryId));
}
return new ArrayList<String>(resultMap.values());
}
I have a class that contains a List as one of the field. How can I update this field.
Found an example for updating a field
BasicDBObject newDocument3 = new BasicDBObject().append("$set", new
BasicDBObject().append("type", "dedicated server"));
collection.update(new BasicDBObject().append("hosting", "hostA"), newDocument3);
From link -> http://www.mkyong.com/mongodb/java-mongodb-update-document/
So this is what I have tried
BasicDBObject objectUpdateCommand = new BasicDBObject("$push", new
BasicDBObject("someList", stringValue));
collection.update(new BasicDBObject().append("id", user.getId()).append("email",
user.getEmail()), objectUpdateCommand);
Result: No change.
What am I missing?
Tried on shell and it worked [I know its not matching all the ids but it works for test purpose]
db.user.update( { Id: 'yourid'}, {$push: { someList: 'appendNewValue'} } )
I have inserted the following rows into the collection.
{ "_id" : ObjectId("50bc89ef88555f5ad35da8ba"), "id" : 1, "email" : "test1#test.com", "list" : [ "list1", "list2" ] }
{ "_id" : ObjectId("50bc89f788555f5ad35da8bb"), "id" : 2, "email" : "test2#test.com", "list" : [ "list1", "list2" ] }
Then by using the following code i am able to update document with id=1.
BasicDBObject cmd = new BasicDBObject().append("$push", new BasicDBObject("list", "list3"));
coll.update(new BasicDBObject().append("id", 1).append("email","test1#test.com"), cmd);
After the update rows look like :
{ "_id" : ObjectId("50bc89ef88555f5ad35da8ba"), "id" : 2, "email" : "test2#test.com", "list" : [ "list1", "list2" ] }
{ "_id" : ObjectId("50bc89f788555f5ad35da8bb"), "email" : "test1#test.com", "id" : 1, "list" : [ "list1", "list2", "list3" ] }
Check your code again. It should work with this code.