A sample JSON Document
{
"_id" : "ab85cebc-8c7a-43bf-8efc-1151ccaa4f84",
"address" : {
"city" : "Bangalore",
"postcode" : "560080",
"countrycode":"in",
"street" : "SADASHIVNAGAR,SANKEY TANK RD"
},
"brand" : "Shell",
"prices" : [
{
"fuelType" : "DIESEL",
"price" : 52.7
},
{
"fuelType" : "PETROL",
"price" : 67.05
}
]
}
I have around 20 docs for brand:Shell with different location in and around Bangalore.
For all those I have to update Diesel and Petrol price.
Say current DIESEL and PETROL prices are 57.9 and 71.4 respectively?
How do I update all document with these latest price using JAVA (using Eclipse IDE)
Code (in complete)
public class UpdateGasstationFuelPrice {
public static void main(String[] args) {
MongoClient client = new MongoClient("localhost",27017);
MongoDatabase db = client.getDatabase( "notes" );
MongoCursor<Document> cursor = db.getCollection( "gasstation" ).find( new BasicDBObject( "address.countrycode","in" )
.append("address.city","Bangalore")
.append("brand","Shell")).iterator();
if (cursor.hasNext()){
Document doc = cursor.next();
}
client.close();
}
}
Update with Query
db.getCollection("gasstation").update({"address.countrycode":"in","address.city":"Bangalore","brand":"Shell"},
//Query to get the position
{
"prices": { $exists: true }
},
// Use the positional $ operator to update specific element (which matches your query
{
$set:
{
//set value specific to elements field/key
"prices" : [
{
"fuelType" : "DIESEL",
"price" : 502.7
},
{
"fuelType" : "PETROL",
"price" : 607.05
}
]
}
}
);
You cannot updated util you know the position of the element which you want to update.
So basically what you can do is:
You need to query to seek the position.
Use the positional operator and update the array.
db.gasstation.update(
//Query to get the position
{
"prices.fuelType": "DIESEL"
},
// Use the positional $ operator to update specific element (which matches your query
{
$set:
{
"prices.$" :
//Element/new value to update
{
"fuelType" : "DIESEL",
"price" : 999.7
}
}
}
);
If you want just to update only specific field inside the json element embedded in the array you can do as follows:
db.gasstation.update(
//Query to get the position
{
"prices.fuelType": "DIESEL"
},
// Use the positional $ operator to update specific element (which matches your query
{
$set:
{
//set value specific to elements field/key
//i.e. Update documents in an Array
"prices.$.price" : 999.7
}
}
);
Updates based on comments:
db.gasstation.update(
//Query to match
{
"address.city":"Bangalore",
"brand":"Shell",
"countrycode":"in",
"prices": { $exists: true }
},
// Use $set operator & overwrite entire array
{
$set:
{
//Overwrite value
"prices" : [
{
"fuelType" : "DIESEL",
"price" : 502.7
},
{
"fuelType" : "PETROL",
"price" : 607.05
}
]
}
}
);
Related
I am new to mongodb and aggregation framework.
We have a class UserMetaData and a list of UserMetaData. I need to fetch data according to the userMetaDataList that is passed to the method solve().
Currently I am iterating the list and one by one fetching the corresponding collection from the monogdb. Since the db calls are made for each element in the list, this becomes a highly expensive operation.
Is there any way to fetch all the required data from mongodb in one shot(more like a bulk fetch operation).
mongodb - perform batch query the solution provided in this does not fulfill the requirements of the current scenario.
Please help!!
This is how I am doing currently.
class UserMetaData{
String userId;
String vehicleId;
String vehicleColour;
String orderId;
}
public List<String> getOrderIds(List<UserMetaData> userMetaDataList) {
List<String> orderIds = new ArrayList<>();
for (UserMetaData userMetadata : userMetaDataList) {
try {
BasicDBObject matchDBObject = new BasicDBObject("user_id", new BasicDBObject("$eq", userMetadata.getUserId()));
matchDBObject.append("vehicle_id", new BasicDBObject("$eq", userMetadata.getVehicleID()));
matchDBObject.append("vehicle_colour", new BasicDBObject("$in", ImmutableSet.of("WHITE", "BLACK")));
Document document = eventCollection.find(matchDBObject)
.projection(new BasicDBObject("order_id", "1"))
.first();
orderIds.add(document.get("order_id").toString());
} catch (Exception e) {
log.info("Exception occurred while fetching order id for user_id: {} asset_id:{} - {}", metadata.getUserId(), metadata.getAssetID(), e);
}
}
return ordersIds;
}
I want to fetch all the corresponding data in a single query.
Requesting help.
You can join all filters with $OR condition and fetch the full list at once ...
I want to fetch all the corresponding data in a single query.
You can use this approach and perform the query as a single operation (avoids the for-loop).
Consider sample documents in the collection test:
{ "_id" : ObjectId("621762e2cda7c6394d557f37"), "userid" : 1, "name" : "ijk", "orderid" : "11" }
{ "_id" : ObjectId("621762efcda7c6394d557f38"), "userid" : 12, "name" : "abc", "orderid" : "99" }
{ "_id" : ObjectId("621762fccda7c6394d557f39"), "userid" : 13, "name" : "xyz", "orderid" : "100" }
The array of objects to filter:
var DOCS = [
{ userid: 12, name: "abc" },
{ userid: 13, name: "xyz" }
]
The query to filter by DOCS:
db.test.find(
{
$expr: {
$in: [ { userid: "$userid", name: "$name" }, DOCS ]
}
},
{
orderid: 1
}
)
The output has documents with userids 12 and 13.
[ EDIT - ADD ]
This aggregation an improvement over the find:
db.test.aggregate([
// This matches the 'userid' and 'name' fields with the input list 'DOCS'
{
$match: {
$expr: {
$in: [ { userid: "$userid", name: "$name" }, DOCS ]
}
}
},
// The grouping will select only the first matching for the 'userid' and 'name'
// (this is as per the question post's code: `.first()`)
{
$group: {
_id: {
userid: "$userid",
name: "$name"
},
orderid: {
$first: "$orderid"
}
}
},
// Remove the '_id' field
// Now the result has just the 'orderid' field only
{
$unset: "_id"
}
])
How can I find the number of duplicates in each document in Java-MongoDB
I have collection like this.
Collection example:
{
"_id": {
"$oid": "5fc8eb07d473e148192fbecd"
},
"ip_address": "192.168.0.1",
"mac_address": "00:A0:C9:14:C8:29",
"url": "https://people.richland.edu/dkirby/141macaddress.htm",
"datetimes": {
"$date": "2021-02-13T02:02:00.000Z"
}
{
"_id": {
"$oid": "5ff539269a10d529d88d19f4"
},
"ip_address": "192.168.0.7",
"mac_address": "00:A0:C9:14:C8:30",
"url": "https://people.richland.edu/dkirby/141macaddress.htm",
"datetimes": {
"$date": "2021-02-12T19:00:00.000Z"
}
}
{
"_id": {
"$oid": "60083d9a1cad2b613cd0c0a2"
},
"ip_address": "192.168.1.5",
"mac_address": "00:0A:05:C7:C8:31",
"url": "www.facebook.com",
"datetimes": {
"$date": "2021-01-24T17:00:00.000Z"
}
}
example query:
BasicDBObject whereQuery = new BasicDBObject();
DBCursor cursor = table1.find(whereQuery);
while (cursor.hasNext()) {
DBObject obj = cursor.next();
String ip_address = (String) obj.get("ip_address");
String mac_address = (String) obj.get("mac_address");
Date datetimes = (Date) obj.get("datetimes");
String url = (String) obj.get("url");
System.out.println(ip_address, mac_address, datetimes, url);
}
in Java, How I can know count duplicated data of "url". And how many of duplicated.
in mongodb you can solve this problem with "Aggregation Pipelines". You need to implement this pipeline in "Mongodb Java Driver". It gives only duplicated results with their duplicates count.
db.getCollection('table1').aggregate([
{
"$group": {
// group by url and calculate count of duplicates by url
"_id": "$url",
"url": {
"$first": "$url"
},
"duplicates_count": {
"$sum": 1
},
"duplicates": {
"$push": {
"_id": "$_id",
"ip_address": "$ip_address",
"mac_address": "$mac_address",
"url": "$url",
"datetimes": "$datetimes"
}
}
}
},
{ // select documents that only duplicates count higher than 1
"$match": {
"duplicates_count": {
"$gt": 1
}
}
},
{
"$project": {
"_id": 0
}
}
]);
Output Result:
{
"url" : "https://people.richland.edu/dkirby/141macaddress.htm",
"duplicates_count" : 2.0,
"duplicates" : [
{
"_id" : ObjectId("5fc8eb07d473e148192fbecd"),
"ip_address" : "192.168.0.1",
"mac_address" : "00:A0:C9:14:C8:29",
"url" : "https://people.richland.edu/dkirby/141macaddress.htm",
"datetimes" : {
"$date" : "2021-02-13T02:02:00.000Z"
}
},
{
"_id" : ObjectId("5ff539269a10d529d88d19f4"),
"ip_address" : "192.168.0.7",
"mac_address" : "00:A0:C9:14:C8:30",
"url" : "https://people.richland.edu/dkirby/141macaddress.htm",
"datetimes" : {
"$date" : "2021-02-12T19:00:00.000Z"
}
}
]
}
If I understand your question correctly you're trying to find the amount of duplicate entries for the field url. You could iterate over all your documents and add them to a Set. A Set has the property of only storing unique values. When you add your values, the ones that are already in the Set will not be added again. Thus the difference of the number of entries in the Set to the number of documents is the amount of duplicate entries for the given field.
If you wanted to know which URLs are non-unique, you could evaluate the return value from Set.add(Object) which will tell you, whether or not the given value has been in the Set beforehand. If it has, you got yourself a duplicate.
I have a list of objects that are given somewhat arbitrary Object keys as a result of using the async Java driver + BSON.
My issue is given the fact that jobStatuses are an arbitrary list of Dictionary items where I don't know the key, I have no idea how to access its sub-values. In the end, I'm trying to build a query that returns if ANY of jobStatus.*._id are true given a list of potential Object ID's.
So I'd be giving a list of ID's and want to return true if ANY of the items in jobStatuses have any of the given ID's. Any ideas?
Let's try this :
db.yourCollectionName.aggregate([
{
$project: {
_id: 0,
jobStatutses: { $arrayElemAt: [{ $objectToArray: "$jobStatutses" }, 0] }
}
}, {
$match: { 'jobStatutses.v._id': { $in: [ObjectId("5d6d8c3a5a0d22d3c84dd6dc"), ObjectId("5d6d8c3a5a0d22d3c84dd6ed")] } }
}
])
Collection Data :
/* 1 */
{
"_id" : ObjectId("5e06319c400289966eea6a07"),
"jobStatutses" : {
"5d6d8c3a5a0d22d3c84dd6dc" : {
"_id" : ObjectId("5d6d8c3a5a0d22d3c84dd6dc"),
"accepted" : "123",
"completed" : 0
}
},
"something" : 1
}
/* 2 */
{
"_id" : ObjectId("5e0631ad400289966eea6dd1"),
"jobStatutses" : {
"5d6d8c3a5a0d22d3c84dd6ed" : {
"_id" : ObjectId("5d6d8c3a5a0d22d3c84dd6ed"),
"accepted" : "456",
"completed" : 0
}
},
"something" : 2
}
/* 3 */
{
"_id" : ObjectId("5e0631cd400289966eea7542"),
"jobStatutses" : {
"5e06319c400289966eea6a07" : {
"_id" : ObjectId("5e06319c400289966eea6a07"),
"accepted" : "789",
"completed" : 0
}
},
"something" : 3
}
Output :
/* 1 */
{
"jobStatutses" : {
"k" : "5d6d8c3a5a0d22d3c84dd6dc",
"v" : {
"_id" : ObjectId("5d6d8c3a5a0d22d3c84dd6dc"),
"accepted" : "123",
"completed" : 0
}
}
}
/* 2 */
{
"jobStatutses" : {
"k" : "5d6d8c3a5a0d22d3c84dd6ed",
"v" : {
"_id" : ObjectId("5d6d8c3a5a0d22d3c84dd6ed"),
"accepted" : "456",
"completed" : 0
}
}
}
All you need is to check if at least one doc gets returned from DB for a given list or not, So we don't need to worry about document structure then just do result.length in your code to say at least one doc got matched for the input list.
I want to get a specific element of the array and through the responsaveis.$ (daniela.morais#sofist.com.br) but there is no result, there is problem in my syntax?
{
"_id" : ObjectId("54fa059ce4b01b3e086c83e9"),
"agencia" : "Abc",
"instancia" : "dentsuaegis",
"cliente" : "Samsung",
"nomeCampanha" : "Serie A",
"ativa" : true,
"responsaveis" : [
"daniela.morais#sofist.com.br",
"abc#sofist.com.br"
],
"email" : "daniela.morais#sofist.com.br"
}
Syntax 1
mongoCollection.findAndModify("{'responsaveis.$' : #}", oldUser.get("email"))
.with("{$set : {'responsaveis.$' : # }}", newUser.get("email"))
.returnNew().as(BasicDBObject.class);
Syntax 2
db.getCollection('validatag_campanhas').find({"responsaveis.$" : "daniela.morais#sofist.com.br"})
Result
Fetched 0 record(s) in 1ms
The $ positional operator is only used in update(...) or project calls, you can't use it to return the position within an array.
The correct syntax would be :-
Syntax 1
mongoCollection.findAndModify("{'responsaveis' : #}", oldUser.get("email"))
.with("{$set : {'responsaveis.$' : # }}", newUser.get("email"))
.returnNew().as(BasicDBObject.class);
Syntax 2
db.getCollection('validatag_campanhas').find({"responsaveis" : "daniela.morais#sofist.com.br"})
If you just want to project the specific element, you can use the positional operator $ in projection as
{"responsaveis.$":1}
db.getCollection('validatag_campanhas').find({"responsaveis" : "daniela.morais#sofist.com.br"},{"responsaveis.$":1})
Try with this
db.validatag_campanhas.aggregate(
{ $unwind : "$responsaveis" },
{
$match : {
"responsaveis": "daniela.morais#sofist.com.br"
}
},
{ $project : { responsaveis: 1, _id:0 }}
);
That would give you all documents which meets that conditions
{
"result" : [
{
"responsaveis" : "daniela.morais#sofist.com.br"
}
],
"ok" : 1
}
If you want one document that has in its responsaveis array the element "daniela.morais#sofist.com.br" you can eliminate the project operator like
db.validatag_campanhas.aggregate(
{ $unwind : "$responsaveis" },
{
$match : {
"responsaveis": "daniela.morais#sofist.com.br"
}
}
);
And that will give you
{
"result" : [
{
"_id" : ObjectId("54fa059ce4b01b3e086c83e9"),
"agencia" : "Abc",
"instancia" : "dentsuaegis",
"cliente" : "Samsung",
"nomeCampanha" : "Serie A",
"ativa" : true,
"responsaveis" : "daniela.morais#sofist.com.br",
"email" : "daniela.morais#sofist.com.br"
}
],
"ok" : 1
}
Hope it helps
I am facing a trouble in the use of ElasticSearch for my java application.
I explain myself, I have a mapping, which is something like :
{
"products": {
"properties": {
"id": {
"type": "long",
"ignore_malformed": false
},
"locations": {
"properties": {
"category": {
"type": "long",
"ignore_malformed": false
},
"subCategory": {
"type": "long",
"ignore_malformed": false
},
"order": {
"type": "long",
"ignore_malformed": false
}
}
},
...
So, as you can see, I receive a list of products, which are composed of locations. In my model, this locations are all the categories' product. It means that a product can be in 1 or more categories. In each of this category, the product has an order, which is the order the client wants to show them.
For instance, a diamond product can have a first place in Jewelry, but the third place in Woman (my examples are not so logic ^^).
So, when I click on Jewelry, I want to show this products, ordered by the field locations.order in this specific category.
For the moment, when I search all the products on a specific category the response for ElasticSearch that I receive is something like :
{"id":5331880,"locations":[{"category":5322606,"order":1},
{"category":5883712,"subCategory":null,"order":3},
{"category":5322605,"subCategory":6032961,"order":2},.......
Is it possible to sort this products, by the element locations.order for the specific category I am searching for ? For instance, if I am querying the category 5322606, I want the order 1 for this product to be taken.
Thank you very much beforehand !
Regards,
Olivier.
First a correction of terminology: in Elasticsearch, "parent/child" refers to completely separate docs, where the child doc points to the parent doc. Parent and children are stored on the same shard, but they can be updated independently.
With your example above, what you are trying to achieve can be done with nested docs.
Currently, your locations field is of type:"object". This means that the values in each location get flattened to look something like this:
{
"locations.category": [5322606, 5883712, 5322605],
"locations.subCategory": [6032961],
"locations.order": [1, 3, 2]
}
In other words, the "sub" fields get flattened into multi-value fields, which is of no use to you, because there is no correlation between category: 5322606 and order: 1.
However, if you change locations to be type:"nested" then internally it will index each location as a separate doc, meaning that each location can be queried independently, using the dedicated nested query and filter.
By default, the nested query will return a _score based upon how well each location matches, but in your case you want to return the highest value of the order field from any matching children. To do this, you'll need to use a custom_score query.
So let's start by creating the index with the appropriate mapping:
curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' -d '
{
"mappings" : {
"products" : {
"properties" : {
"locations" : {
"type" : "nested",
"properties" : {
"order" : {
"type" : "long"
},
"subCategory" : {
"type" : "long"
},
"category" : {
"type" : "long"
}
}
},
"id" : {
"type" : "long"
}
}
}
}
}
'
The we index your example doc:
curl -XPOST 'http://127.0.0.1:9200/test/products?pretty=1' -d '
{
"locations" : [
{
"order" : 1,
"category" : 5322606
},
{
"order" : 3,
"subCategory" : null,
"category" : 5883712
},
{
"order" : 2,
"subCategory" : 6032961,
"category" : 5322605
}
],
"id" : 5331880
}
'
And now we can search for it using the queries we discussed above:
curl -XGET 'http://127.0.0.1:9200/test/products/_search?pretty=1' -d '
{
"query" : {
"nested" : {
"query" : {
"custom_score" : {
"script" : "doc[\u0027locations.order\u0027].value",
"query" : {
"constant_score" : {
"filter" : {
"and" : [
{
"term" : {
"category" : 5322605
}
},
{
"term" : {
"subCategory" : 6032961
}
}
]
}
}
}
}
},
"score_mode" : "max",
"path" : "locations"
}
}
}
'
Note: the single quotes within the script have been escaped as \u0027 to get around shell quoting. The script actually looks like this: "doc['locations.order'].value"
If you look at the _score from the results, you can see that it has used the order value from the matching location:
{
"hits" : {
"hits" : [
{
"_source" : {
"locations" : [
{
"order" : 1,
"category" : 5322606
},
{
"order" : 3,
"subCategory" : null,
"category" : 5883712
},
{
"order" : 2,
"subCategory" : 6032961,
"category" : 5322605
}
],
"id" : 5331880
},
"_score" : 2,
"_index" : "test",
"_id" : "cXTFUHlGTKi0hKAgUJFcBw",
"_type" : "products"
}
],
"max_score" : 2,
"total" : 1
},
"timed_out" : false,
"_shards" : {
"failed" : 0,
"successful" : 5,
"total" : 5
},
"took" : 9
}
Just add a more updated version related to sorting parent by child field.
We can query parent doc type sorted by child field ('count' e.g.) similar as follows.
https://gist.github.com/robinloxley1/7ea7c4f37a3413b1ca16