Handling Long formatted DateTime field with Java Mongo template - java

I have one field which stores date time in Mongo collection i.e. in Number format i.e. long.
I want to write following query using Aggregation Pipeline
db.my_collection.aggregate(
[{ "$match" : { "createdAt" : { "$gte" : 1656786600000, "$lt" : 1657391400000}}},
{ "$project" : { "createdAt" : {"$dateToString": {format: "%Y-%m-%d", "date": {"$toDate" : "$createdAt"}}}}},
{ "$group" : { "_id" : "$createdAt", "count" : { "$sum" : 1}}}]
)
Output for this query will be
/* 1 */
{
"_id" : "2022-07-04",
"count" : 2888.0
}
/* 2 */
{
"_id" : "2022-07-06",
"count" : 1992.0
}

Related

MongoDB Search nested Objects without knowing Key

I have a list of objects that are given somewhat arbitrary Object keys as a result of using the async Java driver + BSON.
My issue is given the fact that jobStatuses are an arbitrary list of Dictionary items where I don't know the key, I have no idea how to access its sub-values. In the end, I'm trying to build a query that returns if ANY of jobStatus.*._id are true given a list of potential Object ID's.
So I'd be giving a list of ID's and want to return true if ANY of the items in jobStatuses have any of the given ID's. Any ideas?
Let's try this :
db.yourCollectionName.aggregate([
{
$project: {
_id: 0,
jobStatutses: { $arrayElemAt: [{ $objectToArray: "$jobStatutses" }, 0] }
}
}, {
$match: { 'jobStatutses.v._id': { $in: [ObjectId("5d6d8c3a5a0d22d3c84dd6dc"), ObjectId("5d6d8c3a5a0d22d3c84dd6ed")] } }
}
])
Collection Data :
/* 1 */
{
"_id" : ObjectId("5e06319c400289966eea6a07"),
"jobStatutses" : {
"5d6d8c3a5a0d22d3c84dd6dc" : {
"_id" : ObjectId("5d6d8c3a5a0d22d3c84dd6dc"),
"accepted" : "123",
"completed" : 0
}
},
"something" : 1
}
/* 2 */
{
"_id" : ObjectId("5e0631ad400289966eea6dd1"),
"jobStatutses" : {
"5d6d8c3a5a0d22d3c84dd6ed" : {
"_id" : ObjectId("5d6d8c3a5a0d22d3c84dd6ed"),
"accepted" : "456",
"completed" : 0
}
},
"something" : 2
}
/* 3 */
{
"_id" : ObjectId("5e0631cd400289966eea7542"),
"jobStatutses" : {
"5e06319c400289966eea6a07" : {
"_id" : ObjectId("5e06319c400289966eea6a07"),
"accepted" : "789",
"completed" : 0
}
},
"something" : 3
}
Output :
/* 1 */
{
"jobStatutses" : {
"k" : "5d6d8c3a5a0d22d3c84dd6dc",
"v" : {
"_id" : ObjectId("5d6d8c3a5a0d22d3c84dd6dc"),
"accepted" : "123",
"completed" : 0
}
}
}
/* 2 */
{
"jobStatutses" : {
"k" : "5d6d8c3a5a0d22d3c84dd6ed",
"v" : {
"_id" : ObjectId("5d6d8c3a5a0d22d3c84dd6ed"),
"accepted" : "456",
"completed" : 0
}
}
}
All you need is to check if at least one doc gets returned from DB for a given list or not, So we don't need to worry about document structure then just do result.length in your code to say at least one doc got matched for the input list.

How to write this Aggregate query in mongo template in spring

I want to write this aggregate query in mongo template using spring.
This is my query:
db.getCollection('CANdata_fc_distance_report').aggregate(
{$match: { device_datetime : { $gte :1462041000000, $lte: 1462732200000 }}},
{"$group" : {_id:{fc :"$fc",vehicle_name:"$vehicle_name",
device_id : "$device_id"},
count:{$sum:1}}}
)
This is the result of the above query
/* 1 */
{
"_id" : {
"fc" : NumberLong(1),
"vehicle_name" : "WPD 9020",
"device_id" : NumberLong(157)
},
"count" : 2
}
/* 2 */
{
"_id" : {
"fc" : NumberLong(2),
"vehicle_name" : "VVD 8966",
"device_id" : NumberLong(137)
},
"count" : 1
}
This is my data in table:
/* 1 */
{
"_id" : ObjectId("581829855d08921ee6f0ac39"),
"_class" : "com.analysis.model.mongo.fc_distance_report",
"device_id" : NumberLong(137),
"vehicle_name" : "VVD 8966",
"distance" : 125.01,
"fc" : NumberLong(1),
"device_datetime" : NumberLong(1462041000000)
}
/* 2 */
{
"_id" : ObjectId("581830335d08921ee6f0ad6b"),
"_class" : "com.analysis.model.mongo.fc_distance_report",
"device_id" : NumberLong(137),
"vehicle_name" : "VVD 8966",
"distance" : 171.88,
"fc" : NumberLong(2),
"device_datetime" : NumberLong(1462127400000)
}
I found example in my google search added match criteria but I have no idea how to write grouping on 3 columns
Aggregation agg = newAggregation(match(Criteria.where("device_datetime").exists(true)
.andOperator(
Criteria.where("device_datetime").gte(startDate),
Criteria.where("device_datetime").lte(endDate))),
group("hosting").count().as("total"),
project("total").and("hosting").previousOperation(),
sort(Sort.Direction.DESC, "total")
);
Please help me. Thank you
You can try something like this. Just and the group keys together.
Aggregation agg = newAggregation(match(Criteria.where("device_datetime").exists(true)
.andOperator(
Criteria.where("device_datetime").gte(startDate),
Criteria.where("device_datetime").lte(endDate))),
group(Fields.fields().and("fc", "$fc").and("vehicle_name", "$vehicle_name").and("device_id", "$device_id"))
.count().as("count"));

how to read all fields and their type of a mongo db collection using java?

how to read all fields and their type of a mongo db collection using java?
collection is as following. Each document may not have all fields
{ "_id" : { "oid" : "5117fa92f1d3a4093d0d3903"} },
{ "_id" : { "oid" : "5117fa93f1d3a4093d0d3904"} , "ip" : "127.0.0.1" },
{ "_id" : { "oid" : "5117fa93f1d3a4093d0d3904"} , "ip" : "127.0.0.1" ,"price" : 2000}

Gets documents and total count of them in single query include pagination

I'm new in mongo and use mongodb aggregation framework for my queries. I need to retrieve some records which satisfy certain conditions(include pagination+sorting) and also get total count of records.
Now, I perform next steps:
Create $match operator
{ "$match" : { "year" : "2012" , "author.authorName" : { "$regex" :
"au" , "$options" : "i"}}}
Added sorting and pagination
{ "$sort" : { "some_field" : -1}} , { "$limit" : 10} , { "$skip" : 0}
After querying I receive the expected result: 10 documents with all fields.
For pagination I need to know the total count of records which satisfy these conditions, in my case 25.
I use next query to get count : { "$match" : { "year" : "2012" , "author.authorName" : { "$regex" : "au" , "$options" : "i"}}} , { "$group" : { "_id" : "$all" , "reviewsCount" : { "$sum" : 1}}} , { "$sort" : { "some_field" : -1}} , { "$limit" : 10} , { "$skip" : 0}
But I don't want to perform two separate queries: one for retrieving documents and second for total counts of records which satisfy certain conditions.
I want do it in one single query and get result in next format:
{
"result" : [
{
"my_documets": [
{
"_id" : ObjectId("512f1f47a411dc06281d98c0"),
"author" : {
"authorName" : "author name1",
"email" : "email1#email.com"
}
},
{
"_id" : ObjectId("512f1f47a411dc06281d98c0"),
"author" : {
"authorName" : "author name2",
"email" : "email2#email.com"
}
}, .......
],
"total" : 25
}
],
"ok" : 1
}
I tried modify the group operator : { "$group" : { "_id" : "$all" , "author" : "$author" "reviewsCount" : { "$sum" : 1}}}
But in this case I got : "exception: the group aggregate field 'author' must be defined as an expression inside an object". If add all fields in _id then reviewsCount always = 1 because all records are different.
Nobody know how it can be implement in single query ? Maybe mongodb has some features or operators for this case? Implementation with using two separate query reduces performance for querying thousand or millions records. In my application it's very critical performance issue.
I've been working on this all day and haven't been able to find a solution, so thought i'd turn to the stackoverflow community.
Thanks.
You can try using $facet in the aggregation pipeline as
db.name.aggregate([
{$match:{your match criteria}},
{$facet: {
data: [{$sort: sort},{$skip:skip},{$limit: limit}],
count:[{$group: {_id: null, count: {$sum: 1}}}]
}}
])
In data, you'll get your list with pagination and in the count, count variable will have a total count of matched documents.
Ok, I have one example, but I think it's really crazy query, I put it only for fun, but if this example faster than 2 query, tell us about it in the comments please.
For this question i create collection called "so", and put into this collection 25 documents like this:
{
"_id" : ObjectId("512fa86cd99d0adda2a744cd"),
"authorName" : "author name1",
"email" : "email1#email.com",
"c" : 1
}
My query use aggregation framework:
db.so.aggregate([
{ $group:
{
_id: 1,
collection: { $push : { "_id": "$_id", "authorName": "$authorName", "email": "$email", "c": "$c" } },
count: { $sum: 1 }
}
},
{ $unwind:
"$collection"
},
{ $project:
{ "_id": "$collection._id", "authorName": "$collection.authorName", "email": "$collection.email", "c": "$collection.c", "count": "$count" }
},
{ $match:
{ c: { $lte: 10 } }
},
{ $sort :
{ c: -1 }
},
{ $skip:
2
},
{ $limit:
3
},
{ $group:
{
_id: "$count",
my_documets: {
$push: {"_id": "$_id", "authorName":"$authorName", "email":"$email", "c":"$c" }
}
}
},
{ $project:
{ "_id": 0, "my_documets": "$my_documets", "total": "$_id" }
}
])
Result for this query:
{
"result" : [
{
"my_documets" : [
{
"_id" : ObjectId("512fa900d99d0adda2a744d4"),
"authorName" : "author name8",
"email" : "email8#email.com",
"c" : 8
},
{
"_id" : ObjectId("512fa900d99d0adda2a744d3"),
"authorName" : "author name7",
"email" : "email7#email.com",
"c" : 7
},
{
"_id" : ObjectId("512fa900d99d0adda2a744d2"),
"authorName" : "author name6",
"email" : "email6#email.com",
"c" : 6
}
],
"total" : 25
}
],
"ok" : 1
}
By the end, I think that for big collection 2 query (first for data, second for count) works faster. For example, you can count total for collection like this:
db.so.count()
or like this:
db.so.find({},{_id:1}).sort({_id:-1}).count()
I don't fully sure in first example, but in second example we use only cursor, which means higher speed:
db.so.find({},{_id:1}).sort({_id:-1}).explain()
{
"cursor" : "BtreeCursor _id_ reverse",
"isMultiKey" : false,
"n" : 25,
"nscannedObjects" : 25,
"nscanned" : 25,
"nscannedObjectsAllPlans" : 25,
"nscannedAllPlans" : 25,
"scanAndOrder" : false,
!!!!!>>> "indexOnly" : true, <<<!!!!!
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
...
}
For completeness (full discussion was on the MongoDB Google Groups) here is the aggregation you want:
db.collection.aggregate(db.docs.aggregate( [
{
"$match" : {
"year" : "2012"
}
},
{
"$group" : {
"_id" : null,
"my_documents" : {
"$push" : {
"_id" : "$_id",
"year" : "$year",
"author" : "$author"
}
},
"reviewsCount" : {
"$sum" : 1
}
}
},
{
"$project" : {
"_id" : 0,
"my_documents" : 1,
"total" : "$reviewsCount"
}
}
] )
By the way, you don't need aggregation framework here - you can just use a regular find. You can get count() from a cursor without having to re-query.

parse a particular amount of data from json file using mongodb and java

I am using mongodb in java for one of my projects.
User is going to enter a time which he knows will be in the json file.
What I want to do is search for the document which contains that time and from that document till the next LoginRequest document all documents are to be produced as an output.
For example:
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113ca"}, "LoginRequest" : { "Time" : "11-06-2012 11:59:33", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cc"}, "LoginResponse" : { "innerAttr1" : "innerValue1", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cb"}, "OtherRequest" : { "innerAttr3" : "innerValue3"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cd"}, "OtherResponse" : { "innerAttr2" : "innerValue2", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113ce"}, "LoginRequest" : { "Time" : "11-06-2012 12:34:05", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cf"}, "LoginResponse" : { "innerAttr1" : "innerValue1", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cg"}, "OtherRequest" : { "innerAttr3" : "innerValue3"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113ci"}, "LoginRequest" : { "Time" : "11-06-2012 14:59:33", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cm"}, "LoginResponse" : { "innerAttr1" : "innerValue1", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cj"}, "OtherRequest" : { "innerAttr3" : "innerValue3"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cs"}, "OtherResponse" : { "innerAttr2" : "innerValue2", "innerAttr4" : "innerValue4"} }
Here suppose user enters time as "11-06-2012 12:34:05".
So the output for this should be:
Output:
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113ce"}, "LoginRequest" : { "Time" : "11-06-2012 12:34:05", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cf"}, "LoginResponse" : { "innerAttr1" : "innerValue1", "innerAttr4" : "innerValue4"} }
{ "_id" : { "$oid" : "4ceb753a70fdf877ef5113cg"}, "OtherRequest" : { "innerAttr3" : "innerValue3"} }
I am able to get { "_id" : { "$oid" : "4ceb753a70fdf877ef5113ce"}, "LoginRequest" : { "Time" : "11-06-2012 12:34:05", "innerAttr4" : "innerValue4"} } as an output but I want the output to be as mentioned above.
You are not storing anything in your LoginResponse or OtherResponse documents that associates them with the LoginRequest that preceded them. Hence, with your current schema, you cannot construct a query to return the LoginRequest followed by all the other documents until the next LoginRequest.
Without knowing the details of your application's purpose and architecture, it is hard to give you a definitive solution. Here, however, are a few suggestions:
(a) Store a timestamp in all documents rather than just in the LoginRequest. Thus, given a LoginRequest, you could find the next LoginRequest (do a query ordered by time) and then search for all other documents with a timestamp between the timestamps of the two LoginRequests.
(b) If your application architecture allows it, store the id of the LoginRequest in the LoginResponse and OtherRequest documents that follow it (until the next LoginRequest).
(c) Don't store separate documents for LoginRequest, LoginResponse and OtherRequest, but instead store a single document in the collection for all the interactions for a particular login. Then it will be a simple single query to retrieve all that information.

Categories

Resources