I have a mongodb collection. The documents have two fields called rtd ( number of days, int value) and timestamp (long value). I need to get all the documents satisfy this condition using Criteria query
if a document is x
currentTimestamp - x.timestamp converted to days < x.rtd
try {
return mongoTemplate.find(query(criteria), PredictiveEntity.class).stream().filter(predictiveEntity ->
predictiveEntity.getRtd() >= TimeUnit.DAYS.convert(Instant.now().toEpochMilli() - predictiveEntity.getTimestamp(), TimeUnit.MILLISECONDS)
).collect(Collectors.toList());
} catch (Exception e) {
return null;
}
The following query can get you the expected output:
db.predictiveentry.find({
$expr:{
$lt:[
{
$toInt:{
$divide:[
{
$subtract:[new Date().getTime(), "$timestamp"]
},
86400000
]
}
},
"$rtd"
]
}
})
Since $expr is still not supported in Criteria, we need to follow a different route i.e parse the BSON query directly.
Query query = new BasicQuery("{ $expr:{ $lt:[ { $toInt:{ $divide:[ { $subtract:[new Date().getTime(), '$timestamp'] }, 86400000 ] } }, '$rtd' ] } }");
return mongoTemplate.find(query, PredictiveEntity.class).stream().collect(Collectors.toList());
Related
We have a collection of scrips :
{
"_id" : ObjectId("xxxxxxx"),
"scrip" : "3647"
}
{
"_id" : ObjectId("yyyyyy"),
"scrip" : "5647"
}
...
We are simply attempting to return the scrip numerals as an array of string using java driver 3.7
ArrayList<Document> scriplist = scrips.aggregate(Arrays.asList(
Aggregates.group(
Accumulators.push("scripids",
new Document("_id", "$id").
append("scripids", "$scripid"))
)
)).into(new ArrayList<>());
System.out.println(scriplist.toString());
Expected output is ['3647','5647'].
However,we get a 'Can't find a codec for class com.mongodb.client.model.BsonField.' exception.
How is this to be done?
The following query can get us the expected output:
db.scrips.distinct("scrip");
Output:
["3647","5647"]
Equivalent code in Java:
DistinctIterable<String> iterable = scrips.distinct("scrip", String.class);
List<String> scrips = new ArrayList<>();
Block<String> block = scrip -> scrips.add(scrip);
iterable.forEach(block);
The 'scrips' set would hold the distinct scrips.
Some other ways to do the same:
db.scrips.aggregate([
{
$group:{
"_id":"$scrip"
}
},
{
$group:{
"_id":null,
"scrips":{
$push:"$_id"
}
}
},
{
$project:{
"_id":0
}
}
])
Java code:
scrips.aggregate(
Arrays.asList(Aggregates.group("$scrip"), Aggregates.group(null, Accumulators.push("scrips", "$_id")),
Aggregates.project(Projections.exclude("_id"))));
db.scrips.aggregate([
{
$group:{
"_id":null,
"scrips":{
$addToSet:"$scrip"
}
}
},
{
$project:{
"_id":0
}
}
])
Java code:
scrips.aggregate(Arrays.asList(Aggregates.group(null, Accumulators.addToSet("scrips", "$_id")),
Aggregates.project(Projections.exclude("_id"))));
data in mongo :
enter image description here
db.test2.aggregate([
{
"$project" : {
"contents" : 1,
"comments" : {
"$filter" : {
"input" : "$comments",
"as" : "item",
"cond" : {"$gt" : ['$$item.score', 2]}
},
},
"comments2" : {
"$filter" : {
"input" : "$comments2",
"as" : "item",
"cond" : {"$gt" : ["$$item.score", 5]}
}
}
}
},
{
"$project" : {
"content" : 1,
"commentsTotal" : {
"$reduce" : {
"input" : "$comments",
"initialValue" : 0,
"in" : {"$add" : ["$$value", "$$this.score"]}
}
},
"comments2Total" : {
"$reduce" : {
"input" : "$comments2",
"initialValue" : 0,
"in" : {"$add" : ["$$value", "$$this.score"]}
}
}
}
},
{$skip : 0},
{$limit: 3}
]);
<!-- language: lang-json-->
So you can see, this does the following :
1、filter the comments and comments2 which score is gt 5.
2、count total of the socre in comment array.
and i write the aggregation query in Spring like this:
AggregationExpression reduce = ArithmeticOperators.Add.valueOf("$$value").add("$$this.socre");
Aggregation aggregation = Aggregation.newAggregation(
Aggregation.project().andExclude("_id")
.andInclude("content")
.and("comments").filter("item", ComparisonOperators.Gt.valueOf("item.score").greaterThanValue(3)).as("comments")
.and("comments2").filter("item", ComparisonOperators.Gt.valueOf("item.score").greaterThanValue(3)).as("comments2"),
Aggregation.project("comments", "comments2")
.and(ArrayOperators.Reduce.arrayOf("comments").withInitialValue("0").reduce(reduce)).as("commentsTotal")
);
when i run like up , it will throws exception :
java.lang.IllegalArgumentException: Invalid reference '$$value'!
You can try below aggregation by wrapping $filter inside the $reduce operation.
Something like below
AggregationExpression reduce1 = new AggregationExpression() {
#Override
public DBObject toDbObject(AggregationOperationContext aggregationOperationContext) {
DBObject filter = new BasicDBObject("$filter", new BasicDBObject("input", "$comments").append("as", "item").append("cond",
new BasicDBObject("$gt", Arrays.<Object>asList("$$item.score", 2))));
DBObject reduce = new BasicDBObject("input", filter).append("initialValue", 0).append("in", new BasicDBObject("$add", Arrays.asList("$$value", "$$this.socre")));
return new BasicDBObject("$reduce", reduce);
}
};
Aggregation aggregation = newAggregation(
Aggregation.project().andExclude("_id")
.andInclude("content")
.and(reduce1).as("commentsTotal")
);
This is an old question, but in case some one winds up here like me, here's how I was able to solve it.
You cannot access "$$this" and "$$value" variables directly like this in spring.
AggregationExpression reduce = ArithmeticOperators.Add.valueOf("$$value").add("$$this.socre");
To do this we have to use reduce variable enum, like this:
AggregationExpression reduce = ArithmeticOperators.Add.valueOf(ArrayOperators.Reduce.Variable.VALUE.getTarget()).add(ArrayOperators.Reduce.Variable.THIS.referringTo("score").getTarget());
Hope this helps!
I had to solve next task and hadn't find any solutions. So i hope my answer will help somebody.
User with roles (user have list of rights + list of roles, each role have own list of rights, needed to find full list of rights):
user structure
role structure
First, i lookup roles to roleDto (for example), then i collect rights from roles to 1 list:
ArrayOperators.Reduce reduce = ArrayOperators.Reduce.arrayOf("$roleDto.rights")
.withInitialValue(new ArrayList<>())
.reduce(ArrayOperators.ConcatArrays.arrayOf("$$value").concat("$$this"));
As result in reduce i have this 1 list of rights collected from roles.
After that i make:
SetOperators.SetUnion.arrayAsSet(reduce).union("$rights")
using previous result. Result type is AggregationExpression because AbstractAggregationExpression implements AggregationExpression.
So, finally i get smth like this (sorry for messy code):
private static AggregationExpression getAllRightsForUser() {
// concat rights from list of roles (each role have list of rights) - list of list to list
ArrayOperators.Reduce reduce = ArrayOperators.Reduce.arrayOf("$roleDto.rights")
.withInitialValue(new ArrayList<>())
.reduce(ArrayOperators.ConcatArrays.arrayOf("$$value").concat("$$this"));
// union result with user.rights
return SetOperators.SetUnion.arrayAsSet(reduce).union("$rights");
}
Result of this operation can be finally used somewhere like here ;) :
public static AggregationOperation addFieldOperation(AggregationExpression aggregationExpression, String fieldName) {
return aoc -> new Document("$addFields", new Document(fieldName, aggregationExpression.toDocument(aoc)));
}
I had the same issue, one of the solutions is to create a custom Reduce function, here's Union example:
public class SetUnionReduceExpression implements AggregationExpression {
#Override
public Document toDocument(AggregationOperationContext context) {
return new Document("$setUnion", ImmutableList.of("$$value", "$$this"));
}
}
I use NodeJS for insert 8.000.000 records into my orientdb database, but after about 2.000.000 insert records my app is stopped and show error "Java Heap".
Is there a way for release memory after every record inserted?
Ram usage:
-befor start app: 2.6g
-after insert 2milions records: 7.6g
My app.js (NodeJS):
var dbConn = [];
var dbNext = 0;
var dbMax = 25;
for (var i = 0; i <= dbMax; i++) {
var db = new ODatabase({
host: orientdb.host,
port: 2424,
username: 'root',
password: orientdb.password,
name: 'test',
});
dbConn.push(db);
}
//---------------------------------------------------
//Start loop
// record = {name: 'test'}
record["#class"] = "table";
var db = nextDB();
db.open().then(function () {
return db.record.create(record);
}).then(function (res) {
db.close().then(function () {
//----resume loop
});
}).error(function (err) {
//------
});
// end loop - iteration loop
//---------------------------------------------------
function nextDB() {
if (++dbNext >= dbMax) {
dbNext -= dbMax;
}
return dbConn[dbNext];
}
OrientJS wasn't efficient for insert massive data from SqlServer to OrientDB. I used ETL module for massive insert, that is fastest way and good idea for transpot massive data without increase memory more than 2GB.
I could transported 7.000 records per minute.
My ETL's config.json:
{
"config": {
log : "debug"
},
"extractor" : {
"jdbc": { "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"url": "jdbc:sqlserver://10.10.10.10;databaseName=My_DB;",
"userName": "sa",
"userPassword": "123",
"query": "select * from My_Table"
}
},
"transformers" : [
{ "vertex": { "class": "Company"} }
],
"loader" : {
"orientdb": {
"dbURL": "plocal:D:\DB\Orient_DB",
dbUser: "admin",
dbPassword: "admin",
"dbAutoCreate": true,
"tx": false,
"batchCommit": 1000,
"wal" : false,
"dbType": "graph"
}
}
}
From the documentation, for massive insertion you should declare your intention :
db.declareIntent( new OIntentMassiveInsert() );
// YOUR MASSIVE INSERTION
db.declareIntent( null );
But by now it seems not implemented in orientJS driver.
Another thing is that you should not open/close your database for each new record created. This is in general bad practise.
I do not have the node.js environment by now but something like this should do the trick:
db.open().then(function () {
// when available // db.declareIntent( new OIntentMassiveInsert() );
for (var i = 0; i < 8000000; i++) {
// create a new record
myRecord = { "#class" : "myClass", "attributePosition" : i };
db.record.create(myRecord);
}
// when available // db.declareIntent( null );
}).then(function () { db.close() });
I had tried a few example codes on suggester feature of ElasticSearch on the net but I couldn't solve my problem against the autocomplete solution
my index:
client.prepareIndex("kodcucom", "article", "1")
.setSource(putJsonDocument("ElasticSearch: Java",
"ElasticSeach provides Java API, thus it executes all operations " +
"asynchronously by using client object..",
new Date(),
new String[]{"elasticsearch"},
"Hüseyin Akdoğan")).execute().actionGet();
and I used suggestbuilder to obtain the keyword then scan through the content "field", and here is where the null pointer exception occurs due to no result
CompletionSuggestionBuilder skillNameSuggest = new CompletionSuggestionBuilder("skillNameSuggest");
skillNameSuggest.text("lien");
skillNameSuggest.field("content");
SuggestRequestBuilder suggestRequestBuilder = client.prepareSuggest("kodcucom").addSuggestion(skillNameSuggest);
SuggestResponse suggestResponse = suggestRequestBuilder.execute().actionGet();
Iterator<? extends Suggest.Suggestion.Entry.Option> iterator =
suggestResponse.getSuggest().getSuggestion("skillNameSuggest").iterator().next().getOptions().iterator();
Am I missing some filters or input criteria in order to get result? Any result should ok such as autocomplete or record found.
EDIT 1:
This is where I got the NPE and I could see that none of any result return at suggestResponse from debug mode
Iterator<? extends Suggest.Suggestion.Entry.Option> iterator =
suggestResponse.getSuggest().getSuggestion("skillNameSuggest").iterator().next().getOptions().iterator();
EDIT 2:
I am using 2.1.1 version of ElasticSearch Java API
EDIT 3:
I tried in splitting up the iterator line into several code blocks, the NPE occur at the last line when converting a set of data into iterator, but there is not much helping
Suggest tempSuggest = suggestResponse.getSuggest();
Suggestion tempSuggestion = tempSuggest.getSuggestion("skillNameSuggest");
Iterator tempIterator = tempSuggestion.iterator();
I see that the codes:
SuggestRequestBuilder suggestRequestBuilder = client.prepareSuggest("kodcucom").addSuggestion(skillNameSuggest);
SuggestResponse suggestResponse = suggestRequestBuilder.execute().actionGet();
has already consists a empty array/dataset, am I using the suggest request builder incorrectly?
In order to use completion feature, you need to dedicate one field, which will be called completion and you have to specify a special mapping for it.
For example:
"mappings": {
"article": {
"properties": {
"content": {
"type": "string"
},
"completion_suggest": {
"type": "completion"}
}
}
}
The completion_suggest field is the field we will use for the autocomplete function in the above code sample. After this mapping defination, the data must be indexing as follow:
curl -XPOST localhost:9200/kodcucom/article/1 -d '{
"content": "elasticsearch",
"completion_suggest": {
"input": [ "es", "elastic", "elasticsearch" ],
"output": "ElasticSearch"
}
}'
Then Java API can be used as follows for get suggestions:
CompletionSuggestionBuilder skillNameSuggest = new CompletionSuggestionBuilder("complete");
skillNameSuggest.text("es");
skillNameSuggest.field("completion_suggest");
SearchResponse searchResponse = client.prepareSearch("kodcucom")
.setTypes("article")
.setQuery(QueryBuilders.matchAllQuery())
.addSuggestion(skillNameSuggest)
.execute().actionGet();
CompletionSuggestion compSuggestion = searchResponse.getSuggest().getSuggestion("complete");
List<CompletionSuggestion.Entry> entryList = compSuggestion.getEntries();
if(entryList != null) {
CompletionSuggestion.Entry entry = entryList.get(0);
List<CompletionSuggestion.Entry.Option> options =entry.getOptions();
if(options != null) {
CompletionSuggestion.Entry.Option option = options.get(0);
System.out.println(option.getText().string());
}
}
Following link provides you the details of how to create a suggester index. https://www.elastic.co/blog/you-complete-me
Now, I use asynchronous Suggestionbuilder Java API to generate suggestions based on terms.
SearchRequestBuilder suggestionsExtractor = elasticsearchService.suggestionsExtractor("yourIndexName", "yourIndexType//not necessary", "name_suggest", term);
System.out.println(suggestionsExtractor);
Map<String,Object> suggestionMap = new HashMap<>();
suggestionsExtractor.execute(new ActionListener<SearchResponse>() {
#Override
public void onResponse(SearchResponse searchResponse) {
if(searchResponse.status().equals(RestStatus.OK)) {
searchResponse.getSuggest().getSuggestion("productsearch").getEntries().forEach(e -> {
e.getOptions().forEach(s -> {
ArrayList<Object> contents = new ArrayList<>();
suggestionMap.put(s.getText().string(), s.getScore());
});
});
}
}
#Override
public void onFailure(Exception e) {
Helper.sendErrorResponse(routingContext,new JsonObject().put("details","internal server error"));
e.printStackTrace();
}
});
Following is how suggestionbuilder is created.
public SearchRequestBuilder suggestionsExtractor(String indexName, String typeName, String field, String term) {
CompletionSuggestionBuilder csb = SuggestBuilders.completionSuggestion(field).text(term);
SearchRequestBuilder suggestBuilder = client.prepareSearch()
.suggest(new SuggestBuilder().addSuggestion(indexName, csb));
return suggestBuilder;
}
I have a collection in mongodb - "text_failed" which has all the numbers on which I failed to send an SMS, the time they failed and some other information.
A document in this collection looks like this:
{
_id(ObjectId): xxxxxx2af8....
failTime(String): 2015-05-15 01:15:48
telNum(String): 95634xxxxx
//some other information
}
I need to fetch the top 500 numbers which failed the most in a month's duration. A number can occur any number of time during this month.(Eg: a number failed 143 times, other 46 etc.)
The problem I have is that during this duration the numbers failed crossed 7M. It's difficult to process this much information using the following code which doesn't use aggregation:
DBCollection collection = mongoDB.getCollection("text_failed");
BasicDBObject query = new BasicDBObject();
query.put("failTime", new BasicDBObject("$gt", "2015-05-15 00:00:00").append("$lt", "2015-06-15 00:00:00"));
BasicDBObject field = new BasicDBObject();
field.put("telNum", 1);
DBCursor cursor = collection.find(query, field);
HashMap<String, Integer> hm = new HashMap<String, Integer>();
//int count = 1;
System.out.println(cursor);
while(cursor.hasNext()) {
//System.out.println(count);
//count++;
DBObject object = cursor.next();
if(hm.containsKey(object.get("telNum").toString())) {
hm.put(object.get("telNum").toString(), hm.get(object.get("telNum").toString()) + 1);
}
else {
hm.put(object.get("telNum").toString(), 1);
}
}
This fetches 7M+ documents for me. I need only the top 500 numbers. The result should look something like this:
{
telNum: xxxxx54654 //the number which failed
count: 129 //number of times it failed
}
I used aggregation myself but didn't get the desired results. Can this be accomplished by aggregation? Or is there any other way more efficient in which I can do this?
You could try the following aggregation pipeline:
db.getCollection("text_failed").aggregate([
{
"$match": {
"failTime": { "$gt": "2015-05-01 00:00:00", "$lt": "2015-06-01 00:00:00" }
}
},
{
"$group": {
"_id": "$telNum",
"count": { "$sum": 1 }
}
},
{
"$sort": { "count": -1 }
},
{
"$limit": 500
}
])