Elastic search range dates - java

I have created an Elastic search index from a Mongo database.
The documents in Mongo have the following structure:
{
"_id" : ObjectId("525facace4b0c1f5e78753ea"),
"time" : ISODate("2013-10-17T09:23:56.131Z"),
"type" : "A",
"url" : "www.google.com",
"name" : "peter",
}
The index was created (apparently) without any problems.
Now, I am trying to use Elastic Search to retrieve the documents in the index between two dates. I have read that I have to use range queries, but I have tried many times things like
MatchQueryBuilder queryBuilder = QueryBuilders.matchQuery("name", "peter").type(Type.PHRASE).minimumShouldMatch("99%");
LocalDateTime toLocal = new LocalDateTime(2013,12,18, 0, 0);
Date to = toLocal.toDate();
LocalDateTime fromLocal = new LocalDateTime(2013,12,17, 0, 0);
Date from = fromLocal.toDate();
RangeQueryBuilder queryDate = QueryBuilders.rangeQuery("time").to(to).from(from);
FilterBuilder filterDate = FilterBuilders.queryFilter(queryDate);
srb = esH.client.prepareSearch("my_index");
srb.setQuery(queryBuilder);
srb.setFilter(filterDate);
sr = srb.execute().actionGet();
and I get 0 hits although there should be many results. I have tried to enter strings instead of dates, but same results.
When I perform a basic query without filters such as:
MatchQueryBuilder queryBuilder = QueryBuilders.matchQuery("name", "peter").type(Type.PHRASE).minimumShouldMatch("99%");
SearchRequestBuilder srb = esH.client.prepareSearch("my_index");
rb.setQuery(queryBuilder);
SearchResponse sr = srb.execute().actionGet();
I get hits with that look like this:
{
"_index" : "my_index",
"_type" : "type",
"_id" : "5280d3c2e4b05e95aa703e34",
"_score" : 1.375688, "_source" : {"type":["A"],"time":["Mon Nov 11 13:55:30 CET 2013"],"name":["peter"]}
}
Where the field time does not have the format ISODate("2013-10-17T09:23:56.131Z")anymore.
To sum up, what would be the Java code (and types) for querying between two dates (and times), taking into account the format?

You are probably passing the wrong field name to the range query at this line:
RangeQueryBuilder queryDate = QueryBuilders.rangeQuery("time").to(to).from(from);
It should probably be #timestamp (or the field you're using to store your timestamp) instead of time. Additionally, it seems that there is no time field in Elasticsearch for the example document you included. This also points to the issue that the time field wasn't converted correctly from Mongo to Elasticsearch.

Can you try
FilterBuilders.rangeFilter("#timestamp").from("from time").to("toTime")

This will work -
You can pass in Long timestamps to the gte and lte params.
QueryBuilders.rangeQuery("time").gte(startTime).lte(endTime);
Make sure to add an "L" at the end of the startTime and endTime, so that it knows its a long and not an int.

Related

How to change name of a field in MongoDB with java for each document in the collection?

Due to some decisions I will have to change the name of some fields in all documents in a single collection. For purpose of automation testing I am inserting documents and then checking some logics.
Lets assume that after the insert method I have the following objects:
"_id" : ObjectId("60c10042d"),
"Name" : Mike,
"years" : 25,
"Country" : England
},
{
"_id" : ObjectId("40r10042t"),
"Name" : Smith,
"years" : 32,
"Country" : England
}
When inserting the document/documents I want to change the field "Country" to "Occupation" using Java. Here is example of the code I'm using:
MongoCollection<Document> documentMongo = MongoDb.getCollection("collectionName");
Document document = Document.parse(readJsonFile(json));
//I've tried this way:
//documentMongo.updateMany(document, Updates.rename("Country", "Occupation"));
//didn't work
documentMongo.insertOne(document);
Oh, the rename should be after the insert is done.
documentMongo.insertOne(document);
documentMongo.updateMany(document, Updates.rename("Country", "Occupation"));
Anyway, it could help others which are searching for easy way to change field names.
Sadly, when I try rename more fields it works only for the first one.
Final solution:
documentMongoCollection.insertOne(document);
BasicDBObject searchQuery = new BasicDBObject();
BasicDBObject updateQuery = new BasicDBObject();
updateQuery.append("$rename",new BasicDBObject().append("oldField", "newField").append("oldField1", "newField1").append("oldField2", "newField2"));
documentMongoCollection.updateMany(searchQuery,updateQuery);

Unable to parse 2022-10-04T19:24:50Z format in ElasticSearch Java Implemnetation

SearchRequest searchRequest = Requests.searchRequest(indexName);
SearchSourceBuilder builder = new SearchSourceBuilder();
Gson gson = new Gson();
QueryBuilder querybuilder = QueryBuilders.wrapperQuery(query);
query : {
"range": {
"timecolumn": {
"gte":"2022-10-07T09:45:13Z",
"lte":"2022-10-07T09:50:50Z"
}
}
}
While passing the above Query I am getting Parser Exception , I cannot change the date format as data in DB is getting inserted in same format .
Need Advice on :
How can we parse this kind of timestamp in ElasticSearch Java , if not
How can we control pattern updation during data insertion like my column in defined as text which takes date format "2022-10-07T09:45:13Z" as text .
either I have to pass this format in ES Parser or I have to change format to 2022-10-07 09:45:13 during insertion itself .
I cannot convert for each row after inserting because we have lakhs of data
As you are mentioning, Elasticsearch storing timecolumn as text type hence i will suggest to change mapping of timecolumn to date type and you will be able to use range query with date. Because if you store date as text and applied range then it will not return a expected result.
{
"mappings": {
"properties": {
"timecolumn": {
"type": "date"
}
}
}
}
Now coming to your Java code issue, You can use below code for creating range query in Java as you are using Java client.
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
QueryBuilder query = QueryBuilders.rangeQuery("timecolumn").gte("2022-10-07T09:45:13Z").lte("2022-10-07T09:50:50Z");
searchSourceBuilder.query(query);
searchRequest.source(searchSourceBuilder);
Regarding your concern about reindexing data:
I cannot convert for each row after inserting because we have lakhs of
data
You can use Reindex API to move data from original index to temp index and delete your original index. Then, defined the index with proper mapping and again use same Reindex API to copy data from temp index to original index with new mapping.

How to query Marklogic with punctuation-sensitve terms using JAVA?

I have the following info stored in Marklogic for the json files as follows.
1.json>> "dateSubmitted" : "2017/10/11 09:15:14"
2.json>> "dateSubmitted" : "2017/10/11 10:13:14"
3.json>> "dateSubmitted" : "2017/10/14 11:12:13"
My query term is:
String dateQuery = "2017/10/11";
I tried 2 methods and none seems to be working.
Method 1:
StructuredQueryBuilder qb = new StructuredQueryBuilder();
QueryDefinition queryDef = qb.and(qb.word((qb.jsonProperty("dateSubmitted"),dateQuery)));
queryDef.setDirectory(DIRECTORY);
SearchHandle resultsHandle = new SearchHandle();
queryManager.search(queryDef, resultsHandle, start);
Method 2:
StructuredQueryBuilder qb = new StructuredQueryBuilder();
String[] wordQueryOptions = {"punctuation-sensitive", "space-sensitive"};
QueryDefinition queryDef = qb.and(qb.word((qb.jsonProperty("dateSubmitted"),
FragmentScope.DOCUMENTS,
wordQueryOptions,100.0,dateQuery)));
queryDef.setDirectory(DIRECTORY);
SearchHandle resultsHandle = new SearchHandle();
queryManager.search(queryDef, resultsHandle, start);
The expected result is to return only 1.json and 2.json.
However 3.json was also returned.
Is there some settings I'm missing in my Marklogic admin to activate options or punctuation-sensitive?
Working with dates is often easier and more powerful if you index the property as a date. That way, you can do before and after matches on the date as well as sort on the date.
To index a property as a date, you can create a range index on the date. You can then use a range query on the date.
In MarkLogic 9, you can also use TDE to project rows from the documents with a column for the dates.
Hoping that helps,

Setting last modification timestamp when using insertOne and findOneAndUpdate

I need all my inserted/updated documents in MongoDb to have an auto-updated currentDate
So let's assume I have the following Json shipment object (which I'm getting from a 3rd party restful API) :
String jsonString = {"tracking_number": "123", "deliveryAddress": { "street_line_1": "12 8th St.", "city": "NY", "state": "NY" }, "cutomers": [ { "firstName": "John", "email": "john#gmail.com" }, { "firstName": "Alex", "email": "alex#gmail.com" } ] }
Problem #1, I need to insert the object into the DB and set "currentDate", but insertOne does not work for me:
MongoClient mongo = new MongoClient(mongodb_host, mongodb_port);
MongoDatabase db = mongo.getDatabase("Test");
MongoCollection<Document> collection = db.getCollection("InsertOneExample");
Document doc = Document.parse(jsonString);
doc.append("lastModifiedTs", new BSONTimestamp());
collection.insertOne(doc);
System.out.println(doc);
This one does not populate "lastModifiedTs" as you can see below
Document{{tracking_number=123, deliveryAddress=Document{{street_line_1=12 8th St., city=NY, state=NY}}, cutomers=[Document{{firstName=John, email=john#gmail.com}}, Document{{firstName=Alex, email=alex#gmail.com}}], lastModifiedTs=TS time:null inc:0, _id=5a6b88a66cafd010f1f2cffd}}
Problem #2
If I'm getting an update on my shipment, the tracking number is the same, but all the other fields may change.
The following code crashes:
FindOneAndUpdateOptions options = new FindOneAndUpdateOptions();
options.returnDocument(ReturnDocument.AFTER);
options.upsert(true);
Bson update = Updates.combine(Document.parse(jsonString), Updates.currentTimestamp("lastModifiedTs"));
Document query = new Document("tracking_number", "123");
Document result = collection.findOneAndUpdate(query, update, options);
With the exception: "Invalid BSON field name equipmentShipmentAddress"
So it looks like I cannot just to put the entire updated document into the "update"
If I set update just to Updates.currentTimestamp("lastModifiedTs"), the code will update just the field "lastModifiedTs", but I need it to modify all the fields.
If I set the query to be the new object, then due to my "upsert" setting, it'll add the new document without replacing the old one.
Notes: needless to say, I can perform several operations: (1) insert the object, get the "_id" field, (2) update the "lastModifiedTs" field (3) read the object by "_id" and get the updated "lastModifiedTs" value, but it's three operation where I expect to be able to achieve everything with a single operation
How can I achieve my goal elegantly?
Thanks
Solution to Problem #1 - Insert new Date() to provide new datetime.
Document doc = Document.parse(jsonString);
doc.append("lastModifiedTs", new Date());
collection.insertOne(doc);
Solution to Problem #2 - Use findOneAndReplace
FindOneAndReplaceOptions options = new FindOneAndReplaceOptions();
options.returnDocument(ReturnDocument.AFTER);
Document replace = Document.parse(jsonString);
replace.append("lastModifiedTs", new Date());
Document query = new Document("tracking_number", "123");
Document result = collection.findOneAndReplace(query, replace, options);

Mongodb + Java Drivers. Search by date range

This is my first shot at using Mongodb with the java drivers. I can query the database via command line using javascript and the Date() object, however, I am having trouble using the driver. Based on my query, can anybody see what the problem is? Thanks
Date current = new Date();
DBCollection coll = db.getCollection("messages");
BasicDBObject query = new BasicDBObject("created_on", new BasicDBObject("$gte", new Date(current.getYear(), current.getMonth(), current.getDate())).
append("created_on", new BasicDBObject("$lt", new Date(current.getYear(), current.getMonth() - 1, current.getDate()))));
System.out.println("Query: " + query);
DBCursor cursor = coll.find(query);
Query: { "created_on" : { "$gte" : { "$date" :
"2012-12-06T05:00:00.000Z"} , "created_on" : { "$lt" : { "$date" :
"2012-11-06T05:00:00.000Z"}}}}
P.S. In case it is not obvious, I'm trying to find all of the records within the last month.
Seems like you are constructing the query wrong. Please try the below one:
BasicDBObject query = new BasicDBObject("created_on", //
new BasicDBObject("$gte", new DateTime().toDate()).append("$lt", new DateTime().toDate()));
Datetime object is a library which simplies date manipulation in java. You can check that out.
http://joda-time.sourceforge.net/
Also morphia is a nice java object-document-mapper (ODM) framework for working with mongodb through java driver. It simplifies querying through java.
https://github.com/jmkgreen/morphia
Based on the query that was output, you are looking for a document with a field created_on that also has a child named created_on. I assume no such document exists. In other words, you query is not correctly formed.
Your query object should look like this:
BasicDBObject dateRange = new BasicDBObject ("$gte", new Date(current.getYear(), current.getMonth(), current.getDate());
dateRange.put("$lt", new Date(current.getYear(), current.getMonth() - 1, current.getDate());
BasicDBObject query = new BasicDBObject("created_on", dateRange);
Also, as a sidebar, you probably should avoid using the three-argument constructor of the java.util.Date class, as it is deprecated. When working with dates in the MongoDB Java driver, I typically use the java.util.Calendar class, and its getTime() method.
I have not used the Java driver for mongo before, but it seems that the query you have created is not correct.
Query: { "created_on" : { "$gte" : { "$date" : "2012-12-06T05:00:00.000Z"} , "created_on" : { "$lt" : { "$date" : "2012-11-06T05:00:00.000Z"}}}}
The query should in fact end up looking like:
Query: { "created_on" : {$gte: start, $lt: end}}
Where start and end are dates. It seems like the second time you refer to "created_on" is unnecessary and in fact might be breaking your query.
NOTE: I have not had the chance to test out this theory, but I am working from http://cookbook.mongodb.org/patterns/date_range/ which seems to be very relevant to the question at hand.
Jodatime lib is very userful, Please make use of DateTimeZone.UTC for timezone parameter of DateTime. Once you set timezone, you will get accurate results. Try this
Calendar cal = Calendar.getInstance();
//get current year,month & day using Calender
int year=cal.get(Calendar.YEAR);
int monthNumber=cal.get(Calendar.MONTH);
int dateNumber=cal.get(Calendar.DAY_OF_MONTH);
monthNumber+=1;
BasicDBObject query = new BasicDBObject("dateCreated",new BasicDBObject("$gte", new DateTime(year, monthNumber, dateNumber, 0, 0,DateTimeZone.UTC).toDate()).append("$lte",new DateTime(year, monthNumber, dateNumber, 23, 59,DateTimeZone.UTC).toDate()));
System.out.println("formed query: "+query);
DBCursor cursor = collection.find(query);
while(cursor.hasNext())
{
System.out.println("found doc in given time range: "+cursor.next().toString());
}

Categories

Resources