Elasticsearch QueryBuilder Match Multiple Terms

Elasticsearch QueryBuilder Match Multiple Terms - java

Given JSON in ES index in the following format:
{
"pin": {
"id": 123,
"location": {
"lat": 456,
"lon":-789
}
}
}
The following gets the document matching the id field:
client.prepareSearch("index_name")
.setTypes("pin")
.setQuery(QueryBuilders.termQuery("id", 123))
.execute()
.actionGet();
Instead, I 'm trying to match multiple fields, ie. (location.lat, location.lon).
QueryBuilders.termQuery(); // accepts only a single term
Tried few alternatives but none of it seems to work, eg:
QueryBuilder queryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery("location.lat", 456))
.must(QueryBuilders.termQuery("location.lon", -789));
client.prepareSearch("index_name")
.setTypes("pin")
.setQuery(queryBuilder)
.execute()
.actionGet();

By default, a geo_point field is not indexed as two fields (location.lat and location.lon), it's indexed as a single field that contains both latitude and longitude.
You can turn on indexing of latitude and longitude by turning on the lat_lon mapping option. However, in your example, the values for latitude and longitude are too large. So, they are normalized, converted to double and indexed as -84.0 and -69.0 instead of 456 and -789. So, if you will enable lat_lon and replace the value in the queries, you should be able to get the results.
Please note that values for latitude and longitude are converted to double before indexing. So using term queries might not be very practical in the long term since you will have to always take rounding errors into consideration. It might be more useful to use range queries or elasticsearch geospatial queries instead.

Check out the bool query in ElasticSearch, you should be able to specify must, should or should_not to get the appropriate mixture of and/or for your query.

QueryBuilder qb = boolQuery()
.must(termsQuery("content", Arrays.asList("test1", "test4")));
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_multiple_exact_values.html

You can use QueryBuilders.multiMatchQuery instead of QueryBuilders.termQuery

Related

Exception in a basic query with relational operator in marklogic nosql database in Java application

I am new to marklogic. I am trying to execute simple less than/ greater than query using marklogic nosql database in Java application.
Java: v14
Marklogic: v9
Lets say i have a "user" database and sample document looks like below:
{
"name": "some name",
"dateOfBirth": "1991-07-01",
...
...
}
SQL version of my expected query is
select * from user where dateOfBirth > "1980-01-01"
Used below code in Java
StructuredQueryBuilder qb = new StructuredQueryBuilder();
StructuredQueryDefinition structuredQueryDefinition = qb.range(qb.pathIndex("/dateOfBirth"),
"xs:string",
(String[]) null,
StructuredQueryBuilder.Operator.GT,
eachCriteria.getValue());
markLogicTemplate.search(CombinedQueryDefinitionBuilder.combine(structuredQueryDefinition), User.class);
Created path index using below code:
xquery version "1.0-ml";
import module namespace admin = "http://marklogic.com/xdmp/admin"
at "/MarkLogic/admin.xqy";
let $config := admin:get-configuration()
let $dbid := xdmp:database("user")
let $pathspec := admin:database-range-path-index(
$dbid,
"string",
"/dateOfBirth",
"http://marklogic.com/collation/",
fn:false(),
"ignore")
return
admin:database-add-range-path-index($config, $dbid, $pathspec)
Getting below exception in Java:
com.marklogic.client.FailedRequestException: Local message: search failed: Bad Request. Server Message: XDMP-PATHRIDXNOTFOUND: cts:search(fn:collection(), cts:and-query((cts:collection-query("User"), cts:path-range-query("/dateOfBirth", ">", "1980-01-01", ("collation=http://marklogic.com/collation/"), 1)), ()), ("unfiltered", cts:score-order("descending")), xs:double("0"), ()) -- No string path range index for /dateOfBirth collation=http://marklogic.com/collation/
Tried re-running marklogic server after creating index but still no luck.
Thanks in advance for help.

To use a range index in a query, the range index and query must specify
the same data type
the same collation for a string data type
I'm guessing that the data type should be date for this range index and query (so a greater date matches even if the string is lesser). String values in JSON documents can be indexed as date values (and as many other data types).
Depending on your requirements, you might consider indexing the documents with TDE and projecting rows out of the documents. In the Java API, you can then use RowManager to retrieve the rows. See:
https://docs.marklogic.com/guide/app-dev/TDE
https://docs.marklogic.com/guide/java/OpticJava
Hoping that helps,

There are fundamental issues with the aforesaid range index and Java implementation:
The "/dateOfBirth" is W3C non-compliant XML/XPath. XML document must have a root element. Java API will not return any result with invalid XPath.
My sample documents:
/person1.xml
<person>
<name>Alice Alice</name>
<dob>1991-07-01</dob>
</person>
/person2.xml
<person>
<name>Lewis Carroll</name>
<dob>1981-07-01</dob>
</person>
You can create either date or string type of range index depending on the application needs. The Java query notations must conform to MarkLogic Java API standard.
Solution one: I create a string path range index -> /person/dob
Java code:
StructuredQueryDefinition queryDef = sqb.range(sqb.pathIndex("/person/dob"), "xs:string", Operator.GT, "1980-01-01");
Java Logging:
Query result:
<search:response snippet-format="snippet" total="2" start="1" page-length="10" xmlns:search="http://marklogic.com/appservices/search">
<search:result index="1" uri="/person1.xml" path="fn:doc("/person1.xml")" score="0" confidence="0" fitness="0" href="/v1/documents?uri=%2Fperson1.xml" mimetype="application/xml" format="xml">
<search:snippet>
<search:match path="fn:doc("/person1.xml")/person/dob"><search:highlight>1991-07-01</search:highlight></search:match>
</search:snippet>
</search:result>
<search:result index="2" uri="/person2.xml" path="fn:doc("/person2.xml")" score="0" confidence="0" fitness="0" href="/v1/documents?uri=%2Fperson2.xml" mimetype="application/xml" format="xml">
<search:snippet>
<search:match path="fn:doc("/person2.xml")/person/dob"><search:highlight>1981-07-01</search:highlight></search:match>
</search:snippet>
</search:result>
...............
</search:response>
Session completed: 2020-09-07T15:49:26.409488
Solution two: I create a date path range index -> /person/dob
Java API produces the same result in this scenario.
Java code:
StructuredQueryDefinition queryDef = sqb.range(sqb.pathIndex("/person/dob"), "xs:date", Operator.GT, "1980-01-01");
You can always use other query constructs to achieve the desired results.

MongoCollection : How to get value of nested key

I have some mongo data that looks like this
{
"_id": {
"$oid": "5984cfb276c912dd03c1b052"
},
"idkey": "123",
"objects": [{
"key1": "481334",
"key2": {
"key3":"val3",
"key4": "val4"
}
}]
}
I want to know what the value of key4 is. I also need to filter the results byidkey and key1. So I tried
doc = mongoCollection.find(and(eq("idKey", 123),eq("objects.key1", 481334))).first();
and this works. But i want to check the value of key4 without having to unwrap the entire object. Is there some query i can perform that gives me just the value of key4? Note that I can update the value of key4 as
mongoCollection.updateOne(and(eq("idKey", 123), eq("objects.key1", 481334)),Updates.set("objects.$.key2.key4", "someVal"));
Is there a similar query i can run just to get the value of key4?
Upadte
thanks a lot #dnickless for your help. I tried both of your suggestions but i am getting null. Here is what i tried
existingDoc = mongoCollection.find(and(eq("idkey", 123), eq("objects.key1", 481334))).first();
this gives me
Document{{_id=598b13ca324fb0717c509e2d, idkey="2323", objects=[Document{{key1="481334", key2=Document{{key3=val3, key4=val4}}}}]}}
so far so good. next i tried
mongoCollection.updateOne(and(eq("idkey", "123"), eq("objects.key1", "481334")),Updates.set("objects.$.key2.key4", "newVal"));
now i tried to get the updated document as
updatedDoc = mongoCollection.find(and(eq("idkey", "123"),eq("objects.key1","481334"))).projection(Projections.fields(Projections.excludeId(), Projections.include("key4", "$objects.key2.key4"))).first();
for this i got
Document{{}}
and finally i tried
updatedDoc = mongoCollection.aggregate(Arrays.asList(Aggregates.match(and(eq("idkey", "123"), eq("objects.key1", "481334"))),
Aggregates.unwind("$objects"), Aggregates.project(Projections.fields(Projections.excludeId(), Projections.computed("key4", "$objects.key2.key4")))))
.first();
and for this i got
Document{{key4="newVal"}}
so i'm happy :) but can you think of a reason why the firs approach did not work?
Final answer
thanks for the update #dnickless
document = collection.find(and(eq("idkey", "123"), eq("objects.key1", "481334"))).projection(fields(excludeId(), include("key4", "objects.key2.key4"))).first();

Your data sample contains a lowercase "idkey" whereas your query uses "idKey". In my examples below, I use the lowercase version. Also you are querying for integers 123 and 481334 as opposed to strings which would be correct looking at your sample data. I'm going for the string version with my below code in order to make it work against the provided sample data.
You have two options:
Either you simply limit your result set but keep the same structure using a simple find + projection:
document = collection.find(and(eq("idkey", "123"), eq("objects.key1", "481334"))).projection(fields(excludeId(), include("objects.key2.key4"))).first();
Or, probably nicer in terms of output (not necessarily speed, though), you use the aggregation framework in order to really just get what you want:
document = collection.aggregate(Arrays.asList(match(and(eq("idkey", "123"), eq("objects.key1", "481334"))), unwind("$objects"), project(fields(excludeId(), computed("key4", "$objects.key2.key4"))))).first();

Range queries in Elasticsearch Java API

I have two fields in my ES index: min_duration and max_duration. I want to create a query to find all the documents for input duration such that :
min_duration<=duration<=max_duration
For example if duration is 30 seconds then I should get all docs having min_duration less than eq to duration and duration less than eq to max_duration.
I am using ES Java API and seems like range filter is the way to go. I have constructed the range filter as follows:
val filter = FilterBuilders.andFilter( FilterBuilders.rangeFilter("min_duration").lte(duration),FilterBuilders.rangeFilter("max_duration").gte(duration))
Though it still not seems to work for me. Is it the correct way to build this type of query or am I missing something?
Thanks.

Try doing it with bool query. Wrap your two range clauses inside it like
QueryBuilder qb = boolQuery()
.must(rangeQuery("max_duration").gte(duration))
.must(rangeQuery("min_duration").lte(duration));
Does this help?

How to get facet value for all fields using Java API

I'm new to Elasticsearch and tried to query some sample documents. I issued the following query using the Java API. This query fetched me the correct result. It returned the names of all categories. Now I want the count of all names of a category. Could you explain me how to do that? I'm sorry for my bad English.
SearchResponse sr = client.prepareSearch()
.addField("Category")
.setQuery(QueryBuilders.matchAllQuery())
.addFacet(FacetBuilders.termsFacet("f")
.field("Category"))
.execute()
.actionGet();

Look at the Count API to count your results if you not want to get the result set (matched documents) but only the count.
If you want to get the result set, you get the result size in the response for every filter or query request too.

Max/Min value(without field name) in mongodb with java

I am using the below mentioned MongoDB query in Java to find the maximun value of field price:
DBCursor cursor = coll.find(query,fields).sort(new BasicDBObject("price",1)).limit(1);
fields argument passing to coll.find function here is having the price field only.
So I am getting the output in the form:
{"price" : value}
Is there any way to get value only in the output without the field name and braces etc, so that it can be assigned to a variable or returned to the calling function etc.
Or if there is any other query or mechanism available that I can use for the same purpose.
Pls suggest..
Thanks & Regards

You can get value of price from the DBCursor object as follows.
while (cursor.hasNext()) {
Double price = (Double) cursor.next().get("price");
}
On the mongo shell you can do it as follows :
db.priceObj.find({},{_id:0, price:1}).sort({price:-1}).limit(1)[0].price

You cannot do this due to the fact that MongoDB communicates using BSON.
A single value like you want would be invalid BSON. It is easy enough to filter it out your side.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Elasticsearch QueryBuilder Match Multiple Terms - java

Check out the bool query in ElasticSearch, you should be able to specify must, should or should_not to get the appropriate mixture of and/or for your query.

QueryBuilder qb = boolQuery() .must(termsQuery("content", Arrays.asList("test1", "test4"))); https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_multiple_exact_values.html

You can use QueryBuilders.multiMatchQuery instead of QueryBuilders.termQuery

Related

Exception in a basic query with relational operator in marklogic nosql database in Java application

MongoCollection : How to get value of nested key

Range queries in Elasticsearch Java API

How to get facet value for all fields using Java API

Max/Min value(without field name) in mongodb with java

Categories

Resources