Get the field names in elasticsearch where search value matches - java

I want to get the field names where my search criteria matches. I have used lowercase_keyword analyser.
I tried using highlight query but it only works for exact matches.
{
"query": {
"query_string" : {
"query" : "cosmic"
}
},
"highlight" : {
"fields" : {
"*" : { }
},
"require_field_match": false
}
}
I want a solution so that I can also get field names of nested fields and partial matching fields apart from exact matching fields.

Related

how to get a equal-match result from field which has array value in elasticsearch

In my documents, I have a field which has an array of String.
like
weapon : ["Bland Blade", "Defender Quelthalas", "Thousand Lies", "Frozen Bonespike"]
I would like to get all the documents whose weapon field has "Frozen Bonespike".
not field which has one word - like "Frozen" or "Bonespike"
and not even field which contains "Frozen Bonespike" - like "Winter Frozen Bonespike"
I would like to get a field which has exactly equal string with query word
Have you any idea?
To achieve that you have to set your weapon field as type keyword and then run a terms search against it.
By default elasticsearch will create both text type field (lets say weapon) and keyword (weapon.keyword)
Ingesting document
POST test_seo/_doc
{
"weapon": [
"Bland Blade",
"Defender Quelthalas",
"Thousand Lies",
"Frozen Bonespike"
]
}
Query
POST test_seo/_search
{
"query": {
"term": {
"weapon.keyword": {
"value": "Frozen Bonespike"
}
}
}
}
If you want to search by many different weapons you can use a terms query
POST test_seo/_search
{
"query": {
"terms": {
"weapon.keyword": [
"Bland Blade",
"Thousand Lies"
]
}
}
}

ElasticSearch search through an array field as exclusive search

I do have an array of data in a field in ElasticSearch with a keyword type. I want to search this array with exclusive values that I want to search i.e. to exclude array values thats not included in my search keyword. Please see the details below.
Thanks!
I have the following elastic search index mapping:
"exgroups": {
"type": "keyword",
"eager_global_ordinals": true
},
With the following sample data:
"id": 1,
"exgroups": ["TSX"]
"id": 2,
"exgroups": ["TSX", "OTC", "NSD"]
My search is like this:
{
"bool" : {
"filter" : {
"term" : {
"exgroups" : {
"value" : "TSX"
}
}
}
}
}
I've used MatchQueryBuilder, TermQueryBuilder, TermsQueryBuilder to no avail. By the ElasticSearch TermQuery definition, it should do the trick. https://www.elastic.co/guide/en/elasticsearch/reference/6.2/query-dsl-term-query.html. But it does not, probably because the field is an array.
In general, the Term*Query behaves like this:
iterate all the documents, for each document
check if the exgroups contains 'tsx'
if it does, return the document
This returns documents 1 and 2 since document 2 contains TSX also. However, I wanted for it to return only document 1 and no other else in the array.
How do I accomplish this?
Thanks in advance.
Re-index solution:
I recently found this documentation from ElasticSearch:
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_multiple_exact_values.html
Both TermQuery and TermsQuery or ElasticSearch in general uses the 'must contain' rather than the 'must equals to' because of its inverted index.
According to them, the best solution possible is:
If you do want that behavior—entire field equality—the best way to accomplish it involves indexing a secondary field. In this field, you index the number of values that your field contains. Using our two previous documents. Once you have the count information indexed, you can construct a constant_score that enforces the appropriate number of terms. https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_multiple_exact_values.html#_equals_exactly
Steps below:
Add additional mapping in the index called exgroups_count.
Use logstash to count the exgroups array length and put into the exgroups_count field.
save the index.
Another Solution without Re-index:
There are some limitations of adding and re-indexing the whole thing. Once your index is growing, it will be very intrusive adding fields to the index, and computing the counts - making it very operation intensive - not to mention you have to save and maintain your mapping.
I found a solution that has no need of re-index. Looking at the ScriptQueryBuilder, I can be able to theoretically add a script filter that counts the length of the array and equate to 1.
"filter" : {
"script" : {
"script" : "doc['exgroups'].values.length == 1"
}
}
So the full query becomes now likes this:
"bool" : {
"must" : [
{
"term" : {
"exgroups" : {
"value" : "TSX",
"boost" : 1.0
}
}
}
],
"filter" : [
{
"script" : {
"script" : {
"source" : "doc['exgroups'].values.length == 1",
"lang" : "painless"
},
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
In Java,
BoolQueryBuilder qBool = new BoolQueryBuilder();
TermQueryBuilder query = new TermQueryBuilder("exgroups", exchangeGroup.getCode());
qBool.must(query);
ScriptQueryBuilder sQuery = new ScriptQueryBuilder(new Script("doc['exgroups'].values.length == 1"));
qBool.filter(sQuery);

Elastic field mapping with tokenizer for numeric values, and have to execute via Java api

I am trying to create a mapping for the field which containing alphanumeric values along with special characters like below
AB-7000-6000-Wk-21
am trying to create a pattern for this, please find my below pattern which I created.
"tokenizer" : {
"code" : {
"pattern" : "[^\\p{L}\\d]+",
"type" : "pattern"
}
},
This pattern is working fine with alphabets and its not working for alphanumeric and special characters.
so if I try to search
get items/_search
{
"query" : {
"match" : {
"code" : "7000-6000-"
}
}
}
I am expecting AB-7000-6000-Wk-21 this result

How to get nested documents by field value from other documents (ElasticSearch)?

example query:
localhost:8080/content/child/123
This query should return doc where parent id is equal to "123", then I need to get id from the returned document and search in Elastic for documents where parent id is equal to this id.
example result :
{
"id": "test",
"parentId" : "123",
"name" : "bambo"
}
{
"id": "someId",
"parentId" : "test",
"name" : "bambo 2"
}
First of all, GET /content/child/123 means searching for the specific document where _index=content; _type=child; _id=123, which is not what you're looking for. The _id field is completely different from your id and parentId fields.
As far as I know, ES does not currently have a "nested search" feature as you described. You need to do two separate searches.
To search for documents with field ("parentId") containing a specific value ("123"), you need the following search query
{
"query": {
"bool": {
"filter": {
"term": {
"parentId": "123"
}
}
}
}
}
After your first search, the response will be a JSON object resp. You may find a list of returned results in resp["hits"]["hits"]. Then, parse a result object to obtain the id field you want. For example, resp["hits"]["hits"][0]["_source"]["id"] will give you the id field.
Check out the documentation here https://www.elastic.co/guide/en/elasticsearch/reference/current/_the_search_api.html

Elasticsearch aggregation histogram by date no longer works with script

I have an ES query, which returns some 26 results.
The query has aggregation histogram element which looks like this:
"aggregations" : {
"by_date" : {
"date_histogram" : {
"field" : "startDate",
"interval" : "month"
}
}
}
The aggregation element of search result looks like this:
"aggregations": {
"date_histogram": {
"buckets":[
{"key_as_string":"2016-01-01T00:00:00.000Z", "key":1451606400000, "doc_count":18},
{"key_as_string":"2016-02-01T00:00:00.000Z", "key":1454284800000, "doc_count":8}
]
}
}
So far so good. But what I want is to do some scripting against search results to remove elements not matching certain criteria. So I added this to the query:
"aggregations" : {
"by_date" : {
"date_histogram" : {
"field" : "startDate",
"interval" : "month",
"script" : {
"inline" : "if (condition) {return 1} else {return 0}"
}
}
}
Unfortunately, this results a single result bucket and aggregation is lost:
"date_histogram": {
"buckets": [
{"key_as_string": "1970-01-01T00:00:00.000Z", "key": 0, "doc_count": 26 }
]
}
What have I tried:
reducing the script inline element to just return 1. This still results broken aggregation
returning value of date field itself. Results ClassCastException - the result should be a number
checking ES config settings. I have enabled everything for script.engine.groovy.{file|indexed|inline}.{aggs|mapping|search|update|plugin}, also script.inline, script.indexed and script.aggs.
Checked the 2.0 breaking aggregation changes but none seem to be relevant.
I know I can run separate queries having that filter in query itself (rather than aggregation part) which would let me do aggregation without script. The point is that I have a dozen of different aggregations which take the same set of search results and do different types of filtering (and aggregation). Running the same query multiple times is counter productive and not acceptable.
As far as I know, this used to work in version 1.4.4 but is no longer working in version 2.2.0.
Is this a bug? Or perhaps the same logic could be reimplemented differently, e.g. via Bucket Script Aggregation, or any other?
have you tried with the new aggregation framework, and inline ternaries in a groovy style script ?
I previously ran into the same kind of issue, and that's how i solved it.
Your aggregation query would look like this :
"aggs": {
"2": {
"date_histogram": {
"field": "startDate",
"interval": "month",
},
"aggs": {
"1": {
"sum": {
"script": "((condition) ? 1 : 0)",
"lang": "expression"
}
}
}
}
}
Note that you can also try it with defining your script as a .groovy file in the scripts folder of ElasticSearch installation.
Hope that it'll help.
Regards.

Categories

Resources