The Regex works in java but is not woked in ElasticSearch.
Java:
Pattern pattern = Pattern.compile("(\\d{8}-[01],)*(((202210((2[89])|(3[01])))|(2022((1[12]))\\d{2})|(20((2[3-9])|([3-9][0-9]))\\d{4}))-[01])*([,]\\d{8}-[01])*");
Matcher matcher = pattern.matcher("20221027-0,20221028-1");
System.out.println(matcher.matches());
It prints true
But when I using EleasticSearch, it was not woked.
The folloing json is the document what I want to query in EleasticSearch.
{
"_index": "eagle_clue_v1",
"_type": "_doc",
"_id": "51740",
"_score": 0.0,
"_source": {
"id": 51740,
"next_follow_time": "20221027-0,20221028-1"
}
}
The following query was not worked
POST /eagle_clue_v1/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"bool": {
"filter": [
{
"terms": {
"id": [
"51740"
]
}
},
{
"regexp": {
"next_follow_time.keyword": {
"value": "(\\d{8}-[01],)*(((202210((2[89])|(3[01])))|(2022((1[12]))\\d{2})|(20((2[3-9])|([3-9][0-9]))\\d{4}))-[01])([,]\\d{8}-[01])*"
}
}
}
]
}
}
]
}
}
}
Check this page for regular expression syntax.
use [0-9] instead of \d.
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"bool": {
"filter": [
{
"regexp": {
"next_follow_time.keyword": {
"value": """([0-9]{8}-[01],)*(((202210((2[89])|(3[01])))|(2022((1[12]))[0-9]{2})|(20((2[3-9])|([3-9][0-9]))[0-9]]{4}))-[01])([,][0-9]{8}-[01])*"""
}
}
}
]
}
}
]
}
}
}
Related
we are migrating to elastic search 8 and when we are trying to fetch the data of parent document inner hits using has parent query .elastic search returning exception when runnning innerhits for has parent query.
https://discuss.elastic.co/t/inner-hits-in-has-parent-giving-error-couldnt-find-nested-source-for-path-currentcompany/318232
I had the same problem
bellow query giving me same error:, I had to pick parent document by using a field instead of _id
GET /user_data_factory/_search?from=0&size=20
{
"query": {
"bool": {
"must": [
{
"match": {
"relation_type": "uinsp"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"match": {
"userInspirer": "63bef9f9a8c98000126589eb"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"has_parent": {
"parent_type": "user",
"query": {
"match": {
"_id": "63bd1ff29510390012760322"
}
},
"inner_hits": {
"_source": ["id"]
}
}
}]
}
}
]
}
}]
}
}
]
}
}
}
using field instead of _id
GET /user_data_factory/_search?from=0&size=20
{
"query": {
"bool": {
"must": [
{
"match": {
"relation_type": "uinsp"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"match": {
"userInspirer": "63bef9f9a8c98000126589eb"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"has_parent": {
"parent_type": "user",
"query": {
"match": {
"id": "63bd1ff29510390012760322"
}
},
"inner_hits": {
"_source": ["id"]
}
}
}]
}
}
]
}
}]
}
}
]
}
}
}
I am doing term aggregation based on field [type] like below but elastic is returning only 1 term count instead of 2 it is not doing nested object aggregation i.e under comments.data.comments[is a list] under this i have 2 type.
{
"aggs": {
"genres": {
"terms": {
"field": "comments.data.comments.type"
}
}
}
}
Gotta utilize the nested field type:
PUT events
{
"mappings": {
"properties": {
"events": {
"type": "nested",
"properties": {
"ecommerceData": {
"type": "nested",
"properties": {
"comments": {
"type": "nested",
"properties": {
"recommendationType": {
"type": "keyword"
}
}
}
}
}
}
}
}
}
}
POST events/_doc
{
"events": [
{
"eventId": "1",
"ecommerceData": [
{
"comments": [
{
"rank": 1,
"recommendationType": "abc"
},
{
"rank": 1,
"recommendationType": "abc"
}
]
}
]
}
]
}
GET events/_search
{
"size": 0,
"aggs": {
"genres": {
"nested": {
"path": "events.ecommerceData.comments"
},
"aggs": {
"nested_comments_recomms": {
"terms": {
"field": "events.ecommerceData.comments.recommendationType"
}
}
}
}
}
}
I read the documentation of BoolQuery and according to it, this the purpose,
filter
The clause (query) must appear in matching documents. However unlike
must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.
Also from BoolQueryBuilder class:
/**
* Adds a query that <b>must</b> appear in the matching documents but will
* not contribute to scoring. No {#code null} value allowed.
*/
public BoolQueryBuilder filter(QueryBuilder queryBuilder) {
if (queryBuilder == null) {
throw new IllegalArgumentException("inner bool query clause cannot be null");
}
filterClauses.add(queryBuilder);
return this;
}
but I can't get my head around, this. When should I use filter vs (should or must)
Here is the example I am working on :
I want to filter out some records based on the following assumptions :
Fetch All
1) Records where deleted=0 and isPrivate=true
AND
2) Records where (isPrivate=false or [isPrivate=true and
createdBy=loggedInUser])
Here are the 2 queries which give the same result, I want to know what filter query signifies
Result without Filter using just must and should clause.
"query": {
"bool": {
"must": [
{
"term": {
"deleted": {
"value": "0",
"boost": 1
}
}
},
{
"match": {
"isPrivate": {
"query": true
}
}
},
{
"bool": {
"should": [
{
"term": {
"isPrivate": {
"value": "false",
"boost": 1
}
}
},
{
"bool": {
"must": [
{
"term": {
"createdBy": {
"value": "1742991596",
"boost": 1
}
}
},
{
"term": {
"isPrivate": {
"value": "true",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
]
}
}
]
}
},
Query with using filter
"query": {
"bool": {
"adjust_pure_negative": true,
"boost": 1,
"filter": [
{
"bool": {
"must": [
{
"term": {
"deleted": {
"value": "0",
"boost": 1
}
}
},
{
"match": {
"isPrivate": {
"query": true
}
}
}
],
"should": [
{
"term": {
"isPrivate": {
"value": "false",
"boost": 1
}
}
},
{
"bool": {
"must": [
{
"term": {
"createdBy": {
"value": "1742991596",
"boost": 1
}
}
},
{
"term": {
"isPrivate": {
"value": "true",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
]
}
}
In your case, you should definitely use bool/filter since you don't have any constraint that contributes to scoring, all constraints are yes/no matches, and by using filter you can benefit from filter caches (which you don't when using must)
So definitely go with the filter option, but with a slight modification (you don't really need must at all and your boolean logic is not properly translated to bool queries):
{
"query": {
"bool": {
"minimum_should_match": 1,
"filter": [
{
"term": {
"deleted": {
"value": "0",
"boost": 1
}
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"isPrivate": {
"value": "false",
"boost": 1
}
}
},
{
"bool": {
"filter": [
{
"term": {
"createdBy": {
"value": "1742991596",
"boost": 1
}
}
},
{
"term": {
"isPrivate": {
"value": "true",
"boost": 1
}
}
}
]
}
}
]
}
}
]
}
}
}
So to sum up:
should = OR condition
must = AND condition (when scoring is desired)
filter = AND condition (when scoring is not desired and/or when you want to benefit from filter caching)
Bonus: must_not = NOT condition
I'm trying to get the counts group by the repetitive items in array without distinct, use aggs terms but not work
GET /my_index/_search
{
"size": 0,
"aggs": {
"keywords": {
"terms": {
"field": "keywords"
}
}
}
}
documents like:
"keywords": [
"value1",
"value1",
"value2"
],
but the result is:
"buckets": [
{
"key": "value1",
"doc_count": 1
},
{
"key": "value2",
"doc_count": 1
}
]
how can i get the result like:
"buckets": [
{
"key": "value1",
"doc_count": 2
},
{
"key": "value2",
"doc_count": 1
}
]
finally I modify the mapping use nested:
"keywords": {
"type": "nested",
"properties": {
"count": {
"type": "integer"
},
"keyword": {
"type": "keyword"
}
}
},
and query:
GET /my_index/_search
{
"size": 0,
"aggs": {
"keywords": {
"nested": {
"path": "keywords"
},
"aggs": {
"keyword_name": {
"terms": {
"field": "keywords.keyword"
},
"aggs": {
"sums": {
"sum": {
"field": "keywords.count"
}
}
}
}
}
}
}
}
result:
"buckets": [{
"key": "value1",
"doc_count": 495,
"sums": {
"value": 609
}
},
{
"key": "value2",
"doc_count": 440,
"sums": {
"value": 615
}
},
{
"key": "value3",
"doc_count": 319,
"sums": {
"value": 421
}
},
...]
I get zero result when combining range filter and missing filter together in a query. Query is given below. I get this issue only while combining missing and range individually both works good.
Any help is appreciated on correcting the query or the code. I am elastic search 1.7.3 version.
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"bool": {
"should": {
"missing": {
"field": "OrderData.XXXX.XXXXQueue"
}
}
}
},
{
"range": {
"OrderData.XXXX.priority": {
"from": 1,
"to": 5,
"include_lower": true,
"include_upper": true
}
}
}
]
}
}
}
}
}
Does this Query get you the expected results?
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": {
"bool": {
"should": [{
"missing": {
"field": "OrderData.XXXX.XXXXQueue"
}
}, {
"range": {
"OrderData.XXXX.priority": {
"from": 1,
"to": 5,
"include_lower": true,
"include_upper": true
}
}
}]
}
}
}
}
}
}
}