ElasticSearch get top 2 results per specific term in aggregations - java

Given this query, i want to retrieve and limit only to top 2 cities from each country.
So, given the most popular country, retrieve top 2 cities, then next country, top 2 cities and etcera.
{
"size": 0,
"aggs": {
"user_city_id": {
"terms": {
"field": "user.city_id",
"size": 999
},
"aggs": {
"user_country_id": {
"terms": {
"field": "user.country_id",
"size": 1
},
"aggs": {
"user_name": {
"terms": {
"field": "user.name",
"size": 1
}
}
}
}
}
}
},
"query": {
"bool": {
"must": [
{
"term": {
"user.category": 1
}
}
]
}
}
}

Below aggregation query give you 2 city per country.
{
"aggs": {
"user_country_id": {
"terms": {
"field": "user.country_id",
"size": 10
},
"aggs": {
"user_city_id": {
"terms": {
"field": "user.city_id",
"size": 2
}
}
}
}
}
}

Related

Couldn't find nested source for path when running inner hit query for has parent query

we are migrating to elastic search 8 and when we are trying to fetch the data of parent document inner hits using has parent query .elastic search returning exception when runnning innerhits for has parent query.
https://discuss.elastic.co/t/inner-hits-in-has-parent-giving-error-couldnt-find-nested-source-for-path-currentcompany/318232
I had the same problem
bellow query giving me same error:, I had to pick parent document by using a field instead of _id
GET /user_data_factory/_search?from=0&size=20
{
"query": {
"bool": {
"must": [
{
"match": {
"relation_type": "uinsp"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"match": {
"userInspirer": "63bef9f9a8c98000126589eb"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"has_parent": {
"parent_type": "user",
"query": {
"match": {
"_id": "63bd1ff29510390012760322"
}
},
"inner_hits": {
"_source": ["id"]
}
}
}]
}
}
]
}
}]
}
}
]
}
}
}
using field instead of _id
GET /user_data_factory/_search?from=0&size=20
{
"query": {
"bool": {
"must": [
{
"match": {
"relation_type": "uinsp"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"match": {
"userInspirer": "63bef9f9a8c98000126589eb"
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"has_parent": {
"parent_type": "user",
"query": {
"match": {
"id": "63bd1ff29510390012760322"
}
},
"inner_hits": {
"_source": ["id"]
}
}
}]
}
}
]
}
}]
}
}
]
}
}
}

How to term query nested json objects/fields in elastic search?

I am doing term aggregation based on field [type] like below but elastic is returning only 1 term count instead of 2 it is not doing nested object aggregation i.e under comments.data.comments[is a list] under this i have 2 type.
{
"aggs": {
"genres": {
"terms": {
"field": "comments.data.comments.type"
}
}
}
}
Gotta utilize the nested field type:
PUT events
{
"mappings": {
"properties": {
"events": {
"type": "nested",
"properties": {
"ecommerceData": {
"type": "nested",
"properties": {
"comments": {
"type": "nested",
"properties": {
"recommendationType": {
"type": "keyword"
}
}
}
}
}
}
}
}
}
}
POST events/_doc
{
"events": [
{
"eventId": "1",
"ecommerceData": [
{
"comments": [
{
"rank": 1,
"recommendationType": "abc"
},
{
"rank": 1,
"recommendationType": "abc"
}
]
}
]
}
]
}
GET events/_search
{
"size": 0,
"aggs": {
"genres": {
"nested": {
"path": "events.ecommerceData.comments"
},
"aggs": {
"nested_comments_recomms": {
"terms": {
"field": "events.ecommerceData.comments.recommendationType"
}
}
}
}
}
}

What is the purpose of BoolQuery's "filter" in ElasticSearch?

I read the documentation of BoolQuery and according to it, this the purpose,
filter
The clause (query) must appear in matching documents. However unlike
must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.
Also from BoolQueryBuilder class:
/**
* Adds a query that <b>must</b> appear in the matching documents but will
* not contribute to scoring. No {#code null} value allowed.
*/
public BoolQueryBuilder filter(QueryBuilder queryBuilder) {
if (queryBuilder == null) {
throw new IllegalArgumentException("inner bool query clause cannot be null");
}
filterClauses.add(queryBuilder);
return this;
}
but I can't get my head around, this. When should I use filter vs (should or must)
Here is the example I am working on :
I want to filter out some records based on the following assumptions :
Fetch All
1) Records where deleted=0 and isPrivate=true
AND
2) Records where (isPrivate=false or [isPrivate=true and
createdBy=loggedInUser])
Here are the 2 queries which give the same result, I want to know what filter query signifies
Result without Filter using just must and should clause.
"query": {
"bool": {
"must": [
{
"term": {
"deleted": {
"value": "0",
"boost": 1
}
}
},
{
"match": {
"isPrivate": {
"query": true
}
}
},
{
"bool": {
"should": [
{
"term": {
"isPrivate": {
"value": "false",
"boost": 1
}
}
},
{
"bool": {
"must": [
{
"term": {
"createdBy": {
"value": "1742991596",
"boost": 1
}
}
},
{
"term": {
"isPrivate": {
"value": "true",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
]
}
}
]
}
},
Query with using filter
"query": {
"bool": {
"adjust_pure_negative": true,
"boost": 1,
"filter": [
{
"bool": {
"must": [
{
"term": {
"deleted": {
"value": "0",
"boost": 1
}
}
},
{
"match": {
"isPrivate": {
"query": true
}
}
}
],
"should": [
{
"term": {
"isPrivate": {
"value": "false",
"boost": 1
}
}
},
{
"bool": {
"must": [
{
"term": {
"createdBy": {
"value": "1742991596",
"boost": 1
}
}
},
{
"term": {
"isPrivate": {
"value": "true",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
]
}
}
In your case, you should definitely use bool/filter since you don't have any constraint that contributes to scoring, all constraints are yes/no matches, and by using filter you can benefit from filter caches (which you don't when using must)
So definitely go with the filter option, but with a slight modification (you don't really need must at all and your boolean logic is not properly translated to bool queries):
{
"query": {
"bool": {
"minimum_should_match": 1,
"filter": [
{
"term": {
"deleted": {
"value": "0",
"boost": 1
}
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"isPrivate": {
"value": "false",
"boost": 1
}
}
},
{
"bool": {
"filter": [
{
"term": {
"createdBy": {
"value": "1742991596",
"boost": 1
}
}
},
{
"term": {
"isPrivate": {
"value": "true",
"boost": 1
}
}
}
]
}
}
]
}
}
]
}
}
}
So to sum up:
should = OR condition
must = AND condition (when scoring is desired)
filter = AND condition (when scoring is not desired and/or when you want to benefit from filter caching)
Bonus: must_not = NOT condition

Combining missing term filter and range check in elastic search

I get zero result when combining range filter and missing filter together in a query. Query is given below. I get this issue only while combining missing and range individually both works good.
Any help is appreciated on correcting the query or the code. I am elastic search 1.7.3 version.
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"bool": {
"should": {
"missing": {
"field": "OrderData.XXXX.XXXXQueue"
}
}
}
},
{
"range": {
"OrderData.XXXX.priority": {
"from": 1,
"to": 5,
"include_lower": true,
"include_upper": true
}
}
}
]
}
}
}
}
}
Does this Query get you the expected results?
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": {
"bool": {
"should": [{
"missing": {
"field": "OrderData.XXXX.XXXXQueue"
}
}, {
"range": {
"OrderData.XXXX.priority": {
"from": 1,
"to": 5,
"include_lower": true,
"include_upper": true
}
}
}]
}
}
}
}
}
}
}

How to print the full elasticsearch request for debug in java

I use
ElasticSearchTemplate().queryForPage(SearchQuery, CLASS)
How can I print the full json request?
I manage to print only filter by doing :
searchQuery.getFilter().toString()
But cant manage to do the same with:
searchQuery.getAggregations().toString();
I would like to print in console something like :
"aggs": {
"agg1": {
"terms": {
"field": "basket_id_1",
"size": 0
},
"aggs": {
"basket_id_2": {
"terms": {
"field": "basket_id_2",
"size": 0
},
"aggs": {
"basket_id_3": {
"terms": {
"field": "basket_id_3",
"size": 0
}
}
}
}
}
}
}
This is what I've started using to do the same thing.
{
"top_agg": {
"terms": {
"field": "id",
"size": 100
},
"aggregations": {
"parent": {
"nested": {
"path": "transactions"
},
"aggregations": {
"totals": {
"filter": {
"terms": {
"transactions.type": [
"ttype"
]
}
},
"total_events": {
"cardinality": {
"field": "parent.field"
}
}
}
}
}
}
}
}
NativeSearchQuery query = queryBuilder.build();
if (query.getQuery() != null) {
log.debug(query.getQuery().toString());
}
if (query.getAggregations() != null) {
try {
XContentBuilder builder = XContentFactory.contentBuilder(XContentType.JSON);
builder.startObject();
for (AbstractAggregationBuilder subAgg : query.getAggregations()) {
subAgg.toXContent(builder, ToXContent.EMPTY_PARAMS);
}
builder.endObject();
log.debug(builder.string());
} catch (IOException e) {
log.debug("Error parsing aggs");
}
}
Could you use the SearchResponse.getAggregations().asList() ?

Categories

Resources