elasticsearch java query match any of my list - java

I am trying to build a query with java which filters all hits by a list.
Let's say I have a list of different names and now i want to build a query which returns all elements with the names stored in my list.
Since there are going to be 100+ names in this list i just want to pass the whole list to my query.
First I tried to build a raw query in my elasticsearch head plugin to make it easier for me to implement it into java.
At the moment my raw query looks like this:
{
"query": {
"bool": {
"filter": {
"term": {
"name": {
"value": [
"name1",
"name2"
]
}
}
}
}
}
}
I know that i have at least one element with the name "name1", same for "name2". But this query doesn't return anything.
What am I doing wrong?
Thanks,
Asiemie

The term query does not support arrays of values. However the terms one does so you can do the following:
{
"query": {
"bool": {
"filter": {
"terms": {
"name": [
"name1",
"name2"
]
}
}
}
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html
You can also wrap term queries into a bool -> should query like so:
{
"query": {
"bool": {
"filter": {
"bool": {
"should": [
{
"term": {
"name": "name1"
}
},
{
"term": {
"name": "name2"
}
}
]
}
}
}
}
}

Related

Elastic Query multi_match conditional query when no results

I have following query which is used for almost all search terms.
Query
GET test_partial/_search
{
"query": {
"function_score": {
"query": {
"bool": {
"filter": [],
"must": [
{
"multi_match": {
"fields": [
"title^30",
"description^10"
],
"operator": "and",
"query": "pamers diap",
"type": "most_fields"
}
}
]
}
}
}
}
}
Document
[
{
"title": "Huggies diapers"
},
{
"title": "Huggies wipes"
},
{
"title": "papmpers wipes"
},
{
"title": "natureval diapers"
}
]
If you check query "operator": "and" it works perfectly fine in terms of relevancy for all other search terms.
I have no pampers diapers document (I get no results)
But I have few documents with Huggies diapers and pampers wipes
If I change "operator": "or" I get both documents in results.
To keep relevancy top, I need to keep operator=and and switch to "OR" when no results. To achieve this I need to make 2 ES calls, is there a way we can specify conditional query when no results switch to "OR" to avoid 2 calls to ES?
Complementing my comment, I would try something like the query below. I have accuracy with match and recovery with multi-match.
{
"query": {
"function_score": {
"query": {
"bool": {
"filter": [],
"should": [
{
"match": {
"title": {
"query": "natureval diapers",
"operator": "and",
"boost": 50
}
}
},
{
"match": {
"description": {
"query": "natureval diapers",
"operator": "and",
"boost": 30
}
}
},
{
"multi_match": {
"fields": [
"title^30",
"description^10"
],
"operator": "or",
"query": "natureval diapers",
"type": "most_fields"
}
}
]
}
}
}
}
}

Elastic Spring Data OR Java High Level REST Client?

I'm new to both Elasticsearch and Spring. I've written a Javascript POC that converts a JSON string into an Elasticsearch query (and performs the request).
It takes a string like this:
{
"period": "years",
"format": "xml",
"criteria": {
"operator": "OR",
"operands": [
{
"operator": "AND",
"operands": [
{
"operator": "exists",
"field": "def"
},
{
"operator": "includes",
"field": "keywords",
"value": [
"abcd"
]
}
]
},
{
"operator": "AND",
"operands": [
{
"operator": "from",
"field": "links",
"value": 1
},
{
"operator": "includes",
"field": "keywords",
"value": [
"abcd",
"efgh"
]
}
]
}
]
}
}
(Note: This query may have any levels of nesting)
... and converts it into this:
{
"query": {
"constant_score": {
"filter": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"bool": {
"must": [
{
"exists": {
"field": "def"
}
},
{
"range": {
"effectiveDate": {
"gte": 1543982400,
"lt": 1575518400
}
}
}
]
}
},
{
"bool": {
"must": [
{
"terms": {
"keywords.name": [
"abcd",
"efgh"
]
}
},
{
"range": {
"effectiveDate": {
"gte": 1543982400,
"lt": 1575518400
}
}
}
]
}
}
]
}
},
{
"bool": {
"must": [
{
"bool": {
"must": {
"terms": {
"links": [
11048,
34618,
34658
]
}
}
}
},
{
"bool": {
"must": [
{
"terms": {
"keywords.name": [
"abcd",
"efgh"
]
}
},
{
"range": {
"effectiveDate": {
"gte": 1543982400,
"lt": 1575518400
}
}
}
]
}
}
]
}
}
]
}
}
}
},
"size": 0,
"aggs": {
"by_id": {
"composite": {
"sources": [
{
"agg_on_id": {
"terms": {
"field": "id"
}
}
}
],
"size": 10000,
"after": {
"agg_on_id": -1
}
},
"aggs": {
"latest_snapshot": {
"top_hits": {
"sort": [
{
"effectiveDate": "desc"
}
],
"_source": true,
"size": 1
}
}
}
}
}
}
It first creates a query (similar to above) for a first trip to Elasticsearch to extract some info ('links') needed for building this query.
Each trip to Elasticsearch may return millions of results, so it does paging using the "search_after" mechanism.
I need to convert this POC to a Spring application.
Question: Which one is most appropriate for this case - Spring Data Elasticsearch or Elasticsearch Java High Level REST Client?
Spring data elasticsearch seems to do a good job at creating simple queries without much effort, but would it help me in this case?
Any suggestions are be much appreciated.
Thanks!
Spring Data Elasticsearch uses the high level client provided by Elasticsearch for the non-reactive implementation.
You can use the query builders from Elasticsearch together with Spring Data Elasticsearch too, this gives you the greatest flexibility.
Spring Data Elasticsearch puts on top of that the entity mapping (POJO to JSON), repository functions and the other stuff from Spring Data.
So it's not a question if you should do the one or the other, but if you need or want to use the additional functionality that Spring Data Elasticsearch offers.
Edit:
When using Spring Data Elasticsearch, you configure the used RestHighLevelClient (see the documentation) and then have it injected into your other Spring beans. So you can even mix access to ES using Spring Data ElasticsearchOperations or Repositories and access by using the RestHighLevelClient directly.
I would suggest you use the official Java-high-level rest-client which is being worked on actively at Elastic and you can also look at all the queries builders it supports(it has got query builders for almost all the queries ).
Also previously Elasticsearch didn't have an official client for JAVA but now as they have and actively improving and developing, IMHO you should go ahead with them as it also provides a lot of out of box options and who understand Elasticsearch better than the company behind it :)

Elasticsearch wildcard search on multiple fields

Im building a filter function and what I want is a wildcard filter. If value is "roj", all records in any field containing "roj" should be displayed.
How to implement this?
Here's my query,
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "Project_error"
}
},
{
"wildcard": {
"api_name": {
"wildcard": "*roj*"
}
}
},
{
"wildcard": {
"error_Code": {
"wildcard": "*roj*"
}
}
}
]
}
}
}
Java code
BoolQueryBuilder bqb = new BoolQueryBuilder();
bqb.must(QueryBuilders.existsQuery("Project_error"))
if(!filter.isEmpty()) {
bqb.filter(QueryBuilders.wildcardQuery(fields[0],"*"+filter+"*"));
bqb.must(QueryBuilders.wildcardQuery(fields[1],"*"+filter+"*"));
...
}
searchSourceBuilder.query(bqb);
This script displays data only if both field contains "roj", which not correct.
Using the Below query you can achieve the result.
GET <index_name>/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "*roj*",
"fields":["field_1", "field_2"]
}
}
]
}
}
}
If you want apply your query term in all fields remove the fields attributes in the above query.!!!

How to build an Elasticsearch-Query with startsWith-functionality and special characters

I have JsonObjects that i search with Elasticsearch from a Java Application, using the Java API to build searchQueries. The objects contain a field called "such" that contains a searchString with which the JsonObject should be found, for example a simple searchString would be "STVBBM160A". Besides the usual characters a-Z 0-9 the searchString could also look like the following examples:
"STV-157ABR", "F-G/42-W3" or "DDM000.074.6652"
The search should return results already when only the first characters are put into a searchfield, which it does for a search like "F-G/42"
My Problem: The search sometimes doesn't return results at all, but when the last character is typed it finds the right document.
What i tried: First I wanted to use a WildcardQuery where the query would be "typedStuff*", but the WildcardQuery didn't return any results at all, as soon as I typed anything but * (It used to work for other searchFields with other values)
Now I am using a QueryStringQuery, which also takes the input and puts a * character to the end. By escaping the QueryString, I am able to search for Strings like "F-G/42" and so on, but the search for "DDM000.074.6652" doesn't return any results until elasticsearch has the whole String to search. Also, when i type "STV" all results with "STV-xxxxx" (containing the "-" after STV) are returned, but not the object with "STVBBM160A", again until the whole String is given for the search (without showing any results inbetween as soon as the searchString is "STVB")
This is the query I'm using right now:
{
"size": 1000,
"min_score": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "MY_DATA_TYPE",
"fields": [
"doc.db_doc_type"
]
}
},
{
"query_string": {
"query": "MY_SPECIFIC_TYPE",
"fields": [
"doc.db_doc_specific"
]
}
}
],
"should": {
"query_string": {
"query": "STV*",
"fields": [
"doc.such"
],
"boost": 3,
"escape": true
}
}
}
}
}
This is the old Query with the WildCardQuery, which doesn't return any results at all unless there is no queryString but *:
{
"size": 50,
"min_score": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "MY_DATA_TYPE",
"fields": [
"doc.db_doc_type"
]
}
},
{
"query_string": {
"query": "MY_SPECIFIC_TYPE",
"fields": [
"doc.db_doc_specific"
]
}
}
],
"should": {
"wildcard": {
"doc.such": {
"wildcard": "STV*",
"boost": 3
}
}
}
}
}
}
When using a PrefixQuery, the search also doesn't return any results at all (with and without the *):
{
"size": 50,
"min_score": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "MY_DATA_TYPE",
"fields": [
"doc.db_doc_type"
]
}
},
{
"query_string": {
"query": "MY_SPECIFIC_TYPE",
"fields": [
"doc.db_doc_specific"
]
}
}
],
"should": {
"prefix": {
"doc.such": {
"prefix": "HSTKV*",
"boost": 3
}
}
}
}
}
}
How can this query be changed to achieve the goal of getting all results starting with the specified String, no matter if the field doc.such also contains Numbers or special chars like "_" or "." or "/" ?
Thanks in advance
As soon as you want to query prefixes, suffixes or substring in a serious way, you need to leverage nGrams. In your case, since you're only after prefixes, an edgeNGram tokenizer would be in order. You need to change the settings of your index to be like this one:
PUT your_index
{
"settings": {
"analysis": {
"analyzer": {
"prefix_analyzer": {
"tokenizer": "prefix_tokenizer",
"filter": [
"lowercase"
]
},
"search_prefix_analyzer": {
"tokenizer": "keyword",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"prefix_tokenizer": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "25"
}
}
}
},
"mappings": {
"your_type": {
"properties": {
"doc": {
"properties": {
"such": {
"type": "string",
"fields": {
"starts_with": {
"type": "string",
"analyzer": "prefix_analyzer",
"search_analyzer": "search_prefix_analyzer"
}
}
}
}
}
}
}
}
}
What will happen with this analyzer is that when indexing F-G/42-W3 the following tokens will be indexed: f, f-, f-g, f-g/, f-g/4, f-g/42, f-g/42-, f-g/42-w, f-g/42-w3.
At search time, we'll simply lowercase the user input and the prefix will be matched against the indexed tokens.
Then your query can simply be transformed to a match query:
{
"size": 1000,
"min_score": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "MY_DATA_TYPE",
"fields": [
"doc.db_doc_type"
]
}
},
{
"query_string": {
"query": "MY_SPECIFIC_TYPE",
"fields": [
"doc.db_doc_specific"
]
}
}
],
"should": {
"match": {
"doc.such": {
"query": "F-G/4"
}
}
}
}
}
}

Why can't I query correctly against a HashMap field in Elasticsearch?

I am using Elasticsearch v1.5.2. I have a JSON document that looks like the following.
{
"id": "RRRZe32",
"metadata": {
"published": "2010-07-29T18:11:43.000Z",
"codeId": "AChdUxnsuRyoCo7roK6gqZSg",
"codeTitle": "something"
}
}
My Java POJO object that backs this JSON looks like the following. Note that I am using Spring-Boot v1.3.0.M2 with spring-boot-starter-data-elasticsearch dependency.
#Document(indexName="ws", type="vid")
public class Video {
#Id
private String id;
#Field(type=FieldType.Object, index=FieldIndex.not_analyzed)
private Map<String, Object> metadata;
}
My mapping is defined as follows.
{
"ws": {
"mappings": {
"vid": {
"properties": {
"id": {
"type": "string"
},
"metadata": {
"properties": {
"codeId": {
"type": "string"
},
"codeTitle": {
"type": "string"
}
}
}
}
}
}
}
}
I can query the document (using Sense) successfully by metadata.codeTitle but not metadata.codeId. My query for metadata.codeTitle looks like the following.
{
"query": {
"bool": {
"must": [
{
"term": {
"metadata.codeTitle": {
"value": "something"
}
}
}
]
}
}
}
My query for metadata.codeId looks like the following.
{
"query": {
"bool": {
"must": [
{
"term": {
"metadata.codeId": {
"value": "AChdUxnsuRyoCo7roK6gqZSg"
}
}
}
]
}
}
}
Any ideas on what I am doing wrong?
It is because your codeId field is analyzed and the value is lowercased at indexing time. So you have two solutions:
You can query like this (i.e. all lowercase)
{
"query": {
"bool": {
"must": [
{
"term": {
"metadata.codeId": {
"value": "achduxnsuryoco7rok6gqzsg"
}
}
}
]
}
}
}
Or you can declare your codeId field as not_analyzed and keep your query as it is.
In your case, it looks like case 1 will be easier to implement.

Categories

Resources