Elasticsearch universal search query

Elasticsearch universal search query - java

I have a string "Jhon Abraham 18". I want to create search query that will search by divided by spaces words from the string in an index. This search have to be set to all fields of the index and you don't know what meaning have to be mapped(set) to a field.
So, I have a document:
{
"_index": "recipient",
"_type": "recipient",
"_id": "37a15258d9",
"_version": 1,
"_score": 1,
"_source": {
"name": "Jhon ",
"surname": "Abraham",
"age": "18 ",
}
and I don't know to what fields of index meanings Jhon, Abraham and 18 correspond. I just have a string and by this string I want to search in all fields of the index documents. I can divide it by separete words by spaces but I don't know exact mapping fields for search. Also, I want to do it at Java.
I'll be appreciate for help.

I think you should use query_string in elasticsearch.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
This will solve your problem.

You can use multi match query, writing all fields or wildcards.
Multi Match Query

Related

How to get value with an underscore inside a string from Elasticsearch using QueryBuilder in Java?

I'm using Elasticsearch 3.2.7 and ElasticsearchRepository.search() which takes QueryBuilder as an argument (doc)
I have a BoolQueryBuilder and use it like this:
boolQuery.must(termQuery("myObject.code", value);
var results = searchRepository.search(boolQuery);
The definition of the field code is as follows:
"myObject": {
"properties": {
"code": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
The issue is, when I search with a value that has underscore inside, for example: FOO_BAR then it doesn't return any results. When I search with other values that have either leading or trailing underscore then it's fine.
I've read that ES may ignore the special character and split the words inside by it so there's a need for an exact match search. But I also read that the keyword setting guarantees that. So right now I'm confused.

yes, you are correct, using keyword field you can achieve the exact match, you need to use the below query
boolQuery.must(termQuery("myObject.code.keyword", value); --> note addition of keyword
var results = searchRepository.search(boolQuery);
you can use the analyze API to see the tokens for your indexed documents and search term, and basically your tokens in index must match search terms tokens, in order ES to return the match :)

Sort the search result in ascending order of a multivalued field in Solr

I'm using Solr of version 6.6.0. I have a schema of title (text_general), description(text_general), id(integer). When I search for a keyword to list the results in ascending order of the title my code returns an error can not sort on multivalued field: title.
I have tried to set the sort using the following 3 methods
SolrQuery query = new SolrQuery();
1. query.setSort("title", SolrQuery.ORDER order);
2. query.addSort("title", SolrQuery.ORDER order);
3. SortClause ab = new SolrQuery.SortClause("title", SolrQuery.ORDER.asc);
query.addSort(ab);
but all of these returns the same error
I found a solution by referring to this answer
It says to use min/max functions.
query.setSort(field("pageTitle",min), ORDER.asc);
this what I'm trying to set as the query, I didn't understand what are the arguments used here.
This is the maven dependency that I'm using
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-solrj</artifactId>
<version>6.5.1</version>
</dependency>

Unless title actually is multiValued - can your post have multiple titles - you should define it as multiValued="false" in your schema. However, there's a second issue - a field of the default type text_general isn't suited for sorting, as it'll generate multiple tokens, one for each word in the title. This is useful for searching, but will give weird and non-intuitive results when sorting.
So instead, define a title_sort field and use a field type with a KeywordTokenizer and LowerCaseFilter attached (if you want case insensitive sort), or if you want case sensitive sort, use the already defined string field type for the title_sort field.

The first thing to check is do you really need that title field to be multivalued, or do your documents really have multiple titles ? If not, you just need to fix the field definition by setting multivalued="false".
That said, sorting on a multivalued field doesn't make sense unless determining which one of these multiple values should be used to sort on, or how to combine them into one.
Let' say we need to sort a given resultset by title (alphabetically), first using a single-valued title field :
# Unsorted
"docs": [
{ "id": "1", "title": "One" },
{ "id": "2", "title": "Two" },
{ "id": "3", "title": "Three" },
]
# Sorted
"docs": [
{ "id": "1", "title": "One" },
{ "id": "3", "title": "Three" },
{ "id": "2", "title": "Two" },
]
# -> ok no problem here
Now applying the same logic with a multi-valued field is not possible as is, you would necessarily need to determine which title to use in each document to properly sort them :
# Unorted
"docs": [
{ "id": "1", "title": ["One", "z-One", "a-One"] },
{ "id": "2", "title": ["Two", "z-Two", "a-Two"] },
{ "id": "3", "title": ["Three", "z-Three", "a-Three"] }
]
Hopefully, Solr allows to sort results by the output of a function, meaning you can use any from Solr's function queries to "get" a single value per title field. The answer you referred to is a good example even though it may not work for you (because title would need docValues enabled - depends on field definition - and knowing that max/min functions should be used only with numeric values), just to get the idea :
# here the 2nd argument is a callback to max(), used precisely to get a single value from title
sort=field(title,max) asc

How to add two separate fields and aggregate over that sum

I have some mock data for banks in elastic search which looks like this:
{
"_index": "test_data",
"_type": "test_type",
"_id": "AVobMd1YHpQD-9cT3TmO",
"_score": 1,
"_source": {
"bank_name": "BOFA",
"transactions_sent": 79,
"transactions_received": 27,
}
}
I want to be able to add the values of transactions_sent and transactions_received to get total transactions and then have an aggregation over total transactions. I'm using elasticsearch 2.4.
I kind of figured out the solution using the script query.
"sum":
"script":{
"inline": "doc['transactions_sent'].value+doc['transactions_received'].value"
}
}
The query time has increased by 8 times when I aggregated using the above query on the inline value compared to if I aggregate on either one of transaction_sent or transactions_received. Is there any other way to do it apart from the script query

Retrieve data from Elasticsearch using aggregations where the values contains hyphen

I am working on elastic search for quite some time now... I have been facing a problem recently.
I want to group by a particular column in elastic search index. The values for that particular column has hyphens and other special characters.
SearchResponse res1 = client.prepareSearch("my_index")
.setTypes("data")
.setSearchType(SearchType.QUERY_AND_FETCH)
.setQuery(QueryBuilders.rangeQuery("timestamp").gte(from).lte(to))
.addAggregation(AggregationBuilders.terms("cat_agg").field("category").size(10))
.setSize(0)
.execute()
.actionGet();
Terms termAgg=res1.getAggregations().get("cat_agg");
for(Bucket item :termAgg.getBuckets()) {
cat_number =item.getKey();
System.out.println(cat_number+" "+item.getDocCount());
}
This is the query I have written inorder to get the data groupby "category" column in "my_index".
The output I expected after running the code is:
category-1 10
category-2 9
category-3 7
But the output I am getting is :
category 10
1 10
category 9
2 9
category 7
3 7
I have already went through some questions like this one, but couldn't solve my issue with these answers.

That's because your category field has a default string mapping and it is analyzed, hence category-1 gets tokenized as two tokens namely category and 1, which explains the results you're getting.
In order to prevent this, you can update your mapping to include a sub-field category.raw which is going to be not_analyzed with the following command:
curl -XPUT localhost:9200/my_index/data/_mapping -d '{
"properties": {
"category": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}'
After that, you need to re-index your data and your aggregation will work and return you what you expect.
Just make sure to change the following line in your Java code:
.addAggregation(AggregationBuilders.terms("cat_agg").field("category.raw").size(10))
^
|
add .raw here

When you index "category-1" you will get (by default) two terms, "category", and "1". Therefore when you aggregate you will get back two results for that.
If you want it to be considered a single "term" then you need to change the analyzer used on that field when indexing. Set it to use the keyword analyzer

ElasticSearch : search more like this in java

Let's say I've indexed a document like this :
{
"_index": "indexapm",
"_type": "membres",
"_id": "3708",
"_score": 1,
"_source": {
"firstname": "John",
"lastname": "GUERET-TALON"
}
}
I want to retrieve this document when searching for "GUER", "GUERET", "TAL" for example.
I have a Java application and I tried this :
MoreLikeThisQueryBuilder qb = QueryBuilders.moreLikeThisQuery(
"firstname^3",
"lastname^3")
.likeText("GUER");
SearchResponse response = client.prepareSearch("myindex")
.setTypes("mytype")
.setSearchType(SearchType.DFS_QUERY_AND_FETCH)
.setQuery(qb) // Query
.setFrom(0)
.setSize(limit)
.setExplain(true)
.execute()
.actionGet();
But this search doen't retrieve my document. Of course if I try an exact match query and search for "GUERET", it works.
Does anyone know what kind of query I have to use and how to make it work with the Java library? Thanks!

The More Like This Query isn't the best choice in this case.
If, as you described, you're looking for documents using the first letters of words, you should use a Prefix Query instead, but they are limited to one field. For a search on more than one field, use the MultiMatch Query (providing the PHRASE_PREFIX type). I would try something like:
QueryBuilders.multiMatchQuery("GUER", "firstname", "lastname")
.type(MatchQueryBuilder.Type.PHRASE_PREFIX);

QueryBuilders.boolQuery().should(QueryBuilders.wildcardQuery("lastname", "*GUER*"));
I got the result using WildcardQueryBuilder it generates following:
{"bool":{"should":[{"wildcard":{"firstname":"*GUER*"}}]}}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Elasticsearch universal search query - java

I think you should use query_string in elasticsearch. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html This will solve your problem.

You can use multi match query, writing all fields or wildcards. Multi Match Query

Related

How to get value with an underscore inside a string from Elasticsearch using QueryBuilder in Java?

Sort the search result in ascending order of a multivalued field in Solr

How to add two separate fields and aggregate over that sum

Retrieve data from Elasticsearch using aggregations where the values contains hyphen

ElasticSearch : search more like this in java

Categories

Resources