query data from elasticsearch using java highlevelrestclient - java

How to query data from elasticsearch based on the property that is present inside the actual object.
Format of data stored in elsticsearch:
{
"principals": [
{
"id": 1,
"account": {
"account_id": 2
}
}
]
}
Search query in postman:
{
"query": {
"terms": {
"account_id": [
1
]
}
}
}
This is returning the required result in postman.
How to achieve the same in java using highlevelrestclient.

I am not sure how your above search query worked in fetching corresponding document.
But I had indexed and searched your document through this way :
mapping:
{
"mappings": {
"properties": {
"principals": {
"properties": {
"id": { "type": "integer" },
"account": {
"properties": {
"account_id": { "type": "integer" }
}
}
}
}
}
}
}
search query:
{
"query": {
"terms": {
"principals.account.account_id": [2]
}
}
}
Search result :
"hits": [
{
"_index": "nestedindex",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"principals": [
{
"id": 1,
"account": {
"account_id": 2
}
}
]
}
}
]
Search query through Elasticsearch Resthighlevelclient
SearchRequest searchRequest = new SearchRequest("testIndex"); //in place of "testIndex" you can give your index name
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
List<Integer> accountIds = new ArrayList<Integer>();
accountIds.add(2);
sourceBuilder.query(QueryBuilders.termsQuery("principals.account.account_id", accountIds));
sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
searchRequest.source(sourceBuilder);
SearchResponse searchResponse =
client.search(searchRequest, RequestOptions.DEFAULT); //client is ES client
return searchResponse; //you can read your hits through searchResponse.getHits().getHits()
ElasticSearch client can be instantiated in spring-boot application by creating configuration file in your project and autowiring the client where required:
#Configuration
#Primary
public class ElasticsearchConfig {
private RestHighLevelClient restHighLevelClient;
#Bean(destroyMethod = "close")
public RestHighLevelClient client() {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));
return client;
}

Related

Build terms_set query in elasticsearch java api

I try to find out how to build the following query with elasticsearch java api
{
"query": {
"bool": {
"must": [
{
"terms_set": {
"names": {
"minimum_should_match_field": "some_match_field",
"terms": [
"Ala",
"Bob"
]
}
}
}
]
}
}
}
I tried to build this query with following code, but there is no termsSetQuery method as well as minimumShouldMatchField in api.
NativeSearchQuery build = new NativeSearchQueryBuilder()
.withQuery(boolQuery()
.must(termsQuery("names", List.of("Ala", "Bob"))))
.build();
but it results as below
{
"bool": {
"must": [
{
"terms": {
"names": [
"Ala",
"Bob"
]
}
}
]
}
}
You need to use TermsSetQueryBuilder for creating query like below:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
List<String> terms = new ArrayList<>();
terms.add("Ala");
searchSourceBuilder.query(QueryBuilders.boolQuery()
.must(new TermsSetQueryBuilder("names", terms).setMinimumShouldMatchField("some_match_field")));

ElasticSearch - JavaApi searching by each character instead of term (word)

Am fetching documents from elastic search using java api, i have the following code in my elastic search documents and am trying to search it with the following pattern.
code : MS-VMA1615-0D
Input : MS-VMA1615-0D -- Am getting the results (MS-VMA1615-0D).
Input : VMA1615 -- Am getting the results (MS-VMA1615-0D) .
Input : VMA -- Am getting the results (MS-VMA1615-0D) .
But, if i give input like below, am not getting results.
Input : V -- Am not getting the results.
INPUT : MS -- Am not getting the results.
INPUT : -V -- Am not getting the results.
INPUT : 615 -- Am not getting the results.
Am expecting to return the code MS-VMA1615-0D. In simple, am trying to search character by character instead of term (word).
It should not return the code MS-VMA1615-0D for the following cases, Because its not matching with my code.
Input : VK -- should not return the results.
INPUT : MS3 -- should not return the results.
Please find my below java code that am using
private final String INDEX = "products";
private final String TYPE = "doc";
SearchRequest searchRequest = new SearchRequest(INDEX);
searchRequest.types(TYPE);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
QueryStringQueryBuilder qsQueryBuilder = new QueryStringQueryBuilder(code);
qsQueryBuilder.defaultField("code");
searchSourceBuilder.query(qsQueryBuilder);
searchSourceBuilder.size(50);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = null;
try {
searchResponse = SearchEngineClient.getInstance().search(searchRequest);
} catch (IOException e) {
e.getLocalizedMessage();
}
Item item = null;
SearchHit[] searchHits = searchResponse.getHits().getHits();
Please find my mapping details :
PUT products
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "my_pattern_tokenizer",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
},
"tokenizer": {
"my_pattern_tokenizer": {
"type": "pattern",
"pattern": "-|\\d"
}
}
}
},
"mappings": {
"doc": {
"properties": {
"code": {
"type": "text",
"analyzer": "custom_analyzer"
}
}
}
}
}
After Update with new Answer :
This is my request via Java API
'SearchRequest{searchType=QUERY_THEN_FETCH, indices=[products], indicesOptions=IndicesOptions[id=38, ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[doc], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, source={"size":50,"query":{"match_phrase":{"code":{"query":"1615","slop":0,"boost":1.0}}}}}
' . But am getting response as null
Follow up: ElasticSearch - JavaApi searching not happening without (*) in my input query
Your mapping should look like:
PUT products
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "ngram",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"doc": {
"properties": {
"code": {
"type": "text",
"analyzer": "custom_analyzer"
}
}
}
}
}
And you should be using a match_phrase query.
In Kibana:
GET products/_search
{
"query": {
"match_phrase": {
"code": "V"
}
}
}
will return the result:
"hits": [
{
"_index": "products",
"_type": "doc",
"_id": "EoGtdGQBqdof7JidJkM_",
"_score": 0.2876821,
"_source": {
"code": "MS-VMA1615-0D"
}
}
]
But this:
GET products/_search
{
"query": {
"match_phrase": {
"code": "VK"
}
}
}
wont:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
Based on your comment:
Instead of using a Query string:
QueryStringQueryBuilder qsQueryBuilder = new QueryStringQueryBuilder(code);
qsQueryBuilder.defaultField("code");
searchSourceBuilder.query(qsQueryBuilder);
searchSourceBuilder.size(50);
searchRequest.source(searchSourceBuilder);
Use a match phrase query:
QueryBuilder query = QueryBuilders.matchPhraseQuery("code", code);
searchSourceBuilder.query(query);
searchSourceBuilder.size(50);
searchRequest.source(searchSourceBuilder);

Best practice to search ingest-attachment from documents (2k+ documents with ingest-attachment)

Am fetching the indexed documents from elastic search using Java API. But am getting Null as a response from elastic search when Index having more number of document like (2k+).
If index doesnt have more documents less than 500 something, the below Java API code is working properly.
More number of documents in Index, creating issue. ( Is that something like performance issue while fetching ?)
I used ingest-attachment processor plugin for attachment, i attached PDF in my documents.
But if i search with the same query using kibana with curl script am getting response, and am able to see the results in Kibana
Please find my java code below
private final static String ATTACHMENT = "document_attachment";
private final static String TYPE = "doc";
public static void main(String args[])
{
RestHighLevelClient restHighLevelClient = null;
try {
restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http"),
new HttpHost("localhost", 9201, "http")));
} catch (Exception e) {
System.out.println(e.getMessage());
}
SearchRequest contentSearchRequest = new SearchRequest(ATTACHMENT);
SearchSourceBuilder contentSearchSourceBuilder = new SearchSourceBuilder();
contentSearchRequest.types(TYPE);
QueryStringQueryBuilder attachmentQB = new QueryStringQueryBuilder("Activa");
attachmentQB.defaultField("attachment.content");
contentSearchSourceBuilder.query(attachmentQB);
contentSearchSourceBuilder.size(50);
contentSearchRequest.source(contentSearchSourceBuilder);
SearchResponse contentSearchResponse = null;
try {
contentSearchResponse = restHighLevelClient.search(contentSearchRequest); // returning null response
} catch (IOException e) {
e.getLocalizedMessage();
}
System.out.println("Request --->"+contentSearchRequest.toString());
System.out.println("Response --->"+contentSearchResponse.toString());
SearchHit[] contentSearchHits = contentSearchResponse.getHits().getHits();
long contenttotalHits=contentSearchResponse.getHits().totalHits;
System.out.println("condition Total Hits --->"+contenttotalHits);
Please find my script that am using in kibana., am getting response for the below script.
GET document_attachment/_search?pretty
{
"query" :{
"match": {"attachment.content": "Activa"}
}
}
Please find the below search request from Java API
SearchRequest{searchType=QUERY_THEN_FETCH, indices=[document_attachment], indicesOptions=IndicesOptions[id=38, ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[doc], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, source={"size":50,"query":{"match":{"attachment.content":{"query":"Activa","operator":"OR","prefix_length":0,"max_expansions":50,"fuzzy_transpositions":true,"lenient":false,"zero_terms_query":"NONE","auto_generate_synonyms_phrase_query":true,"boost":1.0}}}}}
Please find my mapping details
{
"document_attachment": {
"mappings": {
"doc": {
"properties": {
"app_language": {
"type": "text"
},
"attachment": {
"properties": {
"author": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"analyzer": "custom_analyzer"
},
"content_length": {
"type": "long"
},
"content_type": {
"type": "text"
},
"date": {
"type": "date"
},
"language": {
"type": "text"
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"catalog_description": {
"type": "text"
},
"fileContent": {
"type": "text"
}
}
}
}
}
}
}
Please find my settings details
PUT _ingest/pipeline/document_attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "fileContent"
}
}
]
}
Am getting this error only when am trying to search based on attachment.content , If i search with some other field am able to get results.
Am using ElasticSearch 6.2.3 version
Please find the error below.
org.apache.http.ContentTooLongException: entity content is too long [105539255] for the configured buffer limit [104857600]
at org.elasticsearch.client.HeapBufferedAsyncResponseConsumer.onEntityEnclosed(HeapBufferedAsyncResponseConsumer.java:76)
at org.apache.http.nio.protocol.AbstractAsyncResponseConsumer.responseReceived(AbstractAsyncResponseConsumer.java:131)
at org.apache.http.impl.nio.client.MainClientExec.responseReceived(MainClientExec.java:315)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseReceived(DefaultClientExchangeHandlerImpl.java:147)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.responseReceived(HttpAsyncRequestExecutor.java:303)
at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:255)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "main" java.lang.NullPointerException
at com.es.utility.DocumentSearch.main(DocumentSearch.java:88)

elasticsearch aggs groupby field and orderby score

currently I implemented it from postman but I can't implement it from JAVA code. Below is the post json body.
I only want to have the full-text search for Innovation. And groupby email and orderby score. But seems the sum of the score still didn't work. Who can help me out? It worked now.
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "Innovation"
}
}
]
}
},
"highlight": {
"require_field_match": false,
"pre_tags" : [ "<b>" ],
"post_tags" : [ "</b>" ],
"order" : "score",
"highlight_filter" : false,
"fields": {
"*": {}
}
},
"aggs": {
"group_by_emails": {
"terms": { "field": "email" }
}
}
}
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.from(0);
searchSourceBuilder.size(10);
searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
QueryStringQueryBuilder queryString = new QueryStringQueryBuilder(keyword);
searchSourceBuilder.query(queryString);
Script script = new Script("_score;");
AggregationBuilder aggregation = AggregationBuilders
.terms("agg")
.field("email")
.order(BucketOrder.aggregation("sum_score", false))
.subAggregation(AggregationBuilders.sum("sum_score").script(script))
;
searchSourceBuilder.aggregation(aggregation);
System.out.println(searchSourceBuilder.toString());
SearchRequest searchRequest = new SearchRequest(indexName);
searchRequest.source(searchSourceBuilder);

Elasticsearch complex query using Java API

I am trying to query on ElasticSearch using Java API, my query is:
curl -XGET 'http://localhost:9200/logstash-*/_search?search_type=count' -d '
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"and" : [
{
"range": {
"timestamp": {
"gte": "2015-08-20",
"lt": "2015-08-21",
"format": "yyyy-MM-dd",
"time_zone": "+8:00"
}
}
},
{"query": {
"match": {
"request": {
"query": "/v2/brand"
}
}
}
},
{"term": { "response" : "200"}
}
]
}
}
},
"aggs": {
"group_by_device_id": {
"terms": {
"field": "clientip"
}
}
}
}'
The similar sql logic is:
select distinct(clientip) from table where timestamp between '2015-08-20' and '2015-08-21' and request like '/v2/brand%' and response = '200'
How to implement it using Java API?
Please guide I am new to ElasticSearch. Thanks in advance!
I have resolved the problem, below is my codes:
SearchResponse scrollResp1 = client.prepareSearch("logstash-*").setSearchType(SearchType.SCAN).
setQuery(QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
FilterBuilders.andFilter(FilterBuilders.termFilter("response", "200")
, FilterBuilders.rangeFilter("timestamp").gte(startDate).lt
(endDate), FilterBuilders.queryFilter
(QueryBuilders.matchQuery("request", "signup"))
)))
.addAggregation(AggregationBuilders.terms
("group_by_client_ip").size(0).field("clientip")).get();

Categories

Resources