Elasticsearch complex query using Java API - java

I am trying to query on ElasticSearch using Java API, my query is:
curl -XGET 'http://localhost:9200/logstash-*/_search?search_type=count' -d '
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"and" : [
{
"range": {
"timestamp": {
"gte": "2015-08-20",
"lt": "2015-08-21",
"format": "yyyy-MM-dd",
"time_zone": "+8:00"
}
}
},
{"query": {
"match": {
"request": {
"query": "/v2/brand"
}
}
}
},
{"term": { "response" : "200"}
}
]
}
}
},
"aggs": {
"group_by_device_id": {
"terms": {
"field": "clientip"
}
}
}
}'
The similar sql logic is:
select distinct(clientip) from table where timestamp between '2015-08-20' and '2015-08-21' and request like '/v2/brand%' and response = '200'
How to implement it using Java API?
Please guide I am new to ElasticSearch. Thanks in advance!

I have resolved the problem, below is my codes:
SearchResponse scrollResp1 = client.prepareSearch("logstash-*").setSearchType(SearchType.SCAN).
setQuery(QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
FilterBuilders.andFilter(FilterBuilders.termFilter("response", "200")
, FilterBuilders.rangeFilter("timestamp").gte(startDate).lt
(endDate), FilterBuilders.queryFilter
(QueryBuilders.matchQuery("request", "signup"))
)))
.addAggregation(AggregationBuilders.terms
("group_by_client_ip").size(0).field("clientip")).get();

Related

Combination of script score and function score filter in java api

I can't find an option how to combine script score and function score filters in elastic java api.
I have the following query:
GET index/type/_search
{
"query": {
"nested": {
"path": "field",
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"match": {
"field.name": "NAME"
}
}
]
}
},
"functions": [
{
"filter": {
"match": {
"field.type":"TYPE"
}
},
"weight": 3
},
{
"script_score": {
"script":"doc['field.count'].value"
}
}
]
}
}
}
}
}
And tried to write ElasticSearchQuery
ElasticSearchQuery query = new ElasticSearchQuery(Indexes.NAME, Types.TYPE)
.setQueryBuilder(QueryBuilders.nestedQuery(FIELD, QueryBuilders.functionScoreQuery(
QueryBuilders.boolQuery().must(QueryBuilders.matchQuery(FIELD_NAME, fieldName)),
new FunctionScoreQueryBuilder.FilterFunctionBuilder[]{
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
QueryBuilders.matchQuery(FIELD_TYPE, fieldType),
ScoreFunctionBuilders.weightFactorFunction(3.0F)
)
}), ScoreMode.None));
But how to add script score?
solution is pretty simple:
FunctionScoreQueryBuilder.FilterFunctionBuilder[] filterFunctionBuilders = new FunctionScoreQueryBuilder.FilterFunctionBuilder[]{
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
QueryBuilders.matchQuery(FIELD_TYPE, fieldType),
ScoreFunctionBuilders.weightFactorFunction(3)
),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
ScoreFunctionBuilders.scriptFunction(format("doc['%s'].value", FIELD_COUNT))
)
};

ElasticSearch - matchPhraseQuery API to search with multiple fields

Am searching for specific field which is having ngram tokenizer and am querying that field (code) using matchPhraseQuery and it is working fine.
Now, i want to search with 3 field. How can we do this.?
Please find my java code which am searching for only one field (code).
SearchRequest searchRequest = new SearchRequest(INDEX);
searchRequest.types(TYPE);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
QueryBuilder qb = QueryBuilders.matchPhraseQuery("code", code);
searchSourceBuilder.query(qb);
searchSourceBuilder.size(10);
searchRequest.source(searchSourceBuilder);
Please find my mappings details below :
PUT products
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "ngram",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"doc": {
"properties": {
"code": {
"type": "text",
"analyzer": "custom_analyzer"
},
"attribute" : {
"type" : "text",
"analyzer" : "custom_analyzer"
},
"term" : {
"type" : "text",
"analyzer" : "custom_analyzer"
}
}
}
}
}
Now, i want to make a search query for 3 fields code, attribute, term
I have tried the below java code, which is not working as expected :
BoolQueryBuilder orQuery = QueryBuilders.boolQuery();
QueryBuilder qb1 = QueryBuilders.matchPhraseQuery("catalog_keywords", keyword);
QueryBuilder qb2 = QueryBuilders.matchPhraseQuery("product_keywords", keyword);
orQuery.should(qb1);
orQuery.should(qb2);
orQuery.minimumShouldMatch(1);
searchSourceBuilder.query(orQuery);
searchSourceBuilder.size(10);
searchRequest.source(searchSourceBuilder);
My Input Query :
Logi
Output am getting like :
"MateriaƂy, programy doborowe | Marketing | Katalogi, broszury"
Which is totally irrelavant to my query. Expected results is, Logiciels
And my field having value with delimiters |, so i just want only the exact match of the word/character. It should not print with all the delimiters and all.
Use bool query :
GET products/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"FIELD": "PHRASE"
}
},
{
"match_phrase": {
"FIELD": "PHRASE"
}
},
{
"match_phrase": {
"FIELD": "PHRASE"
}
}
]
}
}
}

ElasticSearch - JavaApi searching by each character instead of term (word)

Am fetching documents from elastic search using java api, i have the following code in my elastic search documents and am trying to search it with the following pattern.
code : MS-VMA1615-0D
Input : MS-VMA1615-0D -- Am getting the results (MS-VMA1615-0D).
Input : VMA1615 -- Am getting the results (MS-VMA1615-0D) .
Input : VMA -- Am getting the results (MS-VMA1615-0D) .
But, if i give input like below, am not getting results.
Input : V -- Am not getting the results.
INPUT : MS -- Am not getting the results.
INPUT : -V -- Am not getting the results.
INPUT : 615 -- Am not getting the results.
Am expecting to return the code MS-VMA1615-0D. In simple, am trying to search character by character instead of term (word).
It should not return the code MS-VMA1615-0D for the following cases, Because its not matching with my code.
Input : VK -- should not return the results.
INPUT : MS3 -- should not return the results.
Please find my below java code that am using
private final String INDEX = "products";
private final String TYPE = "doc";
SearchRequest searchRequest = new SearchRequest(INDEX);
searchRequest.types(TYPE);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
QueryStringQueryBuilder qsQueryBuilder = new QueryStringQueryBuilder(code);
qsQueryBuilder.defaultField("code");
searchSourceBuilder.query(qsQueryBuilder);
searchSourceBuilder.size(50);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = null;
try {
searchResponse = SearchEngineClient.getInstance().search(searchRequest);
} catch (IOException e) {
e.getLocalizedMessage();
}
Item item = null;
SearchHit[] searchHits = searchResponse.getHits().getHits();
Please find my mapping details :
PUT products
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "my_pattern_tokenizer",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
},
"tokenizer": {
"my_pattern_tokenizer": {
"type": "pattern",
"pattern": "-|\\d"
}
}
}
},
"mappings": {
"doc": {
"properties": {
"code": {
"type": "text",
"analyzer": "custom_analyzer"
}
}
}
}
}
After Update with new Answer :
This is my request via Java API
'SearchRequest{searchType=QUERY_THEN_FETCH, indices=[products], indicesOptions=IndicesOptions[id=38, ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[doc], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, source={"size":50,"query":{"match_phrase":{"code":{"query":"1615","slop":0,"boost":1.0}}}}}
' . But am getting response as null
Follow up: ElasticSearch - JavaApi searching not happening without (*) in my input query
Your mapping should look like:
PUT products
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "ngram",
"char_filter": [
"html_strip"
],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"doc": {
"properties": {
"code": {
"type": "text",
"analyzer": "custom_analyzer"
}
}
}
}
}
And you should be using a match_phrase query.
In Kibana:
GET products/_search
{
"query": {
"match_phrase": {
"code": "V"
}
}
}
will return the result:
"hits": [
{
"_index": "products",
"_type": "doc",
"_id": "EoGtdGQBqdof7JidJkM_",
"_score": 0.2876821,
"_source": {
"code": "MS-VMA1615-0D"
}
}
]
But this:
GET products/_search
{
"query": {
"match_phrase": {
"code": "VK"
}
}
}
wont:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
Based on your comment:
Instead of using a Query string:
QueryStringQueryBuilder qsQueryBuilder = new QueryStringQueryBuilder(code);
qsQueryBuilder.defaultField("code");
searchSourceBuilder.query(qsQueryBuilder);
searchSourceBuilder.size(50);
searchRequest.source(searchSourceBuilder);
Use a match phrase query:
QueryBuilder query = QueryBuilders.matchPhraseQuery("code", code);
searchSourceBuilder.query(query);
searchSourceBuilder.size(50);
searchRequest.source(searchSourceBuilder);

How to build an Elasticsearch-Query with startsWith-functionality and special characters

I have JsonObjects that i search with Elasticsearch from a Java Application, using the Java API to build searchQueries. The objects contain a field called "such" that contains a searchString with which the JsonObject should be found, for example a simple searchString would be "STVBBM160A". Besides the usual characters a-Z 0-9 the searchString could also look like the following examples:
"STV-157ABR", "F-G/42-W3" or "DDM000.074.6652"
The search should return results already when only the first characters are put into a searchfield, which it does for a search like "F-G/42"
My Problem: The search sometimes doesn't return results at all, but when the last character is typed it finds the right document.
What i tried: First I wanted to use a WildcardQuery where the query would be "typedStuff*", but the WildcardQuery didn't return any results at all, as soon as I typed anything but * (It used to work for other searchFields with other values)
Now I am using a QueryStringQuery, which also takes the input and puts a * character to the end. By escaping the QueryString, I am able to search for Strings like "F-G/42" and so on, but the search for "DDM000.074.6652" doesn't return any results until elasticsearch has the whole String to search. Also, when i type "STV" all results with "STV-xxxxx" (containing the "-" after STV) are returned, but not the object with "STVBBM160A", again until the whole String is given for the search (without showing any results inbetween as soon as the searchString is "STVB")
This is the query I'm using right now:
{
"size": 1000,
"min_score": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "MY_DATA_TYPE",
"fields": [
"doc.db_doc_type"
]
}
},
{
"query_string": {
"query": "MY_SPECIFIC_TYPE",
"fields": [
"doc.db_doc_specific"
]
}
}
],
"should": {
"query_string": {
"query": "STV*",
"fields": [
"doc.such"
],
"boost": 3,
"escape": true
}
}
}
}
}
This is the old Query with the WildCardQuery, which doesn't return any results at all unless there is no queryString but *:
{
"size": 50,
"min_score": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "MY_DATA_TYPE",
"fields": [
"doc.db_doc_type"
]
}
},
{
"query_string": {
"query": "MY_SPECIFIC_TYPE",
"fields": [
"doc.db_doc_specific"
]
}
}
],
"should": {
"wildcard": {
"doc.such": {
"wildcard": "STV*",
"boost": 3
}
}
}
}
}
}
When using a PrefixQuery, the search also doesn't return any results at all (with and without the *):
{
"size": 50,
"min_score": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "MY_DATA_TYPE",
"fields": [
"doc.db_doc_type"
]
}
},
{
"query_string": {
"query": "MY_SPECIFIC_TYPE",
"fields": [
"doc.db_doc_specific"
]
}
}
],
"should": {
"prefix": {
"doc.such": {
"prefix": "HSTKV*",
"boost": 3
}
}
}
}
}
}
How can this query be changed to achieve the goal of getting all results starting with the specified String, no matter if the field doc.such also contains Numbers or special chars like "_" or "." or "/" ?
Thanks in advance
As soon as you want to query prefixes, suffixes or substring in a serious way, you need to leverage nGrams. In your case, since you're only after prefixes, an edgeNGram tokenizer would be in order. You need to change the settings of your index to be like this one:
PUT your_index
{
"settings": {
"analysis": {
"analyzer": {
"prefix_analyzer": {
"tokenizer": "prefix_tokenizer",
"filter": [
"lowercase"
]
},
"search_prefix_analyzer": {
"tokenizer": "keyword",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"prefix_tokenizer": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "25"
}
}
}
},
"mappings": {
"your_type": {
"properties": {
"doc": {
"properties": {
"such": {
"type": "string",
"fields": {
"starts_with": {
"type": "string",
"analyzer": "prefix_analyzer",
"search_analyzer": "search_prefix_analyzer"
}
}
}
}
}
}
}
}
}
What will happen with this analyzer is that when indexing F-G/42-W3 the following tokens will be indexed: f, f-, f-g, f-g/, f-g/4, f-g/42, f-g/42-, f-g/42-w, f-g/42-w3.
At search time, we'll simply lowercase the user input and the prefix will be matched against the indexed tokens.
Then your query can simply be transformed to a match query:
{
"size": 1000,
"min_score": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "MY_DATA_TYPE",
"fields": [
"doc.db_doc_type"
]
}
},
{
"query_string": {
"query": "MY_SPECIFIC_TYPE",
"fields": [
"doc.db_doc_specific"
]
}
}
],
"should": {
"match": {
"doc.such": {
"query": "F-G/4"
}
}
}
}
}
}

Why can't I query correctly against a HashMap field in Elasticsearch?

I am using Elasticsearch v1.5.2. I have a JSON document that looks like the following.
{
"id": "RRRZe32",
"metadata": {
"published": "2010-07-29T18:11:43.000Z",
"codeId": "AChdUxnsuRyoCo7roK6gqZSg",
"codeTitle": "something"
}
}
My Java POJO object that backs this JSON looks like the following. Note that I am using Spring-Boot v1.3.0.M2 with spring-boot-starter-data-elasticsearch dependency.
#Document(indexName="ws", type="vid")
public class Video {
#Id
private String id;
#Field(type=FieldType.Object, index=FieldIndex.not_analyzed)
private Map<String, Object> metadata;
}
My mapping is defined as follows.
{
"ws": {
"mappings": {
"vid": {
"properties": {
"id": {
"type": "string"
},
"metadata": {
"properties": {
"codeId": {
"type": "string"
},
"codeTitle": {
"type": "string"
}
}
}
}
}
}
}
}
I can query the document (using Sense) successfully by metadata.codeTitle but not metadata.codeId. My query for metadata.codeTitle looks like the following.
{
"query": {
"bool": {
"must": [
{
"term": {
"metadata.codeTitle": {
"value": "something"
}
}
}
]
}
}
}
My query for metadata.codeId looks like the following.
{
"query": {
"bool": {
"must": [
{
"term": {
"metadata.codeId": {
"value": "AChdUxnsuRyoCo7roK6gqZSg"
}
}
}
]
}
}
}
Any ideas on what I am doing wrong?
It is because your codeId field is analyzed and the value is lowercased at indexing time. So you have two solutions:
You can query like this (i.e. all lowercase)
{
"query": {
"bool": {
"must": [
{
"term": {
"metadata.codeId": {
"value": "achduxnsuryoco7rok6gqzsg"
}
}
}
]
}
}
}
Or you can declare your codeId field as not_analyzed and keep your query as it is.
In your case, it looks like case 1 will be easier to implement.

Categories

Resources