I have test data shown below.
{
"SequenceLocation":{
"Assembly":"GPR7",
"Chr": "10",
"start": 1111
}
}
Whenever I fired query like below it returns me proper values.
{
"query" : {
"bool" : {
"must" : [
{
"term" : {
"SequenceLocation.Chr": "10"
}
}
]
}
}
}
But when I changes query to
{
"query" : {
"bool" : {
"must" : [
{
"term" : {
"SequenceLocation.Assembly": "GPR7"
}
}
]
}
}
}
It does not return me any hits from Elasticsearch. Could you please explain what am I doing wrong?
I think you have wrong mapping for SequenceLocation.Assembly. Default analyzer splits GPR7.p10 into two tokens gpr7 and p10.
According to documentation term query doesn't analyze your query, so you are asking elasticsearch for GPR7.p10 but it is indexed as tokens gpr7 and p10. So it can't match.
You should recreate index with mapping set to "index" : "not_analyzed" for SequenceLocation.Assembly field.
Related
I'm wondering what is the equivalent for the following query with elastic 5 - 7 (doesn't matter for me)
I know that this query has deprecated but actually i'm trying to use legacy 1.7.5 cluster work with High level ES cluster.
I did some tests and although the documentation points that it isn't support i tried and most of the simple actions work. What is left is convert some queries like in the example below
{
"size" : 3000,
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"must" : [ {
"terms" : {
"source" : [ "o365mail" ]
}
}, {
"range" : {
"bckdate" : {
"from" : "1549360021398l",
"to" : null,
"include_lower" : true,
"include_upper" : true
}
}
} ]
}
}
}
},
"fields" : "*"
}
What i've tried so far is with 7.9.3:
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.9/java-rest-high.html
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
boolQueryBuilder
.must(QueryBuilders.termsQuery(IndexFields.SOURCE.getIndexName(),Arrays.asList(Source.O365MAIL.toString().toLowerCase())))
.must(QueryBuilders.rangeQuery("bckdate").gte(1549360021398l).lte(null));
sourceBuilder.query(boolQueryBuilder);
SearchRequest sr = new SearchRequest();
sr.source(sourceBuilder);
SearchResponse searchResponse2 = client.search(sr, RequestOptions.DEFAULT);
The query from debugging is:
{
"bool" : {
"must" : [
{
"terms" : {
"source" : [
"o365mail"
],
"boost" : 1.0
}
},
{
"range" : {
"bckdate" : {
"from" : 1549360021398,
"to" : null,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
Im wondering if it is the same regarding the filters of the legacy code, cause the responded data is pretty the same .
I need not to break the logic with all Filters like in legacy query...
thanks for help
This following query works correctly and returns the results that I need. I am struggling to write this using JAVA APIs though.
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "somepath",
"query": {
"bool": {
"filter": [
{
"terms": {
"somepath.key": ["key1", "key2", "key3"]
}
}
]
}
}
}
}
]
}
}
}
I am using this in JAVA. What am I missing? commaSeparatedKeyString = "key1, key2, key3"
QueryBuilders.boolQuery().must(QueryBuilders.nestedQuery(
"somepath",
QueryBuilders.boolQuery().filter(QueryBuilders.termsQuery("somepath.key", commaSeparatedKeyString)),
ScoreMode.Total));
For debugging purposes, it can be helpful to check the JSON serialization of the query you are building. Fortunately, the toString() methods in the query builders do that for you, so you can simply use System.out.println to print the query builder to stdout (or log it with a logging framework). Assuming that the variable commaSeparatedKeyString is set to "key1,key2,key3" (it sounds like it is, but you don't tell us), you are actually creating the following query:
{
"bool" : {
"must" : [
{
"nested" : {
"query" : {
"bool" : {
"filter" : [
{
"terms" : {
"somepath.key" : [
"key1,key2,key3"
],
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"path" : "somepath",
"ignore_unmapped" : false,
"score_mode" : "sum",
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
As you can see, there are at least two relevant differences in the query you require and the query you are building:
On the top level, the query you want starts with "bool.filter...", but you are building a query with "bool.must...":
QueryBuilders.boolQuery().must(QueryBuilders.nestedQuery(
The innermost term query is supposed to have an array of terms (key1, key2, key3). You can't simply pass one string with comma separated values to achieve that, but have to pass the terms one by one:
termsQuery("somepath.key", "key1", "key2", "key3"))
I have a structure like this:
{
"_id" : ObjectId("5a9da40e87661b3448b7dfe4"),
"userList" : [
{
"user" : {
"email" : "Arnold#mail.com",
"name" : "Arnold"
},
"key" : "ArnoldKey"
}
]
}
This query in Java works fine:
{'userList.user.email' : 'Arnold#mail.com'}
And this does not find anything:
{'userList.user.email':{ '$regex' : '.*arnold.*' , '$options' : 'i'}}
When I remove [] brackets from the structure it works fine, but It's not a solution for me. How should i query to get regex working? Any help appreciated.
You need to update your query like this:
db.getCollection('users').aggregate([{
$match: {
'userList.user.name': {
$regex: `.*arnold.*`,
$options: 'i'
}
}
}]);
I would query elasticsearch for retrieve all the document that has field value like a given string.
For example field LIKE "abc" has to return
"abc"
"abcdef"
"abcd"
"abc1"
So all the field that has "abc" string inside.
I try this query but return only the document with field = "abc":
{"query":{"more_like_this":{"fields":["FIELD"],"like_text":"abc","min_term_freq" : 1,"max_query_terms" : 12}}}
What is the correct query?
Thanks
If you're trying to do a Prefix Query, then you can use this.
{ "query": {
"prefix" : { "field" : "abc" }
}
See ElasticSearch Prefix Query ElasticSearch Prefix Query
Although your question is incomplete. I will try to give you several ideas.
One way surely is a prefix query, but much more efficiently is to build an edge ngram analyzer. That way you'll have your data prepared on inserts and query will be much faster. edge ngram is the most flexible way to do your functionality also, because you can autocomplete words that appear in any order. If you don't need to do this, but you only need "search as you type" queries then the best way is to use completion suggester. If you need to find strings that appear in the middle of the words than you can check ngram analyzer.
Here is how I set an edge ngram analyzer from my code.
"settings": {
"analysis": {
"filter" : {
"edge_filter" : {
"type" : "edge_ngram",
"min_gram": 1,
"max_gram": 256
}
},
"analyzer": {
"edge_analyzer" : {
"type" : "custom",
"tokenizer": "whitespace",
"filter" : ["lowercase", "edge_filter"]
},
"lowercase_whitespace": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [ "lowercase" ]
}
}
}
},
"mappings": {
"my_type": {
"properties": {
"name": {
"type": "keyword",
"fields": {
"suggest": {
"type": "text",
"analyzer" : "edge_analyzer",
"search_analyzer": "lowercase_whitespace"
}
}
}
}
}
}
You should be able to perform a wildcard query as described here.
Elasticsearch like query
{
"query": {
"wildcard": {
"<<FIELD NAME>>": "*<<QUERY TEXT>>*"
}
}
}
This is how my document looks like in elastic search
{
"entityId": "CAMP_ID",
"txnId": "TXN_ID",
"changeSummary": [{
"field_name": "status_id",
"old_val": "1",
"new_val": "2"
}, {
"field_name": "budget",
"old_val": "100",
"new_val": "250"
}]
}
I get a list of txnId and list of changeSummary.field_name and I have to get all matching documents. I initially tried this query:
{
"query" : {
"bool" : {
"should" : [ {
"terms" : {
"transactionId" : [ "2915315c03b3420280eae04116fb303f" ]
}
}, {
"terms" : {
"transactionId" : [ "18faf80b3eb44e4b85be993a6d5fd40b" ]
}
} ]
}
},
"post_filter" : {
"bool" : {
"must" : [ {
"match" : {
"changeSummary.fieldName" : "abc"
}
},{
"match" : {
"changeSummary.fieldName" : "xyz"
}
}]
}
}
}
This works fine but my problem is that list of transactionId can be really huge from 10 to 1000000 (or even more), and if size of list of transactionId is more than 1024 then I start getting
too_many_clauses: maxClauseCount is set to 1024
Could someone please help in this?