am using opensearch 2.4 and I have an index with some fields while creating , later i started saving new field to the index , now when i query on the newly created field am not getting any results
ex : query 1
POST abc/_search
{
"query": {
"bool": {
"must": [
{
"terms": {
"name": [
"john"
]
}
}
]
}
}
}
above works fine because name fields exists since creation of index
query 2 :
POST abc/_search
{
"query": {
"bool": {
"must": [
{
"terms": {
"lastname": [
"William"
]
}
}
]
}
}
}
above query doesnt work though i have some documents with lastname william
When you index a new field without previously declaring it in the mapping, opensearch/elastic will generate text type and type keyword.
There are two ways for you to get results with the Term Query. First remember that Term query works with exact terms.
The first option is to use the keyword field.
{
"terms": {
"lastname.keyword": [
"William"
]
}
}
The second option is to search in the text field, but remember that when indexing the default parser is applied, then the lowecase filter leaves the token like this: william.
In this case, the query should be:
{
"terms": {
"lastname": [
"william"
]
}
}
When you use "terms" there must be an exact match (including casing).
So make sure your document contains William and not william or Williams
If you want more tolerance you can explore the match query:
https://opensearch.org/docs/latest/opensearch/query-dsl/full-text/#match
Related
I am having a problem while querying elastic search. The below is my query
GET _search {
"query": {
"bool": {
"must": [{
"match": {
"name": "SomeName"
}
},
{
"match": {
"type": "SomeType"
}
},
{
"match": {
"productId": "ff134be8-10fc-4461-b620-79s51199c7qb"
}
},
{
"range": {
"request_date": {
"from": "2018-08-22T12:16:37,392",
"to": "2018-08-28T12:17:41,137",
"format": "YYYY-MM-dd'T'HH:mm:ss,SSS"
}
}
}
]
}
}
}
I am using three match queries and a range query in the bool query. My intention is getting docs with these exact matches and with in this date range. Here , if i change name and type value, i wont get the results. But for productId , if i put just ff134be8, i would get results. Anyone knows why is that ? . The exact match works on name and type but not for productId
You need to set the mapping of your productId to keyword to avoid the tokenization. With the standard tokenizer "ff134be8-10fc-4461-b620-79s51199c7qb" will create ["ff134be8", "10fc", "4461", "b620", "79s51199c7qb"] as tokens.
You have different options :
1/ use a term query to check without analyzing the content of the field
...
{
"term": {
"productId": "ff134be8-10fc-4461-b620-79s51199c7qb"
}
},
...
2/ if you are in Elasticsearch 6.X you could change your request to
...
{
"match": {
"productId.keyword": "ff134be8-10fc-4461-b620-79s51199c7qb"
}
},
...
As elasticsearch will create a subfield keyword with the type keyword for all string field
The best option is, of course, the first one. Always use term query if you are trying to match the exact content.
Am fetching documents from elasticsearch indexes and am using whitespace tokenizer with stemmer.
Please find my mapping file below.
PUT stemmer_lower_test
{
"settings": {
"analysis": {
"analyzer": {
"value_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"char_filter": [
"html_strip"
],
"filter": ["lowercase", "asciifolding", "my_stemmer"]
}
},
"filter" : {
"my_stemmer" : {
"type" : "stemmer",
"name" : "minimal_english"
}
}
}
},
"mappings": {
"doc": {
"properties": {
"product_attr_value": {
"type": "text",
"analyzer": "value_analyzer"
},
"product_id": {
"type": "long"
},
"product_name":{
"type": "text"
}
}
}
}
}
Please find my fuzzy API which am using :
QueryBuilder qb1 = QueryBuilders.boolQuery()
.must(QueryBuilders.fuzzyQuery("product_attr_value", keyword).boost(0.0f).prefixLength(3).fuzziness(Fuzziness.AUTO).transpositions(true));
If am searching for value (in lowercase) and getting count arround 1555. If i searching for Value (only first character in uppercase) and getting 8979 count.
Am expecting both count should be same. like i want to search with case insensitive.
Fuzzy Query is a term level query, then, Elasticsearch won't apply any analyzer on your search term. You have to normalize it before submitting your search to ES. It's the same for multiple other query types.
While the full text queries will analyze the query string before executing, the term-level queries operate on the exact terms that are stored in the inverted index
See https://www.elastic.co/guide/en/elasticsearch/reference/current/term-level-queries.html
I have a sample json which I want to index into elasticsearch.
Sample Json Indexed:
put test/names/1
{
"1" : {
"name":"abc"
},
"2" : {
"name":"def"
},
"3" : {
"name":"xyz"
}
}
where ,
index name : test,
type name : names,
id :1
Now the default mapping generated by elasticsearch is :
{
"test": {
"mappings": {
"names": {
"properties": {
"1": {
"properties": {
"name": {
"type": "string"
}
}
},
"2": {
"properties": {
"name": {
"type": "string"
}
}
},
"3": {
"properties": {
"name": {
"type": "string"
}
}
},
"metadataFieldDefinition": {
"properties": {
"name": {
"type": "string"
}
}
}
}
}
}
}
}
If the map size increases from 3 ( currently) to suppose thousand or million, then ElasticSearch will create a mapping for each which may cause a performance issue as the mapping collection will be huge .
I tried creating a mapping by setting :
"dynamic":false,
"type":object
but it was overriden by ES. since it didnt match the indexed data.
Please let me know how can I define a mapping so that ES. doesnot creates one like the above .
I think there might be a little confusion here in terms of how we index documents.
put test/names/1
{...
document
...}
This says: the following document belongs to index test and is of type name with id 1. The entire document is treated as type name. Using the PUT API as you currently are, you cannot index multiple documents at once. ES immediately interprets 1, 2, and 3 as a properties of type object, each containing a property name of type string.
Effectively, ES thinks you are trying to index ONE document, instead of three
To get many documents into index test with a type of name, you could do this, using the CURL syntax:
curl -XPUT"http://your-es-server:9200/test/names/1" -d'
{
"name": "abc"
}'
curl -XPUT"http://your-es-server:9200/test/names/2" -d'
{
"name": "ghi"
}'
curl -XPUT"http://your-es-server:9200/test/names/3" -d'
{
"name": "xyz"
}'
This will specify the document ID in the endpoint you are index to. Your mapping will then look like this:
"test": {
"mappings": {
"names": {
"properties": {
"name": {
"type": "string"
}
}
}
}
}
Final Word: Split your indexing up into discrete operations, or check out the Bulk API to see the syntax on how to POST multiple operations in a single request.
I've been messing around with this problem for quite some time now and can't get round to fixing this.
Take the following case:
I have 2 employees in my company which have their own blog page:
POST blog/page/1
{
"author": "Byron",
"author-title": "Junior Software Developer",
"content" : "My amazing bio"
}
and
POST blog/page/2
{
"author": "Jason",
"author-title": "Senior Software Developer",
"content" : "My amazing bio is better"
}
After they created their blog posts, we would like to keep track of the 'views' of their blogs and boost search results based on their 'views'.
This can be done by using the function score query:
GET blog/_search
{
"query": {
"function_score": {
"query": {
"match": {
"author-title": "developer"
}
},
"functions": [
{
"filter": {
"range": {
"views": {
"from": 1
}
}
},
"field_value_factor": {
"field": "views"
}
}
]
}
}
}
I use the range filter to make sure the field_value_factor doesn't affect the score when the amount of views is 0 (score would be also 0).
Now when I try to run this query, I will get the following exception:
nested: ElasticsearchException[Unable to find a field mapper for field [views]]; }]
Which makes sense, because the field doesn't exist anywhere in the index.
If I were to add views = 0 on index-time, I wouldn't have the above issue as the field is known within the index. But in my use-case I'm unable to add this either on index-time or to a mapping.
Based on the ability to use a range filter within the function score query, I thought I would be able to use a exists filter to make sure that the field_value_factor part would only be executed when the field is actually present in the index, but no such luck:
GET blog/_search
{
"query": {
"function_score": {
"query": {
"match": {
"author-title": "developer"
}
},
"functions": [
{
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "views"
}
},
{
"range": {
"views": {
"from": 1
}
}
}
]
}
},
"field_value_factor": {
"field": "views"
}
}
]
}
}
}
Still gives:
nested: ElasticsearchException[Unable to find a field mapper for field [views]]; }]
Where I'd expect Elasticsearch to apply the filter first, before parsing the field_value_factor.
Any thoughts on how to fix this issue, without the use of mapping files or fixing during index-time or scripts??
The error you're seeing occurs at query parsing time, i.e. nothing has been executed yet. At that time, the FieldValueFactorFunctionParser builds the filter_value_factor function to be executed later, but it notices that the views field doesn't exist in the mapping type.
Note that the filter has not been executed yet, just like the filter_value_factor function, it has only been parsed by FunctionScoreQueryParser.
I'm wondering why you can't simply add a field in your mapping type, it's as easy as running this
curl -XPUT 'http://localhost:9200/blog/_mapping/page' -d '{
"page" : {
"properties" : {
"views" : {"type" : "integer"}
}
}
}'
If this is REALLY not an option, another possibility would be to use script_score instead, like this:
{
"query": {
"function_score": {
"query": {
"match": {
"author-title": "developer"
}
},
"functions": [
{
"filter": {
"range": {
"views": {
"from": 1
}
}
},
"script_score": {
"script": "_score * doc.views.value"
}
}
]
}
}
}
I'm need run queries with 1000 objects. Using /batch endpoint I can get this to work but is too slow (30 seconds with 300 items).
So I'm trying the same approach as said in this docs page: http://docs.neo4j.org/chunked/2.0.1/rest-api-cypher.html#rest-api-create-mutiple-nodes-with-properties
POST this JSON to http://localhost:7474/db/data/cypher
{
"params": {
"props": [
{
"_user_id": "177032492760",
"_user_name": "John"
},
{
"_user_id": "177032492760",
"_user_name": "Mike"
},
{
"_user_id": "100007496328",
"_user_name": "Wilber"
}
]
},
"query": "MERGE (user:People {id:{_user_id}}) SET user.id = {_user_id}, user.name = {_user_name} "
}
The problem is I'm getting this error:
{ message: 'Expected a parameter named _user_id',
exception: 'ParameterNotFoundException',
fullname: 'org.neo4j.cypher.ParameterNotFoundException',
stacktrace:
...
Maybe this works only with CREATE queries, as showing in the docs page?
Use FOREACH and MERGE with ON CREATE SET:
FOREACH (p in {props} |
MERGE (user:People {id:{p._user_id}})
ON CREATE user.name = {p._user_name})
POST this JSON to http://localhost:7474/db/data/cypher
{
"params": {
"props": [
{
"_user_id": "177032492760",
"_user_name": "John"
},
{
"_user_id": "177032492760",
"_user_name": "Mike"
},
{
"_user_id": "100007496328",
"_user_name": "Wilber"
}
]
},
"query": "FOREACH (p in {props} | MERGE (user:People {id:{p._user_id}}) ON CREATE user.name = {p._user_name}) "
}
Actually, the equivalent to the example in the doc would be:
{
"params": {
"props": [
{
"id": "177032492760",
"name": "John"
},
{
"id": "177032492760",
"name": "Mike"
},
{
"id": "100007496328",
"name": "Wilber"
}
]
},
"query": "CREATE (user:People {props})"
}
It might be legal to replace CREATE to MERGE, but the query may not do what you expect.
For example, if a node with the id "177032492760" already exists, but it does not have the name "John", then the MERGE will create a new node; and you'd end up with 2 nodes with the same id (but different names).
Yes, a CREATE statement can take an array of maps and implicitly convert it to several statements with one map each, but you can't use arrays of maps that way outside of simple create statements. In fact you can't use literal maps the same way either when you use MERGE and MATCH. You can CREATE ({map}) but you have to MATCH/MERGE ({prop:{map}.val} i.e.
// {props:{name:Fred, age:2}}
MERGE (a {name:{props}.name})
ON CREATE SET a = {props}
For your purposes either send individual parameter maps with a query like above or for an array of maps iterate through it with FOREACH
FOREACH (p IN props |
MERGE (user:People {id:p._user_id})
ON CREATE SET user = p)