elasticsearch still creating mappings ad-hoc with dynamic mapping disabled [duplicate] - java

I'm trying to disable dynamic mapping creation for only specific indexes, not for all. For some reason I can't put default mapping with 'dynamic' : 'false'.
So, here left two options as I can see:
specify property 'index.mapper.dynamic' in file elasticsearch.yml.
put 'index.mapper.dynamic' at index creation time, as described here https://www.elastic.co/guide/en/kibana/current/setup.html#kibana-dynamic-mapping
First option may only accept values: true, false and strict. So there is no way to specify subset of specific indexes (like we do by pattern with property 'action.auto_create_index' https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation).
Second option just not works.
I've created index
POST http://localhost:9200/test_idx/
{
"settings" : {
"mapper" : {
"dynamic" : false
}
},
"mappings" : {
"test_type" : {
"properties" : {
"field1" : {
"type" : "string"
}
}
}
}
}
Then checked index settings:
GET http://localhost:9200/test_idx/_settings
{
"test_idx" : {
"settings" : {
"index" : {
"mapper" : {
"dynamic" : "false"
},
"creation_date" : "1445440252221",
"number_of_shards" : "1",
"number_of_replicas" : "0",
"version" : {
"created" : "1050299"
},
"uuid" : "5QSYSYoORNqCXtdYn51XfA"
}
}
}
}
and mapping:
GET http://localhost:9200/test_idx/_mapping
{
"test_idx" : {
"mappings" : {
"test_type" : {
"properties" : {
"field1" : {
"type" : "string"
}
}
}
}
}
}
so far so good, let's index document with undeclared field:
POST http://localhost:9200/test_idx/test_type/1
{
"field1" : "it's ok, field must be in mapping and in source",
"somefield" : "but this field must be in source only, not in mapping"
}
Then I've checked mapping again:
GET http://localhost:9200/test_idx/_mapping
{
"test_idx" : {
"mappings" : {
"test_type" : {
"properties" : {
"field1" : {
"type" : "string"
},
"somefield" : {
"type" : "string"
}
}
}
}
}
}
As you can see, mapping is extended regardless of index setting "dynamic" : false.
I've also tried to create index exactly as described in doc
PUT http://localhost:9200/test_idx
{
"index.mapper.dynamic": false
}
but got the same behavior.
Maybe I've missed something?
Thanks a lot in advance!

You're almost there: the value needs to be set to strict.
And the correct usage is the following:
PUT /test_idx
{
"mappings": {
"test_type": {
"dynamic":"strict",
"properties": {
"field1": {
"type": "string"
}
}
}
}
}
And pushing this a bit further, if you want to forbid the creation even of new types, not only fields in that index, use this:
PUT /test_idx
{
"mappings": {
"_default_": {
"dynamic": "strict"
},
"test_type": {
"properties": {
"field1": {
"type": "string"
}
}
}
}
}
Without _default_ template:
PUT /test_idx
{
"settings": {
"index.mapper.dynamic": false
},
"mappings": {
"test_type": {
"dynamic": "strict",
"properties": {
"field1": {
"type": "string"
}
}
}
}
}

You must know about that the below part just mean that ES could'nt create a type dynamically.
"mapper" : {
"dynamic" : false
}
You should configure ES like this:
PUT http://localhost:9200/test_idx/_mapping/test_type
{
"dynamic":"strict"
}
Then you cant't index other field that without mapping any more ,and get an error as follow:
mapping set to strict, dynamic introduction of [hatae] within [data] is not allowed
If you wanna store the data,but make the field can't be index,you could take the setting like this:
PUT http://localhost:9200/test_idx/_mapping/test_type
{
"dynamic":false
}
Hope these can help the people with the same issue :).

The answer is in the doc (7x.): https://www.elastic.co/guide/en/elasticsearch/reference/7.x/dynamic.html
The dynamic setting controls whether new fields can be added
dynamically or not. It accepts three settings:
true
Newly detected fields are added to the mapping. (default)
false
Newly detected fields are ignored. These fields will not be indexed so
will not be searchable but will still appear in the _source field of
returned hits. These fields will not be added to the mapping, new
fields must be added explicitly.
strict
If new fields are detected, an exception is thrown and the document is
rejected. New fields must be explicitly added to the mapping.
PUT my_index
{
"mappings": {
"dynamic": "strict",
"properties": {
"user": {
"properties": {
"name": {
"type": "text"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}

You cannot disable dynamic mapping in ES 7 anymore, what you can do if you have completely unstructured data is to disable completely the mapping for the index like this:
curl -X PUT "localhost:9200/my_index?pretty" -H 'Content-Type: application/json' -d'
{
"mappings": {
"enabled": false
}
}
'
if you are using python you can do this:
from elasticsearch import Elasticsearch
# Connect to the elastic cluster
es=Elasticsearch([{'host':'localhost','port':9200}])
request_body = {
"mappings": {
"enabled": False
}
}
es.indices.create(index = 'my_index', body = request_body)

For ES 7 if you want to update an existing index:
PUT customers/_mapping
{
"dynamic": "strict"
}

first, please be concern aboout value false or strict,they work in a different way.
using "dynamic": "false" and create documents with fields not covered by the mapping, those fields will be ignored (so they won't be stored) and wouldn't show up in _source when you GET the document.
where value strict will not allow you to create the document rather it will throw an exception
Inner objects inherit the dynamic setting from their parent object or from the mapping type. In the following example, dynamic mapping is disabled at the type level, so no new top-level fields will be added dynamically.
However, the user.social_networks object enables dynamic mapping, so you can add fields to this inner object.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic.html
PUT my-index-000001
{
"mappings": {
"dynamic": false,
"properties": {
"user": {
"properties": {
"name": {
"type": "text"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}
if you are using node.js client
await this.client.indices.putMapping({
index: ElasticIndex.UserDataFactory,
body: {
dynamic: 'strict',
properties: {
...this.schema,
},
},
});

Related

Jolt Transform JSON Spec

I need to transform below Input JSON to output JSON and not sure about how to write spec for that. Need to re-position one field ("homePage") as a root element. Any help or suggestion would be appreciated.
Input JSON :
[{
"uuid": "cac40601-ffc9-4fd0-c5a1-772ac65f0587",
"pageId": 123456,
"page": {
"indexable": true,
"rootLevel": false,
"homePage": false
}
}]
Output JSON :
[{
"uuid": "cac40601-ffc9-4fd0-c5a1-772ac65f0587",
"pageId": 123456,
"homePage": false,
"page": {
"indexable": true,
"rootLevel": false
}
}]
This Jolt Spec should work for you. Tested with https://jolt-demo.appspot.com/
[
{
"operation": "shift",
"spec": {
"*": {
"uuid": "[&1].uuid",
"pageId": "[&1].pageId",
"page": {
"indexable": "[&2].page.indexable",
"rootLevel": "[&2].page.rootLevel",
"homePage": "[&2].homePage"
}
}
}
}
]
input:
{
"uuid" : "cac40601-ffc9-4fd0-c5a1-772ac65f0587",
"pageId" : 123456,
"page" : {
"indexable" : true,
"rootLevel" : false
},
"homePage" : false
}
output:
[ {
"uuid" : "cac40601-ffc9-4fd0-c5a1-772ac65f0587",
"pageId" : 123456,
"page" : {
"indexable" : true,
"rootLevel" : false
},
"homePage" : false
} ]
Explanation:
From the javadoc
& Path lookup
As Shiftr processes data and walks down the spec, it maintains a data structure describing the path it has walked.
The & wildcard can access data from that path in a 0 major, upward oriented way.
Example:
{
"foo" : {
"bar": {
"baz": // &0 = baz, &1 = bar, &2 = foo
}
}
}
Next thing: How to wrap the output object into the array?
A good example can be found in this post.
So, in our case:
"[&1].uuid" says:
Place the uuid value in the object inside the array. The index of the array is indicated by the &1 wildcard. For uuid it will be the index of the array, where the object with uuid key is placed in the original json.
Next, [&2] is similar to [&1]. However, looking at the "indexable" key, it is one level deeper in the input json. Thats why instead of [&1] we used [&2] (have a look again at the foo-bar example from the docs).

How to get nested types in Elasticsearch

I have the following document:
{
"_index" : "testdb",
"_type" : "artWork",
"_id" : "0",
"_version" : 4,
"found" : true,
"_source":{"uuid":0,
"StatusHistoryList":[
{
"ArtWorkDate":"2015-08-28T15:52:03.030+05:00",
"ArtworkStatus":"ACTIVE"
},
{
"ArtWorkDate":"2015-08-28T15:52:03.030+05:00",
"ArtworkStatus":"INACTIVE"
}
]
}
and here is the mapping of the document:
{
"testdb" : {
"mappings" : {
"artWork" : {
"properties" : {
"StatusHistoryList" : {
"type" : "nested",
"properties" : {
"ArtWorkDate" : {
"type" : "string",
"store" : true
},
"ArtworkStatus" : {
"type" : "string",
"store" : true
}
}
},
"uuid" : {
"type" : "integer",
"store" : true
}
}
}
}
}
}
Now I want to access the values of StatusHistoryList. I got null values if I do it like this:
val get = client.prepareGet("testdb", "artWork", Id.toString()).setOperationThreaded(false)
.setFields("uuid",,"StatusHistoryList.ArtworkStatus","StatusHistoryList.ArtWorkDate","_source")
.execute()
.actionGet()
var artworkStatusList= get.getField("StatusHistoryList.ArtworkStatus").getValues.toArray()
var artWorkDateList= get.getField("StatusHistoryList.ArtWorkDate").getValues.toArray()
then I got null values from the code but my document contains the values then I found this question
so after that i tried to do it like this
var smap = get.getSource.get("StatusHistoryList").asInstanceOf[Map[String,Object]]
but then a ClassCastException is thrown
java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.util.Map
Please help me how can I get the values of StatusHistoryList 's ArtworkStatus and ArtWorkDate values please guide me I will be very thankfull to you.
You have almost derived the solution. The GET request has retrieved the response but the problem is in parsing the response.
Let's see the problem. Below is the document that the elastic search returns as source
{
"uuid":0,
"StatusHistoryList":[
{
"ArtWorkDate":"2015-08-28T15:52:03.030+05:00",
"ArtworkStatus":"ACTIVE"
},
{
"ArtWorkDate":"2015-08-28T15:52:03.030+05:00",
"ArtworkStatus":"INACTIVE"
}
]
}
When we do get.getSource.get("StatusHistoryList") it returns List ArtWork objects and not a Map. That is the reason for the classCastException exception.
So if you cast the response to list of objects your problem will be solved.
But this would not be an ideal solution. Some of the libraries like Jackson-Faterxml does the job for you. Using the fasterxml library you can bind the json to equivalent POJO Object.

ElasticSearch mapping for dynamic keys for indexing a map

I have a sample json which I want to index into elasticsearch.
Sample Json Indexed:
put test/names/1
{
"1" : {
"name":"abc"
},
"2" : {
"name":"def"
},
"3" : {
"name":"xyz"
}
}
where ,
index name : test,
type name : names,
id :1
Now the default mapping generated by elasticsearch is :
{
"test": {
"mappings": {
"names": {
"properties": {
"1": {
"properties": {
"name": {
"type": "string"
}
}
},
"2": {
"properties": {
"name": {
"type": "string"
}
}
},
"3": {
"properties": {
"name": {
"type": "string"
}
}
},
"metadataFieldDefinition": {
"properties": {
"name": {
"type": "string"
}
}
}
}
}
}
}
}
If the map size increases from 3 ( currently) to suppose thousand or million, then ElasticSearch will create a mapping for each which may cause a performance issue as the mapping collection will be huge .
I tried creating a mapping by setting :
"dynamic":false,
"type":object
but it was overriden by ES. since it didnt match the indexed data.
Please let me know how can I define a mapping so that ES. doesnot creates one like the above .
I think there might be a little confusion here in terms of how we index documents.
put test/names/1
{...
document
...}
This says: the following document belongs to index test and is of type name with id 1. The entire document is treated as type name. Using the PUT API as you currently are, you cannot index multiple documents at once. ES immediately interprets 1, 2, and 3 as a properties of type object, each containing a property name of type string.
Effectively, ES thinks you are trying to index ONE document, instead of three
To get many documents into index test with a type of name, you could do this, using the CURL syntax:
curl -XPUT"http://your-es-server:9200/test/names/1" -d'
{
"name": "abc"
}'
curl -XPUT"http://your-es-server:9200/test/names/2" -d'
{
"name": "ghi"
}'
curl -XPUT"http://your-es-server:9200/test/names/3" -d'
{
"name": "xyz"
}'
This will specify the document ID in the endpoint you are index to. Your mapping will then look like this:
"test": {
"mappings": {
"names": {
"properties": {
"name": {
"type": "string"
}
}
}
}
}
Final Word: Split your indexing up into discrete operations, or check out the Bulk API to see the syntax on how to POST multiple operations in a single request.

Mappings are not getting updated in ElasticSearch using java code

I have tried updating my mappings in elastic Search using Java Code.
But to my dismay the mappings are not getting updated
Following is my code
String settingsAsJson =getJsonString("settingFile");
client.admin().indices().close(new CloseIndexRequest(indexName)).actionGet().isAcknowledged();
boolean settingsAck = client.admin().indices().prepareUpdateSettings().setSettings(settingsAsJson).setIndices(indexName).execute().actionGet().isAcknowledged();
System.out.println("Applied SETTINGS "+ settingsAck);
String mappingAsJson = getJsonString("mappingFile");
boolean mappingack = client.admin().indices().preparePutMapping().setIndices(indexName).setType(indexType).setSource(mappingAsJson).execute().actionGet().isAcknowledged();
System.out.println("Applied Mappings "+ mappingack);
client.admin().indices().open(new OpenIndexRequest(indexName));
client.admin().indices().prepareFlush(indexName);
Following is my mapping file
{
"profiles": {
"dynamic" : "true",
"_all": {
"type": "string",
"analyzer": "standard"
},
"properties": {
"employee_id": {
"type": "integer",
"analyzer":"standard"
}
}
}
}
Also note, only the dynamic property changes the rest all new updation on mappings dont get updated.
Is there any other way to update mappings using java code?

Function score query with field_value_factor on not (yet) existing field

I've been messing around with this problem for quite some time now and can't get round to fixing this.
Take the following case:
I have 2 employees in my company which have their own blog page:
POST blog/page/1
{
"author": "Byron",
"author-title": "Junior Software Developer",
"content" : "My amazing bio"
}
and
POST blog/page/2
{
"author": "Jason",
"author-title": "Senior Software Developer",
"content" : "My amazing bio is better"
}
After they created their blog posts, we would like to keep track of the 'views' of their blogs and boost search results based on their 'views'.
This can be done by using the function score query:
GET blog/_search
{
"query": {
"function_score": {
"query": {
"match": {
"author-title": "developer"
}
},
"functions": [
{
"filter": {
"range": {
"views": {
"from": 1
}
}
},
"field_value_factor": {
"field": "views"
}
}
]
}
}
}
I use the range filter to make sure the field_value_factor doesn't affect the score when the amount of views is 0 (score would be also 0).
Now when I try to run this query, I will get the following exception:
nested: ElasticsearchException[Unable to find a field mapper for field [views]]; }]
Which makes sense, because the field doesn't exist anywhere in the index.
If I were to add views = 0 on index-time, I wouldn't have the above issue as the field is known within the index. But in my use-case I'm unable to add this either on index-time or to a mapping.
Based on the ability to use a range filter within the function score query, I thought I would be able to use a exists filter to make sure that the field_value_factor part would only be executed when the field is actually present in the index, but no such luck:
GET blog/_search
{
"query": {
"function_score": {
"query": {
"match": {
"author-title": "developer"
}
},
"functions": [
{
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "views"
}
},
{
"range": {
"views": {
"from": 1
}
}
}
]
}
},
"field_value_factor": {
"field": "views"
}
}
]
}
}
}
Still gives:
nested: ElasticsearchException[Unable to find a field mapper for field [views]]; }]
Where I'd expect Elasticsearch to apply the filter first, before parsing the field_value_factor.
Any thoughts on how to fix this issue, without the use of mapping files or fixing during index-time or scripts??
The error you're seeing occurs at query parsing time, i.e. nothing has been executed yet. At that time, the FieldValueFactorFunctionParser builds the filter_value_factor function to be executed later, but it notices that the views field doesn't exist in the mapping type.
Note that the filter has not been executed yet, just like the filter_value_factor function, it has only been parsed by FunctionScoreQueryParser.
I'm wondering why you can't simply add a field in your mapping type, it's as easy as running this
curl -XPUT 'http://localhost:9200/blog/_mapping/page' -d '{
"page" : {
"properties" : {
"views" : {"type" : "integer"}
}
}
}'
If this is REALLY not an option, another possibility would be to use script_score instead, like this:
{
"query": {
"function_score": {
"query": {
"match": {
"author-title": "developer"
}
},
"functions": [
{
"filter": {
"range": {
"views": {
"from": 1
}
}
},
"script_score": {
"script": "_score * doc.views.value"
}
}
]
}
}
}

Categories

Resources