Elasticsearch: sort by value and dynamic templates - java

I need sorting for the documents like :
{
customer: {
fullname: "Lorem ipsum"
},
order_number: "12313131",
company: {
name: "Test Inc."
},
date: "10.06.2015 18:00"
}
But as far as I unterstood I can not sort by values in analysed fields. There I am trying to create a mapping :
{
"mappings": {
"_default_": {
"dynamic_templates": [
{
"base": {
"match": "*",
"mapping": {
"type": "multi_field",
"fields": {
"{name}": {"type": "string"},
"_sort": {"type": "string", "analyzer": "sort"}
}
}
}
}
]
}
},
"settings": {
"analysis": {
"analyzer": {
"sort": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
But if I put this configuration, I am getting an exception : ElasticsearchIllegalArgumentException: unknown property. Without this mapping my indexing works fine.
What i want to do is create a multifield called name_sort (not_analysed) so I can sort by values.
****
At leas I can able to create a mapping correctly. My mapping looks like:
{
"muhamo": {
"mappings": {
"bookings": {
"dynamic_templates": [
{
"base": {
"mapping": {
"index": "analyzed",
"type": "{dynamic_type}",
"fields": {
"{name}_sort": {
"index": "not_analyzed",
"type": "{dynamic_type}"
}
}
},
"match": "*",
"match_mapping_type": "string"
}
},
{
"catch_all": {
"mapping": {
"fields": {
"{name}_sort": {
"index": "not_analyzed",
"type": "{dynamic_type}"
}
}
},
"match": "*",
"match_mapping_type": "*"
}
}
],
"properties": {
"bookingType": {
"type": "string",
"fields": {
"bookingType_sort": {
"type": "string",
"index": "not_analyzed"
}
}
},
"comment": {
"type": "string",
"fields": {
"comment_sort": {
"type": "string",
"index": "not_analyzed"
}
}
},
"costLocation": {
"type": "string",
"fields": {
"costLocation_sort": {
"type": "string",
"index": "not_analyzed"
}
}
},
"customer": {
"properties": {
"fullname": {
"type": "string",
"fields": {
"fullname_sort": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
},
"date": {
"type": "string",
"fields": {
"date_sort": {
"type": "string",
"index": "not_analyzed"
}
}
},
"deleted": {
"type": "boolean"
},
"toAirport": {
"type": "boolean"
}
}
}
}
}
}
But if I try to sort my results by customer.fullname_sort I am getting an exception as
query[ConstantScore(*:*)],from[-1],size[-1]: Parse Failure [No mapping found for [customer.fullname_sort] in order to sort on]

You should sort on customer.fullname.fullname_sort. That's the path to your field, according to the mapping of the index.

Related

How to term query nested json objects/fields in elastic search?

I am doing term aggregation based on field [type] like below but elastic is returning only 1 term count instead of 2 it is not doing nested object aggregation i.e under comments.data.comments[is a list] under this i have 2 type.
{
"aggs": {
"genres": {
"terms": {
"field": "comments.data.comments.type"
}
}
}
}
Gotta utilize the nested field type:
PUT events
{
"mappings": {
"properties": {
"events": {
"type": "nested",
"properties": {
"ecommerceData": {
"type": "nested",
"properties": {
"comments": {
"type": "nested",
"properties": {
"recommendationType": {
"type": "keyword"
}
}
}
}
}
}
}
}
}
}
POST events/_doc
{
"events": [
{
"eventId": "1",
"ecommerceData": [
{
"comments": [
{
"rank": 1,
"recommendationType": "abc"
},
{
"rank": 1,
"recommendationType": "abc"
}
]
}
]
}
]
}
GET events/_search
{
"size": 0,
"aggs": {
"genres": {
"nested": {
"path": "events.ecommerceData.comments"
},
"aggs": {
"nested_comments_recomms": {
"terms": {
"field": "events.ecommerceData.comments.recommendationType"
}
}
}
}
}
}

elasticsearch how to group by repetitive items in array without distinct

I'm trying to get the counts group by the repetitive items in array without distinct, use aggs terms but not work
GET /my_index/_search
{
"size": 0,
"aggs": {
"keywords": {
"terms": {
"field": "keywords"
}
}
}
}
documents like:
"keywords": [
"value1",
"value1",
"value2"
],
but the result is:
"buckets": [
{
"key": "value1",
"doc_count": 1
},
{
"key": "value2",
"doc_count": 1
}
]
how can i get the result like:
"buckets": [
{
"key": "value1",
"doc_count": 2
},
{
"key": "value2",
"doc_count": 1
}
]
finally I modify the mapping use nested:
"keywords": {
"type": "nested",
"properties": {
"count": {
"type": "integer"
},
"keyword": {
"type": "keyword"
}
}
},
and query:
GET /my_index/_search
{
"size": 0,
"aggs": {
"keywords": {
"nested": {
"path": "keywords"
},
"aggs": {
"keyword_name": {
"terms": {
"field": "keywords.keyword"
},
"aggs": {
"sums": {
"sum": {
"field": "keywords.count"
}
}
}
}
}
}
}
}
result:
"buckets": [{
"key": "value1",
"doc_count": 495,
"sums": {
"value": 609
}
},
{
"key": "value2",
"doc_count": 440,
"sums": {
"value": 615
}
},
{
"key": "value3",
"doc_count": 319,
"sums": {
"value": 421
}
},
...]

Validating json payload against swagger file - json-schema-validator

I am trying to validate a json payload against a swagger file that contains the service agreement. I am using the json-schema-validator(2.1.7) library to achieve this, but at the moment it's not validating against the specified patterns or min/max length.
Java Code:
public void validateJsonData(final String jsonData) throws IOException, ProcessingException {
ClassLoader classLoader = getClass().getClassLoader();
File jsonSchemaFile = new File (classLoader.getResource("coachingStatusUpdate.json").getFile());
String jsonSchema = new String(Files.readAllBytes(jsonSchemaFile.toPath()));
final JsonNode dataNode = JsonLoader.fromString(jsonData);
final JsonNode schemaNode = JsonLoader.fromString(jsonSchema);
final JsonSchemaFactory factory = JsonSchemaFactory.byDefault();
JsonValidator jsonValidator = factory.getValidator();
ProcessingReport report = jsonValidator.validate(schemaNode, dataNode);
System.out.println(report);
if (!report.toString().contains("success")) {
throw new ProcessingException (
report.toString());
}
}
Message I am sending through
{
"a": "b",
"c": "d",
"e": -1,
"f": "2018-10-30",
"g": "string" }
The swagger definition:
{
"swagger": "2.0",
"info": {
"version": "1.0.0",
"title": "Test",
"termsOfService": "http://www.test.co.za",
"license": {
"name": "Test"
}
},
"host": "localhost:9001",
"basePath": "/test/",
"tags": [
{
"name": "controller",
"description": "Submission"
}
],
"paths": {
"/a": {
"put": {
"tags": [
"controller"
],
"summary": "a",
"operationId": "aPUT",
"consumes": [
"application/json;charset=UTF-8"
],
"produces": [
"application/json;charset=UTF-8"
],
"parameters": [
{
"in": "body",
"name": "aRequest",
"description": "aRequest",
"required": true,
"schema": {
"$ref": "#/definitions/aRequest"
}
}
],
"responses": {
"200": {
"description": "Received",
"schema": {
"$ref": "#/definitions/a"
}
},
"400": {
"description": "Bad Request"
},
"401": {
"description": "Unauthorized"
},
"408": {
"description": "Request Timeout"
},
"500": {
"description": "Generic Error"
},
"502": {
"description": "Bad Gateway"
},
"503": {
"description": "Service Unavailable"
}
}
}
}
},
"definitions": {
"aRequest": {
"type": "object",
"required": [
"a",
"b",
"c",
"d"
],
"properties": {
"a": {
"type": "string",
"description": "Status",
"enum": [
"a",
"b",
"c",
"d",
"e",
"f",
"g",
"h"
]
},
"aReason": {
"type": "string",
"description": "Reason",
"enum": [
"a",
"b",
"c",
"d",
"e",
"f",
"g",
"h",
"i",
"j",
"k",
"l",
"m",
"n"
]
},
"correlationID": {
"type": "integer",
"format": "int32",
"description": "",
"minimum": 1,
"maximum": 9999999
},
"effectiveDate": {
"type": "string",
"format": "date",
"description": ""
},
"f": {
"type": "string",
"description": "",
"minLength": 1,
"maxLength": 100
}
}
},
"ResponseEntity": {
"type": "object",
"properties": {
"body": {
"type": "object"
},
"statusCode": {
"type": "string",
"enum": [
"100",
"101",
"102",
"103",
"200",
"201",
"202",
"203",
"204",
"205",
"206",
"207",
"208",
"226",
"300",
"301",
"302",
"303",
"304",
"305",
"307",
"308",
"400",
"401",
"402",
"403",
"404",
"405",
"406",
"407",
"408",
"409",
"410",
"411",
"412",
"413",
"414",
"415",
"416",
"417",
"418",
"419",
"420",
"421",
"422",
"423",
"424",
"426",
"428",
"429",
"431",
"451",
"500",
"501",
"502",
"503",
"504",
"505",
"506",
"507",
"508",
"509",
"510",
"511"
]
},
"statusCodeValue": {
"type": "integer",
"format": "int32"
}
}
}
}
}
As you can see I am sending through a correlationID of -1, which should fail validation, but at the moment is's returning as successful:
com.github.fge.jsonschema.report.ListProcessingReport: success
I suggest using this library, which worked for me:
https://github.com/bjansen/swagger-schema-validator
Example:
invalid-pet.json
{
"id": 0,
"category": {
"id": 0,
"name": "string"
},
"named": "doggie",
"photoUrls": [
"string"
],
"tags": [
{
"id": 0,
"name": "string"
}
],
"status": "available"
}
My SchemaParser:
#Component
public class SchemaParser {
private Logger logger = LoggerFactory.getLogger(getClass());
public boolean isValid(String message, Resource schemaLocation) {
try (InputStream inputStream = schemaLocation.getInputStream()) {
SwaggerValidator validator = SwaggerValidator.forJsonSchema(new InputStreamReader(inputStream));
ProcessingReport report = validator.validate(message, "/definitions/Pet");
return report.isSuccess();
} catch (IOException e) {
logger.error("IOException", e);
return false;
} catch (ProcessingException e) {
e.printStackTrace();
return false;
}
}
}
A test:
#Test
void shouldFailValidateWithPetstoreSchema() throws IOException {
final Resource validPetJson = drl.getResource("http://petstore.swagger.io/v2/swagger.json");
try (Reader reader = new InputStreamReader(validPetJson.getInputStream(), UTF_8)) {
final String petJson = FileCopyUtils.copyToString(reader);
final boolean valid = schemaParser.isValid(petJson, petstoreSchemaResource);
assertFalse(valid);
}
}
json-schema-validator seems to work with pure JSON Schema only. OpenAPI Specification uses an extended subset of JSON Schema, so the schema format is different. You need a library that can validate specifically against OpenAPI/Swagger definitions, such as Atlassian's swagger-request-validator.

Filtered query on below mapping

I have created Elastic search mapping as below.
PUT indexcloud
{
"mappings": {
"_default_": {
"_all": {
"enabled": false
},
"_source": {
"compressed": true
},
"properties": {
"term": {
"fields": {
"raw": {
"index": "not_analyzed",
"analyzer": "lowercase_analyzer",
"type": "string"
}
},
"analyzer": "concat_all_alpha",
"type": "string"
},
"relation": {
"type": "nested",
"properties": {
"term": {
"type": "string",
"analyzer": "concat_all_alpha",
"fields": {
"raw": {
"index": "not_analyzed",
"analyzer": "lowercase_analyzer",
"type": "string"
}
}
}
}
}
}
}
},
"settings": {
"index": {
"analysis": {
"analyzer": {
"concat_all_alpha": {
"char_filter": [
"only_alphanum"
],
"filter": [
"lowercase"
],
"tokenizer": "keyword"
},
"uppercase_analyzer": {
"filter": "uppercase",
"tokenizer": "keyword"
},
"lowercase_analyzer": {
"filter": "lowercase",
"tokenizer": "keyword"
}
},
"char_filter": {
"only_alphanum": {
"pattern": "[^A-Z^a-z^0-9]|\\^",
"replacement": "",
"type": "pattern_replace"
}
}
},
"max_result_window": "1000000"
}
}
}
Sample index doc
POST indexcloud/skill
{"term":"Java Language","relation":[{"term":"java8"},{"term":"struct"},{"term":"j2ee"},{"term":"Progamming Language"}]}
I want to search using filtered query as below
GET indexcloud/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"term" : "Java Language"
}
}
}
}
}
But this is not working. How can i achieve this ?. Note : i dont want like below
GET indexcloud/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"term" : "javalanguage"
}
}
}
}
}
Because i want to search, the way i index.

Elasticsearch: Multi-level nested query not working

My mapping is as follows:
{
"mappings": {
"person": {
"properties": {
"lastUpdated": {
"type": "long"
},
"isDeleted": {
"type": "boolean"
},
"person": {
"properties": {
"car": {
"type": "nested",
"properties": {
"model": {
"type": "string"
},
"make": {
"type": "string"
}
}
},
"last_name": {
"type": "string"
},
"first_name": {
"type": "string"
}
}
}
}
}
}
}
I have two documents:
{
"person": {
"first_name": "Bob",
"last_name": "Doe",
"car": [
{
"make": "Saturn",
"model": "Imprezza"
},
{
"make": "Honda",
"model": "Accord"
}
]
},
"isDeleted": false,
"lastUpdated": 1433257051959
}
and
{
"person": {
"first_name": "Zach",
"last_name": "Foobar",
"car": [
{
"make": "Saturn",
"model": "SL"
},
{
"make": "Subaru",
"model": "Imprezza"
}
]
},
"isDeleted": false,
"lastUpdated": 1433257051959
}
I wanted to query the car.make field and so, I wrote the following query:
{
"query": {
"nested": {
"path": "person.person.car",
"query": {
"match": {
"car.make": "Saturn"
}
},
"inner_hits": {}
}
}
}
However, I am not getting anything back results back in return. When I remove the person level object and try to search, then it works. Any idea how to go about doing multi-level nested queries?
EDIT: On the other hand, when I structure my data like this and query then it works.
{
"mappings": {
"person": {
"properties": {
"car": {
"type": "nested",
"properties": {
"model": {
"type": "string"
},
"make": {
"type": "string"
}
}
},
"last_name": {
"type": "string"
},
"first_name": {
"type": "string"
}
}
}
}
}
{
"first_name": "Zach",
"last_name": "Foobar",
"car": [
{
"make": "Saturn",
"model": "SL"
},
{
"make": "Subaru",
"model": "Imprezza"
}
]
}
{
"first_name": "Bob",
"last_name": "Doe",
"car": [
{
"make": "Saturn",
"model": "Imprezza"
},
{
"make": "Honda",
"model": "Accord"
}
]
}
{
"query": {
"nested": {
"path": "person.car",
"query": {
"match": {
"car.make": "Honda"
}
},
"inner_hits": {}
}
}
}
This way the query works. I feel like this has something to do with multi-level nesting. Multi-level nesting is not working.
The nested path attribute needs to be "person.car".
Add "type": "nested", above the (2nd level) person properties line if you wish person to be a nested field type, which is required for Nested Query searches. The default field type is object field.
The naming you are using is confusing, try to rename your mapping not to use person twice.
{
"query": {
"nested": {
"path": "person.car",
"query": {
"match": {
"make": "Saturn"
}
},
"inner_hits": {}
}
}
}

Categories

Resources