Dynamic JSON key/value pairs generation in ESQL

Dynamic JSON key/value pairs generation in ESQL - java

How to transform JSON response retrieved from external system to meaningful data (key/value pairs) in ESQL?
Retrieved JSON:
{
"data": [
{
"name": "application.info.header",
"value": "headerValue"
},
{
"name": "entity.statistics.name.fullName",
"value": "fullNameValue"
},
{
"name": "application.info.matter",
"value": "matterValue"
},
{
"name": "entity.statistics.skill",
"value": "skillValue"
}
]
}
where,
name ~ hierarchy of JSON (last attribute being the key)
value ~ value against the key
Expected JSON:
{
"data": {
"application": {
"info": {
"header": "headerValue",
"matter": "matterValue"
}
},
"entity": {
"statistics": {
"name": {
"fullName": "fullNameValue"
},
"skill": "skillValue"
}
}
}
}
Needless to say this can be easily achieved in Java through Split method - I'm looking for a suitable method in ESQL.
Current ESQL Module:
CREATE COMPUTE MODULE getDetails_prepareResponse
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
DECLARE data REFERENCE TO InputRoot.JSON.Data.data.Item[1];
SET OutputRoot.JSON.Data = InputRoot.JSON.Data;
SET OutputRoot.JSON.Data.data = NULL;
WHILE LASTMOVE(data) DO
DECLARE keyA CHARACTER SUBSTRING(data.name BEFORE '.');
DECLARE name CHARACTER SUBSTRING(data.name AFTER '.');
DECLARE keyB CHARACTER SUBSTRING(name BEFORE '.');
DECLARE key CHARACTER SUBSTRING(name AFTER '.');
CREATE LASTCHILD OF OutputRoot.JSON.Data.data.{EVAL('keyA')}.{EVAL('keyB')}
NAME key VALUE data.value;
MOVE data NEXTSIBLING;
END WHILE;
RETURN TRUE;
END;
END MODULE;
This is currently handled through SUBSTRING method in ESQL (for 3 levels only), but now the JSON levels are dynamic (no limit to key/value pairs) as per requirements.

You could implement your own procedure to split a string. Take a look at this answer for an example.
ESQL for splitting a string into mulitple values
The method splits S on Delim into an array in Env (Environment.Split.Array[]) and removes Environment.Split before refilling it.

Related

Jolt Transform JSON Spec

I need to transform below Input JSON to output JSON and not sure about how to write spec for that. Need to re-position one field ("homePage") as a root element. Any help or suggestion would be appreciated.
Input JSON :
[{
"uuid": "cac40601-ffc9-4fd0-c5a1-772ac65f0587",
"pageId": 123456,
"page": {
"indexable": true,
"rootLevel": false,
"homePage": false
}
}]
Output JSON :
[{
"uuid": "cac40601-ffc9-4fd0-c5a1-772ac65f0587",
"pageId": 123456,
"homePage": false,
"page": {
"indexable": true,
"rootLevel": false
}
}]

This Jolt Spec should work for you. Tested with https://jolt-demo.appspot.com/
[
{
"operation": "shift",
"spec": {
"*": {
"uuid": "[&1].uuid",
"pageId": "[&1].pageId",
"page": {
"indexable": "[&2].page.indexable",
"rootLevel": "[&2].page.rootLevel",
"homePage": "[&2].homePage"
}
}
}
}
]
input:
{
"uuid" : "cac40601-ffc9-4fd0-c5a1-772ac65f0587",
"pageId" : 123456,
"page" : {
"indexable" : true,
"rootLevel" : false
},
"homePage" : false
}
output:
[ {
"uuid" : "cac40601-ffc9-4fd0-c5a1-772ac65f0587",
"pageId" : 123456,
"page" : {
"indexable" : true,
"rootLevel" : false
},
"homePage" : false
} ]
Explanation:
From the javadoc
& Path lookup
As Shiftr processes data and walks down the spec, it maintains a data structure describing the path it has walked.
The & wildcard can access data from that path in a 0 major, upward oriented way.
Example:
{
"foo" : {
"bar": {
"baz": // &0 = baz, &1 = bar, &2 = foo
}
}
}
Next thing: How to wrap the output object into the array?
A good example can be found in this post.
So, in our case:
"[&1].uuid" says:
Place the uuid value in the object inside the array. The index of the array is indicated by the &1 wildcard. For uuid it will be the index of the array, where the object with uuid key is placed in the original json.
Next, [&2] is similar to [&1]. However, looking at the "indexable" key, it is one level deeper in the input json. Thats why instead of [&1] we used [&2] (have a look again at the foo-bar example from the docs).

Elastic search exact match query issue

I am having a problem while querying elastic search. The below is my query
GET _search {
"query": {
"bool": {
"must": [{
"match": {
"name": "SomeName"
}
},
{
"match": {
"type": "SomeType"
}
},
{
"match": {
"productId": "ff134be8-10fc-4461-b620-79s51199c7qb"
}
},
{
"range": {
"request_date": {
"from": "2018-08-22T12:16:37,392",
"to": "2018-08-28T12:17:41,137",
"format": "YYYY-MM-dd'T'HH:mm:ss,SSS"
}
}
}
]
}
}
}
I am using three match queries and a range query in the bool query. My intention is getting docs with these exact matches and with in this date range. Here , if i change name and type value, i wont get the results. But for productId , if i put just ff134be8, i would get results. Anyone knows why is that ? . The exact match works on name and type but not for productId

You need to set the mapping of your productId to keyword to avoid the tokenization. With the standard tokenizer "ff134be8-10fc-4461-b620-79s51199c7qb" will create ["ff134be8", "10fc", "4461", "b620", "79s51199c7qb"] as tokens.
You have different options :
1/ use a term query to check without analyzing the content of the field
...
{
"term": {
"productId": "ff134be8-10fc-4461-b620-79s51199c7qb"
}
},
...
2/ if you are in Elasticsearch 6.X you could change your request to
...
{
"match": {
"productId.keyword": "ff134be8-10fc-4461-b620-79s51199c7qb"
}
},
...
As elasticsearch will create a subfield keyword with the type keyword for all string field
The best option is, of course, the first one. Always use term query if you are trying to match the exact content.

ElasticSearch mapping for dynamic keys for indexing a map

I have a sample json which I want to index into elasticsearch.
Sample Json Indexed:
put test/names/1
{
"1" : {
"name":"abc"
},
"2" : {
"name":"def"
},
"3" : {
"name":"xyz"
}
}
where ,
index name : test,
type name : names,
id :1
Now the default mapping generated by elasticsearch is :
{
"test": {
"mappings": {
"names": {
"properties": {
"1": {
"properties": {
"name": {
"type": "string"
}
}
},
"2": {
"properties": {
"name": {
"type": "string"
}
}
},
"3": {
"properties": {
"name": {
"type": "string"
}
}
},
"metadataFieldDefinition": {
"properties": {
"name": {
"type": "string"
}
}
}
}
}
}
}
}
If the map size increases from 3 ( currently) to suppose thousand or million, then ElasticSearch will create a mapping for each which may cause a performance issue as the mapping collection will be huge .
I tried creating a mapping by setting :
"dynamic":false,
"type":object
but it was overriden by ES. since it didnt match the indexed data.
Please let me know how can I define a mapping so that ES. doesnot creates one like the above .

I think there might be a little confusion here in terms of how we index documents.
put test/names/1
{...
document
...}
This says: the following document belongs to index test and is of type name with id 1. The entire document is treated as type name. Using the PUT API as you currently are, you cannot index multiple documents at once. ES immediately interprets 1, 2, and 3 as a properties of type object, each containing a property name of type string.
Effectively, ES thinks you are trying to index ONE document, instead of three
To get many documents into index test with a type of name, you could do this, using the CURL syntax:
curl -XPUT"http://your-es-server:9200/test/names/1" -d'
{
"name": "abc"
}'
curl -XPUT"http://your-es-server:9200/test/names/2" -d'
{
"name": "ghi"
}'
curl -XPUT"http://your-es-server:9200/test/names/3" -d'
{
"name": "xyz"
}'
This will specify the document ID in the endpoint you are index to. Your mapping will then look like this:
"test": {
"mappings": {
"names": {
"properties": {
"name": {
"type": "string"
}
}
}
}
}
Final Word: Split your indexing up into discrete operations, or check out the Bulk API to see the syntax on how to POST multiple operations in a single request.

neo4j - how to run queries with 1000 objects via rest api

I'm need run queries with 1000 objects. Using /batch endpoint I can get this to work but is too slow (30 seconds with 300 items).
So I'm trying the same approach as said in this docs page: http://docs.neo4j.org/chunked/2.0.1/rest-api-cypher.html#rest-api-create-mutiple-nodes-with-properties
POST this JSON to http://localhost:7474/db/data/cypher
{
"params": {
"props": [
{
"_user_id": "177032492760",
"_user_name": "John"
},
{
"_user_id": "177032492760",
"_user_name": "Mike"
},
{
"_user_id": "100007496328",
"_user_name": "Wilber"
}
]
},
"query": "MERGE (user:People {id:{_user_id}}) SET user.id = {_user_id}, user.name = {_user_name} "
}
The problem is I'm getting this error:
{ message: 'Expected a parameter named _user_id',
exception: 'ParameterNotFoundException',
fullname: 'org.neo4j.cypher.ParameterNotFoundException',
stacktrace:
...
Maybe this works only with CREATE queries, as showing in the docs page?

Use FOREACH and MERGE with ON CREATE SET:
FOREACH (p in {props} |
MERGE (user:People {id:{p._user_id}})
ON CREATE user.name = {p._user_name})
POST this JSON to http://localhost:7474/db/data/cypher
{
"params": {
"props": [
{
"_user_id": "177032492760",
"_user_name": "John"
},
{
"_user_id": "177032492760",
"_user_name": "Mike"
},
{
"_user_id": "100007496328",
"_user_name": "Wilber"
}
]
},
"query": "FOREACH (p in {props} | MERGE (user:People {id:{p._user_id}}) ON CREATE user.name = {p._user_name}) "
}

Actually, the equivalent to the example in the doc would be:
{
"params": {
"props": [
{
"id": "177032492760",
"name": "John"
},
{
"id": "177032492760",
"name": "Mike"
},
{
"id": "100007496328",
"name": "Wilber"
}
]
},
"query": "CREATE (user:People {props})"
}
It might be legal to replace CREATE to MERGE, but the query may not do what you expect.
For example, if a node with the id "177032492760" already exists, but it does not have the name "John", then the MERGE will create a new node; and you'd end up with 2 nodes with the same id (but different names).

Yes, a CREATE statement can take an array of maps and implicitly convert it to several statements with one map each, but you can't use arrays of maps that way outside of simple create statements. In fact you can't use literal maps the same way either when you use MERGE and MATCH. You can CREATE ({map}) but you have to MATCH/MERGE ({prop:{map}.val} i.e.
// {props:{name:Fred, age:2}}
MERGE (a {name:{props}.name})
ON CREATE SET a = {props}
For your purposes either send individual parameter maps with a query like above or for an array of maps iterate through it with FOREACH
FOREACH (p IN props |
MERGE (user:People {id:p._user_id})
ON CREATE SET user = p)

JSON Representation of Map with Complex Key

I want to serialize to JSON the following (java) data structure:
class Machine {
String name;
Map<PartDescriptor, Part> parts;
}
class PartDescriptor {
String group;
String id;
hashCode()
equals()
}
class Part {
String group;
String id;
String description;
String compat;
...
...
}
What would be JSON representation of one Machine?
Also (optional), point me to a JSON to Java serializer/deserializer that will support your representation

I'd do something like:
{
"name": "machine name",
"parts": [
{ "group": "part group", "id": "part id", "description": "...", ... },
{ "group": "part group", "id": "part id", "description": "...", ... },
// ...
]
}
If the "id" for each Part is unique, then the "parts" property can be an object instead of an array, with the "id" of each part serving as the key.
{
"name": "machine name",
"parts": {
"1st part id": { "group": "part group", "description": "...", ... },
"2nd part id": { "group": "part group", "description": "...", ... },
// ...
}
}

You don't need annotations or custom serializers. Assuming you already have getters for all the fields in Part and Machine, all that's really missing is a toString() on PartDescriptor. If, for some reason, you don't have getter functions, you'll need to annotate the fields of interest with #JsonProperty so Jackson knows which fields to include in the serialized output. However, it's preferable (and easier) to simply create getters.
The toString() on PartDescriptor should return the key you want to use in your mapping. As another answer suggests, you might simply concatenate the relevant fields:
#Override
public String toString() {
return group + "|" + id;
}
Then you'll magically get this form when you attempt to serialize a Machine with Jackson's ObjectMapper:
{
"name" : "Toaster",
"parts" : {
"Electrical|Descriptor1" : {
"group" : "Electrical",
"id" : "Part1",
"description" : "Heating Element",
"compat" : "B293"
},
"Exterior|Descriptor2" : {
"group" : "Exterior",
"id" : "Part2",
"description" : "Lever",
"compat" : "18A"
}
}
}

I would do this. The parts key of the top level object would be a JSONArray of JSONObject that have key's and value's. The key would be an object that is your PartDescriptor and the value would be your Part.
{
"name":"theName",
"parts":[
{
"key":{
"group":"theGroup",
"id":"theId"
},
"value":{
"group":"theGroup",
"id":"theId",
"description":"theDescription",
"compat":"theCompat",
...
}
},
...
]
}

Assuming that group+id gives a unique combination, and that ":" is a permissible delimiter:
{
"name": "machine name",
"parts": {
"somegroup:01465": {
"group":"somegroup",
"id": "01465",
...
},
"othergroup:32409": {
"group":"othergroup",
"id": "32409",
...
}
}
}

JSON requires the key to be a string, so if you truly need the data to be represented as keyed (e.g. you don't want to use an array, like in Pointy's answer, because you'd like to guarantee it in the contract that there are no duplicate entries with the same key) then you'd need to decide yourself on a way to serialize the complex key into a string.
Two things to note if going with an approach that uses concatenating with a separator (e.g. group1|part1):
You'll want a separator that cannot itself occur in the key parts, or you'll need to escape it when serializing (e.g. double it). The problems that this prevents might be rare to run into, but if this is to be written in a reusable, general purpose code, it should better be guaranteed, as per Murphy's law - if something can go wrong, it eventually will.
To truly prevent 'more than one value with same compound key', you'll want to maintain the same order of the keys, e.g. sort the key alphabetically
Whereas re:
Also (optional), point me to a JSON to Java serializer/deserializer that will support your representation
One notable example might be Gson - Google's JSON serializer library for Java - which uses this representation:
{
"(group1,part1)": { description: ... },
"(group1,part2)": { description: ... },
"(group2,part1)": { description: ... },
...
"(groupX,partX)": {description: ... },
}
Note: the feature needs to be enabled by setting enableComplexMapKeySerialization (off by default for backwards compatibility)

It can be rendered as the following table:
<table class="machine" name="">
<tr>
<th class="partdescriptor" colspan="2">
<th class="part" colspan="4">
</tr>
<tr>
<td class="partdescriptor group"></td>
<td class="partdescriptor" id=""></td>
<td class="part group"></td>
<td class="part" id=""></td>
<td class="description"></td>
<td class="compat"></td>
</tr>
</table>
The markup decomposes into the following JSON object due to the lack of metadata via attributes:
{
"HTMLTableElement":
[
{
"classname": "machine",
"name": ""
},
{
"HTMLTableRowElement":
[
{
"HTMLTableCellElement": {"classname":"partdescriptor","colspan":2}
},
{
"HTMLTableCellElement": {"classname":"part","colspan":4}
}
]
},
{
"HTMLTableRowElement":
[
{
"HTMLTableCellElement": {"classname":"partdescriptor group"}
},
{
"HTMLTableCellElement": {"classname":"partdescriptor","id":""}
},
{
"HTMLTableCellElement": {"classname":"part","id":""}
},
{
"HTMLTableCellElement": {"classname":"description"}
},
{
"HTMLTableCellElement": {"classname":"compat"}
}
]
}
]
}
Alternatively, Unicode can simplify the mapping:
{"name":"","[{\u0022group\u0022:\u0022\u0022},{\u0022id\u0022:\u0022\u0022}]":
[
{"group":""},
{"id":""},
{"description":""},
{"compat":""}
]
}
Which can be stringified:
JSON.stringify({"name":"","[{\u0022group\u0022:\u0022\u0022},{\u0022id\u0022:\u0022\u0022}":[{"group":""},{"id":""},{"description":""},{"compat":""}]})
to produce:
"{\"name\":\"\",\"[{\\\"group\\\":\\\"\\\"},{\\\"id\\\":\\\"\\\"}]\":[{\"group\":\"\"},{\"id\":\"\"},{\"description\":\"\"},{\"compat\":\"\"}]}"
which can be parsed:
JSON.parse("{\"name\":\"\",\"[{\\\"group\\\":\\\"\\\"},{\\\"id\\\":\\\"\\\"}]\":[{\"group\":\"\"},{\"id\":\"\"},{\"description\":\"\"},{\"compat\":\"\"}]}")
to produce an object literal:
({name:"", '[{"group":""},{"id":""}]':[{group:""}, {id:""}, {description:""}, {compat:""}]})
References
HTMLTableRowElement
HTMLTableCellElement

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Dynamic JSON key/value pairs generation in ESQL - java

You could implement your own procedure to split a string. Take a look at this answer for an example. ESQL for splitting a string into mulitple values The method splits S on Delim into an array in Env (Environment.Split.Array[]) and removes Environment.Split before refilling it.

Related

Jolt Transform JSON Spec

Elastic search exact match query issue

ElasticSearch mapping for dynamic keys for indexing a map

neo4j - how to run queries with 1000 objects via rest api

JSON Representation of Map with Complex Key

Categories

Resources