Java flatten json documents

Java flatten json documents - java

I am a novice in Java and I am looking for a way to flatten json documents.
I have tried Object mapper but without success and I have also tried to do with json node but still get no success .
I found this link but the results is not what I need :https://github.com/wnameless/json-flattener
I have also been helped before but the example was too specific and I cannot do the same things because the documents is too long this is why I am looking for a way to make it generic: Flatten json documents in Java
I need to transform "any" json documents like in the example below :
Here is an example of my documents
Documents recieved:
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 10,
"max_score": 0,
"hits": []
},
"aggregations": {
"groupe": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "a",
"doc_count": 1,
"date": {
"buckets": [
{
"key_as_string": "2017-05-03T00:00:00.000Z",
"key": 1493769600000,
"doc_count": 1,
"value": {
"value": 1
}
},
{
"key_as_string": "2017-05-03T01:00:00.000Z",
"key": 1493776800000,
"doc_count": 1,
"value": {
"value": 3
}
}
]
}
},
{
"key": "b",
"doc_count": 4,
"date": {
"buckets": [
{
"key_as_string": "2017-05-03T00:00:00.000Z",
"key": 1493769600000,
"doc_count": 1,
"value": {
"value": 4
}
},
{
"key_as_string": "2017-05-03T01:00:00.000Z",
"key": 1493773200000,
"doc_count": 1,
"value": {
"value": 3
}
}
]
}
}
]
}
}
}
Document Transformed:
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 10,
"max_score": 0,
"hits": []
},
"aggregations": {
"groupe": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "a",
"doc_count": 1,
"date": {
"buckets": [
{
"key_as_string": "2017-05-03T00:00:00.000Z",
"key": 1493769600000,
"doc_count": 1,
"value": {
"value": 1
}
}
]
}
},
{
"key": "a",
"doc_count": 1,
"date": {
"buckets": [
{
"key_as_string": "2017-05-03T02:00:00.000Z",
"key": 1493776800000,
"doc_count": 1,
"value": {
"value": 3
}
}
]
}
},
{
"key": "b",
"doc_count": 1,
"date": {
"buckets": [
{
"key_as_string": "2017-05-03T02:00:00.000Z",
"key": 1493776800000,
"doc_count": 1,
"value": {
"value": 4
}
}
]
}
},
"key": "b",
"doc_count": 1,
"date": {
"buckets": [
{
"key_as_string": "2017-05-03T02:00:00.000Z",
"key": 1493776800000,
"doc_count": 1,
"value": {
"value": 4
}
}
]
}
}
]
}
}
}

Related

Elastic Search Should clause

I'm trying to fetch users from ES based on the status of some of the fields.
I have 5 fields whose status I want to check and if any of these fields have the failed status I want to fetch that record. Since it's an OR condition between these 5 fields I was trying to use should in ES and adding terms to it. But it returns records of those users who don't match the criteria as well.
{
"from": 0,
"size": 50,
"query": {
"bool": {
"must": [
{
"nested": {
"query": {
"bool": {
"must": [
{
"range": {
"segment_status.updated_at": {
"from": "2021-01-24",
"to": null,
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"should": [
{
"terms": {
"segment_status.bse_status": [
2,
3,
4
],
"boost": 1
}
},
{
"terms": {
"segment_status.nse_status": [
2,
3
],
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"path": "segment_status",
"ignore_unmapped": false,
"score_mode": "avg",
"boost": 1
}
}
],
"must_not": [
{
"term": {
"marked_failed_manually": {
"value": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"segment_status.updated_at": {
"order": "asc",
"mode": "min",
"nested_filter": {
"term": {
"segment_status.segment_type": {
"value": "CASH",
"boost": 1
}
}
},
"nested_path": "segment_status"
}
}
]
}
That is the query generated by the code. I'm using spring boot to build the query.

Just for reference, I tried this query and it seems to work.
{
"from": 0,
"size": 50,
"query": {
"bool": {
"must": [
{
"nested":{
"query":{
"bool":{
"must" : [
{
"range" : {
"segment_status.updated_at" : {
"from" : "2021-08-30",
"to" : null,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
}
]
}
},
"path" : "segment_status",
"ignore_unmapped" : false,
"score_mode" : "avg",
"boost" : 1.0
}
},
{
"nested": {
"query": {
"bool": {
"should": [
{
"terms": {
"segment_status.bse_status": [
2,
3
],
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"path": "segment_status",
"ignore_unmapped": false,
"score_mode": "avg",
"boost": 1
}
}
],
"must_not": [
{
"term": {
"marked_failed_manually": {
"value": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"segment_status.updated_at": {
"order": "asc",
"mode": "min",
"nested_filter": {
"term": {
"segment_status.segment_type": {
"value": "CASH",
"boost": 1
}
}
},
"nested_path": "segment_status"
}
}
]
}

Elasticsearch Rest High Level Client aggregate fields dynamically

I am trying to generate query dynamically based on the inputs but in the generated query i can see there are only two aggregations are getting generated how can i make each fields to have the separate aggregations below is the code what i have tried and the response what i'm getting.
From main() i'm calling
buildSearchCriteria("1");
Here i am setting the aggregation type and respective values:
public static void buildSearchCriteria(String... exceptionId) {
SearchCriteria searchCriteria = new SearchCriteria();
Map<String, List<FieldNameAndPath>> stringListMap = new HashMap<>();
stringListMap.put("nested", asList(new FieldNameAndPath("nested", "recommendations",
"recommendations", null, emptyList(), 1)));
stringListMap.put("filter", asList(new FieldNameAndPath("filter", "exceptionIds", "recommendations.exceptionId.keyword",
asList(exceptionId),
asList(new NestedAggsFields("terms", "exceptionIdsMatch")), 2)));
stringListMap.put("terms", asList(new FieldNameAndPath("terms", "by_exceptionId", "recommendations.exceptionId.keyword", null, emptyList(), 3),
new FieldNameAndPath("terms", "by_item", "recommendations.item.keyword", null, emptyList(), 4),
new FieldNameAndPath("terms", "by_destination", "recommendations.location.keyword", null, emptyList(), 5),
new FieldNameAndPath("terms", "by_trans", "recommendations.transportMode.keyword", null, emptyList(), 6),
new FieldNameAndPath("terms", "by_sourcelocation", "recommendations.sourceLocation.keyword", null, emptyList(), 7),
new FieldNameAndPath("terms", "by_shipdate", "recommendations.shipDate", null, emptyList(), 8),
new FieldNameAndPath("terms", "by_arrival", "recommendations.arrivalDate", null, emptyList(), 9)));
stringListMap.put("sum", asList(new FieldNameAndPath("sum", "quantity", "recommendations.transferQuantity", null, emptyList(), 10),
new FieldNameAndPath("sum", "transfercost", "recommendations.transferCost", null, emptyList(), 11),
new FieldNameAndPath("sum", "revenueRecovered", "recommendations.revenueRecovered", null, emptyList(), 12)));
System.out.println(stringListMap);
searchCriteria.setStringListMap(stringListMap);
aggregate(searchCriteria);
}
Below is the aggregate function which will get the the above information and builds query:
public static void aggregate(SearchCriteria searchCriteria) throws IOException {
Map<String, List<FieldNameAndPath>> map = searchCriteria.getStringListMap();
List<FieldNameAndPath> nesteds = map.get("nested");
List<FieldNameAndPath> filter = map.get("filter");
List<FieldNameAndPath> terms = map.get("terms");
List<FieldNameAndPath> sums = map.get("sum");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
AggregationBuilder aggregationBuilder = new SamplerAggregationBuilder("parent");
nesteds.stream().forEach(l -> buildAggregations(l, aggregationBuilder));
filter.stream().forEach(l -> buildAggregations(l, aggregationBuilder));
terms.stream().forEach(l -> buildAggregations(l, aggregationBuilder));
sums.stream().forEach(l -> buildAggregations(l, aggregationBuilder));
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices("index");
searchRequest.types("type");
sourceBuilder.aggregation(aggregationBuilder);
searchRequest.source(sourceBuilder);
System.out.println(searchRequest.source().toString());
}
buildAggregations method:
private static AggregationBuilder buildAggregations(FieldNameAndPath fieldNameAndPath , AggregationBuilder parentAggregationBuilder) {
if(fieldNameAndPath.getAggType().equals("nested")){
parentAggregationBuilder = AggregationBuilders.nested(fieldNameAndPath.getFieldName(), fieldNameAndPath.getFieldPath());
}
if(fieldNameAndPath.getAggType().equals("filter")){
parentAggregationBuilder.subAggregation(AggregationBuilders
.filter(fieldNameAndPath.getFieldName(),
QueryBuilders.termsQuery(fieldNameAndPath.getNestedAggs()
.stream().map(nestedAggsFields -> nestedAggsFields.getFieldName()).findFirst().get(), fieldNameAndPath.getFieldValues())));
}
if(fieldNameAndPath.getAggType().equals("terms")){
parentAggregationBuilder.subAggregation(AggregationBuilders.terms(fieldNameAndPath.getFieldName())
.field(fieldNameAndPath.getFieldPath()));
}
if(fieldNameAndPath.getAggType().equals("sum")){
parentAggregationBuilder.subAggregation(AggregationBuilders.
sum(fieldNameAndPath.getFieldName()).field(fieldNameAndPath.getFieldPath()));
}
return parentAggregationBuilder;
}
SearchCriteria class:
#Data
public class SearchCriteria {
Map<String, List<FieldNameAndPath>> stringListMap;
private List<String> searchFields;
}
And the DTO FieldNameAndPath:
public class FieldNameAndPath{
private String aggType;
private String fieldName;
private String fieldPath;
private List<String> fieldValues;
private List<NestedAggsFields> nestedAggs;
private int order;
}
And the query output from the above code is:
{
"aggregations": {
"parent": {
"sampler": {
"shard_size": 100
},
"aggregations": {
"exceptionIds": {
"filter": {
"terms": {
"exceptionIdsMatch": [
"1"
],
"boost": 1
}
}
},
"by_exceptionId": {
"terms": {
"field": "recommendations.exceptionId.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
},
"by_item": {
"terms": {
"field": "recommendations.item.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
},
"by_destination": {
"terms": {
"field": "recommendations.location.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
},
"by_trans": {
"terms": {
"field": "recommendations.transportMode.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
},
"by_sourcelocation": {
"terms": {
"field": "recommendations.sourceLocation.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
},
"by_shipdate": {
"terms": {
"field": "recommendations.shipDate",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
},
"by_arrival": {
"terms": {
"field": "recommendations.arrivalDate",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
}
},
"quantity": {
"sum": {
"field": "recommendations.transferQuantity"
}
},
"transfercost": {
"sum": {
"field": "recommendations.transferCost"
}
},
"revenueRecovered": {
"sum": {
"field": "recommendations.revenueRecovered"
}
}
}
}
}
}
Expected Query is:
{
"size": 0,
"aggregations": {
"exceptionIds": {
"nested": {
"path": "recommendations"
},
"aggregations": {
"exceptionIdsMatch": {
"filter": {
"terms": {
"recommendations.exceptionId.keyword": [
"1"
],
"boost": 1
}
},
"aggregations": {
"by_exceptionId": {
"terms": {
"field": "recommendations.exceptionId.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"by_item": {
"terms": {
"field": "recommendations.item.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"by_destination": {
"terms": {
"field": "recommendations.location.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"by_trans": {
"terms": {
"field": "recommendations.transportMode.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"by_sourcelocation": {
"terms": {
"field": "recommendations.sourceLocation.keyword",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"by_shipdate": {
"terms": {
"field": "recommendations.shipDate",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"by_arrival": {
"terms": {
"field": "recommendations.arrivalDate",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"quantity": {
"sum": {
"field": "recommendations.transferQuantity"
}
},
"transfercost": {
"sum": {
"field": "recommendations.transferCost"
}
},
"revenueRecovered": {
"sum": {
"field": "recommendations.revenueRecovered"
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}

elasticsearch how to group by repetitive items in array without distinct

I'm trying to get the counts group by the repetitive items in array without distinct, use aggs terms but not work
GET /my_index/_search
{
"size": 0,
"aggs": {
"keywords": {
"terms": {
"field": "keywords"
}
}
}
}
documents like:
"keywords": [
"value1",
"value1",
"value2"
],
but the result is:
"buckets": [
{
"key": "value1",
"doc_count": 1
},
{
"key": "value2",
"doc_count": 1
}
]
how can i get the result like:
"buckets": [
{
"key": "value1",
"doc_count": 2
},
{
"key": "value2",
"doc_count": 1
}
]

finally I modify the mapping use nested:
"keywords": {
"type": "nested",
"properties": {
"count": {
"type": "integer"
},
"keyword": {
"type": "keyword"
}
}
},
and query:
GET /my_index/_search
{
"size": 0,
"aggs": {
"keywords": {
"nested": {
"path": "keywords"
},
"aggs": {
"keyword_name": {
"terms": {
"field": "keywords.keyword"
},
"aggs": {
"sums": {
"sum": {
"field": "keywords.count"
}
}
}
}
}
}
}
}
result:
"buckets": [{
"key": "value1",
"doc_count": 495,
"sums": {
"value": 609
}
},
{
"key": "value2",
"doc_count": 440,
"sums": {
"value": 615
}
},
{
"key": "value3",
"doc_count": 319,
"sums": {
"value": 421
}
},
...]

regex expression to parse logs

Hello hackers wonder if anyone can help out i need to match all types of logs with pattern any suggestions on how to match full object in {} ?
Sample
[1517725731.300][DEBUG]: DEVTOOLS EVENT Runtime.consoleAPICalled {
"args": [ {
"className": "Array",
"description": "Array(2)",
"objectId": "{\"injectedScriptId\":13,\"id\":11}",
"preview": {
"description": "Array(2)",
"overflow": false,
"properties": [ {
"name": "0",
"type": "string",
"value": "tracking fired"
}, {
"name": "1",
"subtype": "array",
"type": "object",
"value": "Arguments(4)"
} ],
"subtype": "array",
"type": "object"
},
"subtype": "array",
"type": "object"
} ],
"executionContextId": 13,
"stackTrace": {
"callFrames": [ {
"columnNumber": 39,
"functionName": "window.debug.that.(anonymous function)",
"lineNumber": 184,
"scriptId": "234",
"url": "https://secure.somerandomsite.com/js/common/ba-debug.js"
}, {
"columnNumber": 475,
"functionName": "loggerTrackingHandler",
"lineNumber": 0,
"scriptId": "60",
"url": "https://secure.somerandomsite.com/jsmin/gzip_1642294/bundles/mainTracking.min.js"
}, {
"columnNumber": 436,
"functionName": "CsApplicationTrackingControllerBase.commontTrackEvent",
"lineNumber": 4,
"scriptId": "60",
"url": "https://secure.somerandomsite.com/jsmin/gzip_1642294/bundles/mainTracking.min.js"
}, {
"columnNumber": 5,
"functionName": "CsApplicationTrackingControllerBase.parentTrackEvent",
"lineNumber": 8,
"scriptId": "60",
"url": "https://secure.somerandomsite.com/jsmin/gzip_1642294/bundles/mainTracking.min.js"
}, {
"columnNumber": 525,
"functionName": "CsApplicationTrackingController.trackEvent",
"lineNumber": 16,
"scriptId": "60",
"url": "https://secure.somerandomsite.com/jsmin/gzip_1642294/bundles/mainTracking.min.js"
}, {
"columnNumber": 2124,
"functionName": "CsApplicationTrackingController.trackChangeTab",
"lineNumber": 19,
"scriptId": "60",
"url": "https://secure.somerandomsite.com/jsmin/gzip_1642294/bundles/mainTracking.min.js"
}, {
"columnNumber": 22,
"functionName": "MyCreditAnalysisViewModel.self.selectTab",
"lineNumber": 6,
"scriptId": "263",
"url": "https://secure.somerandomsite.com/jsmin/gzip_N1393604112/bundles/newOverview.min.js"
}, {
"columnNumber": 48,
"functionName": "",
"lineNumber": 53,
"scriptId": "281",
"url": "https://secure.somerandomsite.com/js/common/knockout/knockout-2.1.0.js"
}, {
"columnNumber": 4815,
"functionName": "dispatch",
"lineNumber": 2,
"scriptId": "228",
"url": "https://secure.somerandomsite.com/3rd_party/jquery-1.7.2.min.js"
}, {
"columnNumber": 708,
"functionName": "i",
"lineNumber": 2,
"scriptId": "228",
"url": "https://secure.somerandomsite.com/3rd_party/jquery-1.7.2.min.js"
} ]
},
"timestamp": 1517725731298.069,
"type": "log"
}
[1517725731.305][DEBUG]: DEVTOOLS RESPONSE Input.dispatchMouseEvent (id=260) {
}
[1517725731.305][INFO]: Waiting for pending navigations...
[1517725731.305][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=261) {
"expression": "1"
}

Any line starting with something like [1517725731.305][INFO]: is a new log entry. Look for those.
public static void processLogFile(Path logFile) throws IOException {
Pattern logstart = Pattern.compile("^\\[\\d+\\.\\d+\\]\\[\\w+\\]: ");
try (BufferedReader log = Files.newBufferedReader(logFile)) {
StringBuilder logEntry = new StringBuilder();
for (String line; (line = log.readLine()) != null; ) {
if (logstart.matcher(line).find()) {
if (logEntry.length() != 0)
processLogEntry(logEntry.toString());
logEntry.setLength(0); // clear buffer
}
logEntry.append(line).append(System.lineSeparator());
}
if (logEntry.length() != 0)
processLogEntry(logEntry.toString());
}
}
private static void processLogEntry(String logEntry) {
// code here
}

how to parse the json response which starts with an array

response:
[
{
"id": "e9299032e8a34d168def176af7d62da3",
"createdAt": "Nov 8, 2017 9:46:40 AM",
"model": {
"id": "eeed0b6733a644cea07cf4c60f87ebb7",
"name": "color",
"app_id": "main",
"created_at": "May 11, 2016 11:35:45 PM",
"model_version": {}
},
"input": {
"id": "df6eae07cd86483f811c5a2202e782eb",
"data": {
"concepts": [],
"metadata": {},
"image": {
"url": "http://www.sachinmittal.com/wp-content/uploads/2017/04/47559184-image.jpg"
}
}
},
"data": [
{
"hex": "#f59b2d",
"webSafeHex": "#ffa500",
"webSafeColorName": "Orange",
"value": 0.0605
},
{
"hex": "#3f1303",
"webSafeHex": "#000000",
"webSafeColorName": "Black",
"value": 0.2085
},
{
"hex": "#a33303",
"webSafeHex": "#8b0000",
"webSafeColorName": "DarkRed",
"value": 0.3815
},
{
"hex": "#000000",
"webSafeHex": "#000000",
"webSafeColorName": "Black",
"value": 0.34275
},
{
"hex": "#f7ce93",
"webSafeHex": "#ffdead",
"webSafeColorName": "NavajoWhite",
"value": 0.00675
}
],
"status": {}
}
]
need to parse this reponse in json. Please help me out.

You can try something like this..
try{
JSONArray array= new JSONArray(Yourresponse);
for(int i=0; i<=array.length();i++){
JSONObject jsonObject=array.getJSONObject(i);
String id= jsonObject.getString("id");
String created_at= jsonObject.getString("createdAt");
String model_id = jsonObject.getJSONObject("model").getString("id");
String app_id=jsonObject.getJSONObject("model").getString("app_id");
//So On... Depends on your requirements. It's just an idea!
}
}
catch (Exception e){
e.printStackTrace();
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java flatten json documents - java

Related

Elastic Search Should clause

Elasticsearch Rest High Level Client aggregate fields dynamically

elasticsearch how to group by repetitive items in array without distinct

regex expression to parse logs

how to parse the json response which starts with an array

Categories

Resources