Related
I have a streaming app where it listens to some data and then transforming the data by pushing the data into a new topic. I use avro schema for both to read/write my data into topics. The problem is when i consume the data from the final destination by using the command in the below. However, my data is a little complex with some array and json inside of it and i suspect that my avro schemas might not be correct for my purpose. There is no error or anything, I can see all my data on my final topic but the "Pets" field are duplicated for some reason and i can't understand why. In fact, i only add one new field (job_id) to my existing data in the avro schema, i don't make big changes on it when i transform it.
./bin/kafka-console-consumer --topic my_topic \
--bootstrap-server localhost:9092 \
Here's the json data i have
{
"Person":{
"id":"104440",
"Name":"William",
"LastName":"Dorsey",
"archived":false,
"Timezone":"America/Los_Angeles",
"brandCompanyName":"Twitter",
"brandID":"cf545a7b",
"creatorID":"1234",
"currency":"USD",
"dateCreated":"2020-09-07T02:56:22Z",
"dateModified":"2020-09-07T02:57:24Z",
"disabled":false,
"endDate":"2020-11-29T19:51:00-08:00",
"startDate":"2020-08-31T20:55:00-07:00",
"totalBudget":0
},
"Pets":[
{
"Name":"Pawny",
"Id":"4214",
"budget":"0",
"adoptionDate":"2020-09-07T02:56:22Z",
"year":"2",
"type":"Golden",
"gender":"male"
}
],
"CreationTime":"1604036638"
}
my avro schema
{
"name": "MyClass",
"type": "record",
"namespace": "com.acme.avro",
"fields": [
{
"name": "Person",
"type": {
"name": "Person",
"type": "record",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "Name",
"type": "string"
},
{
"name": "LastName",
"type": "string"
},
{
"name": "archived",
"type": "boolean"
},
{
"name": "Timezone",
"type": "string"
},
{
"name": "brandCompanyName",
"type": "string"
},
{
"name": "brandID",
"type": "string"
},
{
"name": "creatorID",
"type": "string"
},
{
"name": "currency",
"type": "string"
},
{
"name": "dateCreated",
"type": "int",
"logicalType": "date"
},
{
"name": "dateModified",
"type": "int",
"logicalType": "date"
},
{
"name": "disabled",
"type": "boolean"
},
{
"name": "endDate",
"type": "int",
"logicalType": "date"
},
{
"name": "startDate",
"type": "int",
"logicalType": "date"
},
{
"name": "totalBudget",
"type": "int"
}
]
}
},
{
"name": "Pets",
"type": {
"type": "array",
"items": {
"name": "Pets_record",
"type": "record",
"fields": [
{
"name": "Name",
"type": "string"
},
{
"name": "Id",
"type": "string"
},
{
"name": "budget",
"type": "string"
},
{
"name": "adoptionDate",
"type": "int",
"logicalType": "date"
},
{
"name": "year",
"type": "string"
},
{
"name": "type",
"type": "string"
},
{
"name": "gender",
"type": "string"
}
]
}
}
},
{
"name": "CreationTime",
"type": "string"
},
{
"name":"jobID",
"type":"string"
}
]
}
the output in my topic when i consume the topic - the pets field are duplicated for some reason? I can't figure out why
{
"id":"104440",
"Name":"William",
"LastName:"Dorsey",
"archived":false,
"Timezone":"America/Los_Angeles",
"brandCompanyName":"Twitter",
"brandID":"cf545a7b",
"creatorID":"1234",
"currency":"USD",
"dateCreated":"2020-09-07T02:56:22Z",
"dateModified":"2020-09-07T02:57:24Z",
"disabled":false,
"endDate":"2020-11-29T19:51:00-08:00",
"startDate":"2020-08-31T20:55:00-07:00",
"totalBudget":0,
"Pets":[
{
"Name":"Pawny",
"Id":"4214",
"budget":"0",
"adoptionDate":2020-09-07T02:56:22Z",
"year":"2",
"type":"Golden",
"gender":"male"
}
],
"CreationTime":1604036638,
"jobID":12512,
"pets":[
{
"Name":"Pawny",
"Id":"4214",
"budget":"0",
"adoptionDate":2020-09-07T02:56:22Z",
"year":"2",
"type":"Golden",
"gender":"male"
}
]
}
It's because i was using Uppercase name in my field names... Wandering in endless loops for 24 hours, i was finally able to figure out this if anyone ran into same issue. Please read here and use lowercase name for your fieldnames. When i changed my field name to "pet". The duplicates are gone
I'm trying to flatten the below response without having to parse it into a class. The reason for this is that the server could add or remove fields at anytime so it needs to be dynamic. We also have another service that returns lookup paths that we use to get data out of the flattened response - like "$.detail.att_one" There is a library for iOS that does the exact thing I'm looking for but as far as I can find nothing similar for Android: https://github.com/infinum/Japx
{
"data": [
{
"type": "items",
"id": "14",
"attributes": {
"item_type": "shape_circle",
"code": null,
"size": "70"
},
"relationships": {
"detail": {
"data": {
"type": "circle",
"id": "90"
}
},
"metadata": {
"data": "metadata"
}
},
"links": {
"self": "http://url/item/14"
}
}
],
"included": [
{
"type": "circle",
"id": "90",
"attributes": {
"att_one": 4,
"att_two": "11111111111",
"att_three": "Bob"
}
}
]}
The result I'm looking for:
{
"data": [
{
"id": "14",
"type": "items",
"item_type": "shape_circle",
"code": null,
"size": "70",
"metadata": {
"data": "metadata"
},
"detail": {
"type": "circle",
"id": "90",
"att_one": 4,
"att_two": "11111111111",
"att_three": "Bob"
},
"links": {
"self": "http://url/item/14"
}
}
]}
There is a nice JSONAPI library that does the thing, but you have to define resource classes for it.
Check out jasminb/jsonapi-converter
It recursively flattens all the included relationships and handles inheritance.
I'm trying to write a JSON schema for my JSON object and I'm not able to follow the error.
I want my JSON object to be stored in the following manner in Java:
public class Category {
private Map<String, List<String>> categoryMapping;
}
Sample JSON:
{
"categoryMapping": {
"categoryA": ["a","b","c"],
"categoryB": ["x","y","z"],
"categoryC": ["x","y","z"]
}
}
However if I write the schema in the following way:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": "$id$",
"description": "list_of_values-1",
"required": [
"categoryMapping"
],
"properties": {
"categoryMapping": {
"$id": "#/properties/categoryMapping",
"type": "object",
"title": "The categoryMapping Schema",
"properties": {
"type": "array",
"items": {
"type": "string"
}
}
}
}
}
I get the following error: The property '#/properties/categoryMapping/properties/type' of type String did not match the following type: object in schema http://json-schema.org/draft-04/schema#
But if I specify the types of categories it works:
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": "$id$",
"description": "list_of_values-1",
"properties": {
"categoryMapping": {
"$id": "#/properties/categoryMapping",
"type": "object",
"title": "The Categorymapping Schema",
"required": [
"categoryA",
"categoryB",
"categoryC"
],
"properties": {
"categoryA": {
"$id": "#/properties/categoryMapping/properties/categoryA",
"type": "array",
"title": "The Categorya Schema",
"items": {
"$id": "#/properties/categoryMapping/properties/categoryA/items",
"type": "string",
"title": "The Items Schema",
"default": "",
"examples": [
"a",
"b",
"c"
],
"pattern": "^(.*)$"
}
},
"categoryB": {
"$id": "#/properties/categoryMapping/properties/categoryB",
"type": "array",
"title": "The Categoryb Schema",
"items": {
"$id": "#/properties/categoryMapping/properties/categoryB/items",
"type": "string",
"title": "The Items Schema",
"default": "",
"examples": [
"x",
"y",
"z"
],
"pattern": "^(.*)$"
}
},
"categoryC": {
"$id": "#/properties/categoryMapping/properties/categoryC",
"type": "array",
"title": "The Categoryc Schema",
"items": {
"$id": "#/properties/categoryMapping/properties/categoryC/items",
"type": "string",
"title": "The Items Schema",
"default": "",
"examples": [
"x",
"y",
"z"
],
"pattern": "^(.*)$"
}
}
}
}
}
}
Is there a way to write the schema without explicitly specifying a list of all category types?
So your sample JSON is actually a type with three properties, which is why the schema you have generated requires you to define each property explicitly, even though they are effectively of the same type.
If you were willing to modify your sample json a little bit, however:
{
"categoryMapping": [
{
"name": "categoryA",
"map": ["a","b","c"]
},
{
"name": "categoryB",
"map": ["x","y","z"]
},
{
"name": "categoryC",
"map": ["x","y","z"]
}
]
}
Then you could validate it with the following schema:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"categoryMapping": {
"type": "array",
"items": {
"type": "object",
"required": [
"name",
"map"
],
"properties": {
"name": {
"type": "string"
},
"map": {
"type": "array",
"items": {
"type": "string"
}
}
}
}
}
}
}
Because you can specify the minimum and maximum number of items allowed in an array, you could restrict the number of categories to 3 and the number of "maps" to 3 also if you wanted to.
I want convert json document into json schema. I googled it but not got the exact idea according to my requirement.
here is JSON
{
"empId":1001,
"firstName":"jonh",
"lastName":"Springer",
"title": "Engineer",
"address": {
"city": "Mumbai",
"street": "FadkeStreet",
"zipCode":"420125",
"privatePhoneNo":{
"privateMobile": "2564875421",
"privateLandLine":"251201546"
}
},
"salary": 150000,
"department":{
"departmentId": 10521,
"departmentName": "IT",
"companyPhoneNo":{
"cMobile": "8655340546",
"cLandLine": "10251215465"
},
"location":{
"name": "mulund",
"locationId": 14500
}
}
}
I want to generate like this
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"title": "Employee",
"properties": {
"empId": {
"type": "integer"
},
"firstName":{
"type":"string"
},
"lastName": {
"type": "string"
},
"title": {
"type": "string"
},
"address": {
"type": "object",
"properties": {
"city": {
"type": "string"
},
"street": {
"type": "string"
},
"zipCode": {
"type": "string"
},
"privatePhoneNo": {
"type": "object",
"properties": {
"privateMobile": {
"type": "string"
},
"privateLandLine": {
"type": "string"
}
}
}
}
},
"salary": {
"type": "number"
},
"department": {
"type": "object",
"properties": {
"departmentId": {
"type": "integer"
},
"departmentName": {
"type": "string"
},
"companyPhoneNo": {
"type": "object",
"properties": {
"cMobile": {
"type": "string"
},
"cLandLine": {
"type": "string"
}
}
},
"location": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"locationId": {
"type": "integer"
}
}
}
}
}
}
}
Is there any library is doing like this or what is another way?
https://github.com/perenecabuto/json_schema_generator
http://jsonschema.net/#/
I'm think this maybe will help
It's been a while since this was asked but I was having the same issue. So far the best solution I have come across is this library:
https://github.com/saasquatch/json-schema-inferrer
I found this from the json-schema doc itself. It has links to implementations for other languages as well:
https://json-schema.org/implementations.html#from-data
My JSON response body looks like this:
{
"status": true,
"responseData": {
"category": "Seeds",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Seeds.jpg",
"display": "Seeds",
"children": [
{
"category": "Vegetables",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Vegetables.jpg",
"display": "Vegetables",
"children": [
{
"category": "Cabbage",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cabbage.jpg",
"display": "Cabbage",
"children": [],
"id": "06523d5d-c2c4-4f83-a94b-8209173f05c8"
},
{
"category": "Cowpea (Chauli)",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cowpea (Chauli).jpg",
"display": "Cowpea (Chauli)",
"children": [],
"id": "f9ccc378-58d7-49b5-a3e2-7639ca967b86"
},
{
"category": "Hot Pepper (Chilli)",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Hot Pepper (Chilli).jpg",
"display": "Hot Pepper (Chilli)",
"children": [],
"id": "4226ebc8-4932-48c2-a2c1-0e52bbd852a9"
},
{
"category": "Cauliflower",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cauliflower.jpg",
"display": "Cauliflower",
"children": [],
"id": "759fac6c-a3fa-42d0-8b15-20797c827c67"
},
{
"category": "Bottle Gourd (Dudhi)",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Bottle Gourd (Dudhi).jpg",
"display": "Bottle Gourd (Dudhi)",
"children": [],
"id": "6719800f-edbc-4fe3-8ae1-e537f6943693"
},
{
"category": "Cucumber",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cucumber.jpg",
"display": "Cucumber",
"children": [],
"id": "310a650a-52b8-4090-9b43-9d1b0b4dbdd1"
},
{
"category": "Ginger",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Ginger.jpg",
"display": "Ginger",
"children": [],
"id": "bd2f9443-5b04-4609-95de-2ae69da053bd"
}
],
"id": "488fafe3-5940-4eef-b7d5-3361a3855a0d"
},
{
"category": "Pulses",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Pulses.jpg",
"display": "Pulses",
"children": [
{
"category": "Urid Bean (Urad)",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Urid Bean (Urad).jpg",
"display": "Urid Bean (Urad)",
"children": [],
"id": "5a38dbef-2118-4cbc-b596-ed2ab58b18cb"
},
{
"category": "Pigeon Pea(Tur)",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Pigeon Pea(Tur).jpg",
"display": "Pigeon Pea (Tur)",
"children": [],
"id": "040e2713-05ab-43e4-b09b-d19f52f96f15"
}
],
"id": "b9526fc2-db2b-485e-a2ce-c2068684d66d"
},
{
"category": "Cash Crop",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cash Crop.jpg",
"display": "Cash Crop",
"children": [
{
"category": "Gum Guar",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Gum Guar.jpg",
"display": "Gum Guar",
"children": [],
"id": "1e7ccb01-274a-40bb-8446-379ea8870047"
},
{
"category": "Cotton",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cotton.jpg",
"display": "Cotton",
"children": [],
"id": "7764aaba-ae3c-470e-a20b-55d6ce8ae095"
}
],
"id": "9636b237-7bc0-40f8-9dec-d6d6c442e293"
},
{
"category": "Cereals",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cereals.jpg",
"display": "Cereals",
"children": [
{
"category": "Pearl Millet (Bajara)",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Pearl Millet (Bajara).jpg",
"display": "Pearl Millet (Bajara)",
"children": [],
"id": "4c0a7b9f-1dc8-4eed-914b-2dcfea568caa"
},
{
"category": "Maize",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Maize.jpg",
"display": "Maize",
"children": [],
"id": "812c864d-2491-4ce9-bb75-72bb1da1fc53"
}
],
"id": "693a3515-8c2d-4d7c-9071-ce4dd400ae27"
},
{
"category": "Oil Seeds",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Oil Seeds.jpg",
"display": "Oil Seeds",
"children": [
{
"category": "Sesame (Tal)",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Sesame (Tal).jpg",
"display": "Sesame",
"children": [],
"id": "21eaacf9-7dd4-4890-9e11-985b2dc6eb52"
},
{
"category": "Castor",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Castor.jpg",
"display": "Castor",
"children": [],
"id": "c4118dea-4ccb-499c-817e-2e8a6b59c6a4"
},
{
"category": "Groundnut",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Groundnut.jpg",
"display": "Groundnut",
"children": [],
"id": "e8227101-3a80-468b-84fc-fc9758440b2d"
},
{
"category": "Mustard",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Mustard.jpg",
"display": "Mustard",
"children": [],
"id": "bf087d9c-46fc-4317-8915-5c29c9d4be43"
}
],
"id": "8bf12226-353d-4227-aa0d-19f02f8ea76c"
},
{
"category": "Fruits",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Fruits.jpg",
"display": "Fruits",
"children": [
{
"category": "Papaya",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Papaya.jpg",
"display": "Papaya",
"children": [],
"id": "de8e5fdc-cc01-44be-ba94-679fe98fd370"
},
{
"category": "Watermelon",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Watermelon.jpg",
"display": "Watermelon",
"children": [],
"id": "466aeb9e-baf1-48a8-82ab-eaaa5e2d12ec"
},
{
"category": "Mango",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Mango.jpg",
"display": "Mango",
"children": [],
"id": "0e1f7681-9cc2-4efa-b726-2102438eec88"
},
{
"category": "Muskmelon",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Muskmelon.jpg",
"display": "Muskmelon",
"children": [],
"id": "3b7156ef-5158-4a90-ae1a-4a02688cc4b3"
}
],
"id": "6a8c7f46-277a-4ebf-97ad-6d6ab2554ff2"
},
{
"category": "Spices",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Spices.jpg",
"display": "Spices",
"children": [
{
"category": "Fennel",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Fennel.jpg",
"display": "Fennel",
"children": [],
"id": "cc8940d3-ca03-4ccc-ba53-4fbf62d560ea"
},
{
"category": "Cumin",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cumin.jpg",
"display": "Cumin",
"children": [],
"id": "432a3fa6-d3ab-4c44-a119-9a6f396a64fd"
},
{
"category": "Fenugreek",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Fenugreek.jpg",
"display": "Fenugreek",
"children": [],
"id": "862ea254-95e3-48e4-ac77-7937f91f9a60"
}
],
"id": "d4ee8a4e-a983-4f8b-b6d9-5a10a87bfba5"
},
{
"category": "Flowers",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Flowers.jpg",
"display": "Flowers",
"children": [
{
"category": "Tuberose",
"image": "https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Tuberose.jpg",
"display": "Tuberose",
"children": [],
"id": "4af24831-9f91-4065-8b33-5af13a310f04"
}
],
"id": "70a34981-b6d9-4f6d-8d85-b2680297867a"
}
],
"id": "5241bf95-38b4-4b66-a4e0-a073a5bf7bc2"
},
"message": ""
}
I would like to read all the values for the key "image" from the entire response body in an ArrayList.
Tried below code to achieve this:
String yourJson = res.asString();
JsonParser parser = new JsonParser();
JsonElement element = parser.parse(yourJson);
JsonObject obj = element.getAsJsonObject();
Set<Map.Entry<String, JsonElement>> entries = obj.entrySet();
for (Map.Entry<String, JsonElement> entry: entries) {
System.out.println(entry.getKey()); //prints keys
System.out.println(entry.getValue()); //prints values
}
I've done good amount of search for finding a solution, but I haven't been able to find an elegant and reusable solution for this problem. I'm open to use any library for this. Please suggest.
As OP requested to extract all values of image keys.
This can be achieved many ways, a simple classical way using just regular java, I have built a method that can take a JSON String and key. It will return you a list of ArrayList of urls strings:
public static List<String> extract(String json, String key) {
List<String> urls = new ArrayList<>();
String[] content = json.split(",");
String keyStr = "\"" + key + "\":";
for (String s : content) {
if (s.trim().startsWith(keyStr))
urls.add(s.trim().replace("\"", "").substring(7));
}
return urls;
}
And you will the output is shown down in the answer.
Another way, I have found very interesting and helpful post covering a general way to flat & map JSON objects. Btw this is more for Java8.
So my solution is based on the mentioned post.
First of all import json flattener jar (com.github.wnameless), I have used Maven to import it.
<dependency>
<groupId>com.github.wnameless</groupId>
<artifactId>json-flattener</artifactId>
<version>0.2.2</version>
</dependency>
Secondly I have built a method that takes your JSON content and the key you want to extract, and it will return you all Urls as list.
public static List<String> extract(String json, String key) {
List<String> urls = new ArrayList<>();
try {
JSONObject jsonObject = new JSONObject(json);
Map<String, Object> flattenedJsonMap =
JsonFlattener.flattenAsMap(jsonObject.toString());
flattenedJsonMap.forEach((k, v) -> {
if (k.contains(key)) {
urls.add(v.toString());
}
}
);
} catch (Exception e) {
e.printStackTrace();
}
return urls;
}
in your main method just call
List<String> urls = extract(res, "image");
and you can test it either java8
urls.forEach(System.out::println);
Or regular java way
for (String url : urls) {
System.out.println(url);
}
And here is you got a list of all urls:
https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Seeds.jpg
https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Vegetables.jpg
https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cabbage.jpg
https://s3.ap-south-1.amazonaws.com/agrostarcatalog/static/Cowpea (Chauli).jpg
......etc.
Note: you can also use it to extract values of other keys, it is also possible to parse JSON file, but that require a bit code modification.
Link to:
https://github.com/wnameless/json-flattener
http://crunchify.com/in-java-how-to-flatten-or-unflatten-complex-json-objects-into-flat-map-like-structure/