Parse multiple JSON objects in one file

Parse multiple JSON objects in one file - java

I have multiple JSON objects stored in one file separated by new line character (but one object can span over multiple lines) - it's an output from MongoDB shell.
What is the easiest way to parse them (get them in an array or collection) using Gson and Java?

Another possibility is to use Jackson and its ObjectReader.readValues() methods:
public <T> Iterator<T> readStream(final InputStream _in) throws IOException {
ObjectMapper mapper = new ObjectMapper();
// configure object mappings
...
// and then
return mapper.reader(MapObject.class).readValues(_in);
}
works pretty good on big enough (few gigabytes) JSON datafiles

Related

Does the StringBuffer change the order of JSON?

I'm using StringBuffer to get JSON from a URL.
This is the original JSON
[{"name":"Italy","topLevelDomain":[".it"],"alpha2Code":"IT","alpha3Code":"ITA","callingCodes":["39"],"capital":"Rome","altSpellings":["IT","Italian Republic","Repubblica italiana"],"region":"Europe","subregion":"Southern Europe","population":60665551,"latlng":[42.83333333,12.83333333],"demonym":"Italian","area":301336.0,"gini":36.0,"timezones":["UTC+01:00"],"borders":["AUT","FRA","SMR","SVN","CHE","VAT"],"nativeName":"Italia","numericCode":"380","currencies":[{"code":"EUR","name":"Euro","symbol":"€"}],"languages":[{"iso639_1":"it","iso639_2":"ita","name":"Italian","nativeName":"Italiano"}],"translations":{"de":"Italien","es":"Italia","fr":"Italie","ja":"イタリア","it":"Italia","br":"Itália","pt":"Itália","nl":"Italië","hr":"Italija","fa":"ایتالیا"},"flag":"https://restcountries.eu/data/ita.svg","regionalBlocs":[{"acronym":"EU","name":"European Union","otherAcronyms":[],"otherNames":[]}],"cioc":"ITA"}]
This is the JSON That I end up with once I convert it to a string from the response
[{"area":301336,"nativeName":"Italia","capital":"Rome","demonym":"Italian","flag":"https://restcountries.eu/data/ita.svg","alpha2Code":"IT","languages":[{"nativeName":"Italiano","iso639_2":"ita","name":"Italian","iso639_1":"it"}],"borders":["AUT","FRA","SMR","SVN","CHE","VAT"],"subregion":"Southern Europe","callingCodes":["39"],"regionalBlocs":[{"otherNames":[],"acronym":"EU","name":"European Union","otherAcronyms":[]}],"gini":36,"population":60665551,"numericCode":"380","alpha3Code":"ITA","topLevelDomain":[".it"],"timezones":["UTC+01:00"],"cioc":"ITA","translations":{"br":"Itália","de":"Italien","pt":"Itália","ja":"イタリア","hr":"Italija","it":"Italia","fa":"ایتالیا","fr":"Italie","es":"Italia","nl":"Italië"},"name":"Italy","altSpellings":["IT","Italian Republic","Repubblica italiana"],"region":"Europe","latlng":[42.83333333,12.83333333],"currencies":[{"symbol":"\u20ac","code":"EUR","name":"Euro"}]}]
This is my code for getting the JSON + Converting it.
JSONArray JSON = null;
//Reading Variables
BufferedReader r = new BufferedReader(new InputStreamReader(con.getInputStream()));
String input;
StringBuffer response = new StringBuffer();
//Adding response to StringBuffer
while((input = r.readLine()) != null) {
response.append(input);
}
//Stopping the reader
r.close();
System.out.println(response);
//Convert StringBuffer to JSON
JSON = new JSONArray(response.toString());
System.out.println(JSON);
return JSON;
Is there a way of preventing it from doing this?

It's not the StringBuffer but the JSONArray.
The order of elements in an array [] is maintained like the list ["AUT","FRA","SMR","SVN","CHE","VAT"] in both examples.
Anything as a name value pair surrounded by {} can be reordered like {"code":"EUR","name":"Euro","symbol":"€"} and {"symbol":"\u20ac","code":"EUR","name":"Euro"}.
To prevent this, you can keep it as a String or create your own object and define the toString method.

Your question is similar to Keep the order of the JSON keys during JSON conversion to CSV.
It is not StringBuffer doing this. It is the JSON implementation itself.
For a start, according to all of the JSON specifications that I have seen, the order of the attributes of a JSON object are not significant. A JSON parser is not expected to preserve the attribute order, and neither is the in memory representation of a JSON object. So, for example, a typical in-memory representation of a JSON object uses a HashMap to hold the attribute names and values.
So my first piece of advice to you would be to change your application so that the order of the JSON attributes doesn't matter. If you design a JSON API where attribute order matters, then your API will be problematic.
(If this is in a testcase, it is not difficult to compare JSON properly. For example, parse the JSON and compare objects attribute by attribute.)
If you are lumbered with a (so-called) JSON API where the order of attributes has some meaning, my advice is:
Complain. Submit a bug report. This is not a proper JSON API.
Look for a JSON library that provides a way to work around the bad design. For example, some libraries allow you to provide a Map class to be used when constructing a JSONObject. The default is usually HashMap, but you could use LinkedHashMap instead.

List of HashMap into JSONs string with new line in Java

I have to convert a List into jsons string with new line.
Right now the code which i am using converts the List of HashMap into single jsons string. like below:
List<HashMap> mapList= new ArrayList<>();
HashMap hashmap = new HashMap();
hashmap.add("name","SO");
hashmap.add("rollNo","1");
mapList.put(hashmap);
HashMap hashmap1 = new HashMap();
hashmap1.add("name","SO1");
hashmap1.add("rollNo","2");
mapList.put(hashmap1 );
Now I am converting it into jsons string using ObjectMapper and the output would be
ObjectMapper mapper = new ObjectMapper();
String output = mapper.writeValueAsString(mapList);
Output:
[{"name":"SO","rollNo":1},{"name":"SO1","rollNo":2}]
Its working fine but I need the output inthe format shown below, i.e for every HashMap there should be new line in the JSON string.
[{"name":"SO","rollNo":1},
{"name":"SO1","rollNo":2}]

If i clearly understand the question, you can use:
output.replaceAll(",",",\n");
or you can go through each HashMap.Entry and call
mapper.writeValueAsString(entry);
or use configuration
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.configure(SerializationConfig.Feature.INDENT_OUTPUT, true);

I suggest a slightly different path, and that is use a custom serializer, as outlined here for example.
It boils down to have your own
public static class MgetSerializer extends JsonSerializer<Mget> {
Which works for List for example.
The point is: I would avoid to "mix" things, as: having a solution where your code writes part of the output, and jackson creates other parts of the output. Rather enable jackson to do exactly what you want it to do.
Beyond that, I find the whole approach a bit dubious in the first place. JSON strings do not care about newlines. So, if you care how things are formatted, rather look into the tools you are using to look at your JSON.
Meaning: why waste your time formatting a string that isn't meant for direct human consumption in the first place? Browser consoles will show you JSON strings in a "folded" way, and any decent editor has similar capabilities these days.
In other words: I think you are investing your energy in the wrong place. JSON is a transport format, and you should only worry about the content you want to transmit, not in (essentially meaningless formatting effects).

You can use String methods to change/replace the output's String. However, this is not correct for json Strings as they may contain commas or other characters that you should escape in the String replace methods.
Alternatively, you should parse the Json String and use JsonNode on the Json as below:
ObjectMapper mapper = new ObjectMapper();
String output = mapper.writeValueAsString(mapList);
JsonNode jsonNode = mapper.readTree(output);
Iterator<JsonNode> iter=jsonNode.iterator();
String result = "[";
while(iter.hasNext()){
result+=iter.next().toString() + ",\n";
}
result =result.substring(0,result.length()-2) + "]";
System.out.println(result);
Result:
[{"rollNo":"1","name":"SO"},
{"rollNo":"2","name":"SO1"}]
This approach will work for String containing characters like comma, for example consider the input hashmap.put("n,,,ame","SO");
The result is:
[{"n,,,ame":"SO","rollNo":"1"},
{"rollNo":"2","name":"SO1"}]
Update: Output updated to include [ and ] and commas between rows.
Update: Fixed the output accordingly

Jackson - Read different object one by one from file

I have a file like this:
[{
"messageType": "TYPE_1",
"someData": "Data"
},
{
"messageType": "TYPE_2",
"dataVersion": 2
}]
As you can see there is a file which contains different types of JSON objects. I also have an ObjectMapper which is able to parse the both types. I have to read the JSon objects one by one (because this file can be pretty huge) and to get the right Object (Type1Obj or Type2Obj) for each of them.
My question is how I could achieve with Jackson to read the JSon objects one by one from the file.

You could read the array as a generic Jackson JSON object similar to
ObjectMapper objectMapper = new ObjectMapper();
JsonNode rootNode = objectMapper.readTree(jsonData);
then traverse all the children of the array using
rootNode#elements()
and parse every one of the JsonNode children into the respective type using a check of messageType similar to
if ("TYPE_1".equals(childNode.get("messageType")) {
Type1Class type1 = objectMapper.treeToValue(childNode, Type1Class.class);
} else // ...

Hadoop + Jackson parsing: ObjectMapper reads Object and then breaks

I am implementing a JSON RecordReader in Hadoop with Jackson.
By now I am testing locally with JUnit + MRUnit.
The JSON files contain one object each, that after some headers, it has a field whose value is an array of entries, each of which I want to be understood as a Record (so I need to skip those headers).
I am able to do this by advancing the FSDataInputStream up to the point of reading.
In my local testing, I do the following:
fs = FileSystem.get(new Configuration());
in = fs.open(new Path(filename));
long offset = getOffset(in, "HEADER_START_HERE");
in.seek(offset);
where getOffset is a function where points the InputStream where the field value starts - which works OK, if we look at in.getPos() value.
I am reading the first record by:
ObjectMapper mapper = new ObjectMapper();
JsonNode actualObj = mapper.readValue (in, JsonNode.class);
The first record comes back fine. I can use mapper.writeValueAsString(actualObj) and it has read it fine, and it was valid.
Fine till here.
So I try to iterate the objects, by doing:
ObjectMapper mapper = new ObjectMapper();
JsonNode actualObj = null;
do {
actualObj = mapper.readValue (in, JsonNode.class);
if( actualObj != null) {
LOG.info("ELEMENT:\n" + mapper.writeValueAsString(actualObj) );
}
} while (actualObj != null) ;
And it reads the first one, but then it breaks:
java.lang.NullPointerException: null
at org.apache.hadoop.fs.BufferedFSInputStream.getPos(BufferedFSInputStream.java:54)
at org.apache.hadoop.fs.FSDataInputStream.getPos(FSDataInputStream.java:57)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:243)
at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:273)
at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:225)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:193)
at java.io.DataInputStream.read(DataInputStream.java:132)
at org.codehaus.jackson.impl.ByteSourceBootstrapper.ensureLoaded(ByteSourceBootstrapper.java:340)
at org.codehaus.jackson.impl.ByteSourceBootstrapper.detectEncoding(ByteSourceBootstrapper.java:116)
at org.codehaus.jackson.impl.ByteSourceBootstrapper.constructParser(ByteSourceBootstrapper.java:197)
at org.codehaus.jackson.JsonFactory._createJsonParser(JsonFactory.java:503)
at org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:365)
at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1158)
Why is this exception happening?
Does it have to do with being reading locally?
Is it needed some kind of reset or something when reusing an ObjectMapper or its underlying stream?

I managed to work it around. In case it helps:
First of all, I'm using Jackson 1.x latest version.
It seems that once JsonParser is instantiated with an InputStream, it takes control over it.
So, when using readValue(), once it is read (internally it calls _readMapAndClose() which automatically closes the stream.
There is a setting that you can set to tell the JsonParser not to close the underlying stream. You can pass it to your JsonFactory like this before your create your JsonParser:
JsonFactory f = new MappingJsonFactory();
f.configure(JsonParser.Feature.AUTO_CLOSE_SOURCE, false);
Beware you are responsible for closing the stream (FSDataInputStream in my case).
So, answers:
Why is this exception happening?
Because the parser manages the stream, and closes it after readValue().
Does it have to do with being reading locally?
No
Is it needed some kind of reset or something when reusing an ObjectMapper or its underlying stream?
No. What you need to be aware of when using Streaming API mixed with ObjectMapper-like methods, is that sometimes the mapper/parser may take control of the underlying stream. Refer to the Javadoc of JsonParser and check the documentation on each of the reading methods to meet your needs.

Alternatives to JSON-object binding in Android application

From my Android application I need to use a RESTful web service that returns me a list of objects in json format.
This list can be very long (about 1000/2000 object.).
What I need to do is to search and retrive just some of the objects inside the json file.
Due to the limited memory of mobile device, I was thinking that using object-binding (using for example GSON library) can be dangerous.
Which are the alternatives for solving this problem?

If you are using gson, use gson streaming.
I've added the sample from the link and added my comment inside of it:
public List<Message> readJsonStream(InputStream in) throws IOException {
JsonReader reader = new JsonReader(new InputStreamReader(in, "UTF-8"));
List<Message> messages = new ArrayList<Message>();
reader.beginArray();
while (reader.hasNext()) {
Message message = gson.fromJson(reader, Message.class);
// TODO : write an if statement
if(someCase) {
messages.add(message);
// if you want to use less memory, don't add the objects into an array.
// write them to the disk (i.e. use sql lite, shared preferences or a file...) and
// and retrieve them when you need.
}
}
reader.endArray();
reader.close();
return messages;
}

For example
1) Read the list as a stream and handle the single JSON entities on the fly and save only those that are of interest to you
2) Read the data into String object/objects and then find the JSON entities and handle them one by one instead of everything at the same time. Ways to analyse the String for JSON structures include regular expressions or manual indexOf combined with substring -type analysis.
1) is more efficient but requires a bit more work as you have to handle the stream at the same time where as 2) is probably more simple but it requires you to use quite a big Strings as temporary means.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parse multiple JSON objects in one file - java

I have multiple JSON objects stored in one file separated by new line character (but one object can span over multiple lines) - it's an output from MongoDB shell. What is the easiest way to parse them (get them in an array or collection) using Gson and Java?

Related

Does the StringBuffer change the order of JSON?

List of HashMap into JSONs string with new line in Java

Jackson - Read different object one by one from file

Hadoop + Jackson parsing: ObjectMapper reads Object and then breaks

Alternatives to JSON-object binding in Android application

Categories

Resources