I want to be able to read stream (from a socket) of json messages using Jackson (2).
There are ways to pass a Reader as the source, such as doing:
ObjectMapper mapper = new ObjectMapper();
MyObject obj = mapper.readValue(aReader, MyObject.class);
but that will block until the entire json message has arrived and I want to avoid that.
Is there a way to have a buffer to which I can keep adding bytes with the ability to ask if the buffer contains a full json representation of a specific class?
Something like:
JsonBuffer buffer = new JsonBuffer(MyObject.class);
...
buffer.add(readBytes);
if (buffer.hasObject()) {
MyObject obj = buffer.readObject();
}
Thanks.
Jackson supports non-blocking JSON stream parsing as of 2.9. You can find an example about how to use it in Spring Framework 5 Jackson2Tokenizer.
(I know this thread is old, but since there is no accepted answer, I wanted to add mine, just in case anyone still reads this).
I just published a new library called Actson (https://github.com/michel-kraemer/actson). It works almost like the OP suggested. You can feed it with bytes until it returns one or more JSON events. When it has consumed all input data, you feed it with more bytes and get the next JSON events. This process continues until the JSON text has been fully consumed.
If you know Aalto XML (https://github.com/FasterXML/aalto-xml) then you should be able to familiarise yourself with Actson quickly because the interface is almost the same.
Here's a quick example:
// JSON text to parse
byte[] json = "{\"name\":\"Elvis\"}".getBytes(StandardCharsets.UTF_8);
JsonParser parser = new JsonParser(StandardCharsets.UTF_8);
int pos = 0; // position in the input JSON text
int event; // event returned by the parser
do {
// feed the parser until it returns a new event
while ((event = parser.nextEvent()) == JsonEvent.NEED_MORE_INPUT) {
// provide the parser with more input
pos += parser.getFeeder().feed(json, pos, json.length - pos);
// indicate end of input to the parser
if (pos == json.length) {
parser.getFeeder().done();
}
}
// handle event
System.out.println("JSON event: " + event);
if (event == JsonEvent.ERROR) {
throw new IllegalStateException("Syntax error in JSON text");
}
} while (event != JsonEvent.EOF);
You can use JsonParser to get individual events/tokens (which is what ObjectMapper uses internally), and this allows more granular access. But all current functionality uses blocking IO, so there is no way to so-called non-blocking (aka "async") parsing.
EDIT: 2019-09-18 -- correction: Jackson 2.9 (https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.9) added support for non-blocking/async JSON parsing (issue https://github.com/FasterXML/jackson-core/issues/57)
This is not the answer to my question, but more of a workaround I came up with.
Instead of dealing with the non-blocking IO on the Jackson side of things, I implemented it in my protocol.
All json messages when sent are padded with a 4 bytes int which holds the length of the rest of the message.
To read a json message now becomes easy, I just find out what the length is, read it asynchronously and then can use Jackson with the resulted string.
If anyone does know how this can be done straight from Jackson, I'm still interested to know.
I found some working solution for this problem. You can you use method inputStream.available to check if there are some bytes in stream and this method is also non-blocking. So, you can use this method to check if there is something when yes - parse value, if not - wait some time to check again. Two examples are shown below.
Safe style - with checking for START_OBJECT json token:
while (run.get()) {
if (inputStream.available() > 0) {
for (JsonToken jsonToken; (null != (jsonToken = jsonParser.nextToken())); ) {
if (JsonToken.START_OBJECT.equals(jsonToken)) {
outputStream.onNext(jsonParser.readValueAs(tClass));
break;
}
}
} else {
Thread.sleep(200); // Or can be another checking time.
}
}
Or simpliest style:
while (run.get()) {
if (inputStream.available() > 0) {
outputStream.onNext(jsonParser.readValueAs(tClass));
} else {
Thread.sleep(200); // Or can be another checking time.
}
}
What about to use Gson?
private fun handleActions(webSocketMessage: WebSocketMessage, webSocketSession: WebSocketSession): Mono<WebSocketMessage> {
val action = Gson().fromJson(webSocketMessage.payloadAsText, ActionWs::class.java)
return when (action.action) {
"REGISTER" -> companyService.createCompany(action.company)
.map { webSocketSession.textMessage(jacksonObjectMapper().writeValueAsString(it)) }
else -> Mono.just(webSocketSession.textMessage(Gson().toJson(CompanyInfo(0, 0, 0, "ACAO INVALIDA!"))))
}
}
Related
I have a JSON document that looks like this:
It's a collection of arrays - findings, assets, assetGroups, etc. I wrote a function that takes the filename and the requested array name, and returns an ArrayList<> of the array entries as strings (which I re-parse to JSON on the client side).
This works great when the files are smaller, but this one file is over 1.6GB in size so I blow out memory if I try and instantiate it all as a JSONObject. I want to try Jackson or GSon streaming APIs, but I'm getting wrapped around the axle trying to mix streaming APIs with direct DOM access. Like, I want to stream the JSON until I reach the "assetGroups" node, then iterate over that array and return the List<> of its contents.
Does that make sense? Any help?
(The following is for Gson)
You would probably have to start with a JsonReader to parse the JSON document in a streaming way. You would then first use its beginObject() method to enter the top level JSON object. Afterwards in a loop guarded by a hasNext() check you would obtain the next member name with nextName() and compare it with the desired name. If the name is not the one you are interested in, you can call skipValue() to skip its value:
JsonReader jsonReader = new JsonReader(reader);
jsonReader.beginObject();
while (jsonReader.hasNext()) {
String name = jsonReader.nextName();
if (name.equals(desiredName)) {
... // extract the data; see next section of this answer
// Alternatively you could also return here already, ignoring the remainder
// of the JSON data
} else {
jsonReader.skipValue();
}
}
jsonReader.endObject();
if (jsonReader.peek() != JsonToken.END_DOCUMENT) {
throw new IllegalArgumentException("Trailing data after JSON value");
}
You might also want to add checks to verify that the desired member actually exists in the JSON document, and to verify that it only exists exactly once and not multiple times.
If you only want to write back a subset of the JSON document without performing any content modifications to it, then you don't need to parse it into DOM objects. Instead you could directly read from the JsonReader and write to a JsonWriter:
static void transferTo(JsonReader reader, JsonWriter writer) throws IOException {
NumberHolder numberHolder = new NumberHolder();
int nestingDepth = 0;
while (true) {
JsonToken token = reader.peek();
switch (token) {
case BEGIN_ARRAY:
reader.beginArray();
writer.beginArray();
nestingDepth++;
break;
case END_ARRAY:
reader.endArray();
writer.endArray();
nestingDepth--;
if (nestingDepth <= 0) {
return;
}
break;
case BEGIN_OBJECT:
reader.beginObject();
writer.beginObject();
nestingDepth++;
break;
case NAME:
writer.name(reader.nextName());
break;
case END_OBJECT:
reader.endObject();
writer.endObject();
nestingDepth--;
if (nestingDepth <= 0) {
return;
}
break;
case BOOLEAN:
writer.value(reader.nextBoolean());
break;
case NULL:
reader.nextNull();
writer.nullValue();
break;
case NUMBER:
// Read the number as string
String numberAsString = reader.nextString();
// Slightly hacky workaround to preserve the original number value
// without having to parse it (which could lead to precision loss)
numberHolder.value = numberAsString;
writer.value(numberHolder);
break;
case STRING:
writer.value(reader.nextString());
break;
case END_DOCUMENT:
throw new IllegalStateException("Unexpected end of document");
default:
throw new AssertionError("Unknown JSON token: " + token);
}
}
}
You would call this method at the location marked with "extract the data" in the first code snippet.
If instead you do need the relevant JSON document section as DOM, then you can first use Gson.getAdapter to obtain the adapter, for example for your List<...> or for JsonArray (the generic DOM class; here it is less likely that you risk any precision loss during conversion). And then you can use read(JsonReader) of that adapter at the location marked with "extract the data" in the first code snippet.
It would not recommend directly using Gson.fromJson(JsonReader, ...) because it changes the configuration of the given JsonReader; unfortunately this pitfall is not properly documented at the moment (Gson issue).
Note that both approaches do not preserve the original JSON formatting, but content-wise that should not make a difference.
I'm trying to parse some huge JSON file (like http://eu.battle.net/auction-data/258993a3c6b974ef3e6f22ea6f822720/auctions.json) using gson library (http://code.google.com/p/google-gson/) in JAVA.
I would like to know what is the best approch to parse this kind of big file (about 80k lines) and if you may know good API that can help me processing this.
Some idea...
read line by line and get rid of the JSON format: but that's nonsense.
reduce the JSON file by splitting this file into many other: but I did not find any good Java API for this.
use this file directlly as nonSql database, keep the file and use it as my database.
I would really appreciate adices/ help/ messages/ :-)
Thanks.
You don't need to switch to Jackson. Gson 2.1 introduced a new TypeAdapter interface that permits mixed tree and streaming serialization and deserialization.
The API is efficient and flexible. See Gson's Streaming doc for an example of combining tree and binding modes. This is strictly better than mixed streaming and tree modes; with binding you don't waste memory building an intermediate representation of your values.
Like Jackson, Gson has APIs to recursively skip an unwanted value; Gson calls this skipValue().
I will suggest to have a look at Jackson Api it is very easy to combine the streaming and tree-model parsing options: you can move through the file as a whole in a streaming way, and then read individual objects into a tree structure.
As an example, let's take the following input:
{
"records": [
{"field1": "aaaaa", "bbbb": "ccccc"},
{"field2": "aaa", "bbb": "ccc"}
] ,
"special message": "hello, world!"
}
Just imagine the fields being sparse or the records having a more complex structure.
The following snippet illustrates how this file can be read using a combination of stream and tree-model parsing. Each individual record is read in a tree structure, but the file is never read in its entirety into memory, making it possible to process JSON files gigabytes in size while using minimal memory.
import org.codehaus.jackson.map.*;
import org.codehaus.jackson.*;
import java.io.File;
public class ParseJsonSample {
public static void main(String[] args) throws Exception {
JsonFactory f = new MappingJsonFactory();
JsonParser jp = f.createJsonParser(new File(args[0]));
JsonToken current;
current = jp.nextToken();
if (current != JsonToken.START_OBJECT) {
System.out.println("Error: root should be object: quiting.");
return;
}
while (jp.nextToken() != JsonToken.END_OBJECT) {
String fieldName = jp.getCurrentName();
// move from field name to field value
current = jp.nextToken();
if (fieldName.equals("records")) {
if (current == JsonToken.START_ARRAY) {
// For each of the records in the array
while (jp.nextToken() != JsonToken.END_ARRAY) {
// read the record into a tree model,
// this moves the parsing position to the end of it
JsonNode node = jp.readValueAsTree();
// And now we have random access to everything in the object
System.out.println("field1: " + node.get("field1").getValueAsText());
System.out.println("field2: " + node.get("field2").getValueAsText());
}
} else {
System.out.println("Error: records should be an array: skipping.");
jp.skipChildren();
}
} else {
System.out.println("Unprocessed property: " + fieldName);
jp.skipChildren();
}
}
}
}
As you can guess, the nextToken() call each time gives the next parsing event: start object, start field, start array, start object, ..., end object, ..., end array, ...
The jp.readValueAsTree() call allows to read what is at the current parsing position, a JSON object or array, into Jackson's generic JSON tree model. Once you have this, you can access the data randomly, regardless of the order in which things appear in the file (in the example field1 and field2 are not always in the same order). Jackson supports mapping onto your own Java objects too. The jp.skipChildren() is convenient: it allows to skip over a complete object tree or an array without having to run yourself over all the events contained in it.
Declarative Stream Mapping (DSM) library allows you to define mappings between your JSON or XML data and your POJO. So you don't need to write a custom parser. İt has powerful scripting(Javascript, groovy, JEXL) support. You can filter and transform data while you are reading. You can call functions for partial data operation while you are reading data. DSM read data as a Stream so it uses very low memory.
For example,
{
"company": {
....
"staff": [
{
"firstname": "yong",
"lastname": "mook kim",
"nickname": "mkyong",
"salary": "100000"
},
{
"firstname": "low",
"lastname": "yin fong",
"nickname": "fong fong",
"salary": "200000"
}
]
}
}
imagine the above snippet is a part of huge and complex JSON data. we only want to get stuff that has a salary higher than 10000.
First of all, we must define mapping definitions as follows. As you see, it is just a yaml file that contains the mapping between POJO fields and field of JSON data.
result:
type: object # result is map or a object.
path: /.+staff # path is regex. its match with /company/staff
function: processStuff # call processStuff function when /company/stuff tag is closed
filter: self.data.salary>10000 # any expression is valid in JavaScript, Groovy or JEXL
fields:
name:
path: firstname
sureName:
path: lastname
userName:
path: nickname
salary: long
Create FunctionExecutor for process staff.
FunctionExecutor processStuff=new FunctionExecutor(){
#Override
public void execute(Params params) {
// directly serialize Stuff class
//Stuff stuff=params.getCurrentNode().toObject(Stuff.class);
Map<String,Object> stuff= (Map<String,Object>)params.getCurrentNode().toObject();
System.out.println(stuff);
// process stuff ; save to db. call service etc.
}
};
Use DSM to process JSON
DSMBuilder builder = new DSMBuilder(new File("path/to/mapping.yaml")).setType(DSMBuilder.TYPE.XML);
// register processStuff Function
builder.registerFunction("processStuff",processStuff);
DSM dsm= builder.create();
Object object = dsm.toObject(xmlContent);
Output: (Only stuff that has a salary higher than 10000 is included)
{firstName=low, lastName=yin fong, nickName=fong fong, salary=200000}
I am using google GSON API to parse a JSON file for my Android project but I have an issue of performance.
Here is the source code I use for parsing the JSON with google GSON API :
public void loadJsonInDb(String path){
InputStream isJson = context.getAssets().open(path);
if (isJson != null) {
int sizeJson = isJson.available();
byte[] bufferJson = new byte[sizeJson];
isJson.read(bufferJson);
isJson.close();
String jsonStr = new String(bufferJson, "UTF-8");
JsonParser parser = new JsonParser();
JsonObject object = parser.parse(jsonStr).getAsJsonObject();
JsonArray array = object.getAsJsonArray("datas");
Gson gson = new Gson();
for(JsonElement jsonElement : array){
MyEntity entity = gson.fromJson(jsonElement, MyEntity.class);
// Do insert into Db stuffs
}
}
}
The problem with this is that after parsing I have to go through the JsonArray with a for loop and perform the desired action (which is an insertion in SQLite DB with ORMLite of each element in the array), I would like to know if it is possible to perform insertion on the flight during the parsing, instead of waiting for the the array to be computed. I have seen in documentation that maybe JsonStreamParser can do the job but I am not how to use it.
I have a few notes regarding the use of Gson and other stuff.
You should close I/O resources in finally blocks to ensure you don't have resource leaks (available and read may throw an exception that prevents the resource from being closed). (Also I'm not sure if using available is a good idea here.)
You just don't have to use Strings in this case. Strings are generally a performance/memory killer for such a scenario (much depends on their result sizes) since strings are accumulated into memory, thus you lose your on-fly idea having it's all collected into memory first. In worst cases, it can finish up your application with OutOfMemoryError.
You can read input streams with a specified encoding, so no string-buffering is necessary.
JsonParser is designed to return JSON trees: JsonElement contains the whole JSON tree in memory. Sounds similar to the strings case above, right? Another performance penalty here.
Creating Gson instances may be somewhat expensive (depending on how to compare, of course), and you can instantiated it once: it's thread safe.
JsonStreamParser is not an option too, because each next() will produce another JSON tree branch in memory (again, depends on how big are your JSON documents and its $.datas array and its elements).
Gson.fromJson uses lookup to find the best type adapter, and you ask a Gson instance for a type adapter once, then not wasting time for lookups anymore. Type adapters are usually perfectly thread-safe too, thus can be cached.
Summarizing the above up, you could implement it as follows:
private static final Gson gson = new Gson();
private static final TypeAdapter<MyEntity> myEntityTypeAdapter = gson.getAdapter(MyEntity.class);
private static void loadJsonInDb(final String path)
throws IOException {
// Java 7 language features can be easily converted to Java 6 try/finally
// Note the way how you can decorate (wrap) everything: an input stream (byte streams) to a reader (character streams, UTF-8 here) to a JSON reader (more high-level character reader)
try ( final JsonReader jsonReader = new JsonReader(new InputStreamReader(context.getAssets().open(path), "UTF-8")) ) {
// Ensure that we're about to open the root object
jsonReader.beginObject();
// And iterate each object property
while ( jsonReader.hasNext() ) {
// And check it's name
final String name = jsonReader.nextName();
// Another Java 7 language feature
switch ( name ) {
// Is it datas?
case "datas":
// The consume it's opening array token
jsonReader.beginArray();
// And iterate each array element
while ( jsonReader.hasNext() ) {
// Read the current value as an MyEntity instance
final MyEntity myEntity = myEntityTypeAdapter.read(jsonReader);
// Now do what you want here
}
// "Close" the array
jsonReader.endArray();
break;
default:
// If it's something other than "datas" -- just skip the entire value -- Gson will do it efficiently (I hope, not sure)
jsonReader.skipValue();
break;
}
}
// "Close" the object
jsonReader.endObject();
}
}
Simply speaking, you just have to write a parser to consume each token. Now, having the following JSON document:
{
"object": {
},
"number": 2,
"array": [
],
"datas": [
{
"k": "v1"
},
{
"k": "v2"
},
{
"k": "v3"
}
]
}
the parser above would extract $.datas.* only consuming as less resources as possible. Substituting // Now do what you want here with System.out.println(myEntity.k); would produce:
v1
v2
v3
assuming that MyEntity is final class MyEntity{final String k=null;}. Note that you can process infinite JSON documents using this approach too.
I have 2 suggestions here:
Deserealize entire collection in 3 lines:
Gson gson = new Gson();
Type listType = new TypeToken<ArrayList<MyEntity>>(){}.getType();
List<MyEntity> listOf = gson.fromJson(jsonStr, listType);
When you got whole list of the entities use bulkInsert with single transaction. There you can get the idea how do use it
P.S.
To use bulkInsert you have to create list of ContentValues from your Entities.
I am attempting to use Gson to to take some Java Object and serialize that to json and get a byte array that represents that Json. I need a byte array because I am passing on the output to an external dependency that requires it to be a byte array.
public byte[] serialize(Object object){
return gson.toJson(object).getBytes();
}
I have 2 questions:
If the input is a String gson seems to return the String as is. It doesn't do any validation of the input. Is this expected? I'd like to use Gson in a way that it would validate that the input object is actually Json. How could I do this?
I'm gonna be invoking this serialize function several thousands of times over a short period. Converting to String and then to byte[] could be some unwanted overhead. Is there a more optimal way to get the byte[]?
edit: my answer on point 1 was misinformed.
2) There will be a lot of unnecessary overhead in reflection if you just use the vanilla gson converter. It would very much be a performance benefit in your case to write a custom adapter. here is one article with more info on that
https://open.blogs.nytimes.com/2016/02/11/improving-startup-time-in-the-nytimes-android-app/?_r=0
If the input is a String gson seems to return the String as is. It doesn't do any validation of the input. Is this expected?
Yes, this is fine. It just returns a JSON string representation of the given string.
I'd like to use Gson in a way that it would validate that the input object is actually Json. How could I do this?
No need per se. Gson.toJson() method accepts objects to be serialized and it generates valid JSON always. If you mean deserialization, then Gson makes fast fails on invalid JSON documents during reading/parsing/deserialization (actually reading, this is the bottom-most layer of Gson).
I'm gonna be invoking this serialize function several thousands of times over a short period. Converting to String and then to byte[] could be some unwanted overhead. Is there a more optimal way to get the byte[]?
Yes, accumulating a JSON string to in order just to expose its internal char[] clone is memory waste, of course. Gson is basically a stream-oriented tool, and note that there are Gson.toJson method overloads accepting Appendable that are basically the Gson core (just take a quick look on how Gson.fromJson(Object) works -- it just creates a StringWriter instance to accumulate a string because of the Appendable interface). It would be extremely cool if Gson could emit JSON tokens via a Reader rather than writing to an Appendable, but this idea was refused and most likely will never be implemented in Gson, unfortunately. Since Gson does not emit JSON tokens during deserialization in read semantics manner (from your code perspective), you have to buffer the whole result:
private static byte[] serializeToBytes(final Object object)
throws IOException {
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
final OutputStreamWriter writer = new OutputStreamWriter(outputStream);
gson.toJson(object, writer);
writer.flush();
return outputStream.toByteArray();
}
This one does not use StringWriter thus not accumulating an intermediate string with cloned arrays ping-pong. I don't know if there are writers/output streams that can utilize/re-use existing byte arrays, but I believe there should be some, because it makes a good rationale for the performance purposes you mentioned in your question.
If possible, you can also check your library interface/API for exposing/accepting OutputStreams somehow -- then you could probably easily pass such output streams to the serializeToBytes method or even remove the method. If it can use input streams, not just byte arrays, you could also take a look at converting output streams to input streams so the serializeToBytes method could return an InputStream or a Reader (requires some overhead, but can process infinite data -- need to find the balance):
private static InputStream serializeToByteStream(final Object object)
throws IOException {
final PipedInputStream inputStream = new PipedInputStream();
final OutputStream outputStream = new PipedOutputStream(inputStream);
new Thread(() -> {
try {
final OutputStreamWriter writer = new OutputStreamWriter(outputStream);
gson.toJson(object, writer);
writer.flush();
} catch ( final IOException ex ) {
throw new RuntimeException(ex);
} finally {
try {
outputStream.close();
} catch ( final IOException ex ) {
throw new RuntimeException(ex);
}
}
}).start();
return inputStream;
}
Example of use:
final String value = "foo";
System.out.println(Arrays.toString(serializeToBytes(value)));
try ( final InputStream inputStream = serializeToByteStream(value) ) {
int b;
while ( (b = inputStream.read()) != -1 ) {
System.out.print(b);
System.out.print(' ');
}
System.out.println();
}
Output:
[34, 102, 111, 111, 34]
34 102 111 111 34
Both represent an array of ASCII codes representing a string "foo" literally.
I am working on a personal project that uses a custom config file. The basic format of the file looks like this:
[users]
name: bob
attributes:
hat: brown
shirt: black
another_section:
key: value
key2: value2
name: sally
sex: female
attributes:
pants: yellow
shirt: red
There can be an arbitrary number of users and each can have different key/value pairs and there can be nested keys/values under a section using tab-stops. I know that I can use json, yaml, or even xml for this config file, however, I'd like to keep it custom for now.
Parsing shouldn't be difficult at all as I have already written code to do parse it. My question is, what is the best way to go about parsing this using clean and structured code as well as writing in a way that won't make changes in the future difficult (there might be multiple nests in the future). Right now, my code looks utterly disgusting. For example,
private void parseDocument() {
String current;
while((current = reader.readLine()) != null) {
if(current.equals("") || current.startsWith("#")) {
continue; //comment
}
else if(current.startsWith("[users]")) {
parseUsers();
}
else if(current.startsWith("[backgrounds]")) {
parseBackgrounds();
}
}
}
private void parseUsers() {
String current;
while((current = reader.readLine()) != null) {
if(current.startsWith("attributes:")) {
while((current = reader.readLine()) != null) {
if(current.startsWith("\t")) {
//add user key/values to User object
}
else if(current.startsWith("another_section:")) {
while((current = reader.readLine()) != null) {
if(current.startsWith("\t")) {
//add user key/values to new User object
}
else if (current.equals("")) {
//newline means that a new user is up to parse next
}
}
}
}
}
else if(!current.isEmpty()) {
//
}
}
}
As you can see, the code is pretty messy, and I have cut it short for the presentation here. I feel there are better ways to do this as well maybe not using BufferedReader. Can someone please provide possibly a better way or approach that is not as convoluted as mine?
I would suggest not creating custom code for config files. What you're proposing isn't too far removed from YAML (getting started). Use that instead.
See Which java YAML library should I use?
Everyone will recommend using XML because it's simply better.
However, in case you're on a quest to prove your programmer's worth to yourself...
...there is nothing really fundamentally wrong with the code you posted in the sense that it's clear and it's obvious to potential readers what's going on, and unless I'm totally out of the loop on file operations, it should perform pretty much as well as it could.
The one criticism I could offer is that it's not recursive. Every level requires a new level of code to support. I would probably make a recursive function (a function that calls itself with sub-content as parameter and then again if there's sub-sub-content etc.), that could be called, reading all of this stuff into a hashtable with hashtables or something, and then I'd use that hashtable as a configuration object.
Then again, at that point I would probably stop seeing the point and use XML. ;)
I'd recommend changing the configuration file's format to JSON and using an existing library to parse the JSON objects such as FlexJSON.
{
"users": [
{
"name": "bob",
"hat": "brown",
"shirt": "black",
"another_section": {
"key": "value",
"key2": "value2"
}
},
{
"name": "sally",
"sex": "female",
"another_section": {
"pants": "yellow",
"shirt": "red"
}
}
]
}
It looks simple enough for a state machine.
while((current = reader.readLine()) != null) {
if(current.startsWith("[users]"))
state = PARSE_USER;
else if(current.startsWith("[backgrounds]"))
state = PARSE_BACKGROUND;
else if (current.equals("")) {
// Store the user or background that you've been building up if you have one.
switch(state) {
case PARSE_USER:
case USER_ATTRIBUTES:
case USER_OTHER_ATTRIBUTES:
state = PARSE_USER;
break;
case PARSE_BACKGROUND:
case BACKGROUND_ATTRIBUTES:
case BACKGROUND_OTHER_ATTRIBUTES:
state = PARSE_BACKGROUND;
break;
}
} else switch(state) {
case PARSE_USER:
case USER_ATTRIBUTES:
case USER_OTHER_ATTRIBUTES:
if(current.startsWith("attributes:"))
state = USER_ATTRIBUTES;
else if(current.startsWith("another_section:"))
state = USER_OTHER_ATTRIBUTES;
else {
// Split the line into key/value and store into user
// object being built up as appropriate based on state.
}
break;
case PARSE_BACKGROUND:
case BACKGROUND_ATTRIBUTES:
case BACKGROUND_OTHER_ATTRIBUTES:
if(current.startsWith("attributes:"))
state = BACKGROUND_ATTRIBUTES;
else if(current.startsWith("another_section:"))
state = BACKGROUND_OTHER_ATTRIBUTES;
else {
// Split the line into key/value and store into background
// object being built up as appropriate based on state.
}
break;
}
}
// If you have an unstored object, store it.
If you could utilise XML or JSON or other well-known data encoding as the data format, it will be a lot easier to parse/deserialize the text content and extract the values.
For example.
name: bob
attributes:
hat: brown
shirt: black
another_section:
key: value
key2: value2
Can be Expressed as the follow XML (there are other options to express it in XML as well)
<config>
<User hat="brown" shirt="black" >
<another_section>
<key>value</key>
<key2>value</key2>
</another_section>
</User>
</config>
Custom ( Extremely simple )
As I mentioned in the comment below, you can just make them all name and value pairs.
e.g.
name :bob
attributes_hat :brown
attributes_shirt :black
another_section_key :value
another_section_key2 :value2
and then do string split on '\n' (newline) and ':' to extract the key and value or build a dictionary/map object.
A nice way to clean it up would be to use a table, i.e. replace your conditionals with a Map. You can then invoke you parsing methods through reflection (simple) or create a few more classes implementing a common interface (more work but more robust).