I am using the Jackson streaming api to read in a json file like so:
// Go through json model and grab needed resources.
JsonFactory jsonfactory = new JsonFactory();
JsonParser jp = jsonfactory.createParser(fis);
JsonToken current;
current = jp.nextToken();
ObjectMapper mapper = new ObjectMapper();
if (current != JsonToken.START_OBJECT) {
System.out.println("Error: root should be object: quiting.");
return null;
}
while (jp.nextToken() != JsonToken.END_OBJECT) {
String fieldName = jp.getCurrentName();
// move from field name to field value
if ("Field1".equals(fieldName)) {
jp.nextToken();
JsonNode json = mapper.readTree(jp);
//Manipulate JsonNode
/*Want to write back into json file in place of
old object with manipulated node*/
}
else {
jp.skipChildren();
}
}
From the code above I am basically parsing the json file until I find the desired field I am looking for and then I read that into a JsonNode object, I then go through that JsonNode object and manipulate some of the data associated with it. My question is is there a way to delete that node out of the json file and write a newly created POJO into the file with the same field name in place of the old one? Everything I can find online about it involve reading the whole json file into a JsonNode which I would like to avoid as this file can be quite large.
In-place editing of a file like that is usually pretty complicated; a simpler approach is to create a new temporary file, and for the most part just copy what you're writing until you hit the conditions to modify what's going to the new one.
Then at the end you could delete the original file and rename the temporary one to "replace" it; Unless disk space is an issue though, I personally like keeping the original source around (especially in automated systems) for troubleshooting
Related
I have a stable SpringBoot project that runs. I want to add a end point that reads a json file from classpath and passes it through to the response without having to create any Model objects (pass thru).
I have no issues reading the json file into JsonNode or ObjectNode, I'm struggling with where to go next to set the data in my response object.
Added this caveat later, I do need to update the json from a database.
Ok, paired up with a colleague at work and we tried two things, return a string (escapes strings in REST output - returns a big String.) not good. What worked is setting the response object to a and calling mapper.readValue(jsonFeed, Map.class), that returned the JSON in proper object notation.
#Value("${metadata.json.file}") //defined in application.context
private Resource metaJsonFileName;
public String getJsonFromFile(List<UnitUiItem> uiitems){
JsonNode root;
ObjectMapper mapper = new ObjectMapper();
InputStream stream = metaJsonFileName.getInputStream();
root = mapper.readTree(stream);
JsonNode dataNode = root.get("data");
JsonNode optionDataNode = dataNode.get("storelocation");
((ObjectNode)optionDataNode).putArray("units");
for(UnitUiItem item : uiitems){
JsonNode unitNode = ((ObjectNode)optionDataNode).withArray("units").addObject();
((ObjectNode)unitNode).put("code",item.getCode());
((ObjectNode)unitNode).put("displayName",item.getDisplayName());
}
LOGGER.info("buildMetaJson exit");
return root.toString();
}
//calling method
String jsonFeed = getJsonFromFile();
ObjectMapper mapper = new ObjectMapper();
response.setData(mapper.readValue(jsonFeed, Map.class));
I have some code cleanup to do.. any cleaner ways of doing this?
I am working with an exiting API that expects a "Metadata" field as part of its json payload. That "Metadata" field is a json object that is completely free-form. Currently, I need to read this data provided from another source, do some enrichment, then pass it on. I am struggling with how to define this "Metadata" object so that it can be any valid json object. OR, if that field was not provided, an empty json object.
I attempted to use org.json.JSONObject like so.
//meta is the json string read from the db
JSONObject jsonobject = new JSONObject(meta);
message.Metadata = jsonobject;
However, jackson, not unexpectedly, threw a serialization error:
com.fasterxml.jackson.databind.JsonMappingException: No serializer found for class org.json.JSONObject and no properties discovered...
This is a critical requirement that I'm guessing I am missing some relatively obvious solution to. Any help would be greatly appreciated.
UPDATED FIX
As suggested by #shmosel I just switched the json object to a com.fasterxml.jackson.databind.JsonNode and all works beautifully.
// working code (rough of course)
ObjectMapper mapper = new ObjectMapper();
JsonNode rootNode = null;
try {
rootNode = mapper.readTree(meta);
} catch (IOException e) {
e.printStackTrace();
}
message.Metadata = rootNode;
I am trying to use Jackson streaming API to deserialize huge objects from XML. The idea is to combine streaming API and ObjectMapper to parse XML(or JSON) by small chunks. However I see some inconsistent behavior with XML Parser.
With this code snippet:
try {
String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo></foo>";
String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo><bar></bar></foo>";
XmlFactory xmlFactory = new XmlFactory();
JsonParser jp = xmlFactory.createParser(new ByteArrayInputStream(xml1.getBytes()));
JsonToken token = jp.nextToken();
while (token != null) {
System.out.println("xml1 token=" + token);
token = jp.nextToken();
}
jp = xmlFactory.createParser(new ByteArrayInputStream(xml2.getBytes()));
token = jp.nextToken();
while (token != null) {
System.out.println("xml2 token=" + token);
token = jp.nextToken();
}
} catch (IOException e) {
e.printStackTrace();
}
I am getting:
xml1 token=START_OBJECT
xml1 token=END_OBJECT
xml2 token=START_OBJECT
xml2 token=FIELD_NAME
xml2 token=VALUE_NULL
xml2 token=END_OBJECT
Why is the FIELD_NAME token missing for xml1? Why is there just one START_OBJECT token for the second xml? Is there any setting that would allow me to see FIELD_NAME of outer tag?
Problem is quite simple: XML module is different from most other Jackson dataformat modules in that direct access via Streaming API is not supported.
This is mentioned on project README (along with mention that "tree model" is similarly not supported).
Not supported does not necessarily mean "can not be used at all", just that its behavior is different from handling for JSON so callers really need to know what they are doing above and beyond API used for JSON content (and Smile, CBOR, YAML -- even CSV content is represented in a way that is compatible with JSON access).
While you can try to use XmlFactory and streaming parser/generator, its behavior is controlled by XmlMapper based on metadata from Java classes, to make things works correctly via databinding API (that is, XmlMapper).
With that, the reason for observed tokens is that such translation is necessary to map to expected Java object structure:
public class Foo {
public Bar bar;
}
which would map to JSON like:
json
{
"bar" : null
}
as well as XML of
xml
<foo>
<bar></bar>
</foo>
Another way to put this is that XML and JSON data models are fundamentally different, and they can not be trivially translated. Since Jackson's token model is based on JSON, some work is needed to translated XML elements and attributes into structure that equivalent JSON would have.
Above is not to say that what you try to do is impossible. There are 2 ways you might be able to make things work:
Knowing translation that XmlParser does, call getToken() expecting translation
Instead of using XmlParser directly, construct XMLStreamReader (Stax low-level streaming parser), read "raw" tokens, and construct separate XmlParser (via XmlFactory) at expected location, use that for reading.
I hope this helps.
A kid with a hammer...
I don't know much about Jackson; in fact, I just started using it, thinking of using JSON or YAML instead of XML. But for XML, we have been using XStream with success.
//Consumer side
FileInputStream fis = new FileInputStream(filename);
XStream xs = new XStream();
Object obj = xs.fromXML(fis);
fis.close();
Also, if the case is that you are also originating the serialization and it is from Java, you could use Java serialization altogether for a lower footprint and faster operation.
//producer side
FileOutputStream fos = new FileOutputStream(filename);
ObjectOutputStream oos = new ObjectOutputStream(new BufferedOutputStream(fos));
oos.writeObject(yourVeryComplexObjectStructure); //I am writing a list of ten 1MB objects
oos.flush();
oos.close();
fos.close();
//Consumer side
final FileInputStream fin = new FileInputStream(filename);
final ObjectInputStream ois = new ObjectInputStream(new BufferedInputStream(fin));
#SuppressWarnings("unchecked")
final YourVeryComplexObjectStructureType object = (YourVeryComplexObjectStructureType) ois.readObject();
ois.close();
fin.close();
I have been trying to create a Json String with a large amount document but using the below code but i get out of range or have to wait till up to 5min b4 the String is greated any idiea how i could optimise the code?
public String getJson() throws NotesException {
...
View view1 = ...;
ViewNavigator nav =view1.createViewNav();
ViewEntry ve = nav.getFirst();
JSONObject jsonMain = new JSONObject();
JSONArray items = new JSONArray();
Document docRoot = null
while (ve != null) {
docRoot= ve.getDocument();
items.add(getJsonDocAndChildren(docRoot));
ViewEntry veTemp = nav.getNextSibling(ve);
ve.recycle();
ve = docTemp;
}
jsonMain.put("identifier", "name");
jsonMain.put("label", "name");
jsonMain.put("items", items);
return jsonMain.toJSONString();
}
private JSONObject getJsonDocAndChildren(Document doc) throws NotesException {
String name = doc.getItemValueString("Name");
JSONObject jsonDoc = new JSONObject();
jsonDoc.put("name", name);
jsonDoc.put("field", doc.getItemValueString("field"));
DocumentCollection responses = doc.getResponses();
JSONArray children = new JSONArray();
getDocEntry(name,children);//this add all doc that has the fieldwith the same value name to children
if (responses.getCount() > 0) {
Document docResponse = responses.getFirstDocument();
while (docResponse != null) {
children.add(getJsonDocAndChildren(docResponse));
Document docTemp = responses.getNextDocument(docResponse);
docResponse.recycle();
docResponse = docTemp;
}
}
jsonDoc.put("children", children);
return jsonDoc;
}
There are a few things here, ranging from general efficiency to optimizations based on how you want to use the code.
The big one that would likely speed up your processing would be to do view operations only, without cracking open the documents. Since it looks like you want to get responses indiscriminately, you could add the response documents to the original view, with the "Show responses in hierarchy" option turned on. Then, if you have columns for Name and field in the view (and no "Show responses only") columns, then a nav.getNext() walk down the view will get them in turn. By storing the entry.getIndentLevel() value for each previous entry and comparing it at the start of the loop, you could "step" up and down the JSON tree: when the indent level increases by one, create a new array and add it to the existing object; when it decreases, step up one. It may be a little conceptually awkward at first, having to track previous states in a flat loop, but it'd be much more efficient.
Another option, also having the benefit of not having to crack open each individual document, would be to have a view of the response documents categorized by #Text($REF) and then making your recursive method look more like:
public static void walkTree(final View treeView, final String documentId) {
ViewNavigator nav = treeView.createViewNavFromCategory(documentId);
nav.setBufferMaxEntries(400);
for (ViewEntry entry : nav) {
// Do code here
walkTree(treeView, entry.getUniversalID(), callback);
}
}
(That example is using the OpenNTF Domino API, but, if you're not using that, you could down-convert the for loop to the legacy style)
As a minor improvement any time you traverse through ViewNavigators, you can set view.setAutoUpdate(false) and then nav.setBufferMaxEntries(400) to improve the internal caching.
And finally, depending on your needs - say, if you're outputting the JSON directly to an HTTP response's output stream - you could use JsonWriter instead of JsonObject to stream the content out instead of building a huge object in memory. I wrote about it with some simple code here: https://frostillic.us/blog/posts/EF0B875453B3CFC285257D570072F78F
You should first determine where the time is spent in your code. Maybe it is in doc.getResponses() or responses.getNextDocument() which you did not show here.
The obvious optimization which could be done within your code snippet is the following:
Basically you have some data structure called Document and build up a corresponding in memory JSON structure consisting of JSONObjects and JSONArrays. This JSON structure is then serialized to a String and returned.
Instead of building the JSON structure you could directly use a JsonWriter (don't know what JSON library you are using but there must be something like a JsonWriter). This avoids the memory allocations for the temporary JSON structure.
In getJson() you start:
StringWriter stringOut = new StringWriter();
JsonWriter out = new JsonWriter(stringOut);
and end
return stringOut.toString();
Now everywhere where you creating JSONObjects or JSONArrays you invoke corresponding writer methods. e.g.
private void getJsonDocAndChildren(Document doc, JsonWriter out) throws NotesException {
out.name("name");
out.value(doc.getItemValueString("Name"));
out.name("field");
out.value(doc.getItemValueString("field"));
DocumentCollection responses = doc.getResponses();
if (responses.getCount() > 0) {
Document docResponse = responses.getFirstDocument();
out.startArray();
...
Hope you get the idea.
Let's say I have a json that looks like this:
{"body":"abcdef","field":"fgh"}
Now suppose the value of the 'body' element is huge(~100 MB or more). I would like to stream out the value of the body element instead of storing it in a String.
How can I do this? Is there any Java library I could use for this?
This is the line of code that fails with an OutOfMemoryException when a large json value comes in:
String inputStreamString = (String) JsonPath.read(textValue.toString(), "$.body");
'textValue' here is a hadoop.io.Text object.
I'm assuming that the OutOfMemory error occurs because we try to do method calls like toString() (which creates a new object), and JsonPath.read(), all of which are done in-memory. I need to know if there is an approach I could take while handling large-sized textValue objects.
Please let me know if you need additional info.
JsonSurfer is good for processing very large JSON data with selective extraction.
Example how to surf in JSON data collecting matched values in the listeners:
BufferedReader reader = new BufferedReader(new FileReader(jsonFile));
JsonSurfer surfer = new JsonSurfer(GsonParser.INSTANCE, GsonProvider.INSTANCE);
SurfingConfiguration config = surfer.configBuilder().bind("$.store.book[*]", new JsonPathListener() {
#Override
public void onValue(Object value, ParsingContext context) throws Exception {
JsonObject book = (JsonObject) value;
}
}).build();
surfer.surf(reader, config);
Jackson offers a streaming API for generating and processing JSON data.